This page is dedicated to the proposition that not all uses of large-scale, schema-oriented ExtensibleMarkupLanguage are best-fits for their problem spaces. It is the counter-part of BenefitsOfXml.
http://www.zeroplayer.com/images/funny-pictures-cat-has-obvious-hat.jpg
----
''[Note: Considerable refactoring; still needs work.]''
* http:/~ward/ascent.html -- WardCunningham gets into xml.
Recently performed refactorings:
* Discussion of XML being too complex is moved to XmlIsTooComplex
* Much comparison of LispLanguage to XML (for whatever reason) moved to LispVsXml.
* Much comparison of EssExpressions to XML moved to XmlIsaPoorCopyOfEssExpressions.
* Poor examples of XML usage and surrounding discussion moved to XmlAbuse.
* Long discussion about the meaning of meaning moved to XmlIsJustDumbText.
----
'''Perceived problems with XML:'''
* XmlIsTooComplex for what it does.
* "Knowing" XML just looks good on a CV.
* It's too hard for programs to parse and too verbose and unreadable for humans to write.
* The benefits of "everyone is using XML, so we should too" are usually outweighed by the costs of time, training and mistakes involved in understanding it.
* Because it's increasingly used for data interchange, it is promoted as a data storage model. XML is only a data encoding format.
* or just comments wrapped around data. Too much comments and symbols.
* , when they could just be comments instead.
* Encourages non-relational data structures
** ie. Data is not even in 1st normal form let alone 5th.
* Poor OnceAndOnlyOnce syntax factoring
* It's a poor copy of EssExpressions
* It is ExtremelyInterstrangled.
* Perhaps worst of all too many programmers don't understand the need for data description languages with broad support.
* Transformations, even identity transforms, result in changes to format (whitespace, attribute ordering, attribute quoting, whitespace around attributes, newlines). These problems can make "diff"ing the XML source very difficult.
Microsoft MSXML /system.xml designer criticizes Xml (see http://nothing-more.blogspot.com/2004/10/where-xml-goes-astray.html)
* Character set allowed is too restrictive
* short on White Space provisions
* XmlNameSpace
----
'''XML is too hard for programs to parse and too verbose and unwritable for humans.'''
''It's not too hard for programs to parse - XML is a subset of SGML, which is well understood and well implemented, and because it's more rigorous than HTML it's easier to parse than HTML, which is a solved problem. It's not too hard for humans, by a long shot; a well-written DTD is a cakewalk to write in.''
Tedious rather than hard. It takes more time and code to extract the information you want from XML than it does to have the information formatted in flat files. Parsing flat files is easier than processing DOM unless tools are provided.
''Well, this is certainly true. You get an old argument of the virtues of (new thingy) over (old thingy). People thought HTML was silly in the light of Gopher, which was flat text, easier to write, edit and parse, and faster to transmit; over time they were shown to be incorrect (correction: over time they were shown different means serve different purposes). XML provides a mechanism for us to provide a parsable definition of document structure, which means that unlike CommaSeparatedValues or similar setups, the software doesn't have to know the document's structure ahead of time ''(given an XML parser; magic? Fact: xml is a document format; The use of DOM and IPC is the key to the success of XML (see SOAP)''. File space requirements matter less every day ''(tell that to a CPU designer, and he will laugh loud)'', and though not trivial, XPath and XSLT are important features over and above what CSV provides. For many applications it's overkill. So is sending readme.1st files in RTF.''
----
'''The benefits of "everyone is using XML, so we should too" are usually outweighed by the costs of time, training and mistakes involved in understanding it.'''
''What are those costs? Many people said this about HTML, but frankly it's just not that hard - commands go in angle brackets, slash means off, i for italic, hit save, you're done. Technical workers can handle that, and XML is no worse (if they need to write their own DTDs, that's a worse, but give that job to qualified staff.'' (Not really, by the way, considering that the evolution of HTML is seeming to move many elements (like i for example) out of the HTML completely, so to be current, you're left with learning HTML as well as CSS)
''Training: everything takes training.''
Some things more than others.
''Most things more than XML.''
----
'''Because XML is increasingly used for data interchange, it is too easily promoted as a data storage model. XML is only a data encoding format.'''
It's not designed as a data storage model, although models can be built on top of it. Compared to older ASN.1 (correction: ASN.1 is really only a language for defining protocols; actually the protocol defined in ASN.1 can use XML as its data transfer format) or GIOP, such XML models suck. Inherent limitations make them unsalvageable. But many folks confuse storage and exchange.
XML must be concrete enough for light-weight programs to parse; the same data may be described in many ways, and different XML representations are suitable for different tasks, in opposition to the OnceAndOnlyOnce goal. In contrast the relational model and SQL use a canonical representation not favoring a particular task. In particular, many to many relationships are problematic in XML.
We have gone back to the sequential text file model at the expense of the kind of abstraction we gained when moving from COBOL to SQL. If you really want to process data sequentially, COBOL is a far better tool than XSLT applied to XML - but sensible people use SQL.
XML should just be used for transport, and there should be a canonical representation (schema) of the relational model. A simple subset of SQL could be implemented to operate on this representation to allow programmers to extract data. Imagine how much simpler life would be if instead of writing XML parsers, and editing enormous, complex and verbose text files by hand, we had a simple SQL-style interface.
In fact - I think I'll write one! (that will be easier than XPath and XQuery?)
-- Tim Glover (ed SkipSailors)
''Why do people insist on complaining that XML doesn't do this or XML doesn't do that, when XML is just supposed to be a data storage and transport mechanism? And now this comparison to COBOL?'' '''COBOL?!?''' ''Oy, vay!''
''XML isn't a database language ''per se''. It is a means of expressing data in a tree structure. If you need flat storage of your data for relational reasons and you don't feel like parsing out an XML file full of relational data items then how about using something other than XML? Although'' any ''data'' can ''be stored in an XML format; it's just a matter of designing the storage translation in and out. XML reliably transports the stored data for you .''
XML is a means of storing data in a tree structure and can express relationships. The XML community try to push it far too far. XML databases are a silly idea. XSLT is a silly idea. When you start embedding Java in XML a la Cocoon you know you've gone completely bonkers.
I have another problem, XML has to be processed by a computer program eventually, be it xslt, java, whatever. Because XML is very concrete and highly non-canonical it introduces a very strong coupling between the actual representation chosen and the processing program, to which I object. You cannot change your XML DTD to optimize a particular task without having to rewrite all your existing programs. I don't think this has really hit home yet - but it will. It is going to cause BIG problems. SQL solves this problem by providing an abstract interface to the data.
My comparison with COBOL was with XSLT, a programming language written in XML for XML, not with XML itself. They are very similar - XML elements correspond to sequential file record types. XML attributes correspond to COBOL data division templates (conceptually at any rate. COBOL is very concrete in its layout of data attributes). COBOL has the great advantage over XSLT that it provides a very clean separation of program from data. In XSLT these are hopelessly confused, which causes much of the difficulty in reading and understanding it.
Thanks for engaging in this with me - I find it a useful and constructive discussion.
-- Tim
''XML is not a good basis for developing data models. It is not a shortcoming of XML, rather a problem that engineers pick the wrong tool for the job so often. Don't use a screwdriver as a crowbar.''
----
'''In XML there would be elements with names that make some sense to somebody reading them.'''
Ae central fallacy of XML is natural language will convey meaning. It is deceptively easy to develop language that appears sufficient but allows disastrous semantics. Rely on the DTD/schema for syntax. If XML conforms to a schema it will pass muster through the validating parser. The value is then a syntactically legitimate value for that entity, and I can probably use it appropriately.''
Syntactically correct phrases in any language can hold semantic gibberish. What's the difference between a bicycle?
''When poor schema allow illegitimate values for a particular entity it doesn't matter what name you put on it, it's still going to get buggered up. XML depends on Schema to really offer protection against that sort of thing.''
----
''XML represents just a tree with attribute name/attribute value associative arrays.''
Oh boy like we haven't had that already. Cut all the angled brackets verbosity and that is what you essentially get: a tree with named nodes. The novelty before our eyes is that it is textual representation and node names got our brains in such frenzy that we thought "oh gee we finally have data representation we can all understand and agree on semantics!".
No we don't. In some projects I participated in even xml schema was not enough to express just type semantics and what is allowed where, namely that some attributes in parent nodes excluded some attributes in child nodes, depending on semantics of an item.
YAML with schema would be semantic equivalent of XML, and much easier to read, parse and understand. About the only (limited) value in XML is just that: a schema, allowing to do SOME validation of the data.
Semantics of the data is left undefined, though, which is nowhere near enough: your understanding of "due_tax" attribute in "taxpayer" node might be very well different from mine. What XML represents is trivial, and what would be very useful to represent XML can't represent.
----
''XML should not have values in nodes, only attributes''
The distinction between value and attributes complicates parsing and code in a completely unnecessary manner. Could you not have just __value attribute, restricted like a keyword?
----
''XML does not have predefined attributes''
General structure of XML is trivial for any programmer to pass around, using say JSON or YAML.
What it could contribute but doesn't would be predefined attributes, so we could all around entire world understand and agree on what say "EBITDA" in XML entity "company" meant without negotiating in small areas, like finance, without creating myriad dialects and incomplete schemas.
Ideal, if not overcomplicated solution, would be to make those attributes inheritable in OO manner or tunable in other way, so we could distinguish between "EBITDA" attribute as proper in say Nevada/US, or Germany/EU.
The main value of XML is contributing a bit of semantics to a tree structure, but it's way too little value for cost it imposes.
----
''What is the '''alternative'''?''
SGML. A database. CommaSeparatedValues. JavaScriptObjectNotation. Something hand-rolled. Whatever's more appropriate. This is one of my personal hates: people which think that because tech X doesn't fit their current application perfectly, that it's bad. It's as if people don't understand that there are other requirements than their own, and that this technology might not have been hand-crafted to support their CGI-based lemonade stand.
YamlAintMarkupLanguage solves some of the same problems as XML. LayeredMarkupAnnotationLanguage (LMNL) attempts to solve other problems of XML.
The suite of XML processing tools available for Scheme: SXML SSAX and friends make XML processing incredibly painless in Scheme. See http://pobox.com/~oleg/ftp/Scheme/xml.html
Nelson's Simple Document Format Specification: http://blog.russnelson.com/opensource/xml-sucks.html It can be defined in a single line:
UTF-8, name=value, % encoding of [ %=\n], use Python indentation for hierarchy but no tabs.
See RelationalAlternativeToXml.
----
''I don't like that s-expr don't require names for all constructs. An interpreter won't care. I as a human want to know what is meant by a list. The only way to know is through a name. -- AnonymousDonor''
"I don't like that this language doesn't restrict people's behavior". S-exps give you the freedom to label elements by names, or not to do so. You are assumed to be an intelligent human being who can design the most convenient representation for your data. There is this concept called context, okay? Context is very helpful in that it gives rise to concise, clear expressions of meaning, if the receiver is only willing to keep a little bit of state and do a little bit of work. Intelligent humans can handle contexts without effort. When I open my dresser in the morning, I don't need every element therein to be labelled as a or . These objects have a recognizable form, and additionally the dresser establishes a context for their interpretation. I don't care that these objects don't mean anything to someone who has never seen a pair of socks or a dresser.
Nor do you pretend that labelling the socks will define them for someone who has never seen feet. Meaning depends on usage. The danger of XML is that two clients may assume their usage is the same when it isn't.
''Yes, that is a problem. Just as when binary data is transferred over a network, both ends must agree on the byte order and size of the data. XML can't address that particular issue; that problem is higher up. All XML can do is make certain that the elements are labeled and that their data complies with the DTD/schema. Remember, XML just moves the data around for you without adding, deleting, or changing anything. It's up to you to use the data appropriately when it gets to its destination.''
----
XML can be overkill when your data doesn't have a deep hierarchical structure. If a Windows-style INI file (sections containing name=value pairs) suffices, use that. If you need more hierarchical structure, or you don't mind markup clutter, use XML. Visually reading "FooBar=6" is easier than reading "6". (LimitsOfHierarchies)
''XML is a document format, not a data format and this is commonly forgotten.''
----
XML doesn't even have a standard way to handle whitespace or newlines. This really should be in the language specification, but it's not. Every XML parser I've seen has had to have 10 different options for the different ways to parse XML.
''XML isn't perfect, but the statement above is definitely false; XML has extremely precise rules for whitespace (it all, without exception, must be presented to the reading program exactly as it appears in the document) and newlines (any instances of \n, \r\n, or \r must be presented to the program as \n). You may not like the rules, but they exist and are precise.''
''That also makes it extremely difficult to round-trip XML. Consider how you would attach a digital signature to an XML document: serialize the document, build the signature, then stick the document and the signature into a "container" document and send the container. On the receiving side, parse the "container" and extract the signature, and then re-serialize the document to generate a comparison signature. If your two XML implementations follow slightly different de/serialization rules, you'll get different text and the signatures won't match. ''Oddly enough'', Ron Rivest (of RSA fame) submitted an IETF draft for S-Expressions back in 1997 (http://theory.lcs.mit.edu/~rivest/sexp.txt) that defines a canonical serialized representation for data.''
There is a canonicalized format for XML, and perfect roundtripping is not necessary, because it is irrelevant whether unnecessary whitespace is preserved or order of attributes is preserved. "Good-enough" (tm) roundtripping is possible.
The problem with this draft is that it is, in my opinion, the worst of both worlds. You can including arbitrary binary data in a document so it's hard to view in a text editor, but it still uses a verbose syntax that is only necessary if you want to edit it in a text editor. I think if size or parsing speed is so important that you can't properly escape your binary data you should use an all binary format. Possibly one with separate headers, so you don't even have to store the structure with the data. See above for my choice - Scheme R^5 minus some things.
----
Could someone comment on why people want to use a markup language as a standard for data representation. It really doesn't make sense to me. Programmers and people think in things like sets, lists, trees, dictionaries and tables. XML can represent these, but I haven't seen a nice way to do it. The only thing I've seen XML do nicely is markup documents.
* ''That's because that's what it's for.''
* Not necessarily exclusively. People may think "in trees" (for example), but often we need a ''textual'' (notational) representation of a tree because it's hard to prepare and feed images or diagrams to computers. So far XML seems more popular than the alternative textual notations for such. I suppose it is ArgumentByPopularity, but sometimes that's the easiest way to measure WetWare issues, and representation is generally a WetWare issue. -t
Here is a quick example of a dictionary in XML:
electronic mail
hypertext markup language
extensible markup language
----
Anybody can come up with a single example of why one approach is superior to another one. Here you have chosen to use the attribute for the XML element as the ID of the word to be defined. That isn't the only way to do this; for instance:
e-mail
Electronic mail
...etc. There are lots of flexible ways in which the data can be designed and communicated via XML. The whole argument about verbosity is a RedHerring; XML is ''not'' supposed to be storage space efficient -- it is supposed to represent data in a means that humans and automata can both read effectively and use by translating into more efficient forms.
''Yes, that is what a printed notation for data is supposed to do. But XML is a failure in achieving these goals.''
Oh? And exactly how is XML a "failure" at expressing data? It certainly has nothing to do with the printed expression, since the XML data needs to be run through an XSLT or something to be put into user-defined format.
----
I like YAML because there exist conceptually simple universal routines to map reliably and predictably between
chunks of text and well-defined data structures such as hashes and lists.
(See example at http://yaml.org/start.html).
----
My conjecture is that XML dialects are going to include abstractions for conditionals (if/then/else) and iterations (for/while/repeat) - or they most probably already do. So when you read in XML fragments and evaluate those constructs, you are actually writing an interpreter for a programming language. Except that you write your interpreter in a different language (say, Java or Smalltalk) than the source language (your XML dialect). And you need to take care of parsing, internal representations, namespaces, and so forth. In Lisp, you just write a bunch of macros and functions, and you're done. This new set of macros and functions is just another set of sexprs, so afterwards you can even create another level on top of them, for example checkers that check for the correctness of your macros, program transformations that weave in treatment of multithreading, security restrictions, and so on, whatever you need. And guess what, all these layers can be compiled into machine language by Lisp compilers, instead of being executed by dumb interpreters written in Java, or other languages that artificially distinguish between code and data. (Yes, the distinction between data and code is an artificial one, inside a computer there is no such distinction. That's where the power of computers really comes from. Lisp is the only language that gives you this power directly, without artificial boundaries, and in a very usable and easily comprehensible way. Sexprs might be harder to read than traditional languages at first, but they are relatively easy to comprehend when taking the power into account that they give you.)
This is already more about this issue than should be on this page. Please read the papers linked in MetaCircularEvaluator for more information.
''Is this some argument about the use of XML as a '''programming''' language? Because that certainly isn't the design intent for XML or any other markup language. XML has nothing to do with procedural decision making or loop repeating; it is strictly a mechanism for describing data and transporting it from one place to another without distortion. Any other use constitutes abuse, which voids the warranty.''
Fine, could you please explain that to http://ant.apache.org/, http://nant.sourceforge.net/ and http://xplusplus.sourceforge.net, just to name a few ?. I think XML is abused about everywhere now and you just cannot get it back to the bottle it came from.
''And this is the fault of the XML spec?!? Please, if you find examples of XML abuse, report it to the proper authorities! Put those abusers under the spotlight!''
----
XML seems to be too verbose an unwieldy to be a universal document format. Programmers hate it because it's hard to parse, and users hate it because it's too verbose. It is good as a markup language, but I don't think it's good as a universal document format. If you try to represent a database table in XML you end up with junk. It has so much markup that you can't see the data.
''I'm neutral on this issue, but is XML designed to be hand-written from scratch? I've only seen it (in real life as opposed to on this wiki) in the context of MacOsx .plist files, and some archived objects on that system. For human-parsing, it may be a bit simpler than sexprs, if only because each field is named. But then, I think the old NeXtStep .plists are easy to read, too, and I'm quite ignorant of Lisp, Scheme, and the like, so... take this for what it's worth. -- JoeOsborn''
The MacOsx .plist files are a perfect example of the unbelievable stupidity of XML. Take the terminal application for example. The background opaqueness can be configured as a floating point number, something like 0.4. But a property that is a list of values, like colors or whatever, is just stupid text, like 0.1 0.9 ...
. Doh! The XML has failed to capture the list as a real object; as a result, the program which reads the .plist must do its own tokenization to extract the elements.
''Don't confuse bad use of XML with XML. I agree it's braindead to include data that must be further parsed. The list should be composed of items. But they didn't have to do it that way and shouldn't have.'' -- AnonymousDonor
It's impossible not to include data that doesn't have to be further parsed, because at the leaf level, every piece of XML datum is just an unstructured text string. To extract what you want from that, you are on your own.
''Ahh, but the element name tells you what the element quantity or property is ''in context,'' so you can convert the parsed string into the proper native format without fear. Also, since you've already run the XML data through your validating parser and compared it to the DTD/schema, you know the data is legit.''
Converting a text into a native format has a name: lexical analysis and parsing. But you already ran the data through an XML parser. So in fact, the XML processing software has not done the full job; it only did half of the parsing, leaving still more work to do. Now who or what determines the grammar for this remaining parsing task? And does the DTD validate *that* grammar as well? Can the DTD specify a check that a floating point number is a digit sequence with an optional period, an optional exponent with an E or e followed by an optional sign and digit sequence? Does XML specify the semantics of such a constant?
''(DTDs don't, but XML Schema does (syntax and some semantics (mapping from physical representation (numeral) to logical value (number))). -- DanielBarclay)''
''No. The whole purpose of XML was to transport data from source to sink without distortion. Source and sink still have to agree on the content and format of the actual data itself.''
----
''XML was designed to be read and written by humans and computers. Configuration files in XML are good examples.'' (ed SkipSailors)
Why do end-users want to read XML files? If a program can read it (and "understand" it), shouldn't the program include features to display it to the end-users? Data exchange formats that allow recursive inclusion of unknown tags like XML are not new. Cryptic data can have its meaning exposed to end-users with software. -- EricHodges (ed SkipSailors)
''But non-end-users need to look at cryptic data for any number of [unpredictable] reasons. Interpretation is possible if data is XML.'' (ed SkipSailors)
One doesn't study much XML data with console LEDs or oscilloscopes. Software translates the raw data to text, and software to renders the text on a display. Software can render this text meaningfully. -- EricHodges (ed SkipSailors)
''Raw machine data needs tools for analysis. Network packet sniffing requires a tool for capturing and displaying such data. We're talking instead about conveying data between applications or systems or subsystems where data needs to be converted from one format to another. XML allows human examination of data in transit.'' (ed SkipSailors)
We *are* talking about the same level of data transfer. My original challenge is in that context. There's no good reason for an end-user to read XML files. If data is intended for an end-user, [write software to] present it. If data needs no end-user examination use a software-friendly format. When end-users want to view data intended for software, use the same software to present it in a human-friendly format. Configuration files are a perfect example. They should be read and written by software that obeys the rules for those actions. If the software needs names in the config file that's its business. End-users will only be lulled into a false sense of understanding by them. -- EricHodges (ed SkipSailors)
''.INI files are readable, as evidenced by Windoze hackers. XML is superior to .INI for configuration information.'' (ed SkipSailors)
What's missing here is the existence of good XML editors. XML is at least programmer-readable, and there are good XML editors out there, so you don't have to write the GUI yourself. It's a step up from a text editor for some random text format, particularly when the XML editor can auto-complete your XML tags and highlight errors. When your users want to look at the data in a spreadsheet, CommaSeparatedValues is a good choice. When they want some kind of glorified hierarchical data editor, XML is better. It's not as good as a real GUI, but many data files aren't sufficiently important to be worth the development cost. - BrianSlesinsky
''XML editing sucks less with JayEdit and its XML and Side''''''Kick plug-ins. -- ElizabethWiethoff''
----
'''"XML seems to be too verbose an unweildy to be a universal document format. Programmers hate it because it's hard to parse, and users hate it because it's too verbose."'''
''"Programmers hate it," "users hate it." Can we have some data to back up that assertion, please? Maybe a link or two to some studies or polling data?''
I'm not sure what you want in terms of data. I've seen studies in terms of file size and parsing speed. These aren't very important when you are just storing documents, but when it comes to something like XML-RPC and SOAP, it may make a big difference. Compare the amount of transactions you can do with Corba vs. SOAP. Yes, the other stuff is just my opinion. I think using XML for SVG is silly, because a markup language should markup text, not vectors, but other people disagree. If you want links I can get some.
''Yes. Please provide links to studies showing that "programmers hate it" and "users hate it." Otherwise such statements amount to rather useless hot air. Sorry.''
[I like to think I'm a programmer, am definitely a user, and I don't hate it. -- SkipSailors]
----
'''There's no good reason for a human to read XML files.'''
''I would argue the reverse. There are only files that matter to humans. Very few of us have the luxury of a custom application to create and manage structured data. So that leaves a human doing it by hand. And XML works well enough to that purpose. There are files that only applications touch and then it doesn't matter. The conclusion then is that all files must be human editable and understandable.''
If the file is for humans-only, what is the benefit of using XML? Why not write it in natural language or outline form?
''Because the file is for humans talking to a program. This doesn't mean that the program creates the file or stores its data in the same file.''
So you're advocating XML as a user interface? Eek. What programs are these that can only accept input in XML format?
''Many programs do not have user interfaces and/or allow substantial configuration for which there is no gui component.''
Why?
''Because they have better things to do than provide a useless gui for doing something that is very easy in a configuration file. You can concentrate on the functionality of the system rather then the interface. Most commercial apps are certainly gui-based, but many in-house apps and tools are configuration-based.''
Any GUI that prevents a human from having to edit XML can't be labeled "useless".
[Sure it can. Time spent making a GUI that serves only as a front-end to a configuration subsystem is time better spent elsewhere.]
This doesn't make it useless, it just makes XML costs higher. The problem with XML for human input is that it grew out of text markup, where the markup noise was small (i.e., a smallish percentage of the total text was markup). See LaTeX for an example where this works beautifully. Using XML for arbitrary data vastly inflates this noise, and makes it a real pain in the ass to work with or read. So you need to create infrastructure to deal with it. To tie in with other parts of this page, of course the same could be done with s-exprs, with less overhead.
''GUIs are cheap. User time and effort is expensive. Forcing users to hand edit XML is an insult.''
GUIs are often very platform dependent. GUIs can be a maintenance sink during rapid change. An incomplete or out-of-date GUI can be worse than useless. GUIs are hard to test. GUIs require a graphical infrastructure, bitmapped display, etc. {In other words: GUIs are not cheap.}
Text editors are effectively ubiquitous. Over the years people have developed effective, efficient ways of editing plain text on everything from serial terminals, character-only-displays and resource-constrained devices to every form of fancy graphical UI.
[So don't use a GUI. But don't use XML either. There are much better alternatives.]
''Please don't discount the simplicity and familiarity of editing marked up text, just because you have access to some GUIs in your situation.''
Editing XML is anything but simple or familiar. If the structure is flat, it's more cumbersome than a simple properties file. If the structure is deep, it's hard to navigate.
''I am NoGreatMind, yet I find XML trivial to use and edit. In a minute, I can create a file and use it in my program. That's pretty good. I don't know what is much better than that. But I am a pragmatist.''
I am a GreatMind and I find XML needlessly complex. Its reason for existing (mixing schema with data) has no value. Schema alone doesn't mean diddly without something to act upon that schema.
''If XML has no value then move on. And yes, there will be something that acts on the data, and that something can read the XML. It's highly unlikely that that something cares deeply about the form. It probably just wants a reasonably easy to use common method. XML suffices. It has no value to you, but it has value to me and others. It is not the best of all worlds, but I hardly care about that.''
This is a page for discussing how much XML sucks [or doesn't]. I came here to lend my voice to that discussion. If you don't want to talk about it, don't. Your statement is an attempt to stop the argument, not to make a point about XML's suckage.
----
This page seems to have been written by someone who is not familiar with schemas and namespaces. DTDs blow chunks and always have.
* It's too complex for what it does.
''Is it? I get the strong impression that you don't know what it does.''
* It's too hard for programs to parse and too verbose and unreadable for humans to write.
''That's what you use a library for. You use the XML library that comes with Java and never worry about parsing text files ever again. Using XML as your data interchange standard means the end of discussions about how data being transmitted between two parties should be encoded. It means never again hunting through code to find the line that -- [expletive deleted, just like the former President] -- detonates when someone puts ''two'' commas at the end of the line with a trailing space after the second.''
No, it means hunting for the line that treats whitespace between elements wrong.
If a human isn't going to read it, why use such a verbose format? Why stick the complete element name at the beginning and end of every element? The computer doesn't need all that jibber jabber.
''No, but someone '''is''' going to have to read it sooner or later.''
''The use of schemas ends those misunderstandings about low-level formatting, allowing parties to spend their months profitably arguing about the business-level meaning of the data. A chunk of XML either is valid according to its schema document or it is not. The integration of schemas with URL naming means that schemas can be published in an agreed-on location, and everything just works. A schema can even ''include'' 3rd-party schemas, raising the possibility of storing enterprise-specific data definitions in LDAP and having the various systems share them. Thus, your schema includes the corporate "employee-id" tag, as does mine, and we can be sure that copy-and-pasted employee-ids will drop straight from my document into yours.''
True, but that applies to the use of any schema, not just XML Schema.
''Finally, you use a tool to generate a class library from the schema, then extend those base classes with whatever functionality you need. You don't operate on the DOM: the extended DOM objects operate on themselves and each other. Bingo. An object hierarchy that knows how to load and store itself in a plain text format.''
If both ends are Java then serialization already does that for less bandwidth and lower parsing cost. If they aren't there are much more efficient formats for transferring the data that don't have the pretense of being human readable.
----
'''The benefits of "everyone is using XML, so we should too" are usually outweighed by the costs of time, training and mistakes involved in understanding it.'''
''So the industry as a whole is facing a learning curve. Big deal. They said the same thing about programming without GOTOs.''
'''Because it's increasingly used for data interchange, it is promoted as a data model. XML is only a data encoding format.'''
''It encodes trees. Trees are a common data model. Yes, it's weak on relational data. Is this the point you were trying to make?''
'''It's a poor copy of EssExpressions.'''
''[Discussion moved to XmlIsaPoorCopyOfEssExpressions]''
----
'''In XML there would be elements with names that make some sense to somebody reading them.'''
''That's the central fallacy of XML: natural language will convey meaning. It's comforting when it works, but can be disastrous when it doesn't. Never rely on it.''
What more can one reply to this nonsense? ''Pay'' someone to make ''sure'' that data is named meaningfully. Appoint someone the corporate XML naming Nazi - or get the DBA to do it - he only sits on his ass 99% of the time anyway (although the 1% does involve 72-hour database recovery shifts, so fair enough).
''You're missing the point. Part of the hype around XML is that an application that was never intended as the target of a message will be able to extract useful information from it because the element names are in the message. Just naming them meaningfully doesn't guarantee that a naive receiver will interpret the data properly. You can name the fields in any schema meaningfully and leave them on the sender and receiver ends without sending them as part of each and every message with just as much re-use and without the illusion that a message can be "understood" on its own.''
[See XmlIsJustDumbText]
'''Quoting from above:'''
''Yes, that is a problem. Just as when binary data is transferred over a network, both ends must agree on the byte order and size of the data. XML can't address that particular issue; that problem is higher up. All XML can do is make certain that the elements are labeled and that their data complies with the DTD/schema.''
What is the benefit of sending the name at the start and end of each element if both ends have already agreed on the format of the message?
''The analogy was a bit strained; XML is a '''transport''' for data. The idea here is to make sure the data gets transported from source to sink without decay. XML is a mechanism for making sure that this is so. However, XML cannot do anything about guaranteeing that the application treats the clean data appropriately. This is the synchronization problem discussed in the pull quote.''
----
''XML won't do weird-ass self-referential structures, but businesses, generally, don't need them. You might like textual data that looks more like math than like text, but XML is very big in that segment of the computing community where paid work gets done.''
XML is very big in that segment of the computing community that doesn't get charged by the byte for messages, the segment that doesn't have to allocate funds for additional servers to handle all those redundant element names.
''That is a total RedHerring. We have already beat the terseness argument into the ground.''
Not a RedHerring. My experience is that CIOs on golf courses love XML and data communications programmers hate it. We may have beaten the terseness argument into the ground but we haven't altered the nature of XML.
''What exactly is the statement you are trying to make?''
----
'''The only gain from reiterating the element name in the closing tag is redundancy.'''
''I used to dislike the redundancy of the closing tags too. In practice I found it useful because it made it quite obvious what wasn't matching both when implementing a parser and by visual inspection. This actually removes a whole class of errors. You shouldn't need a paren-matching editor to figure out why your file is badly formed.''
[Also, with S-expressions, you might end up with more opening than closing parens in a 1000-line file, and no clue where to look for the missing closing paren. This cannot practically happen with XML, since as soon as you forget one closing tag, your document is no longer well-formed and the error easily spotted (that is, unless you have a ...-style document going on).]
You don't have to look for closing parenthesis if you have a text editor that does it for you. I am guessing you could write a script for emacs to do this for you, .. not sure, someone clarify this please for me. Unless using notepad, or using some simple lame editor is worth the effort, just so you can brag about it to people. Many times, I lose my place in thought and need my editor to find the parenthesis for me. This should be done by a computer, not a human brain. Take the load off your brain so you can focus on more important things, like more coding rather than tedious work.
The other option, if you still can't find that parenthesis (or if you say have so many parenthesis, that you are still confused, or you made a mistake) is just to make a comment after the parenthesis.. i.e:
Then do a simple search for the parenth you are trying to find...i.e. search for "close of first" using your text editor's find option. In fact I'm wondering why notepad even has a find function, why not just forget the whole find function, keep notepad simple. Notepad doesn't have a parenthesis matching tool, but it has a find function...how silly, let's remove the find utility in notepad, it's too much. (note sarcasm). -- bozo ''(RealNamesPlease.)''
----
''If redundant open/close tags are really the biggest gripe, then XML has a solution: define your elements all with attributes and no content/subnodes and the closing tag becomes optional. Using examples from this page, instead of:''
3
abc
''You can define your DTD differently and use:''
''-- StevenNewton''
That only works if your schema is flat. No lists, no trees. A comma delimited list would be more efficient.
''Granted, this is a rather extreme example just to prove a point. However, addressing your other issue, a comma delimited list would be order sensitive. Source and sink would have to agree on the order of all entries, and, in fact, their very presence in the data stream. The XML representation can have any or all of the entries in any order; the parser sorts all this good stuff out.''
----
I can't resist chiming in.
LISP is my favorite language. But I'm also an XML fan.
S-expressions are powerful, in some ways more powerful than XML. I have designed quite a few data file formats using S-expression like notation, particularly when using the LISP reader was an option. Have you ever tried doing text processing in LISPy S-expressions? I have - I coded a TeX like language in S-expressions. At first, I actually used LISP, but it was FAR too verbose...far too much quoting was needed. And that is what I think the key difference between S-expressions and XML is: XML is a text mainly language, where quoting is not required by default, whereas S-expressions are a language where the text is secondary, See the mention of S-expressions on my own wiki page, http://glew.dyndns.org/cgi-bin/wiki.pl?Non-XML_Data_Formats, in addition to the other pages I mention here.
I recognize that XML is verbose. In
http://glew.dyndns.org/cgi-bin/wiki.pl?A_Modest_Proposal_For_Evolving_Lighter_Than_XML_Data_Formats I propose how line oriented data file formats be evolved towards XML and/or S-expressions.
I consider S-expressions and XML almost equivalent. I personally plan to use either, depending on whatever is needed. See http://glew.dyndns.org/cgi-bin/wiki.pl?XML-Like_Data_Formats
By the way, whoever said the following were equivalent
XML: 3abc
S-exp: (3 "abc")
does not know the formal logic theory behind S-expressions: the proper S-expression equivalent is
S-exp: (foo 3 "abc")
Yes, LISP has some implicit primitive datatypes, but if we are going to talk about typing we must be explicit. there are flavours of S-expression based languages that don't, yielding
S-exp: (foo (integer "3") (string "abc"))
or
S-exp: (foo (integer 3) (string abc))
Don't give LISP the advantage of positional parameters if you are not willing to give the same to XML tools.
''This isn't about positional parameters, it's about XML's inability to express one number and one string in any other way than 42foo, while with S-expressions you can simply use 42 "foo". Sure, we could use (number 42) (string "foo"), but that would be completely pointless and redundant; comparable to 42foo.''
''I am willing to give XML "the advantage of positional parameters," whatever that is. Now please show me how XML can use it.''
OK, I will. We did this at my work (just using XML for the already available parser):
...
Rearranging the action tags had an effect on the order in which they were executed.
''Yeeeeah, that's because you didn't identify one action from another. If the actions were supposed to be order sensitive then you needed some way to convey that in the transport. Adding an element or attribute like order="1" would have cured that problem.''
By the way, XML has one significant notational convenience over LISP: the angle bracket tag notation allows the creation of arbitrary numbers of superbrackets. I allude briefly to this in http://glew.dyndns.org/cgi-bin/wiki.pl?SuperBrackets-ProvidingMissingClosingTags. Now, I admit that superbrackets are not legal XML; they are just implied by the tag notation; I don't think superbrackets are legal in any modern LISP, but they were a great convenience when they were present, but I think they have died out mainly because (a) of smart editors, and (b) they weren't scalable with the small set of brackets in the character set. XML's tags have solved the latter problem, and http://glew.dyndns.org/cgi-bin/wiki.pl?SuperBrackets-ProvidingMissingClosingTags describes at least one situation where superbrackets are still nice to have.
----
'''Encourages non-relational data structures'''
''This is like seller of plums saying any fruit you buy is wrong if it is not a plum. The mapping step from an object to a database is painful and often useless step. It adds friction when I often just want to save an object and get an object. For applications where all access is through a program there there is often no need for a database backend.''
Of course a RelationalWeenie is going to disagree with you. We feel that either you use a database or end up reinventing one the hard way. I will see if I can find a good existing topic on this opinion, as some have been deleted by those tired of the topic. Note there are other comments related to relational structures above. In my opinion, relational is mostly just good OnceAndOnlyOnce of both DatabaseVerbs and data repetition management.
Related: RelationalAlternativeToXml
----
Well, some times you end up reinventing one.. but some times you end up using a database in a situation where you don't need one. Just look at an INI file. Why would you use a database in that situation? With an INI file, are you reinventing a database? Surely not.
Of course if you DO need a database, and you have a valid situation where you need one, at least use a proper tool (PhpMyAdminSucks).
Sometimes a database costs you more time and bandwidth. Let's take for example an FAQ page that you need to make. You won't need anyone interacting with the FAQ page..Maybe people interacting with the webpage, but not so far as you needing a database (say people need to search the FAQ). So it isn't dynamic enough for you to fire up the heavy bloated database. But you still want to be updating this FAQ frequently yourself, you want it easy to manage, and you want to be able to search the file with less code than when using a database. With a simple text file using {question} and {answer}, you would search for the latter with Regexes, using say "eregi" in php. Whereas with a database, way too much overboard just to make an easy to manage FAQ page. A database requires you connect to the database, manage the database, create the database, create fields, write tons of code just to do a simple thing, etc. etc...all for just a simple FAQ page?
Just make a text file like:
: {question 1}
: This is question 1?
: {answer 1}
: This is answer 1.
: {question 2}
: This is question 2?
: {answer 2}
: This is answer 2.
Then search the text file with "eregi" for instances of {question 1} ... and print it accordingly, on your webpage. And there you have it. You needed a database for your FAQ, but you made an easier to manage one without the bloat of having to use a database. Sometimes You don't need all the garbage that a database has. Imagine having to create all the tables, fields, in a database, then having to access this database (using a hard to get around, slow annoying web interface like phpmyadmin.. or using a proper but still CPU heavy offline database GUI tool.. then writing all those lines of code just to connect and check the database.. Just for a plain and simple page like an FAQ) You waste all that time with the database.. You have to create fields and specify all the primary key, etc. All you need is a simple database... but there isn't one out there, since databases are meant to be complex. So you use a text file .. and this text file can be edited easily even in a spreadsheet if you really wanted to speed things up.
If you wanted to add information to the FAQ, you'd just go into the text file and add more questions and answers.
: {question 3}
: This is question 3?
: {answer 3}
: This is answer 3.
But the above is not XML.. Can someone do it better in XML, or explain the advantages?
''Um, how about''
This is question 3?
This is answer 3
''?''
[[ And how is that easier to access with my keyboard and edit. And how is that magically better? And how is it any faster? And what's so special about what you just wrote? I don't see any magic in all your <> <> <> everywhere, what's so great about that? How is that beneficial over a simple bookmark 0r a number representing something?
I see a lot of mark up in your example, and a lot of things that get in the way of my cursor, my selector, mouse, keyboard, and most importantly, my eyes get lost in all the markup. I am crowded with information surrounding the text I am trying to get at, rather than it being on the side (like a bookmark, or even just a number)]]
''As for what the advantages of this would be, how about''
* ''not having to create your own parser (just link to whatever XML library is available to you);''
* ''not having to create your own Emacs major mode (just use the XML one); and''
* ''not having to create your own data transformer (just use XSLT)?''
Yes, but all those "<" ">" and "<>" are very hard on my keyboard.. I have to use the arrow key and the mouse to get inbetween them, select things, etc. for even small jobs (like a simple config file).
''This is a problem with your { and �} characters, too. The difference is that Emacs can insert the < and > characters for me.''
[What do you mean? A character tool, A character Map.. etc.? Artificial intelligence? how does it insert this character?]
I might even need to go as far as writing a regular expression just to find what I'm looking for in the XML. It's not laid out like a database is, where you just pinpoint what you need to edit (i.e. an INI file).
''The regexp comment is ironic, because regular expressions is exactly the kind of tool you would need to parse your {question 4} format above; to parse XML, you use an XML processor. As for the INI files = databases thing, I have no idea what you're getting at.''
Speed of XML:
Why is my own hand written parser or "regex with ini file" style of way doing things so fast compared to xml?
''Perhaps you're not utterly incompetent after all.''
Every xml app I have seen or tried is slow.
''Let me guess: they were all written in Java?''
[I don't know, probably, I am not a fan of Java at all.. This may be way off topic but: XML and JAVA together would probably give me a hard attack. A virtual machine is a nice idea in a virtual world, but in a real world.. there is something called time and speed. I am typing from a Pentium 1 computer, and java won't work on a Pentium one computer.. so java is not platform independent (i.e. work in a reasonable fashion)]
It almost is so slow I'd rather take the time to write a parser myself. Writing a parser isn't that hard for a simple application.. for a complex one, then fine. Also, not that you have to write a parser yourself.. who says someone hasn't already written a parser for my INI file I'm trying to parse?
''Of course someone has written a parser for INI files. But can INI files store tree-structured data? And how do you transform the data in an INI file to an arbitrary format? If INI files is all you need and you're comfortable with those, by all means, use them. But XML can do a lot more than INI files can.''
Why do I have to look up an XML library, why can't I just find a website that allows me to download parsers (if this isn't available, who says it couldn't be.. we don't HAVE to conform to XML and what they've started).
''I fail to see how "downloading parsers" is so much more convenient than "looking up an XML library".''
Another problem is that you can barely see the text in the XML that you are looking for since there are just so much mark up surrounding everything.
''Granted, this is a well-known problem with high-markup-density XML.''
You could just use bookmarks, and ditch the xml all together? Why have the markup inside the actual language, when you could just have the mark ups externally in a bookmark or class?
''What do you mean by "bookmark" and "class"?''
I was being a bit vague when I used those phrases, sorry.
I think the idea is that people fear corruption of the file or something - things like bookmarks or references in a text editor (a class browser) are simple ways of organizing things without embedding any text in the file. Look at RTF. The last time someone manually hacked or read their RTF file without proper software to read it? I'm not saying RTF files are a replacement for XML, I am just saying there is no real reason to continue to want to embed mark up within the file - keep it separate and let software organize like an RTF file or like those 'squares' in a database or cell reader program do.
The funny thing is, I've had so many applications crash that use XML with the following error message: 'While reading XML blah blah is corrupt'. One example is Trillian.. not sure what other software, but I have seen it numerous times.
----
It's an example of XproductMarketing. It sounds really cool, so people assume it must be good.
----
XML doesn't scale well. The redundancy in the tags have, in my experience, produced up to a factor of 5 of bloat. Now take a nice 20MB flat file, add XML and ship it. And while you can compress, the bloat is still a factor of 5 or more, making it useless for large scale data transfer. Only good for small data packets.
''"XML doesn't scale well." Hah, hah. Yeah, that's good. Please tell this to the SQL people who use multi-hundred megabyte .sql files to replicate their data. XML is actually ''smaller'' to reproduce a lot of those databases.''
Sounds like a case of a stupid programmer or DBA using a hammer when they should be using a screwdriver. XML however *forces* the bloat on you. In the case you cite lack of skill causes the user to *choose* bloat. I still find XML horribly wasteful of memory and bandwidth.
''Under what conditions?''
Any conditions. Bloat is bloat and I find it offensive. In my case I am using multi GB databases and I cannot overlook the bloat. The examples above are toys in comparison to what I do, but they are still bloated. It has been suggested by some that XML is part of a conspiracy by memory (including disk) and network gear manufacturers to force you to upgrade. And while I am at it, DOM parsers are dumb, dumb, dumb. The ones I have seen have processed entire document not only by slurping it into memory but by parsing it recursively. Once we sent an XML file to a client who tried to open it using a DOM parser and imagine their surprise when they locked up their system. We switched to flat files shortly there after. To be fair SAX parsers do stream processing, and so avoid this problem. But it is still inefficient.
----
To be fair SAX parsers do stream processing, and so avoid this problem. But it is still inefficient.
Why would they be inefficient? Read text. Figure it out what to call. Call handler shit.
What do you do that is more efficient?
''slurp line from flat file, split on delimiters, stuff into array, do stuff. no muss, no fuss no stacks. as light weight as it gets.''
Except when you want to know what data are in each delimiter or handling nested data.
You also need to encode commas.
''Both protocols have some encoding problems. With comma-delimited, generally you put quotes around strings with commas in them. As far as knowing the positions of data, see RelationalAlternativeToXml. Further, what some call "nested" is simply lack of normalization.''
''Notice I did NOT say commas, I said delimiters. Use what works (tabs, semi-colons, unprintable ascii characters). Also in the past I have done things like :: which can be used like '''''first_name:string:Wolfgang''''' solving the above problem. SO instead of giving a downstream user a DTD you send them an equivalent map defining the structure of the data. Same result, less bloat. I suspect (though have not proven) that this is logically identical to XML, just fewer tags.''
The comma point I agree with.. why use commas as a delimiter? Use commas as the delimiter only if the text you are working with does not contain commas.
There could even be a dedicated "CSV" key on the keyboard, specifically for delimiting text. ''(The AsciiCode already has these separator characters; see TabDelimitedTables.)'' Who says we can't change the keyboard standard? I feel far to much time is wasted during coding "pretending" something is something (pretending comma is a delimiter, when in fact there should be a dedicated delimiter key available). People that heavily use CommaSeparatedValues could optionally go to the store and buy a keyboard with the "Separator" keys on it. Grandma and pop who never touch or see CSV, or know what CSV is, could stick to using their old keyboard.
''There actually are 4 dedicated ASCII control characters intended for delimiting. I have never, ever, seen them used. Still, that would require some keyboard mangling to get the imaginary characters, but the functionality was planned decades ago. -- MartinZarate''
It is a hardware issue sometimes, or a "current standard" issue. But at some point, there must be a change when we constantly need something and it isn't there (a dedicated delimit character - what would be wrong with one?). Far too often software programmers spend a lot of their time pretending by faking by substituting.
One could argue that there are already enough keys and combinations on the keyboard.. so we just don't need another key. The problem is, there is a point in which hardware should be upgraded.. you can't have 6 alt keys and 2 regular keys, for example, and be efficient when typing English.
Generally speaking, I think there should be a dedicated delimit character. ''(There are four of them; see TabDelimitedTables.)'' A keyboard with a delimit key on it is not required for people who just rarely use CommaSeparatedValues, but still use it: just the character should be available to them.
----
-rw-r--r-- 1 root root 1376386 Oct 31 14:01 /usr/lib/libxml2.a
-rwxr-xr-x 1 root root 1056400 Oct 31 14:01 /usr/lib/libxml2.so.2.6.15
Built with "optimize for space". I rest my case.
----
Whiners. If you don't like it then don't use it. It's that simple.
The advantages of XML are not in its size, efficient form, etc., but rather in its cross platform format. That is what XML was designed for. So that multiple platforms/products could exchange data in a format that was understood by all. It's that plain and simple.
''But there are alleged better alternative standards for even just data exchange.''
''I think cross platform format is a bit moot when Microsoft can have a proprietary XML format and lock out users. You may as well use EBCDIC. Face it, the party is over ''
* I don't know of any way that a text file format can prevent use of deliberately-obfuscated (even encrypted) schemas from being used to intentionally hinder interoperability. This state of affairs is an indictment of MicroSoft, not XML.
----
It seems all of the linux applications I am using do not use XML setup files. The configuration files and data files that GnuLinux uses are very easy to read. It seems all the files I need to adjust are easy to adjust since there is no XML in the way. I am happy that GnuLinux has easy to read and configure files all over the place and has not adopted Xml.
----
'''References:'''
"The Case Against XML": http://www.krisandsusanna.com/Documents/XML.rtf
http://www.clearsilver.net/ ''[Found this the other day; a good alternative, pragmatic and in wide use. Don't know why I missed it so long...]''
----
Why do we have such things called "datafeeds" on the net, who transfer data via XML? Sure, I can see datafeeds as extremely useful if they were only in data format. But people keep saying XML isn't for data (and I agree). They are really missing something here (people on the net). Why not CSV feeds, or SQL feeds? Or ''real'' data feeds. I know the current short answer would be: because not everyone uses SQL, and CSV has some limitations. That doesn't determine XML the answer for data transport!
People won't agree on a common database file format, with more options and versatility than CSV? I suggest a new standard that allows comments along with structured data (versatile comments, but not forced markup). CSV is pure data with no comments, and sometimes data doesn't need comments. The column's headers are enough. Some times it does need extra comments, and CSV doesn't offer that. So don't stop right there at CSV and just give up. I don't like overly marked up data, when all I want to do is parse someone's inventory or someone's web data (which is obviously square in format and doesn't need much markup or description. But when it does need an additional description or comment, then fine. But don't force markup on me).
During transport of data, you don't need to transport the comments.. with XML, you do. With a data format that is designed for data, comments can be held in a separate column or file. In XML, the whole markup is transported together with the data, and then has to be parsed. Not efficient - even if computers are getting faster with more wattage (but electricity is getting more expensive).
Yeah, um, actually efficiency isn't the point, as has been stated about umpty-fourteen times previously. Not only that, but you can have additive XML descriptions for the same entity, like this:
S''''''omeValue
S''''''omeOtherValue
When the day is ended the attributes can total up to a single entity. If you don't want to use some particular feature then don't bother describing it.
----
'''Source file diffing problems'''
Yeah, you can't diff the XML source round trip, that's for sure. However, you can always diff the parsed tree, so that's what you should do. Eh? Because, after all, you aren't interested in the XML representation of the data, but the data itself. Diff before XML encoding and after XML decoding. Problem solved.
----
In MathMl, the situation is still poorer than in the rest of the XML world. People are misusing MathML in rather alarming ways. All the hype around the new "cool technology" is forcing people to download and install special fonts and plugins, use browsers with MathML native support, declare "special" mime types and change the rest of the document from HTML to XHTML (for using the XHTML 1.1 + MathML 2.0 DTD).
All cool and promising at the beginning. At the end, the final MathML code served on the Internet is ultraverbose (of order of 10-20 times more than similar code in other languages), not accessible and even wrong. For example, Distler is using MathML in his "advanced" blog for encoding [2s ds] whereas visually rendering the square of ds "thanks" to MathML and the rest of XML hype.
After of obtaining specially designed fonts and plugins, developing DTDs, browser layers, four specifications, and special tools for editing code, we finalize observing as an '''old piece of HTML code''' ds2 '''is more correct, more accessible, and effective''' in practice!
XML hype really sucks, with things as XSL-FO, Schemas, namespaces, or MathML being a complete disaster for most tasks.
----
The SubVersion project has removed some XML in their 1.4 release
http://subversion.tigris.org/svn_1.4_releasenotes.html
They mention changes as, smarter, better performance, lower disk storage, and less bugs.
"Working copy performance improvements (client)
The way in which the Subversion client manages your working copy has undergone radical changes. The .svn/entries file is no longer XML, and the client has become smarter about the way it manages and stores property metadata.
As a result, there are substantial performance improvements. The new working copy format allows the client to more quickly search a working copy, detect file modifications, manage property metadata, and deal with large files. The overall disk footprint is smaller as well, with fewer inodes being used. Additionally, a number of long standing bugs related to merging and copying have been fixed."
----
FUSDX format is an alternative to XML and CSV for data dumps that I am working on
http://z505.com/cgi-bin/qkcont/qkcont.cgi?p=FUSDX-Standard
Further discussion of FUSDX at RelationalAlternativeToXml.
----
'''XML is verbose and brings nothing new'''
As stated above, there is no advantage of human readability when the XML file gets big. In fact, XML brings hardly anything new. ASN.1 (AbstractSyntaxNotationOne) or XDR (ExternalDataRepresentation) have the same hierachical structure, are binary and compact, actually very compact if ASN.1 Packed Encoding rules are used.
The fact that few editors will show the matching tag makes XML hard to edit.
Especially for machine to machine communications a human readable format is ridiculous.
''This stuff has all been addressed above. Conciseness is not an issue when XML is applied where it makes sense. Easy.''
''XML is no harder to edit than any source code. Easy.''
''If the application is strictly M2M and does not require any human intervention at all (such as a proven server-to-server exchange) then simply don't use XML. Easy.''
----
XML is like violence, if it doesn't solve the problem, just use more. --"itior" (bash.org #546813)
-------------------------------------------------------------------------------------------------------------
Hi, just a simple question, I'm not a programmer but do have the misfortune of having to read XML as text far too often, this is because so many manufacturers use it as the output format for system controller logs. I am a SAN engineer and find the lack of any parsers readily available to be very fraustrating, so come on chaps. Someone out there is coding these systems to output log files in XML, who's going to code the application that makes that output actually useful?
''Load the files in your Web browser. The popular browsers recognise XML and present it in a readable format.''
{They may present a click-able tree, but that may not be sufficient to spot or extract the info you are looking for. That's one thing I like about table-oriented approaches: it's easier to "re-project" the info in a way that you wish to see it.}
-------
It's possible that the DomainSpecificLanguage (or D.S. markup) is the problem, and XML '''gets blamed for bad DSL designs'''. It's possible to muck up the alternatives just as well. Bad designers can fuck up anything you give them. Often using the tool right is more important than using the right tool.
----
See: TheRealStrengthOfXml, XmlAbuse, XmlIsaPoorCopyOfEssExpressions, XmlIsTheHtmlOfTheFuture, RelationalAlternativeToXml
CategoryRant, CategorySucks, CategoryXml