''Note: This is not a "pattern" as much as it is a "refactoring". I've used the common form for pattern descriptions because I'm not sure how to lay out a refactoring description.'' ---- '''Problem:''' An XmlLanguage (or similar) document format is too hard to process. '''Context:''' Even though each datum is well-isolated (see IsolateEachDatum), the data is still unwieldy. Cumbersome and unnatural parsing must be performed and non-obvious conventions must be applied in order to make heads or tails of the data structure. In particular, there are flat lists that contain data with implicit internal structure. '''Solution:''' Make sure there are enclosures for all groups of logically connected data. However, avoid multiple layers of unnecessary wrapping; once your data is well-structured, leave it at that. The deeper you make it from that point on, the more annoying it will be to work with -- at little to no benefit. It may appear as though adding structure would just mean adding bloat/noise/weight or whatever. Nevertheless, if in doubt, go with more structure. It is easy to transform well-structured data to poorly-structured data; going the other way is harder. If still in doubt, maybe you'll have to take other factors into account, such as expected amount of hand-editing and the effect on that, and whether or not size is absolutely critical (if you're using XML, now is a good time to reconsider). '''Resulting Context:''' This is primarily from an XML point of view, but most of this applies to, e.g., EssExpressions as well: * An XSLT stylesheet can now straightforwardly match all elements of a certain type and do whatever it needs to with those and their children. This is the way XSLT was designed to work. With poorly-enclosed data, the stylesheet was required to essentially do some nasty tree/list parsing to figure out where the data it needed was located. * Writing readable and modular schemas to validate the data is now much easier. * The data is much easier to read, since quickly scanning the left edge of the markup now reveals its overall structure in a glance. '''Known Uses:''' This problem is very common and often occurs when people inexperienced with processing data is responsible for designing data formats. It is relevant to XML as well as S-expressions, although it's more common in the XML world -- if only because the XML world is less mature. '''Author:''' DanielBrockman 2004-01-20 ---- We start out with an abstract example that quickly get to the core of the issue. EssExpression examples are in italics. ---- '''Polygon example''' ''(See IsolateEachDatum for beginning of story.)'' Bad -- hard to process ''(does not group related data)'': 1 6 3 5 4 6 ''(polygon 1 6 3 5 4 6)'' Good -- easily processable ''(provides good structure by grouping related data)'': ''(polygon (1 6) (3 5) (4 6))'' '''Dictionary example''' Bad -- hard to process ''(has implicit structure in an element list)'': ROFL rolling on the floor laughing LMAO laughing my arms off ''(dictionary "ROFL" "rolling on the floor laughing" "LMAO" "laughing my arms off")'' Good -- easy to process ''(has just the right amount of structure)'': ROFL rolling on the floor laughing LMAO laughing my arms off ''(dictionary'' '' ("ROFL" "rolling on the floor laughing")'' '' ("LMAO" "laughing my ass off"))'' Also good (effectively equivalent): rolling on the floor laughing laughing my arms off ''(no direct S-expression equivalent)'' ''Wrong. Here it is:'' ''(dictionary'' '' (entry :term "ROFL" "rolling on the floor laughing")'' '' (entry :term "LMAO" "laughing my ass off"))'' Right, but I think that one doesn't really make sense. First, what purpose do the seemingly dead-weight copies of "entry" serve? Second, why use ":term x" instead of the more obvious "(term x)"? Third, why mention "term" at all? It seems all those things are just there to create a literal translation of a piece of non-ideal XML, and that's kind of silly. I mean, in that case, you might as well emulate the end tags as well, yielding ''(dictionary'' '' (entry :term "ROFL" "rolling on the floor laughing" /entry)'' '' (entry :term "LMAO" "laughing my ass off" /entry) /dictionary)'' Mind you, I certainly didn't intend this to be yet another XML vs. S-expressions page; rather, I wanted to show how to refactor XML, and while at it, also roughly corresponding S-expressions. However, I can see that the "no S-expression equivalent" looks kind of provocative. Maybe we should instead stick the alternative XML solutions right next to each other and show the single sane S-expression solution below them? Thanks for your input. ---- Real-world examples follow. ---- '''Apple's new plist format''' ''(See IsolateEachDatum for the beginning of the story.)'' Bad -- hard to process ''(uses an ungodly mix of both too much and too little structure)'': A''''''nimateSnapToGrid E''''''mptyTrashProgressWindowLocation F''''''ileViewer.L''''''astWindowLocation Good: A''''''nimateSnapToGrid E''''''mptyTrashProgressWindowLocation F''''''ileViewer.L''''''astWindowLocation Also good (but note that this variant would not have been an option had I chosen to go with structured keys): ''How about:'' '''''' ''Here, the boolean validation is according to type, type being 'boolean' element (meaning a property may contain 'boolean' element and not 'true' or 'false' element).'' Yes, you're absolutely right. I like this better too. In your version, ElementNamesAreTypeNames, which strikes me as highly intuitive. ------ See also: GroupRelatedInformation ---- CategoryXml, CategoryRefactoring, CategoryInfoPackaging