|Ned Batchelder : Blog | Code | Text | Site|
Elements vs. attributes
» Home : Blog : December 2004
The classic XML design question came up the other day: whether to use elements or attributes. During the discussion I became somewhat heated, for a few reasons:
First, we weren't debating whether to create a design using elements or attributes, we were talking about changing an existing design using attributes to a new one using elements. To my mind, the reasons for switching had better be pretty good to change an existing system.
Second, I had designed the system in question, and I thought the attribute decision was a sound one. They were all simple datatypes, and were order-less, and could appear only once. In this case, attributes are perfectly reasonable, and mean that you can avoid the overhead of end tags.
Third, I sensed ill-reason, or worse, dogma, approaching.
The X12 Reference Model for XML Design was offered as guidance for the decision. This is a long and detailed document that describes many things I don't understand. Like many documents of its ilk, it generalizes concepts and terms to the point that I no longer know what they refer to.
But section 7.2.5 (Elements vs. Attributes) applied, and for the most part consists of a clear explanation of the pros and cons of elements and attributes. Here it is (used without permission, mea culpa):
All is well and good. These are the pros and cons based on the XML semantics of elements and attributes. But then it continues with this recommendation:
What?! How does that relate to all the pros and cons? This recommendation is a commonly-repeated mantra about elements and attributes, and is nearly meaningless. What if I have "metadata" that has order significance or needs to be repeated, or is itself structured? And what do they mean by "metadata" anyway? One man's data is another man's metadata. It's impossible to separate the two without specifying the audience for the information.
In the HTML world, there's a handy rule of thumb: element content gets put on the screen, and attributes do not. That's basically the metadata rule, but it only works in this case because HTML has a very clear consumer (a browser) with a very clear processing model (render the HTML for display). Most other XML dialects don't have such clear processing models. The particular case I'm dealing with is data served by an API, with dozens of potential consumers, doing a dozen different things with the data. The metadata rule is useless to me.
I say: Use attributes unless you truly need elements. You need elements for a thing if the thing can be repeated, or is itself structured, or has semantics based on its order among its peers.