|
A WebDeveloper.com Feature
|
XML: A Look At Some Real Data
Part 2
But one of the problems when creating arbitrary tags is how to explain the relationship between elements. In HTML, this is explicitly defined by its basic object model. For instance, HTML has a hierarchy of objects that all HTML documents share (even if they aren't explicitly stated in the HTML markup). That structure looks basically like this:
<HTML>
<HEAD>
<TITLE>Some Title</TITLE>
</HEAD>
<BODY>
document contents
</BODY>
</HTML>
The top-level HTML declaration contains the entire document and defines it as a Web page. The HTML element contains header information, such as the TITLE, that is itself contained within the HEAD element. The visible portion of the document is contained in the BODY element. The relationship between these elements is described in a template that browsers use called a Document Type Definition (DTD).
XML vocabularies can also have DTDs and the CDF vocabulary used in this sample application has one that describes the relationship between CHANNELs, AUTHORs, ABSTRACTs and ITEMs. It also describes whether these elements have attributes, such as the HREF property of CHANNELs and ITEMs that allow those to point to specific Web pages.
But XML, unlike SGML, doesn't require that DTDs be used to describe the relationships between elements. Instead, the hierarchy of arbitrarily-defined objects in an XML document are implied by their position in the hierarchy.
For instance, if I were to create the FOO element which contains BAR, I could simply contain BAR in FOO like this:
<FOO>
<BAR/>
</FOO>
The relationship--the object model--of this XML is implied by the fact that BAR is inside of FOO and is therefore the child element.
You'll also notice that there is a slight difference in how the BAR element is closed as compared to HTML. Single tags in HTML that contain no data like the IMG tag do not close. This makes it much harder to write parsers since they have to know specific information about these exceptions to the conventional approach of having all tags close explicitly.
Obviously, this can't be the case if you're going to define elements arbitrarily so stand-alone (called "empty") tags close within a single tag by using a slash before the closing bracket. In the CDF tag AUTHOR, data is actually defined within the attribute VALUE. Since the tag doesn't enclose data, it uses a closing slash before the end bracket.
XMLs that are able to be parsed without a DTD are called "well-formed" because they follow all of the proper syntax.
Finally, you may have noticed that the CDF file is not really a document as we generally think about them. Instead of containing text, it contains information about text and where it can be found. This is a concept called "meta data" which literally means "above data" but is really a bunch of information about information. XML as meta data is one of the most powerful uses of this technology on the Web, which is a topic I'll explore in the next XML File.
Contact the WebDeveloper.com® staff
Last modified:
Friday, 22-Aug-2008 13:46:48 EDT
|