XML achieves its flexibility by allowing you to extend a base markup language (the XML specification itself) with tags of your own design. You create tags to structure the text within a document so that its underlying meaning is clearly presented. For example, to denote an item on an invoice, you could use an <ITEM>
tag.
While XML and HTML documents look a lot alike, there are several important syntactical differences. HTML is fairly flexible. You can omit end tags from many of an HTML document's most important structures, such as list items, and most browsers will happily display the document as best they can. XML documents, however, must meet a more rigid set of requirements:
A document must begin with a line that identifies it as XML. It must also include the XML specification with which it complies. Since XML is a brand-new standard, this line is currently <?xml
version="1.0"?>
.[2]
[2] The line can also include additional metadata that I've omitted for purposes of simplicity.
Tags are case sensitive. For example, <INVOICE_NUMBER>
and <invoice_number>
are not the same. In general, the convention is to always use uppercase.
All attribute values must appear in quotes, as in <CUSTOMER
CUST_ID="12345">
.
A start tag must always have a corresponding end tag. The combination of a start tag (plus any attributes), an end tag, and any intervening text is called an element.
Elements cannot overlap. For example, the following set of markups is illegal: <INVOICE_ITEM><PART_NUM>PN-1234</INVOICE_ITEM></PART_NUM>
.
"Empty" tags that don't mark up any text, like HTML's <p>
or <br>
) must have corresponding end tags. For example, if you want to use a <PAID_IN_FULL>
tag to indicate that an invoice has been paid, you must end with a </PAID_IN_FULL>
tag, even though there is no text in between. XML also has an alternative notation for empty tags that lets you simply append a "/" to the end of the start tag (for example, <PAID_IN_FULL/>
).
A document that follows all these rules is called well-formed, which means that it is syntactically correct. Even more so than with HTML, XML requires a precise syntax to make sure the documents follow a predictable structure. Fortunately, there are several commercially available tools that help you create well-formed XML documents. Figure 9.1 shows Vervet Logic's XML Pro (http://www.vervet.com).
In the next section we'll look at how you can define strict rules the tags in your documents must follow.
Copyright (c) 2000 O'Reilly & Associates. All rights reserved.