XML Examples -- How XML Works
Here are some XML samples which illustrate its simplicity and flexibility, followed by
a brief description of how XML works.
Here's a sample describing two people, with their first and last names:
<person>
<firstName>Jane</firstName>
<lastName>Smith</lastName>
</person>
<person>
<firstName>Evelyn</firstName>
<lastName>Doe</lastName>
</person>
Now let's add their eye colors and their height in meters:
<person>
<firstName>Jane</firstName>
<lastName>Smith</lastName>
<eyes>brown</eyes>
<ht>1.5</ht>
</person>
<person>
<firstName>Evelyn</firstName>
<lastName>Doe</lastName>
<eyes>blue</eyes>
<ht>1.5234782</ht>
</person>
Now suppose Jane Smith has two pet dogs:
<person>
<firstName>Jane</firstName>
<lastName>Smith</lastName>
<eyes>brown</eyes>
<ht>1.5</ht>
<pet>
<petType>dog</petType>
<petName>Snoopy</petName>
<petBreed>Golden Retriever</petBreed>
</pet>
<pet>
<petType>dog</petType>
<petName>Fido</petName>
<petBreed>German Shepherd</petBreed>
</pet>
</person>
<person>
<firstName>Evelyn</firstName>
<lastName>Doe</lastName>
<eyes>blue</eyes>
<ht>1.5234782</ht>
</person>
How XML works:
- An XML file is a standard text file, just like an HTML file.
- Also as with HTML, whitespace (tabs, carriage returns and linefeeds,
and extra spaces) is ignored by programs which use XML files. You can add whitespace
to your XML documents to make them easier to read. You can have as little or as much
whitespace as desired.
- Data is contained in elements, for example: <firstName>Evelyn</firstName>
- The actual data value -- Evelyn in this example -- is contained inside
the container tags -- <firstName> and </firstName>.
firstName would be called the element name
or the tagname. <firstName> is the opening
tag and </firstName> is the closing tag.
- Elements can contain other elements, or they can contain a data value, or they can
contain both. For example, the person element contains the firstName,
lastName, eyes, ht and pet elements.
The firstName element contains a data value. Note how the person
element may contain zero or more pet elements.
Evelyn has no pets, but Jane has two.
- Element names are created by the designer of the XML file. In assigning element
names, there's a tradeoff between making them easy to understand for people, and
minimizing the size of the file. Generally, for small files, or for elements which
occur infrequently in the file, it's a good idea to use descriptive, easily-understood
tags. For elements which occur many times in large files, it's worth while to create
brief tagnames.
- Element names are case-sensitive. A widely used convention for creating element
names is as follows: use lower-case except for element names which
are the first letter of the second and subsequent words in the element name. For
example: person and firstName.
- The elements allowable in an XML file, and how they relate to each other, are described
in a Data Type Definition (DTD) file. For example, the DTD file for
the above XML file would look like this:
<!ELEMENT person (
firstName,
lastName,
eyes?,
ht?,
pet*
)>
<!ELEMENT firstName (#PCDATA)>
<!ELEMENT lastName (#PCDATA)>
<!ELEMENT eyes (#PCDATA)>
<!ELEMENT ht (#PCDATA)>
<!ELEMENT pet (
petType,
petName,
petBreed?,
)>
<!ELEMENT petType (#PCDATA)>
<!ELEMENT petName (#PCDATA)>
<!ELEMENT petBreed (#PCDATA)>
This tells you that:
- Each person element must contain a firstName and a lastName.
- The question marks after eyes and ht (as in eyes?
and ht?) mean that each person may contain one eyes
element and/or one ht element, but is not required to. In this
case, for example, to describe a person you need to know their first name and their last
name, but you may or may not know their eye color or their height.
- The asterisk after the pet element (as in pet*) means
that each person can have zero or more pets. (If a person was required to have at
least one pet, and could have any number, the tag would be followed by a plus sign, as in pet+.)
- The firstName, lastName, eyes, ht,
petType, petName, and petBreed elements
contain data values.
- Data values for specific elements may be either alphanumeric, numeric,
or one of a set of enumerated allowable values. If alphanumeric,
they may be of any length. If numeric, they may contain a decimal point and if so,
any number of digits to the right of the decimal point.
- There are a wide variety of off-the-shelf software tools available for working with XML
datafiles. Some of the best are shareware or freeware:
- Microsoft's Internet Explorer web browser understands XML, and can parse XML datafiles
and transform the data they contain into formatted reports.
- XML editors use the DTD file for a particular XML file to make it a snap to create,
modify or display XML data.
- Internet Explorer and other programs which can parse XML files make it
easy to programmatically extract the data contained in an XML file.
- Some parsing programs can validate the data contained in an XML file to ensure that it
is consistent with the structure defined in the associated DTD file. These validating
parsers make it easy to check your XML datafiles as you create them.
For more information, see the section on XML Documentation.