What is XML?

Well, a simple answer is that XML is an acronym and it stands for Extensible Markup Language. Satisfied? Maybe you have a few other questions - Why isn't it called 'EML' then? Is 'Markup' really a word? Well, I have no idea. Probably though, 'markup' isn't a word.

You might also be wondering what XML actually is. To understand XML we have to think a little about how software works. Essentially softwares, although functionally sophisticated, are "as dumb as slugs" (as Derek Holzer would say). That means that they can only understand information if it is delivered using very strict rules. If the information is not delivered exactly in the manner they understand then softwares either crash, make a mess, or don't do anything. So it is important to find ways to pass information to software using these strict rules.

XML is designed exactly for this purpose. XML is the set of rules governing how a text document should be written so softwares will understand it.

What does XML Look Like?

Interestingly, XML has been designed in a very flexible way so that it can be used to describe almost any kind of information. Essentially you can define your own categorisation of information within an XML document. For example, lets imagine that you have an address book software. This software might store all the contact details in a text file that follows the XML rule set (normally just referred to as 'an xml file').

So there might be the contact details for two people in the file, and for each person we have the name and phone number. An XML document representing the data of one person might look like this:

<CONTACT NAME="Jane Hudson">
    <NAME>
        <FAMILY>Hudson</FAMILY>
        <FIRST>Jane</FIRST>
    </NAME>
    <TELEPHONE>
        <AREACODE>+21</AREACODE>
        <NUMBER>29210101</NUMBER>
    </TELEPHONE>
</CONTACT>

Having two people in the address book file might look like this:

<CONTACT NAME="Jane Hudson">
    <NAME>
        <FAMILY>Hudson</FAMILY>
        <FIRST>Jane</FIRST>
    </NAME>
    <TELEPHONE>
        <AREACODE>+21</AREACODE>
        <NUMBER>29210101</NUMBER>
    </TELEPHONE>
</CONTACT>­
<CONTACT NAME="Robert Hull">
    <NAME>
        <FAMILY>Hull</FAMILY>
        <FIRST>Robert</FIRST>
    </NAME>
    <TELEPHONE>
        <AREACODE>+67</AREACODE>
        <NUMBER>8128282</NUMBER>
    </TELEPHONE>
</CONTACT>

So, as you can see it is a very structured way of representing information. The categories are defined by items known as elements. So '<NAME>' is referred to as the 'name element', and each element may have as many 'sub-elements' as you wish. For example, the <NAME> element in the above example contains two other elements - <FAMILY> and <FIRST>. It could contain more if there was a need (for example <MIDDLE> for recording a middle name).

What are Common Uses of XML?

XML defines many types of documents on the internet. For example, if you click on a link to subscribe to a Podcast you are actually clicking on a link to what is known as a RSS file.  This RSS file can be read by most media players (eg, iTunes or Rhythmbox). Although the media player downloads the audio (Podcast) for you to play it is actually the RSS file that you subscribe to. An RSS file is a specific type of XML file, one with a specific set of categories (elements) already defined. The rules used to create the RSS file are specific to RSS but follow the 'meta' rules of XML. In the case of a 'Podcast' the RSS file tells your media player what the name of the audio file is a defined by 'title element' (<TITLE>), and there are other elements for storing a description, and information about where the audio can be downloaded etc.

RSS files are also used to manage 'news feeds', 'blog feeds' and other subscription content online.

That isn't the extent to which XML impacts on the WWW. Its effect is far far greater. HTML is another type of file that is understood by web browsers and describes the content and look of webpages. Most webpages are HTML pages, and there is a variety of HTML that follows the XML rules. This variety of HTML is known as XHTML and is very commonly used.

It is impossible to know how big the world of XML is because many softwares you won't have even heard of use it as a document storage format, and it is used by many more for exchanging data between softwares.