Web Server Programming

Chapter 9
XML

Standards and Applications

The home for XML is the W3.org site; the site has a subsection summarizing current activity. Google has an XML page; most of the links are in specific subsections such as applications, programming, and XSL. The XML.com site is part of the O'Reilly group; this has a variety of articles, tutorials, and technical material.

Much of the work on inventing domain specific standards (XML DTDs or Schemas) is organized by OASIS. This is an industrial sponsored consortium that has subcommittees working in numerous areas; examples including vehicle repair (documents with details on vehicle emissions and pollution control mechanisms), legal - court filings ("develop specifications for the use of XML to create legal documents and to transmit legal documents from an attorney, party or self-represented litigant to a court, from a court to an attorney, party or self-represented litigant or to another court, and from an attorney or other user to another attorney or other user of legal documents".), tax ("analyze personal and business tax reporting and compliance information, represented in XML, to facilitate interoperability in a way that is open, flexible and international in scope"). Other news from industry is presented at CoverPages.org.

Some of the technical applications for XML data exchanges are Astronomical data, Chemistry, Mathematics, voice xml (markup language associated with synthetic speech generators - supposedly can get a development kit from IBM), and SMIL (XML-based language for defining interactive multimedia presentations).

XML is used in areas quite different from the original "portable data" type data exchange applictions. The SOAP protocols for remote procedure calls tunneling over HTTP use XML documents to define requests and responses (see Userland's intro). Ant's use of XML documents to define program build procedures illustrates another quite unexpected application.

The major players each have their own take on XML and how it integrates into their world models:

XSLT stylesheets

Often, it will be convenient to obtain data in the form of an XML document and then extract and format subsets of the data using XSLT. Some databases can return data directly in the form of XML documents; in other cases, one may use Java "beans" that obtain collections of data and have a toXML() method that returns the data as a long String encoding an XML document (don't use String concatenation!). Once you have your XML text, you need an XSLT script to transform it.

Naturally, the W3C site is the official home for XSL; the site has news, links to the specifications, some reference and example data, and links to software. Tutorials include:

The soccer example, with an XML file and an XSL stylesheet for IE-5 or above (Netscape-6 or above) is available.

Of course you need an engine that runs the XSL transforms. The usual one for Java is xalan from the Apache Jakarta project.

The same Soccer example is also available as a JSP that uses a style sheet to format XML data obtained from a bean. The example is packaged as a zip file containing a .war archive (once again, the extra zip wrapping is to avoid problems with browsers that want to treat .war files as text). The example is very simple, the JSP instantiates a bean that has a method that returns data as an XML string. The data in the bean are defined at compile time. (The example is packaged with an older version of xalan. See the Apache xalan site for details of the new archive structures which redistribute class files amongst archives.)


WAP & WML

The phone companies' share prices are still depressed; WAP and WML still haven't hit big time and those costly purchases of radio bandwidth are not proving to be good investments. But maybe it will all eventuate as hoped for. The new mobile phones are starting to gain market share - selling to those who are rich and need the comfort of constant visual and aural contact with their peer group. Once the phone base is there, applications may follow so finally resulting in significant use of WML.

The WebReference site has a large collections of links to tutorials, references, and sites such as Nokia for development kits (links deep into sites like Nokia fairly soon get broken, in Q2-2003 that link did lead to a page on the Nokia development kit).. There are a number of tutorials on WML available on the net including:

The "Open Mobile Alliance" is the consortium promoting WAP and related technologies. They have a number of sites including a main business page and a WAP Forum site. Another site with general information for developers is Wireless Developer Network. The Internet.com site is All Net Devices; this has news, tutorials, faqs etc.


Parsers

XML parsers (SAX and DOM styles) are available for most languages. The Apache XML sub-project has Xerces parsers for C++, Java, and Perl. Your PHP download should have its versions of these parsers.

There are others, for example crimson; this was Sun's main parser library, and is also available at Apache ("hibernating code"). The JAXP library really only adds an extra layer that tries to make the program using a parser slightly less dependent on the specific parser instantiated (the parser is created via a factory object that is controlled via properties rather than by code that explicitly instantiates a parser defined by a specific implementation.)

The Sun site includes an an introduction to the various Java APIs for XML parsing. There are tutorials on Java and XML at Visual Builder, 'Fortune City', Sun (part of the WebServices tutorial), and O'Reilly's XML site.

This is the little example from the text with a SAX parser used to find price data on books; it includes the data file and a xerces.jar.