XSL Transformation With Xalan

Take your Java/XML skills to the next level by converting your XML into other formats with the very powerful Xalan XSLT engine.

The Next Step

The past couple of weeks have been fairly hectic. I've finally got my arms around Java-based XML parsing with the Xerces parser, and have successfully figured out the basics of SAX and DOM programming with Xerces. Hey, I even managed to do some simple XML-to-HTML conversion by adding a little JSP to the mix.

In this article, I'm going to take things further, exploring yet another addition to the Java/XML family. It's called Xalan, and it's an XSLT transformation engine that should substantially simplify the task of converting, or "transforming", XML documents into other formats. I'll be looking at the basics of the Xalan engine, demonstrating how to write Java applications that can be used to convert XML data into regular ASCII text files and browser-readable HTML. And, of course, I'll come up with a number of silly examples to amuse myself (and hopefully, you).

Sounds interesting? Flip the page for more.

The Introductions

Xalan is an XSL engine developed by the people behind the Apache Web server. Consequently, it's fairly feature-rich and specification-compliant, with the latest version, version 2.3, coming with support for XSL 1.0, XPath 1.0 and the Java API for XML Parsing (JAXP).

Xalan can be configured to work with any XML parser that is compliant to the Java API for XML Parsing (JAXP) - I'll be using Xerces here - and can also be run as a standalone program from the command line, or within a servlet for XML-HTML transformation.

With the introductions out of the way, let's put together the tools you'll need to get started with Xalan. Here's a quick list of the software you'll need:

  1. The Java Development Kit (JDK), available from the Sun Microsystems Web site (http://java.sun.com)

  2. The Apache Web server, available from the Apache Software Foundation's Web site (http://httpd.apache.org)

  3. The Tomcat Application Server, available from the Apache Software Foundation's Web site (http://httpd.apache.org)

  4. The Xerces parser, available from the Apache XML Project's Web site (http://xml.apache.org)

  5. The Xalan XSLT processor, available from the Apache XML Project's Web site (http://xml.apache.org)

  6. The mod_jk extension for Apache-Tomcat communication, available from the Jakarta Project's Web site (http://httpd.apache.org)

Installation instructions for all these packages are available in their respective source archives. In case you get stuck, you might want to look at http://www.devshed.com/Server_Side/Java/JSPDev, or at the Tomcat User Guide at http://jakarta.apache.org/tomcat/tomcat-3.3-doc/tomcat-ug.html

I'm assuming here that you're familiar with XML and XSLT, and know the basics of node selection with XPath and template creation with XSLT. In case you're not, you aren't going to get much joy from this article. Flip to the end, get an education via the links included there, and then come right back for some code.

Meeting The World's Greatest Detective

With everything installed and configured, it's time to get your hands dirty. First up, a simple example, just to get you comfortable with how Xalan works. Consider the following XML file, snipped out from my XML-encoded address book:

<?xml version="1.0"?>
<me>
   <name>Sherlock Holmes</name>
   <title> World's Greatest Detective</title>
   <address>221B Baker Street, London, England</address>
   <tel>123 6789</tel>
   <email>sherlock@holmes.domain.com</email>
   <url>http://www.method_and_madness.com/</url>
</me>

Now, let's suppose I wanted to convert this snippet into the following plaintext file:

Contact information for "Sherlock Holmes, World's Greatest Detective"
Mailing address: 221B Baker Street, London, England
Phone: 123 6789
Email address: sherlock@holmes.domain.com
Web site URL:  http://www.method_and_madness.com/

Here's the XSLT stylesheet to get from point A to point B:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:template match="/" >
Contact information for "<xsl:value-of select="normalize-space(me/name), <xsl:value-of select="normalize-space(me/title)" />"
Mailing address: <xsl:value-of select="normalize-space(me/address)" />
Phone: <xsl:value-of select="normalize-space(me/tel)" />
Email address: <xsl:value-of select="normalize-space(me/email)" />
Web site URL: <xsl:value-of select="normalize-space(me/url)" />
</xsl:template>
</xsl:stylesheet>

Now, we've got the XML and the XSLT. All we need is something to marry the two together.

// import required classes
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;

public class addressBookConverter {

   // store the names of the files
   public static String xmlFile, xslFile, resultFile = "";

   // constructor
   public addressBookConverter(String xmlFile, String xslFile, String resultFile) {
      try {

         // create an instance of the TransformerFactory
         // this allows the developer to use an API that is independent of
         // a particular XML processor implementation.
         TransformerFactory tFactory = TransformerFactory.newInstance();

         // create a transformer which takes the name of the stylesheet
         // as an input parameter.
         // this creates a Templates object which is applied to the XML file
         // in the next step
         Transformer transformer = tFactory.newTransformer(new StreamSource(xslFile));

         // transform the given XML file with the Templates object
         transformer.transform(new StreamSource(xmlFile), new StreamResult(resultFile));

         System.out.println("Done!");
        } catch (TransformerException e) {
         System.err.println("The following error occured: " + e);
        }
   }

   // everything starts here
   public static void main (String[] args) {
      if(args.length != 3) {
               System.err.println("Please specify three parameters:\n1. The name and path to the XML file.\n2. The name and path to the XSL file.\n3. The name of the output file.");
              return;
       }

      // assign the parameters passed as input parameters to
      // the variables defined above
      xmlFile = args[0];
      xslFile = args[1];
      resultFile = args[2];

      addressBookConverter myFirstExample = new addressBookConverter (xmlFile, xslFile, resultFile);
    }

}

Now compile the class:

$ javac addressBookConverter.java

Assuming that all goes well, you should now have a class file named "addressBookConverter.class". Copy this class file to your Java CLASSPATH, and then execute it.

$ java addressBookConverter

Ummm...Houston, we have a problem. Here's what you should see (unless you're really intelligent and spotted me setting you up):

Please specify three parameters:
1. The name and path to the XML file.
2. The name and path to the XSL file.
3. The name of the output file.

Let's try it again:

So, if the XML and XSL files are placed in the same folder as our application, you can use the following syntax to run the application:

$ java addressBookConverter addresses.xml addresses.xsl result.txt

And here's what you should see:

Contact information for "Sherlock Holmes, World's Greatest Detective"
Mailing address: 221B BakerStreet, London, England
Phone: 123 6789
Email address: sherlock@holmes.domain.com
Web site URL: http://www.method_and_madness.com/

Let's look at the code in detail.

The Anatomy Of A Transformation

As always with Java, the first step involves pulling all the required classes into the application.

// import required classes
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;

In case you're wondering, first come the classes for JAXP, followed by the classes for exception handling and file I/O.

Now, I bet you're wondering, what's JAXP? According to the Unofficial JAXP FAQ, available at http://xml.apache.org/~edwingo/jaxp-faq.html, the Java API for XML Processing (JAXP) "enables applications to parse and transform XML documents using an API that is independent of a particular XML processor implementation". Or, to put it very simply, JAXP provides an abstraction layer, or standard API, that allows for code reuse across different XSLT processors.

You might be wondering how JAXP, as an abstraction layer, knows which class to use during the transformation process. This information is available via the javax.xml.transform.TransformerFactory property. In case this didn't make any sense to you, don't worry about it; if it did, and you want to know more, take a look at http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/TransformerFactory.html

Next, I've instantiated some variables to hold the names of the various files I'll be using in the application.

   // store the names of the files
   public static String xmlFile, xslFile, resultFile = "";

And now for the constructor:

  // constructor
   public addressBookConverter(String xmlFile, String xslFile, String resultFile) {
      try {

         // create an instance of the TransformerFactory
         // this allows the developer to use an API that is independent of
         // a particular XML processor implementation.
         TransformerFactory tFactory = TransformerFactory.newInstance();

         // create a transformer which takes the name of the stylesheet
         // as an input parameter
         Transformer transformer = tFactory.newTransformer(new StreamSource(xslFile));

         // transform the given XML file
         transformer.transform(new StreamSource(xmlFile), new StreamResult(resultFile));

         System.out.println("Done!");
        } catch (TransformerException e) {
         System.err.println("The following error occured: " + e);
        }
   }

The first step is to create an instance of the TransformerFactory class. This can be used to create a Transformer object, which reads the XSLT stylesheet and converts the templates within it into a Templates object. This Templates object is a dynamic representation of the instructions present in the XSLT file - you won't see any reference to it in the code above, because it all happens under the hood, but trust me, it exists.

Once the stylesheet is processed, a new Transformer object is generated. This Transformer object does the hard work of applying the templates within the Templates object to the XML data to produce a new result tree, via its transform() method. The result tree is stored in the specified output file.

Finally, the main() method sets the ball in motion:

 // everything starts here
   public static void main (String[] args) {
      if(args.length != 3) {
               System.err.println("Please specify three parameters:\n1. The name and path to the XML file.\n2. The name and path to the XSL file.\n3. The name of the output file.");
              return;
       }

      // assign the parameters passed as input parameters to
      // the variables defined above
      xmlFile = args[0];
      xslFile = args[1];
      resultFile = args[2];

      addressBookConverter myFirstExample = new addressBookConverter (xmlFile, xslFile, resultFile);
    }

This method first checks to see if the correct number of arguments was passed. If so, it invokes the constructor to create an instance of the addressBookConverter class; if not, it displays an appropriate error message.

Six Degrees Of Separation

Let's move on to something a little more complicated. Let's suppose that I wanted to convert my address book from XML into a delimiter-separated ASCII file format, for easy import into another application. With XSLT and Xalan, the process is a snap.

Here's the XML data:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/home/me/xsl/xml2csv.xsl"?>
<addressbook>
   <item>
      <name>Bugs Bunny</name>
      <address>The Rabbit Hole, The Field behind Your House</address>
      <email>bugs@bunnyplanet.com</email>
   </item>
   <item>
      <name>Batman</name>
      <address>The Batcave, Gotham City</address>
      <tel>123 7654</tel>
      <url>http://www.belfry.net/</url>
      <email>bruce@gotham-millionaires.com</email>
   </item>
</addressbook>

Now, in order to convert this XML document into a delimiter-separated file, I need an XSLT stylesheet like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="text"/>
   <xsl:strip-space elements="*"/>
   <!- Set the "|" symbol as the default delimiter à
   <xsl:param name="delimiter" select="normalize-space('|')"/>
   <xsl:template match="/addressbook">
      <xsl:for-each select="item">
         <xsl:value-of select="normalize-space(name)"/>
         <xsl:value-of select="$delimiter"/>
         <xsl:value-of select="normalize-space(address)"/>
         <xsl:value-of select="$delimiter"/>
         <xsl:value-of select="normalize-space(tel)"/>
         <xsl:value-of select="$delimiter"/>
         <xsl:value-of select="normalize-space(email)"/>
         <xsl:value-of select="$delimiter"/>
         <xsl:value-of select="normalize-space(url)"/>
         <!- hexadecimal value for the new-line character à
         <xsl:text>&#x0A;</xsl:text>
      </xsl:for-each>
   </xsl:template>
</xsl:stylesheet>

And here's the Java code to tie it all together:

// imported java classes
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;

public class xml2csv {

   // store the names of the files
   public static String xmlFile, xslFile, resultFile,delimiterValue = "";

   // parameters for the getAssociatedStylesheet() method of the TransformerFactory class
   // Set them to null as they are not essential here
   String media = null , title = null, charset = null;

   public xml2csv(String xmlFile, String resultFile, String delimiterValue) {
      try {

         // create an instance of the TransformerFactory class
         TransformerFactory tFactory = TransformerFactory.newInstance();

         // get the name of the stylesheet that has been defined in the XML file.
         // create a Source object for use by the upcoming newTransformer() method
         Source stylesheet = tFactory.getAssociatedStylesheet (new StreamSource(xmlFile),media, title, charset);

         // create a transformer which takes the name of the stylesheet
         // as an input parameter
         Transformer transformer = tFactory.newTransformer(stylesheet);

         // set a delimiter via the setParameter() method of the transformer
         transformer.setParameter("delimiter",delimiterValue);

         // perform the transformation
         transformer.transform(new StreamSource(xmlFile), new StreamResult(resultFile));

         System.out.println("Done!");

         } catch (TransformerException e) {
         System.err.println("The following error occured: " + e);
         }
   }

   // everything starts here
   public static void main (String[] args) {
      if(args.length != 3) {
            System.err.println("Please specify three parameters:\n1. The name and path to the XML file.\n2. The name of the output file.\n3. The delimiter to be used.");
              return;
       }

   // set some variables
   xmlFile = args[0];
   resultFile = args[1];
   delimiterValue = args[2];

        xml2csv mySecondExample = new xml2csv(xmlFile, resultFile, delimiterValue);
    }

}

Most of the code is identical to the previous example. There are a couple of important differences, though. First, the parameters passed to the constructor are different in this case.

   public xml2csv(String xmlFile, String resultFile, String delimiterValue) {

     // snip

   }

Over here, I'm passing three parameters to the constructor: the name of the XML file, the name of the output file, and the delimiter to be used between the various elements of a record. What about the XSLT file, you ask? Well, that's sourced automatically from the XML file via the getAssociatedStylesheet() method of the TransformerFactory class.

         // create an instance of the TransformerFactory class
         TransformerFactory tFactory = TransformerFactory.newInstance();

         // get the name of the stylesheet that has been defined in the XML file.
         // create a Source object for use by the upcoming newTransformer() method
         Source stylesheet = tFactory.getAssociatedStylesheet (new StreamSource(xmlFile),media, title, charset);

A new Source object is created to represent this stylesheet; this Source object is ultimately passed to the Transformer class.

         // create a transformer which takes the name of the stylesheet
         // as an input parameter
         Transformer transformer = tFactory.newTransformer(stylesheet);

If you take a close look at the XSLT file above, you'll see that I've defined an XSLT parameter named "delimiter". This parameter is essentially a variable which can be accessed by XSLT, and a value can be assigned to it via the setParameter() method of the Transformer class.

         // set a delimiter via the setParameter() method of the transformer
         transformer.setParameter("delimiter",delimiterValue);

In this case, the "delimiter" parameter is set to whatever delimiter was specified by the user.

Finally, the actual transformation is performed, the output stored in the desired output file, and a result code generated.

         // perform the transformation
         transformer.transform(new StreamSource(xmlFile), new StreamResult(resultFile));

         System.out.println("Done!");

Here's what it looks like, assuming you use a pipe (|) as delimiter (I haven't used a comma simply because some of the entries already contain commas):

Bugs Bunny|The Rabbit Hole, The Field behind Your House||bugs@bunnyplanet.com|
Batman|The Batcave, Gotham City|123 7654|bruce@gotham-millionaires.com|http://www.belfry.net/

Isn't it simple when you know how?

The Write Stuff

Now, if I've done my job right, you should have a fairly clear idea of how to perform XSLT transformation at the command prompt. How about the doing the same in a Web environment?

Have no fear. The process is very similar, and will cause you no heartburn whatsoever. Let's take the very first example in the article and update it to work with the Web. Here's the XML:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="addresses.xsl"?>
<me>
   <name>Sherlock Holmes</name>
   <title> World's Greatest Detective</title>
   <address>221B Baker Street, London, England</address>
   <tel>123 6789</tel>
   <email>sherlock@holmes.domain.com</email>
   <url>http://www.method_and_madness.com/</url>
</me>

Let's suppose I wanted to convert this XML document into the following Web page:

The first step, obviously, is to create a new XSLT stylesheet to produce HTML, rather than ASCII, output.

<?xml version="1.0"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

   <html>
   <head>
   </head>
   <body>
   <h1>Contact information for <b><xsl:value-of select="me/name" /></b></h1>

   <h2>Mailing address:</h2>
   <xsl:value-of select="me/address" />

   <h2>Phone:</h2>
   <xsl:value-of select="me/tel" />

   <h2>Email address:</h2>
   <xsl:value-of select="me/email" />

   <h2>Web site URL:</h2>
   <xsl:value-of select="me/url" />

   </body>
   </html>

</xsl:template>

</xsl:stylesheet>

And here's the Java class code:

// imported java classes
import org.xml.sax.SAXException;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;

public class addressBookHTMLConverter {

   // store the names of the files
   public static String xmlFile = "";

   // a Writer object to communicate with JSP
   private Writer out;

   // parameters for the getAssociatedStylesheet() method of the TransformerFactory class
   // Set them to null as they are not essential here
   String media = null , title = null, charset = null;

   // constructor
   public addressBookHTMLConverter(String xmlFile, Writer out) throws SAXException {
         try {

            this.out = out;

            // create an instance of the TransformerFactory
            TransformerFactory tFactory = TransformerFactory.newInstance();

             // get the name of the stylesheet that has been defined in the XML file.
             // create a Source object for use by the upcoming newTransformer() method
             Source stylesheet = tFactory.getAssociatedStylesheet (new StreamSource(xmlFile),media, title, charset);

            // create a transformer which takes the name of the stylesheet
            // from the XML document
            transformer.transform(new StreamSource(xmlFile),new StreamResult(out));

            out.flush();

         } catch (TransformerException e) {
            throw new SAXException(e);
         } catch (IOException e) {
            throw new SAXException(e);
         }
   }
}

The key difference between this example and the earlier application is the introduction of a new Writer object, which makes it possible to redirect output to the browser instead of the standard output device.

private Writer out;

The constructor also needs to be modified to work with the Writer object:

   public addressBookHTMLConverter(String xmlFile, Writer out) throws SAXException {

   // snip

}

Finally, the transform() method of the Transformer object needs to know that, this time, output must go to the Writer object and not to a file. Note the change in the second argument to the transform() method, below.

            // create a transformer which takes the name of the stylesheet
            // from the XML document
            transformer.transform(new StreamSource(xmlFile),new StreamResult(out));

Note that, since I'm using a Writer object, I will have to build some error-handling routines into the class as well. It's instructive to examine them, and understand the reason for their inclusion.

You'll remember that I defined a Writer object at the top of my program; this Writer object provides a convenient way to output a character stream, either to a file or elsewhere. However, if the object does not initialize correctly, there is no way of communicating the error to the final JSP page.

The solution to the problem is simple: throw an exception. This exception can be captured by the JSP page and resolved appropriately. This is no different from the mechanism used when catching XML errors with the Xerces XML parser.

So that's the class. Now, we need something to tie the class to the Web server. Enter JSP.

<%@ page language="java" import="java.io.IOException" %>
<html>
<head>
</head>
<body>
<%
   try {
        addressBookHTMLConverter myThirdExample = new addressBookHTMLConverter("/home/me/xml/addresses.xml ", out);

   } catch (Exception e) {
         out.println("Something bad happened!");
   }
%>

</body>
</html>

Now, start up your Web server, with the Tomcat engine, and browse to the JSP page. Here's what you'll see:

This example, though intended only as an illustration, should serve to demonstrate how easy it is to perform XSLT transformations in a Web environment with Xalan.

Still Hungry?

And that's about it from me. In this article, I attempted to demonstrate the basics of the Xalan XSLT processor, offering a broad overview of how it work by transforming XML documents into ASCII and HTML formats. I showed you how to write simple command-line Java applications to perform transformations, and then ported these examples over to the Web via JSP.

I've made a conscious attempt to stay away from the geekier aspects of the processor, hoping to keep things as simple as possible. If, however, those geekier aspects do interest you, consider visiting the following links

The official Xalan Web site, at http://xml.apache.org/xalan-j/index.html

The official Xerces Web site, at http://xml.apache.org/xerces-j/index.html

Xalan code samples, at http://xml.apache.org/xalan-j/samples.html

The W3C's XSL and XSLT site, at http://www.w3c.org/Style/XSL/

XSLT Basics, at http://www.melonfire.com/community/columns/trog/article.php?id=82

XML Basics, at http://www.melonfire.com/community/columns/trog/article.php?id=78

The goal here was to offer you a broad overview of how Xalan works, in order to give you some insight into the processor's capabilities and lay the groundwork for more advanced applications. Did I succeed? Write in and tell me.

Note: All examples in this article have been tested with JDK 1.3.0, Apache 1.3.11, mod_jk 1.1.0, Xalan 2.3 and Tomcat 3.3. Melonfire offers no support or warranties for the source code in this article. Examples are illustrative only, and are not meant for a production environment. YMMV!

This article was first published on08 Mar 2002.