Logo         Community
  Trog


Copyright notice:

This article is copyright Melonfire, 2018. All rights reserved.

All source code, brand names, trademarks and other content contained herein is proprietary to Melonfire, 2018. All rights reserved.

Source code within this article is provided with NO WARRANTY WHATSOEVER. It is meant for illustrative purposes only, and is NOT recommended for use in production environments.

Copyright infringement is a violation of law.

Printed from http://www.melonfire.com/community/columns/trog/article.php?id=108



XML Parsing With SAX and Xerces (part 2)
Get down and dirty with the Xerces SAX parser.

Looking Back

In the first part of this article, I introduced you to the Xerces XML parser, explaining how it could be used to parse XML documents using an event-driven approach called SAX. I also demonstrated how the parser worked by using it in a couple of simple Java programs, and explained some of the interfaces and callbacks available in the API.

Now, writing a Java program to parse an XML document is all well and good. However, it's not really all that useful if you're a Web developer and your primary goal is the dynamic generation of Web pages from an XML file. And so, this concluding part takes everything you learned last time and tosses it out into the wild and wacky world of the Web, demonstrating clearly how Java, JSP, Xerces and XML can be combined to create simple, real-world Web applications. Take a look!

Nailing It To The Wall

Now, how about something a little more useful? Consider the following modification of the previous example:


<?xml version="1.0"?>
<inventory>
<item>
<id>758</id>
<name>Rusty, jagged nails for nailgun</name>
<supplier>NailBarn, Inc.</supplier>
<cost>2.99</cost>
<quantity alert="500">10000</quantity>
</item>
<item>
<id>6273</id>
<name>Power pack for death ray</name>
<supplier>QuakePower.domain.com</supplier>
<cost currency="USD">9.99</cost>
<quantity alert="20">10</quantity>
</item>
<item>
<id>3784</id>
<name>Axe</name>
<supplier>Axe And You Shall Receive, Inc.</supplier>
<cost currency="USD">56.74</cost>
<quantity alert="5">25</quantity>
</item>
<item>
<id>986</id>
<name>NVGs</name>
<supplier>Quake Eyewear</supplier>
<cost currency="USD">1399.99</cost>
<quantity alert="5">2</quantity>
</item>
</inventory>


Now, let's suppose I want to display this information in a neatly-formatted table, with those items that I'm low on highlighted in red. My preferred output would look something like this:

Output image

Here's the code to accomplish this:


import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import java.io.*;

public class MyFifthSaxApp extends DefaultHandler {

private Writer out;
private String ElementName, AttributeName,AttributeValue = "";
private Integer Quantity, Alert;

    // constructor
    public MyFifthSaxApp (String xmlFile, Writer out)
throws SAXException {

this.out = out;

        //  Create a Xerces SAX parser
        SAXParser parser = new SAXParser();

//  Set the Content Handler
        parser.setContentHandler(this);

        //  parse the Document
        try {
            parser.parse(xmlFile);
out.flush();
                  } catch (IOException e) {
            throw new SAXException(e);
            }
    }

    // call this when opening element found
    public void startElement (String uri, String local, String qName, Attributes atts) 
throws SAXException {
try {

// this is useful later
ElementName = local;

// display table header
if(local.equals("inventory")) {
out.write("<h1><font face=Verdana>Inventory Management</font></h1>\n<table width=\"55%\" cellpadding=\"5\" cellspacing=\"5\" border=\"1\"><tr><td><p align=right><b><font face=Verdana size=2>Code</font></b></p></td><td><b><font face=Verdana size=2>Name</font></b></td><td><b><font face=Verdana size=2>Supplier</font></b></td><td><p align=right><b><font face=Verdana size=2>Cost</font></b></p></td><td><p align=right><font face=Verdana size=2><b>Quantity</b></font></p></td></tr>");
} else if(local.equals("item")) {
// "item" element starts a new row
out.write("<tr>");
} else if( local.equals("name") || local.equals("supplier")) {
// create table cells within row
// align strings left, numbers right
out.write("<td><p align=left><font face=Verdana size=2>");
} else if( local.equals("id") || local.equals("cost") || local.equals("quantity")) {
out.write("<td><p align=right><font face=Verdana size=2>");
} else {
out.write("<br>");
}

for (int i = 0; i < atts.getLength(); i++) {
AttributeName = atts.getLocalName(i);
AttributeValue = atts.getValue(AttributeName);
if(AttributeName.equals("currency")) {
out.write(AttributeValue + "&nbsp;");
} else if(AttributeName.equals("alert")) {
Alert = new Integer(AttributeValue);
} else {
out.write("&nbsp;");
}
        }
} catch (IOException e) {
throw new SAXException(e);
}
}

// call this when cdata found
    public void characters(char[] text, int start, int length)
throws SAXException {
  try {
String Content = new String(text, start, length);
        if (!Content.trim().equals("")) {
if ((ElementName != null && ElementName.equals("quantity")) && (AttributeName != null && AttributeName.equals("alert"))) {
Quantity = new Integer(Content);
// if quantity lower than expected, highlight in red
if((Quantity.intValue()) < (Alert.intValue())) { out.write("<font color=\"#ff0000\">" + Quantity + "</font>");
} else {
out.write("<font color=\"#000000\">" + Quantity + "</font>");
}

} else {
out.write(Content);
}
}
} catch (IOException e) {
throw new SAXException(e);
}
    }

//  call this when closing element found
public void endElement (String uri, String local, String qName)
throws SAXException {
try {
if(local.equals("inventory")) {
out.write("</table>");
} else if(local.equals("item")) {
// "item" closes table row
out.write("</tr>");
} else if(local.equals("id") || local.equals("name") || local.equals("supplier") || local.equals("cost") || local.equals("quantity")) {
// close table cells
out.write("</font></p></td>");
} else {
out.write("&nbsp;");
}
} catch (IOException e) {
throw new SAXException(e);
}
}

}


As you can see, the callback functions used here have evolved substantially from the previous examples - they now contain more conditional tests, and better error handling capabilities. Let's take a closer look.

Most of the work in this script is done by the startElement() callback function. This function prints specific HTML output depending on the element encountered by the parser.


// call this when opening element found
    public void startElement (String uri, String local, String qName, Attributes atts) 
throws SAXException {
try {

// this is useful later
ElementName = local;

// display table header
if(local.equals("inventory")) {
out.write("<h1><font face=Verdana>Inventory Management</font></h1>\n<table width=\"55%\" cellpadding=\"5\" cellspacing=\"5\" border=\"1\"><tr><td><p align=right><b><font face=Verdana size=2>Code</font></b></p></td><td><b><font face=Verdana size=2>Name</font></b></td><td><b><font face=Verdana size=2>Supplier</font></b></td><td><p align=right><b><font face=Verdana size=2>Cost</font></b></p></td><td><p align=right><font face=Verdana size=2><b>Quantity</b></font></p></td></tr>");
} else if(local.equals("item")) {
// "item" element starts a new row
out.write("<tr>");
} else if( local.equals("name") || local.equals("supplier")) {
// create table cells within row
// align strings left, numbers right
out.write("<td><p align=left><font face=Verdana size=2>");
} else if( local.equals("id") || local.equals("cost") || local.equals("quantity")) {
out.write("<td><p align=right><font face=Verdana size=2>");
} else {
out.write("<br>");
}

for (int i = 0; i < atts.getLength(); i++) {
AttributeName = atts.getLocalName(i);
AttributeValue = atts.getValue(AttributeName);
if(AttributeName.equals("currency")) {
out.write(AttributeValue + "&nbsp;");
} else if(AttributeName.equals("alert")) {
Alert = new Integer(AttributeValue);
} else {
out.write("&nbsp;");
}
        }
} catch (IOException e) {
throw new SAXException(e);
}
}


This function maps different XML elements to appropriate HTML markup. As you can see, the document element "inventory", which marks the start of the XML document, is used to create the skeleton and first row of an HTML table, while the different "item" elements correspond to rows within this table. The details of each item - name, supplier, quantity et al - are formatted as cells within each row of the table.

Next, the characters() callback function handles formatting of the content embedded within the elements.


// call this when cdata found
    public void characters(char[] text, int start, int length)
throws SAXException {
  try {
String Content = new String(text, start, length);
        if (!Content.trim().equals("")) {
if ((ElementName != null && ElementName.equals("quantity")) && (AttributeName != null && AttributeName.equals("alert"))) {
Quantity = new Integer(Content);
// if quantity lower than expected, highlight in red
if((Quantity.intValue()) < (Alert.intValue())) { out.write("<font color=\"#ff0000\">" + Quantity + "</font>");
} else {
out.write("<font color=\"#000000\">" + Quantity + "</font>");
}

} else {
out.write(Content);
}
}
} catch (IOException e) {
throw new SAXException(e);
}
    }


For most of the elements, I'm simply displaying the content as is. The only deviation from this standard policy occurs with the "quantity" element, which has an additional "alert" attribute. This "alert" attribute specifies the minimum number of units that should be in stock of the corresponding item; if the quantity drops below this minimum level, an alert should be generated. Consequently, the characters() callback includes some code to test the current quantity against the minimum quantity, and highlight the data in red if the test fails.

And finally, to wrap things up, the endElement() callback closes the HTML tags opened earlier.


//  call this when closing element found
public void endElement (String uri, String local, String qName)
throws SAXException {
try {
if(local.equals("inventory")) {
out.write("</table>");
} else if(local.equals("item")) {
// "item" closes table row
out.write("</tr>");
} else if(local.equals("id") || local.equals("name") || local.equals("supplier") || local.equals("cost") || local.equals("quantity")) {
// close table cells
out.write("</font></p></td>");
} else {
out.write("&nbsp;");
}
} catch (IOException e) {
throw new SAXException(e);
}
}


Once you've compiled this class, you can use it in a JSP page, as you did with the previous example. Here's the code,


<%@ page language="java" import="java.io.IOException" %>
<html>
<head>
</head>
<body>

<%
try {
MyFifthSaxApp myFifthExample = new MyFifthSaxApp("/www/xerces/WEB-INF/classes/inventory.xml",out);
} catch (Exception e) {
      out.println("<font face=\"verdana\" size=\"2\">The following error occurred: <br><b>" + e + "</b></font>");
}
%>
</body>
</html>


and here's the output:

Output image

When Things Go Wrong

If you take a close look at the previous example, you'll notice some fairly complex error-handling built into it. It's instructive to examine that, and understand the reason for its inclusion.

You'll remember that I defined a Writer object at the top of my program; this Writer object provides a convenient way to output a character stream, either to a file or elsewhere. However, if the object does not initialize correctly, there is no way of communicating the error to the final JSP page.

The solution to the problem is simple: throw an exception. This exception can be captured by the JSP page and resolved appropriately.

Let's take another look at the startElement() callback, this time focusing on the error-handling built into it:


// call this when opening element found
    public void startElement (String uri, String local, String qName, Attributes atts) 
throws SAXException {
try {

// snip

} catch (IOException e) {
throw new SAXException(e);
}
}


By default, the startElement() callback is not set up to throw any exception. However, it's possible to alter this default behaviour and set it up to throw a SAXException if an error occurs with the Writer object, and propagate this error to the target JSP document.

Why is this necessary? Because if you don't do this, and your Writer object throws an error, there's no way of letting the JSP document know what happened, simply because the Writer object is the only available line of communication between the Java class and the JSP document. It's a little like that chicken-and-egg situation we all know and love...

Now, in the JSP page, it's possible to set up a basic error resolution mechanism to display the error on the screen. In order to test-drive it, try removing one of the opening "item" tags from the XML document used in this example and accessing the JSP page again through your browser.

Skinning A Cat, Technique Two

How about another example, this one utilizing a different technique to format XML into HTML?

Here's the XML file I plan to use - it's a simple to-do list, with tasks, priorities and due dates marked up in XML.


<?xml version="1.0"?>
<todo>
<item>
<priority>1</priority>
<task>Figure out how Xerces works</task>
<due>2001-12-12</due>
</item>
<item>
<priority>2</priority>
<task>Conquer the last Quake map</task>
<due>2001-12-31</due>
</item>
<item>
<priority>3</priority>
<task>Buy a Ferrari</task>
<due>2005-12-31</due>
</item>
<item>
<priority>1</priority>
<task>File tax return</task>
<due>2002-03-31</due>
</item>
<item>
<priority>3</priority>
<task>Learn to cook</task>
<due>2002-06-30</due>
</item>
</todo>


As with the previous examples, this has two components: the source code for the Java class, and the JSP page which uses the class. Here's the class:


import java.util.*;
import java.io.*;
import org.xml.sax.*;
import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.helpers.DefaultHandler;

public class MySixthSaxApp extends DefaultHandler {

private Writer out;
private String ElementName = "";

// define a hash table to store HTML markup
// this hash table is used in the callback functions
// for start, end and character elements ("priority" only)
private Map StartElementHTML = new HashMap();
private Map EndElementHTML = new HashMap();
private Map PriorityHTML = new HashMap();

  // constructor
  public MySixthSaxApp (String xmlFile, Writer out)
throws SAXException {

this.out = out;

// initialize StartElementHTML Hashmap
StartElementHTML.put("todo","<ol>\n");
StartElementHTML.put("item","<li>");
StartElementHTML.put("task","<b>");
StartElementHTML.put("due","&nbsp;<i>(");

// initialize EndElementHTML Hashmap
EndElementHTML.put("todo","</ol>\n");
EndElementHTML.put("item","</font></li>\n");
EndElementHTML.put("task","</b>");
EndElementHTML.put("due",")</i>");

// initialize PriorityHTML Hashmap
PriorityHTML.put("1","<font face=\"Verdana\" color=\"#ff0000\" size=\"2\">");
PriorityHTML.put("2","<font face=\"Verdana\" color=\"#0000ff\" size=\"2\">");
PriorityHTML.put("3","<font face=\"Verdana\" color=\"#000000\" size=\"2\">");

//  create a Xerces SAX parser
  SAXParser parser = new SAXParser();

//  set the content handler
  parser.setContentHandler(this);

  //  parse the document
  try {
      parser.parse(xmlFile);
out.flush();
  } catch (IOException e) {
      throw new SAXException(e);
  }
  }

//  start element callback function
  public void startElement (String uri, String local, String qName, Attributes atts) 
throws SAXException {
try {
// keep track of the element being parsed
ElementName = local;

// only call the HashMap table if the element is not the "priority" element
if(local != null && (!local.equals("priority"))) {

// this ensures that elements not present
// in the HashMap are handled
// basically, taking care of those ugly NullPointerExceptions
if(StartElementHTML.get(local) != null) {
out.write((StartElementHTML.get(local)).toString());
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}

// cdata callback function
  public void characters(char[] text, int start, int length)
throws SAXException {
  try {
String Content = new String(text, start, length);
if (!Content.trim().equals("")) {
if(ElementName != null) {

// if the element name is not "priority", then display content
if(!ElementName.equals("priority")) {
        out.write(Content);
} else {
// if it is the "priority" element

// get the HTML tag from the Priority Hashmap
// this defines the color for tasks with different priorities

if(PriorityHTML.get(Content) != null) {

out.write((PriorityHTML.get(Content)).toString());
}
}
}
}
} catch (IOException e) {
throw new SAXException(e);
}
  }

//  end element callback function
public void endElement (String uri, String local, String qName)
throws SAXException {
try {
if(local != null && (!local.equals("priority"))) {
if(EndElementHTML.get(local) != null) {
out.write((EndElementHTML.get(local)).toString());
ElementName = null;
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}
}


This is much cleaner and easier to read than the previous example, since it uses Java's HashMap object to store key-value pairs mapping HTML markup to XML markup. Three HashMaps have been used here: StartElementHTML, which stores the HTML tags for opening XML elements; EndElementHTML, which stores the HTML tags for closing XML elements; and PriorityHTML, which stores the HTML tags for the "priority" elements defined for each "item".

These HashMaps are populated with data in the class constructor:


// initialize StartElementHTML Hashmap
StartElementHTML.put("todo","<ol>\n");
StartElementHTML.put("item","<li>");
StartElementHTML.put("task","<b>");
StartElementHTML.put("due","&nbsp;<i>(");

// initialize EndElementHTML Hashmap
EndElementHTML.put("todo","</ol>\n");
EndElementHTML.put("item","</font></li>\n");
EndElementHTML.put("task","</b>");
EndElementHTML.put("due",")</i>");

// initialize PriorityHTML Hashmap
PriorityHTML.put("1","<font face=\"Verdana\" color=\"#ff0000\" size=\"2\">");
PriorityHTML.put("2","<font face=\"Verdana\" color=\"#0000ff\" size=\"2\">");
PriorityHTML.put("3","<font face=\"Verdana\" color=\"#000000\" size=\"2\">");


A string variable named ElementName is also used to store the name of the element currently being parsed; this is used within the characters() callback function.


private String ElementName = "";


Now, when an opening tag is found, the startElement() callback is triggered; this callback function uses the current element name as a key into the HashMap previously defined, retrieves the corresponding HTML markup for that element, and prints it.


//  start element callback function
  public void startElement (String uri, String local, String qName, Attributes atts) 
throws SAXException {
try {
// keep track of the element being parsed
ElementName = local;

// only call the HashMap table if the element is not the "priority" element

if(local != null && (!local.equals("priority"))) {

// this ensures that elements not present
// in the HashMap are handled
// basically, taking care of those ugly NullPointerExceptions
if(StartElementHTML.get(local) != null) {
out.write((StartElementHTML.get(local)).toString());
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}


Note the numerous checks to avoid NullPointerExceptions, the bane of every Java programmer on the planet.

With the opening element handled, the next step is to process the character data that follows it. This is handled by the characters() callback, which performs the important task of displaying the element content, with appropriate modification to the font colour depending on the element priority.


// cdata callback function
  public void characters(char[] text, int start, int length)
throws SAXException {
  try {
String Content = new String(text, start, length);
if (!Content.trim().equals("")) {
if(ElementName != null) {

// if the element name is not "priority", then display content
if(!ElementName.equals("priority")) {
        out.write(Content);
} else {
// if it is the "priority" element

// get the HTML tag from the Priority Hashmap
// this defines the color for tasks with different priorities

if(PriorityHTML.get(Content) != null) {

out.write((PriorityHTML.get(Content)).toString());
}
}
}
}
} catch (IOException e) {
throw new SAXException(e);
}
  }


Here, the priority of the task is used to retrieve the corresponding display colour for from the PriorityHTML HashMap, and the content is then printed in that colour.

Finally, the endElement() callback function replicates the functionality of the startElement() callback, closing the HTML tags opened earlier.


//  end element callback function
public void endElement (String uri, String local, String qName)
throws SAXException {
try {
if(local != null && (!local.equals("priority"))) {
if(EndElementHTML.get(local) != null) {
out.write((EndElementHTML.get(local)).toString());
ElementName = null;
}
}
} catch (IOException e) {
throw new SAXException(e);
}
}
}


And here's the JSP page that uses the class above:


<%@ page language="java" import="java.io.IOException" %>
<html>
<head>
</head>
<body>
<h1><font face="Verdana">My Todo List</font></h1>
<% try {
MySixthSaxApp mySixthExample = new MySixthSaxApp("/www/xerces/WEB-INF/classes/todo.xml ",out);
} catch (Exception e) {
out.println("<font face=\"verdana\" size=\"2\">Something bad just happened: <br><b>" + e + "</b></font>");
}
%>
</body>
</html>


And here's what it looks like:

Output image

Because I've used HashMaps to map XML elements to HTML markup, the code in the example above is cleaner and easier to maintain. Further, this approach makes it simpler to edit the XML-to-HTML mapping; if I need to add a new element to the source XML document, I need only update the HashMaps in my class code, with minimal modification to the callbacks themselves.

Endnote

That's about it for this article. Over the preceding pages, you learned more than you ever wanted to know about the Xerces SAX parser, using it to develop simple XML-based applications in both Web and non-Web environments. You (hopefully) understood how SAX works, gained an insight into what callback functions do, and learned how to use Xerces' interfaces in combination with simple Java constructs to quickly and easily create dynamic Web pages from static XML documents.

I hope you enjoyed it, and that it helped you to gain a greater understanding of how to process XML and use it in a Java-based environment - both on and off the Web. In case you'd like more information on the topic, you should consider bookmarking the following sites:

The official Xerces Web page, at http://xml.apache.org/xerces-j/

The Xerces FAQ, at http://xml.apache.org/xerces-j/faq-write.html

The SAX project, at http://www.saxproject.org/

The SAX2 Quick Start, at http://www.megginson.com/SAX/Java/quick-start.html

The Xerces-Java Quick Start, at http://www.ecerami.com/xerces/

See you soon!

Note: All examples in this article have been tested with JDK 1.3.0, Apache 1.3.11, mod_jk 1.1.0, Xerces 1.4.4 and Tomcat 3.3. Examples are illustrative only, and are not meant for a production environment. YMMV!


Copyright notice:

This article is copyright Melonfire, 2018. All rights reserved.

All source code, brand names, trademarks and other content contained herein and proprietary to Melonfire, 2018. All rights reserved.

Source code within this article is provided with NO WARRANTY WHATSOEVER. It is meant for illustrative purposes only, and is NOT recommended for use in production environments.

Copyright infringement is a violation of law.

Printed from http://www.melonfire.com/community/columns/trog/article.php?id=108



Copyright © 1998-2018 Melonfire. All rights reserved
Terms and Conditions | Feedback