Logo         Community
  Trog
Services
The Company
Community
Columns
Your Account
Contact Us
 
 
XML Parsing With DOM and Xerces (part 1)
Figured out SAX parsing in Java? Cat-skinning technique two involves a little thing called the DOM.

| Nailguns, Going Cheap |

I'll begin with something simple. Consider the following XML file, an XML-encoded inventory statement for a business selling equipment to Quake enthusiasts.


<?xml version="1.0"?>
<inventory>
<item>
<id>758</id>
<name>Rusty, jagged nails for nailgun</name>
<supplier>NailBarn, Inc.</supplier>
<cost>2.99</cost>
<quantity>10000</quantity>
</item>
<item>
<id>6273</id>
<name>Power pack for death ray</name>
<supplier>QuakePower.domain.com</supplier>
<cost>9.99</cost>
<quantity>10</quantity>
</item>
</inventory>


The Xerces DOM parser is designed to read an XML file, build a tree to represent the structures found within it, and expose object methods and properties to manipulate them. This next example demonstrates how, building a simple Java application that initializes the parser and reads the XML file.


import org.apache.xerces.parsers.DOMParser;
import org.w3c.dom.*;
import java.io.*;

public class MyFirstDomApp {
 
      // constructor
public MyFirstDomApp (String xmlFile) {
     
//  create a DOM parser
DOMParser parser = new DOMParser();

//  parse the document   
try {
parser.parse(xmlFile);
    Document document = parser.getDocument();
NodeDetails(document);
      } catch (IOException e) {
        System.err.println (e);
      }
  }


// this function prints out information on a specific node
// in this example, the "#document" node
// it then goes to the next node
// and does the same for that
private void NodeDetails (Node node) {
    System.out.println ("Node Type:" + node.getNodeType() + "\nNode Name:" + node.getNodeName());
if(node.hasChildNodes()) {
System.out.println ("Child Node Type:" + node.getFirstChild().getNodeType() + "\nNode Name:" + node.getFirstChild().getNodeName());
}
}

  // the main method to create an instance of our DOM application
  public static void main (String[] args) {
      MyFirstDomApp MyFirstDomApp = new MyFirstDomApp (args[0]);
  }
}


I'll explain what all this gobbledygook means shortly - but first, let's compile and run the code.

''.preg_replace(array('/  /', '/ /'), array('  ', '   '), '
$ javac MyFirstDomApp.java
').'
'

Assuming that all goes well, you should now have a class file named "MyFirstDomApp.class". Copy this class file to your Java CLASSPATH, and then execute it, with the name of the XML file as argument.

''.preg_replace(array('/  /', '/ /'), array('  ', '   '), '
$ java MyFirstDomApp /home/me/dom/inventory.xml
').'
'

Here's what the output looks like:

''.preg_replace(array('/  /', '/ /'), array('  ', '   '), '
Node Type:9
Node Name:#document
Child Node Type:1
Node Name:inventory
').'
'

Now, this might not look like much, but it demonstrates the basic concept of the DOM, and builds the foundation for more complex code. Let's look at the code in detail:

1. The first step is to import all the classes required to execute the application. First come the classes for the Xerces DOM parser, followed by the classes for exception handling and file I/O.


import org.apache.xerces.parsers.DOMParser;
import org.w3c.dom.*;
import java.io.*;


2. Next, a constructor is defined for the class (in case you didn't already know, a constructor is a method that is invoked automatically when you create an instance of the class).


// constructor
public MyFirstDomApp (String xmlFile) {
     
//  create a DOM parser
DOMParser parser = new DOMParser();

//  parse the document   
try {
parser.parse(xmlFile);
    Document document = parser.getDocument();
NodeDetails(document);
} catch (IOException e) {
System.err.println (e);
      }
}


As you can see, the constructor uses the parse() method to perform the actual parsing of the XML document; it accepts the XML file name as method argument. This method call is enclosed within a "try-catch" error handling block, in order to gracefully recover from errors.

The end result of this parsing is a DOM tree consisting of a single root and its child nodes, each of which exposes methods that describe the object in greater detail.

3. The getDocument() method returns an object representing the entire XML document; this object reference is then passed on to the NodeDetails() method to display information about itself, and its children.


// this function prints out information on a specific node
// in this example, the "#document" node
// it then goes to the next node
// and does the same for that
private void NodeDetails (Node node) {
System.out.println ("Node Type:" + node.getNodeType() + "\nNode Name:" + node.getNodeName());
if(node.hasChildNodes()) {
System.out.println ("Child Node Type:" + node.getFirstChild().getNodeType() + "\nNode Name:" + node.getFirstChild().getNodeName());
}
}


4. Once a reference to a node has been obtained, a number of other methods and properties become available to obtain the name and value of that node, as well as references to parent and child nodes. In the code snippet above, I've used the getNodeType() and getNodeName() methods of the Node object to obtain the node type and name respectively. Similarly, the hasChildNodes() method can be used to find out if a node has child nodes under it, while the getFirstChild() method can be used to get a reference to the first child node.

In case you're wondering about the getNodeType() method - every node is of a specific type, and this method returns a numeric and string constant corresponding to the node type. Here's the list of available types:

Type  Type                    Description          Name
(num)  (str)
---------------------------------------------------------------------------
1      ELEMENT_NODE          Element              The element name

2      ATTRIBUTE_NODE        Attribute            The attribute name

3      TEXT_NODE              Text                  #text

4      CDATA_SECTION_NODE    CDATA                #cdata-section

5      ENTITY_REFERENCE_NODE  Entity reference      The entity reference name

6      ENTITY_NODE            Entity                The entity name

7      PROCESSING_INSTRUCTION_NODE PI              The PI target

8      COMMENT_NODE          Comment             #comment

9      DOCUMENT_NODE          Document              #document

10      DOCUMENT_TYPE_NODE    DocType             Root element 

11      DOCUMENT_FRAGMENT_NODE DocumentFragment     #document-fragment

12      NOTATION_NODE          Notation              The notation name


How to do Everything with PHP & MySQL
How to do Everything with PHP & MySQL, the best-selling book by Melonfire, explains how to take full advantage of PHP's built-in support for MySQL and link the results of database queries to Web pages. You'll get full details on PHP programming and MySQL database development, and then you'll learn to use these two cutting-edge technologies together. Easy-to-follow sample applications include a PHP online shopping cart, a MySQL order tracking system, and a PHP/MySQL news publishing system.

Read more, or grab your copy now!


previous page more like this  print this article  next page
 
Search...
 
In trog...
Logging With PHP
Building A Quick-And-Dirty PHP/MySQL Publishing System
Output Buffering With PHP
Date/Time Processing With PHP
Creating Web Calendars With The PEAR Calendar Class
more...
 
In the hitg report...
Crime Scenes
Animal Attraction
Lord Of The Strings
more...
 
In boombox...
Patience - George Michael
Think Tank - Blur
My Private Nation - Train
more...
 
In colophon...
Hostage - Robert Crais
The Dead Heart - Douglas Kennedy
Right As Rain - George Pelecanos
more...
 
In cut!...
American Chai
The Core
Possession
more...
 
Find out how you can use this article on your own Web site!


Copyright © 1998-2018 Melonfire. All rights reserved
Terms and Conditions | Feedback