Logo         Community
The Company
Your Account
Contact Us
XPath Basics
Use XPath to navigate to any point within an XML document.

| Dog Days |

If you've worked with XML data, you already know that the XML specification defines certain rules that a document must adhere to in order to be well-formed. One of the most important rules is that every XML document must have a single outermost element, called the "root element" which, in turn, may contain other elements, nested in a hierarchical manner.

Now, it seems logical to assume that if an XML document is laid out in this structured, hierarchical tree, it's possible to move at will from any node on the tree to any other node on the tree. And that's where XPath comes in - it provides a standard addressing mechanism for an XML document which lays bare every element, attribute and text node on the tree, making it a snap to access and manipulate them.

In fact, XPath gets its name from the fact that node addresses look a  lot like standard *NIX or Windows paths - a hierarchical list of all the branches between the current node and the tree root, separated by slashes. 

XPath represents an XML document as a tree containing a number of different node types - seven of them, actually. In order to illustrate this, consider the following XML document:

<?xml version="1.0"?>
<!-- in case you didn't know, this is based on the comic - Ed -->
<cast>Hugh Jackman, Patrick Stewart and Ian McKellen</cast>
<director>Bryan Singer</director>

Here's how XPath would represent this:

    | -- movie
        | -- title
            | -- X-Men
        | -- in case you didn't know...
        | -- cast
            | -- Hugh Jackman, Patrick Stewart and Ian McKellen
        | -- director
            | -- Bryan Singer
        | -- year
            | -- 2000
        | -- play_trailer

As you can see, the various nodes in the tree above are not identical - some are elements, some contain text fragments, and some simply represent comments. Since XML itself supports a limited number of constructs, the XPath specification is able to categorize these different types of nodes into:

Element nodes: Elements within the XML document are represented as element nodes in the XPath data model. Since elements can have other elements nested within them, they typically appear as branches on the tree (although so-called "empty" elements would appear as leaves.) In the example above, "title" would be an element node.

Text nodes: The character sequences that are enclosed within elements constitute text nodes on the XPath tree. If a text node contains an entity reference, it is automatically expanded to its full value. In the example above, "X-Men" would be a text node.

Attribute nodes: If an element has attributes, those attributes are also represented as nodes; however, since attributes are always linked to elements, they appear as children of the corresponding element node.

Namespace nodes: If an XML document defines one or more namespaces for the elements within it, these namespace declarations are represented as separate nodes by XPath. Like attributes, namespace nodes appear as children of the associated element node in the XPath tree - you can see this from the diagram above.

Processing instruction (PI) nodes: If a document contains a processing instruction - well, that's a separate node too. Note, however, that although the XML declaration at the top of the document is a PI, there exists no node corresponding to it.

Comment nodes: You figure this one out...

Now, in addition to these six types (which, if you look at your leather-bound copy of the XML specification, correspond rather closely with the six basic constructs available in XML), XPath also defines something called a "root node", which is unique to every XML document. This root node represents the base of the XML document tree, and encloses everything within it. There can be only one root node in an XML document, and all other elements within the document exist as children of this root node.

It should be noted that the root node of a document is not the same as the outermost element (sometimes referred to as the "document element"); rather, as the representation above describes, it is a hypothetical node which exists as the parent of the outermost element

The hierarchical nature of XML data itself imposes a couple of other rules, which you might think of as pretty obvious - however, they bear repeating in this context. Every node (other than the root node) has a single parent. Every node (including the root node) may have one or more children. And every dog has his day.

How to do Everything with PHP & MySQL
How to do Everything with PHP & MySQL, the best-selling book by Melonfire, explains how to take full advantage of PHP's built-in support for MySQL and link the results of database queries to Web pages. You'll get full details on PHP programming and MySQL database development, and then you'll learn to use these two cutting-edge technologies together. Easy-to-follow sample applications include a PHP online shopping cart, a MySQL order tracking system, and a PHP/MySQL news publishing system.

Read more, or grab your copy now!

previous page more like this  print this article  next page
In trog...
Logging With PHP
Building A Quick-And-Dirty PHP/MySQL Publishing System
Output Buffering With PHP
Date/Time Processing With PHP
Creating Web Calendars With The PEAR Calendar Class
In the hitg report...
Crime Scenes
Animal Attraction
Lord Of The Strings
In boombox...
Patience - George Michael
Think Tank - Blur
My Private Nation - Train
In colophon...
Hostage - Robert Crais
The Dead Heart - Douglas Kennedy
Right As Rain - George Pelecanos
In cut!...
American Chai
The Core
Find out how you can use this article on your own Web site!

Copyright © 1998-2018 Melonfire. All rights reserved
Terms and Conditions | Feedback