Using the JAXP XPath API

Saxon provides an API for executing XPath expressions. The API is an implementation of the JAXP 1.3 XPath API, which in turn is loosely modelled on the DOM Level 3 API for XPath (which Saxon does not implement). For full documentation, see the Javadoc description of package net.sf.saxon.xpath. Two sample applications using this API are available: they are called XPathExample.java and ApplyXPathJAXP.java, and can be found in the samples/java directory.

The XPathExample.java application has been rewritten in Saxon 8.2 to use JAXP 1.3 interfaces. To run this application, see the instructions in Shakespeare XPath Sample Application.

The ApplyXPathJAXP.java application is an enhanced version of the class of the same name issued as a sample application in the JAXP 1.3 distribution. It has been enhanced to show the use of more advanced features, such as the ability to bind namespaces, variables, and functions, and also to demonstrate use of the XPath API with different object models.

Because the XPath API in Saxon predates the introduction of the JAXP 1.3 XPath API, there is often more than one way of achieving the same effect. It is likely that in time, some of the native Saxon methods will be deprecated and replaced by the standard JAXP methods.

An application using the JAXP 1.3 XPath API starts by instantiating a factory class. This is done by calling:


XPathFactory xpathFactory = XPathFactory.newInstance(objectModel);
XPath xpath = xpathFactory.newXPath();

Here objectModel is a URI that identifies the object model you are using. Saxon recognizes four values for the object model:

Symbolic name

Meaning

XPathConstants.DOM_OBJECT_MODEL

The DOM object model

NamespaceConstant.OBJECT_MODEL_SAXON

Saxon's native object model. This means anything that implements the NodeInfo interface, including the standard tree, the tiny tree, and third-party implementations of NodeInfo.

NamespaceConstant.OBJECT_MODEL_JDOM

The JDOM object model

NamespaceConstant.OBJECT_MODEL_XOM

The XOM object model

To ensure that Saxon is selected as your XPath implementation, you must specify one of these constants as your chosen object model, and you must ensure that the Java system property javax.xml.xpath.XPathFactory is set to the value net.sf.saxon.xpath.XPathFactoryImpl. Normally, if Saxon is on your classpath then the Saxon XPath implementation will be picked up automatically, but if there are other implementations on the classpath as well then it is best to set the system property explicitly to be sure.

This API is based on the class net.sf.saxon.xpath.XPathEvaluator. This class provides a few simple configuration interfaces to set the source document, the static context, and the context node, plus a number of methods for evaluating XPath expressions.

The XPath object returned from the above calls allows you to set the static context for evaluating XPath expressions (you can pre-declare namespaces, variables, and functions), and to compile XPath expressions in this context. A compiled XPath expression (an object of class XPathExpression) can then be evaluated, with a supplied node (represented by a class in the selected object model) supplied as the context node. For further details, see the Javadoc specifications and the supplied example applications.

The JAXP specification leaves it rather up to the implementation how the results of an XPath expression will be returned. This is partly because it is defined only for XPath 1.0, which has a much simpler type system, and partly because it is deliberately designed to be independent of the object model used to represent XML trees.

If you specify the return type XPathConstants.BOOLEAN then Saxon will return the effective boolean value of the expression, as a java.lang.Boolean. This is the same as wrapping the expression in a call of the XPath boolean() function.

If you specify the return type XPathConstants.STRING then Saxon will return the result of the expression converted to a string, as a java.lang.Boolean. This is the same as wrapping the expression in a call of the XPath string() function.

If you specify the return type XPathConstants.NUMBER then Saxon will return the result of the expression converted to a double as a java.lang.Double. This is the same as wrapping the expression in a call of the XPath number() function.

If you specify the return type XPathConstants.NODE then Saxon will return the result the result as a node object in the selected object model. With the DOM model this will be an instance of org.w3.dom.Node, with the native Saxon model it will be an instance of net.sf.saxon.om.NodeInfo, and so on.

If the return type is XPathConstants.NODESET, the result will be a Java List containing node objects in the selected object model. Note that the DOM NodeList class is not returned.

Saxon does not recognize additional values for the return type other than the values defined in JAXP. If you want to return a different result type, for example a list of integers or a date, use one of the methods in which the result type is unspecified. If any conversions are necessary, do them within the XPath expression itself, using casts or constructor functions. The Java object that is returned will be a representation of the XPath value, converted in the same way as arguments to a extension functions.

Saxon's implementation of XPathExpression (namely net.sf.saxon.xpath.XPathExpressionImpl) provides additional methods for evaluating the XPath expression. In particular the rawIterator() method with no arguments returns a Saxon SequenceIterator which allows the application to process the results of any XPath expression, with no conversion: all values will be represented using a native Saxon class, for example a node will be represented as a NodeInfo and a QName as a QNameValue. The NodeInfo interface is described in the next section.

The native Saxon methods rely on the dynamic context being established using separate setXX() calls on the XPathExpressionImpl object. If these methods are used, the XPathExpression object will not be thread-safe.

XPath itself provides no sorting capability. You can therefore specify a sort order in which you want the results of an expression returned. This is done by nominating another expression, via the setSortKey method: this second expression is applied to each item in the result sequence, and its value determines the position of that item in the sorted result order.

You can call methods directly on the NodeInfo object to get information about a node: for example getDisplayName() gets the name of the node in a form suitable for display, and getStringValue() gets the string value of the node, as defined in the XPath data model. You can also use the node as the context node for evaluation of subsequent expressions.

Expand

Next