SAXONICA |
This page and the following pages describe Saxon's original native Java API for XQuery. For the new XQJ interface, see Invoking XQuery using the XQJ API. For .NET interfaces, see Saxon API for .NET.
Rather than using the query processor from the command line, you may want to issue queries from your own application, perhaps one that enables it to be used within an applet or servlet. If you run the processor repeatedly, this will always be much faster than running it each time from a command line, even if it handles a different query each time.
In the absence of a standard API for XQuery, so Saxon provides its own. It is fully described
in the JavaDoc included in the download: look for the package net.sf.saxon.query
.
The starting point is the class QueryStaticContext
. What follows here is an
overview. For an example of how the API can be used, take a look at the source code for the class
QueryAPIExamples
in the samples/java
directory.
Getting started
The first thing you need to do is to create a net.sf.saxon.Configuration
object.
This holds values of all the system settings, corresponding to flags available on the command line.
You don't need to set any properties in the Configuration
object if you are happy
with the default settings. However, there are many options that you can set by calling setter
methods on the Configuration
object, or, if you prefer, by calling the general-purpose
method setConfigurationProperty(name, value)
where the name is a constant from the
class net.sf.saxon.FeatureKeys
: the available properties are described on the page
Using XSLT from an Application
For schema-aware processing, assuming you are using the schema-aware version of Saxon,
create an instance of
com.saxonica.validate.SchemaAwareConfiguration
, which is a subclass of
Configuration
. Alternatively, there is a factory method Configuration.makeConfiguration()
which makes a schema-aware configuration if Saxon-SA is available and licensed, and a non-schema-aware
configuration otherwise.
Then you need to create a net.sf.saxon.query.StaticQueryContext
object. As the name
implies, this holds information about the static (compile-time) context for a query. Most aspects
of the static context can be defined in the Query Prolog, but this object allows you to initialize
the static context from the application instead if you need to. Some of the facilities provided are
very much for advanced users only, for example the ability to declare variables and functions, and
the ability to specify a NamePool to be used. One aspect of the static context that you may need
to use is the ability to declare collations. Using the method declareCollation
you can
create a mapping between a collation URI (which can then be used anywhere in the Query) and a Java
Comparator
object used to implement that collation.
Compiling the Query
The StaticQueryContext
object can now be used to compile a Query. The text of the
Query can be supplied either as a String
or as a Java Reader
. There
are thus two different compileQuery
methods. Each of them returns the compiled
query in the form of an XQueryExpression
. The XQueryExpression
, as you would expect,
can be executed repeatedly, as often as you want, in the same or in different threads.
For example:
Configuration config = new Configuration();
StaticQueryContext staticContext =
new StaticQueryContext(config);
XQueryExpression exp =
staticContext.compileQuery("count(//ITEM)");
Note: the StaticQueryContext
object no longer gets updated by the query parser with additional
information defined in the query prolog. It is therefore no longer necessary to create a
new StaticQueryContext
object for each query you compile. This also means that you can't
use the StaticQueryContext
to obtain information about the query you have just compiled;
instead, use the internal StaticQueryContext
object created by Saxon, which is available
using the getStaticContext()
method on the XQueryExpression
object.
You can optionally register a ModuleURIResolver
with the
Configuration
(using the setModuleURIResolver()
method).
This will be used to handle the URIs found in any import module
declaration. The resolver returns a set of JAXP StreamSource
objects, each containing either an InputSource
or a Reader
providing access to the text of the query module. The StreamSource
must
also contain a SystemId, representing the base URI of the query module. (Supply a Reader
if you want to handle encoding issues yourself, or an InputSource
if you want Saxon to deal with this.)
Building a Source Document
Before you run your query, you may want to build one or more trees representing
XML documents that can be used as input to your query. You don't need to do this: if the query
loads its source documents using the doc()
function then this will be done
automatically, but doing it yourself gives you more control. A document node at the root of
a tree is represented in Saxon by the net.sf.saxon.DocumentInfo
interface.
The Configuration
provides a convenience method, buildDocument()
,
that allows an instance of DocumentInfo
to be constructed. The input parameter to
this is defined by the class javax.xml.transform.Source
, which is part of the
standard Java JAXP API: the Source
interface is an umbrella for different kinds of
XML document source, including a StreamSource
which parses raw XML from a byte
or character stream, SAXSource
which takes the input from a SAX parser (or an
object that is simulating a SAX parser), and DOMSource
which provides the input
from a DOM.
Saxon also provides a several additional implementations of the Source
interface
that can be used as input to this method. Saxon's DocumentInfo
and NodeInfo
classes both implement this interface, though this isn't useful for this particular method because
you will only have one of these once you have built the tree from some other source.
There are a number of wrapper classes that allow trees in other object models to be treated as
Saxon trees: net.sf.saxon.jdom.DocumentWrapper
class for wrapping a JDOM document,
net.sf.saxon.xom.DocumentWrapper
for XOM, net.sf.saxon.dom.DocumentWrapper
for DOM,
and net.sf.saxon.dom4j.DocumentWrapper
for DOM4J.
The net.sf.saxon.AugmentedSource
object
can wrap any other kind of Source
, and provides additional options as to how the Source
should be processed, for example whether it should be validated against a schema, whether whitespace
should be stripped, and whether XInclude processing should take place. Validation is only possible if
you created a SchemaAwareConfiguration
.
Running the Query
To execute your compiled query, you need to create a DynamicQueryContext
object
that holds the run-time context information. The main things you can set in the run-time context are:
Values of parameters (external global variables). You can set these using the setParameter()
method. The mappings from Java classes to XQuery/XPath data types is the same as the mapping used for the
returned values from an external Java method call, and is described under
Result of an Extension Function.
The context node can be set using the method setContextNode()
or
setContextItem()
.
You can also set a URIResolver and/or ErrorListener. These default to the ones that were used during Query compilation.
You are now ready to evaluate the query. There are several methods on the QueryExpression
object that you can use to achieve this. The evaluate()
method returns the result sequence
as a Java java.util.List
. The evaluateSingle()
method is suitable when you know
that the result sequence will contain a single item: this returns this item as an Object, or returns null
if the result is an empty sequence. There is also an iterator
method that returns an iterator
over the results. This is a Saxon object of class net.sf.saxon.om.SequenceIterator
: it is similar
to the standard Java iterator, but not quite identical; for example, it can throw exceptions. Finally,
there is a run()
method, which executes the query, converts the results to an XML document,
and writes this document to a JAXP Result
object, which may represent a DOM, a SAX ContentHandler,
or a serial output stream.
The evaluate()
and evaluateSingle()
methods return the result as a Java object
of the most appropriate type: for example a String is returned as a java.lang.String
, a
boolean as a java.lang.Boolean
. A node is returned using the Saxon representation of a node,
net.sf.saxon.om.NodeInfo
. With the standard and tinytree models, this object also implements
the DOM Node
interface (but any attempt to update the node throws an error).
The iterator()
method, by contrast, does not do any conversion of the result. It is returned
using its native Saxon representation, for example a String is returned as an instance of
sf.net.saxon.value.StringValue
. You can then use all the methods available on this class
to process the returned value.
The run()
method is probably the most efficient in the case of queries that construct a new
document as their output, because it allows the nodes of the result document to be serialized (or sent to the
destination) as they are created, without creating a tree structure in memory first.
Here is a simple example for a query that returns a singleton integer result:
DynamicQueryContext dynamicContext =
new DynamicQueryContext(config);
dynamicContext.setContextNode(
config.buildDocument(
new StreamSource(new File("books.xml"))));
Long count = (Long)exp.evaluateSingle(dynamicContext);
System.out.println("There are " + count.intValue() + " books");
Here is an example where the query returns a list of nodes:
XQueryExpression exp = staticContext.compileQuery("//ITEM/TITLE");
DynamicQueryContext dynamicContext =
new DynamicQueryContext(config);
dynamicContext.setContextNode(
config.buildDocument(
new StreamSource(new File("books.xml")));
SequenceIterator books = exp.iterator(dynamicContext);
while (true) {
NodeInfo book = (NodeInfo)books.next();
if (book==null) break;
String title = book.getStringValue();
System.out.println(title);
}
Wrapped Output
If you want to process the results of the query in your application, that's all there is to it. But you
may want to output the results as serialized XML. Saxon provides two ways of doing this: you can produce
wrapped output, or raw output. Raw output works only if the result consists of a single document or element
node, and it outputs the subtree rooted at that element node in the form of a serialized XML document. The
simplest way to produce raw output is to use the run()
method on the XQueryExpression
object, but you can also do it by retrieving the result as a SequenceIterator and passing this to the
serialize()
method of the QueryResult
class.
Wrapped
output works for any result sequence, for example a sequence of integers or a sequence of attribute and
comment nodes; this works by wrapping each item in the result sequence as an XML element, with details
of its type and value. To produce wrapped output, you first wrap the result sequence as an XML tree, and then serialize the
tree. This can be done using the QueryResult
class. This class doesn't need to be
instantiated, its methods are static. The method QueryResult.wrap
takes as input the iterator
produced by evaluating the query using the iterator()
method, and produces as output
a DocumentInfo
object representing the results wrapped as an XML tree. The method
QueryResult.serialize
takes any document or element node as input, and writes it to
a specified destination, using specified output properties. The destination is supplied as an object
of class javax.xml.transform.Result
. Like the Source
, this is part of the
JAXP API, and allows the destination to be specified as a StreamResult (representing a byte stream or
character stream), a SAXResult (which wraps a SAX ContentHandler), or a DOMResult
(which delivers the result as a DOM). The output properties are used only when writing to
a StreamResult: they correspond to the properties available in the xsl:output
element
for XSLT. The property names are defined by constants in the JAXP javax.xml.transform.OutputKeys
class (or net.sf.saxon.event.SaxonOutputKeys
for Saxon extensions): for details of the
values that are accepted, see the JavaDoc documentation or the JAXP specification.
Here is an example that produces wrapped output:
XQueryExpression exp =
staticContext.compileQuery("//ITEM");
DynamicQueryContext dynamicContext =
new DynamicQueryContext(config);
dynamicContext.setContextNode(
config.buildDocument(
new StreamSource(new File("books.xml")));
SequenceIterator books = exp.iterator(dynamicContext);
DocumentInfo resultDoc = QueryResult.wrap(books, config);
Properties props = new Properties();
props.setProperty(OutputKeys.METHOD, "xml");
props.setProperty(OutputKeys.INDENT, "yes");
QueryResult.serialize(resultDoc,
new StreamResult(System.out), props);
This example produces output without wrapping:
XQueryExpression exp = staticContext.compileQuery("//ITEM");
DynamicQueryContext dynamicContext =
new DynamicQueryContext(config);
dynamicContext.setContextNode(
config.buildDocument(
new StreamSource(new File("books.xml")));
SequenceIterator books = exp.iterator(dynamicContext);
Properties props = new Properties();
props.setProperty(OutputKeys.METHOD, "xml");
props.setProperty(OutputKeys.INDENT, "no");
int nr = 1;
while (true) {
NodeInfo book = (NodeInfo)books.next();
if (book==null) break;
System.out.println("===== BOOK " + nr + " =====");
QueryResult.serialize(book, new StreamResult(System.out), props);
}
If the results do not need to be processed by the application, the same effect can be achieved more efficiently using the code shown below:
XQueryExpression exp = staticContext.compileQuery("//ITEM");
DynamicQueryContext dynamicContext =
new DynamicQueryContext(config);
dynamicContext.setContextNode(
config.buildDocument(
new StreamSource(new File("books.xml")));
Properties props = new Properties();
props.setProperty(OutputKeys.METHOD, "xml");
props.setProperty(OutputKeys.INDENT, "no");
exp.run(dynamicContext, new StreamResult(System.out), props);