JAXP Source Types
This section is relevant to the Java platform only.
When a user application invokes Saxon via the Java API, then a source document is supplied as
an instance of the JAXP Source
class. This is true whether invoking an XSLT
transformation, an XQuery query, or a free-standing XPath expression. The Source
class is essentially a marker interface. The Source
that is supplied must be a
kind of Source
that Saxon recognizes.
Saxon recognizes the three kinds of Source
defined in JAXP: a
StreamSource
, a SAXSource
, and a DOMSource
.
Saxon also accepts input from an XMLStreamReader
(javax.xml.stream.XMLStreamReader
), that is a StAX pull parser as defined in
JSR 173. This is achieved by creating an instance of net.sf.saxon.pull.StaxBridge
, supplying the
XMLStreamReader
using the setXMLStreamReader()
method, and
wrapping the StaxBridge
object in an instance of net.sf.saxon.pull.PullSource
, which implements the
JAXP Source
interface and can be used in any Saxon method that expects a
Source
. Saxon has been validated with two StAX parsers: the Zephyr parser from
Sun (which is supplied as standard with JDK 1.6), and the open-source Woodstox parser from
Tatu Saloranta. In my experience, Woodstox is the more reliable of the two. However, there is
no immediate benefit in using a pull parser to supply Saxon input rather than a push parser;
the main use case for using an XMLStreamReader
is when the data is supplied from
some source other than parsing of lexical XML.
Nodes in Saxon's implementation of the XPath data model are represented by the interface NodeInfo
. A NodeInfo
is
itself a Source
, which means that any method in the API that requires a source
object will accept any implementation of NodeInfo
. As discussed in the next
section, implementations of NodeInfo
are available to wrap Axiom, DOM, DOM4J,
JDOM, JDOM2, or XOM nodes, and in all cases these wrapper objects can be used wherever a
Source
is required.
Saxon also provides a class net.sf.saxon.lib.AugmentedSource
which implements the Source
interface.
This class encapsulates one of the standard Source
objects, and allows additional
processing options to be specified. These options include whitespace handling, schema and DTD
validation, XInclude processing, error handling, choice of XML parser, and choice of Saxon
tree model.
Saxon allows additional Source
types to be supported by registering a SourceResolver
with the Configuration
object. The task of a
SourceResolver
is to convert a Source
that Saxon does not
recognize into a Source
that it does recognize. For example, this may be done by
building the document tree in memory and returning the NodeInfo
object representing the root of the tree.