JAXP Source Types
This section is relevant to the Java platform only.
When a user application invokes Saxon via the Java API, then a source document is supplied as an
instance of the JAXP Source
class. This is true whether invoking an XSLT transformation,
an XQuery query, or a free-standing XPath expression. The Source
class
is essentially a marker interface. The Source
that is supplied must be a kind of Source
that Saxon recognizes.
Saxon recognizes the three kinds of Source
defined in JAXP: a StreamSource
,
a SAXSource
, and a DOMSource
.
Saxon also accepts input from an XMLStreamReader
(javax.xml.stream.XMLStreamReader
), that is
a StAX pull parser as defined in JSR 173. This is achieved by creating an instance of net.sf.saxon.pull.StaxBridge
,
supplying the XMLStreamReader
using the setXMLStreamReader()
method, and wrapping the
StaxBridge
object in an instance of net.sf.saxon.pull.PullSource
, which implements the JAXP
Source
interface and can be used in any Saxon method that expects a Source
. Saxon has been validated
with two StAX parsers: the Zephyr parser from Sun (which is supplied as standard with JDK 1.6), and the open-source Woodstox parser
from Tatu Saloranta. In my experience, Woodstox is the more reliable of the two. However, there is no immediate benefit in using
a pull parser to supply Saxon input rather than a push parser; the main use case for using an XMLStreamReader
is when
the data is supplied from some source other than parsing of lexical XML.
Nodes in Saxon's implementation of the XPath data model are represented by the interface
NodeInfo
. A NodeInfo
is itself a Source
, which means
that any method in the API that requires a source object will accept any implementation of
NodeInfo
. As discussed in the next section, implementations of NodeInfo
are available to wrap DOM, DOM4J, JDOM, or XOM nodes, and in all cases these wrapper objects can be used
wherever a Source
is required.
Saxon also provides a class net.sf.saxon.lib.AugmentedSource
which implements the Source
interface. This class encapsulates one of the standard Source
objects, and allows additional processing
options to be specified. These options include whitespace handling, schema and DTD validation, XInclude processing,
error handling, choice of XML parser, and choice of Saxon tree model.
Saxon allows additional Source
types to be supported by registering a SourceResolver
with the Configuration
object. The task of a SourceResolver
is to convert a
Source
that Saxon does not recognize into a Source
that it does recognize.
For example, this may be done by building the document tree in memory and returning the NodeInfo
object representing the root of the tree.