Saxon.Api
Class DocumentBuilder
-
public class DocumentBuilder
The DocumentBuilder class enables XDM documents to be built from various sources.
The class is always instantiated using the NewDocumentBuilder method
on the Processor object.
Property Summary |
|
|---|---|
| Uri | BaseUri
The base URI of a document loaded using this |
| XQueryExecutable | DocumentProjectionQuery Set a compiled query to be used for implementing document projection. |
| bool | DtdValidation
Determines whether DTD validation is applied to documents loaded using this
|
| bool | LineNumbering
Determines whether line numbering is enabled for documents loaded using this
|
| SchemaValidationMode | SchemaValidationMode
Determines whether schema validation is applied to documents loaded using this
|
| SchemaValidator | SchemaValidator Property to set and get the schemaValidator to be used. This determines whether schema validation is applied to an input document and whether type annotations in a supplied document are retained. If no schemaValidator is supplied, then schema validation does not take place. |
| QName | TopLevelElementName The required name of the top level element in a document instance being validated against a schema. |
| TreeModel | TreeModel
The Tree Model implementation to be used for the constructed document. By default
the |
| WhitespacePolicy | WhitespacePolicy
Determines the whitespace stripping policy applied when loading a document
using this |
| ResourceResolver | XmlDocumentResolver
An XmlDocumentResolver used to resolve any URI passed to the |
| XmlResolver | XmlResolver
A |
Method Summary |
|
|---|---|
| XdmNode | Build (Uri uri) Load an XML document, retrieving it via a URI. |
| XdmNode | Build (Stream input)
Load an XML document supplied as raw (lexical) XML on a |
| XdmNode | Build (TextReader input)
Load an XML document supplied using a |
| XdmNode | Build (XmlReader reader)
Load an XML document, delivered using an |
| XdmNode | Build (XmlNode source)
Load an XML DOM document, supplied as an |
| XdmNode | Wrap (XmlDocument doc)
Wrap an XML DOM document, supplied as an |
| XdmNode | Wrap (XdmNode docWrapper, XmlNode node)
Wrap an XML DOM node (other than a document node), as a Saxon |
Property Detail
BaseUri
The base URI of a document loaded using this DocumentBuilder.
This is used for resolving any relative URIs appearing
within the document, for example in references to DTDs and external entities.
This information is required when the document is loaded from a source that does not
provide an intrinsic URI, notably when loading from a Stream or a TextReader.
DocumentProjectionQuery
Set a compiled query to be used for implementing document projection.
The effect of using this option is that the tree constructed by the
DocumentBuilder contains only those parts
of the source document that are needed to answer this query. Running this query against
the projected document should give the same results as against the raw document, but
the
projected document typically occupies significantly less memory. It is permissible
to run
other queries against the projected document, but unless they are carefully chosen,
they
will give the wrong answer, because the document being used is different from the
original.
The query should be written to use the projected document as its initial context item.
For example, if the query is //ITEM[COLOR='blue'], then only ITEM
elements and their COLOR children will be retained in the projected document.
This facility is only available in Saxon-EE; if the facility is not available, calling this method has no effect.
DtdValidation
Determines whether DTD validation is applied to documents loaded using this
DocumentBuilder.
By default, no DTD validation takes place.
LineNumbering
Determines whether line numbering is enabled for documents loaded using this
DocumentBuilder.
By default, line numbering is disabled.
Line numbering is not available for all kinds of source: in particular,
it is not available when loading from an existing XmlDocument.
The resulting line numbers are accessible to applications using the
extension function saxon:line-number() applied to a node.
Line numbers are maintained only for element nodes; the line number returned for any other node will be that of the most recent element.
SchemaValidationMode
Determines whether schema validation is applied to documents loaded using this
DocumentBuilder, and if so, whether it is strict or lax. If schema validation
is requested and the document is not valid, then the Build method will fail
with an exception.
By default, no schema validation takes place.
This option requires Saxon Enterprise Edition (Saxon-EE).
SchemaValidator
Property to set and get the schemaValidator to be used. This determines whether schema validation is applied to an input document and whether type annotations in a supplied document are retained. If no schemaValidator is supplied, then schema validation does not take place.
If validation is requested using this mechanism, and the document is not valid, then no exception is raised; it is for the application to handle any invalidity reports from the SchemaValidator.
TopLevelElementName
The required name of the top level element in a document instance being validated against a schema.
If this property is set, and if schema validation is requested, then validation will fail unless the outermost element of the document has the required name.
This option requires the schema-aware version of the Saxon product (Saxon-EE).
TreeModel
The Tree Model implementation to be used for the constructed document. By default
the TinyTree is used. The main reason for using the LinkedTree alternative is if
updating is required (the TinyTree is not updateable).
WhitespacePolicy
Determines the whitespace stripping policy applied when loading a document
using this DocumentBuilder.
By default, whitespace text nodes appearing in element-only content are stripped, and all other whitespace text nodes are retained.
If DTD or schema validation is applied, the only permitted setting
is WhitespacePolicy#IGNORABLE. Any other value results
in an exception from the Build() method.
XmlDocumentResolver
An XmlDocumentResolver used to resolve any URI passed to the Build method.
If a resolver is supplied, it must take total responsibility for resolving all URIs;
there is no fallback if it returns null or raises an error. If the resolver is to
handle
some URIs but delegate the handling of others, this can be achieved by creating a
CommonResourceResolver and chaining a DirectResourceResolver.
XmlResolver
A System.Xml.XmlResolver, which will be used to resolve references to external entities
within XML documents being loaded (including any external DTD) when the DocumentBuilder
allocates an XmlReader.
If no XmlResolver is supplied, the ResourceResolver associated with the
Saxon configuration is used (Configuration.getResourceResolver())
In Saxon releases prior to 11.1, the supplied XmlResolver was also used to
resolve any relative URI passed to the DocumentBuilder.Build() method.
Method Detail
Build
Load an XML document supplied as raw (lexical) XML on a Stream.
The document is parsed using the Microsoft System.Xml parser.
Before calling this method, the BaseUri property should be set to identify the
base URI of this document, used for resolving any relative URIs contained within it;
if it has not been set, the current working directory is assumed.
Note that the Microsoft System.Xml parser does not report whether attributes are
defined in the DTD as being of type ID and IDREF. This is true whether or not
DTD-based validation is enabled. This means that such attributes are not accessible
to the
id() and idref() functions.
Parameters:
input - The Stream containing the XML source to be parsed. Closing this stream
on completion is the responsibility of the caller.Returns:
XdmNode, the document node at the root of the tree of the resulting
in-memory document.
Build
Load an XML document supplied using a TextReader.
The document is parsed using the Microsoft System.Xml parser.
Before calling this method, the BaseUri property should be set to identify the
base URI of this document, used for resolving any relative URIs contained within it;
if it has not been set, the current working directory is assumed.
Note that the Microsoft System.Xml parser does not report whether attributes are
defined in the DTD as being of type ID and IDREF. This is true whether or not
DTD-based validation is enabled. This means that such attributes are not accessible
to the
id() and idref() functions.
Parameters:
input - The TextReader containing the XML source to be parsedReturns:
XdmNode, the document node at the root of the tree of the resulting
in-memory document.
Build
Load an XML document, delivered using an XmlReader.
The XmlReader is responsible for parsing the document; this method builds a tree
representation of the document (in an internal Saxon format) and returns its document
node.
The XmlReader is not required to perform validation but it must expand any entity references.
Saxon uses the properties of the XmlReader as supplied.
Use of a plain XmlTextReader is discouraged, because it does not expand entity
references. This should only be used if you know in advance that the document will
contain
no entity references (or perhaps if your query or stylesheet is not interested in
the content
of text and attribute nodes). Instead, with .NET 1.1 use an XmlValidatingReader (with ValidationType
set to None). The constructor for XmlValidatingReader is obsolete in .NET 2.0,
but the same effect can be achieved by using the Create method of XmlReader with
appropriate XmlReaderSettings.
The base URI of the resulting document is taken from the BaseURI property
of the XmlReader if this is non-null and non-empty; otherwise it is taken from the
BaseUri property of this DocumentBuilder.
Conformance with the W3C specifications requires that the Normalization property
of an XmlTextReader should be set to true. However, Saxon does not insist
on this.
If the XmlReader performs schema validation, Saxon will ignore any resulting type
information. Type information can only be obtained by using Saxon's own schema validator,
which
will be run if the SchemaValidationMode property is set to Strict or Lax.
Note that the Microsoft System.Xml parser does not report whether attributes are
defined in the DTD as being of type ID and IDREF. This is true whether or not
DTD-based validation is enabled. This means that such attributes are not accessible
to the
id() and idref() functions.
Note that setting the XmlResolver property of the DocumentBuilder
has no effect when this method is used; if an XmlResolver is required, it must
be set on the XmlReader itself.
Parameters:
reader - The XMLReader that supplies the parsed XML sourceReturns:
XdmNode, the document node at the root of the tree of the resulting
in-memory document.
Build
Load an XML DOM document, supplied as an XmlNode, into a Saxon XdmNode.
The returned document will contain only the subtree rooted at the supplied node.
This method copies the DOM tree to create a Saxon tree. See the Wrap method for
an alternative that creates a wrapper around the DOM tree, allowing it to be modified
in situ.
Parameters:
source - The DOM Node to be copied to form a Saxon treeReturns:
XdmNode, the document node at the root of the tree of the resulting
in-memory document.
Wrap
Wrap an XML DOM document, supplied as an XmlDocument, as a Saxon XdmNode.
This method must be applied at the level of the Document Node. Unlike the
Build method, the original DOM is not copied. This saves memory and
time, but it also means that it is not possible to perform operations such as
whitespace stripping and schema validation.
Parameters:
doc - The DOM document node to be wrappedReturns:
XdmNode, the Saxon document node at the root of the tree of the resulting
in-memory document.
Wrap
Wrap an XML DOM node (other than a document node), as a Saxon XdmNode.
Parameters:
docWrapper - The wrapper for the containing DOM document nodenode - The DOM node containing the node to be wrappedReturns:
XdmNode, wrapping the supplied DOM node
Load an XML document, retrieving it via a URI.
Note that the type
Urirequires an absolute URI.The URI is dereferenced using the registered
XmlResolver.This method takes no account of any fragment part in the URI.
The
rolepassed to theGetEntitymethod of theXmlResolveris "application/xml", and the required return type isSystem.IO.Stream.The document located via the URI is parsed using the
System.Xmlparser.Note that the Microsoft
System.Xmlparser does not report whether attributes are defined in the DTD as being of typeIDandIDREF. This is true whether or not DTD-based validation is enabled. This means that such attributes are not accessible to theid()andidref()functions.Parameters:
uri- The URI identifying the location where the document can be found. This will also be used as the base URI of the document (regardless of the setting of theBaseUriproperty).Returns:
XdmNode, the document node at the root of the tree of the resulting in-memory document.