This document is also available in these non-normative formats: XML and Revision Markup.
Copyright © 2020 Saxonica Ltd, published by the EXPath Community Group under the W3C Community Final Specification Agreement (FSA). A human-readable summary is available.
This specification was published by the EXPath Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Final Specification Agreement (FSA) other conditions apply. Learn more about W3C Community and Business Groups.
This proposal provides an API for XPath 3.1 to construct XDM node trees.
The module homepage, with more information, is on the EXPath website at http://expath.org/modules/binary/.
1 Status of this document
2 Introduction
2.1 Namespace conventions
2.2 Error management
2.3 Relationship to XSLT and XQuery
2.4 Node names
2.5 Node identity
2.6 Schema awareness
2.7 Test suite
3 Functions
3.1 build:attribute
3.2 build:comment
3.3 build:document
3.4 build:element
3.5 build:namespace
3.6 build:processing-instruction
3.7 build:text
The module defined by this document defines several functions, all contained in the
namespace http://expath.org/ns/builder
. In this document, the build
prefix, when used, is bound to this namespace URI.
Error codes are defined in the same namespace (http://expath.org/ns/build
),
and in this document are displayed with the same prefix, build
.
TODO: should we use XSLT error codes, XQuery error codes, or our own error codes?
Both XSLT and XQuery provide native syntax for constructing XDM node trees.
There are a number of reasons for providing a function library to achieve the same thing:
replace-with(string, regex, function)
that applies
the supplied function to the matching substrings, the call replace-with($in, "\[.*?\]", build:element("cite", ?))
turns See [Kay, 1970]
into See <cite>Kay, 1970</cite>
.
options
parameter allows for setting of properties that are not easy
to control using XSLT and XQuery language facilities, for example, the base URI of a document node.The semantics of operations for node construction in XSLT and XQuery are very similar. There are however a few minor differences, and in these cases the XSLT rules have been chosen in preference. The differences include:
XSLT and XQuery define different error codes for the same conditions. This specification currently uses the XSLT codes, but will change to use its own error codes.
In function arguments where a node name is required, the required type is denoted as
union(xs:string, xs:QName)
. Such a type can be defined in a schema, but in the absence
of a schema, it cannot be directly expressed as a SequenceType in XPath 3.1 syntax. Implementations
may therefore use the nearest supertype, namely xs:anyAtomicType
, with a dynamic check
that the supplied value conforms to the required constraints.
The functions are defined to be non-deterministic with respect to node identity (as defined in F+O §1.7.4). This means that it is implementation-dependent whether two calls with the same arguments return the same node or different nodes. This approach gives implementations maximum freedom for optimization, for example it allows function calls to be extracted from a loop.
Options are available to request validation of constructed nodes against a schema, in the same
way as for instructions such as xsl:element
in XSLT. The semantics of the various operations
are outlined briefly in this specification, but the intent is that the rules should be the same as XSLT
(which defines them in much more detail) unless otherwise specified.
Processors that do not support schema-awareness should raise an error if options dependent on a schema are selected.
A suite of test-cases for all the functions defined in this module, in [QT3] format, will be created.
This EXPath module defines the following functions:
Returns an attribute node, with given name and content.
build:attribute ( | $name | as union(xs:QName, xs:string) , |
$content | as xs:string? ) as attribute() |
build:attribute ( | $name | as union(xs:QName, xs:string) , |
$content | as xs:string? , | |
$options | as map(*) ) as attribute() |
The effect of the two-argument form of this function is the same as calling
the three-argument form with an empty map as the value of the $options
argument.
The name of the attribute may be supplied in a number of ways:
xs:QName
, in which case
the attribute's name will have
the prefix, namespace URI, and local name supplied in this QName.xs:string
that conforms to the
rules for a valid xs:NCName
. The attribute's name will have
this local part, with no namespace URI or prefix.xs:string
in the format
Q{uri}local
. The attribute's local name and namespace URI
will be taken from this value, and will have a system-allocated prefix.xs:string
in the format
prefix:local
. In this case the the prefix must be declared
in the static context of the function call, and the element's name will
use this prefix and local name, together with the namespace URI associated with
this prefix in the static context.If the attribute name has a URI but no prefix, then the system will allocate
an arbitrary prefix. If the attribute name is given as an xs:QName
with a prefix but no URI, then the prefix will be ignored.
The content of the attribute node (that is, the string value of the node) is formed by evaluating the second argument. If this is an empty sequence, the string value of the attribute will be a zero-length string.
The type annotation of the new attribute node will be xs:untypedAtomic
.
If the attribute has the name xml:id
then xml:id processing is performed,
and the attribute will have the is-id
property.
The function imposes rules preventing the misuse of reserved names such as
"xml" and "xmlns", in the same way as the xsl:attribute
instruction
in XSLT, or the attribute constructor expression in XQuery. The error codes
used are those defined in XSLT 3.0.
The entries that may appear in the $options
map are as follows. The option
parameter conventions apply.
Key | Meaning |
---|---|
type | Causes the value to be validated against a named simple type.
The value must be the name of a type in the in-scope schema definitions.
The supplied string is validated against this type and a dynamic error occurs if is not valid.
The returned attribute node has this type annotation. Validation may also affect the string
value of the attribute, for example by collapsing whitespace.
|
If the function is called twice with the same arguments, it is unpredictable whether it returns the same attribute node or different attribute nodes from the two invocations.
The XSLT/XQuery rules for constructing simple content do not apply. The value must be supplied as a string, or as a value that is converted to a string by virtue of the function conversion rules.
These rules are similar to the XQuery rules for the
element {...}
expression. However, there are some differences.
Most notably, the XSLT rules allow multiple attribute nodes with the same name
to appear in the content sequence (the last one wins). Furthermore, the error codes used for invalid
conditions (such as the presence of maps or functions or conflicting namespace nodes
in the content) are those given in the XSLT 3.0 specification.
The XSLT/XQuery rules for constructing simple content do not apply. The value must be supplied as a string, or as a value that is converted to a string by virtue of the function conversion rules.
Since the declared type of the first argument is namespace
sensitive, error XPTY0117
will be raised if an untyped atomic
value (or an untyped node) is supplied as the actual argument. Conversion
to a string should therefore be done explicitly. For example,
to convert the element <prop name="x" value="y"/>
to
the attribute node x="y"
, use
build:attribute(string(@name), string(@value))
Returns a comment node, with given content.
build:comment
($content
as
xs:string?
) as
text()
The content of the comment (that is, the string value of the node) is formed by evaluating the first argument. If this is an empty sequence, the string value will be a zero-length string.
If the function is called twice with the same arguments, it is unpredictable whether it returns the same node or different nodes from the two invocations.
If the content contains the substring "--", or if it ends in "-", this
is handled in the same way as the xsl:comment
instruction in XSLT:
the value is adjusted by inserting spaces.
The XSLT/XQuery rules for constructing simple content do not apply. The value must be supplied as a string, or as a value that is converted to a string by virtue of the function conversion rules.
Returns a document node, with given content.
build:document
($content
as
item()*
) as
document-node()
build:document
($content
as
item()*
, $options
as
map(*)
) as
document-node()
The effect of calling the single-argument function is the same as the effect of calling the two-argument function supplying an empty map as the second argument.
The content of the document node (that is, the children of the node) is formed by evaluating the first argument, and applying the rules given in the XSLT 3.0 specification section 5.7.1, Constructing Complex Content.
The base URI of the new document node is taken from the static base URI of the calling expression.
If the function is called twice with the same arguments, it is unpredictable whether it returns the same document node or different document nodes from the two invocations.
It is not required that the resulting document should satisfy the XML rules for a well-formed document; specifically, the node may contain multiple element and text nodes among its children.
The entries that may appear in the $options
map are as follows. The option
parameter conventions apply.
Key | Value | Meaning |
---|---|---|
base-uri | Determines base URI of the returned element node. This should be an absolute URI.
| |
type | Causes the content of the outermost element to be validated against a named schema type.
The validation and type options are mutually exclusive.
The value must be the name of a type in the in-scope schema definitions.
The supplied content is validated against this type and a dynamic error occurs if is not valid.
| |
validation | Causes the content of the outermost element to be validated against a schema.
The validation
and type options are mutually exclusive.
| |
strict | There must be an element declaration with matching name in the in-scope schema definitions. The element is validated against this declaration. | |
lax | There may be an element declaration with matching name in the in-scope schema definitions: if there is, then the element is validated against this declaration. | |
skip | The content is not validated. |
These rules are almost identical to the XQuery rules for the
document {...}
expression. However, the error codes used for invalid
conditions (such as the presence of attributes, namespace nodes, maps or functions
in the content) are those given in the XSLT 3.0 specification.
Returns an element node, with given name and content.
build:element ( | $name | as union(xs:QName, xs:string) , |
$content | as item()* ) as element(*) |
build:element ( | $name | as union(xs:QName, xs:string) , |
$content | as item()* , | |
$options | as map(*) ) as element(*) |
The effect of the two-argument form of this function is the same as calling
the three-argument form with an empty map as the value of the $options
argument.
The name of the element node is determined by the first argument. This may
be supplied either as an instance of either xs:string
or
xs:QName
.
The name of the element may be supplied in a number of ways:
xs:QName
, in which case
the element's name will have
the prefix, namespace URI, and local name supplied in this QName.xs:string
, that conforms to the
rules for a valid xs:NCName
. The element's name will have
this local part, with no namespace URI or prefix.xs:string
in the format
Q{uri}local
. The element's local name and namespace URI
will be taken from this value, and the name will have no prefix (that is,
the URI will be the default namespace).xs:string
in the format
prefix:local
. In this case the the prefix must be declared
in the static context of the function call, and the element's name will
use this prefix and local name, together with the namespace URI associated with
this prefix in the static context.The content of the element node (that is, the children of the node) is formed by evaluating the second argument, and applying the rules given in the XSLT 3.0 specification section 5.7.1, Constructing Complex Content.
The base URI of the new element node is taken from the static base URI of the calling expression.
The type annotation of the new element node will be xs:untyped
.
Namespace fixup is applied to the new element as described in the XSLT 3.0 specification to ensure that all namespaces used in element and attribute names are properly declared.
The entries that may appear in the $options
map are as follows. The option
parameter conventions apply.
Key | Value | Meaning |
---|---|---|
base-uri | Determines base URI of the returned element node. This should be an absolute URI.
| |
is-id | Determines whether the element has the is-id property.
| |
is-idrefs | Determines whether the element has the is-idrefs property.
| |
inherit-namespaces | Determines whether the namespaces of the newly constructed element are propagated
to the copies of its descendants. The semantics correspond to the inherit-namespaces
attribute of the xsl:element instruction
| |
type | Causes the content to be validated against a named schema type.
The validation and type options are mutually exclusive.
The value must be the name of a type in the in-scope schema definitions.
The supplied content is validated against this type and a dynamic error occurs if is not valid.
The returned element node has this type annotation.
| |
validation | Causes the content to be validated against a schema. The validation
and type options are mutually exclusive.
| |
strict | There must be an element declaration with matching name in the in-scope schema definitions. The element is validated against this declaration. | |
lax | There may be an element declaration with matching name in the in-scope schema definitions: if there is, then the element is validated against this declaration. | |
preserve | The content of the new element is not validated, but any descendant nodes that are copied retain their type annotations. | |
strip | The content is not validated, and any descendant nodes that are copied
have their type annotation changed to xs:untyped . |
If the function is called twice with the same arguments, it is unpredictable whether it returns the same element node or different element nodes from the two invocations.
These rules are similar to the XQuery rules for the
element {...}
expression. However, there are some differences.
Most notably, the XSLT rules allow multiple attribute nodes with the same name
to appear in the content sequence (the last one wins). Furthermore, the error codes used for invalid
conditions (such as the presence of maps or functions or conflicting namespace nodes
in the content) are those given in the XSLT 3.0 specification.
Any attribute nodes in the content sequence become attributes of the constructed element; they are not atomized to form text nodes.
Since the declared type of the first argument is namespace
sensitive, error XPTY0117
will be raised if an untyped atomic
value (or an untyped node) is supplied as the actual argument. Conversion
to a string should therefore be done explicitly. For example,
to convert <prop name="x" value="y"/>
to
<x>y</x>
, use saxon:new-element(string(@name), string(@value))
Supplying a simple NCName as the first argument means the element will be in no namespace. The default namespace for elements is NOT used.
Creates a namespace node.
build:namespace
($prefix
as
xs:string
, $uri
as
xs:string
) as
namespace-node()
This function creates a new parentless namespace node. The first argument gives the
name of the namespace node (that is, the namespace prefix), while the second gives the
namespace URI. The prefix may be ""
to create a default namespace;
otherwise it must be a valid NCName. The URI must not be the empty string.
Returns a new processing instruction node, with given name and content.
build:processing-instruction ( | $name | as xs:string , |
$content | as xs:string? ) as processing-instruction() |
This function constructs a new parentless processing instruction node.
The name of the processing instruction is determined by the first argument. This must
be an instance of xs:string
that conforms to the rules for an
xs:NCName
; it must not match the name "xml" in a case-blind comparison.
The content of the processing instruction (that is, the string value of the node) is formed by evaluating the second argument. If this is an empty sequence, the string value will be a zero-length string.
Any substring of the string value that matches ?>
is replaced by
? >
(that is, a space is inserted).
If the function is called twice with the same arguments, it is unpredictable whether it returns the same node or different nodes from the two invocations.
The XSLT/XQuery rules for constructing simple content do not apply. The value must be supplied as a string, or as a value that is converted to a string by virtue of the function conversion rules.
Returns a new text node, with given content.
build:text
($content
as
xs:string?
) as
text()
This function constructs a new parentless text node.
The content of the text node (that is, the string value of the node) is formed by evaluating the second argument. If this is an empty sequence, the string value will be a zero-length string.
If the function is called twice with the same arguments, it is unpredictable whether it returns the same node or different nodes from the two invocations.
The XSLT/XQuery rules for constructing simple content do not apply. The value must be supplied as a string, or as a value that is converted to a string by virtue of the function conversion rules.