Saxon extensions to the W3C XSLT/XQuery specifications
Extension Functions
The extension function saxon:sort()
is dropped, as it is superseded by fn:sort()
in XPath 3.1.
A new extension function saxon:doc() is available. It is similar to fn:doc(), but takes a second argument which is a map supplying options for how the source document is to be parsed. The options allow control of schema and DTD validation, whitespace stripping, and use of accumulators. The function is non-deterministic with respect to node-identity: that is, if it is called twice with the same arguments, there are no guarantees about whether the same result is returned each time.
The new extension function saxon:function-annotations() is made available to obtain access to the annotations defined in a function declaration.
The extension function saxon:timestamp() is added. It
takes no arguments and returns the xs:dateTimeStamp
representing the instant at which the function is called. The optimizer makes some attempt to avoid rearranging calls
to this function (for example by not lifting the call out of a loop) but order of execution is not guaranteed so there may
still be surprises when the function is used to time program execution.
The new extension function saxon:array-member() takes
a single argument, which evaluates to an arbitrary sequence, and it returns an external object of type ArrayMemberValue wrapping this sequence. The
significance of this external object is that it is specially recognized by operations that construct an array,
indicating that the sequence contained by the ArrayMemberValue
is to become a single member of the resulting array, rather than having each item treated independently.
So, for example, the XPath expression array{ saxon:array-member((1,2)), saxon:array-member((3,4)) }
creates
the array [(1,2), (3,4)]
. This is different from the result of array{ (1,2), (3,4) }
which
produces the array [1, 2, 3, 4]
. This provides a pratical way of creating arrays containing nested sequences,
especially useful where the number of members in the array is not known statically.
A new function saxon:message-count($errCode) is
available. It returns the number of xsl:message
instructions that have been output with a given error code. If the argument is an empty sequence, it returns the total
number of xsl:message
instructions that have been output. A typical use case is to limit the number of
messages that are output, or to terminate execution after reaching a particular threshold.
Because the xsl:message
instruction has side-effects (including the side-effect
of updating these counters) the results are not guaranteed to be predictable. This is particularly true when multi-threading.
The implementation of saxon:stream() has undergone further
changes, which bring it closer into line with
the streamability rules for XSLT 3.0 streaming. When a filter expression is used as the argument, for example
saxon:stream(doc('a'xml')/b/c[PREDICATE])
, the predicate must now be motionless: this means for example that
it cannot be positional, and it cannot access children or descendants or the string value of the node being tested. These restrictions
can be circumvented by moving the predicate outside the call on saxon:stream
, for example
saxon:stream(doc('a'xml')/b/c)[PREDICATE]
.
Extension Attributes
The attribute saxon:time="yes"
on xsl:message
causes a timestamp to be added to the message output.
The attribute saxon:trace="yes" on xsl:accumulator causes tracing of all changes to the current value of the accumulator.
The serialization property saxon:single-quotes="yes|no"
controls whether the XML and HTML output methods use single
or double quotes to delimit attributes (see Serialization Parameters). The default is to use
double quotes, except in cases where character maps are involved; single
quotes can be useful if the lexical XML/HTML is to be wrapped in a string literal delimited by double quotes, for example in JSON. This
does not eliminate the need to escape double quotes (as \"
) in such a context, but it means that fewer characters are
affected.
Extension Instructions
The new instruction <saxon:array> has the same
content model as <xsl:sequence>, but
builds an array rather than a sequence. The members of the array will all be singleton items, unless constructed using the
saxon:array-member
extension function or the saxon:array-member
extension instruction (see above).
The new instruction <saxon:array-member> has the same content model as
<xsl:sequence>
, but builds an external object of type ArrayMemberValue wrapping the
constructed sequence. The significance of this external object is that it is specially recognized by operations that
construct an array, indicating that the sequence contained by the ArrayMemberValue
is to become a single member of the resulting array, rather than having each item treated independently.
An extension function saxon:array-member() is also available.
Collection URIs
The standard collection URI resolver now accepts the parameter match=regex
as an alternative to select=glob
.
The value is an XPath 3.1 regular expression. Characters that are not allowed in a URI (such as backslash and curly braces) must
be escaped using the %HH convention; this is best achieved by processing the value through the fn:encode-for-uri() function.
XPath Syntax Extensions
Saxon implements a number of extensions to the XPath 3.1 grammar. These are enabled only if the configuration option FeatureKeys.ALLOW_SYNTAX_EXTENSIONS is set, and they require Saxon-PE or higher.
Simple inline functions: an abbreviated syntax is available for defining simple inline functions. For example,
the expression fn{.+1}
represents a function that takes a single argument and returns the value of that
argument plus one. A simple inline function takes a single argument with required type item()
,
and returns any sequence (type item()*
). The function body is evaluated with a singleton focus based
on the supplied argument value.
Simple inline functions are particularly convenient when providing functions as arguments to higher-order functions,
many of which accept a single item as their one argument.
For example, to sort employees in order of salary, you can write sort(//employee, fn{@salary})
.
Simple inline functions can access externally-defined local variables in the usual way (that is, they have a closure).
The expression fn{EXPR}
is a syntactic shorthand for function($x as item()) as item()* {$x!(EXPR)}
.
Type Aliases. Wherever an ItemType
is expected, instead of writing an explicit type like
map(xs:string, map(xs:integer, xs:string))
, it is possible to use a type alias, which is a QName prefixed with a tilde,
for example ~inventory
. Type aliases can be declared in XSLT using the extension element
<saxon:type-alias>;
they are not currently available in XQuery. More details: see Type Aliases.
Union Types. A union type is an extension to the syntax for an ItemType
.
An example is union(xs:dateTime, xs:date, xs:time)
, which denotes a union type
whose members are xs:date
, xs:time
, and xs:dateTime
.
These types can conveniently be used in function signatures
when writing a function that is designed to take arguments of more than one type. They can also be used in other
places where an ItemType
is required, for example in a cast as
expression.
Requires Saxon-EE. The semantics are the same as for union types defined in a schema. For further details
see Union Types
Tuple Types. A tuple type is an extension to the syntax for an ItemType
. An example is
tuple(ssn: xs:string, emp: element(employee))
. A tuple type is essentially a way of defining a more
precise type for maps. This example type will be satisfied by any map that has an entry whose key is "ssn" and whose
value is a string, plus an entry whose key is "emp" and whose value is an employee
element. More details:
see Tuple Types.