Saxon extensions to the W3C XSLT/XQuery specifications

Extension Functions

The extension function saxon:sort() is dropped, as it is superseded by fn:sort() in XPath 3.1.

A new extension function saxon:doc() is available. It is similar to fn:doc(), but takes a second argument which is a map supplying options for how the source document is to be parsed. The options allow control of schema and DTD validation, whitespace stripping, and use of accumulators. The function is non-deterministic with respect to node-identity: that is, if it is called twice with the same arguments, there are no guarantees about whether the same result is returned each time.

The new extension function saxon:function-annotations() is made available to obtain access to the annotations defined in a function declaration.

The extension function saxon:timestamp() is added. It takes no arguments and returns the xs:dateTimeStamp representing the instant at which the function is called. The optimizer makes some attempt to avoid rearranging calls to this function (for example by not lifting the call out of a loop) but order of execution is not guaranteed so there may still be surprises when the function is used to time program execution.

The new extension function saxon:array-member() takes a single argument, which evaluates to an arbitrary sequence, and it returns an external object of type ArrayMemberValue wrapping this sequence. The significance of this external object is that it is specially recognized by operations that construct an array, indicating that the sequence contained by the ArrayMemberValue is to become a single member of the resulting array, rather than having each item treated independently.

So, for example, the XPath expression array{ saxon:array-member((1,2)), saxon:array-member((3,4)) } creates the array [(1,2), (3,4)]. This is different from the result of array{ (1,2), (3,4) } which produces the array [1, 2, 3, 4]. This provides a pratical way of creating arrays containing nested sequences, especially useful where the number of members in the array is not known statically.

A new function saxon:message-count($errCode) is available. It returns the number of xsl:message instructions that have been output with a given error code. If the argument is an empty sequence, it returns the total number of xsl:message instructions that have been output. A typical use case is to limit the number of messages that are output, or to terminate execution after reaching a particular threshold. Because the xsl:message instruction has side-effects (including the side-effect of updating these counters) the results are not guaranteed to be predictable. This is particularly true when multi-threading.

The implementation of saxon:stream() has undergone further changes, which bring it closer into line with the streamability rules for XSLT 3.0 streaming. When a filter expression is used as the argument, for example saxon:stream(doc('a'xml')/b/c[PREDICATE]), the predicate must now be motionless: this means for example that it cannot be positional, and it cannot access children or descendants or the string value of the node being tested. These restrictions can be circumvented by moving the predicate outside the call on saxon:stream, for example saxon:stream(doc('a'xml')/b/c)[PREDICATE].

Extension Attributes

The attribute saxon:time="yes" on xsl:message causes a timestamp to be added to the message output.

The attribute saxon:trace="yes" on xsl:accumulator causes tracing of all changes to the current value of the accumulator.

The serialization property saxon:single-quotes="yes|no" controls whether the XML and HTML output methods use single or double quotes to delimit attributes (see Serialization Parameters). The default is to use double quotes, except in cases where character maps are involved; single quotes can be useful if the lexical XML/HTML is to be wrapped in a string literal delimited by double quotes, for example in JSON. This does not eliminate the need to escape double quotes (as \") in such a context, but it means that fewer characters are affected.

Extension Instructions

The new instruction <saxon:array> has the same content model as <xsl:sequence>, but builds an array rather than a sequence. The members of the array will all be singleton items, unless constructed using the saxon:array-member extension function or the saxon:array-member extension instruction (see above).

The new instruction <saxon:array-member> has the same content model as <xsl:sequence>, but builds an external object of type ArrayMemberValue wrapping the constructed sequence. The significance of this external object is that it is specially recognized by operations that construct an array, indicating that the sequence contained by the ArrayMemberValue is to become a single member of the resulting array, rather than having each item treated independently.

An extension function saxon:array-member() is also available.

Collection URIs

The standard collection URI resolver now accepts the parameter match=regex as an alternative to select=glob. The value is an XPath 3.1 regular expression. Characters that are not allowed in a URI (such as backslash and curly braces) must be escaped using the %HH convention; this is best achieved by processing the value through the fn:encode-for-uri() function.

XPath Syntax Extensions

Saxon implements a number of extensions to the XPath 3.1 grammar. These are enabled only if the configuration option FeatureKeys.ALLOW_SYNTAX_EXTENSIONS is set, and they require Saxon-PE or higher.

Simple inline functions: an abbreviated syntax is available for defining simple inline functions. For example, the expression fn{.+1} represents a function that takes a single argument and returns the value of that argument plus one. A simple inline function takes a single argument with required type item(), and returns any sequence (type item()*). The function body is evaluated with a singleton focus based on the supplied argument value.

Simple inline functions are particularly convenient when providing functions as arguments to higher-order functions, many of which accept a single item as their one argument. For example, to sort employees in order of salary, you can write sort(//employee, fn{@salary}).

Simple inline functions can access externally-defined local variables in the usual way (that is, they have a closure).

The expression fn{EXPR} is a syntactic shorthand for function($x as item()) as item()* {$x!(EXPR)}.

Type Aliases. Wherever an ItemType is expected, instead of writing an explicit type like map(xs:string, map(xs:integer, xs:string)), it is possible to use a type alias, which is a QName prefixed with a tilde, for example ~inventory. Type aliases can be declared in XSLT using the extension element <saxon:type-alias>; they are not currently available in XQuery. More details: see Type Aliases.

Union Types. A union type is an extension to the syntax for an ItemType. An example is union(xs:dateTime, xs:date, xs:time), which denotes a union type whose members are xs:date, xs:time, and xs:dateTime. These types can conveniently be used in function signatures when writing a function that is designed to take arguments of more than one type. They can also be used in other places where an ItemType is required, for example in a cast as expression. Requires Saxon-EE. The semantics are the same as for union types defined in a schema. For further details see Union Types

Tuple Types. A tuple type is an extension to the syntax for an ItemType. An example is tuple(ssn: xs:string, emp: element(employee)). A tuple type is essentially a way of defining a more precise type for maps. This example type will be satisfied by any map that has an entry whose key is "ssn" and whose value is a string, plus an entry whose key is "emp" and whose value is an employee element. More details: see Tuple Types.