Saxon extensions to the W3C XSLT/XQuery specifications
Extension Functions
A number of new extension functions are available:
-
Parses a supplied URI, returning a map containing its various components (such as the scheme, port, path, fragment, and query).
-
Splits the supplied string into a sequence of single-character strings.
-
Given a string in the form of a lexical EQName, returns the corresponding
xs:QName
value. -
saxon:in-scope-namespaces($element)
Returns the in-scope namespaces of an element, in the form of a map from prefixes to URIs.
-
saxon:index-where($sequence, $predicate)
Returns the integer positions of items in the sequence that match the supplied predicate function.
-
Returns true if the supplied argument is the
xs:float
orxs:double
valueNaN
. -
saxon:items-after($input, $predicate)
Returns all items in the input sequence that follow the first item that matches the predicate.
-
saxon:items-before($input, $predicate)
Returns all items in the input sequence that precede the first item that matches the predicate.
-
saxon:items-from($input, $predicate)
Returns all items in the input sequence starting with the first item that matches the predicate.
-
saxon:items-until($input, $predicate)
Returns all items in the input sequence up to and including the first item that matches the predicate.
-
saxon:parse-dateTime($string, $format)
Parses dates and times in non-standard formats. The first argument is a string containing the value to be parsed; the second is a pattern giving the expected format, in the notation used by the Java
DateTimeFormatter
class. The result is anxs:dateTime
,xs:date
,xs:time
,xs:gYear
,xs:gYearMonth
,xs:gMonth
,xs:gMonthDay
, orxs:gDay
depending on the components that were actually present in the input value. -
Similar to
fn:replace
, but instead of supplying a replacement string, the caller supplies a callback function that computes the replacement string from the matched substring. -
Returns a map containing the values of all tunnel parameters (whether or not they are declared in the current template). The keys in the map are QNames (the parameter names); the corresponding values are arbitrary XDM values.
A number of changes have been made to existing extension functions:
-
The
saxon:evaluate-node()
function is dropped. The functionality is available using xsl:evaluate in XSLT 3.0. -
The extension function
saxon:get-pseudo-attribute()
now parses the supplied input much more rigorously, applying the rules found in the W3C specification, and raising an error for invalid syntax that was previously allowed through.
Extensions to Standard Functions
Various Saxon-specific options have been provided for the map:merge() function: saxon:on-duplicates
provides a call-back function for handling duplicate keys; saxon:key-type
allows Saxon to optimize the resulting
map if the keys are all strings; saxon:final
allows Saxon to optimize for the case where no further changes to the
content of the map are likely; saxon:duplicates-error-code
defines an error code to be used when duplicate
keys are encountered.
Extension Instructions
A new instruction saxon:for-each-member
is available; it iterates over the members of a supplied array.
For example:
<saxon:for-each-member select="[(1,2), (3,4)]" bind-to="m"> <subtotal>{sum($m)}</subtotal> </saxon:for-each-member>outputs <subtotal>3</subtotal><subtotal>7</subtotal>
A new attribute xsl:mode/@saxon:as
is available. Its value is a sequence type, with Saxon extensions permitted.
This provides a default value for the as
attribute of all template rules in the mode, unless they have their own
required type defined in an as
or saxon:as
attribute. This is handy in cases where, for example, all the
template rules in a particular mode are required to return a boolean value. It is particularly useful in cases where the return
type is a complex tuple type, as this means that changes to the tuple type only need to be made in one place.
XPath Syntax Extensions
Tuple Types
The experimental syntax for declaring tuple types has been revised; the specification has been
expanded and clarified, and the implementation is much more thoroughly tested.
The colon separating field name and required type has been replaced with "as
"; and the
notation ",*"
" can be used after the last field to indicate that the tuple type
is extensible (that is, additional fields are permitted beyond those declared). In addition, field names
that are not NCNames can now be used, written in quotes. For example, a tuple type may now
be declared as tuple(key as xs:string, 'max size' as xs:numeric?, value, *)
.
Item Type Aliases
Where named type aliases are defined in XSLT or XQuery, the syntax for referring to them in an XPath SequenceType
has changed
from ~typename
to type(typename)
.
Processing all Members of an Array
A new for-member expression is available to process all the members of an array:
for member $x in EXPR (, $y in EXPR)* return EXPR
For example: for member $m in [(3,5,6), (8,12)] return sum($m)
returns the sequence (14, 20)
.
This syntax is currently available only as a free-standing expression, not as a clause in a FLWOR expression; it has been designed, however, to allow integration into a FLWOR expression in the future.
Extensions to the Lookup Operator
Following a unary or binary "?" operator, Saxon now allows a string literal or variable reference to appear without
surrounding parentheses: for example $map?"first name"
or [1 to 10]?$i
.
The otherwise
Operator
The expression chapter[title='Introduction'] otherwise chapter[1]
returns the chapter(s) whose title
is "Introduction" if such a chapter exists, or the first chapter if not. More generally, A otherwise B
returns A
, unless it is an empty sequence, in which case it returns B
.
KindTests
The syntax for the element()
and attribute()
KindTests
is extended to allow constructs of the form
element(*:div)
or attribute(myns:*, myns:someType)
.
Abbreviated inline functions
The expression .{@x}
(referred to as a "dot function") is an anonymous inline function that
returns an attribute of the node passed as the
function parameter. (This obsoletes the syntax fn{@x}
introduced experimentally in Saxon 9.9,
which is retained for the time being.) For example, sort(//employee, .{@lastname, @firstname})
returns employees sorted by last name then first name. A dot function has signature function(item()) as item()*
:
that is, it has arity one, and expects a single item as its argument.
The expression _{$1 + $2}
(referred to as an "underscore function") is an anonymous inline
function with two arguments, which may be of any type; it returns the sum of the two arguments.
The numeric variable references $1
and $2
refer to the argument values based on their position in the argument list. The arity
of the function is determined from the highest numeric variable reference. It is not necessary to reference
all the arguments other than the last, for example _{$2}
is an arity-2 function that returns
the value of the second argument, ignoring the first. The numeric argument references must appear directly
in the body of the underscore function; they cannot be referenced in a nested inline function (whether or
not this is itself an underscore function). For example, for-each-pair((1,2,3), (4,5,6), _{$1 + $2})
returns (5,7,9)
. The signature of the function in this example is
function(item()*, item()*) as item()*
.
As a special case, _{12}
is a zero-arity function that always returns the value 12.
XSLT extensions
The xsl:map instruction has acquired an extension attribute saxon:on-duplicates
. The value is a user-supplied
function which is called when map entries with duplicate keys are encountered. The function is supplied with the two conflicting
values and can combine them to create a new value which is stored in the resulting map. This can be used to emulate all the
options supplied on the map:merge function (use-first,
use-last, combine, or fail) and to achieve other effects, for example delivering the sum, maximum, or string-join of
the set of values associated with a single key, or selecting one of the values based on data such as a time-stamp
attribute.
Provided that Saxon syntax extensions are enabled, some extensions to XSLT 3.0 syntax are implemented:
-
The
xsl:when
andxsl:otherwise
elements can have aselect
attribute in place of a contained sequence constructor. -
The
xsl:if
elements can have athen
attribute in place of a contained sequence constructor, and it can also have anelse
attribute.For example, this function found in a W3C specification:
<xsl:function name="f:product"> <xsl:param name="seq"/> <xsl:choose> <xsl:when test="empty($seq)"> <xsl:sequence select="1"/ </xsl:when> <xsl:otherwise> <xsl:sequence select="head($seq) * f:product(tail($seq))"/> </xsl:otherwise> </xsl:choose> </xsl:function>Can now be written:
<xsl:function name="f:product"> <xsl:param name="seq"/> <xsl:if test="empty($seq)" then="1" else="head($seq) * f:product(tail($seq))"/> </xsl:function>
Provided Saxon syntax extensions are enabled, a range of new match patterns can be defined, particularly suitable when processing JSON. These include (by example):
-
atomic(xs:integer)
: matches an atomic value of a given atomic type -
union(xs:integer, xs:date)
: matches an atomic value of a given union type -
map(xs:integer, element()*)
: matches a map of a given type -
array(xs:integer*)
: matches an array of a given type -
tuple(first, middle, last, *)
: matches a map conforming to a given tuple type
All of these may be followed by optional predicates. Default priorities are defined, designed to reflect the type hierarchy so that more selective types have higher priority than less selective types; the rules for allocating priorities, however, should be regarded as provisional.
The s9api XsltCompiler interface, and the net.sf.saxon.Transform
command line, now
allow a default namespace to be specified; this acts as a "default default" for the value of the xpath-default-namespace
attribute, and it has no effect if an explicit value for xpath-default-namespace
appears in the stylesheet.
The s9api XsltCompiler
interface, and the net.sf.saxon.Transform
command line, also allow
you to specify that unprefixed element names used in path expressions and match patterns should match by local name only,
ignoring the namespace entirely. For example, a path X/Y/Z
is then treated as if it were written *:X/*:Y/*:Z
.
This option overrides the effect of xpath-default-namespace
in cases where it applies.
A further option is to indicate that unprefixed element names should match elements either in the default namespace (as specified
using xpath-default-namespace
) or in no namespace. This option is provided primarily to reflect the XSLT/XPath
variation defined in the HTML5 specification, which says that unprefixed element names should match elements in the XHTML namespace
when the context item is a node "in an HTML DOM", and elements in no namespace otherwise. It is difficult to reproduce this
rule precisely in Saxon, because it's not clear what being "in an HTML DOM" means when the data model is XDM (for example, does it apply
to a node constructed by using xsl:copy-of
applied to an HTML element?). But this option provides an approximation
that ensures Saxon (in particular, Saxon-JS) will behave the same way as an XSLT 1.0 stylesheet running in the browser in most
practical situations. The option applies when an NCName is used as a NameTest
(on any axis other than the attribute and namespace
axes) and to the element name in an ElementTest
(that is, element(name)
), whether in an XPath expression,
a pattern, or a SequenceType
. It does not apply to unprefixed type names or to names used in a SchemaElementTest
(that is, schema-element(name)
).