Saxon extensions to the W3C XSLT/XQuery specifications
Changes to existing Saxon extensions and new extensions in Saxon 12 are outlined below. This includes
extension functions and instructions in the saxon
namespace, as well as
experimental implementations for version 4.0 extensions to XPath, XSLT, and XQuery. A W3C
Community Group is working on these proposals; for more information see the QT4CG Specifications, and documentation about the Saxon
implementations at Experimental 4.0 extensions.
New Extension Functions
A number of proposed 4.0 functions are implemented (see New functions):
fn:all()
, fn:all-different()
, fn:all-equal()
, fn:characters()
,
fn:contains-sequence()
, fn:ends-with-sequence()
,
fn:expanded-QName()
, fn:foot()
, fn:highest()
, fn:identity()
, fn:index-where()
,
fn:in-scope-namespaces()
, fn:intersperse()
, fn:is-NaN()
,
fn:items-after()
, fn:items-at()
, fn:items-before()
, fn:items-ending-where()
, fn:items-starting-where()
,
fn:iterate-while()
, fn:lowest()
, fn:op()
, fn:parcel()*
,
fn:parse-html()
, fn:parse-QName()
, fn:parts()*
,
fn:replicate()
, fn:some()
, fn:starts-with-sequence()
,
fn:trunk()
, and fn:unparcel()*
. Specifications of these functions
can be found in the QT4CG draft specification.
The proposed 4.0 functions map:build()
and map:filter()
are implemented.
The proposed 4.0 functions array:empty()
, array:exists()
,
array:foot()
, array:index-where()
, and array:trunk()
are implemented.
From Saxon 12.1:
-
The
options
parameter offn:deep-equal()
is implemented. -
The
fn:atomic-equal()
function is implemented. -
The
fn:slice()
function (available in 12.0) is dropped. -
The functions
fn:build-uri()
andfn:parse-uri()
are implemented.
From Saxon 12.2:
-
The
fn:char()
function is implemented. -
The
array:members()
andarray:of()
functions are implemented (both were previously available but not fully documented, and the detailed specification has changed). -
The
array:of()
function is aligned with the latest 4.0 specification: it expects a single argument whose value is a sequence of "value records" (maps with a single entry having the key "value"). Thefn:parcel
function remains available, though it is not in the 4.0 specification, and now delivers "value records" in this format. -
The
array:build()
function is implemented (previously partly available but undocumented asarray:_from_sequence
). -
The
map:get()
andarray:get()
functions support a third argument, defining the fallback action to be taken when no entry is found.
From Saxon 12.3:
-
The
fn:parse-integer()
function is implemented. This allows parsing of integers in bases other than 10, for exampleparse-integer('ffff', 16)
returns 65535. -
The
fn:format-integer()
function has been extended to allow output in bases other than 10, for examplefn:format-integer(65534, '16^XXXX_XXXX')
outputs0000_FFFE
. -
The
fn:partition()
function is implemented. This provides positional grouping as a higher-order function. -
The
map:of()
function is implemented. This provides another way of constructing a map from a set of key-value pairs. -
The
map:pairs()
function is implemented. This decomposes a map into a set of key-value pairs. The function was previously available under the namemap:key-value-pairs
. -
The
fn:remove()
function allows several items to be removed from a sequence at the same time. -
The
fn:slice()
function is implemented; it allows selection of part of a sequence with capability for numbering from the end, counting backwards, selecting every Nth item, etc. -
The
array:replace()
function is implemented; it replaces one member of an array with a new value that is computed by applying a function to the existing value. -
The
array:slice()
function is implemented; it does the same asfn:slice()
, but for arrays rather than sequences. -
The
map:keys#2
function is implemented; the second argument is a function that tests each value and decides whether the corresponding key should be returned. -
The
fn:xdm-to-json()
function is implemented. This allows conversion of an arbitrary XDM sequence (including, of course, an XML document) to JSON format.
For 12.3, there have been changes to the keywords used for arguments to built-in functions, tracking the draft specifications which have adjusted the keywords to give greater consistency.
From Saxon 12.4:
-
Functions that accept a collation argument now allow it to be set to an empty sequence, which has the same effect as omitting the argument.
-
The
fn:duplicate-values()
function is implemented. This returns all values that appear more than once in the input sequence. -
The
fn:void()
function is implemented. This evaluates the supplied argument, raises an error if evaluation fails, and then returns an empty sequence. The function is primarily designed (like thesaxon:do
instruction) for evaluating expressions that have side-effects, such as functions in the EXPath file module. -
The new
fn:all()
function is renamedfn:every()
. -
The
fn:transitive-closure()
function is completely redesigned according to the latest draft specification. -
The new
map:of()
function is renamedmap:of-pairs()
. -
The new
array:of()
function is renamedarray:of-members()
. -
Implemented
array:split()
, andmap:pair()
. -
The
action
parameter offn:replace()
is implemented, allowing the replacement string to be computed. This makes the Saxon extension functionsaxon:replace-with()
obsolescent. The new function is more powerful in that it supplies captured groups to the action function. -
Constructor functions have a zero-arity version that takes the context item as the default argument. Also, the single argument version can be called using
value
as the argument keyword. -
The
fn:codepoints-to-string()
had been changed to allow a variable number of arguments (rather than a single sequence-valued argument). This change didn't make it into the spec and has now been reverted.
Dropped or Changed Extension Functions
The Saxon extension functions saxon:evaluate()
, saxon:eval()
, and saxon:expression()
were dropped in Saxon 12.0. (The function saxon:evaluate-node()
was dropped in Saxon 10).
The same effect (and more) can be achieved using the standard XSLT 3.0 instruction
xsl:evaluate. From Saxon 12.4,
the functions are reinstated for the benefit of XQuery users, where the
xsl:evaluate
instruction is not available.
The Saxon extension function saxon:in-scope-namespaces() has been
aligned with the proposed 4.0 function fn:in-scope-namespaces()
in that the returned map now always includes an entry for the
XML namespace.
The extension function saxon:parse-html() is now a synonym for
fn:parse-html()
, a new function proposed for 4.0. The function has been reimplemented on SaxonJ to use the
validator.nu
library in place of TagSoup
. In SaxonCS, it has been reimplemented to use AngleSharp
in place of HtmlAgilityPack
. In both cases this gives much closer conformance to the HTML5 parsing algorithm; the function
has also been much more thoroughly tested. However, a handful of the 1300 new tests are currently giving unexplained results (some of
these may turn out to be correct), so it remains work in progress.
XPath 4.0 Syntax Extensions
Some experimental syntax extensions intended for 4.0 have been dropped. "Dot Functions" (written as .{.+1}
or fn{.+1}
)
are now written as ->{.+1}
. "Underscore functions" (written as _{$1 + $2}
) are dropped
entirely (write instead ->($p1, $p2){$p1 + $p2}
).
Union NodeTests are implemented. This feature allows steps such as ancestor::(div1|div2|div3)
,
@(id|name)
, and following-sibling::(comment()|processing-instruction())
.
Static function calls may supply arguments by keyword as well as by position.
From Saxon 12.1, XPath 4.0 string templates are implemented.
Example: let $message := `{$day} of {$month}, {$year}`
.
From Saxon 12.1, XPath "braced if" expressions are implemented. Examples:
-
if ($condition) {<x>It's true</x>}
-
if ($condition) {<x>It's true</x>} else {<x>It's a lie</x>}
-
if ($condition) {<x>It's true</x>} else if ($polite) {<x>He mis-spoke</x>} else {<x>It's a lie</x>}
From 12.2, numeric integer literals can be written in hex (0xFFFF0000
) or binary
(0b101010
), and underscores can be used as separators (1_000_000
).
From 12.3, the non-ASCII characters ×
(xD7) and ÷
(xF7) can be
used in place of *
and div
to represent the multiplication and division
operators. The <
and >
characters (xFF1C and xFF1E) can be used
in place of <
and >
to represent the less-than and greater-than
operators; these characters can also be used in place of <
and >
in the compound tokens <=
, >=
, <<
,
>>
, =>
, ->
, and =!>
.
From 12.3, the mapping arrow operator =!>
is implemented. This is similar to the XPath 3.1 arrow operator
(for example $x => abs()
), but it applies the function on the right-hand side to each item delivered
by the left-hand side individually. For example (-2 to +2) =!> abs()
returns (2, 1, 0, 1, 2)
.
The effect is similar to writing (-2 to +2) ! abs(.)
, but the operator precedences make it
easier to construct a pipeline of operations without parentheses.
From 12.3, an inline function following the arrow operator no longer needs to be parenthesized:
the expression $in => (function($x){$x+1})()
can now be written
$in => function($x){$x+1}()
. It is also possible to use a focus function:
$in => function{.+1}()
From 12.3, the function coercion rules are extended to allow a supplied function item to have lower arity
than that implied by the signature of the required type. For example, map:for-each()
expects a function with
two arguments, which are set respectively to the key and the value of an entry in the map. But if you are only interested in the key,
you can supply a function of arity 1, and your function will be called omitting the second argument. Similarly,
for a function that expects a predicate (a function of arity one), you can now supply the value fn:true#0
which has arity zero: this has the effect that the predicate will always be true.
From 12.3, the syntax for "focus functions" is changed from ->{EXPR}
to function{EXPR}
.
At the same time, the abbreviated syntax for inline functions with named parameters (known as lambda expressions) is
changed from ->($x, $y){$x + $y}
to ($x, $y)->{$x + $y}
. In both cases,
Saxon continues to support the older format for the time being.
From 12.3, for
and let
expressions can use the keyword repeatedly, rather than using a comma:
for $i in 1 to 10 for $j in 1 to 10 return $i * $j
, or let $i := 10 let $j := 20 return $i + $j
.
This syntax was already valid in XQuery.
The implementation of "for member" expressions has changed in 12.3: as a result of this, any SEF files using the construct will need to be recompiled.
From 12.3, Switch and Typeswitch expressions in XQuery allow curly braces, for example:
switch($x){case 1: return "a" case 2: return "b" default: return "?"}
From 12.4, element and attribute tests can use the syntax element(A|B)
and attribute(A|B)
to accept a union of names. Wildcards are also allowed,
for example element(p:*|q:*)
From 12.4, casting to locally-defined union types and enumeration types is supported.
From 12.4, an extensible record test with no named fields is allowed: record(*)
.
From 12.4, there has been a change to the for member
construct. To do two nested iterations
over arrays $A
and $B
, you now need to write for member $a in $A,
member $b in $B return ...
. Previously the second member
was not needed, it was assumed to
apply to subsequent clauses.
From 12.4, the atomic types xs:hexBinary
and xs:base64Binary
are mutually
comparable; and promotion on function calls and variable binding now ensures that either type can be supplied
where the other is required.
XQuery 4.0 extensions
See also XPath 4.0 extensions.
Function declarations may declare some parameters as optional, with a default value. From Saxon 12.1, the default
value must either be a constant (for example, ()
, 0
, false()
, or
""
), or the context item expression .
. This restriction is imposed pending clarification
of the specification.
From 12.3, XQuery switch expressions are generalized so that each case
may match multiple values,
for example case 0 to 9 return "single"
.
From 12.3, some of the constructs in a FLWOR window
clause become optional, for example it is no longer
necessary to say when true()
.
From 12.3, variables (both global variable declarations and local variables bound in let
,
for [member]
and group by
clauses) are subject to type coercion in the
same way as function parameters. For example it becomes possible to say
let $x as xs:double := 1
because the integer 1 is now coerced to an xs:double
.
From 12.4, true()
and false()
are allowed as annotation values. (In previous
XQuery versions, only string literals and numeric literals were allowed.)
From 12.4, the comparand expression in a switch expression can be omitted, and defaults to true()
.
From 12.4, the XQuery syntax for declaring named item types is changed to match the syntax in the
draft specification (declare item-type my:type as union(x, y);
in place of
declare type my:type = union(x, y)
).
XSLT 4.0 extensions
If XSLT 4.0 extensions are enabled, two new (experimental) values are available for xsl:mode/@on-no-match
,
namely shallow-copy-all
and shallow-skip-all
. These have the same effect as shallow-copy
and shallow-skip
respectively, except when processing maps and arrays.
The shallow-skip-all
option has been dropped from the spec, and is removed in Saxon 12.4.
For shallow-copy-all
:
- When processing an array that does not match any template rule, a new array is created. The existing array
is split into a set of parcels, one for each member of the array, and each parcel is processed by making
an implicit call on
xsl:apply-templates
. A parcel is represented as a single-entry map with the key"value"
, and can be matched using the patternmatch="record(value)"
. The value of the array member can be extracted from the parcel using the functionfn:unparcel()
. - When processing a map that does not match any template rule, a new map is created. The existing map
is split into a set of map entries, one for each key-value pair in the map. These entries are represented
as singleton maps, and they are processed by making an implicit call on
xsl:apply-templates
. Any template rule that processes such a singleton map is required to return a result that is a map. The resulting maps are merged, effectively by callingmap:merge()
with theduplicates
option set touse-last
. If a singleton map does not match any template rule, then it is effectively skipped.
The experimental @array
and @map
attributes of xsl:for-each
,
xsl:for-each-group
, and xsl:iterate
added in Saxon 11 are dropped. In their place, use
select="array:members($array)"
or select="map:key-value-pairs($map)"
.
From 12.2, the xsl:array/@use
attribute is implemented. The @composite
attribute is retained
for the time being: composite="yes"
is treated as equivalent to use="?value"
.
The proposed instruction xsl:match
added in Saxon 11 has been dropped.
Function declarations (xsl:function
) may declare some parameters as optional,
with a default value. From Saxon 12.1, the default
value must either be a constant (for example, ()
, 0
, false()
, or
""
), or the context item expression .
. This restriction is imposed pending clarification
of the specification.
From 12.2 the syntax for type patterns changes so that match="type(T)"
may specify any
item type as T
(including, for example, an atomic type name: match="type(xs:integer)"
).
The syntax match="atomic(T)"
is dropped from the draft specification, and
match="type(T)"
should be used instead. For now Saxon still allows match="atomic(T)"
but will produce a warning; it will be removed in a future Saxon release. The priority rules for type patterns are not yet fully
aligned with the evolving specification: matching of atomic types works correctly according to the type hierarchy,
but if union types and/or record types are used in match patterns (for example match="record(lat, long)"
)
they should be disambiguated using explicit priorities.
From 12.3 xsl:matching-substring
and xsl:non-matching-substring
may take
a select
attribute in place of a contained sequence constructor.
From 12.3 enclosing modes are implemented: xsl:template
rules can be defined as a child
of xsl:mode
, and within those template rules, any xsl:apply-templates
instruction
defaults to the enclosing mode.
From 12.4 the xsl:accumulator-rule/@saxon:capture
extension has now been incorporated
as a standard XSLT 4.0 feature under the name xsl:accumulator-rule/@capture
.
The Saxon attribute name remains available for the time being as a synonym.