Saxonica.com

XPath changes

Static type checking has become a little stronger. One change is that in a conditional expression (which includes xsl:choose in XSLT and typeswitch in XQuery, each branch of the conditional must satisfy the required type, as must the default branch if there is a default.

In regular expressions, a back-reference can no longer appear within a character class expression (thus "(abc|def)[\1]" is now illegal). This change has been agreed as an erratum by the W3C working groups, on the grounds that such an expression is meaningless.

In regular expressions, the interpretation of the character class escapes [\c] and [\i] is now sensitive to the selected version of XML in the configuration: if the configuration is set to XML 1.1, then the XML 1.1 definitions of Letter and NameChar are used in place of the XML 1.0 definitions. Internally, the routines for classifying characters according to their validity in different contexts in XML 1.0 and XML 1.1 have been reorganized: the data tables for XML 1.0 and XML 1.1 have been merged into a single table, and this is now used also by the regular expression routines to support the \c and \i character class escapes. One side-effect of this change is that Saxon now includes no Apache code, which slightly simplifies the license conditions.

The implementation of durations has changed to use a two-component model (months and seconds) rather than a six component model (years, months, days, hours, minutes, seconds). Previously Saxon was capable of maintaining unnormalized durations (for example 18 months) but there were no longer any XPath functions or operators that relied on this. The change raises some implementation-defined limits. Some operations that break the implementation-defined limits may now be detected rather than causing incorrect results. The change may also affect Java applications that relied on maintaining durations in unnormalized form.

The implicit timezone is now a genuine part of the dynamic context. In previous releases, it was a property of the Configuration, which meant it was known at compile time. The effect of this is that a compiled query or stylesheet can now be used across a change of timezone. As a result, operations that depend on the implicit timezone can no longer be evaluated statically, even if the operands are known.

One minor glitch in this: when xsl:key is used and the values that are indexed are timezone-less dates or times, the index is built using the implicit timezone from the session in which the index was built. If the same source document is used with the same stylesheet in a number of separate transformations, then indexes are reused across sessions. In this situation, it is necessary to ensure that all such sessions use the same implicit timezone. To avoid this problem, don't construct indexes using timezone-less dates or times.

The rule in min() and max() that the returned value should be an instance of the lowest common supertype of the values in the input sequence is now correctly implemented. For example, max((xs:unsignedInt(3), xs:positiveInteger(2))) returns the value 3 with type label xs:nonNegativeInteger.

When casting from a value other than xs:string to a user-defined type on which a pattern facet is defined, the specification requires the canonical lexical representation of the target value to be valid against the facet. In previous releases Saxon was checking not the canonical lexical representation as defined in XML Schema, but the result of casting to a string as defined in XPath. This has now been fixed. Cases where the two are different include: xs:decimal when the value is a whole number; xs:double and xs:float; xs:dateTime and xs:time when the value is in a timezone other than Z; xs:date with a timezone outside the range -12:00 to +11:59.

Next