Functions, operators, and data types for XPath 2.0

Implemented the escape-uri() function. The '#' character is treated as a reserved character, in addition to those listed in the specification. {expr85}

Implemented the item-at() function, but with restrictions: if the subscript is out of range, it should raise an error, but it currently returns the empty sequence. {pos65}

Implemented the data() function. {schema008, 009, 010, 012}

The concept of "effective boolean value" has been implemented. This algorithm is now used when converting any value to a boolean in contexts such as conditional expressions, filter predicates, and the boolean() function. It is fully backwards compatible with XPath 1.0.

A different, more restricted algorithm is used when casting values to booleans using a cast expression or the xs:boolean() constructor: for strings in particular, the effective boolean value gives false for a zero-length string and true for any other string, while xs:boolean() (in line with W3C Schema) gives true for "1" or "true", false for "0" or "false", and an error for any other string. {type034} xs:boolean() changes are not yet complete for supplied values other than string.

The algorithm for "atomize" is also available for all expressions, though at present it is used only for the argument of a cast. It is also simpler than the algorithm described in the specification because at present the typed value of a node is always the same as the string value.

Changed the EXSLT set:leading() and set:trailing() functions (as required by the spec) so that if the second argument is empty, the first argument is returned. Changed saxon:before() and saxon:after() so they work the same way. Previously, the empty node-set was returned. This change will be retrofitted to 6.5.x. There is a further deviation from the spec: If no node in the second node-set is present in the first node-set, Saxon returns all nodes before/after the first/last in the second node-set, whereas the spec requires it to return an empty sequence. This would require a redesign, and it prevents a pipelined implementation, so I don't intend to implement this change.

Implemented the string-to-codepoints() and codepoints-to-string() functions, replacing saxon:string-to-unicode() and saxon:unicode-to-string(). {saxon68-69}

Implemented the string-join() function. {str125}

Implemented the castable as operator. {type030}

Implemented the types xs:anyURI and xs:QName, and the functions expanded-QName(), get-local-name-from-QName(), get-namespace-from-QName() {type031-33}

Implemented the SequenceType grammar for "attribute of type T" and "element of type T". T must be a built-in simple type. {schema002-004, 014; error009, 012}.

The second argument of saxon:serialize() must now be known at compile-time. This is because details of xsl:output declarations are not available at run-time unless they are actually referenced.

The results of the function-available() and element-available() functions may be inaccurate if the argument is not known at compile-time. Specifically, only system-defined functions and instructions are known at run-time. In practice, these functions are designed to perform compile-time tests so this is very unlikely to be a problem. There is also some justification in that the only functions that can be called dynamically (using saxon:evaluate()) are system-defined functions.

As a result of the changes affecting stylesheet compilation, there are some new restrictions on the extension function saxon:evaluate() (and also saxon:expression()). In particular, the dynamically constructed expression can no longer reference any XSLT variables, and it cannot access any stylesheet functions, Saxon extension functions, or XSLT-specific functions such as key() and generate-id().