Saxonica.com

Internal Changes

The Expression and Instruction classes, used originally for XPath expressions and XSLT instructions respectively, have now been fully merged. This creates a cleaner execution model for XQuery, and enables the kind of optimizations to be done at the XSLT level (specifically, tree rewriting) that were previously confined to individual XPath expressions. This change affects a number of internal APIs that may be used by user applications.

All dynamic errors are now handled internally using the DynamicError class; the TransformerException is used only on public interfaces. The DynamicError object allows an error code to be set, and this is now used for many error conditions; if set, the error code is displayed by the standard error listener, and it is available to applications via the getErrorCode() method on the exception object. The DynamicError object can also hold a reference to the XPathContext, that is, the dynamic context in which the error occurred: this can be used for diagnostics.

There are now essentially three methods for evaluating an expression: the iterate() method, which returns an iterator over the items in the expression's value; the evaluateItem() method, which is suitable only for expressions returning zero or one items, and which returns the item in question, or null; and the process() method, which pushes the results of the expression to a Receiver. Every expression must support at least one of these three methods directly, and supports the others indirectly through a superclass. There are three main families of expressions: ComputedExpression which handles all the traditional XPath expressions (including functions), Value which handles constant values (including sequences), and Instruction which handles XSLT instructions and also supports XQuery node construction expressions. All three families implement the Expression interface.

The body of an XSLT function (xsl:function) is now always compiled into a single expression (using an append expression if there is more than one instruction). This means that the executable code for equivalent XSLT and XQuery functions is now identical.

The mechanism for binding function calls in XPath (and XQuery) expressions has changed, to reduce the amount of duplication between different implementations of the static context. The static context now supports a method getFunctionLibrary, which returns an object of type FunctionLibrary. In practice this will be a FunctionLibraryList, which is a composite function library consisting of several component libraries. There are implementations of FunctionLibrary to support standard system functions, vendor-defined functions, user-defined stylesheet functions, user-defined XQuery functions, and Java extension functions. The different implementations of StaticContext merely assemble these different function libraries in slightly different ways. A FunctionLibrary supports two methods: isAvailable(), which can be used to check whether a given function is available for use, and bind() which returns a FunctionCall object representing a call on the named function.

There are some changes that affect user-written extension instructions. If you have implemented such instructions, study the revised code for the SQL extension instructions to see how they now work. Extension instructions should now be implemented as subclasses of ExtensionInstruction, which contains some useful helper methods. In many cases it will be convenient to compile the extension instruction to a subclass of SimpleExpression, which implements many of the methods that every expression is required to support. See the JavaDoc for more information.

The current output destination is now maintained in the XPath context, not in the Controller. Similarly, the current template, current mode, current group, and current regex iterator are all now maintained in the XPathContext object. This change allows lazy evaluation of constructs that use these context variables.

The Controller no longer maintains a current item. Instead, the current item in the XPath context is always used. The XSLT current() function is now implemented by static rewriting of the expression in which it appears, so that on entry to the expression, the value of "." is assigned to a local variable, which is then referenced at the point where the call to current() appeared.

All XSLT variables are now compiled into similar code, as if they all had a select expression. Where the xsl:variable element contains a sequence of instructions, this is compiled into an expression. The as attribute generates type-checking code in the same way for all kinds of xsl:variable. As with functions, the value of a variable is now generally found by evaluating the contained instructions in "pull" rather than "push" mode, with a saving in memory usage.

Some preparatory changes have been made to ease the transition to JDK 1.5 (which brings with it JAXP 1.3 and DOM Level 3). Unfortunately it is not possible to produce an implementation of DOM classes such as org.w3c.dom.Node that compiles both when DOM Level 3 is installed, and when it is not installed. At present, therefore, Saxon cannot be compiled if DOM Level 3 interfaces are present on the classpath (which they will be if JDK 1.5 is installed). Most of the missing methods have been added where they cause no conflict with existing DOM Level 2 code, and where there are conflicts, the new methods have been coded but commented out. Existing Saxon methods such as isSameNode() that conflict with methods in DOM Level 3 have been renamed.

Changes have been made to the way nodes are tested against a NodeTest, to avoid the cost of calculating the name fingerprint in cases where it is not needed and is not readily available. This especially benefits performance when using a DOM or JDOM data source.

Type-checking of expressions involving the context item has been improved. A static error is now reported if an expression that is dependent on the context item is used in a position where the context item is known to be undefined, or if an expression that requires the context item to be a node is used where it is known to be an atomic value.

The class NodeTest is no longer a subclass of Pattern. Instead, the class NodeTestPattern has been introduced to represent a pattern that consists of nothing more than a simple NodeTest. This change allows NodeTest objects to be shared, and gives a better separation of concerns because NodeTests are used widely in XPath and XQuery, and they are now freed of their XSLT baggage.

Next