Optimizations and performance improvements

When the descendant axis is used in a schema-aware query or stylesheet, and the type of the context node is statically known, the step that uses the descendant axis is now replaced by a sequence of steps using the child axis (and the descendant or descendant-or-self axes, if necessary) that restricts the search to the parts of the tree where the required element can actually be found. If it is not possible for the descendant element to exist within the subtree, a compile-time warning is produced (in the same way as previous releases do for the child and attribute axes).

Expressions using the axis step child::* now have their static type inferred if the schema only allows one possible element type in this context. This may lead to warnings being produced when a path expression using such a construct cannot select anything.

In previous releases of the Saxon-SA join optimizer, document-level indexes (keys) were not used to index expressions that required sorting into document order, for example doc('abc.xml')//b/c[@d=$x]". This restriction has been removed.

Saxon-SA is now better at detecting when there is an indexable term in a predicate masked by other terms that are not indexable, for example doc('abc.xml')/a/b/c[@d gt 5 and @e=$x]" which will now be indexed on the value of @e

Saxon has always gone to some efforts to ensure that the result of a path expression is not sorted at run-time if the path is naturally sorted, that is, if the nested-loop evaluation of the path expression will deliver nodes in document order anyway. One situation where this is not possible is with a path of the form $v/a/b/c/d, in the case where Saxon cannot determine statically that $v will be a singleton. In this situation Saxon was effectively generating the expression sort($v/a/b/c/d). This has now changed in Saxon-SA so that in the case where the tail of the path expression (a/b/c/d) is naturally sorted, Saxon now generates a conditional sort expression, which performs the sort only if the condition exists($v[2]) is true. (Note: it is not possible to rewrite the expression as sort($v)/a/b/c/d, because this can result in duplicates if $v is not a peer node-set, that is, if it contains one node that is an ancestor of another.)

This optimization (which is available only in Saxon-SA) benefits many queries of the form:

let $x := doc('abc.xml')//item[@code='12345'] return $x/price, $x/value, $x/size

Where the expression EXP1 in for $i in EXP1 return EXP2 is known to be a singleton, the expression is rewritten let $i := EXP1 return EXP2. This creates the opportunity for further simplifications.

Local variables are now inlined if they are bound to a constant value. Previously variables were inlined only in cases where there is just one reference to the variable. This creates the opportunity for further static evaluation of constant subexpressions.

In Saxon-SA, function calls are now inlined, provided certain conditions are met. These conditions are currently rather conservative. The function that is inlined must not call any user-defined functions, and it must not exceed a certain size (currently set, rather arbitrarily, to 15 nodes in the expression tree). It must also not contain certain constructs: for example, various XSLT instructions such as xsl:number and xsl:apply-templates, any instruction that performs sorting, or a "for" expression with an "at" variable.

There are two reasons to inline function calls. Firstly, with very simple functions, the cost of doing a function call can be noticeable. Secondly and more importantly, combining the calling and called functions into a single expression enables further optimizations, for example it often becomes possible to extract subexpressions from the function body out of a loop. Where a function call has constants as its arguments, this might even include evaluating the result of the function at compile-time.

When a function call is inlined, the original function remains available even if there are no further calls to it. This is because there are interfaces in Saxon that allow functions in a query module to be located and invoked dynamically.

If a subexpression within a function or template body does not depend on the parameters to the function, does not create new nodes, and is not a constant, then it is now extracted from the function body and evaluated as a global variable. This might apply to an expression that depends on other global variables or parameters, or to an expression such as doc('abc.xml') that is never evaluated at compile time. This optimization applies only to Saxon-SA. However, during testing of this optimization a considerable number of cases were found where Saxon was not taking the opportunity to do "constant folding" (compile-time evaluation of expressions) and these have been fixed, benefitting both Saxon-B and Saxon-SA.

Static type checking when applied to a conditional expression is now distributed to the branches of the conditional. ("Static type checking" here means checking that the static type of an expression is compatible with the required type, and generating run-time type checking code where this proves necessary). This means that no run-time checking code is now generated or executed for those branches of the conditional that are statically type-safe. This in turn means that if one branch of the conditional is a recursive tail call, tail call optimization is no longer inhibited by the unnecessary run-time type check on the value returned by the recursive call. Another effect of the change is that a static type error may now be reported if any branch of the conditional has a static type that is incompatible with the required type; previously this error would have been reported only when this branch was actually executed. This change affects XPath if/then/else, XSLT's xsl:if and xsl:choose, and XQuery typeswitch.

Tail-call optimization on xsl:call-template has also been improved. In the past this optimization was never applied if the named template declared a return type. This restriction is removed. To enable this, the static type inferencing on xsl:call-template has been improved. (Note however that declaring a return type on a match template will still generally inhibit tail call optimization, because calls on xsl:apply-templates cannot be statically analyzed.)

Saxon-SA now optimizes certain multi-branch conditional expressions into an efficient switch expression. The expressions that qualify are XSLT xsl:choose instructions or multi-way XPath/XQuery if () then ... else if () then ... expressions where all the conditions take the form of singleton comparisons of the same expression against different literal constants, for example @a = 3, @a = 7, @a = 8. The expression on the left must be identical in each case, and the constants on the right must all be of the same type. The expression is optimized by putting the constant values (or collation keys derived from them) in a hash table and doing a direct lookup on the value of the expression.

There has been some tuning applied to the DOM interface, specifically the wrapper code which implements the NodeInfo interface on top of org.w3.dom.Node. The frequently-used iterator for the child axis was creating nodes for all the children in a list, to ensure that adjacent text nodes were properly concatenated. This has changed so that the creation of nodes is now incremental.

Saxon now tries more aggressively to precompile regular expressions that are known at compile time, where these are used in the XPath functions matches(), tokenize(), and replace(), or in the xsl:analyze-string instruction. Previously, this was only done (in general) when the regex was written as a string literal or a constant subexpression. It is now done also when the regex can be reduced to a string literal during earlier stages of optimization. In particular, it now handles the case where the expression is written in the content of an XSLT variable, as this is a popular coding idiom because it avoids problems with escaping curly braces and other special characters.

There are some improvements in the optimization of expressions used in a context where the effective boolean value is required. These now all use the same logic (implemented as a static method in class BooleanFn). Expressions known to return nodes are now wrapped in a call of exists(), which takes advantage of the ability of some iterators to report whether any nodes exist without materializing the node. Expressions of the form A | B appearing in a boolean context are rewritten as exists(A) or exists(B), which eliminates the costs of sorting into document order and checking for duplicates. Calls to normalize-space() in a boolean context are optimized to simply test whether the string contains any non-whitespace characters.

There has been a complete redesign of the optimization of expressions such as SEQ[position() gt 5] - specifically, filter expressions that perform an explicit test on the position() function. These are generally rewritten into a call of subsequence(), remove(), or the new internal function saxon:item-at(), or (typically for S[position() != 1]) into a TailExpression. Where necessary, conditional logic is added to the call to handle the case where the expression being compared to position() is not guaranteed to be an integer, or might be an empty sequence. This redesign eliminates the need for the expression types PositionRange and SliceExpression.

Compile-time performance has been improved for expressions containing long lists of subexpressions separated by the comma operator. Lists longer than 5000 or so items were blowing the Java stack, and the compile time was also quadratic in the number of subexpressions.

Document Projection

Document Projection is a mechanism that analyzes a query to determine what parts of a document it can potentially access, and then while building a tree to represent the document, leaves out those parts of the tree that cannot make any difference to the result of the query.

In this release document projection is an option on the XQuery command line interface. Currently it is only used if requested.

Internally, the class PathMap computes (and represents) the set of paths within a document that are followed by an expression, that is, for each document accessed by an expression, the set of nodes that are reachable by the expression. A PathMap can be set on an AugmentedSource supplied to the Configuration.buildDocument() method to request filtering of the document while constructing the tree. If -explain is specified, the output includes feedback on the use of document projection.