Optimizations and performance improvements
Revised the code for type checking and conversion in GeneralComparisons, in particular, the rule that in backwards compatibility mode, an argument is converted to a double if the other argument is numeric. {bool73, bool74}
There are four functions whose result depends implicitly on the current document: key
,
id
, unparsed-entity-uri
, and unparsed-entity-public-id
. These
functions are now statically rewritten to call an equivalent internally-defined function that takes "/" as
an additional explicit argument: for example id($x)
is rewritten as id+($x,/)
.
This has allowed the removal of the special code needed to handle the
fact that these functions have an implicit dependency on the context. (This is designed as a step
on the way to elimination of expression rewriting during reduction).
The changes made in 7.4 to extract constant sub-expressions out of a loop had an adverse side-effect in forcing path expressions to be sorted into document order unnecessarily. This has been fixed.
The decision whether a sort is necessary to deliver the results of a path expression in document
order is now made at compile time, and is reported by saxon:explain
. The phrase "naturally sorted"
means that the path expression delivers its results in document order; the phrase "requiring sort" means
that unless the containing expression asks for the results in arbitrary order, a sort will be performed
to get them in document order.
Some additional cases have been identified where a sort into document order can be avoided.
The main such case is path expressions starting with a variable reference, for example $x/a/b/c
.
These are increasingly common within for
expressions, and within stylesheet functions.
The system now maintains enough static information about the expression to which the variable is bound to
eliminate the sort in many cases. Note that with expressions used inside a stylesheet function, where
$x
is a parameter, this works only when the parameter is given a type that disallows multiple
nodes. Sorting is also avoided in a path expression that starts with the
document()
function. A performance bug that caused the results of the key()
function to be sorted unnecessarily has been fixed.
The expression rewriting in 7.4 sometimes introduced an unnecessary redundant range variable (in the
saxon:explain output this appears as let $zz:r1 := $zz:r0
). In most cases this has been eliminated.
Apart from a small intrinsic cost saving, this also enables some further expression optimization which
was previously inhibited by the extra complexity of the expression.
The mechanism previously used for lazy evaluation has been changed. In previous releases, the
reduce()
method was called to create a copy of the expression, in which context-dependent
subexpressions (such as variable references, or the current()
function) were replaced by their values.
This copying was expensive, in both time and memory. The strategy at this release is that a lazily-evaluated
value is represented by a Closure, which consists of the original expression (unchanged), together with
a SavedContext
, which holds all the values of context variables.
The changes to lazy evaluation revealed some problems with saxon:preview
. The problems
are not actually new, they were exposed by the changes. I eventually decided there was no alternative
to withdrawing this facility. I think the time is coming near when saxon:assign
may also have to go: these "features" are starting to interfere too much with optimization.
The changes can also cause problems with extension functions that have side-effects, by changing
the order of execution of different instructions. These problems can generally be fixed by writing
saxon:assignable="yes"
on each xsl:variable
element where the order of
execution is significant. However, it is best to avoid using extension functions with side effects
if at all possible.