Internal changes
The optimization of count($x)=0
as empty($x)
was working only when the
stylesheet specifies version="2.0"
. It now works with version="1.0"
also.
Text-only temporary trees are used in a wider range of circumstances than before. This data structure
is used when it is known statically that a temporary tree will consist of a single text node; it is a
lot more efficient than a general-purpose temporary tree. It is now used in cases where the content
of the variable invokes xsl:text
and xsl:value-of
including calls that are
within xsl:for-each
, xsl:choose
, xsl:if
,
xsl:sequence
and xsl:analyze-string
, and also where it
uses xsl:call-template
, provided that
all the subordinate instructions generate text nodes. This has been done
in a generalized way which will eventually lead to static type inferencing working at the XSLT level
in the same way as it currently works at the XPath level. For xsl:template
, the type
of the results is inferred from the as
attribute if present, or from the contents of
the template otherwise. {not v. thoroughly tested!}
Range variables (that is, variables declared in an XPath expression (for, some, every) are now stored on the local stack frame in the Bindery rather than directly in the XPathContext object. This simplifies the machinery for handling variables and allows instructions and expressions to be treated more interchangeably.
The Expression
class has been refactored. The original class net.sf.saxon.expr.Expression
is now an interface. The various expressions are now structured under ComputedExpression
for
"true" XPath expressions, net.sf.saxon.value.Value
for constant values, and InstructionExpr
for instructions (such as xsl:element
) that act as expressions when used from XQuery.
The utility methods, including the make
factory method, are now in
net.sf.saxon.expr.ExpressionTool
.
When the dependencies of a ComputedExpression
are determined, the information is now saved
with the expression rather than being recalculated whenever it is needed. For complex expressions this
calculation can be quite complex, and there are still some cases where it is being done at run-time.
Path expressions now use the standard type-checking machinery to check that both arguments of "/" are node-sets. This means that in some cases an error in this area will now be detected statically; and it means that if the expression is found statically to be safe, no run-time checking is done.
I have changed the way delayed evaluation is done: when an expression is evaluated lazily, a Closure object is created as a surrogate for the value. This now contains the expression itself together with all the context information that the expression needs. The separate SavedContext object is no longer used. The Closure is evaluated using the ordinary XPathContext object, which now holds a reference to the local stack frame. With delayed evaluation, this "stack frame" is not actually on the stack at all, it is in the heap, so it survives if the Closure is returned from a function call.
I have reverted to the principle used prior to Saxon 7.x, that lazy evaluation is used only for expressions that are expected to return a (non-singleton) sequence. However, the classification of such expressions is now much more accurate. The reason for this policy is that delaying evaluation of singleton expressions is usually not beneficial - it saves no memory, and incurs a cost for saving and restoring the context. Also, lazy evaluation is not used for expressions that have unusual context dependencies, for example those that depend on current(), position(), last(), or current-group(). This eliminates the problem of saving these values and ensuring that they are referenced correctly during the delayed evaluation.
The delayed evaluation code now evaluates the underlying expression at most once, thus ensuring that
it never takes longer than direct evaluation. At the same time, if only the first item in the sequence is
used then only the first item will be read. In a construct such as
if (exists($x)) then $x else "nothing"
, the first reference to $x primes the iterator, and saves
anything it reads in a buffer (called the "reservoir") within the Closure object. The second reference to $x
starts by reading what it can from the reservoir, and if it needs more, it picks up iterating the underlying
expression where the first evaluation left off. Once some user of the variable has accessed all the items
in the underlying expression, the reservoir contains all the values needed and subsequent evaluations read
the value from there.
Certain instructions, specifically those that are used in for XQuery as well as XSLT processing, now act as expressions as well as instructions. There are two modes of evaluating these instructions: the process() method causes the instruction to write its output to the current Receiver, while the evaluateItem() and iterate() methods return the results in the same way as for any other expression.
To support this mechanism, the process() method now takes an XPathContext as its argument, instead of the Controller. This is because in XQuery, the XPath context needs to be passed unchanged by an element constructor to its child expressions.