Internal changes
The Receiver
interface has been changed: instead of a Configuration
and
a LocationProvider
being passed down the pipeline, the context information for the pipeline is
now passed as a PipelineConfiguration
object. The provides access to the Configuration
and
the LocationProvider
, as well as other information: currently an ErrorListener
and
a URIResolver
. This means that warnings detected by receivers in the pipeline (for example, serialization
errors and validation warnings) can now be properly reported to the ErrorListener
associated with a
transformation, rather than with the global ErrorListener
associated with the Configuration
.
This change was necessary in order to implement the JAXP Validator
interface (which uses a local
ErrorHandler
), but it has other spin-off benefits. In particular, it means that the information passed
down a pipeline can in future be extended by adding new fields to the PipelineConfiguration
class,
with no impact on the Receiver
interface itself.
User-defined functions are now evaluated lazily (that is, if the function returns a sequence, each item in the sequence
is evaluated only when it is needed). This has required some changes to the implementation of tail call optimization.
There are now two kinds of Closure: the new Closure
class is used when the results are
needed only once, as when
evaluating a function call. The old Closure
class is renamed MemoClosure
,
and is used when the results are likely
to be needed more than once, as when evaluating a variable.
Saxon now does static analysis of variable references to identify variables that are never referenced, and
variables that are only referenced once. If a variable is only referenced once, then during lazy evaluation of
the variable the value will be discarded rather than being retained in memory for subsequent reference. There
are now two classes supporting lazy evaluation: Closure
is a value that is evaluated when first
needed and is immediately discarded from
memory, while MemoClosure
also defers evaluation, but retains each item in the evaluated sequence
once it is known. This analysis is currently done only for local variables and function parameters
(not for global variables or XSLT template parameters).
The algorithm for type-checking (the XPath function call rules) has been rewritten to follow the specification more precisely. The rules have gradually been refined over successive W3C drafts, and although the changes are very minor, the implementation had got a little out of step.
The optimizer now recognizes that certain expressions cannot be moved out of a loop. A classic example is
the XQuery expression count(./(for $i in 1 to 5 return <a/>))
, which should return 5. Previous
Saxon releases moved the element constructor out of the loop, and thus returned the value 1. Similar
constructs occur in XSLT in the case of an XPath expression that calls a stylesheet function. At present all calls
on user-defined functions (and XSLT templates) are treated as if they might create new nodes. This does not
affect expressions that create new nodes in a context where the final result cannot depend on the identity of the
new nodes: for example, if a node is created or a function is called within the predicate of a filter expression,
this will still be extracted from the loop and evaluated only once.
The same considerations apply to path expressions in which one of the steps constructs new nodes.
For example the result of count(a/<x/>)
should be equal to the number of a
elements selected; in previous Saxon releases it was always 1.
The TinyTree data structure now allows a single TinyTree to contain any number of trees rooted either at document nodes or element nodes. Allowing multiple parentless element nodes in a single TinyTree reduces the overhead involved in constructing sequences of elements.
Where appropriate, xsl:copy-of
now creates virtual copies of nodes, using the new class
VirtualCopy
. This is simply a reference to the node that was copied, together with sufficient
information to give the copy a different node identity from the original. This technique is used in cases where
the copy is not being directly written to another tree, for example where it is returned as the value of a
variable or function.