Optimizations and performance improvements

Arrays have been re-implemented to use a "persistent" immutable data structure (based on the PCollections Java library). This means that operations such as array:put() no longer make a copy of the entire array.

As an example of the performance savings, using array:put() to modify every item of an array takes the following times in microseconds:

No of members	Saxon 9.8 (microsecs)	Saxon 9.9 (microsecs)
100	10	24
1000	1149	475
10000	113531	6680

Type checking has been improved: the main impact is to give better diagnostics, but there will also be cases where performance is improved. The main changes are:

Static type checking of returned item types is now distributed to all operands of an expression that contribute directly to the returned result (operands with usage Transmission in terms of the XSLT 3.0 streamability analysis). For example, if a template is required to return a sequence of integers, then every instruction in the template that contributes directly to the result is checked to ensure that it returns integers. This means that more errors are detected statically, and less time is spent at run-time checking the results of subexpressions that are statically known to be correct. It also enables more precise diagnostics as to which instruction or subexpression generated invalid results; this is particularly useful when functions have been inlined.
Diagnostics for type errors now give more precise information, particularly for functions, maps, and arrays. When a supplied value V does not match the required type R, then instead of reporting that V is of type S, the message reports why V does not match the required type R. For example if R is map(xs:string, item()*) and V contains an entry whose key is of type xs:untypedAtomic, the error message in in 9.8 would report that V was of type map(xs:anyAtomicType, item()*), whereas 9.9 will give a message in the form "The map contains an entry with key ("abc") of type xs:untypedAtomic that is not an instance of the required type xs:string".

The lookup operator "?" has been re-implemented as a native expression (previously it was compiled into a complex expression involving loops and conditionals). As a result it should now be faster, but more particularly, it now benefits from improved type checking, and any run-time type errors are now much clearer because they relate to the expression as written, and not to some internally-generated code.

The map:merge() function has been re-implemented to avoid quadratic performance when a large number of maps containing duplicate keys are combined using the duplicates:combine option.

In Saxon-EE, regular expressions used in functions such as matches(), replace(), tokenize(), and analyze-string() are now cached so that when the same regular expression is used repeatedly, the compiled form of the regex is reused. (This is only relevant where the regular expression is not known statically: regular expressions that are known at query/stylesheet compile time are always compiled at compile time.) The cache is held at the level of a Configuration. The optimization can be suppressed using opt:-e.

Saxon by default checks axis steps at compile time to ensure they are capable of returning a non-empty result. "Void" steps such as comment()/* result in a compile-time warning. This check appears to be quite expensive in some use cases, so an optimization flag (opt:-d) has been introduced to suppress it.

A new optimization has been introduced to reduce the cost of executing simple path expressions like books/book/price. Specifically, if the right-hand operand of "/" is a simple axis step, Saxon now evaluates it without creating a new dynamic context object, and without tracking the context item and position, because they are never needed in this situation. This appears to give a speed-up of around 10% for such path expressions.

Changes to the TinyTree data structure

Two significant changes have been implemented for the TinyTree:

Where an element has a single text node child (and no attributes or namespace declarations), the element and text nodes are now combined in a single entry in the main node array (called a TextualElement). For documents that have many such simple-content elements, this can reduce the space overhead (over and above the space used by character content) by up to 50%. (Note: further reductions can be achieved if whitespace is stripped.) As a side-effect, searching for elements using the child or descendant axes is fractionally faster because there is no need to skip the entries containing text nodes.
It is now possible for a TinyTree to contain element nodes created as virtual copies of nodes in another external tree (typically but not necessarily another TinyTree). These are referred to as "grafted" nodes. They arise when a tree is constructed by copying parts of another tree. For example, given:
<xsl:variable name="sorted"> <temp> <xsl:copy-of select="sort(//employee, function($e){number($e!salary)})"/> </temp> </xsl:variable>
a new tree will be constructed that consists largely of grafted nodes containing references to the employee elements in the original unsorted input tree. This saves the cost in time and space of copying the employee subtrees from the old tree to the new. Saxon ensures that the nodes behave like copies even though they are not physically copied: for example, generate-id() will return a different value from the original, and navigating down into the subtree and then back up again will find the correct parent node.