Streamed processing of input documents

Streamed execution of XQuery is now supported in Saxon-EE.

The streamability rules have been brought into line with the latest XSLT 3.0 draft. This does not in fact have a major impact on what is streamable, although there are some changes in the details.

The XSLT extension attribute saxon:read-once="yes", which was an early interface provided for streamed processing, is dropped. In place of <xsl:copy-of select="doc('a.xml')//e" saxon:read-once="yes"/>, use <xsl:stream href="a.xml"><xsl:apply-templates select=".//e"/></xsl:stream>

Many more functions and operators are now fully streamable (and more fully tested for streaming). These include boolean, codepoints-to-string, deep-equal (partially), filter, fold-left, index-of, insert-before, not, one-or-more, outermost, remove, snapshot, subsequence, tail, trace, unordered; comma expressions, instance of, conditional expressions.

Streaming of deep-equal() is restricted in that any nodes in the streamed input sequence are materialized (one at a time) in memory.

A number of functions and operators can cause parsing of the input document to terminate early if no more input is required. These include empty, exists, boolean, not, head, E[1], subsequence, deep-equal, general comparisons, instance of. The parse is only terminated if the streamed argument to the function consumes the entire document. Terminating the parsing early means that wellformedness errors in the rest of the file will not be detected.

Two configuration options for streaming have been added. The option FeatureKeys.STREAMABILITY controls whether streaming is used at all, and if so, whether Saxon streaming extensions are enabled. The option FeatureKeys.STREAMING_FALLBACK controls what happens when non-streamable constructs are encountered, the options being either to raise a compile-time error, or to attempt a non-streamed evaluation (which will produce the same result as a streamed evaluation, provided enough memory is available).

Saxon has come into line with the W3C specification by no longer allowing an xsl:for-each or xsl:apply-templates instruction whose select expression is crawling (for example select="//*") and whose body is consuming. (A crawling expression, informally, is one that selects elements on the descendant axis, unless it's wrapped in a call on outermost() to ensure that nested elements are not selected.) It was found that there were cases where this didn't work (notably when the body of the instruction contained local variable declarations) which were difficult to fix.