saxon:stream
Provides streamed input.
stream($input as item()*) ➔ item()*
Arguments | |||
| $input | item()* | The input to be streamed |
Result | item()* |
Namespace
http://saxon.sf.net/
Notes on the Saxon implementation
Available since before Saxon 8.0. Obsolescent in XSLT, since the xsl:stream instruction provides equivalent functionality; but still useful in XQuery.
Changed in 9.6 so that the function delivers snapshots of the selected nodes rather than copies: that is, the delivered nodes include copies of the ancestors and their attributes, as well as the attributes and descendants of the selected node.
Details
Conceptually, this function returns a copy of its input. The intent, however, is to evaluate the supplied argument in "streaming mode", which allows an input document to be processed without building a tree represention of the whole document in memory. This allows much larger documents to be processed using Saxon than would otherwise be the case.
When there is a requirement to stream documents other than the principal input, this
can be achieved in XQuery using the saxon:stream
extension function, which
enables burst-mode streaming by reading a source document and delivering a sequence of
element nodes representing selected elements within that document. For example:
This example returns a sequence of <sal>
elements. The result of the
saxon:stream
call is a sequence of <employee>
elements. Each <employee>
element is linked to copies of: its
attributes and namespaces; its descendants and their attributes; its ancestors and their
attributes. But the siblings of the selected node, and of its ancestors, are missing,
and any attempt to select them will return an empty sequence. This means it is not
possible to navigate from one <employee>
element to others in the
file; in fact, only one of them actually exists in memory at any one time.
The function saxon:stream
may be regarded as a pseudo-function.
Conceptually, it takes the set of nodes supplied in its argument, and makes a snapshot
of each one (in the sense of the XSLT 3.0 fn:snapshot() function). The resulting sequence of
nodes will usually be processed by an expression such as an XQuery FLWOR expression,
which handles the nodes one at a time. The actual implementation of
saxon:stream
, however, is rather different, in that it changes the way in
which its argument is evaluated: instead of the doc()
function building a
tree in the normal way, the path expression
doc('employees.xml')/*/employee)
is evaluated in streamed mode - which
means that it must conform to a subset of the XPath syntax which Saxon can evaluate in
streamed mode. For more information, see Streaming in XQuery.
The facility should not be used if the source document is read more than once in the
course of the query/transformation. There are two reasons for this: firstly, if it is
read more than once then performance will be better if the document is read into memory;
and secondly, when this optimization is used, there is no guarantee that the
doc()
function will be stable, that is, that it will return the same
results when called repeatedly with the same URI.
If the path expression cannot be evaluated in streaming mode, execution does not fail;
rather it is evaluated in unstreamed mode. This will give the same results provided
enough memory is available for this mode of evaluation. To check whether streamed
processing is actually being used, set the -t
option from the command line
or the FeatureKeys.TIMING option from the configuration API; the output will indicate
whether a particular source document has been processed by building a tree, or by
streaming.
The expression used as an argument to the saxon:stream
function must
consist of:
-
A call to the
document()
ordoc()
function, followed by -
A streamable pattern
Streamable patterns use a subset of XPath expression syntax corresponding to the rules for motionless match patterns in XSLT 3.0.
For further details see Streaming of Large Documents.