Whitespace Stripping in Source Documents
A number of factors combine to determine whether whitespace-only text nodes in the source document are visible to the user-written XSLT or XQuery code.
By default, if there is a DTD or schema, then ignorable whitespace is stripped from
any source document loaded from a StreamSource or SAXSource.
Ignorable whitespace is defined as the whitespace that appears separating the child elements
in elements declared to have element-only content. This whitespace is removed regardless of
any xml:space attributes in the source document.
It is possible to change this default behavior in several ways.
-
From the
com.saxonica.Queryorcom.saxonica.Transformcommand line, options are available:-strip:allstrips all whitespace text nodes,-strip:nonestrips no whitespace text nodes, and-strip:ignorablestrips ignorable whitespace text nodes only (this is the default). -
If the
-poption is used on the command line, then query parameters are recognized in the URI passed to thedocument()ordoc()function. The parameterstrip-space=yesstrips all whitespace text nodes,strip-space=nostrips no whitespace text nodes, andstrip-space=ignorablestrips ignorable whitespace text nodes only. This overrides anything specified on the command line. -
Options corresponding to the above can also be set on the
TransformerFactoryobject or on theConfiguration. These settings are global.
Whitespace stripping that is specified in any of the above ways does not occur only if the
source document is parsed under Saxon's control: that is, if it is supplied as a JAXP
StreamSource or SAXSource. It also applies where the input is
supplied in the form of a tree (for example, a DOM). In this case Saxon wraps the supplied
tree in a virtual tree that provides a view of the original tree with whitespace text nodes
omitted.
This whitespace stripping is additional (and prior) to any stripping carried out as a result
of the xsl:strip-space declaration in the stylesheet.