Whitespace Stripping in Source Documents
A number of factors combine to determine whether whitespace-only text nodes in the source document are visible to the user-written XSLT or XQuery code.
By default, if there is a DTD or schema, then ignorable whitespace is stripped from
any source document loaded from a StreamSource
or SAXSource
.
Ignorable whitespace is defined as the whitespace that appears separating the child elements
in elements declared to have element-only content. This whitespace is removed regardless of
any xml:space
attributes in the source document.
It is possible to change this default behavior in several ways.
-
From the
com.saxonica.Query
orcom.saxonica.Transform
command line, options are available:-strip:all
strips all whitespace text nodes,-strip:none
strips no whitespace text nodes, and-strip:ignorable
strips ignorable whitespace text nodes only (this is the default). -
If the
-p
option is used on the command line, then query parameters are recognized in the URI passed to thedocument()
ordoc()
function. The parameterstrip-space=yes
strips all whitespace text nodes,strip-space=no
strips no whitespace text nodes, andstrip-space=ignorable
strips ignorable whitespace text nodes only. This overrides anything specified on the command line. -
Options corresponding to the above can also be set on the
TransformerFactory
object or on theConfiguration
. These settings are global.
Whitespace stripping that is specified in any of the above ways does not occur only if the
source document is parsed under Saxon's control: that is, if it is supplied as a JAXP
StreamSource
or SAXSource
. It also applies where the input is
supplied in the form of a tree (for example, a DOM). In this case Saxon wraps the supplied
tree in a virtual tree that provides a view of the original tree with whitespace text nodes
omitted.
This whitespace stripping is additional (and prior) to any stripping carried out as a result
of the xsl:strip-space
declaration in the stylesheet.