Whitespace Stripping in Source Documents
A number of factors combine to determine whether whitespace-only text nodes in the source document are visible to the user-written XSLT or XQuery code.
By default, if there is a DTD or schema, then ignorable whitespace is stripped from any source
document loaded from a StreamSource
or SAXSource
. Ignorable whitespace is defined
as the whitespace that appears separating the child elements in element declared to have element-only content.
This whitespace is removed regardless of any xml:space
attributes in the source document.
It is possible to change this default behavior in several ways.
-
From the Transform or Query command line, options are available:
-strip:all
strips all whitespace text nodes,-strip:none
strips no whitespace text nodes, and-strip:ignorable
strips ignorable whitespace text nodes only (this is the default). -
If the
-p
option is used on the command line, then query parameters are recognized in the URI passed to thedocument()
ordoc()
function. The parameterstrip-space=yes
strips all whitespace text nodes,strip-space=no
strips no whitespace text nodes, andstrip-space=ignorable
strips ignorable whitespace text nodes only. This overrides anything specified on the command line. -
Options corresponding to the above can also be set on the
TransformerFactory
object or on theConfiguration
. These settings are global.
Whitespace stripping that is specified in any of the above ways does not occur only if the source document is
parsed under Saxon's control: that is, if it supplied as a JAXP StreamSource
or SAXSource
. It also applies where
the input is supplied in the form of a tree (for example, a DOM). In this case Saxon wraps the supplied tree in a virtual tree that provides a view of
the original tree with whitespace text nodes omitted.
This whitespace stripping is additional (and prior) to any stripping carried out as a result of the
xsl:strip-space
declaration in the stylesheet.