Saxon provides a number of additional serialization parameters, with names in the Saxon
namespace. These can be specified as attributes on the xsl:output and xsl:result-document elements (XSLT-only), in
the Query prolog (XQuery only), as parameters in the fn:serialize() function, or as extra parameters on
the Query or Transform command line. They can also be specified in the query or
transformation API.
saxon:attribute-order
eqnames
Available with the XML, HTML, and XHTML output methods, to control the order in
which attributes appear within an element start tag (in the absence of the
property the order of attributes is unpredictable).
The value of the parameter is a list of tokens, each of which is either a QName or
the token "*" to match unspecified attributes. Attributes whose names are
listed before the "*" token appear first, in the order they are listed; other
unlisted attributes follow, sorted first by namespace URI and then by local-name, and finally any attributes
whose names appear after the "*" appear at the end.
For example saxon:attribute-order="a b c * xml:space" will cause attributes to be output in the order
a, then b, then c, then everything else (sorted by URI and local name),
then xml:space.
saxon:canonical
boolean
Available with the XML output method, to request that the serialized XML conforms to the W3C XML Canonicalization 1.1
specification (C14N). This can be useful, for example, when test results are to be compared.
Specifically, this option changes XML serialization as follows:
Empty elements are output as <empty></empty> rather than <empty/>.
Namespaces within a start tag are sorted in alphabetical order of prefix.
Attributes within a start tag are sorted first by namespace URI, then by local name.
Processing instructions and comments that appear as children of the document node are separated by newlines.
Specifying saxon:canonical="yes" forces omit-xml-declaration="yes",
indent="no", and encoding="utf-8", and causes use-character-maps and cdata-section-elements
to be ignored. No DOCTYPE declaration is output. The option does not force Unicode normalization; if in doubt, set normalization-form="C".
saxon:character-representation
"native" | "entity" | "decimal" | "hex"
Allows greater control over how non-ASCII characters will be represented on
output.
When the output method is XML, two values are supported: decimal and
hex. These control whether numeric character references are
output in decimal or hexadecimal when the character is not available in the
selected encoding.
When the output method is HTML, the value may hold two strings, separated by a
semicolon. The first string defines how non-ASCII characters within the
character encoding will be represented, the values being native,
entity, decimal, or hex. The second
string defines how characters outside the encoding will be represented, the
values being entity, decimal, or hex.
Here native means output the character as itself;
entity means use a defined entity reference (such as
"é") if known; decimal and hex refer to
numeric character references. For example entity;decimal (the
default) means that with encoding="iso-8859-1", characters in the
range 160-255 will be represented using standard HTML entity references, while
Unicode characters above 255 will be represented as decimal character
references.
saxon:double-space
eqnames
When the output method is XML with indent="yes", the
saxon:double-space attribute may be used to generate an extra
blank line before selected elements. The value is a whitespace-separated list of
element names. The attribute follows the same conventions as
cdata-section-elements: values specified in separate xsl:output or xsl:result-document elements are cumulative, and if the value is
supplied programmatically via an API, or from the command line, then the element
names are given in Clark notation, namely {uri}local. The effect of
the attribute is to cause an extra blank line to be output before the start tag
of the specified elements.
saxon:indent-spaces
integer
When the output method is XML, HTML, or XHTML with indent="yes", the
saxon:indent-spaces attribute may be used to control the amount
of indentation. The default value in the absence of this attribute is 3.
saxon:internal-dtd-subset
integer
When the output method is XML, the
saxon:internal-dtd-subset attribute may be used to generate an internal DTD.
The value is a string conforming to the XML grammar production intSubset; it is
included in the serialized document "as is", without checking. As with any string, special
characters will need to be escaped, for example "<" is written as "<".
The square brackets that enclose the internal subset within the Document Type Declaration
should not be included in the value.
saxon:line-length
integer
Default value 80. With the XML output method, attributes are output on a new line
if they would otherwise extend beyond this column position. With the HTML output
method, text lines are split at this line length when possible.
saxon:newline
string
Default value 10. Defines the string that is used by the text output method
to represent a newline. The Windows line ending x0Dx0A (CRLF) may sometimes be preferred
to the default of a single newline character, this can be specified using
saxon:newline="
".
saxon:next-in-chain
uri
XSLT only. Used to direct the output to another stylesheet. The value is the URL
of a stylesheet that should be used to process the output stream. In this case
the output stream must always be pure XML, and attributes that control the
format of the output (e.g. method,
cdata-section-elements, etc) will have no effect. The output of
the second stylesheet will be directed to the destination that would have been
used for the first stylesheet if no saxon:next-in-chain attribute
were present.
This serialization property is available only on xsl:output declarations and xsl:result-document instructions. It cannot be
supplied as an attribute to any of the various APIs that control serialization; nor can it be used
on the command line. It is not supported as an XQuery serialization parameter.
Supplying a zero-length string is equivalent to omitting the attribute, except
that it can be used to override a previous setting.
If the value is a relative URI, it is interpreted relative to the base URI of the
stylesheet element (xsl:output or xsl:result-document)
on which the attribute appears.
saxon:property-order
eqnames
Available with the JSON output method, to control the order in
which properties appear within the serialization of a map/object (in the absence of saxon:property-order
the order of properties is unpredictable).
The value of the parameter is a list of tokens, in which the token "*" is treated specially. Properties whose names are
listed before the "*" token appear first, in the order they are listed; other
unlisted properties follow, sorted alphabetically, and finally any properties
whose names are listed after the "*" appear at the end.
For example saxon:property-order="@ a b c * $" will cause properties to be output in the order
@, then a, then b, then c, then everything else,
then $. Although JSON property names can include spaces, there is no provision for such names to be included in the list.
saxon:recognize-binary
boolean
Relevant only when using the text output method. If set to yes, the
processing instructions <?hex XXXX?> and <?b64
XXXX?> will be recognized; the value is taken as a hexBinary or
base64 representation of a character string, encoded using the encoding in use
by the serializer, and this character string will be output without validating
it to ensure it contains valid XML characters. Also recognized are
<?hex.EEEE XXXX?> and <?b64.EEEE
XXXX?>, where EEEE is the name of the encoding of the base64 or
hexBinary data: for example hex.ascii or b64.utf8.
This enables non-XML characters, notably binary zero, to be output.
For example, given <xsl:output method="text"
saxon:recognize-binary="yes"/>, the following instruction:
outputs the value of the securityKey element, on the assumption that
this is base64-encoded UTF-8 text.
saxon:require-well-formed
boolean
Affects the handling of result documents that contain multiple top-level elements
or top-level text nodes. The W3C specifications allow such a result document,
even though it is not a well-formed XML document. It is, however, a well-formed
external general parsed entity, which means it can be incorporated
into a well-formed XML document by means of an entity reference.
The default is no. If the value is set to yes, and a
SAX destination (for example a SAXResult, a
JDOMResult, or a user-written ContentHandler) is
supplied to receive the results of the transformation, then Saxon will report an
error rather than sending a non-well-formed stream of SAX events to the
ContentHandler. This attribute is useful when the output of the
stylesheet is sent to a component (for example an XSL-FO rendering engine) that
is not designed to accept non-well-formed XML result trees.
Note also that namespace undeclarations of the form xmlns:p="" (as
permitted by XML Namespaces 1.1) are passed to the
startPrefixMapping() method of a user-defined
ContentHandler only if undeclare-prefixes="yes" is
specified on xsl:output.
saxon:single-quotes
boolean
If set to yes, the XML, HTML, and XHTML output methods will generally use
single quotes (apostrophes) rather than double quotes to delimit attribute values.
This can be useful if the serialized XML/HTML is to be subsequently wrapped in double quotes,
for example as part of a JSON text, or within a Java string literal. It does not eliminate the
need to escape double quotes (using \") in such a context, but it means that
fewer characters will be affected, which improves the readability of the result.
The property is ignored in the case of attributes affected by character maps, where
single or double quotes are used intelligently based on the actual content of the attribute.
saxon:supply-source-locator
boolean
Relevant only when output is sent to a user-written ContentHandler,
that is, a SAXResult. It causes extra information to be maintained
and made available to the ContentHandler for diagnostic purposes:
specifically, the Locator that is passed to the
ContentHandler via the setDocumentLocator method
may be cast to a ContentHandlerProxyLocator, which exposes the
method getContextItemStack(). This returns a
java.util.Stack. The top item on the stack is the current
context item, and below this are previous context items. Each item is
represented by the interface net.sf.saxon.om.Item. If the item is a node, and if the node is one
derived by parsing a source document with the line-numbering option enabled,
then it is possible to obtain the URI and line number of this node in the
original XML source.
For this to work, the code must be compiled with tracing enabled. This can be
achieved by setting the option config.setCompileWithTracing(true)
on the Configuration
object, or equivalently by setting the configuration property COMPILE_WITH_TRACING.
Note that this compile-time option imposes
a substantial run-time overhead, even if tracing is not switched on at run-time
by providing a TraceListener.