Serialization parameters

Saxon provides a number of additional serialization parameters, with names in the Saxon namespace. These can be specified as attributes on the xsl:output and xsl:result-document elements (XSLT-only), in the Query prolog (XQuery only), as parameters in the fn:serialize() function, or as extra parameters on the Query or Transform command line. They can also be specified in the query or transformation API.

saxon:attribute-order?

eqnames

Available with the XML, HTML, and XHTML output methods, to control the order in which attributes appear within an element start tag (in the absence of the property the order of attributes is unpredictable). Attributes whose names are listed in this property appear first, in the order they are listed; other attributes follow, sorted first by namespace URI and then by local-name.

saxon:character-representation?

"native" | "entity" | "decimal" | "hex"

Allows greater control over how non-ASCII characters will be represented on output.

When the output method is XML, two values are supported: decimal and hex. These control whether numeric character references are output in decimal or hexadecimal when the character is not available in the selected encoding.

When the output method is HTML, the value may hold two strings, separated by a semicolon. The first string defines how non-ASCII characters within the character encoding will be represented, the values being native, entity, decimal, or hex. The second string defines how characters outside the encoding will be represented, the values being entity, decimal, or hex. Here native means output the character as itself; entity means use a defined entity reference (such as "é") if known; decimal and hex refer to numeric character references. For example entity;decimal (the default) means that with encoding="iso-8859-1", characters in the range 160-255 will be represented using standard HTML entity references, while Unicode characters above 255 will be represented as decimal character references.

saxon:double-space?

eqnames

When the output method is XML with indent="yes", the saxon:double-space attribute may be used to generate an extra blank line before selected elements. The value is a whitespace-separated list of element names. The attribute follows the same conventions as cdata-section-elements: values specified in separate xsl:output or xsl:result-document elements are cumulative, and if the value is supplied programmatically via an API, or from the command line, then the element names are given in Clark notation, namely {uri}local. The effect of the attribute is to cause an extra blank line to be output before the start tag of the specified elements.

saxon:indent-spaces?

integer

When the output method is XML, HTML, or XHTML with indent="yes", the saxon:indent-spaces attribute may be used to control the amount of indentation. The default value in the absence of this attribute is 3.

saxon:line-length?

integer

Default value 80. With the XML output method, attributes are output on a new line if they would otherwise extend beyond this column position. With the HTML output method, text lines are split at this line length when possible. In releases 9.2 and earlier, the HTML output method attempted to split lines that exceeded 120 characters in length.

saxon:newline?

string

Default value x10. Defines the string that is used by the text output method to represent a newline. The Windows line ending x0Cx0A (CRLF) may sometimes be preferred to the default of a single newline character.

saxon:next-in-chain?

uri

XSLT only. Used to direct the output to another stylesheet. The value is the URL of a stylesheet that should be used to process the output stream. In this case the output stream must always be pure XML, and attributes that control the format of the output (e.g. method, cdata-section-elements, etc) will have no effect. The output of the second stylesheet will be directed to the destination that would have been used for the first stylesheet if no saxon:next-in-chain attribute were present.

This serialization property is available only on xsl:output declarations. It cannot be supplied as an attribute to xsl:result-document or in any of the various APIs that control serialization.

Supplying a zero-length string is equivalent to omitting the attribute, except that it can be used to override a previous setting.

saxon:recognize-binary?

boolean

Relevant only when using the text output method. If set to yes, the processing instructions <?hex XXXX?> and <?b64 XXXX?> will be recognized; the value is taken as a hexBinary or base64 representation of a character string, encoded using the encoding in use by the serializer, and this character string will be output without validating it to ensure it contains valid XML characters. Also recognized are <?hex.EEEE XXXX?> and <?b64.EEEE XXXX?>, where EEEE is the name of the encoding of the base64 or hexBinary data: for example hex.ascii or b64.utf8.

This enables non-XML characters, notably binary zero, to be output.

For example, given <xsl:output method="text" saxon:recognize-binary="yes"/>, the following instruction:

<xsl:processing-instruction name="hex.ascii" select="'00'"/>

outputs the Unicode character with codepoint zero ("NUL"), while

<xsl:processing-instruction name="b64.utf8" select="securityKey"/>

outputs the value of the securityKey element, on the assumption that this is base64-encoded UTF-8 text.

saxon:require-well-formed?

boolean

Affects the handling of result documents that contain multiple top-level elements or top-level text nodes. The W3C specifications allow such a result document, even though it is not a well-formed XML document. It is, however, a well-formed external general parsed entity, which means it can be incorporated into a well-formed XML document by means of an entity reference.

The default is no. If the value is set to yes, and a SAX destination (for example a SAXResult, a JDOMResult, or a user-written ContentHandler) is supplied to receive the results of the transformation, then Saxon will report an error rather than sending a non-well-formed stream of SAX events to the ContentHandler. This attribute is useful when the output of the stylesheet is sent to a component (for example an XSL-FO rendering engine) that is not designed to accept non-well-formed XML result trees.

Note also that namespace undeclarations of the form xmlns:p="" (as permitted by XML Namespaces 1.1) are passed to the startPrefixMapping() method of a user-defined ContentHandler only if undeclare-prefixes="yes" is specified on xsl:output.

saxon:supply-source-locator?

boolean

Relevant only when output is sent to a user-written ContentHandler, that is, a SAXResult. It causes extra information to be maintained and made available to the ContentHandler for diagnostic purposes: specifically, the Locator that is passed to the ContentHandler via the setDocumentLocator method may be cast to a ContentHandlerProxyLocator, which exposes the method getContextItemStack(). This returns a java.util.Stack. The top item on the stack is the current context item, and below this are previous context items. Each item is represented by the interface net.sf.saxon.om.Item. If the item is a node, and if the node is one derived by parsing a source document with the line-numbering option enabled, then it is possible to obtain the URI and line number of this node in the original XML source.

For this to work, the code must be compiled with tracing enabled. This can be achieved by setting the option config.setCompileWithTracing(true) on the Configuration object, or equivalently by setting the property FeatureKeys.COMPILE_WITH_TRACING on the JAXP TransformerFactory. Note that this compile-time option imposes a substantial run-time overhead, even if tracing is not switched on at run-time by providing a TraceListener.

saxon:suppress-indentation?

eqnames

When the output method is XML with indent="yes", the saxon:suppress-indentation attribute may be used to suppress indentation for certain elements. The value is a whitespace-separated list of element names. The attribute follows the same conventions as cdata-section-elements: values specified in separate xsl:output or xsl:result-document elements are cumulative, and if the value is supplied programmatically via an API, or from the command line, then the element names are given in Clark notation, namely {uri}local. The effect of the attribute is to suppress indentation for the content of the specified elements, that is, no whitespace will be inserted within such elements, at any depth. The option is useful where elements contain mixed content in which whitespace is significant.