saxonica.com

The saxon:character-representation attribute

This attribute allows greater control over how non-ASCII characters will be represented on output.

With method="xml", two values are supported: "decimal" and "hex". These control whether numeric character references are output in decimal or hexadecimal when the character is not available in the selected encoding.

With HTML, the value may hold two strings, separated by a semicolon. The first string defines how non-ASCII characters within the character encoding will be represented, the values being "native", "entity", "decimal", or "hex". The second string defines how characters outside the encoding will be represented, the values being "entity", "decimal", or "hex". Here "native" means output the character as itself; "entity" means use a defined entity reference (such as "é") if known; "decimal" and "hex" refer to numeric character references. For example "entity;decimal" (the default) means that with encoding="iso-8859-1", characters in the range 160-255 will be represented using standard HTML entity references, while Unicode characters above 255 will be represented as decimal character references.

This attribute is retained for the time being in the interests of backwards compatibility. However, the latest XSLT 2.0 specification makes it technically a non-conformance to provide attributes that change serialization behavior except in cases where the behavior is implementation-defined; and this is not such a case (the specification, at least in the case of the XML output method, does not allow a character to be substituted with a character reference in cases where the character is present in the chosen encoding. The best way of ensuring that non-ASCII characters are output using character references is to use encoding="us-ascii".

Next