SAXONICA |
This page summarizes the changes in 8.7.1. In addition to the changes listed here, all bugs listed on the SourceForge site under group v8.7.1 have been fixed.
New features
A new extension attribute saxon:allow-all-built-in-types="yes"
has been added to enable
the use of types such as xs:int
which are not permitted by the W3C conformance rules for
a Basic XSLT Processor. These types are already allowed by Saxon-SA, of course, but this switch also
enables their use with Saxon-B. The particular use case that prompted this extension was Dimitre Novatchev's
XPath 2.0 Visualizer tool, which uses dynamically-constructed XSLT stylesheets as a vehicle for exercising
XPath expressions.
Saxon-B is now capable of taking as input a source tree that contains typed elements and attributes,
provided that the type annotations are restricted to the built-in types. Such input can be supplied, for example
by sending the document to Saxon in the form of a sequence of Receiver events with type annotations included,
or by creating a user-defined implementation of the NodeInfo interface. If it is known that all nodes will be
untyped, it is useful to call the method Configuration.setAllNodesUntyped(true)
because this information
is useful to the compiler. This is done automatically when the XSLT or XQuery processor is invoked from the
command line with Saxon-B.
A number of new output character encodings are now supported natively, including EUC-JP, EUC-KR, Big5, GB2312, ISO 8859-5, ISO 8859-7, ISO8859-8, ISO8859-9. Thanks to Lauren Ward of Hewlett Packard for supplying these.
The DOM4J object model is now recognized in the same way as DOM, JDOM, and XOM. The code has been lifted from the Orbeon OPS server (it was originally written as a modification of the JDOM support module in Saxon). A few minor bugs have been fixed. Thanks to Erik Bruchez for identifying this opportunity.
The StaxBridge
class now has a method that allows you to supply your own
XMLStreamReader
.
Previously, the default language for format-date()
and related functions in XSLT was taken from the
Java default locale. This has been changed so that a non-English language is used as the default only if (a)
it is the language of the Java default locale, and (b) there is an installed numberer for that language. The effect
of this change is to eliminate the warning output [Language: en] produced when the Java default locale is non-English
but there is no localized numberer available for that language.
Problems fixed
The new code introduced in Saxon 8.7 for converting floating point numbers to strings was found to be unsatisfactory, and has been completely rewritten using a different algorithm.
An optimization used by the schema validator while constructing finite state machines to implement the schema grammar was found to be unsound in a very small number of cases; the optimization has therefore been removed. Unfortunately this means that compiling a schema is now a little slower.
In xsl:analyze-string
, a check is now made for error XTDE1150 (regex matches a zero-length string) in the
case where the regex is not known until run-time.
In XSLT, the attribute stable="yes"
or stable="no"
is now permitted on xsl:sort
.
It currently has no effect (sorting is always stable in Saxon). This is conformant behaviour, because the effect
of stable="no"
is implementation-dependent.
In XQuery, when a ModuleURIResolver
is set on the StaticQueryContext
for a main
module, it is now also used for resolving module imports contained in any transitively-imported library modules.
The "tiny forest" mechanism, whereby a single TinyTree structure is used to hold multiple trees (root nodes) in a sequence, was found not to be working reliably in Saxon 8.7, and has been redesigned to make it more robust. Generally speaking, this mechanism reduces the number of objects that are allocated but increases their size; this may affect the performance profile of some applications.
The XSLT xsl:number
instruction now recognizes non-BMP digits in its format string.
(This works best with JDK 1.5; there are some restrictions under JDK 1.4)
The TypeHierarchy
object, which holds a cache of type information, is now held
as part of the Configuration
and no longer as part of the NamePool
. This
is to avoid memory leaks in cases where one long-lived NamePool
was used with many
transient Configuration
objects. (This happened with the schema-aware product only,
because user-defined types held in the TypeHierarchy
hold a reference to the
Configuration
under which they were created.)
A change has been made to the way in which XSLT current template rule is maintained. This
is to implement the rule that when a template is defined using a union pattern, it is treated
as a set of template rules with potentially different priorities. The xsl:next-match
instruction can therefore invoke the same template more than once. To implement this, the
currentTemplate maintained in the context is now a Rule object rather than a Template object.
In schema-aware processing, improvements have been made to the type inferencing. The type
of a path expresssion starting with a variable whose static type is document-node(schema-element(x))
is now inferred more precisely, and the cardinality of an expression using the child axis is also
now inferred more precisely. This enables better compile-time detection of type errors, and in some cases
better optimization.
On .NET, Saxon 8.7.1 is built using IKVM 0.26. The associated version of GNU Classpath fixes a number of bugs, including a serious one involving decimal arithmetic.
W3C language conformance
The component extraction functions get-years-from-duration()
, get-months-from-duration()
,
etc, now operate on any xs:duration value, not only on an xdt:yearMonthDuration
or
xdt:dayTimeDuration
value. (W3C Bugzilla 2934)
The type names dayTimeDuration
, yearMonthDuration
, untypedAtomic
,
untyped
, and anyAtomicType
are
now recognized in the xs
namespace
http://www.w3.org/2001/XMLSchema
as well as in the
previous xdt
namespace (in fact several versions of the xdt
namespace are
recognized. This situation is transitional: eventually only the XMLSchema namespace will be allowed.
The functions encode-for-uri()
and iri-to-uri()
have been modified
according to the changes agreed in W3C Bugzilla 2457
Casting from a derived type to a supertype is no longer a no-op. Although I believe that the language specification permits the previous behavior, it was controversial, and it seems better to do something that causes fewer surprises even if it is slower.
In schema-aware XQuery with multiple modules, error XQST0036 (an imported function or variable uses an unknown type) is now reported only if the function or variable is actually referenced in the importing module. See W3C Bugzilla 2546.
XSLT and XQuery error codes have been added for most validation errors. I have also started the process of incorporating XML Schema error codes as mandated by Appendix C of XML Schema Part 1.
Performance tuning
The internal MappingIterator
and MappingFunction
classes have
been subdivided into three pairs of classes that provide different subsets of the functionality:
ContextMappingIterator
is used when each item being mapped becomes the context item;
MappingIterator
when this is not the case, and ItemMappingIterator
when
the mapping is from one input item to zero-or-one output items. This change was made to reduce
code pathlengths in the most commonly used cases.
Handling of decimal values has been speeded up by using the JDK 1.5 method stripTrailingZeros
if it is available (Saxon uses an equivalent but slower routine otherwise).
There have been some improvements to the join optimizer in Saxon-SA, allowing hash joins to be used in some situations where they were not used previously.
Improvements have been made to the memoizing optimization used for
<xsl:number level="any"/>
.
API Change on .NET
In the .NET API, the BaseUri property of an XsltCompiler is now a Uri rather than a String. This change is for compatibility with the BaseUri property of the DocumentBuilder class.