DITA-OT Optimization

The DITA Open Toolkit (DITA-OT) is a generic framework for the processing of DITA documents, which is written in a mixture of XSLT 1.0 and 2.0. Since DITA uses tags within a @class attribute to describe document logical structure, XSLT stylesheets in the framework use a very large number of templates with matches of the form:

<xsl:template match="*[contains(@class,' tag ')]">...

where the combination of the contains() function and the spaces around the tag string simulate a tokenized comparision. (Much of the code dates from XSLT 1.0, which lacked the tokenize() function.) In processing some example documents, up to 35% of the entire execution time is taken up by testing a few hundred very similar patterns.

By enabling DITA-OT optimization, Saxon-EE will attempt to improve the template selection performance by tokenizing the @class attribute on whitespace and finding the appropriate templates from an index derived from the tag values.

This technique is experimental and should be used only after testing representative documents for correctness by comparing outputs with DITA indexation enabled and disabled. Invoking Saxon-EE through the class com.saxonica.StatsTransform can be used to help and can give indications of the level of any performance improvement to be anticipated. For more details of this analysis tool, see Detailed Pattern-Matching Statistics.