Compiling a Stylesheet

Generally, the cost of analyzing the XSLT source code in a stylesheet and preparing it for execution can be high in relation to the cost of actually running the code to transform an individual source document, especially where the stylesheet is large and the source document is small. Saxon provides several capabilities designed to ensure that when you use the same stylesheet repeatedly, you only need to incur this overhead once.

In simple cases, you can exploit the ability to process an entire directory of source files using a single invocation of the Transform command on the command line.
Both the JAXP and s9api interfaces separate the process of compiling a stylesheet and the process of using it to transform a source document. (With JAXP the object representing the compiled stylesheet is the javax.xml.transform.Templates object, with s9api it is the XsltExecutable). If you run transformations within a web service then it is always a good idea to cache the compiled form of the stylesheets it uses.
From Saxon 9.7, it is also possible to export the compiled form of a stylesheet as an XML file (called the stylesheet export file), in much the same way that object code from other languages is saved to filestore, and distributed from developers to users.
A related capability is the ability in Saxon-EE to generate bytecode (intermediate Java code) to improve the speed of stylesheet execution.

Caching compiled stylesheets in memory

The JAXP interface represents a compiled stylesheet as a Templates object. The object contains the entire stylesheet; all modules must be compiled as a single unit. JAXP was designed before packages were added to the XSLT 3.0 language. The Templates object is thread-safe, so once created it can be used by many transformations running separately in parallel. To use the Templates object to run a transformation of a particular source document, a Transformer object is created. The Transformer is not thread-safe; its transform() method must not be called while a transformation is active. The Transformer can be serially reused, but with Saxon there is no benefit in doing so; better garbage collection generally occurs if a new Transformer is created for each transformation.

The s9api interface in its original form has a similar design: a compiled stylesheet is represented by a XsltExecutable object, and the instantiation of a stylesheet performing a single transformation by an XsltTransformer object. The s9api interface also adds a third class to the design, namely the XsltCompiler, which holds compile-time options such as the base URI of the stylesheet, values of static parameters, and compile-time options such as whether to generate bytecode, how to resolve references to modules (xsl:include/xsl:import), what schema definitions to use, and where to report compile-time errors. The XsltCompiler is also thread-safe, though the options in force should not be changed while the compiler is in use. Different XsltCompiler instances with different option settings can run concurrently with each other.

A preliminary implementation of XSLT 3.0 packages appeared in Saxon 9.6, with a much more complete implementation following in Saxon 9.7. A package may consist of a single module, or of a number of modules connected using xsl:include/xsl:import; a package is compiled as a unit, and may have references to other packages (via xsl:use-package) that are compiled independently. To allow independent compilation, there is much stronger control over the interfaces that a package exposes to the outside world, and over the ability of declarations in one package to override another. For example, if a function is declared to return an integer, then when compiling a call to that function, the compiler can be confident that any overriding declaration of the function will still return an integer result.

In the s9api interface, a package is represented by an XsltPackage object. The XsltCompiler has a method compilePackage which returns an XsltPackage if successful. The package may be made available for use by other packages being compiled, in the same or in a different XsltCompiler, by the XsltCompiler's importPackage method. When an xsl:use-package declaration is found while compiling one package, the compiler searches for a matching package among those that have been imported by the XsltCompiler in this way. It is possible to import several different versions of the same package, and the package-version attribute of xsl:use-package determines which of them is loaded.

The XsltPackage object, once created, is immutable and thread-safe. It is tied to a Saxon Configuration (or s9api Processor) but it can be imported by multiple XsltCompiler instances. If a common library package is used by many different stylesheets, it makes sense to define it as a reusable package, since this avoids the cost of compiling the code repeatedly, and avoids the need to keep multiple copies in memory.

Exporting Packages

From Saxon 9.7, a package once compiled into an XsltPackage object, can be saved as a stylesheet export file using the save() method of the XsltPackage. The generated file is intended to be used for one purpose only, namely for reconstituting the XsltPackage at a different time and place. The format is XML, but its interpretation is not published and should not be considered stable. The file contains a checksum and cannot be loaded in the event of a checksum failure, so modifications to the content are not permitted. The content of the file is sufficiently far removed from the original source that distributing code in this form achieves a useful level of IP protection, though like Java bytecode, it is not intended to resist determined attempts at reverse engineering. Indeed, in the interests of run-time diagnostics, it preserves information such as variable names and line numbers that are not strictly needed at execution time.

The simplest way to generate an export file is from the command line:

java -jar dir/saxon9ee.jar -xsl:stylesheet.xsl -export:stylesheet.sef -nogo

Here, the option -nogo suppresses any attempt to execute the stylesheet.

Additionally, the -relocate:on option can be used to produce an export package which can be deployed to a different location, with a different base URI.

A stylesheet export file for a complete stylesheet (as distinct from a library package) is accepted by any Saxon interface that accepts a source stylesheet. For example, from the command line:

java -jar dir/saxon9ee.jar -xsl:stylesheet.sef -s:source.xml

A stylesheet export file is also needed when using the Saxon-JS product to run transformations in the browser. In this case the export file must be generated with the option -target:JS because there are minor differences (for example, for some constructs such as node tests in path expressions and match patterns the export file actually includes fragments of generated Javascript code to speed evaluation).

When exporting a package, all components (templates, functions, etc) from the packages it uses are also exported. It is possible therefore either to export an individual library package (typically having no dependencies on other packages), or a complete stylesheet (a package together with its tree of dependencies). As well as the s9api interface, packages can also be exported using the -export option on the net.sf.saxon.Transform command line. Packages can similarly be imported either by listing them in the -pack option of net.sf.saxon.Transform, or within s9api by use of the XsltCompiler methods loadLibraryPackage and loadExecutablePackage.

In the case of schema-aware stylesheets, the schema components needed by a stylesheet are not exported along with the stylesheet code. The user of the stylesheet needs to import the required schemas before the stylesheets can be loaded. The schema loaded at execution time must match the schema used when the stylesheet was compiled. Saxon is not draconian about checking this, and many minor changes will cause no trouble (for example, changing the regular expression used in a pattern facet). Structural changes that invalidate the assumptions made during XSLT compilation, however, are likely to cause execution to fail, not necessarily in predictable ways.

The computer on which the stylesheet is executed needs to have a Saxon license of sufficient capability to meet the requirements of the stylesheet. There are two ways this can be achieved. Either the run-time system can have a conventional Saxon license installed in the normal way, or it can take advantage of a license embedded within the exported stylesheet itself. Saxonica offers developers the option of purchasing a "developer master key" which, if installed, will cause all exported stylesheets to contain an embedded license key sufficient to execute the stylesheet in question. An embedded license key applies only to that stylesheet and cannot be used for any other code developed elsewhere; stylesheets that are exported with an embedded license can only be executed "as is", and cannot be incorporated as libraries into larger applications.

Exporting stylesheet packages requires Saxon-EE, optionally with the Developer Master Key if stylesheets with embedded license information are to be exported. Importing stylesheet packages requires the Saxon-PE or Saxon-EE software to make the package import possible, but no license key need be purchased unless the stylesheet to be executed uses licensable Saxon features. (This means that the run-time software needed to execute packaged code in this way is free-of-charge but not open source.)

Bytecode generation

When a stylesheet package is compiled into its in-memory representation, Saxon-EE by default generates Java bytecode for faster execution of selected parts of the code. The generated bytecode is mixed with interpreted code, each calling the other where appropriate.

The performance boost achieved by bytecode generation is variable; 25% is typical. The functions and templates that benefit the most are those where the expression tree contains many constructs that are relatively cheap in themselves, such as type conversion, comparisons, and arithmetic. This is because the saving from bytecode generation is mainly not in the cost of performing primitive operations, but in the cost of deciding which operations to perform: so the saving is greater where the number of operations is high relative to their average cost.

There are configuration options to suppress bytecode generation (FeatureKeys.GENERATE_BYTE_CODE), to insert debugging logic into the generated bytecode (FeatureKeys.DEBUG_BYTE_CODE), and to display the generated bytecode (FeatureKeys.DISPLAY_BYTE_CODE). See Configuration Features for more information.

Currently, exported packages do not include bytecode.