Compiling a Stylesheet
Generally, the cost of analyzing the XSLT source code in a stylesheet and preparing it for execution can be high in relation to the cost of actually running the code to transform an individual source document, especially where the stylesheet is large and the source document is small. Saxon provides several capabilities designed to ensure that when you use the same stylesheet repeatedly, you only need to incur this overhead once.
- In simple cases, you can exploit the ability to process an entire directory of source
files using a single invocation of the
Transform
command on the command line. - Both the JAXP and s9api interfaces separate the process of compiling a stylesheet and the process of using it to transform a source document. (With JAXP the object representing the compiled stylesheet is the Templates object, with s9api it is the XsltExecutable). If you run transformations within a web service then it is always a good idea to cache the compiled form of the stylesheets it uses.
- From Saxon 9.7, it is also possible to export the compiled form of a stylesheet as an XML file (called the stylesheet export file), in much the same way that object code from other languages is saved to filestore, and distributed from developers to users.
- A related capability is the ability in Saxon-EE to generate bytecode (intermediate Java code) to improve the speed of stylesheet execution.
Caching compiled stylesheets in memory
The JAXP interface represents a compiled stylesheet as a Templates
object. The
object contains the entire stylesheet; all modules must be compiled as a single unit. JAXP
was designed before packages were added to the XSLT 3.0 language. The
Templates
object is thread-safe, so once created it can be used by many
transformations running separately in parallel. To use the Templates
object to
run a transformation of a particular source document, a Transformer
object is
created. The Transformer
is not thread-safe; its transform()
method must not be called while a transformation is active. The Transformer
can be serially reused, but with Saxon there is no benefit in doing so; better garbage
collection generally occurs if a new Transformer
is created for each
transformation.
The s9api interface in its original form has a similar design: a compiled stylesheet is
represented by a XsltExecutable
object, and the instantiation of a stylesheet performing a single transformation by an
XsltTransformer object. The s9api
interface also adds a third class to the design, namely the XsltCompiler, which holds compile-time
options such as the base URI of the stylesheet, values of static parameters, and
compile-time options such as whether to generate bytecode, how to resolve references to
modules (xsl:include
/xsl:import
), what schema definitions to use,
and where to report compile-time errors. The XsltCompiler
is also thread-safe,
though the options in force should not be changed while the compiler is in use. Different
XsltCompiler
instances with different option settings can run concurrently
with each other.
A preliminary implementation of XSLT 3.0 packages appeared in Saxon 9.6, with a much more
complete implementation following in Saxon 9.7. A package may consist of a single module,
or of a number of modules connected using xsl:include
/xsl:import
;
a package is compiled as a unit, and may have references to other packages (via
xsl:use-package)
that are compiled independently. To allow independent
compilation, there is much stronger control over the interfaces that a package exposes to
the outside world, and over the ability of declarations in one package to override another.
For example, if a function is declared to return an integer, then when compiling a call to
that function, the compiler can be confident that any overriding declaration of the
function will still return an integer result.
In the s9api interface, a package is represented by an XsltPackage object. The
XsltCompiler
has a method compilePackage
which returns an
XsltPackage
if successful. The package may be made available for use by
other packages being compiled, in the same or in a different XsltCompiler
, by
the XsltCompiler
's importPackage
method. When an
xsl:use-package
declaration is found while compiling one package, the
compiler searches for a matching package among those that have been imported by the
XsltCompiler
in this way. It is possible to import several different
versions of the same package, and the package-version
attribute of
xsl:use-package
determines which of them is loaded.
The XsltPackage
object, once created, is immutable and thread-safe. It is tied
to a Saxon Configuration (or s9api Processor) but it can be imported by multiple
XsltCompiler
instances. If a common library package is used by many
different stylesheets, it makes sense to define it as a reusable package, since this avoids
the cost of compiling the code repeatedly, and avoids the need to keep multiple copies in
memory.
Exporting Packages
From Saxon 9.7, a package once compiled into an XsltPackage object, can be saved as a stylesheet export file using the save()
method
of the XsltPackage
. The generated file is intended to be used for one purpose
only, namely for reconstituting the XsltPackage
at a different time and place.
The format is XML, but its interpretation is not published and should not be considered
stable. The file contains a checksum and cannot be loaded in the event of a checksum
failure, so modifications to the content are not permitted. The content of the file is
sufficiently far removed from the original source that distributing code in this form
achieves a useful level of IP protection, though like Java bytecode, it is not intended to
resist determined attempts at reverse engineering. Indeed, in the interests of run-time
diagnostics, it preserves information such as variable names and line numbers that are not
strictly needed at execution time.
The simplest way to generate an export file is from the command line:
java -jar dir/saxon9ee.jar -xsl:stylesheet.xsl -export:stylesheet.sef -nogoHere, the option -nogo
suppresses any attempt to execute the stylesheet.
A stylesheet export file for a complete stylesheet (as distinct from a library package) is accepted by any Saxon interface that accepts a source stylesheet. For example, from the command line:
java -jar dir/saxon9ee.jar -xsl:stylesheet.sef -s:source.xmlA stylesheet export file is also needed when using the Saxon-JS product to run transformations
in the browser. In this case the export file must be generated with the option -target:JS
because there are minor differences (for example, for some constructs such as node tests in path
expressions and match patterns the export file actually includes fragments of generated Javascript code
to speed evaluation).
When exporting a package, all components (templates, functions, etc) from the packages it
uses are also exported. It is possible therefore either to export an individual library
package (typically having no dependencies on other packages), or a complete stylesheet (a
package together with its tree of dependencies). As well as the s9api interface, packages
can also be exported using the -export
option on the net.sf.saxon.Transform command line. Packages can
similarly be imported either by listing them in the -pack
option of
net.sf.saxon.Transform
, or within s9api by use of the XsltCompiler methods
loadLibraryPackage
and loadExecutablePackage
.
In the case of schema-aware stylesheets, the schema components needed by a stylesheet are not exported along with the stylesheet code. The user of the stylesheet needs to import the required schemas before the stylesheets can be loaded. The schema loaded at execution time must match the schema used when the stylesheet was compiled. Saxon is not draconian about checking this, and many minor changes will cause no trouble (for example, changing the regular expression used in a pattern facet). Structural changes that invalidate the assumptions made during XSLT compilation, however, are likely to cause execution to fail, not necessarily in predictable ways.
The computer on which the stylesheet is executed needs to have a Saxon license of sufficient capability to meet the requirements of the stylesheet. There are two ways this can be achieved. Either the run-time system can have a conventional Saxon license installed in the normal way, or it can take advantage of a license embedded within the exported stylesheet itself. Saxonica offers developers the option of purchasing a "developer master key" which, if installed, will cause all exported stylesheets to contain an embedded license key sufficient to execute the stylesheet in question. An embedded license key applies only to that stylesheet and cannot be used for any other code developed elsewhere; stylesheets that are exported with an embedded license can only be executed "as is", and cannot be incorporated as libraries into larger applications.
Exporting stylesheet packages requires Saxon-EE, optionally with the Developer Master Key if stylesheets with embedded license information are to be exported. Importing stylesheet packages requires the Saxon-PE or Saxon-EE software to make the package import possible, but no license key need be purchased unless the stylesheet to be executed uses licensable Saxon features. (This means that the run-time software needed to execute packaged code in this way is free-of-charge but not open source.)
Bytecode generation
When a stylesheet package is compiled into its in-memory representation, Saxon-EE by default generates Java bytecode for faster execution of selected parts of the code. The generated bytecode is mixed with interpreted code, each calling the other where appropriate.
The performance boost achieved by bytecode generation is variable; 25% is typical. The functions and templates that benefit the most are those where the expression tree contains many constructs that are relatively cheap in themselves, such as type conversion, comparisons, and arithmetic. This is because the saving from bytecode generation is mainly not in the cost of performing primitive operations, but in the cost of deciding which operations to perform: so the saving is greater where the number of operations is high relative to their average cost.
There are configuration options to suppress bytecode generation
(FeatureKeys.GENERATE_BYTE_CODE
), to insert debugging logic into the
generated bytecode (FeatureKeys.DEBUG_BYTE_CODE
), and to display the generated
bytecode (FeatureKeys.DISPLAY_BYTE_CODE
). See Configuration Features for more
information.
Currently (Saxon 9.7), exported packages do not include bytecode.