Saxonica: Technology FAQ

Answers to frequently asked questions

Technology

Product functionality

Commercial

Why should I want to use a schema-aware XSLT or XQuery processor?

The short answer is that it makes it easier and faster to develop correct stylesheets and queries, especially when handling complex input or output vocabularies. By declaring the types of data that individual functions or templates are designed to manipulate, and by validating source and result data against a schema at each stage of processing, bugs are caught earlier in the development cycle. Incorrect queries and stylesheets typically produce an error message rather than simply producing incorrect output. Error messages are likely to refer to the exact point in the query or stylesheet where the error needs to be corrected, rather than simply reporting that the output of the query or stylesheet is invalid.

For more information, see Michael Kay's article on the Stylus Studio web site.

Return to top of page

I've heard XSLT is hard to learn. Is this true?

It really depends on your experience. If your background is in procedural programming, for example with Java or JavaScript then you may not be all that familiar with some of the ideas in XSLT.

Mulberry Technology's paper Introduction to XSLT Concepts provides an excellent introduction.

It is easy enough to get started by looking at some existing code, but there is so much power in XSLT that it is well worth investing some effort into studying it in more depth; you will be much more productive in your use of the language as a result.

For an excellent beginner's guide, see Beginning XSLT 2.0 by Jeni Tennison.

Return to top of page

Should I use XSLT or XQuery? Are the two languages in competition?

The two languages do have a high level of functional overlap, but each language has unique strengths. XSLT 2.0 is better than XQuery 1.0 at handling the rendition of narrative (document-oriented) XML (for example it offers facilities such as format-number() and format-date()), while XQuery makes it easier to perform some of the manipulations needed when handling more rigidly-structured data. Saxon is unique in allowing the two languages to be mixed in a single application.

Return to top of page

Does Saxon do static type-checking?

Saxon does not do static type-checking in the sense that the term is used in the W3C language specifications (this refers to pessimistic type checking, in which any construct that might fail at run-time is rejected at compile time). This is an optional feature of the W3C specifications. Saxon does however perform optimistic static analysis of queries and stylesheets, in which an error is reported only for constructions that must always fail at run-time. The information derived from this static analysis is also used to optimize the run-time code.

Return to top of page

Why didn't Saxon use the schema processor already available in Xerces?

Schema processing needs to be tightly integrated into a schema-aware XSLT or XQuery processor. By designing a new schema processor as an integral part of Saxon, it was possible to design data structures and interfaces that are optimized for use in this environment. For example, the schema processor and validator share the use of the Saxon NamePool for managing names and namespaces, and the validator is designed to slot into Saxon's SAX-like event processing pipeline used both for building source documents and for serializing output documents. This close integration also enables much better error reporting, something which is critical to the usability of a schema-aware XSLT or XQuery processor. Finally, Saxon has always prided itself on offering the highest-possible level of adherence to W3C specifications, and this would not be possible if such a critical component were outside Saxon's direct control.

Return to top of page

How do the products for different platforms relate to each other?

Saxon is offered on four technology platforms: Java, .NET, C, and JavaScript.

The first three (Java, .NET, and C) are derived from the same Java source code:

  • Until version 10, Saxon for .NET was created by converting Java bytecode to the .NET equivalent (CIL) using using IKVMC, an open-source cross-compiler. IKVMC, however, only supported .NET Framework, which has now been discontinued. From version 11, therefore, SaxonCS was created using an entirely different approach: the Java source code is translated into C# using a custom transpiler developed by Saxonica (largely in XSLT).
  • Saxon for the C platform (with language bindings for C++, PHP, and Python) was developed using the Excelsior JET tooling, which compiles Java "ahead of time" directly to machine code. (While this approach has proved successful and continues to work well, the tooling is no longer under active development and we will be looking at alternative mechanisms in the future.)
  • Saxon for JavaScript platforms (browsers and node.js) uses a completely separate code base, though much of the design is common. There is some interoperability, in that the XSLT compiler in the Java product can be used to generate an executable representation of a stylesheet (a SEF file) which the JavaScript product can load and execute.

Each of the products offers APIs adapted to the conventions of the relevant language (Java, C#, JavaScript, Python, etc.), and each takes advantage of the run-time capabilities of the respective platforms, and integrates with other components such as XML parsers appropriate to that platform.

Return to top of page

Will Saxon continue to be available free of charge?

The open-source development model has been very successful for Saxon, and the company Saxonica was established in order to make it possible to continue it. Saxonica aims to develop added-value options to the base Saxon technology that will be offered commercially, alongside the open-source product which will continue to meet the needs of most users.

In the twelve years since Saxonica was established, new releases of the open-source Saxon-HE product have continued to be released in parallel with the commercial Saxon-EE offering, and Saxon-HE users have benefited from many of the investments funded through Saxonica's commercial activities.

Return to top of page

What are the differences between the open-source and commercial versions?

The open source product, Saxon-HE, offers conformance to W3C standards at the minimum conformance level. The functionality and performance are good enough to build some serious applications, but the commercial products (PE - professional edition, and EE - enterprise edition) offer added capability that can improve scaleability and reduce integration costs.

The focus in Saxon-PE is on extensibility. Saxon-PE includes a range of functional extensions to the W3C base standards developed by Saxonica (for example, libraries for SQL database access and for manipulating binary data), and it provides APIs allowing XSLT, XQuery, and XPath code to make calls to user-written Java or C# methods.

For Saxon-EE, the focus is on scaleability. The main differences are:

  • Saxon-EE is schema-aware. This means it includes an XML Schema processor, and schema-aware XSLT 3.0 and XQuery 3.1 processors. The main benefit of schema-aware processing is that it makes it easier to find errors in large queries and stylesheets, leading to faster development and fewer bugs in deployed code. A schema-aware XSLT or XQuery processor takes advantage of the type information extracted when source documents are validated against a schema; this gives the potential for improved error reporting and improved performance. A schema-aware XSLT or XQuery processor also allows the result document to be validated as it is being written, so for example if the stylesheet or query generates incorrect XHTML, the error can be pinpointed to the place in the stylesheet or query that needs to be corrected.
  • Saxon-EE provides streamed processing as defined in the XSLT 3.0 specification. This allows source documents in the gigabyte to terabyte range to be transformed without running out of memory.
  • Saxon-EE allows stylesheets to be exported for execution elsewhere. This gives performance benefits when large stylesheets are used frequently; and it gives developers the ability to control how the code is used, protecting intellectual property and enabling change management. In addition, Saxon-EE allows critical parts of stylesheets or queries to be compiled as Java or .NET bytecode, giving a useful performance boost of typically 25-40%.

For more information, see Saxonica products.

Return to top of page

Is source code available for the schema-aware version of Saxon?

No, this software is being made available on a commercial basis, protected by a license key. If you need access to source code for commercial reasons, please contact Saxonica to discuss possible licensing terms.

Return to top of page

Does the commercial version of Saxon include all the open-source code?

Yes. The code from Saxon-HE is all present in Saxon-EE, without any modifications. Saxon-EE is built entirely by adding modules to Saxon-HE. Because the open source code is not modified, there is no need to publish separate Saxon-EE versions of these modules. In the terminology of the Mozilla Public License, Saxon-EE is a larger work rather than a modification.

Return to top of page

Does Saxon support any languages other than English?

All the interfaces for developers are in English, but there is some localisation support in transformation to enable dates and numbers to be formatted in other Western European languages. So end-user output can be localised, but developer output cannot. Number and date formatting is currently available for: English, Danish, German, French, French (Belgium), Italian, Dutch, Flemish (Belgium) and Swedish.

Saxon includes APIs which allow support for additional languages to be developed. Should you wish to do so for a specific language, we are happy to provide advice and to incorporate the results into a future version of Saxon.

Return to top of page

Who owns the IPR in Saxon?

The vast majority of the code was developed by Saxonica, and Saxonica therefore owns the copyright.

The open-source Saxon code is developed and released under the Mozilla Public License version 2.0, which you can obtain at http://www.mozilla.org/MPL/2.0/. The Mozilla license grants you a right to use the code, distribute it, and modify it, free of charge, for any purpose. Modifications to the code must be distributed under the same license as the original code. The Mozilla license is not viral: it allows you to incorporate Saxon in a commercial product without any requirement to make your own code open source.

The additional code in Saxon-PE and Saxon-EE is proprietary to Saxonica.

Prior to 2004, early versions of Saxon were produced by Michael Kay while working as an employee first of International Computers Limited (since merged into Fujitsu) and subsequently of Software AG. These early versions were released under the Mozilla Public License under the authority of those companies. Subsequent versions (including all commercial versions) of the software have all been released under the authority of Saxonica.

Over the years, some individual contributions to the open source code have been accepted from third parties. These are acknowledged in the documentation: see List of contributors. Currently Saxonica does not accept source code contributions without a formal assignment of copyright.

Saxon incorporates some open source components developed independently by third parties: an example is the sort routine. These components are all listed, together with their license conditions, in the documentation: see Third party source components.

Return to top of page

What license do I need to run Saxon in the cloud?

It depends what you are doing.

If you're running a web site or an application that delivers information to internal or external users, or that enables them to interact with you, then we try to treat it in the same way as if you were running the web site in-house. Pricing is a bit complicated because there's no simple formula that works for everyone, so the rule we apply is that the charge we make is related to the amount you are paying for the cloud service, just as it would be if you bought your own hardware to run it. Please ask us for a quote: we can be flexible.

If you're using the cloud to deliver software-as-a-service, then we treat this as if you were shipping an application for customers to run on their own machines. That means you need to buy a redistribution license. The distinguishing feature for this scenario is that the application is yours, but the data belongs to your customer.

Return to top of page