Saxonica: Information
saxonica.com

Further information


Articles written for Stylus Studio

Saxonica has a close working relationship with the Stylus Studio team: Stylus Studio was the first XML development environment to offer Saxon-SA as a standard feature. As part of this collaboration, we wrote a regular column for their web site. The following articles have been published:

Return to top of page

W3C Specifications

The main W3C specifications implemented by Saxon are listed below. Each of these documents contains many links to additional documents.

For information on Saxon's conformance to these specifications, see:

Return to top of page

Saxon on SourceForge

The open-source version of Saxon is hosted on SourceForge.

Return to top of page

Published Papers and Articles

Benchmarking XSLT Performance. This paper, by Michael Kay and Debbie Lockett, was presented at XML London 2014. It presents a new benchmarking framework for XSLT. The project, called XT-Speedo, is open source and we hope that it will attract a community of developers. The tangible deliverable consists of a set of test material, a set of test drivers for various XSLT processors, and tools for analyzing the test results. Underpinning these deliverables is a methodology and set of measurement objectives that influence the design and selection of material for the test suite, which are also described in this paper.

Streaming in the Saxon XSLT Processor. This paper was presented at XML Prague 2014. Streaming is a major new feature of the XSLT 3.0 specification, currently a Last Call Working Draft. This paper discusses streaming as defined in the W3C specification, and as implemented in Saxon. Streaming refers to the ability to transform a document that is too big to fit in memory, which depends on transformation itself being in some sense linear, so that pieces of the output appear in the same order as the pieces of the input on which they depend. This constraint is reflected in the W3C specification by a set of streamability rules that determine statically whether a stylesheet is streamable or not. This paper gives a tutorial introduction to the streamability rules and they way they are implemented in Saxon. It then does on to describe the implementation architecture for implementing streaming in the Saxon run-time, by means of push pipelines, and gives rationale for this choice of architecture.

XML on the Web: is it still relevant?. This paper (presented at XML London 2013) discusses what is meant by the term XML on the Web and how this relates to the browser. The success of XSLT in the browser has so far been underwhelming, and it examines the reasons for this and considers whether the situation might change. It describes the capabilities of the first XSLT 2.0 processor designed to run within web browsers, bringing not just the extra capabilities of a new version of XSLT, but also a new way of thinking about how XSLT can be used to create interactive client-side applications. Using this processor, the author demonstrates as a use-case, a technical documentation application which permits browsing and searching in a intuitive way and shows its internals to illustrate how it works.

Multi-user interaction using client-side XSLT. This paper (presented at XML Prague 2013) describes two use-case applications to illustrate the capabilities of the first XSLT 2.0 processor designed to run within web browsers. The first is a technical documentation application, which permits browsing and searching in a intuitive way. The second is a multi-player chess game application; using the same XSLT 2.0 processor as the first application, it is in fact very different in purpose and design in that it provides multi-user interaction on the GUI and implements communication via a social media network: namely Twitter.

The effects of bytecode generation in XSLT and XQuery. This paper (presented at Balisage 2011, Montreal) discusses highly efficient optimization of expression with XSLT and XQuery processors today and presents further speed improvements that can be gained by generating bytecode rather than interpreting queries directly. Although optimization produces the most throughput gain, the gains from optimization and bytecode generation are orthogonal, and compilation can produce about 25% gain over and above gains from optimization. Tests with two variants of a well-known XSLT/XQuery processor, one with code generation and one with optimization alone, demonstrate the effect on a range of queries.

A streaming XSLT processor. XSLT transformations can refer to any information in the source document from any point in the stylesheet, without constraint; XSLT implementations typically support this freedom by building a tree representation of the entire source document in memory and in consequence can process only documents which fit in memory. But many transformations can in principle be performed without storing the entire source tree. The paper (given at Balisage 2010, Montreal) reports on the progress of the W3C XSL Working Group implementation of a new version of XSLT, designed to make streamed implementations of XSLT feasible.

You pull, I’ll push: On the polarity of pipelines. This paper (given at Balisage 2009, Montreal) discusses the most effective way to move XML data through a processing pipeline. It draws on the concept of program inversion, originally developed to eliminate bottlenecks in magnetic-tape-based processes, and ideas derived from Jackson Structured Programming which allow processes written in a convenient pull style to be compiled into push-style code; thus potentially reducing both coordination overhead and latency.

Ten Reasons why Saxon XQuery is Fast. A paper written for the IEEE Data Engineering Bulletin, included in a special issue published in December 2008 and devoted to papers on the state-of-the-art in XQuery implementation. Most of what the paper says is of course equally applicable to XSLT.

Writing an XSLT Optimizer in XSLT. This paper (given at Extreme Markup 2007) explores the possibility that since query optimization is an exercise in transforming expression trees, and XSLT is a language for transforming trees, it ought to be possible to write an optimizer in XSLT. (The rendition of the paper is poor because it has been only partially recovered after IDEAlliance, the conference organizers, withdrew their public archive of the conference proceedings.)

C24 White Paper: Using XQuery with Financial Messages. Back in 2006-7, Saxonica collaborated with C24 to enable Saxon to be used as the query engine within the C24 Integration Objects product. (The company was subsequently acquired by Iona, which in turn was acquired by Progress, but it is now independent again and trading under its old name. In 2013 we've resumed the collaboration and hope to move the technology forward to take advantage of all the things that have happened in Saxon in the meantime.) This May 2007 paper describes how such an integration enables XQuery to be used to access non-XML data such as SWIFT financial messages, and to convert data between different formats.

Positional Grouping in XQuery. Published at the XIME-P 2006 XQuery workshop at the SIGMOD Conference in Chicago, this paper proposes an extension to XQuery to handle positional grouping problems, derived from experience with the xsl:for-each-group construct in XSLT 2.0

Using XSLT and XQuery for Life-Size Applications. This paper discusses the role of the XSLT 2.0 and XQuery 1.0 languages when it comes to writing real-life, sizeable applications for performing data transformations: especially factors such as error handling, debugging, performance, reuse and customization of code, relationships with XML Schema and other technologies such as XForms, and the use of pipeline-based application architectures.

Comparing XSLT and XQuery by Michael Kay. This paper was presented at XTech 2005 in Amsterdam. It compares XSLT and XQuery not just using a blow-by-blow feature comparison, but an assessment of the suitability of the languages for different tasks, and the kinds of users the two languages are aimed at.

Up-Conversion using XSLT 2.0 by Michael Kay. This paper was presented at XML 2004 in Washington DC. By means of a case study, it shows how some of the new features in XSLT 2.0 (notably the grouping instructions and the facilities for handling regular expressions) make XSLT 2.0 suitable for applications such as up-conversion (creating structured XML from unstructured input) that were quite infeasible in XSLT 1.0.

XSLT and XPath Optimization by Michael Kay. (.pdf format, 70kb) This paper presented at XML Europe 2004 in Amsterdam looked at the techniques used inside an XSLT processor (Saxon, of course!) to optimize performance. It described some of the techniques actually used in the Saxon processor, and surveyed other ideas coming from academia.

XML Five Years On (.pdf format, 144kb): a review of the achievements so far and the challenges ahead. Keynote address given by Michael Kay at the Document Engineering 2003 Conference in Grenoble, France.

XML & Co. - was bringt die Zukunft? Article in ComputerWoche (in German): XML begann als "SGML light" und sollte sich vor allem durch Einfachheit auszeichnen. Eine Reihe von Zusatzstandards erhöhten aber zwischenzeitlich die Komplexität beträchtlich. Während der Kernstandard weitgehend stabil bleibt, stehen in anderen Bereichen größere Änderungen bevor.

Saxon: Anatomy of an XSLT Processor by Michael Kay. This paper, although published as long ago as 2001, remains a frequently cited description of how XSLT processing in a product like Saxon actually works.

What kind of a language is XSLT? by Michael Kay. This paper, published at the same time as the one above, gives an overview of the capabilities of the XSLT language.

Reflections on open-source development. Some personal insights into the experience of undertaking open-source software development, in particular solo development as distinct from group development.

Return to top of page

Demonstrations

In some of my tutorials and seminars I use a genealogy application to illustrate the features of XSLT 2.0. The files for this demonstration are available for download.

Return to top of page

Books

XSLT 2.0 Programmer's Reference 4th edition by Michael Kay, published by Wrox Press. This book is widely recognized as the authoritative reference on the XSLT 2.0 language, second only to the W3C specification itself. It covers every feature of the language comprehensively, while at the same time explaining the concepts behind the language design, and giving many examples of practical stylesheets to illustrate each language feature.

Find it on amazon.com

Previous editions

Also available:

XQuery from the Experts: A Guide to the W3C XML Query Language http://www.amazon.com/exec/obidos/ASIN/0321180607

Eight chapters by members of W3C's Query Working Group provide an overview of XQuery designed to be of interest to programmers at every skill level. Coverage ranges from strictly technical subjects to historical essays on the language's ancestry and the process behind XQuery's design. The book presents its material in both tutorial and reference form.

Michael Kay's chapter provides a high-level comparison of XQuery and XSLT, looking both at the differences between the two languages and at their similarities.

A quote from a reader's review: Chapter Three is especially helpful for understanding the similarities and differences between XQuery, XPath and XSLT. To really understand where XQuery fits, you must understand this interrelationship. Not only does Mr. Kay do a great job explaining that, he actually makes it fun to read.