Articles written for Stylus Studio
Saxonica has a close working relationship with the Stylus Studio team: Stylus Studio was the first
XML development environment to offer Saxon-SA as a standard feature. As part of this
collaboration, we wrote a regular column for their web site. The following articles
Return to top of page
The main W3C specifications implemented by Saxon are listed below. Each of these
documents contains many links to additional documents.
For information on Saxon's conformance to these specifications, see:
Return to top of page
Saxon on SourceForge
The open-source version of Saxon is hosted on SourceForge.
Return to top of page
Published Papers and Articles
XSLT Performance. This paper, by Michael Kay and Debbie Lockett, was presented
at XML London 2014. It presents a new benchmarking framework for XSLT. The project,
called XT-Speedo, is open source and we hope that it will attract a community of
developers. The tangible deliverable consists of a set of test material, a set of
drivers for various XSLT processors, and tools for analyzing the test results.
Underpinning these deliverables is a methodology and set of measurement objectives
influence the design and selection of material for the test suite, which are also
described in this paper.
the Saxon XSLT Processor. This paper was presented at XML Prague 2014. Streaming
is a major new feature of the XSLT 3.0 specification, currently a Last Call Working
Draft. This paper discusses streaming as defined in the W3C specification, and as
implemented in Saxon. Streaming refers to the ability to transform a document that
too big to fit in memory, which depends on transformation itself being in some sense
linear, so that pieces of the output appear in the same order as the pieces of the
on which they depend. This constraint is reflected in the W3C specification by a set
streamability rules that determine statically whether a stylesheet is streamable or
This paper gives a tutorial introduction to the streamability rules and they way they
are implemented in Saxon. It then does on to describe the implementation architecture
for implementing streaming in the Saxon run-time, by means of push pipelines, and
rationale for this choice of architecture.
XML on the
Web: is it still relevant?. This paper (presented at XML London 2013) discusses
what is meant by the term XML on the Web and how this relates to the browser. The
success of XSLT in the browser has so far been underwhelming, and it examines the
reasons for this and considers whether the situation might change. It describes the
capabilities of the first XSLT 2.0 processor designed to run within web browsers,
bringing not just the extra capabilities of a new version of XSLT, but also a new
thinking about how XSLT can be used to create interactive client-side applications.
Using this processor, the author demonstrates as a use-case, a technical documentation
application which permits browsing and searching in a intuitive way and shows its
internals to illustrate how it works.
interaction using client-side XSLT. This paper (presented at XML Prague 2013)
describes two use-case applications to illustrate the capabilities of the first XSLT
processor designed to run within web browsers. The first is a technical documentation
application, which permits browsing and searching in a intuitive way. The second is
multi-player chess game application; using the same XSLT 2.0 processor as the first
application, it is in fact very different in purpose and design in that it provides
multi-user interaction on the GUI and implements communication via a social media
network: namely Twitter.
The effects of bytecode generation in XSLT and
XQuery. This paper (presented at Balisage 2011, Montreal) discusses highly
efficient optimization of expression with XSLT and XQuery processors today and presents
further speed improvements that can be gained by generating bytecode rather than
interpreting queries directly. Although optimization produces the most throughput
the gains from optimization and bytecode generation are orthogonal, and compilation
produce about 25% gain over and above gains from optimization. Tests with two variants
of a well-known XSLT/XQuery processor, one with code generation and one with
optimization alone, demonstrate the effect on a range of queries.
A streaming XSLT processor. XSLT
transformations can refer to any information in the source document from any point
the stylesheet, without constraint; XSLT implementations typically support this freedom
by building a tree representation of the entire source document in memory and in
consequence can process only documents which fit in memory. But many transformations
in principle be performed without storing the entire source tree. The paper (given
Balisage 2010, Montreal) reports on the progress of the W3C XSL Working Group
implementation of a new version of XSLT, designed to make streamed implementations
You pull, I’ll push: On the polarity of
pipelines. This paper (given at Balisage 2009, Montreal) discusses the most
effective way to move XML data through a processing pipeline. It draws on the concept
program inversion, originally developed to eliminate bottlenecks in magnetic-tape-based
processes, and ideas derived from Jackson Structured Programming which allow processes
written in a convenient pull style to be compiled into push-style code; thus potentially
reducing both coordination overhead and latency.
Ten Reasons why Saxon XQuery is Fast. A paper written for the
IEEE Data Engineering Bulletin, included in a special issue published in December
and devoted to papers on the state-of-the-art in XQuery implementation. Most of what
paper says is of course equally applicable to XSLT.
Writing an XSLT Optimizer in XSLT. This paper (given at Extreme Markup 2007)
explores the possibility that since query optimization is an exercise in transforming
expression trees, and XSLT is a language for transforming trees, it ought to be possible
to write an optimizer in XSLT. (The rendition of the paper is poor because it has
only partially recovered after IDEAlliance, the conference organizers, withdrew their
public archive of the conference proceedings.)
C24 White Paper: Using XQuery with Financial Messages. Back in 2006-7, Saxonica
collaborated with C24 to enable Saxon to be used as the query engine within the C24
Integration Objects product. (The company was subsequently acquired by Iona, which
turn was acquired by Progress, but it is now independent again and trading under its
name. In 2013 we've resumed the collaboration and hope to move the technology forward
take advantage of all the things that have happened in Saxon in the meantime.) This
2007 paper describes how such an integration enables XQuery to be used to access non-XML
data such as SWIFT financial messages, and to convert data between different
Grouping in XQuery. Published at the XIME-P 2006 XQuery workshop at the SIGMOD
Conference in Chicago, this paper proposes an extension to XQuery to handle positional
grouping problems, derived from experience with the
construct in XSLT 2.0
Using XSLT and XQuery
for Life-Size Applications. This paper discusses the role of the XSLT 2.0 and
XQuery 1.0 languages when it comes to writing real-life, sizeable applications for
performing data transformations: especially factors such as error handling, debugging,
performance, reuse and customization of code, relationships with XML Schema and other
technologies such as XForms, and the use of pipeline-based application
XSLT and XQuery by Michael Kay. This paper was presented at XTech 2005 in
Amsterdam. It compares XSLT and XQuery not just using a blow-by-blow feature comparison,
but an assessment of the suitability of the languages for different tasks, and the
of users the two languages are aimed at.
Up-Conversion using XSLT 2.0 by Michael Kay. This paper was presented at XML
2004 in Washington DC. By means of a case study, it shows how some of the new features
in XSLT 2.0 (notably the grouping instructions and the facilities for handling regular
expressions) make XSLT 2.0 suitable for applications such as up-conversion (creating
structured XML from unstructured input) that were quite infeasible in XSLT 1.0.
XSLT and XPath
Optimization by Michael Kay. (.pdf format, 70kb) This paper presented at XML
Europe 2004 in Amsterdam looked at the techniques used inside an XSLT processor (Saxon,
of course!) to optimize performance. It described some of the techniques actually
in the Saxon processor, and surveyed other ideas coming from academia.
XML Five Years On
(.pdf format, 144kb): a review of the achievements so far and the challenges
ahead. Keynote address given by Michael Kay at the Document Engineering 2003
Conference in Grenoble, France.
XML & Co. - was bringt die Zukunft? Article in ComputerWoche (in German):
XML begann als "SGML light" und sollte sich vor allem durch Einfachheit
auszeichnen. Eine Reihe von Zusatzstandards erhöhten aber zwischenzeitlich die
Komplexität beträchtlich. Während der Kernstandard weitgehend stabil bleibt, stehen
in anderen Bereichen größere Änderungen bevor.
Saxon: Anatomy of an
XSLT Processor by Michael Kay. This paper, although published as long ago as
2001, remains a frequently cited description of how XSLT processing in a product like
Saxon actually works.
What kind of a
language is XSLT? by Michael Kay. This paper, published at the same time as the
one above, gives an overview of the capabilities of the XSLT language.
on open-source development. Some personal insights into the experience of
undertaking open-source software development, in particular solo development as distinct
from group development.
Return to top of page
In some of my tutorials and seminars I use a genealogy application to illustrate the
features of XSLT 2.0. The files for this demonstration are available for download.
Return to top of page
XSLT 2.0 Programmer's Reference 4th edition by Michael Kay, published by Wrox
Press. This book is widely recognized as the authoritative reference on the XSLT 2.0
language, second only to the W3C specification itself. It covers every feature of
language comprehensively, while at the same time explaining the concepts behind the
language design, and giving many examples of practical stylesheets to illustrate each
||Find it on amazon.com
- The third edition was published in two separate volumes, covering XSLT 2.0
and XPath 2.0 separately. This edition was produced before the final specifications
were ratified by W3C, so there are some inaccuracies. The format (split into two
volumes) was not especially popular with readers, especially as many made the
mistake of buying the XSLT volume on its own, without realising that it relied
heavily on the reader also having access to the XPath book. Navigation in the book
was also difficult because of the absence of running heads for the alphabetical
chapters. The fourth edition corrects all these problems, and has received a much
more enthusiastic reception.
- The second edition remains in print, and is useful as the definitive
reference to the XSLT 1.0 language (though it does include some features from the
draft XSLT 1.1 specification, which W3C abandoned just before the book went to
- The first edition was published in April 2000, very soon after the XSLT 1.0
specification was ratified. It quickly established itself as the definitive guide
the language and played a significant part in ensuring the rapid and successful
adoption of XSLT by the user community.
XQuery from the Experts: A Guide to the W3C XML Query Language
Eight chapters by members of W3C's Query Working Group provide an overview of XQuery
designed to be of interest to programmers at every skill level. Coverage ranges from
strictly technical subjects to historical essays on the language's ancestry and the
process behind XQuery's design. The book presents its material in both tutorial and
Michael Kay's chapter provides a high-level comparison of XQuery and XSLT, looking
at the differences between the two languages and at their similarities.
A quote from a reader's review: Chapter Three is especially helpful for understanding
the similarities and differences between XQuery, XPath and XSLT. To really
understand where XQuery fits, you must understand this interrelationship. Not only
does Mr. Kay do a great job explaining that, he actually makes it fun to