Michael Kay's Saxon diaries blog contains in-depth entries about a variety of topics relating to the current development of Saxon.
XSLT 2.0 Programmer's Reference 4th edition by Michael Kay, published by Wrox Press. This book is widely recognized as the authoritative reference on the XSLT 2.0 language, second only to the W3C specification itself. It covers every feature of the language comprehensively, while at the same time explaining the concepts behind the language design, and giving many examples of practical stylesheets to illustrate each language feature.
Michael Kay's XSLT 2.0 and XPath 2.0 (for XML, XSLT, and XPath) is some of the best money I've ever spent on XML-technology-related documentation - it is a fantastic piece of work.
— Bridger Dyson-Smith, posting on xsl-list, 2 August 2014
|Find it on amazon.com|
XQuery from the Experts: A Guide to the W3C XML Query Language http://www.amazon.com/exec/obidos/ASIN/0321180607
Eight chapters by members of W3C's Query Working Group provide an overview of XQuery designed to be of interest to programmers at every skill level. Coverage ranges from strictly technical subjects to historical essays on the language's ancestry and the process behind XQuery's design. The book presents its material in both tutorial and reference form.
Michael Kay's chapter provides a high-level comparison of XQuery and XSLT, looking both at the differences between the two languages and at their similarities.
Chapter Three is especially helpful for understanding the similarities and differences between XQuery, XPath and XSLT. To really understand where XQuery fits, you must understand this interrelationship. Not only does Mr. Kay do a great job explaining that, he actually makes it fun to read.
— A quote from a reader's review
Return to top of page
Michael Kay and John Lumley. Presented at XML Prague 2019.
This paper discusses the implementation of an XSLT 3.0 compiler written in XSLT 3.0. XSLT is a language designed for transforming XML trees, and since the input and output of the compiler are both XML trees, compilation can be seen as a special case of the class of problems for which XSLT was designed. Nevertheless, the peculiar challenges of multi-phase compilation in a declarative language create performance challenges, and much of the paper is concerned with a discussion of how the performance requirements were met.
Michael Kay and John Lumley. "An XSLT compiler written in XSLT: can it perform?". XML Prague 2019. http://archive.xmlprague.cz/2019/files/xmlprague-2019-proceedings.pdf
Debbie Lockett and Adam Retter. Presented at XML Prague 2019.
XPDLs (XPath Derived Languages) such as XQuery and XSLT have been pushed beyond the envisaged scope of their designers. Perversions such as processing Binary Streams, File System Navigation, and Asynchronous Browser DOM Mutation have all been witnessed. Many of these novel applications of XPDLs intentionally incorporate non-sequential and/or concurrent evaluation and embrace side effects to achieve their purpose. To arrive at a solution for safely managing side effects and concurrent execution, this paper first surveys both the available XPDL vendor extensions and approaches offered in non-XPDLs, and then describes EXPath Tasks, a novel solution derived for the safe evaluation of side effects in XPDLs which respects both sequential and concurrent execution.
Debbie Lockett and Adam Retter. "Task Abstraction for XPath Derived Languages". XML Prague 2019. http://archive.xmlprague.cz/2019/files/xmlprague-2019-proceedings.pdf
Michael Kay. Presented at Markup UK 2018.
Michael Kay. "An XSD 1.1 Schema Validator Written in XSLT 3.0". Markup UK 2018. http://markupuk.org/2018/Markup-UK-2018-proceedings.pdf
O'Neil Delpratt and Debbie Lockett. Presented at XML Prague 2018.
In this paper, we discuss our experiences in developing Saxon-Forms, a new partial XForms implementation for browsers using "interactive" XSLT 3.0, and suggest some benefits of this implementation over others. Firstly we describe the mechanics of the implementation - how XForms features such as actions are implemented using the interactive XSLT extensions available with Saxon-JS, to update form data in the (X)HTML page, and handle user input using event handling templates. Secondly we discuss how Saxon- Forms can be used, namely by integrating it into the client-side XSLT of a web application, and examples of the advantages of this architecture. As a motivation and use case we use Saxon-Forms in our in-house license tool application.
O'Neil Delpratt and Debbie Lockett. "Implementing XForms using interactive XSLT 3.0". XML Prague 2018. http://archive.xmlprague.cz/2018/files/xmlprague-2018-proceedings.pdf
Michael Kay. Presented at XML Prague 2018.
A large class of XML transformations involves making fairly small changes to a document. The functional nature of the XSLT and XQuery languages mean that data structures must be immutable, so these operations generally involve physically copying the whole document, including the parts that are unchanged, which is expensive in time and memory. Although efficient techniques are well known for avoiding these overheads with data structures such as maps, these techniques are difficult to apply to the XDM data model because of two closely-related features of that model: it exposes node identity (so a copy of a node is distinguishable from the original), and it allows navi- gation upwards in the tree (towards the root) as well as downwards. This paper proposes mechanisms to circumvent these difficulties.
Michael Kay. "XML Tree Models for Efficient Copy Operations". XML Prague 2018. http://archive.xmlprague.cz/2018/files/xmlprague-2018-proceedings.pdf
John Lumley. Presented at Balisage 2017, Washington DC.
Lumley, John, Debbie Lockett and Michael Kay. "Compiling XSLT3, in the browser, in itself." Presented at Balisage: The Markup Conference 2017, Washington, DC, August 1 - 4, 2017. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies, vol. 19 (2017). doi:10.4242/BalisageVol19.Lumley01.
O'Neil Delpratt and Debbie Lockett. Presented at XML London 2017.
This paper presents work on improving an existing in-house License Tool application. The current tool is a server-side web application, using XForms in the front end. The tool generates licenses for the Saxon commercial products using server-side XSLT processing. Our main focus is to move parts of the tool's architecture client-side, by using "interactive" XSLT 3.0 with Saxon-JS. A beneficial outcome of this redesign is that we have produced a truly XML end-to-end application.
O'Neil Delpratt and Debbie Lockett. "Distributing XSLT Processing between Client and Server". Presented at XML London 2017, June 10 - 11th, 2017. doi:10.14337/XMLLondon17.Lockett01.
Michael Kay. Presented at XML Prague 2017.
This paper describes, compares, and contrasts two techniques designed to enable an XML document to be processed without building an entire tree representation of the document in memory. Document projection analyses a query to determine which parts of the document are relevant to the query, and discards everything else during source document parsing. Streaming attempts to execute a stylesheet "on the fly" while the source document is being read. For both techniques, the paper describes the way that they are implemented in the Saxon XSLT and XQuery engine. Performance results are given that apply to both techniques, in relation to the queries in the XMark benchmark applied to a 118Mb source document. The paper concludes with a discussion of ideas for combining the benefits of both techniques and getting more synergy between them.
Michael Kay. "Projection and Streaming: Compared, Contrasted, and Synthesized". XML Prague 2017. http://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf
John Lumley, Debbie Lockett, Michael Kay. Presented at XML Prague 2017.
This paper discusses the implementation of an XPath 3.1 processor with high levels
standards compliance that runs entirely within current modern browsers. The runtime
pre-compiled XSLT 3.0 stylesheets, is extended with a dynamic XPath parser and
converter to the Saxon-JS compilation format. This is used to support both XSLT's
which supports XPath outside an XSLT context.
John Lumley, Debbie Lockett, and Michael Kay. "XPath 3.1 in the Browser". XML Prague 2017. http://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf
John Lumley. Presented at Balisage 2016, Washington DC.
This paper discusses transforming a CSS stylesheet into an XSLT transform that projects an approximation of the styling from the CSS onto a target XML document. It was developed during several XSLT-based projects involving multi-dialect XML documents, where there was a need either to evaluate CSS properties for another external tool, such as in an HTML → XSL-FO → PDF pipeline, or where a document styling needed to be "fixed" for embedding in another document, such as examples in professional papers. The paper presents examples, explains the general architecture of the generated XSLT transform, discusses how that transform is itself constructed from the CSS stylesheet and outlines the strengths and weaknesses and some of the directions in which the tool could be developed. It is approximate in that it only supports some of the core CSS features, assumes the user is "skilled in the art" and is working with CSS stylesheets that are understood and visible, and that the execution speed of the CSS "projection" is not an issue. Nevertheless, in the author's experience the ability to mix CSS styling into the "XSLT researcher's toolbox" has proved to be of some utility.
Lumley, John. "Approximate CSS Styling in XSLT". Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). doi:10.4242/BalisageVol17.Lumley01.
Debbie Lockett and Michael Kay. Presented at Balisage 2016, Washington DC.
Lockett, Debbie, and Michael Kay. "Saxon-JS: XSLT 3.0 in the Browser." Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). doi:10.4242/BalisageVol17.Lockett01.
Michael Kay. Presented at XML Prague 2016.
The XSLT 3.0 and XPath 3.1 specifications, now at Candidate Recommendation status, introduce capabilities for importing and exporting JSON data, either by converting it to XML, or by representing it natively using new data structures: maps and arrays. The purpose of this paper is to explore the usability of these facilities for tackling some practical transformation tasks. Two representative transformation tasks are considered, and solutions for each are provided either by converting the JSON data to XML and transforming that in the traditional way, or by transforming the native representation of JSON as maps and arrays. The exercise demonstrates that the absence of parent or ancestor axes in the native representation of JSON means that the transformation task needs to be approached in a very different way.
Kay, Michael. "Transforming JSON using XSLT 3.0". XML Prague 2016. http://archive.xmlprague.cz/2016/files/xmlprague-2016-proceedings.pdf
John Lumley. Presented at Balisage 2015, Washington DC.
This paper discusses automated methods of 'downgrading' XSLT 3.0 programs into XSLT 2.0 syntax and semantics. The stimulus was running portions of a document processing system, that had been upgraded to use more coherent features of XSLT 3.0, in the environment of a browser-based standards-compliant XSLT 2.0 implementation (Saxon-CE). The work involves detailed knowledge of XSLT and is intended to automate significant sections of the 'downconversion', leaving other sections to conditional compilation directives. All conversion tools are of course written in XSLT and several aspects involve partial processing and evaluation of XSLT semantics within XSLT.
Lumley, John. "Two from Three (in XSLT)". Presented at Balisage: The Markup Conference 2015, Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015). doi:10.4242/BalisageVol15.Lumley01.
John Lumley and Michael Kay. Presented at XML London 2015 and again at Balisage 2015, Washington DC.
This paper discusses improving the performance of XSLT programs that use very large numbers of similar patterns in their push-mode templates. The experimentation focusses around stylesheets used for processing DITA document frameworks, where much of the document logical structure is encoded in @class attributes. The processing stylesheets, often defined in XSLT 1.0, use string-containment tests on these attributes to describe push-template applicability. For some cases this can mean a few hundred string tests have to be performed for every element node in the input document to determine which template to evaluate, which sometimes means up to 30% of the entire processing time is taken up with such pattern matching. The paper examines methods, within XSLT implementations, to ameliorate this situation, including using sets of pattern preconditions and pretokenization of the class-describing attributes. How such optimisation may be configured for an XSLT implementation is discussed.
Dr. John Lumley and Dr. Michael Kay. "Improving Pattern Matching Performance in XSLT". Presented at XML London 2015, June 6 - 7th, 2015. doi:10.14337/XMLLondon15.Lumley01.
Michael Kay. Presented at XML Prague 2015.
One of the supposed benefits of using declarative languages (like XSLT) is the potential for parallel execution, taking advantage of the multi-core processors that are now available in commodity hardware. This paper describes recent developments in one popular XSLT processor, Saxon, which start to exploit this potential. It outlines the challenges in implementing parallel execution, and reports on the benefits that have been observed.
Kay, Michael. "Parallel Processing in the Saxon XSLT Processor". XML Prague 2015. http://archive.xmlprague.cz/2015/files/xmlprague-2015-proceedings.pdf
John Lumley. Presented at Balisage 2014, Washington DC.
Determining streamability of constructs in XSLT 3.0 involves the application of a set of rules that appear to be complex. A tool that analyses these rules on a given stylesheet has been developed to help developers understand why sections which were designed with streaming might fail the required conditions. This paper discusses the structure of this analysis tool.
Lumley, John. "Analysing XSLT Streamability". Presented at Balisage: The Markup Conference 2014, Washington, DC, August 5 - 8, 2014. In Proceedings of Balisage: The Markup Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). doi:10.4242/BalisageVol13.Lumley01.
Michael Kay and Debbie Lockett. Presented at XML London 2014.
This paper presents a new benchmarking framework for XSLT. The project, called XT-Speedo, is open source and we hope that it will attract a community of developers. The tangible deliverable consists of a set of test material, a set of test drivers for various XSLT processors, and tools for analyzing the test results. Underpinning these deliverables is a methodology and set of measurement objectives that influence the design and selection of material for the test suite, which are also described in this paper.
Dr. Michael Kay and Dr. Debbie Lockett. "Benchmarking XSLT Performance". Presented at XML London 2014, June 7 - 8th, 2014. doi:10.14337/XMLLondon14.Kay01.
Michael Kay. Presented at XML Prague 2014.
Streaming is a major new feature of the XSLT 3.0 specification, currently a Last Call Working Draft. This paper discusses streaming as defined in the W3C specification, and as implemented in Saxon. Streaming refers to the ability to transform a document that is too big to fit in memory, which depends on transformation itself being in some sense linear, so that pieces of the output appear in the same order as the pieces of the input on which they depend. This constraint is reflected in the W3C specification by a set of streamability rules that determine statically whether a stylesheet is streamable or not. This paper gives a tutorial introduction to the streamability rules and they way they are implemented in Saxon. It then does on to describe the implementation architecture for implementing streaming in the Saxon run-time, by means of push pipelines, and gives rationale for this choice of architecture.
Kay, Michael. "Streamability in Saxon". XML Prague 2014. http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf
John Lumley. Presented at XML Prague 2014.
This paper discusses issues and lessons that arose during the finalisation of a standard (library) for XSLT/XPath/XQuery extension functions to manipulate binary data. This process took place during 2013 in the EXPath community, through shared (mailing-list) commenting, specification redrafting, implementation experimentation and test suite development. The purpose, form and specification of the library (which isn’t technically difficult) are described briefly. Lessons and suggestions arising from the development are presented in four broad categories: establishing policies, concurrent implementation and application, using tools and declarative approaches, and pragmatic issues. None of these lessons are new, but bear reinforcement. This work was performed under the auspices of the EXPath community and was funded by Saxonica Ltd.
Lumley, John. "Finalising a (small) Standard". XML Prague 2014. http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf
O'Neil Delpratt. Presented at XML London 2013.
This paper discusses what is meant by the term XML on the Web and how this relates to the browser. The success of XSLT in the browser has so far been underwhelming, and it examines the reasons for this and considers whether the situation might change. It describes the capabilities of the first XSLT 2.0 processor designed to run within web browsers, bringing not just the extra capabilities of a new version of XSLT, but also a new way of thinking about how XSLT can be used to create interactive client-side applications. Using this processor, the author demonstrates as a use-case, a technical documentation application which permits browsing and searching in a intuitive way and shows its internals to illustrate how it works.
O'Neil Delpratt. "XML on the Web: Is it still relevant?". Presented at XML London 2013, June 15 - 16th, 2013. doi:10.14337/XMLLondon13.Delpratt01.
O'Neil Delpratt and Michael Kay. Presented at XML Prague 2013.
This paper describes two use-case applications to illustrate the capabilities of the first XSLT 2.0 processor designed to run within web browsers. The first is a technical documentation application, which permits browsing and searching in a intuitive way. The second is a multi-player chess game application; using the same XSLT 2.0 processor as the first application, it is in fact very different in purpose and design in that it provides multi-user interaction on the GUI and implements communication via a social media network: namely Twitter.
O'Neil Delpratt and Michael Kay. "Multi-user interaction using client-side XSLT". XML Prague 2013. http://archive.xmlprague.cz/2013/files/xmlprague-2013-proceedings.pdf
O'Neil Delpratt and Michael Kay. Presented at Balisage 2011, Montréal.
This paper discusses highly efficient optimization of expression with XSLT and XQuery processors today and presents further speed improvements that can be gained by generating bytecode rather than interpreting queries directly. Although optimization produces the most throughput gain, the gains from optimization and bytecode generation are orthogonal, and compilation can produce about 25% gain over and above gains from optimization. Tests with two variants of a well-known XSLT/XQuery processor, one with code generation and one with optimization alone, demonstrate the effect on a range of queries.
Delpratt, O'Neil Davion, and Michael Kay. "The Effects of Bytecode Generation in XSLT and XQuery". Presented at Balisage: The Markup Conference 2011, Montréal, Canada, August 2 - 5, 2011. In Proceedings of Balisage: The Markup Conference 2011. Balisage Series on Markup Technologies, vol. 7 (2011). doi:10.4242/BalisageVol7.Delpratt01.
XSLT transformations can refer to any information in the source document from any point in the stylesheet, without constraint; XSLT implementations typically support this freedom by building a tree representation of the entire source document in memory and in consequence can process only documents which fit in memory. But many transformations can in principle be performed without storing the entire source tree. The paper (given at Balisage 2010, Montréal) reports on the progress of the W3C XSL Working Group implementation of a new version of XSLT, designed to make streamed implementations of XSLT feasible.
Kay, Michael. "A Streaming XSLT Processor". Presented at Balisage: The Markup Conference 2010, Montréal, Canada, August 3 - 6, 2010. In Proceedings of Balisage: The Markup Conference 2010. Balisage Series on Markup Technologies, vol. 5 (2010). doi:10.4242/BalisageVol5.Kay01.
This paper (given at Balisage 2009, Montréal) discusses the most effective way to move XML data through a processing pipeline. It draws on the concept of program inversion, originally developed to eliminate bottlenecks in magnetic-tape-based processes, and ideas derived from Jackson Structured Programming which allow processes written in a convenient pull style to be compiled into push-style code; thus potentially reducing both coordination overhead and latency.
Kay, Michael. "You Pull, I’ll Push: on the Polarity of Pipelines". Presented at Balisage: The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In Proceedings of Balisage: The Markup Conference 2009. Balisage Series on Markup Technologies, vol. 3 (2009). doi:10.4242/BalisageVol3.Kay01.
A paper written for the IEEE Data Engineering Bulletin, included in a special issue published in December 2008 and devoted to papers on the state-of-the-art in XQuery implementation. Most of what the paper says is of course equally applicable to XSLT.
This paper (given at Extreme Markup 2007) explores the possibility that since query optimization is an exercise in transforming expression trees, and XSLT is a language for transforming trees, it ought to be possible to write an optimizer in XSLT. (The rendition of the paper is poor because it has been only partially recovered after IDEAlliance, the conference organizers, withdrew their public archive of the conference proceedings.)
Back in 2006-7, Saxonica collaborated with C24 to enable Saxon to be used as the query engine within the C24 Integration Objects product. (The company was subsequently acquired by Iona, which in turn was acquired by Progress, but it is now independent again and trading under its old name. In 2013 we've resumed the collaboration and hope to move the technology forward to take advantage of all the things that have happened in Saxon in the meantime.) This May 2007 paper describes how such an integration enables XQuery to be used to access non-XML data such as SWIFT financial messages, and to convert data between different formats.
Published at the XIME-P 2006 XQuery workshop at the SIGMOD Conference in Chicago,
paper proposes an extension to XQuery to handle positional grouping problems, derived
from experience with the
xsl:for-each-group construct in XSLT 2.0.
This paper discusses the role of the XSLT 2.0 and XQuery 1.0 languages when it comes to writing real-life, sizeable applications for performing data transformations: especially factors such as error handling, debugging, performance, reuse and customization of code, relationships with XML Schema and other technologies such as XForms, and the use of pipeline-based application architectures.
This paper by Michael Kay was presented at XTech 2005 in Amsterdam. It compares XSLT and XQuery not just using a blow-by-blow feature comparison, but an assessment of the suitability of the languages for different tasks, and the kinds of users the two languages are aimed at.
This paper by Michael Kay was presented at XML 2004 in Washington DC. By means of a case study, it shows how some of the new features in XSLT 2.0 (notably the grouping instructions and the facilities for handling regular expressions) make XSLT 2.0 suitable for applications such as up-conversion (creating structured XML from unstructured input) that were quite infeasible in XSLT 1.0.
This paper by Michael Kay, presented at XML Europe 2004 in Amsterdam, looked at the techniques used inside an XSLT processor (Saxon, of course!) to optimize performance. It described some of the techniques actually used in the Saxon processor, and surveyed other ideas coming from academia.
Keynote address given by Michael Kay at the Document Engineering 2003 Conference in Grenoble, France.
Article in ComputerWoche (in German): XML begann als "SGML light" und sollte sich vor allem durch Einfachheit auszeichnen. Eine Reihe von Zusatzstandards erhöhten aber zwischenzeitlich die Komplexität beträchtlich. Während der Kernstandard weitgehend stabil bleibt, stehen in anderen Bereichen größere Änderungen bevor.
This paper by Michael Kay, although published as long ago as 2001, remains a frequently cited description of how XSLT processing in a product like Saxon actually works.
This paper by Michael Kay, published at the same time as the one above, gives an overview of the capabilities of the XSLT language.
Return to top of page
Saxonica has a close working relationship with the Stylus Studio team: Stylus Studio was the first XML development environment to offer Saxon-SA as a standard feature. As part of this collaboration, we wrote a regular column for their web site. The following articles have been published:
Return to top of page
In some of my tutorials and seminars I use a genealogy application to illustrate the features of XSLT 2.0. The files for this demonstration are available for download.
Return to top of page