Using XML catalogs

XML catalogs (defined by OASIS) provide a way to avoid hard-coding the locations of XML documents and other resources in your application. Instead, the application refers to the resource using a conventional system identifier (URI) or public identifier, and a local catalog is used to map the system and public identifiers to an actual location.

When using Saxon from the command line, it is possible to specify a catalog to be used using the option -catalog:files. Here files is the catalog file to be searched, or a list of filenames separated by semicolons. This catalog will be used to locate DTDs and external entities required by the XML parser, XSLT stylesheet modules requested using xsl:import and xsl:include, documents requested using the document() and doc() functions, and also schema documents, however they are referenced.

Similarly, when using the Java or C# API, catalog files can be nominated:

The catalog is used for all kinds of external resources, including XML documents, JSON documents, query modules, unparsed text files, and schema modules (but not collations or collections). The kind of resource required is generally identified by the nature parameter, following the principles of the RDDL specification. The values used are listed in constants in class ResourceRequest on Java (or ResourceRequest on C#).

Catalog-based resolution is performed using a third-party library:

See the documentation for the relevant library for options on how the catalog resolver can be configured, for example using system properties or a properties file. Remember that such settings are typically global (they may affect more than one Saxon Configuration).

Settings local to a configuration can be made by getting the ResourceResolver from the Configuration, casting it to class ConfigurableResourceResolver, and calling its setFeature() method.

Note: Saxon configures the resolver with an explicitly empty list of catalog files and does not use catalog files configured with properties or property files. This is a compromise for backwards compatibility because, historically, Saxon only used a catalog resolver if an explicit -catalog option was specified on the command line. If Saxon configured the resolver to use the global configuration, then it would always attempt to load ./catalog.xml as a catalog file and that could change the behavior of existing applications. You must use the -catalog option or settings local to the Saxon Configuration to specify the catalogs used by Saxon.

Here is an example of a very simple catalog file. The publicId and systemId attributes give the public or system identifier as used in the source document; the uri attribute gives the location (in this case a relative location) where the actual resource will be found.

<?xml version="1.0"?> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <group prefer="public" xml:base="file:///usr/share/xml/" > <public publicId="-//OASIS//DTD DocBook XML V4.5//EN" uri="docbook45/docbookx.dtd"/> <system systemId="http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" uri="docbook45/docbookx.dtd"/> </group> </catalog>

There are many tutorials for XML catalogs available on the web, including some that have information specific to Saxon, but take care that it is relevant to the current release. Significant changes have been made in Saxon 11.1.