Using XML catalogs
XML catalogs (defined by OASIS) provide a way to avoid hard-coding the locations of XML documents and other resources in your application. Instead, the application refers to the resource using a conventional system identifier (URI) or public identifier, and a local catalog is used to map the system and public identifiers to an actual location.
When using Saxon from the command line, it is possible to specify a catalog to be used using
the option -catalog:files
. Here files
is the catalog
file to be searched, or a list of filenames separated by semicolons. This catalog will be used
to locate DTDs and external entities required by the XML parser, XSLT stylesheet modules
requested using xsl:import
and xsl:include
, documents requested
using the document()
and doc()
functions, and also schema documents,
however they are referenced.
Similarly, when using the API, catalog files can be nominated:
- From Java, use the method Processor.setCatalogFiles() with a list of filenames.
- From C#, use the method Processor.SetCatalogFiles() with a list of filenames.
- From C++ and PHP, use the method
SaxonProcessor.setCatalog()
with the filename. From Python, use the methodPySaxonProcessor.set_catalog()
with the filename.
The catalog is used for all kinds of external resources, including XML documents,
JSON documents, query modules, unparsed text files, and schema modules (but
not collations or collections). The kind of resource required is generally identified
by the nature
parameter, following the principles of the
RDDL specification.
The values used are listed in constants in class ResourceRequest on Java (or ResourceRequest on C#).
Catalog-based resolution is performed using a third-party library:
- For SaxonJ and SaxonC, project
xmlresolver/xmlresolver
on GitHub. - For SaxonCS, project
xmlresolver/xmlresolvercs
on GitHub.
See the documentation for the relevant library for options on how the catalog resolver can be configured, for example using system properties or a properties file. Remember that such settings are typically global (they may affect more than one Saxon Configuration).
Settings local to a configuration can be made by getting the ResourceResolver
from the Configuration
, casting it to class ConfigurableResourceResolver,
and calling its setFeature()
method.
Note: Saxon configures the resolver with an explicitly
empty list of catalog files and does not use catalog files
configured with properties or property files. This is a
compromise for backwards compatibility because, historically,
Saxon only used a catalog resolver if an explicit
-catalog
option was specified on the command
line. If Saxon configured the resolver to use the global configuration, then
it would always attempt to load ./catalog.xml
as a catalog
file and that could change the behavior of existing applications.
You must use the -catalog
option or settings
local to the Saxon Configuration
to specify the catalogs
used by Saxon.
Here is an example of a very simple catalog file. The publicId
and
systemId
attributes give the public or system identifier as used in the source
document; the uri
attribute gives the location (in this case a relative location)
where the actual resource will be found.
There are many tutorials for XML catalogs available on the web, including some that have information specific to Saxon, but take care that it is relevant to the current release. Significant changes have been made in Saxon 11.1.