Validation from the command line

To validate one or more source documents using SaxonJ, write:

java  com.saxonica.Validate   [options]  source.xml...  

For SaxonCS, there are two ways of running validation from the command line. If you download SaxonCS as an executable application from the Saxonica download page, then you can validate one or more source documents with the command:

SaxonCS validate [options]  source.xml...  

But if you install SaxonCS as a nuget package then you need to prefix the command with dotnet, like this:

dotnet SaxonCS validate [options]  source.xml...  

For SaxonC, the validate executable file must first be built. To do this, compile the Validate.c C command line program by running the batch script ./build64-linux.sh on Linux, ./build64-mac.sh on MacOS, or build-windows.bat on Windows.

Then to validate one or more source documents using SaxonC, write:

./validate   [options]  source.xml...  

The command allows you to validate one or more source XML documents against a given schema, or simply to check a schema for internal correctness.

It is possible to use glob syntax to process multiple files, for example Validate *.xml.

In the above form, the command relies on the use of xsi:schemaLocation attributes within the instance document to identify the schema to be loaded. As an alternative, the schema can be specified on the command line, for example with the parameters:

-xsd:schema.xsd -s:instance.xml

In this form of the command, it is possible to specify multiple schema documents and/or multiple instance documents, as a list of names separated by semicolon (on Windows) or colon (on Linux and MacOS). Glob syntax (such as *.xml) is available only if the -s: prefix is omitted, because the shell has to recognize the argument as a filename.

Thus, source files to be validated can be listed either using the -s option, or in any argument that is not prefixed with "-". This allows the standard wildcard expansion facilities of the shell interpreter to be used, for example *.xml validates all files in the current directory with extension "xml".

If no instance documents are supplied, the effect of the command is simply to check a schema for internal correctness. So a schema can be verified using the command:

java com.saxonica.Validate -xsd:schema.xsddotnet SaxonCS validate -xsd:schema.xsd./validate -xsd:schema.xsd

More generally the syntax of the command is:

java com.saxonica.Validate [options] [params] [filenames] dotnet SaxonCS validate [options] [params] [filenames] ./validate [options] [params] [filenames]

where options generally take the form -name:value and params take the form keyword=value.

Command line options

The options are as follows (in any order):

-catalog:filenames

filenames is either a file name or a list of file names separated by semicolons; the files are OASIS XML catalogs used to define how public identifiers and system identifiers (URIs) used in a source document or schema are to be redirected, typically to resources available locally. For more details see Using XML catalogs.

-config:filename

Loads options from a configuration file. This must describe a schema-aware configuration.

-dtd:(on|off|recover)

Setting -dtd:on requests DTD-based validation of the source files. Requires an XML parser that supports validation. The setting -dtd:off (which is the default) suppresses DTD validation. The setting -dtd:recover performs DTD validation but treats the error as non-fatal if it fails. Note that any external DTD is likely to be read even if not used for validation, because DTDs can contain definitions of entities.

-export:filename

Makes a copy of the compiled schema (providing it is valid) as a schema component model to the specified XML file. This file will contain schema components corresponding to all the loaded schema documents. This option may be combined with other options: the SCM file is written after all document instance validation has been carried out.

-ext:(on|off)

If ext:off is specified, suppress access to certain external resources. Specifically it sets the configuration option ALLOW_EXTERNAL_FUNCTIONS, whose effect is rather wider than the name suggests. This option is useful when loading an untrusted schema, perhaps from a remote site using an http:// URL; it ensures that the schema cannot call arbitrary Java methods and thereby gain privileged access to resources on your machine.

-init:initializer

The value identifies user-supplied code that is called during the initialization process, and may be used to set any options required on the Saxon configuration.

For SaxonJ, the value is the name of a user-supplied class that implements the interface Initializer. The class must be on the classpath.

For SaxonCS, the value is the filename of a user-written assembly that contains an implementation of the interface Saxon.Api.IProcessorInitializer. For details see Callbacks.

-limits:min,max

Sets upper limits on the values of minOccurs and maxOccurs allowed in a schema content model, in cases where Saxon is not able to implement the rules using a finite state machine with counters. For further details see Handling minOccurs and maxOccurs.

-opt:0...10

Set optimization level. The value is an integer in the range 0 (no optimization) to 10 (full optimization); currently all values other than 0 result in full optimization but this is likely to change in future. The default is full optimization; this feature allows optimization to be suppressed in cases where reducing compile time is important, or where optimization gets in the way of debugging, or causes extension functions with side-effects to behave unpredictably. (Note however, that even with no optimization, lazy evaluation may still cause the evaluation order to be not as expected.)

-quit:(on|off)

With the default setting, on, the command will quit the Java VM and return an exit code if a failure occurs. This is useful when running from an operating system shell. With the setting quit:off the command instead throws a RunTimeException, which is more useful when the command is invoked from another Java application such as Ant.

-r:classname

Use the specified URIResolver to process the URIs of all schema documents and source documents. The URIResolver is a user-defined class, that implements the URIResolver interface defined in JAXP, whose function is to take a URI supplied as a string, and return a SAX InputSource. It is invoked to process URIs found in xs:include and xs:import schemaLocation attributes of schema documents, the URIs found in xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the source document, and (if -u is also specified) to process the URI of the source file provided on the command line. Specifying -r:org.apache.xml.resolver.tools.CatalogResolver selects the Apache XML resolver (part of the Apache Commons project, which must be on the classpath) and enables URIs to be resolved via a catalog, allowing references to external websites to be redirected to local copies.

-report:filename

This option switches on the capture of validation reporting. Here filename specifies where the validation report should be written to on disk. The validation report is in XML format. The format of the validation report is defined in a schema which is available in the saxon-resources download file (see validation-reports.xsd).

-s:file;file...

Supplies a list of source documents to be validated. Each document is validated using the same options. The value is a list of filenames separated by semicolons (on Windows) or by colons (on Linux and MacOS); but if the -u option is used, then the list is a whitespace-separated list of URIs. It is also possible to specify the names of source documents as arguments without any preceding option flag; in this case shell wildcards ("globs") can be used. A filename can be specified as "-" to read the source document from standard input, in which case the base URI is taken from that of the current directory.

The validation of multiple source documents is done simultaneously (in parallel threads) by default. The number of threads used is set to the number of processors available on the machine. If the configuration property ALLOW_MULTITHREADING is set to false, the source documents are validated synchronously in a single thread.

-scmin:filename

Loads a precompiled schema component model from the given file. The file should be generated in a previous run using the -export option. When this option is used, the -xsd option should not be present. Schemas loaded from an SCM file are assumed to be valid, without checking.

This option is retained for compatibility. From Saxon 9.7, SCM files can also be supplied in the -xsd option.

-scmout:filename

Synonym of -export:filename, retained for compatibility.

-stats:filename

Requests creation of an XML document containing statistics showing which schema components were used during the validation episode, and how often (coverage data). This data can be used as input to further processes to produce user-readable reports; for example the data could be combined with the output of -scmout to show which components were not used at all during the validation.

-t

Requests display of version and timing information to the standard error output. This also shows all the schema documents that have been loaded.

-top:element-name

Requires that the outermost element of the instance being validated has the required name. This is written in Clark notation format {uri}local.

-u

Indicates that the name of the source document and schema document are supplied as URIs; otherwise they are taken as filenames, unless they start with "http:", "https:", "file:", or "classpath:", in which case they they are taken as URLs.

In addition, when this option is specified, multiple URIs supplied in the -s option must be separated by whitespace rather than by the usual file separator (colon or semicolon). To achieve this, the list of names needs to be in quotes.

-val:(strict|lax)

Invokes strict or lax validation (default is strict). Lax validation validates elements only if there is an element declaration to validate them against, or if they have an xsi:type attribute.

-x:classname

Requests use of the specified SAX parser for parsing the source file. The classname must be the fully-qualified name of a Java class that implements the org.xml.sax.XMLReader interface. In the absence of this argument, the standard JAXP facilities are used to locate an XML parser. Note that the XML parser performs the raw XML parsing only; Saxon always does the schema validation itself. Selecting -x:org.apache.xml.resolver.tools.ResolvingXMLReader selects a parser configured to use the Apache entity resolver, so that DTD and other external references in source documents are resolved via a catalog. The parser (part of the Apache Commons project) must be on the classpath.

-xi:(on|off)

Apply XInclude processing to all XML documents (both source documents and schema documents). This relies on XInclude support in the XML parser.

-xmlversion:(1.0|1.1)

If set to 1.1, allows XML 1.1 and XML Namespaces 1.1 constructs. This option must be set if source documents using XML 1.1 are to be validated, or if the schema itself is an XML 1.1 document. This option causes types such as xs:Name, xs:QName, and xs:ID to use the XML 1.1 definitions of these constructs.

-xsd:file;file...

Supplies a list of schema documents to be used for validation. The value is a list of filenames separated by semicolons (on Windows) or by colons (on Linux and MacOS); but if the -u option is used, then the list is a whitespace-separated list of URIs.

The documents may either be source XSD schema documents, or compiled SCM files generated previously using the -export option. Loading precompiled schemas in SCM format is substantially faster. In addition, an SCM file may contain an embedded license key, in which case it is possible to use it for validation using a Saxon-EE configuration that does not have its own license.

NOTE: the format changed between 11.x and 12.x. In 11.x the separator was a semicolon regardless of the operating system platform, and regardless of the -u option. It was changed for consistency with the -s option.

-xsdversion:(1.0|1.1)

Indicates whether the schema processor is to act as an XSD 1.0 or XSD 1.1 processor. The default is XSD 1.1.

-xsiloc:(on|off)

If set to on (the default) the schema processor attempts to load any schema documents referenced in xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the instance document, unless a schema for the specified namespace (or non-namespace) is already available. If set to off, these attributes are ignored.

-y:classname

Use the specified SAX parser for schema documents. The supplied classname must be the fully-qualified class name of a Java class that implements the org.xml.sax.XMLReader or javax.xml.parsers.SAXParserFactory interface, and it must be instantiable using a zero-argument public constructor.

--feature:value

Set a configuration feature: see Configuration features. The value used here is the part of the name after the last "/", for example --allow-external-functions:off. Only features accepting a string or boolean may be set; for booleans the values true/false or on/off are recognized.

-?

Display command syntax.

Note: under some shell languages, this needs to be escaped as -\?.

--?

Display a list of features that are available using the --feature:value syntax.

Note: under some shell languages, this needs to be escaped as --\?.

The results of processing the schema, and of validating the source document against the schema, are written to the standard error output. Unless the -t option is used, successful processing of the source document and schema results in no output.

Command line parameters

Parameters on the command line can be used to supply values for any saxon:param declarations in the schema. See Parameterizing schemas for details. The format of parameters is the same as for the XSLT and XQuery command lines: name=value to supply a simple value; +name=filename to supply the contents of an XML document as the parameter value; or ?name=expression to supply the result of evaluating an XPath expression (for example, ?date=current-date()).