net.sf.saxon.functions.regex
Class JDK15RegexTranslator
java.lang.Object
net.sf.saxon.functions.regex.RegexTranslator
net.sf.saxon.functions.regex.JDK15RegexTranslator
public class JDK15RegexTranslator
- extends RegexTranslator
This class translates XML Schema regex syntax into JDK 1.5 regex syntax. This differs from the JDK 1.4
translator because JDK 1.5 handles non-BMP characters (wide characters) in places where JDK 1.4 does not,
for example in a range such as [X-Y]. This enables much of the code from the 1.4 translator to be
removed.
Author: James Clark, Thai Open Source Software Center Ltd. See statement at end of file.
Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions
to the XML Schema regex syntax. This version also removes most of the complexities of handling non-BMP
characters, since JDK 1.5 handles these natively.
Fields inherited from class net.sf.saxon.functions.regex.RegexTranslator |
ALL, captures, caseBlind, curChar, currentCapture, eos, expandComplementBlockNames, ignoreWhitespace, inCharClassExpr, isXPath, isXPath30, length, NONE, NOT_ALLOWED_CLASS, pos, regExp, result, SOME, SURROGATES1_CLASS, SURROGATES2_CLASS, warnings, xmlVersion, xsdVersion |
Method Summary |
static void |
main(String[] args)
Main method for testing. |
static String |
translate(CharSequence regExp,
int options,
int flagbits,
List<RegexSyntaxException> warnings)
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular
expression in the syntax of java.util.regex.Pattern . |
protected boolean |
translateAtom()
If what follows is an Atom, translate it and return true; otherwise return false |
Methods inherited from class net.sf.saxon.functions.regex.RegexTranslator |
absorbSurrogatePair, advance, copyCurChar, expect, highSurrogateRanges, isAsciiAlnum, isJavaMetaChar, lowSurrogateRanges, makeException, makeException, parseQuantExact, recede, translateBranch, translateQuantifier, translateQuantity, translateRegExp, translateTop |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
categoryCharClasses
public static final JDK15RegexTranslator.CharClass[] categoryCharClasses
- Translates XML Schema and XPath regexes into
java.util.regex
regexes.
- See Also:
Pattern
,
XML Schema Part 2
subCategoryCharClasses
public static final JDK15RegexTranslator.CharClass[] subCategoryCharClasses
translate
public static String translate(CharSequence regExp,
int options,
int flagbits,
List<RegexSyntaxException> warnings)
throws RegexSyntaxException
- Translates a regular expression in the syntax of XML Schemas Part 2 into a regular
expression in the syntax of
java.util.regex.Pattern
. The translation
assumes that the string to be matched against the regex uses surrogate pairs correctly.
If the string comes from XML content, a conforming XML parser will automatically
check this; if the string comes from elsewhere, it may be necessary to check
surrogate usage before matching.
- Parameters:
regExp
- a String containing a regular expression in the syntax of XML Schemas Part 2options
- bit-wise option settingsflagbits
- Java bit-wise options settings based on supplied flagswarnings
- a list to contain any warnings generated. If no list is supplied, this indicates
that the caller is not interested in knowing about any warnings.
- Returns:
- a JDK 1.5 regular expression
- Throws:
RegexSyntaxException
- if regexp
is not a regular expression in the
syntax of XML Schemas Part 2, or XPath 2.0, as appropriate- See Also:
Pattern
,
XML Schema Part 2
translateAtom
protected boolean translateAtom()
throws RegexSyntaxException
- Description copied from class:
RegexTranslator
- If what follows is an Atom, translate it and return true; otherwise return false
- Specified by:
translateAtom
in class RegexTranslator
- Returns:
- true if we found an atom
- Throws:
RegexSyntaxException
- if the regex syntax is incorrect
main
public static void main(String[] args)
throws RegexSyntaxException
- Main method for testing. Outputs to System.err the Java translation of a supplied
regular expression
- Parameters:
args
- command line arguments
arg[0] a regular expression
arg[1] = xpath to invoke the XPath rules
- Throws:
RegexSyntaxException
- if the regex is invalid
Copyright (c) 2004-2011 Saxonica Limited. All rights reserved.