net.sf.saxon.functions.regex
Class JDK15RegexTranslator

java.lang.Object
  extended by net.sf.saxon.functions.regex.RegexTranslator
      extended by net.sf.saxon.functions.regex.JDK15RegexTranslator

public class JDK15RegexTranslator
extends RegexTranslator

This class translates XML Schema regex syntax into JDK 1.5 regex syntax. This differs from the JDK 1.4 translator because JDK 1.5 handles non-BMP characters (wide characters) in places where JDK 1.4 does not, for example in a range such as [X-Y]. This enables much of the code from the 1.4 translator to be removed. Author: James Clark, Thai Open Source Software Center Ltd. See statement at end of file. Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions to the XML Schema regex syntax. This version also removes most of the complexities of handling non-BMP characters, since JDK 1.5 handles these natively.


Nested Class Summary
protected static class JDK15RegexTranslator.CharClass
           
protected static class JDK15RegexTranslator.CharRange
           
 
Nested classes/interfaces inherited from class net.sf.saxon.functions.regex.RegexTranslator
RegexTranslator.Range
 
Field Summary
static JDK15RegexTranslator.CharClass[] categoryCharClasses
          Translates XML Schema and XPath regexes into java.util.regex regexes.
static JDK15RegexTranslator.CharClass[] subCategoryCharClasses
           
 
Fields inherited from class net.sf.saxon.functions.regex.RegexTranslator
ALL, captures, caseBlind, curChar, currentCapture, eos, expandComplementBlockNames, ignoreWhitespace, inCharClassExpr, isXPath, isXPath30, length, NONE, NOT_ALLOWED_CLASS, pos, regExp, result, SOME, SURROGATES1_CLASS, SURROGATES2_CLASS, warnings, xmlVersion, xsdVersion
 
Method Summary
static void main(String[] args)
          Main method for testing.
static String translate(CharSequence regExp, int options, int flagbits, List<RegexSyntaxException> warnings)
          Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax of java.util.regex.Pattern.
protected  boolean translateAtom()
          If what follows is an Atom, translate it and return true; otherwise return false
 
Methods inherited from class net.sf.saxon.functions.regex.RegexTranslator
absorbSurrogatePair, advance, copyCurChar, expect, highSurrogateRanges, isAsciiAlnum, isJavaMetaChar, lowSurrogateRanges, makeException, makeException, parseQuantExact, recede, translateBranch, translateQuantifier, translateQuantity, translateRegExp, translateTop
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

categoryCharClasses

public static final JDK15RegexTranslator.CharClass[] categoryCharClasses
Translates XML Schema and XPath regexes into java.util.regex regexes.

See Also:
Pattern, XML Schema Part 2

subCategoryCharClasses

public static final JDK15RegexTranslator.CharClass[] subCategoryCharClasses
Method Detail

translate

public static String translate(CharSequence regExp,
                               int options,
                               int flagbits,
                               List<RegexSyntaxException> warnings)
                        throws RegexSyntaxException
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax of java.util.regex.Pattern. The translation assumes that the string to be matched against the regex uses surrogate pairs correctly. If the string comes from XML content, a conforming XML parser will automatically check this; if the string comes from elsewhere, it may be necessary to check surrogate usage before matching.

Parameters:
regExp - a String containing a regular expression in the syntax of XML Schemas Part 2
options - bit-wise option settings
flagbits - Java bit-wise options settings based on supplied flags
warnings - a list to contain any warnings generated. If no list is supplied, this indicates that the caller is not interested in knowing about any warnings.
Returns:
a JDK 1.5 regular expression
Throws:
RegexSyntaxException - if regexp is not a regular expression in the syntax of XML Schemas Part 2, or XPath 2.0, as appropriate
See Also:
Pattern, XML Schema Part 2

translateAtom

protected boolean translateAtom()
                         throws RegexSyntaxException
Description copied from class: RegexTranslator
If what follows is an Atom, translate it and return true; otherwise return false

Specified by:
translateAtom in class RegexTranslator
Returns:
true if we found an atom
Throws:
RegexSyntaxException - if the regex syntax is incorrect

main

public static void main(String[] args)
                 throws RegexSyntaxException
Main method for testing. Outputs to System.err the Java translation of a supplied regular expression

Parameters:
args - command line arguments arg[0] a regular expression arg[1] = xpath to invoke the XPath rules
Throws:
RegexSyntaxException - if the regex is invalid


Copyright (c) 2004-2011 Saxonica Limited. All rights reserved.