public final class Tokenizer
extends java.lang.Object
This code was originally derived from James Clark's xt, though it has been greatly modified since. See copyright notice at end of file.
Modifier and Type | Field and Description |
---|---|
boolean |
allowSaxonExtensions
Flag to allow Saxon extensions
|
static int |
BARE_NAME_STATE
State in which a name is NOT to be merged with what comes next, for example "("
|
int |
currentToken
The number identifying the most recently read token
|
int |
currentTokenStartOffset
The position in the input expression where the current token starts
|
java.lang.String |
currentTokenValue
The string value of the most recently read token
|
static int |
DEFAULT_STATE
Initial default state of the Tokenizer
|
boolean |
disallowUnionKeyword
Flag to disallow "union" as a synonym for "|" when parsing XSLT 2.0 patterns
|
java.lang.String |
input
The string being parsed
|
int |
inputOffset
The current position within the input string
|
boolean |
isXQuery
Flag to indicate that this is XQuery as distinct from XPath
|
int |
languageLevel
XPath language level: e.g.
|
static int |
OPERATOR_STATE
State in which the next thing to be read is an operator
|
static int |
SEQUENCE_TYPE_STATE
State in which the next thing to be read is a SequenceType
|
Constructor and Description |
---|
Tokenizer() |
Modifier and Type | Method and Description |
---|---|
int |
getBinaryOp(java.lang.String s)
Identify a binary operator
|
int |
getColumnNumber()
Get the column number of the current token
|
int |
getColumnNumber(int offset)
Return the column number corresponding to a given offset in the expression
|
long |
getLineAndColumn(int offset)
Get the line and column number corresponding to a given offset in the input expression,
as a long value with the line number in the top half and the column number in the lower half.
|
int |
getLineNumber()
Get the line number of the current token
|
int |
getLineNumber(int offset)
Return the line number corresponding to a given offset in the expression
|
int |
getState()
Get the current tokenizer state
|
void |
incrementLineNumber(int offset)
Increment the line number, making a record of where in the input string the newline character occurred.
|
void |
lookAhead()
Look ahead by one token.
|
void |
next()
Get the next token from the input expression.
|
char |
nextChar()
Read next character directly.
|
int |
peekAhead()
Peek ahead at the next token
|
java.lang.String |
recentText(int offset)
Get the most recently read text (for use in an error message)
|
void |
setState(int state)
Set the tokenizer into a special state
|
void |
tokenize(java.lang.String input,
int start,
int end)
Prepare a string for tokenization.
|
void |
treatCurrentAsOperator()
Force the current token to be treated as an operator if possible
|
void |
unreadChar()
Step back one character.
|
public static final int DEFAULT_STATE
public static final int BARE_NAME_STATE
public static final int SEQUENCE_TYPE_STATE
public static final int OPERATOR_STATE
public int currentToken
public java.lang.String currentTokenValue
public int currentTokenStartOffset
public java.lang.String input
public int inputOffset
public boolean disallowUnionKeyword
public boolean isXQuery
public int languageLevel
public boolean allowSaxonExtensions
public int getState()
public void setState(int state)
state
- the new statepublic void tokenize(java.lang.String input, int start, int end) throws XPathException
input
- the string to be tokenizedstart
- start point within the stringend
- end point within the string (last character not read):
-1 means end of stringXPathException
- if a lexical error occurs, e.g. unmatched
string quotespublic void next() throws XPathException
XPathException
- if a lexical error is detectedpublic int peekAhead()
public void treatCurrentAsOperator()
public void lookAhead() throws XPathException
XPathException
- if a lexical error occurspublic int getBinaryOp(java.lang.String s)
s
- String representation of the operator - must be internedpublic char nextChar() throws java.lang.StringIndexOutOfBoundsException
java.lang.StringIndexOutOfBoundsException
- if an attempt is made to read beyond
the end of the string. This will only occur in the event of a syntax error in the
input.public void incrementLineNumber(int offset)
offset
- the place in the input string where the newline occurredpublic void unreadChar()
public java.lang.String recentText(int offset)
offset
- the offset of the offending token, if known, or -1 to use the current offsetpublic int getLineNumber()
public int getColumnNumber()
public long getLineAndColumn(int offset)
offset
- the byte offset in the expressionpublic int getLineNumber(int offset)
offset
- the byte offset in the expressionpublic int getColumnNumber(int offset)
offset
- the byte offset in the expressionCopyright (c) 2004-2020 Saxonica Limited. All rights reserved.