java.lang.Object
- net.sf.saxon.expr.parser.Tokenizer

```
public final class Tokenizer
extends java.lang.Object
```
Tokenizer for expressions and inputs.
This code was originally derived from James Clark's xt, though it has been greatly modified since. See copyright notice at end of file.

Field Summary

Fields
Modifier and Type	Field	Description
`boolean`	`allowSaxonExtensions`	Flag to allow Saxon extensions
`static int`	`BARE_NAME_STATE`	State in which a name is NOT to be merged with what comes next, for example "("
`int`	`currentToken`	The number identifying the most recently read token
`int`	`currentTokenStartOffset`	The position in the input expression where the current token starts
`java.lang.String`	`currentTokenValue`	The string value of the most recently read token
`static int`	`DEFAULT_STATE`	Initial default state of the Tokenizer
`boolean`	`disallowUnionKeyword`	Flag to disallow "union" as a synonym for "\|" when parsing XSLT 2.0 patterns
`java.lang.String`	`input`	The string being parsed
`int`	`inputOffset`	The current position within the input string
`boolean`	`isXQuery`	Flag to indicate that this is XQuery as distinct from XPath
`int`	`languageLevel`	XPath language level: e.g.
`static int`	`OPERATOR_STATE`	State in which the next thing to be read is an operator
`static int`	`SEQUENCE_TYPE_STATE`	State in which the next thing to be read is a SequenceType

Constructor Summary

Constructors
Constructor Description

Tokenizer()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`int`	`getColumnNumber()`	Get the column number of the current token
`int`	`getColumnNumber(int offset)`	Return the column number corresponding to a given offset in the expression
`int`	`getLineNumber()`	Get the line number of the current token
`int`	`getLineNumber(int offset)`	Return the line number corresponding to a given offset in the expression
`int`	`getState()`	Get the current tokenizer state
`void`	`incrementLineNumber(int offset)`	Increment the line number, making a record of where in the input string the newline character occurred.
`void`	`lookAhead()`	Look ahead by one token.
`void`	`next()`	Get the next token from the input expression.
`char`	`nextChar()`	Read next character directly.
`void`	`setState(int state)`	Set the tokenizer into a special state
`void`	`tokenize(java.lang.String input, int start, int end)`	Prepare a string for tokenization.
`void`	`treatCurrentAsOperator()`	Force the current token to be treated as an operator if possible
`void`	`unreadChar()`	Step back one character.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEFAULT_STATE
```
public static final int DEFAULT_STATE
```
    Initial default state of the Tokenizer
    
    See Also:
    
    Constant Field Values
  - BARE_NAME_STATE
```
public static final int BARE_NAME_STATE
```
    State in which a name is NOT to be merged with what comes next, for example "("
    
    See Also:
    
    Constant Field Values
  - SEQUENCE_TYPE_STATE
```
public static final int SEQUENCE_TYPE_STATE
```
    State in which the next thing to be read is a SequenceType
    
    See Also:
    
    Constant Field Values
  - OPERATOR_STATE
```
public static final int OPERATOR_STATE
```
    State in which the next thing to be read is an operator
    
    See Also:
    
    Constant Field Values
  - currentToken
```
public int currentToken
```
    The number identifying the most recently read token
  - currentTokenValue
```
public java.lang.String currentTokenValue
```
    The string value of the most recently read token
  - currentTokenStartOffset
```
public int currentTokenStartOffset
```
    The position in the input expression where the current token starts
  - input
```
public java.lang.String input
```
    The string being parsed
  - inputOffset
```
public int inputOffset
```
    The current position within the input string
  - disallowUnionKeyword
```
public boolean disallowUnionKeyword
```
    Flag to disallow "union" as a synonym for "|" when parsing XSLT 2.0 patterns
  - isXQuery
```
public boolean isXQuery
```
    Flag to indicate that this is XQuery as distinct from XPath
  - languageLevel
```
public int languageLevel
```
    XPath language level: e.g. 2.0, 3.0, or 3.1
  - allowSaxonExtensions
```
public boolean allowSaxonExtensions
```
    Flag to allow Saxon extensions
- Constructor Detail
  - Tokenizer
```
public Tokenizer()
```
- Method Detail
  - getState
```
public int getState()
```
    Get the current tokenizer state
    
    Returns:
    
    the current state
  - setState
```
public void setState(int state)
```
    Set the tokenizer into a special state
    
    Parameters:
    
    state - the new state
  - tokenize
```
public void tokenize(java.lang.String input,
                     int start,
                     int end)
              throws XPathException
```
    Prepare a string for tokenization. The actual tokens are obtained by calls on next()
    
    Parameters:
    
    input - the string to be tokenized
    
    start - start point within the string
    
    end - end point within the string (last character not read): -1 means end of string
    
    Throws:
    
    XPathException - if a lexical error occurs, e.g. unmatched string quotes
  - next
```
public void next()
          throws XPathException
```
    Get the next token from the input expression. The type of token is returned in the currentToken variable, the string value of the token in currentTokenValue.
    
    Throws:
    
    XPathException - if a lexical error is detected
  - treatCurrentAsOperator
```
public void treatCurrentAsOperator()
```
    Force the current token to be treated as an operator if possible
  - lookAhead
```
public void lookAhead()
               throws XPathException
```
    Look ahead by one token. This method does the real tokenization work. The method is normally called internally, but the XQuery parser also calls it to resume normal tokenization after dealing with pseudo-XML syntax.
    
    Throws:
    
    XPathException - if a lexical error occurs
  - nextChar
```
public char nextChar()
              throws java.lang.StringIndexOutOfBoundsException
```
    Read next character directly. Used by the XQuery parser when parsing pseudo-XML syntax
    
    Returns:
    
    the next character from the input
    
    Throws:
    
    java.lang.StringIndexOutOfBoundsException - if an attempt is made to read beyond the end of the string. This will only occur in the event of a syntax error in the input.
  - incrementLineNumber
```
public void incrementLineNumber(int offset)
```
    Increment the line number, making a record of where in the input string the newline character occurred.
    
    Parameters:
    
    offset - the place in the input string where the newline occurred
  - unreadChar
```
public void unreadChar()
```
    Step back one character. If this steps back to a previous line, adjust the line number.
  - getLineNumber
```
public int getLineNumber()
```
    Get the line number of the current token
    
    Returns:
    
    the line number. Line numbers reported by the tokenizer start at zero.
  - getColumnNumber
```
public int getColumnNumber()
```
    Get the column number of the current token
    
    Returns:
    
    the column number. Column numbers reported by the tokenizer start at zero.
  - getLineNumber
```
public int getLineNumber(int offset)
```
    Return the line number corresponding to a given offset in the expression
    
    Parameters:
    
    offset - the byte offset in the expression
    
    Returns:
    
    the line number. Line and column numbers reported by the tokenizer start at zero.
  - getColumnNumber
```
public int getColumnNumber(int offset)
```
    Return the column number corresponding to a given offset in the expression
    
    Parameters:
    
    offset - the byte offset in the expression
    
    Returns:
    
    the column number. Line and column numbers reported by the tokenizer start at zero.

Class Tokenizer

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

DEFAULT_STATE

BARE_NAME_STATE

SEQUENCE_TYPE_STATE

OPERATOR_STATE

currentToken

currentTokenValue

currentTokenStartOffset

input

inputOffset

disallowUnionKeyword

isXQuery

languageLevel

allowSaxonExtensions

Constructor Detail

Tokenizer

Method Detail

getState

setState

tokenize

next

treatCurrentAsOperator

lookAhead

nextChar

incrementLineNumber

unreadChar

getLineNumber

getColumnNumber

getLineNumber

getColumnNumber