com.saxonica.schema.fsa
Class FiniteStateMachine

java.lang.Object
  extended by com.saxonica.schema.fsa.FiniteStateMachine
All Implemented Interfaces:
Serializable

public class FiniteStateMachine
extends Object
implements Serializable

Class representing a finite state machine, as a collection of states with transitions between them. The finite state machine allows states with counters. The static machine created by the schema compiler contains states of type AutomatonState; these may have a minOccurs and maxOccurs value. When such a state is entered at validation time, a CountingState is dynamically created that contains a pointer to the fixed AutomatonState plus an integer counter. Any transition that goes from the state to itself causes a new CountingState to be created with an incremented value of the counter (provided it does not exceed the maxOccurs()); any transition to a different state causes the counter to be checked against the minOccurs value.

See Also:
Serialized Form

Nested Class Summary
protected static class FiniteStateMachine.TermParticlePair
           
 
Constructor Summary
FiniteStateMachine(UserComplexType type, boolean determinized)
          Create a finite state machine
 
Method Summary
 void allocateStateNumber(AutomatonState state)
          Allocate a unique number to a state, and index the state from the FSM
static NonDeterminizedState compileParticle(SchemaCompiler compiler, Particle particle, NonDeterminizedState endState, UserComplexType subjectType, FiniteStateMachine machine)
          Static method to translate a particle to a Finite State Automaton, returning the start state of the FSA.
static FiniteStateMachine determinize(SchemaCompiler compiler, FiniteStateMachine nfsa)
          Determinize the finite state machine: that is, ensure that there are no ambiguous transitions from this state.
 void display(PrintStream err)
          Display the finite state machine.
 AutomatonState getInitialState()
          Get the initial state of this finite state machine
 int getNumberOfStates()
          Determine the number of states in this finite state machine
 Wildcard getOpenContentWildcard()
          Get the open content wildcard, if any
 AutomatonState getState(int number)
          Get the state with a given unique state number
 boolean isOpenContentInterleaved()
          Ask whether the open content wildcard for this finite state machine (assuming there is one) permits interleaved content (as opposed to suffixed content only)
 void serialize(SchemaModelSerializer serializer)
          Output a reppresentation of this finite state machine
 void setInitialState(AutomatonState initialState)
          Set the initial state of this finite state machine
 void setOpenContentWildcard(Wildcard wildcard, boolean interleaved)
          Set the open content wildcard for this machine
static String subsumesMachine(FiniteStateMachine base, FiniteStateMachine sub, SchemaCompiler compiler)
          Test whether one finite state machine subsumes another FSM: that is, whether for each path through the second FSM, there is a corresponding path through the first FSM.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FiniteStateMachine

public FiniteStateMachine(UserComplexType type,
                          boolean determinized)
Create a finite state machine

Parameters:
type - the type whose content model is implemented by this finite state machine
determinized - true if this machine is determinized
Method Detail

allocateStateNumber

public void allocateStateNumber(AutomatonState state)
Allocate a unique number to a state, and index the state from the FSM

Parameters:
state - the state whose number is to be allocated

getNumberOfStates

public int getNumberOfStates()
Determine the number of states in this finite state machine

Returns:
the number of states

getInitialState

public AutomatonState getInitialState()
Get the initial state of this finite state machine

Returns:
the initial state

setInitialState

public void setInitialState(AutomatonState initialState)
Set the initial state of this finite state machine

Parameters:
initialState - the initial state

getState

public AutomatonState getState(int number)
Get the state with a given unique state number

Parameters:
number - the number of the state
Returns:
the state with this unique state number

setOpenContentWildcard

public void setOpenContentWildcard(Wildcard wildcard,
                                   boolean interleaved)
Set the open content wildcard for this machine

Parameters:
wildcard - the open content wildcard (may be null)
interleaved - true if open content may be interleaved, false if it must be suffixed

compileParticle

public static NonDeterminizedState compileParticle(SchemaCompiler compiler,
                                                   Particle particle,
                                                   NonDeterminizedState endState,
                                                   UserComplexType subjectType,
                                                   FiniteStateMachine machine)
                                            throws SchemaException,
                                                   UnresolvedReferenceException
Static method to translate a particle to a Finite State Automaton, returning the start state of the FSA. This precisely follows Henry Thompson's and Richard Tobin's algorithm, published here

Parameters:
compiler - user for error reporting
particle - the particle to be compiled, which is either an ElementDeclaration, a WildCard, a choice, or a sequence
endState - the State representing the final state of the automaton
subjectType - the complex type to which this particle belongs
machine - the finite state machine to which the state belongs
Returns:
the initial state of the new Automaton
Throws:
SchemaException - if XSD constraints are violated
UnresolvedReferenceException - if unresolved references in the schema are found

display

public void display(PrintStream err)
Display the finite state machine. This is a diagnostic method for internal use.

Parameters:
err - the output destination

subsumesMachine

public static String subsumesMachine(FiniteStateMachine base,
                                     FiniteStateMachine sub,
                                     SchemaCompiler compiler)
Test whether one finite state machine subsumes another FSM: that is, whether for each path through the second FSM, there is a corresponding path through the first FSM. The algorithm used is as published by Thompson and Tobin, XML Europe 2003.

Parameters:
base - the first FSM (representing the base type)
sub - the other FSM (representing the type that is derived by restriction, validly or otherwise)
compiler - used for error reporting
Returns:
null indicating that the first machine does indeed subsume the other; or a string indicating why it doesn't.

determinize

public static FiniteStateMachine determinize(SchemaCompiler compiler,
                                             FiniteStateMachine nfsa)
                                      throws SchemaException
Determinize the finite state machine: that is, ensure that there are no ambiguous transitions from this state. When presented with an input token, the FSA can then examine the specific transitions from this state and either choose one of them, or throw an error.

This is a revised implementation, based more closely on Aho and Ullman p93, to try to eliminate the rare problems that occur when a state has two transitions for the same symbol. (Given UPA, this should occur only with nested loops, e.g. (a{10,11}){1,3} which allows a sequence of 10, 20, 30, 11, 22, or 33 a's.)

Parameters:
compiler - the schema compiler
nfsa - the finite state machine to be determinized
Returns:
the new deterministic finite state machine
Throws:
SchemaException - if an XSD constraint is violated

getOpenContentWildcard

public Wildcard getOpenContentWildcard()
Get the open content wildcard, if any

Returns:
the open content wildcard for this finite state machine, if one is defined

isOpenContentInterleaved

public boolean isOpenContentInterleaved()
Ask whether the open content wildcard for this finite state machine (assuming there is one) permits interleaved content (as opposed to suffixed content only)

Returns:
true if the open content wildcard permits interleaved content.

serialize

public void serialize(SchemaModelSerializer serializer)
               throws XPathException
Output a reppresentation of this finite state machine

Parameters:
serializer - the schema model serializer to which this representation is to be written
Throws:
XPathException - if any error occurs during serialization


Copyright (c) 2004-2011 Saxonica Limited. All rights reserved.