This module defines an FSA class, for representing and operating on finite-state automata (FSAs). FSAs can be used to represent regular expressions and to test sequences for membership in the languages described by regular expressions.
FSAs can be deterministic or nondeterministic, and they can contain epsilon transitions. Methods to determinize an automaton (also eliminating its epsilon transitions), and to minimize an automaton, are provided.
The transition labels for an FSA can be symbols from an alphabet, as in the standard formal definition of an FSA, but they can also be instances which represent predicates. If these instances implement instance.matches(), then the FSA nextState() function and accepts() predicate can be used. If they implement instance.complement() and instance.intersection(), the FSA can be be determinized and minimized, to find a minimal deterministic FSA that accepts an equivalent language.
Instances of FSA can be created out of labels (for instance, strings) by the singleton() function, and combined to create more complex FSAs through the complement(), closure(), concatenation(), union(), and other constructors. For example, concatenation(singleton('a'), union(singleton('b'), closure(singleton('c')))) creates an FSA that accepts the strings 'a', 'ab', 'ac', 'acc', 'accc', and so on.
Instances of FSA can also be created with the compileRE() function, which compiles a simple regular expression (using only '*', '?', '+', '|', '(', and ')' as metacharacters) into an FSA. For example, compileRE('a(b|c*)') returns an FSA equivalent to the example in the previous paragraph.
FSAs can be determinized, to create equivalent FSAs (FSAs accepting the same language) with unique successor states for each input, and minimized, to create an equivalent deterministic FSA with the smallest number of states. FSAs can also be complemented, intersected, unioned, and so forth as described under 'FSA Functions' below.
The class FSA defines the following methods.
EMPTY_STRING_FSA is an FSA that accepts the language consisting only of the empty string.
NULL_FSA is an FSA that accepts the null language.
UNIVERSAL_FSA is an FSA that accepts S*, where S is any object.
FSA is initialized with a list of states, an alphabet, a list of transition, an initial state, and a list of final states. If fsa is an FSA, fsa.tuple() returns these values in that order, i.e. (states, alphabet, transitions, initialState, finalStates). They're also available as fields of fsa with those names.
Each element of transition is a tuple of a start state, an end state, and a label: (startState, endSTate, label).
If the list of states is None, it's computed from initialState, finalStates, and the states in transitions.
If alphabet is None, an open alphabet is used: labels are assumed to be objects that implements label.matches(input), label.complement(), and label.intersection() as follows:
label.matches(input) returns true iff label matches input
- label.complement() returnseither a label or a list of labels which,
together with the receiver, partition the input alphabet
- label.intersection(other) returns either None (if label and other don't
both match any symbol), or a label that matches the set of symbols that both label and other match
As a special case, strings can be used as labels. If a strings 'a' and 'b' are used as a label and there's no alphabet, '~a' and '~b' are their respective complements, and '~a&~b' is the intersection of '~a' and '~b'. (The intersections of 'a' and 'b', 'a' and '~b', and '~a' and 'b' are, respectively, None, 'a', and 'b'.)
Design Goals:
Non-Goals: