S() = 1 + a S(1) + c S(2) (S1)
S(w 1) = b S(w) + a S(w 1 1) + c S(w 1 2)
S(w 2) = d S(w) + a S(w 2 1) + c S(w 2 2)
corresponding to the expression:
S = 1 + a S b S + c S d S (E1)
Also, as mentioned in the first article, there is a situation here that is
somewhat reminiscent of what goes on in Quantum Field Theory. You have a
state, S(), which looks like a ground state in Quantum Physics, and you
have what look like Creation operators:
<1|: S(w) -> S(w 1)
<2|: S(w) -> S(w 2)
To denote the ground state, S(), we'll use a third operator <0| and make
the following correspondence:
<0| <x1| ... <xn| S <-> S(x1 ... xn)
We'll formally define the Annihilation operators by the following
property:
Orthogonality <i| |i> = 1
<i| |j> = 0 if i is not equal to j (i, j = 0, 1, 2)
and assume that these operators commute with the symbols of the underlying
alphabet, X:
Commutativity x <i| = <i| x, x |i> = |i> x.
Then the grammar presented in S1 can be rewritten as a single equation:
S = |0> + |1> b S + |2> d S + <1| a S + <2| c S (S2)
For instance, the first equation in S1 can be recovered by applying the
ground state operator <0| to both sides:
<0| S = <0| |0> + <0| |1> b S + <0| |2> d S + <0| <1| a S + <0| <2| c S
which, after simplification, becomes:
<0| S = 1 + 0 b S + 0 d S + <0|<1| a S + <0|<2| c S
= 1 + a <0|<1| S + c <0|<2| S
thus:
S() = 1 + a S(1) + b S(2)
(3.2) Quantum Grammars
To the list of basic items used to build regular expression over an
alphabet, X:
(0) 0 ------- to denote the empty set.
(1) 1 ------- to denote the empty string in X*.
(2) x ------- an element of X.
(3) [A] ----- denotes the regular expression 1 + A.
(4) A+ ------ denotes the regular expression A A*.
(5) A* ------ ITERATION (the set 1 + A + A A + A A A + ...)
(6) A + B --- ALTERATION (the union of A and B)
(7) A B ----- CONCATENATION
(the difference operator is not being included here), we'll add the following
items:
(6) Creation operators <m|, m = 0, 1, 2, ..., M
(7) Annhiliation operators |m>, m = 0, 1, 2, ..., M
and write the following postulates.
* Completeness: |0><0| + |1><1| + |2><2| + ... |M><M| = 1
* Orthogonality: <m| |n> = <m|n> = 1 if m = n, = 0 otherwise
* Commutativity: <m| x = x <m|, |m> x = x |m>
where x is any input symbol in X.
A degree M Quantum Grammar is then defined as a system of equations of the
following form:
Definition: CS[M] Quantum Grammar
A CS[M] Index Grammar is a system of equations in q variables,
Q0, Q1, ..., Q{q-1} over an input alphabet X and operator sets
{ |0> ... |M> }, { <0| ... <M| }, of the forms:
Qi = Sum of terms of the following forms:
(1) |0>
(2) |0> a <0| <y1| ... <yn| Q'
(3) |y> a <y1| ... <yn| Q'
(4) a <y1| ... <yn| Q'
where
the y's are in the set { 1, ..., m },
a is in the set X,
Q' is in the set {Q0, Q1,..., Q{q-1}).
It's not actually necessary to include terms of type (4) since they can be
converted to terms of types (2) and (3) by using the Completeness Property:
a <y1| ... <yn| Q' = |0> a <0| <y1| ... <yn| Q'
+ |1> a <1| <y1| ... <yn| Q'
...
+ |M> a <M| <y1| ... <yn| Q'
In view of these developments, the correspondence between CS[M] Index
Grammars and CS[M] Quantum Grammars should be clear. They're really the
same thing represented in slightly different notation: the indexes are
now represented as operators.
It is also possible to define a Quantum Grammar as a system of equations that
allows terms of the form:
(2) |0> <0| <y1| ... <yn| Q'
(3) |y> <y1| ... <yn| Q'
(4) <y1| ... <yn| Q'
in analogy to lambda-PDA's, but as is the case with lambda-PDA's, the
resulting definition would actually be equivalent.
(3.3) Context Free Expressions
With the addition of these new operators, you now have the ability to
work with a more powerful class of expressions, known as the Context Free
Expressions. The definition is as follows:
Definition: CS[M] Context Free Expression
A regular expression over the alphabet:
X union { <m|: m = 0, ..., M } union { |m>: m = 0, ..., M }
where the Orthogonality, Commutativity and Completeness are
assumed to apply to the operators <m|, |m>, is called a Context
Free Expression if it has the form:
<0| E |0>
and if the only appearance of the operators <0| and |0> is in the
combination |0><0|.
Any other regular expression over this alphabet that can be reduced to this
form, using the Orthogonality, Commutativity and Completeness axioms will
also be called a CS[M] Context Free Expression.
Example:
<0| (a<1| + x + b|1>) (a<1| + x + b|1>) |0>
= (a <0 1| + x <0| + b <0|1>) (a <1|0> + x |0> + b |1 0>)
= (a <0 1| + x <0|) (x |0> + b |1 0>)
= a x <0 1|0> + x x <0|0> + a b <0 1|1 0> + x b <0|1 0>
= x x + a b
where I'm using the notations:
<a b| = <a| <b|
and <a|b> = <a| |b>
<0| <1| <1| (a |1>)* |0>
= <0 1 1| (1 + a |1> + a a |1 1> + a a a |1 1 1> (a |1>)*) |0>
= <0 1 1|0> + <0 1 1|1 0> a + <0 1 1|1 1 0> a a + <0 1 1|1 1 1> (a|1>)*|0>
= 0 + 0 + 1 a a + 0
= a a
It should be clear that a Context Free Expression can be converted into
a Quantum Grammar using the very same techniques that are used to convert
regular expressions into finite automata.
(3.4) Context Free Grammar = Recursive System
The expression:
<0| (a<1| + b|1>)* |0>
is of particular interest. If you work out the various components of this
expression, <0| (a<1| + b|1>)^n |0> for n >= 0, you'll soon find that the
only non-0 terms are those for which the a's and b's are properly nested.
This corresponds to the language defined by the recursive equation:
S = 1 + a S b S
and this provides an example demonstrating that context free expressions can
be used to solve recursive systems of equations.
Other examples of context free expressions and the recursive systems that
they are solutions to follow:
(1) S = y0 + a S a y1 + b S b y2
S = <0| (a<1| + b<2|)* y0 (|1>a y1 + |2>b y2)* |0>
(2) S = 1 + a S b + S S
S = <0| ( (a<1|)* |1>b)* |0>
(3) S = y0 + a S b y1 + S S
S = <0| ( (a<1|)* y0 |1>b y1 )* |0>
(4) S = y0 + a S b y1 + S S y2
S = <0| ( <2|* (a<1|)* y0 (|1>b y1 + |2> y2) )* |0>
(5) S = x + a S b + S S
S = <0| (a<1| + x + |1>b)* |0>
(6) S = { a^n b^n: n >= 0 }
S = <0| (a<1|)* (|1>b)* |0>
Formally define a recursive system of equations over alphabet X in variables
{Q0, ..., Q{q-1}} to be a system of the following form:
Qi = regular expression over X union { Q0, ..., Q{q-1} }.
You should be able to see that this is just a fancy way of defining a Context
Free Grammar, where the notation is almost identical to EBNF (extended BNF)
notation. Therefore, I will use the term EBNF to refer to recursive systems
of equations, as well. The variable, Q0, is defined as the "sentence symbol",
which represents the main variable with respect to which you're trying to
solve for. The language "generated" by this system is the solution to this
system in terms of Q0 or ... in light of previous discussion ... the Least
Fixed Point solution to this system.
Any method used to solve recursive systems of equations in terms of
context free expressions, or alternatively, in terms of Quantum Grammars,
will therefore be the basis for a parser generater.
(3.5) Converting Recursive Systems into Quantum Grammars.
The following example:
S = T (p T)*.
T = F (m F)*.
F = a S b + x.
alphabet X = { p, m, a, b, x }
will be converted to a Quantum Grammar to illustrate the general method.
Step 1: Use the regular expression -> NFA method outlined in the first
article to generate a recursive system of "linear" equations:
S = T v0, v0 = p v1 + 1, v1 = T v0
T = F v2, v2 = m v3 + 1, v3 = F v2
F = a v4 + x v6, v4 = S v5, v5 = b v6, v6 = 1.
Step 2: Group all the states coming from the equation Q = ... into the
class s(Q). In this example, the partition is as follows:
s(S) = { S, v0, v1 }
s(T) = { T, v2, v3 }
s(F) = { F, v4, v5, v6 }
Step 3: Insert the Creation and Annihilation operators
For each term of the form
v = ... + n v' + ..., where n is in the set N
introduce a new operator pair <m| and |m> (starting with m = 1) and
make the following transformations:
(a) Change n v' into <m| n
(b) For each state, v, in the set s(n), if 1 appears on the
right hand side of v's equation, then add the term |m> v' to the
right hand side of the equation.
By following this recipe, the above system is transformed into:
S = <1| T, v0 = p v1 + 1 + |5> v5, v1 = <2| T
T = <3| F, v2 = m v3 + 1 + |1> v0 + |2> v0, v3 = <4| F
F = a v4 + x v6, v6 = 1 + |3> v2 + |4> v2, v4 = <5| S, v5 = b v6
Step 4: Make the following transformation:
(A) for all states, v, if 1 appears on the right hand side of v's
equation:
(a) remove it.
(b) if v is in the partition s(S), add the term |0>
(B) add in a new start state S' with the single equation:
S' = <0| S
The result, applied to the example above is:
S' = <0| S
S = <1| T
v0 = p v1 + |0> + |5> v5
v1 = <2| T
T = <3| F
v2 = m v3 + |1> v0 + |2> v0
v3 = <4| F
F = a v4 + x v6
v4 = <5| S
v5 = b v6
v6 = |3> v2 + |4> v2
(3.6) Optimizations
This is not the most efficient way to make the transformation above,
because more operator symbols than are needed are introduced. In other
words, the method presented above will generate a CS[m] Quantum Grammar for
values of m much larger than optimal. Therefore there will exist a set of
optimization methods that can be applied to reduce m. Since the problem of
reducing m to its absolute minimum is unsolveable, it follows that no
system of optimization rules that you come up with will be complete. You
can make them efficient, but there will always be room for improvement.
Here are a few examples of optimizations that can be done:
(i) Removing uninformative operators
The symbols <1| and <2|, <3| and <4| can be merged because each pair
derived from terms of the same form (T v0, and F v2 respectively). The
result is, after renaming <3| and |3> to <2| and |2>, and <5| and |5> to
<3| and |3> is this:
S' = <0| S
S = <1| T
v0 = p v1 + |0> + |3> v5
v1 = <1| T
T = <2| F
v2 = m v3 + |1> v0
v3 = <2| F
F = a v4 + x v6
v4 = <3| S
v5 = b v6
v6 = |2> v2
(ii) Performing Equivalence Tests
States that are easily determined to be equivalent can be merged (S and v1,
T and v3):
S' = <0| S
S = <1| T
v0 = p S + |0> + |3> v5
T = <2| F
v2 = m T + |1> v0
F = a v4 + x v6
v4 = <3| S
v5 = b v6
v6 = |2> v2
There's plenty of other optimizations that apply that I won't get into, but
will let you discover if you're interested. The example used here can
actually be reduced to the following degree-1 Quantum Grammar:
S' = <0| S
S = a<1| S + x T
T = |1>b T + p S + m S + |0>
and thus to the following degree-1 CFE:
S' = <0| ( (a<1|)* x (|1>b)* (p + m) )* (a<1|)* x (|1>b)* |0>
the operators <1| and |1> being needed only to ensure the proper pairing
of a and b throughout. This is optimal.