Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Context Free Expressions

4 views
Skip to first unread message

Mark Hopkins

unread,
Jul 28, 1995, 3:00:00 AM7/28/95
to
This article is a continuation of the series:

The Spinor Representation of Formal Languages
The Formal Language Representation of Algebras
The Equational Representation of Formal Languages

previously posted here. By applying the axiom

State = Expression

we arrived at the correspondences

Automaton = System of Equations

in which every automaton is represented by a (possibly infinite) system
of right linear equations involving language expressions. As a further
consequence we established a link to the correspondence:

Language = Expression = State

develped in the first two articles. This, then, provides the basis (and
explanation) of the link between languages and automnata.

The case where the automaton class is restricted to push down automata
will be looked at here. The corresponding languages are known as the
Context Free Languages. Therefore the corresponding expressions will be
called the Context Free Expressions.

A Push Down Automaton is an Infinite State Automaton (defined in the
previous article) with the following restrictions:

* Q = Q0 x S*, S = stack alphabet. Q0 finite
* F subset of Q0 x { 1 }
* Only transitions (q, w t) x |- (q, w y)
allowed for t in S, y in S*.
* If (q, w t) x |- (q, w y)
then (q, w' t) x |- (q, w' y)

which will therefore correspond to an infinite system of equations in the
following form:

q(1) = \q + union ( x q'(y) : (q, 1) x |- (q', y) )
q(w t) = union ( x q'(w y): (q, w t) x |- (q', w y) )

with one equation given, identical in form to the second, for all w in S*.
The term \q is defined as follows:

\q = 1 if q is a final state (i.e. (q, 1) is in F).
= 0 else

Define the following state operators:

un: q(w) |--> q(w n)
vn: q(w n) |--> q(w)
q(w n') |--> 0 if n, n' distinct
q(1) |--> 0
p0: q(1) |--> q(1)
q(w n) |--> 0

where n and n' range over the set S. These operators form an algebra
satisfying the identities:

vn un' = 1 if n = n'
= 0 otherwise
p0 un = 0 = vn p0
p0^2 = p0

p0 + u1 v1 + ... + us vs = 1

where 1 is also being used to denote the identity operator. We can then
write each set of equations as follows

p0 q(1) = \q + sum x u(y) p0 q'(1) : (q, 1) x |- (q', y)
ut vt q(w t) = sum x u(y) vt q'(w t): (q, w t) x |- (q', w y)

where y is in S* and the notation un is extended as follows:

u(1) = 1
u(t) = ut, for t in S.
u(s1 s2 ... sn) = u(s1) u(s2) ... u(sn), for si in S.

Since p0 q(w t) = 0 = ut vt q(1) this system can be generalised to the
following form:

p0 q(w) = \q + sum x u(y) p0 q'(w): (q, 1) x |- (q', y)
ut vt q(w) = sum x u(y) vt q'(w): (q, w t) x |- (q', w y)

Using the property:

p0 + u1 v1 + ... + us vs = 1

this yields a single equation for each state q(w):

q(w) = \q + sum x u(y) p0 q'(w): (q, 1) x |- (q', y)
+ sum x u(y) vt q'(w): (q, w t) x |- (q', w y)


By enumerating the stack states as follows:

1 --> 0
w t --> sw + t for w in S*, t = 1, 2, ..., s.

we can represent each state set { q(w): w in S* } as a vector. Then
the operators un, vn, p0 will have the matrix realization given in the
article "The Spinor Representation of Formal Languages". Defining

q = (q(0), q(1), q(2), ... )'

(where R' denotes the transpose of the row vector R), then the system
above can be more compactly written in the form:

q = \q + sum x u(y) p0 q': (q, 1) x |- (q', y)
+ sum x u(y) vt q': (q, w t) x |- (q', w y)

This is a FINITE system of right linear equations of the form:

q = 1 + ... if q is a final state
q = ... + x u(y) p0 q' if (q, 1) x |- (q', y)
q = ... + x u(y) vt q' if (q, w t) x |- (q', w y)

where
x is in X union {1} (X = input alphabet)
and t is in S, y in S* (S = stack alphabet)

Being finite, it can be solved as a regular expression over the
algebra:
X* x { p0, u1, v1, u2, v2, ..., un, vn }

This regular expression is called a Context Free Expression.

By identifying x with x I (where I is the identity matrix), we can add
in the relations:

un x = x un, vn x = x vn, p0 x = x p0

Thus we arrive at the following theorem:

The Fundamental Theorem of Context Free Expressions --

Every Context Free Language over an alphabet X can be represented
as a regular expression over the algebra generated by:

X union { p0, u1, v1, u2, v2, ..., un, vn }

defined by the following relations:

vn un' = 1 if n = n'
= 0 otherwise
p0 un = 0 = vn p0
p0^2 = p0

p0 + u1 v1 + ... + us vs = 1

x un = un x, x vn = vn x, x p0 = p0 x

This will be called the Context Free Algebra. It is equivalent to the
algebra:

X* x { p0, u1, v1, u2, v2, ..., un, vn }*

An interesting question is whether the converse is true: that any
regular expression over this algebra represents a context free language
(answer: yes, if the expression can be written in the form p0 E p0. This
was established in the first article in the series).


This presentation can be made more uniform by adopting the fiction
that
p0 = u0 v0

Then the relations above will follow from the relations:

vn un' = 1 if n = n', 0 otherwise
u0 v0 + u1 v1 + ... + un vn = 1.
un x = x un, vn x = x vn
since
p0^2 = (u0 v0)(u0 v0) = u0 (v0 u0) v0 = u0 v0 = p0

p0 un = (u0 v0) un = u0 (v0 un) = 0 for n > 0
vn p0 = vn (u0 v0) = (vn u0) v0 = 0 for n > 0

p0 + sum(un vn: n > 0) = u0 v0 + sum(un vn: n > 0) = 1

p0 x = u0 v0 x = u0 x v0 = x u0 v0 = x p0


This algebra can be realised using tensor space over V* of an
(s+1)-dimensional vector space V (recall, s = the size of the stack alphabet)
by making the idenfications:

un = |n> vn = <n|

where |n> is the nth element of the V and vn the adjoint basis element of V+.
The quotient is taken with respect to the relations:

<n| |n'> = 1 if n = n', 0 otherwise (R)
|0><0| + |1><1| + ... + |n><n| = 1

By taking the product of of the algebras:

X* x (V*/R)

we arrive at an alternate representation of Context Free Expressions closely
related to that provided in the first article in the series.

Another interesting question is what if we take either of the products:

X* x Y* x (V*/R)

where X and Y are two finite alphabets, or:

X* x (V*/R) x (W*/S)

where V*/R and W*/S are two realisations of the Context Free Algebra. In
the first case, we get a representation for a automaton with two alphabets
(which may be interpreted as an input and output alphabet), and in the
second case an automaton with two stacks (which may be interpreted as a
Turing Machine). This leads to the basis of a notation for more general kinds
of expressions (Translation Expressions, Turing Expressions respectively).

0 new messages