Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

A Formal Language Representation of Algebras

4 views

Skip to first unread message

Mark Hopkins

unread,

Jul 26, 1995, 3:00:00 AM7/26/95

This is a followup to "The Spinor Representation of Formal Languages",
posted previously in comp.theory.

Several years ago I developed a method of representing formal languages
that makes use of infinite state automata. For instance, the Dyck
language D(p, q) can be represented as follows:

-- p --> -- p --> -- p --> ...
--> [[0]] [1] [2] [3]
<-- q -- <-- q -- <-- q -- ...

and the intersection of the Dyck languages D(p, q) and D(b, d) as follows:

p p p
----> ----> ---->
--> [[0,0]] [0,1] [0,2] ...
| ^ <---- | ^ <---- | ^ <----
| | q | | q | | q
b | | d b | | d b | | d
| | p | | p | | p
v | ----> v | ----> v | ---->
[1,0] [1,1] [1,2] ...
| ^ <---- | ^ <---- | ^ <----
| | q | | q | | q
b | | d b | | d b | | d
| | p | | p | | p
v | ----> v | ----> v | ---->
[2,0] [2,1] [2,2] ...
| ^ <---- | ^ <---- | ^ <----
| | q | | q | | q
...

This notation can be generalised so that it can be used for all rings
(including linear algebras) not just the semi-ring comprising language
expressions. In the process one will see that it is also a generalization
of the cycle notation used for permutations.

For instance, one could write the cycle (1 2 3) as:

[1] --> [2] --> [3]
^ |
| |
*---------------*

which is a union of the state transitions ([1] -> [2]), ([2] -> [3]), and
([3] -> [1]).

More generally, each automaton will have a direct correspondence to a
matrix with:

[m] -- a --> [n] <==> The matrix whose m-n entry is a
all other entries 0.

For unlabeled arcs, the value a = 1 is understood. Also, by convention,
multiplication by scalars will be understood to apply arc-wise as in the
following examples:

p ([0] --> [1] -- a --> [2]) = [0] -- p --> [1] -- pa --> [2]

([0] --> [1] -- a --> [2]) p = [0] -- p --> [1] -- ap --> [2]

For formal languages, scalars are taken over the semi-ring comprising
language expressions. But generally, they can be taken over any semi-ring
and (thus) any ring.

The rule for multiplying automata is essentially the same as the
corresponding rule for multiplying permutations in cycle notation:

The coefficient, a, of the transition [m] -- a --> [n]

is the sum of all products bc over all paths

[m] -- b --> [p] -- c --> [n]

where the m-p transition is from the first automaton and the
p-n transition is from the second.

Examples:
(1) --> [0] [1] -- a -->
[0] [1] X | ^ | ^ = [0] [1]
<-- | | | | <-- a --
*---* *---*
a a

(2) [0] -- a --> [1] -- b --> [2] X [2] -- c --> [0] -- d --> [1]

= [1] -- bc --> [0]

(3) [0] [1] -- a --> -- a -->
| ^ | ^ X [0] [1] = [0] [1]
| | | | <-- b -- <-- b --
*---* *---*

In the third example, the term on the left behaves as the identity with
respect to the set of states { [0], [1] }. In 2 x 2 matrix form, this takes
the form of the identity matrix.

A more complex automaton is then understood as the union of all the
automata constructed from the individual state transitions. For instace,
the automaton:
-- a -->
[0] [1]
<-- b --
will correspond to the matrix

A = / 0 a \
\ b 0 /

To recover the conventional interpretation of an automaton, one will
assume that an automaton, A, with its start state and final states
specified will correspond to the RTC of the corresponding matrix operating
on a vector to the right corresponding to the set of final state, and a
vector to the left corresponding to the start state. For instance, the
automaton:
-- a -->
--> [0] [[1]]
<-- b --

with start state [0] and final state [1] will correspond to the expression:

(1 0) / 0 a \* / 0 \
\ b 0 / \ 1 /

= (1 0) / (ab)* b(ab)* \ / 0 \
\ a(ba)* (ba)* / \ 1 /

= ( (ab)* a(ba)* ) / 0 \
\ 1 /

= a(ba)*

Thus, an automaton is not just a representation of a language expression,
but IS the language expression!

As a more complex example, consider the infinite automaton:

-- p --> -- p --> -- p --> ...
--> [[0]] [1] [2] [3]
<-- q -- <-- q -- <-- q -- ...

with [0] as both the start state final state. This is the enumeration of
the states of the following push down automaton:

Input Alphabet: p, q
Stack Alphabet: a
State Set: S = start state = final state
Transitions:
(empty) S p |- a S
a S p |- a a S
a S q |- S

where state [n] represents state S with n symbols on the stack. By the
notational convention just established, this automaton is equal to:

/ -- p --> -- p --> -- p --> ... \ *
<0| | [0] [1] [2] [3] | |0>
\ <-- q -- <-- q -- <-- q -- ... /

where <0| is the row vector (1 0 0 ...) and |0> the adjoint column vector.
The term inside the parentheses can be factored into the union:

[0] -- p --> [1] -- p --> [2] -- p --> [3] ...
+ [0] <-- q -- [1] <-- q -- [2] <-- q -- [3] ...
or
p ( [0] ---> [1] ---> [2] ---> [3] ... )
+ q ( [0] <--- [1] <--- [2] <--- [3] ... )

In matrix form the terms in the parentheses are respectively:

/ \ / \
| 0 1 0 0 0 ... | | 0 0 0 0 0 ... |
| 0 0 1 0 0 ... | | 1 0 0 0 0 ... |
v = | 0 0 0 1 0 ... | u = | 0 1 0 0 0 ... |
| 0 0 0 0 1 ... | | 0 0 1 0 0 ... |
| 0 0 0 0 0 ... | | 0 0 0 1 0 ... |
... ...

So the automaton takes the form:

<0| (p v + q u)* |0>

which is the 0-0 component of the matrix:

/ \ *
| 0 p 0 0 0 ... |
| q 0 p 0 0 ... |
(p v + q u)* = | 0 q 0 p 0 ... |
| 0 0 q 0 p ... |
...

/ \
| S pT p s* pU p s* p t* pV ... |
| SqS T s* pU s* p t* pV ... |
= | SqSqS SqT U t* pV ... |
| SqSqSqS SqSqT SqU V ... |
...

where S = 1 + p S q S
T = 1 + p S q T + s T s = q p
U = 1 + p S q U + t U t = q s* p
V = 1 + p S q V + u V u = q t* p
...

The 0-0 component is the set S, which is the Dyck language D(p, q), which
(of course) is none other than the language recognized by the push down
automaton.

However, this notation doesn't just apply to the semi-ring comprising
language expression where addition is interpreted as the union operation. It
can be used for any ring and, indeed, any linear algebra. For instance, one
way of representing the Dirac Algebra (in terms of Quaternions) would be as
follows:

gamma_0: [0] [1] gamma_1: -- i -->
| ^ | ^ [0] [1]
| | | | <-- i --
*---* *---*
+1 -1

with gamma_2 and gamma_3 defined similarly in terms of j and k respectively.

For instance:
-- i --> -- j -->
gamma_1 gamma_2 = [0] [1] X [0] [1]
<-- i -- <-- j --

[0] [1]
= | ^ | ^
| | | |
*---* *---*
k k

where ij = k. Since ji = -k, it follows that

[0] [1]
gamma_2 gamma_1 = | ^ | ^ = -gamma_1 gamma_2
| | | |
*---* *---*
-k -k

Any other matrix representation for the Dirac algebra will likewise yield
an automaton representation and vice versa.

As another example, if [n] denotes the nth harmonic of the Simple Harmonic
Oscillator, then the creation and annihilation operators a+ and a take on the
following forms:

a+ = [0] - sqrt(1) -> [1] - sqrt(2) -> [2] - sqrt(3) -> [3] ...

a = [0] <- sqrt(1) - [1] <- sqrt(2) - [2] <- sqrt(3) - [3] ...

analogous to the v and u operators defined previously.

0 new messages