Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Need Graph Isomorphism Algorithm De-bunked

6 views
Skip to first unread message

Bill Cox

unread,
Sep 22, 2006, 10:56:20 AM9/22/06
to
I thought this forum might be a good place to post, since graph
isomorphism is commonly suggested for use in cryptography.

I've written a graph isomorphism program that determines if two graphs
are isomorphic. It runs in O(N^3).

Since graph isomorphism is a long studied problem with (AFAIK) no known
polynomial time solution, my algorithm is more than likely flawed. Any
help finding the flaws would be appreciated! The web site is

http://www.billrocks.org/ideas

Thanks!

Message has been deleted

Bill Cox

unread,
Sep 22, 2006, 1:55:59 PM9/22/06
to
\> > I've written a graph isomorphism program that determines if two

graphs
> > are isomorphic. It runs in O(N^3).
> >
> > Since graph isomorphism is a long studied problem with (AFAIK) no known
> > polynomial time solution, my algorithm is more than likely flawed. Any
> > help finding the flaws would be appreciated! The web site is
> >
> > http://www.billrocks.org/ideas
>
> Your problem is the hash - what happens if, by very low chance, you
> encounter a collision? Then the algorithm fails.

Thanks for the reply!

This is an interesting point, and please keep trying to find more flaws
(which are most likely there). I think is a complication, but not a
killer. Given two truly random graphs of any significant size (say, N
> 100) with the same number of nodes and edges, I agree that the algorithm is far more likely to report a false positive than a correct result.

However, I've got a great function that is perfect for comparing large
random graphs:

bool compareRandomLargeGraphs(Graph G1, Graph G2)
{
return false;
}

Two truly random graphs of any significant size will realistically
never match.

There are plenty of algorithms that CAN fail, but realistically wont.
RSA is a good example. It tests primeness of it's two secret prime
numbers using a probabilistic algorithm. The important thing (I think)
with these kinds of algorithms is choosing how much risk we can live
with. The prototype code I wrote uses all 32-bit numbers, which makes
collisions fairly likely. In the real world, we'd need hash functions
of 64-bits or more. I begin to be comfortable somewhere around 256
bits. Even so, given two truly random graphs, it's more likely that
we'll find a collisions, than it is for two random graphs of any
significant size to actually be isomorphic.

As a real-world example (the one that got me thinking about this),
consider LVS (layout versus schematic) algorithms. In this case, we
try to show that two graphs representing circuits are the same. If the
user made no mistakes in implementing his circuit in silicon, the
graphs will match. If the algorithm says they don't match, then
there's an error the user has to fix. If the algorithm says the graphs
match, and if our hash function is good and uses many bits (say 256),
then the probability of being wrong should be something like 1/(2^256).
I can live with that. Also, we can easily check the graph
isomorphism, and I already included such a check in the code.

Since detecting a failure is easy, the algorithm could be enhanced to
re-randomize the input graphs, and try again. The chance of running
into yet another collision becomes vanishingly small.

Scott Fluhrer

unread,
Sep 22, 2006, 4:10:52 PM9/22/06
to

"Sebastian Gottschalk" <se...@seppig.de> wrote in message
news:4nif23F...@news.dfncis.de...
> Bill Cox wrote:
>
>> I thought this forum
>
> This is no forum, this is Usenet.

>
>> I've written a graph isomorphism program that determines if two graphs
>> are isomorphic. It runs in O(N^3).
>>
>> Since graph isomorphism is a long studied problem with (AFAIK) no known
>> polynomial time solution, my algorithm is more than likely flawed. Any
>> help finding the flaws would be appreciated! The web site is
>>
>> http://www.billrocks.org/ideas
>
> Your problem is the hash - what happens if, by very low chance, you
> encounter a collision? Then the algorithm fails.

Actually, that does not appear to be a fatal problem. Suppose you replace
the hashes with universal hashes (based on a randomly selected key). Then,
this is a randomized algorithm that has a probability of failing (that is,
reporting two different graphs as being identical), but as long as this
probability is bounded away from 1, this would show the graph isomorphism
problem is in coRP, which (AFAIK) is not currently known.

Of course, you'd still need to prove that a) the probability of a collision
somewhere in the run of the algorithm is bounded away from 1, and b) if
there isn't a collision, the algorithm always reports the correct result.
(a) appears to be fairly straightforward (given a few algorithm tweaks), (b)
looks like it's the tricky point.

--
poncho


Francois Grieu

unread,
Sep 22, 2006, 4:21:14 PM9/22/06
to
I paraphrase/modify/improve/pervert the algorithm
given by Bill Cox.

A graph G with n vertices is expressed as an n*n bit
matrix, G[u,v] set meaning that an edge exists from
node u to node v.

We want a computable graph hash invariant under graph
isomorphism; that is, such that for any u,v, exchanging
line u with line v and column u with column v in G
leaves the hash unchanged; but it is computationally
impossible to exhibit two non-isomorphic graphs with
the same hash; therefore such that, in a sense, comparing
hashes is computationally equivalent to testing for graph
isomorphism.

Select a cryptographic hash with the desired output
width w, say SHA-w with w=256.

Have two arrays A and B of n variables each w-bit wide,
and an array P of n variables each at least
ceil(log2(n))-bit wide.


Setup:
- for all u, 0<=u<n
- - set A[u] = 0
- - set P[u] = u

Compute:
- repeat n times
- - for all u, 0<=u<n
- - - initialize the cryptographic hash
- - - for all b, 0<=v<n
- - - - if G[P[u],P[v]] is set
- - - - - enter A[v] into the cryptographic hash
- - - set B[u] to the result of the cryptographic hash
- - set A = B
- - sort P such that A[P[u]] is nondecreasing

Finish:
- initialize the cryptographic hash
- for all u, 0<=u<n
- - enter A[v] into the cryptographic hash
- output the result of the cryptographic hash


It is left as an exercise to the reader / conjectured
that if two graphs produce the same hash, either they
are isomorphic, or two different inputs to the
cryptographic hash produced the same output during
the course of the algorithm.

The cost of the computation is O(n^3 * w)


In practice it seems the cost can be reduced greatly
with a few simple tricks outlined below
- assumming n <= 2^w, replace the first iteration of the
outer loop in Compute by setting A[u] to the number of
bits set in line G[u], then sorting P accordingly
- when sorting P, only sort the subintervals where
A[P[u]] was constant in the previous iteration of
the outer loop of Compute
- rather than entering in the cryptographic hash several
identical values of A belonging to the same subinterval
of P such that A[P[u]] was constant in the previous
iteration of the outer loop of Compute,
enter this value of A once, followed by the
number of such identical values on at least
ceil(log2(n)) bits
- perform only d+1 iterations of the outer loop in Compute,
where d is the maximum number of edges along the shortest
path between any two connected vertices
- if more than half of the bits are set in G, first
complement G, compute the hash, and complement the
overall result
- if the graph is undirected, that is when G is symetric
along the diagonal, change 0<=v<n to 0<=v<=u


François Grieu

Francois Grieu

unread,
Sep 22, 2006, 4:41:43 PM9/22/06
to
I paraphrase/modify/improve/pervert the algorithm
given by Bill Cox.

A graph G with n vertices is expressed as an n*n bit
matrix, G[u,v] set meaning that an edge exists from

vertex u to vertex v.

We want a computable graph hash invariant under graph
isomorphism; that is, such that for any u,v, exchanging
line u with line v and column u with column v in G
leaves the hash unchanged; but it is computationally
impossible to exhibit two non-isomorphic graphs with
the same hash; therefore such that, in a sense, comparing
hashes is computationally equivalent to testing for graph
isomorphism.

Select a cryptographic hash with the desired output
width w, say SHA-w with w=256.

Have two arrays A and B of n variables each w-bit wide,
and an array P of n variables each at least
ceil(log2(n))-bit wide.


Setup:
- for all u, 0<=u<n
- - set A[u] = 0
- - set P[u] = u

Compute:
- for all j, 0<=j<n


- - for all u, 0<=u<n
- - - initialize the cryptographic hash
- - - for all b, 0<=v<n
- - - - if G[P[u],P[v]] is set
- - - - - enter A[v] into the cryptographic hash
- - - set B[u] to the result of the cryptographic hash
- - set A = B
- - sort P such that A[P[u]] is nondecreasing

Finish:
- initialize the cryptographic hash
- for all u, 0<=u<n

- - enter A[u] into the cryptographic hash


- output the result of the cryptographic hash

It is left as an exercise to the reader / conjectured
that if two graphs produce the same hash, either they
are isomorphic, or two different inputs to the
cryptographic hash produced the same output during
the course of the algorithm.

Rationale; A[u] is representative of the structure
of a graph encompassing at least j step from a.


François Grieu
[reposted with cosmetic correction, and rationale]

Bill Cox

unread,
Sep 22, 2006, 4:55:59 PM9/22/06
to
> > Your problem is the hash - what happens if, by very low chance, you
> > encounter a collision? Then the algorithm fails.
>
> Actually, that does not appear to be a fatal problem. Suppose you replace
> the hashes with universal hashes (based on a randomly selected key). Then,
> this is a randomized algorithm that has a probability of failing (that is,
> reporting two different graphs as being identical), but as long as this
> probability is bounded away from 1, this would show the graph isomorphism
> problem is in coRP, which (AFAIK) is not currently known.

I'm still a bit confused about co-classes. I looked up RP at the fount
of all knowledge, Wikipedia :-)
Could you explain why this would be coRP, rather than RP? Thanks.

> Of course, you'd still need to prove that a) the probability of a collision
> somewhere in the run of the algorithm is bounded away from 1, and b) if
> there isn't a collision, the algorithm always reports the correct result.
> (a) appears to be fairly straightforward (given a few algorithm tweaks), (b)
> looks like it's the tricky point.

I may be getting just a bit confused by the term 'collision'. One kind
of collision is when the actual hash fuction gets the same output even
when the two values I feed it are different. This can be minimized by
using a strong hash function, like SHA-256. Then, there's graph-level
collisions, where the nodes all wind up with the same hash values, even
though the graphs are different, and there was no hash-collision.
Let's call those graph-collisions.

Graph collisions are the part that worry me. Let's face it, chances
are VERY slim that I just happened to luck into a good algorithm here.
In reality, it probably doesn't work. However, if it is true that all
non-isomorphic graphs can be differentiated by this hashing algorithm
so long as there are no hash-collisions, then I think it does work.

So, my intuition says that different graphs will have different
graph-level hashes. My intuition is the weak link.

Mike Amling

unread,
Sep 22, 2006, 6:16:35 PM9/22/06
to
Sebastian Gottschalk wrote:
> Bill Cox wrote:
>
>> I thought this forum
>
> This is no forum, this is Usenet.
>
>> I've written a graph isomorphism program that determines if two graphs
>> are isomorphic. It runs in O(N^3).
>>
>> Since graph isomorphism is a long studied problem with (AFAIK) no known
>> polynomial time solution, my algorithm is more than likely flawed. Any
>> help finding the flaws would be appreciated! The web site is
>>
>> http://www.billrocks.org/ideas
>
> Your problem is the hash - what happens if, by very low chance, you
> encounter a collision? Then the algorithm fails.

The hash is just a stand-in for the data being hashed. Using that
data itself rather than the hash, the method is still polynomial.

--Mike Amling

Mike Amling

unread,
Sep 22, 2006, 6:22:33 PM9/22/06
to

I withdraw this comment.

--Mike Amling

Francois Grieu

unread,
Sep 22, 2006, 7:28:10 PM9/22/06
to
Third try at a paraphrase/modification/improvement/
simplification/perversion of Bill Cox's algorithm.


We want a computable, computationally collision and
preimage resistant hash function accepting a graph as
input, that is invariant under graph isomorphism
(for any graph, exchanging any vertices leaves the hash
unchanged); implying that computing and comparing two such
hashes is computationally enough to test for graph
isomorphism.

A graph G with n vertices will be expressed as an n*n bit


matrix, G[u,v] set meaning that an edge exists from
vertex u to vertex v.

Select a cryptographic hash with the desired output

width w, supposed collision and preimage resistant,
say SHA-w with w=256.

Have two arrays A and B of n variables each w-bit wide,

and an array P of n variables each ceil(log2(n))-bit wide.


Setup:
- for all u, 0<=u<n

- - set A[u] = G[u,u]


- - set P[u] = u

- - sort P such that A[P[u]] is non-decreasing

Compute:
- for all j, 0<j<n


- - for all u, 0<=u<n
- - - initialize the cryptographic hash

- - - enter A[u] into the cryptographic hash
- - - for all v, 0<=v<n, v!=u


- - - - if G[P[u],P[v]] is set
- - - - - enter A[v] into the cryptographic hash
- - - set B[u] to the result of the cryptographic hash
- - set A = B

- - sort P such that A[P[u]] is non-decreasing

Finish:
- initialize the cryptographic hash
- for all u, 0<=u<n
- - enter A[u] into the cryptographic hash
- output the result of the cryptographic hash


This graph hash is invariant under graph isomorphism
(by induction, corresponding vertices in two isomorphic
graphs have equal values of corresponding A[u] across
2 parallel runs of the algorithm)

I can make a handwaving argument that if two graphs


produce the same hash, either they are isomorphic,
or two different inputs to the cryptographic hash
produced the same output during the course of the

algorithm, thus the scheme is collision-resistant.
The idea is that at the end of step j in Compute,
and assuming no collision in the cryptographic hash,
A[u] is representative of the structure of a sub-graph
encompassing at least j steps away from vertex u.

Once collision resistance is proved, preimage
resistance is inherited from the cryptographic hash,
merely by considering the Finish part.

The cost of the computation is O(n^3 * w)


In practice it seems that the cost can be reduced
greatly with a few tricks outlined below
- when sorting P during Compute, only sort the subintervals
where A[P[u]] was constant in the previous sort


- rather than entering in the cryptographic hash several
identical values of A belonging to the same subinterval
of P such that A[P[u]] was constant in the previous

sort, enter the concatenation of this value of A and
the number of such identical values on ceil(log2(n)) bits
- perform only d iterations of the outer loop in Compute,


where d is the maximum number of edges along the shortest

path between any two distinct connected vertices


- if more than half of the bits are set in G, first

complement G, compute the hash, then complement the


overall result
- if the graph is undirected, that is when G is symetric

along the diagonal, change "0<=v<n, v!=u" to "0<=v<u"


François Grieu

Disclaimer: I badly need sleep, this might be all wrong.

Francois Grieu

unread,
Sep 22, 2006, 8:01:15 PM9/22/06
to
Hum, I guess I should explain why I felt the urge to
change Bill Cox's algorithm in the first place.

One problem is that Bill needs a hash insensitive
to the order of the inputs, but still collision
resistant. And his XOR can make unwanted collisions
(in particular, even numbers of identical inputs
cancel out). Changing XOR to addition mod 2^k
only partially solve the problem, as we can still
contruct deliberate collisions (from about w value
w-bit wide, we can find two distinct colliding
subsets).

To solve this, I replace Bill's XOR with sorting
the inputs before hashing, and have that feeling
that a correct proof is much simplified.


François Grieu

Mike Amling

unread,
Sep 22, 2006, 8:39:10 PM9/22/06
to

First off, when you say

- For 2*D iterations, update each vertex's hash value, by hashing it
with it's neighbor's hash values, and the iteration number.
- The hashing values for neighbors has to be symmetric (commutative) so
that you don't need to know what order to hash them in. I just used XOR.

does that mean that you first XOR all a node's neighbors' current hashes
together and then hash the result in with the node's current hash and
the iteration number? XORing them seems like it loses information about
pairs of neighbors that at the moment have the same hash value. How
about addition? Or just concatenate the neighbors' hash values in
nondecreasing order of hash value.

The original algorithm hashes for each node the sequence
node's own degree
set of neighbors' degrees
set of neighbors' sets of neighbors' degrees
...

It generates the same hash for each node of two non-isomorphic graphs
if the two graphs have the same number of nodes and all the nodes on the
graphs have the same degree. Coming up with two such graphs is so easy
that I just did it.


The second algorithm is like the first except x is substituted for
one node (call it A)'s degree. The second algorithm's calculated hash
for a node B differs from the first's depending on the number of paths
(and degrees of nodes along the paths and ...) from B to A as a function
of path length (where path length corresponds to iteration number).
Since x is arbitrary, would it be useful to make a polynomial using
an abstract x?


Does the canonical Zero-Knowledge proof of a Hamiltonian circuit use
graphs for which an N**3 attack is feasible for a Verifier?

--Mike Amling

David Wagner

unread,
Sep 22, 2006, 9:32:45 PM9/22/06
to
Francois Grieu wrote:
>One problem is that Bill needs a hash insensitive
>to the order of the inputs, but still collision
>resistant.

This kind of hash function is easy to construct in a generic way.

H(a,b,c,d,..,z) = hash(sort(a,b,c,..,z))

There is no loss of collision-resistance.

Is this the same as what you are doing? Can your scheme be described
more simply as an algorithm that uses H (rather than as an algorithm which
includes the sorting as part of the algorithm)?

Phil Carmody

unread,
Sep 23, 2006, 1:31:57 AM9/23/06
to
Francois Grieu <fgr...@francenet.fr> writes:
> A graph G with n vertices will be expressed as an n*n bit
> matrix, G[u,v] set meaning that an edge exists from
> vertex u to vertex v.
...
> Have two arrays A and B of n variables each w-bit wide,
> and an array P of n variables each ceil(log2(n))-bit wide.
>
>
> Setup:
> - for all u, 0<=u<n
> - - set A[u] = G[u,u]
> - - set P[u] = u
> - - sort P such that A[P[u]] is non-decreasing

This is not an algorithm. The above does not uniquely
specify P[u].

Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.

Francois Grieu

unread,
Sep 23, 2006, 2:51:09 AM9/23/06
to
In article <87fyejp...@nonospaz.fatphil.org>,
Phil Carmody <thefatphi...@yahoo.co.uk> wrote:

> Francois Grieu <fgr...@francenet.fr> writes:
> > A graph G with n vertices will be expressed as an n*n bit
> > matrix, G[u,v] set meaning that an edge exists from
> > vertex u to vertex v.
> ...
> > Have two arrays A and B of n variables each w-bit wide,
> > and an array P of n variables each ceil(log2(n))-bit wide.
> >
> >
> > Setup:
> > - for all u, 0<=u<n
> > - - set A[u] = G[u,u]
> > - - set P[u] = u
> > - - sort P such that A[P[u]] is non-decreasing
>
> This is not an algorithm. The above does not uniquely
> specify P[u].

Indeed this is underspecified, anb in addition thare is
a cut-and-paste typo; I meant

Setup:
- for all u, 0<=u<n
- - set A[u] = G[u,u]
- - set P[u] = u
- sort P such that A[P[u]] is non-decreasing

We can use straight insertion, or quicksort. Any stable sort
algorithm will produce the same result, locally. In fact any
sort algorith, including the unstable heapsort, will produce
the same hash (or my alg is deeply flawed).


Francois Grieu

Bill Cox

unread,
Sep 23, 2006, 4:30:53 AM9/23/06
to
Francois Grieu wrote:
> I can make a handwaving argument that if two graphs
> produce the same hash, either they are isomorphic,
> or two different inputs to the cryptographic hash
> produced the same output during the course of the
> algorithm, thus the scheme is collision-resistant.
> The idea is that at the end of step j in Compute,
> and assuming no collision in the cryptographic hash,
> A[u] is representative of the structure of a sub-graph
> encompassing at least j steps away from vertex u.

Thanks for the careful work on this! The formal description helps a
lot. The sorting of values and before hashing neighbor values is a
nice improvement (let's us stop worring about simple XOR collisions).
Also, the various speed improvements you suggest sound good for making
an actual useful algorithm. O(n^3) is too slow in practice to be
useful. If this algorithm is proven out, I'm confident a sub-O(N^2)
average-case algorithm can be developed.

I think we agree on the handwaving argument. If this could be
formalized into a convincing proof, I'd be convinced that the overall
algorithm works. I think we will very likely to fail to complete this
proof, since in reality isomorphism detection is proably
computationally hard, or somebody would have come up with a fast
algorithm by now.

So, this is where I'm stuck. I have not found a counter example graph
by hand, and I haven't been able to prove that "at the end of step j in


Compute, and assuming no collision in the cryptographic hash, A[u] is
representative of the structure of a sub-graph encompassing at least j

steps away from vertex u". Perhaps a proof by induction? I'm also not
getting much sleep, but I'll work on it.

Bill Cox

unread,
Sep 23, 2006, 4:49:45 AM9/23/06
to

Mike Amling wrote:
> does that mean that you first XOR all a node's neighbors' current hashes
> together and then hash the result in with the node's current hash and
> the iteration number? XORing them seems like it loses information about
> pairs of neighbors that at the moment have the same hash value. How
> about addition? Or just concatenate the neighbors' hash values in
> nondecreasing order of hash value.

The suggestion of sorting neighbor hash values, then hashing the result
is a good one. So, I think we can stop worring about the XOR hash.

> The original algorithm hashes for each node the sequence
> node's own degree
> set of neighbors' degrees
> set of neighbors' sets of neighbors' degrees
> ...
>
> It generates the same hash for each node of two non-isomorphic graphs
> if the two graphs have the same number of nodes and all the nodes on the
> graphs have the same degree. Coming up with two such graphs is so easy
> that I just did it.

Me, too. I suspect that the modified algorithm still has such a flaw.

> Since x is arbitrary, would it be useful to make a polynomial using
> an abstract x?

I'm afraid I don't know what you mean by making a polynomial using an
abstract x.

Francois Grieu

unread,
Sep 23, 2006, 5:14:08 AM9/23/06
to
In article <ef22rt$mm5$1...@agate.berkeley.edu>,
d...@taverner.cs.berkeley.edu (David Wagner) wrote:

> Francois Grieu wrote:
> >One problem is that Bill needs a hash insensitive
> >to the order of the inputs, but still collision
> >resistant.
>
> This kind of hash function is easy to construct in
> a generic way.
>
> H(a,b,c,d,..,z) = hash(sort(a,b,c,..,z))
>
> There is no loss of collision-resistance.
>
> Is this the same as what you are doing?

Yes. I keep the current hashes of the vertices
sorted using an auxiliary permutation table,
and enter them into the next stage in this order.

> Can your scheme be described more simply as an algorithm
> that uses H (rather than as an algorithm which includes
> the sorting as part of the algorithm)?

Yes, and that makes it clearer! Try 4 follows.
Supercedes earlier stuff. Still basically
Bill Cox's idea plus the sort trick.


Graph G is given as n*n bit array, G[u,v] set means


an edge exists from vertex u to vertex v.

We make use of a cryptographic hash function
producing a w-bit result, say SHA-w for w=256

We use a w-bit value A[u] for each vertex u,
and a temp B[u] of same size.

Initially, set A[u] to G[u,u] for all u.

Repeat n-1 times

- For each vertex u, in any order

- - Build new hash B[u] for vertex u, as the
- - cryptographic hash of A[u], and (entering
- - these in nondecreasing order) the A[v] of
- - each vertex v other than u with G[u,v]==1

- set A = B

Compute cryptographic hash of the n values A[u]
in nondecreasing order, output result.


Result is unchanged under graph isomorphism.
Proof by induction, considering the values
in array A in two parallel executions of the
algorithm.

I have a handwaving argument that (assuming no
cryptographic hash collision) the structure of a
subgraph comprising at least j steps from
vertex u is fully captured into the value of A[u]
at the jth outer iteration, concluding by
induction the we have cdifferent results for
non-isomorphic graphs, baring a collison in
the cryptographic hash. This is the difficult
step in a proof.

Preimage resistance is inherited from the
last cryptographic hash.


Notice that it is enough to maintain a permutation
P such that A[P[u]] is nondecreasing, rather than
sorting (a subset of) A for each of the (n-1)*n+1
independent cryptographic hashes performed.
Using this technique, the cost of the algorithm is
dominated by the cryptographic hashes. Assuming
the cost of the cryptographic hash per input bit
grows as log(w), cost is O(n^3 * w * log(w)),
with worst case the fully connected graph.


Miscelaneous improvements

We can stop updating A[u] as soon as there is no
other v with A[v] = A[u]. This helps a lot for
graphs without an internal symmetry, and also give
an early stop criteria (when all values in A are
unique). As a bonus, we recognize graphs with an
internal symmetry, as having some equal values in
the input of the final cryptographic hash.

We can reduce the number of outer iterations to d,
where d is the max of the length of the shortest
path between any two distinct vertices, which can
be precomputed with effort O(n^2 * log(n)).

Rather than entering repeating identical values
in the crytograhic hash, we can enter the
concatenation of the number of repeats and the
value. This improves the worst case of a highly
symmetric graph.

We actually need not fully sort A and P, we can
sort subintervals which identical values in A
at the previous iteration.

High values of d can only be obtained for very
sparse G, and for these we'd better convert G to
a linked lists of lines (linked list of edges for
each vertices). This seem to allow a reduction
in the asymptotic cost relative to n.

We can defer computing d until it becomes apparent
that the graph is probably symmetric, because
we find that the number of different values in
A is not making any progress. Or maybe we can
stop the outer loop as soon as this lack of
progress strikes, and need not compute d at all?
[I'm very uncertain on this]

If more than half the bits in G are set, we
can complement G, compute the graph hash, and
complement the result.

François Grieu

Phil Carmody

unread,
Sep 23, 2006, 5:15:07 AM9/23/06
to

I hate to say this, as I don't have a counterexample or
counterargument, but I think it's flawed.
I just don't think that you sort things enough to uniquely
specify an order for all nodes, such that two isomorphic
graphs will always be mapped onto the same sequence.

I don't even believe it is even possible to perform such
a sort. (without solving a group isomorphism problem...)

Bill Cox

unread,
Sep 23, 2006, 12:11:21 PM9/23/06
to
> I hate to say this, as I don't have a counterexample or
> counterargument, but I think it's flawed.
> I just don't think that you sort things enough to uniquely
> specify an order for all nodes, such that two isomorphic
> graphs will always be mapped onto the same sequence.
>
> I don't even believe it is even possible to perform such
> a sort. (without solving a group isomorphism problem...)
>
> Phil

I think you're right. I still haven't found counter examples, but hey,
this dumb idea is only two days old. I'll update the code to resolve
cases where it detects automorphism, and run a bunch of cases. This
will test that when the algorithm labels multiple nodes with the same
label in a graph, that they are indeed symmetric. If I can find a
counter-example, I wont have to waste more time trying to prove that it
works!

David Wagner

unread,
Sep 23, 2006, 6:46:46 PM9/23/06
to
Francois Grieu wrote:
>Graph G is given as n*n bit array, [...]
>We use a w-bit value A[u] for each vertex u, [...]

>Initially, set A[u] to G[u,u] for all u.

I don't get it. G[u,u] is a 1-bit value, but you are assigning
to a w-bit field. Type mismatch. Can you explain?

David Wagner

unread,
Sep 23, 2006, 7:06:41 PM9/23/06
to
Francois Grieu wrote:
>[...]

Can I rephrase your algorithm as follows? We compute a sequence
of arrays A_0[.], A_1[.], .., A_n[.] indexed by vertices of G, defined
iteratively as follows:

A_0[u] = 1 if u -> u is an edge in G, or 0 otherwise
A_{i+1}[u] = H(A_i[u], {A_i[v] : u -> v is an edge in G})

where 1 represents 256 repeated one bits, and 0 is 256 repeated
zero bits, and where H(x,S) is a cryptographic hash given by, e.g.,

H(x,S) = hash(x, sort(S))

where S is an arbitrary multi-set. Then, two graphs are classified
as isomorphic iff they have the same value for A_n, where n counts the
number of vertices in the graphs.

Is that the idea?


>I have a handwaving argument that (assuming no
>cryptographic hash collision) the structure of a
>subgraph comprising at least j steps from
>vertex u is fully captured into the value of A[u]
>at the jth outer iteration, concluding by
>induction the we have cdifferent results for
>non-isomorphic graphs, baring a collison in
>the cryptographic hash. This is the difficult
>step in a proof.

Can you elaborate on this argument? This is indeed the critical part.

I'm guessing you're going to use a proof by induction. However, I'm
having a hard time guessing what a good induction predicate would be.
If u is a vertex in G, let N_d(u) denote the subgraph of G given by
the set of all vertices that can be reached from u by a path of length
at most d, along with all the edges between these pairs of vertices.
A natural guess at an induction predicate P(d) might be:
for all u in V, u' in V',
A_d[u] = A'_d[u'] ==> N_d(u) is isomorphic to N_d(u')
However, I don't think this works. For instance, consider the graphs
G = (V,E) = ({a,b}, {a->b})
G' = (V',E') = ({a',b'}, {a'->b', b'->a'})
Note that P(1) is false for these two graphs, since A_1[a] = A'_1[a']
but N_1(a) is not isomorphic to N_1(a'). So we're going to need some
more clever induction predicate.

Mike Amling

unread,
Sep 23, 2006, 8:45:30 PM9/23/06
to
David Wagner wrote:
>
> I'm guessing you're going to use a proof by induction. However, I'm
> having a hard time guessing what a good induction predicate would be.
> If u is a vertex in G, let N_d(u) denote the subgraph of G given by
> the set of all vertices that can be reached from u by a path of length
> at most d, along with all the edges between these pairs of vertices.

I think you want to use only "edges traversed by those paths" in your
subgraph rather than all the edges between vertices reachable in d or
fewer steps. The hash depends on the paths, but is independent of edges
not in any such path.

> A natural guess at an induction predicate P(d) might be:
> for all u in V, u' in V',
> A_d[u] = A'_d[u'] ==> N_d(u) is isomorphic to N_d(u')
> However, I don't think this works. For instance, consider the graphs
> G = (V,E) = ({a,b}, {a->b})
> G' = (V',E') = ({a',b'}, {a'->b', b'->a'})
> Note that P(1) is false for these two graphs, since A_1[a] = A'_1[a']
> but N_1(a) is not isomorphic to N_1(a'). So we're going to need some
> more clever induction predicate.

--Mike Amling

David Wagner

unread,
Sep 23, 2006, 9:39:16 PM9/23/06
to
Mike Amling wrote:
>David Wagner wrote:
>> I'm guessing you're going to use a proof by induction. However, I'm
>> having a hard time guessing what a good induction predicate would be.
>> If u is a vertex in G, let N_d(u) denote the subgraph of G given by
>> the set of all vertices that can be reached from u by a path of length
>> at most d, along with all the edges between these pairs of vertices.
>
> I think you want to use only "edges traversed by those paths" in your
>subgraph rather than all the edges between vertices reachable in d or
>fewer steps. The hash depends on the paths, but is independent of edges
>not in any such path.

I see. That sounds much more reasonable. Thanks.

Now the tricky bit is figuring out whether P(d) ==> P(d+1), and if so,
why...

fgrieu

unread,
Sep 23, 2006, 10:55:39 PM9/23/06
to
Some thoughts later: unless I misunderstood the OP's algorithm,
it does not work. At least my variant does not work.
I found a counterexample. These two graphs are non-isomorhic,
but my algorithm fails to distinguish them.

0 1 0 0 0 1 0 1 1 0 0 0
1 0 1 0 0 0 1 0 1 0 0 0
0 1 0 1 0 0 1 1 0 0 0 0
0 0 1 0 1 0 0 0 0 0 1 1
0 0 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 0 0 0 1 1 0

The left is a ring of 6 vertices, the right is 2 independent
rings of 3 vertices. Problem is, the algorithm keeps
assigning the same value to all vertices in the two graphs.

The problem can occur even for fully connected graphs:
just complement the matrix except for the diagonal
and you get a fully-connected counterexample.

I retract all my previous posts in this thread.

Still I feel that the use of a crypto hash made insensitive
to order by sorting its input is a promizing technique
to recognize graph isomorphism.

François Grieu

Bill Cox

unread,
Sep 24, 2006, 7:14:47 AM9/24/06
to

I made one small addition to the algorithm since you posted your formal
version. I didn't think it was necessarily important, but I required
it to show that when two graphs are reported isomorphic by the
algorithm, that they will be. At each step, I also hash into the hash
at each vertex the iteration step number. In your small ring, values
coming back to the original node will have different hash values
because of this.

I ran the above graph through the latest code (now posted), and it
determines they are different. I have spent a couple more hours trying
to find small manually generated counter examples (which I feel do
exist), but haven't. Next, I'll run it on the hundreds of graphs from:

http://amalfi.dis.unina.it/graph

I've also added code to resolve the isomorphism when the graphs have
auto-morphisms. Now, if the code runs into a counter-example, and it
generates a bad isomorphism, it will report the error.

Hopefully, I can find a counter-example that way, though these graphs
all have one problem: they are trivially distinguishable.

I did find a troubling property of the hash propagation, which I think
is key to why it will fail for some graph. Let's say we're doing a
hash iteration from start node Ak. As the iteration progresses, hash
values influenced by Ak's unique hash value propagate through the
graph. The frontier of reached nodes describes a cut of the graph. If
this cut reaches a point where the sub-graph we've reached has
automorphisms between any two nodes in the cut, then the frontier hash
values will all be identical. At this point, if the nodes connected to
in the unreached portion of the graph by the cut form a subgraph of all
equal degree nodes, then the hash values will not propagate into the
rest of the graph in such a way as to cause nodes to be differentiated.

However, later, when we pick nodes reached by that cut to start from,
we do find the properties of the graph that differentiate these nodes.
This is why I haven't been able to find a counter-example by hand.

Mike Amling

unread,
Sep 24, 2006, 4:01:37 PM9/24/06
to

I don't understand that there's a problem with auto-morphism. If two
nodes end up with the same hash, then the graph with those nodes
exchanged is isomorphic to the original graph, which is as it should be.
And the nodes of any other graph isomorphic to that graph will generate
the same node hash values with the same duplicate values.

--Mike Amling

Message has been deleted

fgrieu

unread,
Sep 24, 2006, 5:04:08 PM9/24/06
to
Bill Cox wrote:
> At each step, I also hash into the hash at each vertex the
> iteration step number. In your small ring, values back to
> the original node will have different hash values of this.

If "the iteration step number" is relative to the outer loop,
performed 2*D times in
http://www.billrocks.org/ideas
then I fail to see how this extra hashed information helps
distinguish my two earlier test graphs

0 1 0 0 0 1 0 1 1 0 0 0
1 0 1 0 0 0 1 0 1 0 0 0
0 1 0 1 0 0 1 1 0 0 0 0
0 0 1 0 1 0 0 0 0 0 1 1
0 0 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 0 0 0 1 1 0

The curse with these graphs is that the value assigned to
any vertex at any step depends only on the step, and is the
same for both graphs. This is a consequence of computing
the hashes for all vertices at the same time: the algorithm
has no way to realize that it "gets back" at which vertex it
started, because it did not start from a particular vertex.

This can be fixed, solving my test case, at the expense
of raising the cost by a factor of n. A price I would be
more than happy to pay if it gave me a precise induction
property; but it does not, and I refrain from even
stating my tentative algorithm without further thoughts.

François Grieu

Mike Amling

unread,
Sep 24, 2006, 5:17:32 PM9/24/06
to

I think if there's a flaw, it would be false positives. The algorithm
may claim two graphs are isomorphic when they aren't. Just looking at
it, I can't see how two isomorphic graphs would get different hashes.
If there's at least one false positive and no false negatives, then
the number of isomorphically distinct (if that's the term) graphs with a
given number of nodes as measured by the algorithm would be less than
the number as measured by standard methods (whatever those may be). I
ran the second algorithm[1] on all simply connected graphs of 8 nodes or
fewer (with bidirectional edges only and with no edge from a node to
itself, restrictions which make it feasible to go up to 8 nodes).

Nodes Distinct graphs
2 1
3 2
4 6
5 21
6 112
7 853
8 10897 and counting (Give it another 10 or 20 hours.)

I'm sure the algorithm is right for 2..4 nodes. Does anyone know how
many distinct such graphs there really are for 5..8 nodes?

Note: Of course, there may be a bug in my code.

[1] As modified by David Wagner's suggestion. Rather than initialize all
but one node with its degree, I initialized all but one to the hash of a
bit string of length 0, and one special node to a different arbitrary
value. The degree of a node will strongly influence its hash after one
iteration, as the degree is the number of neighboring hashes included.
I used FNV64 hash as it's fast enough, has simple inline code, and
there's no adversary, so I don't see any need for cryptographic
strength. And 64 bits should be enough to avoid birthday problems for
this low number of nodes.

--Mike Amling

Mike Amling

unread,
Sep 24, 2006, 5:23:08 PM9/24/06
to

I was thinking of the x in, say, a CRC polynomial. Such an x is never
evaluated. It remains abstract. It doesn't look to be useful here anyhow.

--Mike Amling

Mike Amling

unread,
Sep 24, 2006, 5:23:24 PM9/24/06
to

Uh, "is independent of" was too string a term. More like "may not
fully reflect".

--Mike Amling

Bill Cox

unread,
Sep 24, 2006, 5:48:12 PM9/24/06
to
> I don't understand that there's a problem with auto-morphism. If two
> nodes end up with the same hash, then the graph with those nodes
> exchanged is isomorphic to the original graph, which is as it should be.
> And the nodes of any other graph isomorphic to that graph will generate
> the same node hash values with the same duplicate values.
>
> --Mike Amling

Automorphism isn't a problem if I'm only trying to detect that graphs
are isomorphic. However, if I want to report the node correspondence,
the automorphic nodes have to be resolved. For example two simple
isomorphic rings of nodes will have several possible valid mappings
(about 2*(N-1)). All the nodes will wind up with the same label, but I
can't just make a random mapping and expect it to be valid. What I do
in this case is just assign the first pair of nodes, and then do
another hash iteration with the assigned nodes acting as start-nodes.
In the case of the rings, the rest of the nodes will then all get
unique hash values, so they can be matched up correctly. This seems to
work so far.

Being able to report the correspondence is important, both in the
real-world, and to detect counter-examples that cause the algorithm to
fail. Once the actual isomorphism mapping is reported, I do a simple
check to see if it's valid. I'm hoping to catch a counter example this
way, but I'm still dealing with bugs, and it's unfortunate, but true
that I have other work to do!

One possible result from this effort so far... Even if graph
isomorphism isn't in P, I think it's not a good idea to use exact graph
isomorphism for cryptography. It seems VERY hard to generate a pair of
graphs that are difficult to match! Any ideas for generating difficult
to match graphs would be appreciated!

Mike Amling

unread,
Sep 24, 2006, 6:03:42 PM9/24/06
to
fgrieu wrote:

> Bill Cox wrote:
>> At each step, I also hash into the hash at each vertex the
>> iteration step number. In your small ring, values back to
>> the original node will have different hash values of this.
>
> If "the iteration step number" is relative to the outer loop,
> performed 2*D times in
> http://www.billrocks.org/ideas
> then I fail to see how this extra hashed information helps
> distinguish my two earlier test graphs
>
> 0 1 0 0 0 1 0 1 1 0 0 0
> 1 0 1 0 0 0 1 0 1 0 0 0
> 0 1 0 1 0 0 1 1 0 0 0 0
> 0 0 1 0 1 0 0 0 0 0 1 1
> 0 0 0 1 0 1 0 0 0 1 0 1
> 1 0 0 0 1 0 0 0 0 1 1 0
>
> The curse with these graphs is that the value assigned to
> any vertex at any step depends only on the step, and is the
> same for both graphs. This is a consequence of computing
> the hashes for all vertices at the same time: the algorithm
> has no way to realize that it "gets back" at which vertex it
> started, because it did not start from a particular vertex.
>
> This can be fixed, solving my test case, at the expense
> of raising the cost by a factor of n. A price I would be
> more than happy to pay if it gave me a precise induction
> property; but it does not, and I refrain from even
> stating my tentative algorithm without further thoughts.

Are you using the second algorithm as described on the web site? I
agree that all nodes in the left graph will get the same value, and all
nodes in the right graph will get the same value, but those two values
will be different.
Take the arbitrary initial values for the nodes as A for the special
node and B for the rest. Then the hashes for a 3-node connected subgraph
of your right graph are, listing the special node first,
Initially
A
B
B

After round 1, the nodes' hashes are of messages
A,1BB
B,1AB
B,1AB

After round 2, the nodes have hashes of the messages
A,1BB,21BB1BB
B,1AB,21AB1BB
B,1AB,21AB1BB
etc.


The hashes for your left graph are
Initially
A
B
B
B
B

After round 1, the hashes are of messages
A,1BB
B,1AB
B,1BB
B,1BB
B,1BB
B,1AB

The left graph has some hashes of messages that start "B,1BB", but
the right graph has no such messages. Hash values will appear in the
left graph that are not the right, and that fact should persist through
further processing when the other nodes are special and on into the
combination of all of a node's hash values (as calculated when it and
all the others are the special node) into a final hash value for the node.

--Mike Amling

Mike Amling

unread,
Sep 24, 2006, 6:53:57 PM9/24/06
to
Bill Cox wrote:
>> I don't understand that there's a problem with auto-morphism. If two
>> nodes end up with the same hash, then the graph with those nodes
>> exchanged is isomorphic to the original graph, which is as it should be.
>> And the nodes of any other graph isomorphic to that graph will generate
>> the same node hash values with the same duplicate values.
>>
>> --Mike Amling
>
> Automorphism isn't a problem if I'm only trying to detect that graphs
> are isomorphic. However, if I want to report the node correspondence,
> the automorphic nodes have to be resolved. For example two simple
> isomorphic rings of nodes will have several possible valid mappings
> (about 2*(N-1)). All the nodes will wind up with the same label, but I
> can't just make a random mapping and expect it to be valid.

Oh, now I get it.

> What I do
> in this case is just assign the first pair of nodes, and then do
> another hash iteration with the assigned nodes acting as start-nodes.
> In the case of the rings, the rest of the nodes will then all get
> unique hash values, so they can be matched up correctly. This seems to
> work so far.

Is this the pairing algorithm?
1. Pair any node that has a hash value that is unique within graph 1
to the one and only node in graph 2 with that value.
2. If all nodes have been paired, stop. If no nodes have been paired,
go to step 7.
3. Assign a fresh arbitrary value to each node in graph 1 that has
already been paired. Assign the same value to its counterpart in graph 2.
4. Get a fresh arbitrary value for each pool of unpaired nodes that
had the same hash value in step 1, and assign it to each member of the
pool (in both graphs).
5. Run the hash propagation through n iterations, starting with the
values assigned in steps 3 and 4, where n is the graph diameter or
something.
6. If any node in graph 1 has a hash value after step 5 that is
unique within its graph, go to step 1.
7. Arbitrarily pick one unpaired node from graph 1 and pair it with
an arbitrary node from graph 2 that has the same hash value. Go to step 3.

I can see how one might wonder if step 7 is dangerous if reached by
falling through step 6. But I think it's safe because equal hash values
would be generated only for nodes that are connected to the paired part
of the graph in equivalent ways.

>
> Being able to report the correspondence is important, both in the
> real-world, and to detect counter-examples that cause the algorithm to
> fail. Once the actual isomorphism mapping is reported, I do a simple
> check to see if it's valid. I'm hoping to catch a counter example this
> way, but I'm still dealing with bugs, and it's unfortunate, but true
> that I have other work to do!
>
> One possible result from this effort so far... Even if graph
> isomorphism isn't in P, I think it's not a good idea to use exact graph
> isomorphism for cryptography. It seems VERY hard to generate a pair of
> graphs that are difficult to match! Any ideas for generating difficult
> to match graphs would be appreciated!

I'm worried about the classic Zero Knowledge proof a Hamiltonian
circuit. (See the example in
http://en.wikipedia.org/wiki/Zero-knowledge_proof). It looks like the
article is wrong when it says "Because of the nature of the isomorphic
graph and Hamiltonian cycle problems, (namely, that they are NP), Victor
gains no information about the Hamiltonian cycle in G from the
information revealed in each round." If graph isomorphism is in RP (as
well as NP), I would hardly conclude that V is getting no knowledge
about the circuit in G by being shown the circuit in H.

--Mike Amling

fgrieu

unread,
Sep 24, 2006, 8:20:02 PM9/24/06
to
Mike Amling wrote:
> Are you using the second algorithm as described on the web site?

No. Some cache effect at my current browsing location
had made it unavailable to me. Now fixed.

I agree the new alg distinguish my test graphs.

François Grieu

Bill Cox

unread,
Sep 24, 2006, 9:46:11 PM9/24/06
to
> I'm worried about the classic Zero Knowledge proof a Hamiltonian
> circuit. (See the example in
> http://en.wikipedia.org/wiki/Zero-knowledge_proof). It looks like the
> article is wrong when it says "Because of the nature of the isomorphic
> graph and Hamiltonian cycle problems, (namely, that they are NP), Victor
> gains no information about the Hamiltonian cycle in G from the
> information revealed in each round." If graph isomorphism is in RP (as
> well as NP), I would hardly conclude that V is getting no knowledge
> about the circuit in G by being shown the circuit in H.

I agree. There is no way I would use graph isomorphism as a
hard-problem meant to secure important data. Somebody, please, show me
two graphs which are hard to compare!

Mike Amling

unread,
Sep 24, 2006, 10:44:36 PM9/24/06
to

The Wikipedia article may have distorted the ZK proof algorithm. In
the ZK proof version at
http://citeseer.ist.psu.edu/cachedpage/320165/25, the prover does not
send the allegedly isomorphic graph unconditionally, but instead sends
commitments of its edges. Then after the verifier responds, the prover
either sends keys to unlock only edges that form a Hamiltonian circuit
or sends the permutation of the nodes that constitutes the isomorphism
and keys to unlock all edges.

--Mike Amling

fgrieu

unread,
Sep 25, 2006, 8:22:25 AM9/25/06
to
> I ran the second algorithm[1] on all simply connected graphs of 8
> nodes or fewer (with bidirectional edges only and with no edge from
> a node to itself, restrictions which make it feasible to go up to
> 8 nodes).
>
> Nodes Distinct graphs
> 2 1
> 3 2
> 4 6
> 5 21
> 6 112
> 7 853
> 8 10897 and counting (Give it another 10 or 20 hours.)

So far so good
http://www.research.att.com/~njas/sequences/A001349

François Grieu

fgrieu

unread,
Sep 25, 2006, 1:06:03 PM9/25/06
to
Bill Cox asked:

> Somebody, please, show me two graphs which are hard
> to compare!

At the end of this message are three 25-vertices graphs
that will give a hard time to the algorithm now at
http://www.billrocks.org/ideas/

This is a variation of my earlier counterexample:
one ring of 24 vertices for A, 2 independent rings of
12 vertices for B, both with an extra vertex connected
to all the others. C is a shuffle of B.

The problem exhibited is really only about the "2*D"
number of iterations, which I (now) think is insufficient.
My counterexample graphs have D=2. Yet with 4 steps in
the outer loop of the "hash iteration", the hashes for
all but one node are equal, and the hashes are the same
in all three graphs. The hashes are thus not enough to
recognize A from B, or to reorder B and C into a canonical
graph.

I have no counterexample to the lazy version of the algorithm,
where we loop N times rather 2*D. I have no conjecture on
a minimum, but my counterexample implies it is at least
in the order of N/4.


On the problems side
- I have no precise recurrence property for a proof;
- the cost has raised to O(N^4) for constant hash width;
- I fail to see the benefit of hashing the iteration number.


On the positive side, **IF** the algorithm works save for
an accidental hash collision, I think we can get rid of any
possibility of collision, and still be in polynomial time
(as Mike conjectured then retracted).
I replace the cryptographic hash with ordering and concatenation
of input (each ceil(log2(n)) bits), giving a longhash.
After the step reading "update each vertex's hash value", I sort
the N longhashes, then output each longhash (with length prefix),
then replace each unique value with its index among unique values,
which fits ceil(log2(n)) bits. This avoids exponential explosion.
The result of each "hash iteration" is the concatenation of
all outputs during this "hash iteration", and the last longhash.
The rest is unchanged.


François Grieu

Now the boring graphs

[A]
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

[B]
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

[C]
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0

Mike Amling

unread,
Sep 25, 2006, 1:30:10 PM9/25/06
to

It finished counting, and got 11117 for 8 nodes. So much for the easy
counterexample.

--Mike Amling

Mike Amling

unread,
Sep 25, 2006, 4:47:58 PM9/25/06
to
fgrieu wrote:
> Bill Cox asked:
>> Somebody, please, show me two graphs which are hard
>> to compare!
>
> At the end of this message are three 25-vertices graphs
> that will give a hard time to the algorithm now at
> http://www.billrocks.org/ideas/
>
> This is a variation of my earlier counterexample:
> one ring of 24 vertices for A, 2 independent rings of
> 12 vertices for B, both with an extra vertex connected
> to all the others. C is a shuffle of B.
>
> The problem exhibited is really only about the "2*D"
> number of iterations, which I (now) think is insufficient.
> My counterexample graphs have D=2. Yet with 4 steps in
> the outer loop of the "hash iteration", the hashes for
> all but one node are equal, and the hashes are the same
> in all three graphs. The hashes are thus not enough to
> recognize A from B, or to reorder B and C into a canonical
> graph.
>
> I have no counterexample to the lazy version of the algorithm,
> where we loop N times rather 2*D. I have no conjecture on
> a minimum, but my counterexample implies it is at least
> in the order of N/4.

I take it that D is Diameter, defined as

maximum over i,j of (length of shortest path between node i and node j).

We could call that the Inner Diameter and also define an Outer Diameter as

maximum over i,j of (length of any path between node i and node j in
which no node appears twice).


You've shown 2*ID iterations are not always sufficient. Are OD or
OD+1 iterations sufficient?
I'm sure there are efficient algorithms for determining ID, but I
don't know about OD. Of course, if one does go to the trouble of
determining OD for each of two graphs, and gets different values, then
the graphs are immediately non-isomorphic.

--Mike Amling

Mike Amling

unread,
Sep 26, 2006, 2:23:53 PM9/26/06
to
fgrieu wrote:
> On the positive side, **IF** the algorithm works save for
> an accidental hash collision, I think we can get rid of any
> possibility of collision, and still be in polynomial time
> (as Mike conjectured then retracted).
> I replace the cryptographic hash with ordering and concatenation
> of input (each ceil(log2(n)) bits), giving a longhash.
> After the step reading "update each vertex's hash value", I sort
> the N longhashes, then output each longhash (with length prefix),
> then replace each unique value with its index among unique values,
> which fits ceil(log2(n)) bits. This avoids exponential explosion.
> The result of each "hash iteration" is the concatenation of
> all outputs during this "hash iteration", and the last longhash.
> The rest is unchanged.

Last night I thought I understood what you are suggesting, and if the
algorithm works, it would put graph isomorphism into P, not just RP.
This morning I'm doubtful.

First, do I understand the what you suggest? Is it
0. All values are in the range 0..n-1.
1. Assign the special node the value 1. Assign all the other nodes the
value 0.
2. For each node, assign a bit string which is its value represented as
a bit string of length ceil(lg2(n)) followed by the bit strings of that
length representing its neighbors' values in sorted order.
3. Make a list of all the bit strings assigned in this iteration, with
duplicates removed. Sort the list (Regard a shorter string as strictly
less than any longer string.). Assign each node a new value which is a
perfect hash of its bit string in this iteration, namely the ordinal
number of its bit string in the sorted list.
4. Loop back to step 2 some number of times (N? N/4?).
5. Record each node's value assigned by the most recent iteration of step 3.
6. Repeat steps 1 through 5 once with each node as the special node.
7. Assign each node a final bit string of length n*ceil(lg2(n))
consisting of the bit string representing its recorded value when it was
special followed by its sorted recorded values when it was not special.
8. Compare the (sorted) list of bit strings assigned in step 7 for one
graph with those assigned to another graph.

I think the perfect hashing throws away too much data. I suspect that
the values of the CS hashes generated by previous versions of the
algorithm may differ in ways that allow the CS-hash method to
distinguish graphs that the perfect-hash method regards as isomorphic.
But I haven't looked for an example yet.

--Mike Amling

fgrieu

unread,
Sep 26, 2006, 7:14:48 PM9/26/06
to
Mike Amling wrote:

> François Grieu wrote:
>> On the positive side, **IF** the algorithm works save for
>> an accidental hash collision, I think we can get rid of any
>> possibility of collision, and still be in polynomial time
>> (as Mike conjectured then retracted).
>> I (snipped)

>
> Last night I thought I understood what you are suggesting,
> and if the algorithm works, it would put graph isomorphism
> into P, not just RP. This morning I'm doubtful.
>
> First, do I understand the what you suggest?

I did not quite recognize my idea. I'll paraphrase you,
at the same time fixing a flaw in my earlier description.
The algorithm is stated for an undirected graph only
(symmetric matrix), no loopy vertices (diagonal is 0).

0. Associated to each node we need
- a "value" in the range 0..n-1, assimilated to a
fixed length bit string of ceil(lg2(n)) bits;
- a "longhash" bit string;
- a "result" bit string, initially empty.
1. Repeat 2..5 with each node the special node in turn
2. Assign the special node the value 1. Assign all the other
nodes the value 0.
Repeat 3..4 some number of times (N? N/4?).
3. For each node, assign to longhash its value, followed
by its neighbors values in sorted order (duplicates
kept); longhash has length at most n*ceil(lg2(n)).
4. Make a list of all the longhash assigned in step 3.
Sort the list and remove duplicates (regard a shorter
string as strictly less than any longer string).
Assign each node a new value, which is the position of
its longhash in the list.
Append to the special node's result a bit string
reversibly coding the list (e.g. prefix each longhash
with a length prefix of appropriate fixed size, the
zero length coding end of list)
Notice that what we just appended is invariant under
graph isomorphism, and that it is enough to rebuild the
current longhash of each node from its current value.
5. Append to the special node's result the sorted list
of the values of each node in sorted order (duplicates
kept);
6. Sort all N results (duplicates kept, shorter strings
first), compute the whole graph's hash as a bit string
reversibly coding the N results (e.g. prefix each result
with a length prefix of appropriate fixed size). Notice
this graph's hash is invariant under graph isomorphism,
and enough to backtrack (except for a shuffle of the nodes)
the values and longhash assigned to each node, without
knowledge of the graph.

An alternative to step 6 is to output the graph in generic
order based on a sort of the results, making a much shorter
and useful representative of the graph's equivalence class
(this has been suggested by Bill Cox).
I however fail to convince myself that this is equivalent;
or that the "compare graph's hash" variant never falsely
declare graphs isomorphic, or that the "compare sorted
graphs" variant never falsely declare graphs non-isomorphic.

François Grieu

Mike Amling

unread,
Sep 26, 2006, 9:11:43 PM9/26/06
to

OK, sorry for my misinterpretation. I've waxed optimistic again, and
I think there may even be a chance for my misinterpreted version.
In any event, I think that the fixed number of iterations of your
steps 3 and 4 above may be replaced by the criterion that the nodes'
values become stagnant.

--Mike Amling

Mike Amling

unread,
Sep 26, 2006, 10:43:40 PM9/26/06
to
Mike Amling wrote:
> OK, sorry for my misinterpretation. I've waxed optimistic again, and I
> think there may even be a chance for my misinterpreted version.

No chance. It confuses two 5-node graphs, namely if B, C, and D
connect to both A and E, it's insensitive to whether A connects with E.

--Mike Amling

Ben Livengood

unread,
Sep 27, 2006, 4:55:56 AM9/27/06
to
fgrieu wrote:
> Mike Amling wrote:
...

> An alternative to step 6 is to output the graph in generic
> order based on a sort of the results, making a much shorter
> and useful representative of the graph's equivalence class
> (this has been suggested by Bill Cox).
> I however fail to convince myself that this is equivalent;
> or that the "compare graph's hash" variant never falsely
> declare graphs isomorphic, or that the "compare sorted
> graphs" variant never falsely declare graphs non-isomorphic.
>
> François Grieu

I think it is sufficient to represent the graph over the equivalence
class of its vertices. Let N be the number of equivalence classes over
the vertices in the graph. Let A be a modified N*N adjacency matrix
such that A[i,j] is the number of edges from vertices in equivalence
class i to vertices in equivalence class j. To handle vertices with
zero degree the original number of vertices will have to be output as
well.

I don't know much graph theory, so I apologize if this representation
already has a name or a better description, or if it doesn't actually
work.

Ben Livengood

Bill Cox

unread,
Sep 27, 2006, 7:06:19 AM9/27/06
to
> OK, sorry for my misinterpretation. I've waxed optimistic again, and
> I think there may even be a chance for my misinterpreted version.
> In any event, I think that the fixed number of iterations of your
> steps 3 and 4 above may be replaced by the criterion that the nodes'
> values become stagnant.
>
> --Mike Amling

You're proof for graphs of 8 or fewer nodes is very cool. Here's a
recent algorithm and proof that claim graph isomorphism is polynomial
in time:

http://arxiv.org/abs/math.CO/0202085
http://arxiv.org/abs/math.CO/0607770

I haven't yet taken the time to understand "K-orbits" of graphs, or
"combinational algebraics", or any of the group-theory jargon.
However, it does make me optimistic that a hashing-based algorithm
works.

Mike Amling

unread,
Sep 27, 2006, 10:36:45 PM9/27/06
to

I've coded up something I think is equivalent to what you suggest. It
works for up to 7 nodes (simply connected, undirected, no loopback), but
for 8 it only found 10126 equivalence classes, not 11117. I'll look
again at the code to see if something obvious is amiss. I instrumented
it to note the maximum number of iterations needed before the nodes'
values duplicate previous values, and it's surprisingly few:

Sufficient
Nodes Iterations
3 1
4 1
5 1
6 2
7 3
8 5

E.g., the fourth iteration of a 7-node graph always generates values
that duplicate a previous generation (the immediately previous). For 8
nodes the duplicate may be two generations back.

--Mike Amling

Mike Amling

unread,
Sep 27, 2006, 10:38:35 PM9/27/06
to

What's an orbit? Is it like a cycle in permutations?

--Mike Amling

Mike Amling

unread,
Sep 27, 2006, 10:47:29 PM9/27/06
to
Ben Livengood wrote:
> fgrieu wrote:
>> Mike Amling wrote:
> ...
>> An alternative to step 6 is to output the graph in generic
>> order based on a sort of the results, making a much shorter
>> and useful representative of the graph's equivalence class
>> (this has been suggested by Bill Cox).
>> I however fail to convince myself that this is equivalent;
>> or that the "compare graph's hash" variant never falsely
>> declare graphs isomorphic, or that the "compare sorted
>> graphs" variant never falsely declare graphs non-isomorphic.
>>
>> François Grieu
>
> I think it is sufficient to represent the graph over the equivalence
> class of its vertices. Let N be the number of equivalence classes over
> the vertices in the graph.

Equivalence under what relationship? Is this related to automorphism?

fgrieu

unread,
Sep 28, 2006, 8:30:38 AM9/28/06
to
Ben Livengood wrote

> fgrieu wrote:
>> An alternative to step 6 is to output the graph in
>> generic order based on a sort of the results, making
>> a much shorter and useful representative of the graph's
>> equivalence class (this has been suggested by Bill Cox).
>> I however fail to convince myself that this is
>> equivalent; or that the "compare graph's hash" variant
>> never falsely declare graphs isomorphic, or that the
>> "compare sorted graphs" variant never falsely declare
>> graphs non-isomorphic.
>

> I think it is sufficient to represent the graph over the
> equivalence class of its vertices. Let N be the number of
> equivalence classes over the vertices in the graph. Let A
> be a modified N*N adjacency matrix such that A[i,j] is the
> number of edges from vertices in equivalence class i to
> vertices in equivalence class j. To handle vertices with
> zero degree the original number of vertices will have to
> be output as well.


I guess we are talking about equivalence class for the
equivalence relation defined by: "two nodes are equivalent
if the graph's matrix is unchanged by swapping these nodes".

I believe that the proposition stated is false, and propose
two 9-vertices graphs as a counterexample:

0 1 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0
1 0 1 0 1 1 0 0 0 1 0 1 0 1 1 0 0 0
1 1 0 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0
1 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0
1 1 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 1
0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0
0 0 1 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1
0 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0 0 1
0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 1 1 0

The first graph can be visualized as a cube with 8 vertices
and 12 edges, plus an extra vertex connected to the
4 vertices of the top square of the cube.

The second is identical, except for twisting two vertices
on opposite sides of the bottom square.

Unless I make a mistake, both graphs have 3 equivalence
class: the 4 vertices with 3 edges, the 4 other vertices
connected to any of the previous, and the remaining vertex.
They have the same "modified adjacency matrix", as
defined by Ben Livengood, and number of edges.

Yet the graphs are not isomorphic.


François Grieu

fgrieu

unread,
Sep 28, 2006, 8:58:59 AM9/28/06
to
Mike Amling wrote :

> I've coded up something I think is equivalent to what you
> suggest. It works for up to 7 nodes (simply connected,
> undirected, no loopback), but for 8 it only found 10126
> equivalence classes, not 11117. I'll look again at the code
> to see if something obvious is amiss.

My intent was to write an equivalent to the (second)
Bill Cox algorithm that you verified up to 8 nodes, except
not relying on a hash. I'd bet the problem (assuming it is
not in your implementation) is in my rendering of Bill Cox's
algorithm, not in the removal of the hash. Maybe I should
have used Bill Cox's code as a basis, rather than his text.
I'll check this in more detail.

It would help if you could give two graphs that are not
distinguished by your new code, but are distinguished by
the previous version (is it Bill Cox's?).

I notice a fun thing: when using any variant of the algorithm
to compare two graphs, we can detect with certainty when the
algorithm fails, as follow.

We use the algorithm to produce a hash for each vertex.

If these hashes are not identical within order, we have
proved the graphs are not isomorphic.

Else, we reorder both graph's matrix according to
nondecreasing hashes. If the reordered graphs are equal,
we have proved the graphs are isomorphic.

Else, the algorithm failed.


François Grieu

Mike Amling

unread,
Sep 28, 2006, 11:45:42 AM9/28/06
to
fgrieu wrote:
> Mike Amling wrote :
>> I've coded up something I think is equivalent to what you
>> suggest. It works for up to 7 nodes (simply connected,
>> undirected, no loopback), but for 8 it only found 10126
>> equivalence classes, not 11117. I'll look again at the code
>> to see if something obvious is amiss.
>
> My intent was to write an equivalent to the (second)
> Bill Cox algorithm that you verified up to 8 nodes, except
> not relying on a hash. I'd bet the problem (assuming it is
> not in your implementation) is in my rendering of Bill Cox's
> algorithm, not in the removal of the hash. Maybe I should
> have used Bill Cox's code as a basis, rather than his text.
> I'll check this in more detail.

The only data that you keep is the "result" concatenation of sorted
unique "longhash"es (bad name; They're bit strings formed by
concatenation, and not hashes.), right? I'm discarding final values
assigned to the nodes. There is no marker between the list from one
iteration and the appended list from the next iteration, but none should
be necessary. And, mea culpa, I kept only a 32-bit hash of the results,
not the results themselves (but I can hardly believe there've been 991
collisions).

> It would help if you could give two graphs that are not
> distinguished by your new code, but are distinguished by
> the previous version (is it Bill Cox's?).

Yeah, when I saw that 10,126 I immediately regretted not keeping any
data from previous runs. It will take a while.

>
> I notice a fun thing: when using any variant of the algorithm
> to compare two graphs, we can detect with certainty when the
> algorithm fails, as follow.
>
> We use the algorithm to produce a hash for each vertex.
>
> If these hashes are not identical within order, we have
> proved the graphs are not isomorphic.

"proved" to the point where I believe it, but I haven't seen a formal
proof.

> Else, we reorder both graph's matrix according to
> nondecreasing hashes. If the reordered graphs are equal,
> we have proved the graphs are isomorphic.
>
> Else, the algorithm failed.

I think you have to specify more than "nondecreasing". The reordered
graphs are equal iff you have established an isomorphism between the two
graphs, and to do that you have to cope with Bill Cox's automorphism
issue. E.g., two graphs that are each 4 nodes connected in a ring have
only 8 isomorphisms, but all the node hashes are same and there are 24
ways to sort them nondecreasing.

--Mike Amling

Francois Grieu

unread,
Sep 28, 2006, 5:14:14 PM9/28/06
to
In article <efgqn6$d...@dispatch.concentric.net>,
Mike Amling <nos...@foobaz.com> wrote:

> François Grieu wrote:
> > I notice a fun thing: when using any variant of the algorithm
> > to compare two graphs, we can detect with certainty when the
> > algorithm fails, as follow.
> >
> > We use the algorithm to produce a hash for each vertex.
> >
> > If these hashes are not identical within order, we have
> > proved the graphs are not isomorphic.
>
> "proved" to the point where I believe it, but I haven't seen a
> formal proof.

I stand by "proved". The proof, by induction, that the graph hash
produced by all variants stated so far is invariant under graph
isomorphism, seems quite straightformard;
this is all that is needed to proove that different hashes imply
non-isomorphism.

> > Else, we reorder both graph's matrix according to
> > nondecreasing hashes. If the reordered graphs are equal,
> > we have proved the graphs are isomorphic.
> >
> > Else, the algorithm failed.

> I think you have to specify more than "nondecreasing". The reordered
> graphs are equal iff you have established an isomorphism between the two
> graphs, and to do that you have to cope with Bill Cox's automorphism
> issue. E.g., two graphs that are each 4 nodes connected in a ring have
> only 8 isomorphisms, but all the node hashes are same and there are 24
> ways to sort them nondecreasing.

You are right. What I stated is strictly speaking true, but the
"algorithm failed" box is reached hopelessly often.


François Grieu

Ben Livengood

unread,
Sep 29, 2006, 7:08:16 PM9/29/06
to
Mike Amling wrote:
>
> The only data that you keep is the "result" concatenation of sorted
> unique "longhash"es (bad name; They're bit strings formed by
> concatenation, and not hashes.), right? I'm discarding final values
> assigned to the nodes. There is no marker between the list from one
> iteration and the appended list from the next iteration, but none should
> be necessary. And, mea culpa, I kept only a 32-bit hash of the results,
> not the results themselves (but I can hardly believe there've been 991
> collisions).
>

The first of the two graphs François Grieu provided in his reply to me
only had longhash differences in the first and second iterations for
the 8 vertices of the cube, and after that all 8 longhashes are equal.
The differences between the longhashes of the top and bottom vertices
only differed by a few bits, so it's possible that a 32 bit hash isn't
long enough. When I implemented François's algorithm I used strings of
integers instead of creating bitstrings since it seems to preserve the
sorting order of the vertex values and was simpler to implement, but I
haven't proved that it's equivalent and I may also have gotten
different results than the algorithm he specified.

> I think you have to specify more than "nondecreasing". The reordered
> graphs are equal iff you have established an isomorphism between the two
> graphs, and to do that you have to cope with Bill Cox's automorphism
> issue. E.g., two graphs that are each 4 nodes connected in a ring have
> only 8 isomorphisms, but all the node hashes are same and there are 24
> ways to sort them nondecreasing.

I think that outputting sorted vertices and an equivalence relation
over the vertices might work. To generate a mapping from one graph to
another isomorphic graph just map vertices one at a time from one graph
to the other, and as each vertex is mapped all its neighbors can be
mapped as well. By restricting the available choices of mappable
vertexes to the equivalence class I think it is deterministic and in P.
The algorithm would work as follows:

Assume two graphs A and B of equal size and matching hashes.
Remove duplicate vertex hash results from the final hash, making an
equivalence relation over the vertices, e.g. an array Va and Vb with
Va[i] set to the index of the new duplicate-free hash result array
(which was sorted by the original algorithm), and similarly for Vb.
Generate an empty array M of size N for mapping vertices from A to B.
Assume M[i]=j implies vertex i in A is mapped to vertex j in B.
1. Pick an initial vertex from the same equivalence class of both
graphs, and map them to each other. I am not sure this step is in P.
There may be graphs where there are no equivalence classes containing
only a single vertex, which means every combination of vertices from A
and B from the smallest equivalence class must be tried to see if a map
can be found.
2. Pick a mapped vertex (M[i] is defined) from A that has unmapped
neighbors. For each unmapped neighbor X:
3. Let E be the equivalence class X belongs to in A. Pick an unmapped
vertex Y from the corresponding equivalence class of graph B. E.g.
Va[X]=Vb[Y] and there does not exist i such that M[i]=Y. Since Va and
Vb are sorted, equal indexes should imply corresponding equivalence
classes. Set M[X]=Y, and go to step 2 if there are more unmapped
neighbors. If at any step an edge implies that the mapping is incorrect
(X has only one choice Y in the other graph, but M[X]!=Y), either the
algorithm has failed or step 1 must be repeated with a different
combination of starting vertices.
4. If there are no unmapped neighbors of any mapped vertex, the graph
is disconnected, so go to step 1.

I'll try this tonight and see if it can generate a map for all
permutations of a graph back to itself. I don't think step 1 will
actually be a problem because if the choice of starting vertex
mattered, the graphs would likely not be isomorphic. The problem with
my earlier attempt was that no internal structure within equivalence
classes of vertices was possible to express with the equivalence
relation alone. Hopefully this fixes it.

Ben Livengood

Mike Amling

unread,
Sep 29, 2006, 8:11:28 PM9/29/06
to
Ben Livengood wrote:
> I think that outputting sorted vertices and an equivalence relation
> over the vertices might work. To generate a mapping from one graph to
> another isomorphic graph just map vertices one at a time from one graph
> to the other, and as each vertex is mapped all its neighbors can be
> mapped as well. By restricting the available choices of mappable
> vertexes to the equivalence class I think it is deterministic and in P.
> The algorithm would work as follows:
>
> Assume two graphs A and B of equal size and matching hashes.

Let's try this on two graphs that are each a 100-node ring.

> Remove duplicate vertex hash results from the final hash, making an
> equivalence relation over the vertices, e.g. an array Va and Vb with
> Va[i] set to the index of the new duplicate-free hash result array
> (which was sorted by the original algorithm), and similarly for Vb.

All nodes get the same hash values, so all Va[.] and Vb[.] are equal.

> Generate an empty array M of size N for mapping vertices from A to B.
> Assume M[i]=j implies vertex i in A is mapped to vertex j in B.
> 1. Pick an initial vertex from the same equivalence class of both
> graphs, and map them to each other. I am not sure this step is in P.
> There may be graphs where there are no equivalence classes containing
> only a single vertex, which means every combination of vertices from A
> and B from the smallest equivalence class must be tried to see if a map
> can be found.

Is same-equivalence-class the only criterion for this "pick"? If not,
what are the criteria?

> 2. Pick a mapped vertex (M[i] is defined) from A that has unmapped
> neighbors. For each unmapped neighbor X:
> 3. Let E be the equivalence class X belongs to in A. Pick an unmapped
> vertex Y from the corresponding equivalence class of graph B. E.g.
> Va[X]=Vb[Y] and there does not exist i such that M[i]=Y. Since Va and
> Vb are sorted, equal indexes should imply corresponding equivalence
> classes. Set M[X]=Y, and go to step 2 if there are more unmapped
> neighbors. If at any step an edge implies that the mapping is incorrect
> (X has only one choice Y in the other graph, but M[X]!=Y), either the
> algorithm has failed or step 1 must be repeated with a different
> combination of starting vertices.

You could be going back to step 1 a lot. The equivalence class does
not even restrict Y to being a neighbor of M[X]. The probability of M[1]
being a neighbor of M[0] is 2/99. I think the OP's matching algorithm
has a better chance of being in RP.

> 4. If there are no unmapped neighbors of any mapped vertex, the graph
> is disconnected, so go to step 1.

--Mike Amling

Ben Livengood

unread,
Sep 30, 2006, 5:25:23 PM9/30/06
to
After the equivalence class of equally hashed vertices has been
created, each component of the graph should be considered as a second
equivalence class over the vertices and be intersected with the first.
Then the vertices with equal hashes should be ordered according to the
component they are in, which are sorted by size. For instance three
rings of size 5 generate 15 equivalent vertices but three components of
size 5. The vertices are ordered into groups of 5 depending on which
component they belong to, and the final result of intersecting the
equivalence classes is output along with the sorted graph, perhaps as a
vector of size N containing equivalence classes for the corresponding
vertices. Any representation that describes the boundaries of the
equivalence classes would suffice. As far as I can tell this still
preserves the total order of the hashing, because the final sorting
step is only needed to separate individual components of the graph, it
maintains an order that hashing alone could have produced under a
permutation of the vertices, and by sorting equivalence classes by size
a total order is created since any equally sized equivalence class
containing vertices with the same hash should be swappable with any
other from the same component.

Note that within each equivalent set of vertices, the equivalence
classes over those vertices must be sorted by the graph component the
vertices belong to, otherwise it would might be possible for graph A to
have components A1 and A2 mixed up in order with graph B's components
so that in the final order one graph has A1B1A2B2 and the other has
B1A1A2B2. I think this can only happen in the case that A1 and B1 have
the same hash, but the size of the equivalence classes for A1,B1 and
A2,B2 differ. I don't know if this is possible, since subparts of a
graph with different numbers of vertices should not generate the same
vertex hash results. If it turns out to be a problem the graph can just
be split into its connected components and hashed individually. I don't
see any problems with ordering graph components by their final hash
affecting the isomorphism decision. In fact, for a database of graphs
it might prove useful to only hash connected graphs and test trial
graphs for isomorphism only after separating them into their components
anyway.

Mike Amling wrote:
...


>
> > Generate an empty array M of size N for mapping vertices from A to B.
> > Assume M[i]=j implies vertex i in A is mapped to vertex j in B.
> > 1. Pick an initial vertex from the same equivalence class of both
> > graphs, and map them to each other. I am not sure this step is in P.
> > There may be graphs where there are no equivalence classes containing
> > only a single vertex, which means every combination of vertices from A
> > and B from the smallest equivalence class must be tried to see if a map
> > can be found.
>
> Is same-equivalence-class the only criterion for this "pick"? If not,
> what are the criteria?
>

Yes, same-equivalence class is all that's necessary. For instance, if
there is only one vertex in an equivalence class of graph A, there must
exist a matching vertex in the same valued equivalence class in graph
B, and by sorting with the hash they must be in the same position in
the output which suffices to identify them as equivalent. If there are
multiple vertices in an equivalence class E in A, there must be an
equal number of vertices in the corresponding equivalence class F in B,
and by the sorted hash the equivalence classes must overlap exactly.
Since equivalent vertices are indistinguishable to the hash, then if
the hash determines isomorphism it doesn't matter which vertex in an
equivalence class of graph A is first matched with the same equivalence
class in an isomorphism of A. Obviously once vertices start being
mapped to each other this will no longer be the case and the edges of
the graph must be taken into account.

> > 2. Pick a mapped vertex (M[i] is defined) from A that has unmapped
> > neighbors. For each unmapped neighbor X:
> > 3. Let E be the equivalence class X belongs to in A. Pick an unmapped
> > vertex Y from the corresponding equivalence class of graph B. E.g.
> > Va[X]=Vb[Y] and there does not exist i such that M[i]=Y. Since Va and
> > Vb are sorted, equal indexes should imply corresponding equivalence
> > classes. Set M[X]=Y, and go to step 2 if there are more unmapped
> > neighbors. If at any step an edge implies that the mapping is incorrect
> > (X has only one choice Y in the other graph, but M[X]!=Y), either the
> > algorithm has failed or step 1 must be repeated with a different
> > combination of starting vertices.
>
> You could be going back to step 1 a lot. The equivalence class does
> not even restrict Y to being a neighbor of M[X]. The probability of M[1]
> being a neighbor of M[0] is 2/99. I think the OP's matching algorithm
> has a better chance of being in RP.
>

I don't think I fully specified the algorithm, because the only way for
it to work is to ensure that Y is a neighbor of M[X]. Let me try again:

3. for a mapped vertex X in A, Y is the equivalent vertex in B as
indicated by M[X]=Y. For each edge from X to X' in A, do the following:
3.1 if M[X'] is defined, repeat 3.1 with the next X'. If no X' remain,
then pick another X go to step 3
3.2 Let E=Va[X']. For all Y' in B such that Vb[Y']=E and for all i no
M[i]=Y', do the following:
3.3 If the edges between X and X' do not match the edges between Y and
Y' in B, repeat this step with the next Y'. If no Y' remain, then the
graphs are nonisomorphic [1]
3.4 Set M[X']=Y', and go to step 3.

[1] If I am right about step 1 always working and the algorithm is
correct, then this is true because every step is deterministic and
there is no other way to try mapping the vertices that is not
isomorphic to what was already tried. In other words, permuting the
vertices in equivalence classes will not allow a mapping to be created,
which implies that no permutation of the vertices (while still sorting
them by their hash result) will result in graph equality, so there is
no isomorphism.

The equivalence class only tells the algorithm which set of vertices in
B to pick from. If the edges between X and X' match the edges between Y
and Y', and Y' is unmapped, then X' can be mapped to Y'. This is
straightforward for graphs without automorphisms, since each vertex is
a singleton in its equivalence class. For disconnected graphs with no
isomorphisms on its components, it is also straightforward since the
components are in different equivalence classes sorted by component.
For graphs with multiple permutation groups, as one vertex from a
permutation group is mapped, the valid permutations in other groups
must depend on the choices already made. Outside of step 1, any
arbitrary mapping choices directly correspond to a choice of which
graph permutation to select. I hypothesize that if the group
represented by graph automorphisms has order n, then the product of all
possible choices at every step in the algorithm must equal n. I am
nowhere near being able to prove this, but I think any algorithm that
decides graph isomorphism must at least pick one permutation out of the
permutation group. The problem with the proof will be proving how each
choice affects the number of other choices available at future steps.

Ben Livengood

Mike Amling

unread,
Sep 30, 2006, 5:50:43 PM9/30/06
to
fgrieu wrote:
> It would help if you could give two graphs that are not
> distinguished by your new code, but are distinguished by
> the previous version (is it Bill Cox's?).

CNN's next headline may well be 'Graph Isomorphism in P'. I reran
both algorithms, keeping the lexically lowest graph connection matrix
from each set of graphs that produced the same output. Both my version
of your algorithm and my version of Bill Cox's now produce exactly the
same set of 11,117 isomorphically distinct connection matrices for 8
nodes. This time I used a 64-bit (FNV) hash of your result string
instead of the 32 LSBs. I'm still not sure how you can get away with
throwing away the values and just keeping the lists.

--Mike Amling

Mike Amling

unread,
Sep 30, 2006, 6:01:31 PM9/30/06
to
Mike Amling wrote:
> This time I used a 64-bit (FNV) hash of your result string instead of
> the 32 LSBs.

And 4 iterations are sufficient for 8 nodes. The 5th iteration's
result always duplicates the immediately previous result.

--Mike Amling

fgrieu

unread,
Oct 1, 2006, 12:01:44 PM10/1/06
to
Mike Amling wrote:

> CNN's next headline may well be 'Graph Isomorphism in P'.

As soon as we get a proof ; 8 is shy from infinity :-)

Many thanks for doing this hard work.
Can I assume you in effect verified the following up to
8 nodes?
[I restate things with better terminology, no handwaving
justifications, and without the alternate step 6 which
was found wrong, or in need of nontrivial replacement].

The algorithm produce a bit string from an undirected graph
(symmetric graph matrix) with N nodes, no loopy vertices
(matrix diagonal is 0).

0. Associated to each node we need

- a "value" in the range 0..N-1, assimilated to a
fixed length bit string of ceil(lg2(N)) bits;
- a "temp" bit string;


- a "result" bit string, initially empty.

1. Repeat 2..5 with each node the special node in turn.


2. Assign the special node the value 1. Assign all the other
nodes the value 0.

Repeat 3..4 some number of times (say N).
3. For each node, assign to temp the node's value,
followed by its neighbor's values in sorted order
(duplicates kept); temp has length at most
N*ceil(lg2(N)).
4. Make a list of the N temp assigned in step 3.


Sort the list and remove duplicates (regard a shorter
string as strictly less than any longer string).
Assign each node a new value, which is the position of

its temp in the list.


Append to the special node's result a bit string

reversibly coding the list (e.g. prefix each temp


with a length prefix of appropriate fixed size, the

zero length coding end of list).


5. Append to the special node's result the sorted list
of the values of each node in sorted order (duplicates

kept).


6. Sort all N results (duplicates kept, shorter strings

first), compute the result as a bit string reversibly


coding the N results (e.g. prefix each result with a
length prefix of appropriate fixed size).

The output of isomorphic graphs is identical. It is
conjectured that non-isomorphic graphs produce
distinct results.


> I'm still not sure how you can get away with
> throwing away the values and just keeping the lists.

I think I can explain how my algorithm's correctness
follows if the only failures of Bill Cox's algorithm
are attribuable to hash collisions.

I deviate from Bill Cox by not using the node's
number of neighbors, with rationale that this is
compensated by an extra iteration; and by not
entering the iteration number, with rationale
that if this data element prevents a collision, then
the hash is distinguishable from a random oracle.

The algorithm, using a W-bit cryptographic hash such
as SHA-W for W=256, at this stage boils down to:

0. Associated to each node we need

- a W-bit "value";
- a W-bit "temp";
- a W-bit "result".
1. Repeat 2..5 with each node the special node in turn.
2. Assign the special node the value "all bits clear
except the last".
Assign all the other nodes the value "all bits clear".
Repeat 3..4 some number of times (say N).
3. For each node, assign to temp the hash of
[the node's value, followed by its neighbor's
values in sorted order (duplicates kept)].
4. For each node, copy temp to value.
5. Set the special node's result to the hash of
[values of each node in sorted order (duplicates kept)].
6. Produce the overall result as the hash of
[outputs of all nodes in sorted order (duplicates kept)].

Next I replace the cryptographic hash with a concatenation
of inputs and an overall length prefix; the algorithm becomes
free from accidental collisions, but unfortunately, the output
length now grows exponentially with the graph size.

Then I tweak the previous monster by keeping a more
compact version of things, namely by replacing the
"temps" produced at each step by fixed-size "values",
doing so without throwing away any information (thanks
to what gets accumulated in "output").
Loking at "values" alone, they can colide between graphs
that did not collide in the cryptographic version, but this
can occur only after "outputs" have diverged.

- -

An idea to speed up experimental verification against
http://www.research.att.com/~njas/sequences/A001349

It is straightforward that isomorphic graphs produce
identical output. We thus do not need to run the
algorithm on the N*(N-1)/2 possible graphs, only
on a subset where each class under isomorphism
has at least one member (ideally: just one).
An idea is to keep enumerating all possible graphs,
but filter out those found isomorphic to an earlier
one after some simple reordering heuristic.
Another approach is to enumerate the graphs with
some restrictive property, like "nodes listed in
order such that the number of neighbors is
nondecreasing", which allows a convenient
pruning of the enumerated space.


François Grieu

Ben Livengood

unread,
Oct 3, 2006, 8:25:06 PM10/3/06
to
Mike Amling wrote:
> > I think it is sufficient to represent the graph over the equivalence
> > class of its vertices. Let N be the number of equivalence classes over
> > the vertices in the graph.
>
> Equivalence under what relationship? Is this related to automorphism?

As far as I can tell, the vertices that generate the same final hash
(before the vertex hashes are combinded to make the final graph hash)
are the vertices that are part of the permutation group of the graph's
automorphism group. Clearly any automorphism of the graph will generate
the same final hash, but my assumption is that for each automorphism
the vertices with equivalent hashes will stay within their own
equivalence class, but in a different order. This seems pretty
straightforward, since if two different permutations of a graph's
vertices produce different hash results for any vertex, the hash would
not preserve isomorphisms.

I am pretty sure that François Grieu's counterexample proves that the
equivalence classes alone are not sufficient to represent the entire
graph. However, I still think the equivalence classes can be used in
conjunction with a sorted graph representation to generate a mapping
between the vertices of two graphs. If this can be done deterministicly
it would constitute a proof that the algorithm cannot generate false
positives, since if the algorithm could always generate a vertex
mapping between graphs with equivalent hashes then those graphs must be
isomorphic by definition.

You mentioned in another post that sorting the graph matrix lexically
produced a unique representation for every graph under isomorphism,
which would be much more useful than trying to work with equivalence
classes. How were you determining the lexical order? Did automorphisms
cause any difficulty?

Ben Livengood

Ben Livengood

unread,
Oct 3, 2006, 8:43:16 PM10/3/06
to
Ben Livengood wrote:
> You mentioned in another post that sorting the graph matrix lexically
> produced a unique representation for every graph under isomorphism,
> which would be much more useful than trying to work with equivalence
> classes. How were you determining the lexical order? Did automorphisms
> cause any difficulty?

Sorry, next time I'll think a few minutes longer before I post... I
finally remembered that automorphisms of the graph will result in the
same matrix.

Ben

Mike Amling

unread,
Oct 3, 2006, 9:56:42 PM10/3/06
to
Ben Livengood wrote:
> You mentioned in another post that sorting the graph matrix lexically
> produced a unique representation for every graph under isomorphism,
> which would be much more useful than trying to work with equivalence
> classes. How were you determining the lexical order?

I used a 64-bit integer to represent the 8x8 connection matrix. Its
lexical order was just the natural comparison of two's complement integers.

> Did automorphisms
> cause any difficulty?

The first comparison field was the generated hash. The 64-bit
integers were only sorted within sets of graphs that had the same hash.

--Mike Amling

Ben Livengood

unread,
Oct 4, 2006, 2:07:43 AM10/4/06
to
fgrieu wrote:
> Mike Amling wrote:
>
> > CNN's next headline may well be 'Graph Isomorphism in P'.
>
> As soon as we get a proof ; 8 is shy from infinity :-)

I found two strongly regular graphs for which my implementation of your
algorithm fails, at: http://www.maths.gla.ac.uk/~es/16.vertices

I picked the two (16,6,2,2) graphs at the top of the page and got the
same hash for both graphs, but I don't think the graphs are isomorphic.
I listed the two graphs at the end of this post. When I run the
algorithm each vertex has the exact same hash, so as far as I can tell
there is no way to sort the vertices to see if the two graphs are
actually isomorphisms without trying all 16! permutations. The way they
were listed on that page led me to believe they weren't isomorphic,
though.

Ben Livengood


Graph 1
0111111000000000
1011000111000000
1101000000111000
1110000000000111
1000011100100100
1000101010010010
1000110001001001
0100100011100100
0100010101010010
0100001110001001
0010100100011100
0010010010101010
0010001001110001
0001100100100011
0001010010010101
0001001001001110

Graph 2
0111111000000000
1011000111000000
1100100100110000
1100010010001100
1010001000101010
1001001000010101
1000110001000011
0110000001010110
0101000001101001
0100001110000011
0010100010011001
0010010100100101
0001100010100110
0001010100011010
0000101101001100
0000011011110000

http://www.maths.gla.ac.uk/~es/srgraphs.html has a large collection of
other strongly regular graphs as well.

Bill Cox

unread,
Oct 5, 2006, 9:09:05 AM10/5/06
to

Ben Livengood wrote:
> fgrieu wrote:
> I found two strongly regular graphs for which my implementation of your
> algorithm fails, at: http://www.maths.gla.ac.uk/~es/16.vertices
>
> I picked the two (16,6,2,2) graphs at the top of the page and got the
> same hash for both graphs, but I don't think the graphs are isomorphic.
> I listed the two graphs at the end of this post. When I run the
> algorithm each vertex has the exact same hash, so as far as I can tell
> there is no way to sort the vertices to see if the two graphs are
> actually isomorphisms without trying all 16! permutations. The way they
> were listed on that page led me to believe they weren't isomorphic,
> though.
>
> Ben Livengood

My code found these graphs to be different. "Strongly regular" graphs
are easily differentiated by the algorithm. Be sure to use 2*N (number
of nodes in the graph), rather than my original 2*D (diameter). I also
still hash the iteration step number at each node, though it remains to
be proven that this helps.

Bill

Bill Cox

unread,
Oct 5, 2006, 9:17:32 AM10/5/06
to

fgrieu wrote:
> I notice a fun thing: when using any variant of the algorithm
> to compare two graphs, we can detect with certainty when the
> algorithm fails, as follow.
>
> We use the algorithm to produce a hash for each vertex.
>
> If these hashes are not identical within order, we have
> proved the graphs are not isomorphic.
>
> Else, we reorder both graph's matrix according to
> nondecreasing hashes. If the reordered graphs are equal,
> we have proved the graphs are isomorphic.
>
> Else, the algorithm failed.
>
>
> François Grieu

Almost, but not quite true... If there are nodes in an graph with
identical hashes, then they are automorphic (symmetric?). Comparing
two automorphic graphs requires what I'm calling "resolution". The
algorithm I use is to pick two arbitrary nodes (one in each graph) that
have identical hashes. Assign them to each other by setting the hash
value of both nodes to a new big pseudo-random number. No other node
will have the same hash. Then, take do another hash iteration on each
graph with these nodes as the start-nodes.

I'll describe why this is needed with a simple example. Comparing two
rings of 4 nodes, every node in each graph will have the same hash
value. Obviously, there are random mappings from the first graphs
nodes to the second which are not valid isomorphisms. The resolution
step solves this problem.

Bill

Bill Cox

unread,
Oct 5, 2006, 9:49:22 AM10/5/06
to

Sorry for not reading all the recent posts before replying... you guys
are already way ahead of me. I went on a sales trip for a week, and
now my brain hurts just thinking about graph isomorphism. Somehow,
doing sales makes me want to play golf and stop reading /.

Bill Cox

unread,
Oct 5, 2006, 2:58:42 PM10/5/06
to
Here's an interesting paper on "strongly regular graphs", which so far,
don't seem too difficult.

http://citeseer.ist.psu.edu/cameron01strongly.html

Quoting from the paper:

"Another role of strongly regular graphs is aa test cases for graph
isomorphism testing algorithms. The global uniformity ensured by the
definition makes it harder to find a cononical labelling, while the
superexpodential number of graphs means that they cannot be processed
as exceptions."

All I can says is: we pass the test.

Bill Cox

unread,
Oct 7, 2006, 5:58:30 AM10/7/06
to

I ran the code between all 15 25-node graphs on this page:

http://www.maths.gla.ac.uk/~es/25.vertices

It differentiated them all.

Bill Cox

unread,
Oct 8, 2006, 2:23:50 PM10/8/06
to

I also finished running all 167 64 node strongly regular graphs against
each other. No counter-examples, all were differentiated.

Phil Carmody

unread,
Oct 8, 2006, 3:15:25 PM10/8/06
to

You are checking that it actually detects isomorphisms too, aren't you?
You've not explicitly said so. Else here's an O(1) algorithm which
correctly differentiates all of the examples you've listed so far.

are_isomorphic(M1, M2) = are_identical(M1,M2)

Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.

Mike Amling

unread,
Oct 8, 2006, 9:17:01 PM10/8/06
to
Phil Carmody wrote:
> You are checking that it actually detects isomorphisms too, aren't you?
> You've not explicitly said so. Else here's an O(1) algorithm which
> correctly differentiates all of the examples you've listed so far.
>
> are_isomorphic(M1, M2) = are_identical(M1,M2)

Not to worry. Francois Grieu has proved that there are no false
negatives. (Je ne comprends pas la preuve. Elle implique beaucoup
d'onduler de main.)

--Mike Amling

Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Francois Grieu

unread,
Oct 9, 2006, 9:20:28 AM10/9/06
to
In article <egc7ud$o...@dispatch.concentric.net>,
Mike Amling <nos...@foobaz.com> wrote:

The proof that isomorphic graphs give identical hash, with less
handwaving (gestes des mains), runs as follows.

Suppose two graphs G, G' with n nodes are isomorhic. That is,
there exists a permutation P on {0..n-1} such that
for all i,j G[i,j] = G'[P(i),P(j)].

We'll be running (say) this version of the algorithm,

0. Associated to each node we need
- a W-bit "value";
- a W-bit "temp";
- a W-bit "result".
1. Repeat 2..5 with each node (s) the special node in turn.
2. Assign the special node the value "all bits clear
except the last".
Assign all the other nodes the value "all bits clear".
Repeat 3..4 some number of times, say for t in {0..N-1}.
3. For each node (u), assign to temp the hash of
[the node's value, followed by its neighbor's
values in sorted order (duplicates kept)].
4. For each node, copy temp to value.
5. Set the special node's result to the hash of
[values of each node in sorted order (duplicates kept)].
6. Produce the overall result as the hash of
[outputs of all nodes in sorted order (duplicates kept)].


Note value[u]@(s,t) the value for node u at step 3 when
computing result for node s (variable of step 1), and at
iteration t (variable in step 2), for graph G; and note
value'[u]@(s,t) the same for graph G'.

We'll proove by induction that
value[u]@(s,t) = value'[P[u]]@(P[s],t)

This is true when t==0, since
value[u]@(u,0) == 1, value[u]@(s,0) == 0 when u!=s,
value'[u']@(u',0) == 1, value'[u']@(s',0) == 0 when u'!=s',
we choose u' = P[u], s' = P[s'] and get
value'[P[u]]@(P[u],0) == 1, value'[P[u]]@(P[s],0) == 0 when u!=s
thus value[u]@(s,0) = value'[P[u]]@(P[s],0).

If the induction property is true up to t, when performing
step 3 for node u and graph G, the node's value is the same
as the value when running step 3 for node P[u] and graph G'.
And the neighbor's values being sorted are the same for
graph G and G", save for order. Because the values are ordered
before hash, thus the hash insentitive to order, it follows
that at iteration t, after step 3, for all u,
temp[u] = temp'[P[u]]
and thus after step 4, the induction property is valid for
the next t.

Because the hash used at step 5 is insensitive to order, it
follows that at step 5, result[s] = result'[P[s]]

Because the hash used at step 6 is isensitive to order, it
follows that the overall result is identical for G and G'.


I which that a proof that equal graph hashes implies isomorphism
would be that simple.


Note: it's silly, but I stil do not have a running version
of the code, and have not checked the "strongly similar" graphs
by myself (seems we have two contributors in disagrement here).


François Grieu

Bill Cox

unread,
Oct 9, 2006, 3:32:15 PM10/9/06
to

Yes, I've compared many of them against eachother, though I've done the
randomization by hand until now. I'll try to find time to randomize
them automatically and do full self-comparisons.

Francois Grieu

unread,
Oct 10, 2006, 5:40:07 AM10/10/06
to
In article <egc7ud$o...@dispatch.concentric.net>,
Mike Amling <nos...@foobaz.com> wrote:

The proof that isomorphic graphs give identical hash, with less


handwaving (gestes des mains), runs as follows.

Suppose two graphs G, G' with n nodes are isomorphic. That is,


there exists a permutation P on {0..n-1} such that
for all i,j G[i,j] = G'[P(i),P(j)].

We'll be running (say) this version of the algorithm,

0. Associated to each node we need
- a W-bit "value";
- a W-bit "temp";
- a W-bit "result".
1. Repeat 2..5 with each node (s) the special node in turn.
2. Assign the special node the value "all bits clear
except the last".
Assign all the other nodes the value "all bits clear".
Repeat 3..4 some number of times, say for t in {0..N-1}.
3. For each node (u), assign to temp the hash of
[the node's value, followed by its neighbor's
values in sorted order (duplicates kept)].
4. For each node, copy temp to value.
5. Set the special node's result to the hash of
[values of each node in sorted order (duplicates kept)].
6. Produce the overall result as the hash of
[outputs of all nodes in sorted order (duplicates kept)].


We'll note value[u]@(s,t) for the value for node u at


step 3 when computing result for node s (variable of step 1),
and at iteration t (variable in step 2), for graph G; and
note value'[u]@(s,t) the same for graph G'.

We'll prove by induction that for all u,s and t in {0..n-1}
value[u]@(s,t) = value'[P(u)]@(P(s),t)

This is true when t==0, since
value[u]@(u,0) == 1, value[u]@(s,0) == 0 when u!=s,
value'[u']@(u',0) == 1, value'[u']@(s',0) == 0 when u'!=s',

we choose u' = P(u), s' = P(s) and get
value'[P(u)]@(P(u),0) == 1, value'[P(u)]@(P(s),0) == 0 when u!=s
thus value[u]@(s,0) = value'[P(u)]@(P(s),0).

If the induction property is true up to t, then when performing


step 3 for node u and graph G, the node's value is the same

as the value when running step 3 for node P(u) and graph G'.


And the neighbor's values being sorted are the same for
graph G and G', save for order. Because the values are ordered

before hash, making the hash insensitive to order, it follows


that at iteration t, after step 3, for all u,

temp[u] = temp'[P(u)]


and thus after step 4, the induction property is valid for
the next t.

Because the hash used at step 5 is insensitive to order, it

follows that at step 5, result[s] = result'[P(s)]

Because the hash used at step 6 is insensitive to order, it


follows that the overall result is identical for G and G'.


I wish that a proof that equal graph hashes implies isomorphism
would be that simple, or even possible.


Note: it is silly, but I still do not have a running version


of the code, and have not checked the "strongly similar" graphs

by myself (seems we have two contributors in disagreement here).


François Grieu
[reposted with corrections]

Booted Cat

unread,
Oct 10, 2006, 5:29:29 PM10/10/06
to
Hi Bill,

Call Robert Sedgewick at http://www.cs.princeton.edu/~rs/

Francois Grieu

unread,
Oct 11, 2006, 2:46:44 AM10/11/06
to
Well, I finally implemented this version of the algorithm

0. Associated to each node we need
- a W-bit "value";
- a W-bit "temp";
- a W-bit "result".

1. Repeat 2..5 with each node the special node in turn.


2. Assign the special node the value "all bits clear
except the last".
Assign all the other nodes the value "all bits clear".

Repeat 3..4 for 2n times (n is the number of nodes)
3. For each node, assign to temp the hash of


[the node's value, followed by its neighbor's
values in sorted order (duplicates kept)].
4. For each node, copy temp to value.
5. Set the special node's result to the hash of
[values of each node in sorted order (duplicates kept)].
6. Produce the overall result as the hash of
[outputs of all nodes in sorted order (duplicates kept)].

My implementation consistently produce identical hashes
for isomorhic graphs, and discriminates many small
non-isomorhic graphs, including those that I posted.
But it fails to discriminate the "strongly regular"
graphs proposed as counterexamples by Ben Livengood.

This match Ben Livengood's results, and does not match
Bill Cox's results.

Among several other possibilities to explain this
dispredancy, I lean towards:
- a probem with Bill Cox's results
- that I failed to capture Bill Cox's algorithm.


François Grieu

Bill Cox

unread,
Oct 11, 2006, 12:15:04 PM10/11/06
to

You are both right. I had a bug, which now that I've fixed, matches
your results. My hashing algorithm fails to differentiate the 16-node
strongly regular graphs Ben Livengood posted, even with many hash
iterations.

Looks like this algorithm doesn't pan out after all... :-(

Oh, well. It's been interesting, and I've enjoyed hashing through it
with you guys. Thanks for finding a counter example!

Ben Livengood

unread,
Oct 11, 2006, 2:43:16 PM10/11/06
to
Bill Cox wrote:

> You are both right. I had a bug, which now that I've fixed, matches
> your results. My hashing algorithm fails to differentiate the 16-node
> strongly regular graphs Ben Livengood posted, even with many hash
> iterations.

Perhaps that was a feature, not a bug? ;)

> Looks like this algorithm doesn't pan out after all... :-(
>
> Oh, well. It's been interesting, and I've enjoyed hashing through it
> with you guys. Thanks for finding a counter example!

I have tried a few other things to salvage the algorithm, including
calculating the transitive closure of the graph and marking the nodes'
initial values according to when they're visited during the traversal
in order to try to capture more of the connection structure of the
graph, but that didn't work either. The interesting thing is that the
algorithm apparently works perfectly for all graphs up to 8 nodes. From
http://mathworld.wolfram.com/StronglyRegularGraph.html, there are
1,2,2,4,3,6,2,6 SRGs respectively for 1 through 8 nodes. However, I
don't think there are any non-isomorphic SRGs with the same parameters.
It would be interesting to find the minimum number of nodes necessary
for two graphs that cause the algorithm to fail, and whether the graphs
are also the smallest non-isomorphic SRGs with the same parameters.
It's possible that the algorithm only fails to distinguish between SRGs
with the same parameters, and that might even have an easy proof since
any difference in the number of connections the graph has will affect
the hashes.

Ben Livengood

0 new messages