Slowdown in is_isomorphic?

88 views
Skip to first unread message

mark mcclure

unread,
Feb 26, 2009, 8:53:45 PM2/26/09
to sage-devel
There appears to have been a slow down in the is_isomorphic
method of the Graph class. The code segment below (just
the part timed with %time) runs in 0.7 seconds in Sage 3.1.1
and 2.2 seconds in Sage 3.2.3 and Sage 3.3. Both times are
on my MacBook Pro at 2.1 GHz.

The code simply downloads some graphs and checks that they
are all distinct. Here are the graphs:

import urllib2
f = urllib2.urlopen('http://cs.anu.edu.au/~bdm/data/graph6.g6')
graph_strings = f.readlines()
f.close()
some_graphs = [Graph(graph_string) for graph_string in graph_strings]


And here is the timed code:

%time
result = false
for i in xrange(0,len(some_graphs)-1):
for j in xrange(i+1, len(some_graphs)):
if some_graphs[i].is_isomorphic(some_graphs[j]):
result = true
break
if result:
break
result

Mark McClure

Robert Miller

unread,
Feb 26, 2009, 9:35:33 PM2/26/09
to sage-devel
Mark,

Between Sage 3.1.1 and 3.2.3, the background implementation of graph
isomorphism was completely switched over. The old implementation
simply computed canonical labels, but it always constructed a C graph
first. Now, it actually uses an algorithm specifically designed to
test for isomorphism between two inputs, but it uses the underlying
implementation of the input graph. Therefore, if the input graphs are
based on NetworkX graphs, it will go much more slowly than it used to.
For maximum speed in is_isomorphic, you should construct c_graphs. If
G is a graph, you can do

sage: H = G.copy(implementation='c_graph')

Here are some benchmarks:

sage: G = graphs.PetersenGraph()
sage: S = SymmetricGroup(10)
sage: H = G.relabel(S.random_element(), inplace=False)
sage: G_C = G.copy(implementation='c_graph', sparse=False)
sage: H_C = H.copy(implementation='c_graph', sparse=False)
sage: G_C_S = G.copy(implementation='c_graph', sparse=True)
sage: H_C_S = H.copy(implementation='c_graph', sparse=True)

sage: timeit('G.is_isomorphic(H)')
625 loops, best of 3: 1.2 ms per loop
sage: timeit('G_C.is_isomorphic(H_C)')
625 loops, best of 3: 206 µs per loop
sage: timeit('G_C_S.is_isomorphic(H_C_S)')
625 loops, best of 3: 208 µs per loop

sage: D = graphs.DodecahedralGraph()
sage: S = SymmetricGroup(20)
sage: H = D.relabel(S.random_element(), inplace=False)
sage: D_C = D.copy(implementation='c_graph', sparse=False)
sage: H_C = H.copy(implementation='c_graph', sparse=False)
sage: D_C_S = D.copy(implementation='c_graph', sparse=True)
sage: H_C_S = H.copy(implementation='c_graph', sparse=True)

sage: timeit('D.is_isomorphic(H)')
125 loops, best of 3: 2.06 ms per loop
sage: timeit('D_C.is_isomorphic(H_C)')
625 loops, best of 3: 305 µs per loop
sage: timeit('D_C_S.is_isomorphic(H_C_S)')
625 loops, best of 3: 312 µs per loop


Does this modification bring you back up to old speed, or hopefully,
even better?


RLM

Jason Grout

unread,
Feb 26, 2009, 9:46:09 PM2/26/09
to sage-...@googlegroups.com
Robert Miller wrote:
> Mark,
>
> Between Sage 3.1.1 and 3.2.3, the background implementation of graph
> isomorphism was completely switched over. The old implementation
> simply computed canonical labels, but it always constructed a C graph
> first. Now, it actually uses an algorithm specifically designed to
> test for isomorphism between two inputs, but it uses the underlying
> implementation of the input graph. Therefore, if the input graphs are
> based on NetworkX graphs, it will go much more slowly than it used to.
> For maximum speed in is_isomorphic, you should construct c_graphs. If
> G is a graph, you can do
>
> sage: H = G.copy(implementation='c_graph')
>
> Here are some benchmarks:
>
> sage: G = graphs.PetersenGraph()
> sage: S = SymmetricGroup(10)
> sage: H = G.relabel(S.random_element(), inplace=False)
> sage: G_C = G.copy(implementation='c_graph', sparse=False)
> sage: H_C = H.copy(implementation='c_graph', sparse=False)
> sage: G_C_S = G.copy(implementation='c_graph', sparse=True)
> sage: H_C_S = H.copy(implementation='c_graph', sparse=True)
>
> sage: timeit('G.is_isomorphic(H)')
> 625 loops, best of 3: 1.2 ms per loop
> sage: timeit('G_C.is_isomorphic(H_C)')
> 625 loops, best of 3: 206 盜 per loop
> sage: timeit('G_C_S.is_isomorphic(H_C_S)')
> 625 loops, best of 3: 208 盜 per loop

>
> sage: D = graphs.DodecahedralGraph()
> sage: S = SymmetricGroup(20)
> sage: H = D.relabel(S.random_element(), inplace=False)
> sage: D_C = D.copy(implementation='c_graph', sparse=False)
> sage: H_C = H.copy(implementation='c_graph', sparse=False)
> sage: D_C_S = D.copy(implementation='c_graph', sparse=True)
> sage: H_C_S = H.copy(implementation='c_graph', sparse=True)
>
> sage: timeit('D.is_isomorphic(H)')
> 125 loops, best of 3: 2.06 ms per loop
> sage: timeit('D_C.is_isomorphic(H_C)')
> 625 loops, best of 3: 305 盜 per loop
> sage: timeit('D_C_S.is_isomorphic(H_C_S)')
> 625 loops, best of 3: 312 盜 per loop

>
>
> Does this modification bring you back up to old speed, or hopefully,
> even better?


Is there a reason that you don't convert to C graphs by default anymore?
I can see this preventing at least some people from getting fast
answers and having a bad opinion of Sage :(.


Jason

Robert Miller

unread,
Feb 26, 2009, 9:55:07 PM2/26/09
to sage-devel
> Is there a reason that you don't convert to C graphs by default anymore?
>   I can see this preventing at least some people from getting fast
> answers and having a bad opinion of Sage :(.

If there were evidence that things were *always* faster when
converting to C graphs automatically, I would definitely want to
implement that. What I was afraid of were the cases that would
terminate quickly in the isomorph stage, and hence copying the graph
over would be a waste of time. Granted, isomorph testing itself can
probably be expected to dominate, so I wouldn't be surprised if
converting first was just about always faster.

Thoughts?

Carl Witty

unread,
Feb 26, 2009, 10:11:33 PM2/26/09
to sage-...@googlegroups.com

I would think that the default method should be the one that's usually
faster. If you want to allow not converting, you could add an extra
argument convert_to_c=True, so the user could explicitly skip the
conversion step if there was reason to expect it wasn't a good idea in
this case.

Carl

mark mcclure

unread,
Feb 26, 2009, 10:31:32 PM2/26/09
to sage-devel
On Feb 26, 9:35 pm, Robert Miller <rlmills...@gmail.com> wrote:
> Between Sage 3.1.1 and 3.2.3, the background implementation
> of graph isomorphism was completely switched over. ...
> if the input graphs are based on NetworkX graphs, it will
> go much more slowly than it used to. For maximum speed in
> is_isomorphic, you should construct c_graphs.

Thanks Robert, that definitely helps a lot. The results of
my timing tests are below.

Timings:
Graphs7
NetworkX: 26s
Sage 3.1.1: 32s
Sage 3.3: 37s
(c_graphs)
Graphs6
Networkx: 0.55s
Sage 3.1.1: 0.72s
Sage 3.3: 0.79s
(c_graphs)

Again, I imported a collection of graphs and checked that
they were indeed all distinct. Graphs7 is the collection
of all 1044 connected graphs on 7 vertices and Graphs6 is
the collection of all 156 graphs on 6 vertices. I did not
include the Sage 3.3 timings using the old NetworkX graphs
but those timings appear to be about a factor of 4 slower.
I did experiment with sparse=True or sparse=False but the
timings appeared to be unaffected. Of course, these aren't
large matrices.

For clarity, my slightly revised code is below.

import urllib2
f = urllib2.urlopen('http://cs.anu.edu.au/~bdm/data/graph6.g6') # Or
graph7.g6
graph_strings = f.readlines()
f.close()
some_graphs = [Graph(graph_string) for graph_string in graph_strings]
# Switch to the new c_graph implementation.
better_graphs = [g.copy(implementation='c_graph', sparse=False) for g
in some_graphs]
n = len(some_graphs)


%time
# Start with result = false.
# That is, we haven't found a pair of isomorphic graphs.
result = false
for i in xrange(0,n-1):
for j in xrange(i+1, n):
if better_graphs[i].is_isomorphic(better_graphs[j]):
# Change better to some for 3.1.1.

Jason Grout

unread,
Feb 26, 2009, 11:35:39 PM2/26/09
to sage-...@googlegroups.com
mark mcclure wrote:
> On Feb 26, 9:35 pm, Robert Miller <rlmills...@gmail.com> wrote:
>> Between Sage 3.1.1 and 3.2.3, the background implementation
>> of graph isomorphism was completely switched over. ...
>> if the input graphs are based on NetworkX graphs, it will
>> go much more slowly than it used to. For maximum speed in
>> is_isomorphic, you should construct c_graphs.
>
> Thanks Robert, that definitely helps a lot. The results of
> my timing tests are below.
>
> Timings:
> Graphs7
> NetworkX: 26s
> Sage 3.1.1: 32s
> Sage 3.3: 37s
> (c_graphs)
> Graphs6
> Networkx: 0.55s
> Sage 3.1.1: 0.72s
> Sage 3.3: 0.79s
> (c_graphs)


What is the Networkx timing? That seems like the best.

Jason


mark mcclure

unread,
Feb 27, 2009, 7:10:41 AM2/27/09
to sage-devel
On Feb 26, 11:35 pm, Jason Grout <jason-s...@creativetrax.com> wrote:
> What is the Networkx timing?  That seems like the best.

That's just straight up NetworkX run independently of Sage.
Of course, the code is almost identical.

Mark

Jason Grout

unread,
Feb 27, 2009, 7:44:21 AM2/27/09
to sage-...@googlegroups.com


So it seems that your timings indicate that Networkx's isomorphism
checker is faster than the Sage one, even if we convert to c_graphs. Is
that right?

That's embarrassing; I thought we had the "fastest isomorphism checker
in the west".

Jason

mabshoff

unread,
Feb 27, 2009, 7:57:05 AM2/27/09
to sage-devel
Well, I don't know if Graphs7 is that interesting of a problem set
size wise (I assume it isn't if 1000+ tests can be done in less than
30 seconds) and I would like to see what happens when you use both
code bases on "large problems". But obviously being faster for the
small stuff would also be nice :)

> Jason

Cheers,

Michael

mark mcclure

unread,
Feb 27, 2009, 8:07:33 AM2/27/09
to sage-devel
On Feb 27, 7:57 am, mabshoff <mabsh...@googlemail.com> wrote:
> On Feb 27, 4:44 am, Jason Grout <jason-s...@creativetrax.com> wrote:
> > So it seems that your timings indicate that Networkx's isomorphism
> > checker is faster than the Sage one, even if we convert to c_graphs.  Is
> > that right?
>
> > That's embarrassing; I thought we had the "fastest isomorphism checker
> > in the west".
>
> Well, I don't know if Graphs7 is that interesting of a problem set
> size wise.

I would agree with Michael on this. I'll be examining graphs on < 10
vertices (and hopefully = 10 vertices) for a particular application
that
I have in mind. That's why I ran the particular test I ran. But it's
quite
possible that the C_Graph implementation works on moderately large
and sparse graphs. I'll probably look into this over the weekend.

Mark


Jason Grout

unread,
Feb 27, 2009, 10:45:54 AM2/27/09
to sage-...@googlegroups.com


Sure, I hope the c_graph code is better for bigger graphs. I was just
thinking that for smaller graphs, maybe we could use a faster method...

Jason

Robert Miller

unread,
Mar 2, 2009, 12:09:26 PM3/2/09
to sage-devel
Mark,

My timings are showing different results than yours.

(I'm running these tests on my idle MacBook, OS X 10.5.6, 2.4 GHz Core
2 Duo)

----------------------------------------------------------------------
| Sage Version 3.3, Release Date: 2009-02-21 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------

sage: def go(graphs):
....: result = False
....: for i in xrange(len(graphs)):
....: for j in xrange(i+1,len(graphs)):
....: if graphs[i].is_isomorphic(graphs[j]):
....: result = True
....: break
....: if result: break
....: return result

sage: G = list(graphs(6))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 0.64 s, sys: 0.00 s, total: 0.65 s
Wall time: 0.66 s
False

sage: G = list(graphs(7))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 30.70 s, sys: 0.11 s, total: 30.81 s
Wall time: 31.11 s
False

sage: import networkx
sage: def go(graphs):
....: result = False
....: for i in xrange(len(graphs)):
....: for j in xrange(i+1,len(graphs)):
....: GM = networkx.isomorphvf2.GraphMatcher(graphs
[i],graphs[j])
....: if GM.is_isomorphic():
....: result = True; break
....: if result: break
....: return result

sage: G = list(graphs(6))
sage: G = [g.networkx_graph() for g in G]
sage: %time go(G)
CPU times: user 0.63 s, sys: 0.00 s, total: 0.63 s
Wall time: 0.63 s
False

sage: G = list(graphs(7))
sage: G = [g.networkx_graph() for g in G]
sage: %time go(G)
CPU times: user 29.81 s, sys: 0.10 s, total: 29.91 s
Wall time: 30.12 s
False

Essentially the same time. NetworkX is faster but by a ratio of 1.048
for graphs on 6 vertices and of 1.033 for graphs on 7 vertices.

What platform were you running your tests on?

Robert Miller

unread,
Mar 2, 2009, 1:45:09 PM3/2/09
to sage-devel
I have implemented some speedups at #5421.

First, let me remark that switching to c_graphs when possible is
advisable, even for graphs as small as 7 vertices. I tested the
is_isomorphic function on all pairs of such graphs, and used the
longest one as a case study. In particular, with c_graph already as
the underlying implementation, the timing of isomorph testing itself
is about (max) 0.000172s. Switching from NetworkX implementation to
c_graph takes about (max) 0.000796s. The sum is 0.000968s, which is
still faster than sending NICE the NetworkX graph, which takes about
(max) 0.001330s.

On to the speedups: the old benchmarks--

sage: G = list(graphs(6))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 0.64 s, sys: 0.00 s, total: 0.65 s
Wall time: 0.66 s
False

sage: G = list(graphs(7))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 30.70 s, sys: 0.11 s, total: 30.81 s
Wall time: 31.11 s
False

And the new ones--

sage: G = list(graphs(6))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 0.24 s, sys: 0.00 s, total: 0.24 s
Wall time: 0.25 s
False
sage: G = list(graphs(7))
sage: G = [g.copy(implementation='c_graph') for g in G]
sage: %time go(G)
CPU times: user 10.62 s, sys: 0.03 s, total: 10.65 s
Wall time: 10.71 s
False

So now we're beating NetworkX three to one (at least on my machine).
I'm still curious why Mark's results are different. Perhaps Cython
runs slower on his machine...

Finally, when we simply pass the Python (NX) data structure to NICE,
the analogous results are:

sage: G = list(graphs(6))
sage: %time go(G)
CPU times: user 0.88 s, sys: 0.00 s, total: 0.88 s
Wall time: 0.89 s
False

sage: G = list(graphs(7))
sage: %time go(G)
CPU times: user 41.88 s, sys: 0.14 s, total: 42.02 s
Wall time: 42.27 s
False

Robert Miller

unread,
Mar 2, 2009, 2:02:01 PM3/2/09
to sage-devel
> Well, I don't know if Graphs7 is that interesting of a problem set
> size wise (I assume it isn't if 1000+ tests can be done in less than
> 30 seconds)

Actually, it's 1000+ choose 2, or about 544K.

mabshoff

unread,
Mar 2, 2009, 2:10:06 PM3/2/09
to sage-devel
Yeah, I see that now since I guess it does the isomorphism testing for
each graph on the list for any other graph :). Judging from the patch
the overhead was killing you there to some extend. If someone reviews
the patch in the next hour it will go into 3.4.rc0.

My plan was to do 3.4.rc0 last night once we went from the UGA campus
to Jon's house, but the snow storm on the east coast killed power in
many places in Athens due to trees crashing down all over the place
and cutting power lines and hence the net was off and I just went to
sleep. This is the best excuse ever for not doing a Sage release on
time :)

Cheers,

Michael

Robert Miller

unread,
Mar 2, 2009, 2:36:04 PM3/2/09
to sage-devel
> ... Judging from the patch the overhead was killing you there...

Entirely.

The new patch also allows the following notation:

sage: G = list(graphs(7, implementation='c_graph'))
sage: def test(G):
....: for i in xrange(len(G)):
....: for j in xrange(i+1,len(G)):
....: if G[i].is_isomorphic(G[j]): print "FAIL"; return
....:
sage: %time test(G)
CPU times: user 10.52 s, sys: 0.07 s, total: 10.59 s
Wall time: 10.76 s

Robert Bradshaw

unread,
Mar 2, 2009, 2:55:16 PM3/2/09
to sage-...@googlegroups.com
On Mar 2, 2009, at 10:45 AM, Robert Miller wrote:

> I have implemented some speedups at #5421.
>
> First, let me remark that switching to c_graphs when possible is
> advisable, even for graphs as small as 7 vertices.

Then I'd say let's do this by default. (Perhaps add a "don't change"
implementation option.)

- Robert

mark mcclure

unread,
Mar 2, 2009, 6:30:49 PM3/2/09
to sage-devel
On Mar 2, 12:09 pm, Robert Miller <rlmills...@gmail.com> wrote:
> My timings are showing different results than yours.
> ...
> What platform were you running your tests on?

Hey,

Just got back from a day of fun in the snow.

I can confirm your timing comparisons, when I run your code
on my machine. Makes me wonder if I'm doing anything wrong
in my code. The exact code I used to produce my originally
posted timing results is below. My machine is a MacBook Pro
running Sage 3.3 on OSX 10.4.11.

I haven't tried your patch, but I'm looking forward to it.

Mark


--------------------- NetworkX ----------------------
-- The interpreter box at the top is set to Python --

import networkx as nx
import time as time

import urllib2
f = urllib2.urlopen('http://cs.anu.edu.au/~bdm/data/graph7.g6')
graph_strings = f.readlines()
f.close()
graphs = [nx.parse_graph6(s.strip()) for s in graph_strings]

t = time.clock()
result = False
for i in xrange(0, len(graphs)-1):
for j in xrange(i+1, len(graphs)-1):
if nx.isomorphvf2.GraphMatcher(graphs[i],graphs
[j]).is_isomorphic():
result=True
break
if result:
break
result
time.clock()-t

CPU time: 27.39 s


----------------- Sage code --------------
import urllib2
f = urllib2.urlopen('http://cs.anu.edu.au/~bdm/data/graph7.g6') # Or
graph7.g6
graph_strings = f.readlines()
f.close()
some_graphs = [Graph(graph_string) for graph_string in graph_strings]
# Switch to the new c_graph implementation.
better_graphs = [g.copy(implementation='c_graph', sparse=False) for g
in some_graphs]
n = len(some_graphs)

%time
result = false
for i in xrange(0,n):
for j in xrange(i+1, n):
if better_graphs[i].is_isomorphic(better_graphs[j]):
result = true
break
if result:
break
result

CPU time: 38.12 s
Reply all
Reply to author
Forward
0 new messages