Suggestion to speed up nauty_geng()?

86 views
Skip to first unread message

Ai Bo

unread,
Mar 21, 2019, 12:33:26 AM3/21/19
to sage-devel
I am running a program with these lines:
 i =12
 for G in graphs.nauty_geng(str(i) + " -C"):

It is very slow. I know the returned generator is very large. Is there a way to speed this up?

Is there a way to "random access"? For example, access the nth element in the "generator", instead of one by one?

Thank you!

Jori Mäntysalo (TAU)

unread,
Mar 21, 2019, 2:48:38 AM3/21/19
to sage-devel
On Thu, 21 Mar 2019, Ai Bo wrote:

> Is there a way to "random access"? For example, access the nth element
> in the "generator", instead of one by one?

Kind of. As a most time is propably spent by creating Python data
structures for SageMath, you can use nautygen directly to generate huge
number of graphs.

As an example, it takes below 5 seconds to generate all biconnected graphs
on 10 vertices, and I took third last one:

$ nauty26r7/geng 10 -C | tail -3 | head -1
>A /home/jm58660/lat-koe/nauty26r7/geng -Cd2D9 n=10 e=10-45
>Z 9743542 graphs generated in 4.59 sec
I]~~~~~~w

and now

sage: g = Graph('I]~~~~~~w', format='graph6')
sage: g.is_biconnected()
True


--
Jori Mäntysalo

Tampereen yliopisto - Ihminen ratkaisee

Simon King

unread,
Mar 21, 2019, 9:09:05 AM3/21/19
to sage-...@googlegroups.com
Hi!

Does either of you plan to open a ticket and make the functionality
available, that according to Jori is present in nauty but according
to Ai isn't wrapped in Sage?

Best regards,
Simon

Ai Bo

unread,
Mar 21, 2019, 11:03:06 AM3/21/19
to sage-devel
is this "nauty26r7/geng" a program available?
Also, as Python is slow, any part of the nautygen can be written in other language, such as C/C++?

Thanks,
Laura

Ai Bo

unread,
Mar 21, 2019, 11:06:51 AM3/21/19
to sage-...@googlegroups.com
Another question: If I use  nautygen and generate huge number of graph, how do I load into sage?
Because in my for loop, I also use other functions such as Sandpile on the graph generated by nautygen.
Thanks,

--
You received this message because you are subscribed to a topic in the Google Groups "sage-devel" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sage-devel/QoqD5Ka6068/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sage-devel+...@googlegroups.com.
To post to this group, send email to sage-...@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Dima Pasechnik

unread,
Mar 21, 2019, 11:46:56 AM3/21/19
to sage-devel
On Thu, Mar 21, 2019 at 3:03 PM Ai Bo <boai...@gmail.com> wrote:

> is this "nauty26r7/geng" a program available?

geng is installed in local/bin/ sub-directory of your Sage
installation, as a part of Sage's standard package nauty.

> Also, as Python is slow, any part of the nautygen can be written in other language, such as C/C++?

it is written in C, so it's quite fast in this sense.

>
> Thanks,
> Laura
>
> On Wednesday, March 20, 2019 at 11:48:38 PM UTC-7, Jori Mäntysalo (TAU) wrote:
>>
>> On Thu, 21 Mar 2019, Ai Bo wrote:
>>
>> > Is there a way to "random access"? For example, access the nth element
>> > in the "generator", instead of one by one?
>>
>> Kind of. As a most time is propably spent by creating Python data
>> structures for SageMath, you can use nautygen directly to generate huge
>> number of graphs.
>>
>> As an example, it takes below 5 seconds to generate all biconnected graphs
>> on 10 vertices, and I took third last one:
>>
>> $ nauty26r7/geng 10 -C | tail -3 | head -1
>> >A /home/jm58660/lat-koe/nauty26r7/geng -Cd2D9 n=10 e=10-45
>> >Z 9743542 graphs generated in 4.59 sec
>> I]~~~~~~w
>>
>> and now
>>
>> sage: g = Graph('I]~~~~~~w', format='graph6')
>> sage: g.is_biconnected()
>> True
>>
>>
>> --
>> Jori Mäntysalo
>>
>> Tampereen yliopisto - Ihminen ratkaisee
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Ai Bo

unread,
Mar 21, 2019, 2:41:31 PM3/21/19
to sage-devel
I found it. Thank you.
I also tried the command listed above.
I am confused. Where is this "I]~~~~~~w"?
Is it a file? How did Graph load this?

In my program, my code looks like this:
i=12
for G in graphs.nauty_geng(str(i) + " -C"):     
        q = True
        for j in range (0,i):
            S = Sandpile(G,j)
            if S.identity() != S.max_stable():
                q = False
                break



If I use geng to generate graphs, how should I load them in my for loop so I can check with Sandpile?

Sorry for being new in Sagemath. Thank you all for the help.

Ai Bo

unread,
Mar 21, 2019, 2:42:06 PM3/21/19
to sage-devel
I found it. Thank you.
I also tried the command listed above.
I am confused. Where is this "I]~~~~~~w"?
Is it a file? How did Graph load this?

In my program, my code looks like this:
i=12
for G in graphs.nauty_geng(str(i) + " -C"):     
        q = True
        for j in range (0,i):
            S = Sandpile(G,j)
            if S.identity() != S.max_stable():
                q = False
                break



If I use geng to generate graphs, how should I load them in my for loop so I can check with Sandpile?

Dima Pasechnik

unread,
Mar 21, 2019, 2:47:37 PM3/21/19
to sage-devel


On Thu, 21 Mar 2019 18:42 Ai Bo, <boai...@gmail.com> wrote:
I found it. Thank you.
I also tried the command listed above.
I am confused. Where is this "I]~~~~~~w"?

This is a particular way to encode graph as a string of characters (8 bits per character).
Read Sage docs on Graph for details.

Is it a file? How did Graph load this?

In my program, my code looks like this:
i=12
for G in graphs.nauty_geng(str(i) + " -C"):     
        q = True
        for j in range (0,i):
            S = Sandpile(G,j)
            if S.identity() != S.max_stable():
                q = False
                break



If I use geng to generate graphs, how should I load them in my for loop so I can check with Sandpile?

Ideally one should create a fast interface from Python to geng, probably using Cython, if it is not already done.

David Coudert

unread,
Mar 22, 2019, 3:28:59 AM3/22/19
to sage-devel
Our interface to nauty geng is in Python, but the difficulty is not here.
  • It takes time to build graphs from graph6 strings, and also to build Sandpiles (12 for each graph)
  • The number of biconnected graphs with 12 nodes is huge:  153.620.333.545  See https://oeis.org/A002218
Some parallelism could certainly help here, but there will be no miracle. A faster method for the tests on Sandpile is needed.

Nico Van Cleemput

unread,
Mar 22, 2019, 3:36:03 AM3/22/19
to sage-devel
If you want to use the command that Jori gave in Sage, you can use the fact that nauty_geng just runs geng and appends the option string after the command. So if you want to generate the last three graphs that are returned by geng for biconnected graphs on 10 vertices you can do this:

for g in graphs.nauty_geng("10 -C | tail -3"):
    print g


Cheers
Nico

Op vr 22 mrt. 2019 om 08:29 schreef David Coudert <david....@gmail.com>:

Samuel Lelievre

unread,
Mar 22, 2019, 3:49:56 AM3/22/19
to sage-devel
Thu 2019-03-21 18:47:37 UTC, Dima Pasechnik:

>
> Ideally one should create a fast interface from Python
> to geng, probably using Cython, if it is not already done.

PyNauty is discussed at

- Sage Trac ticket 25506:
  Nauty interface for isomorphism checking and automorphism group computing
  https://trac.sagemath.org/ticket/25506
  (needs review)

Dima Pasechnik

unread,
Mar 22, 2019, 4:11:38 AM3/22/19
to sage-...@googlegroups.com
On Thu, 21 Mar 2019 at 18:42, Ai Bo <boai...@gmail.com> wrote:
I found it. Thank you.
I also tried the command listed above.
I am confused. Where is this "I]~~~~~~w"?
Is it a file? How did Graph load this?

In my program, my code looks like this:
i=12
for G in graphs.nauty_geng(str(i) + " -C"):     
        q = True
        for j in range (0,i):
            S = Sandpile(G,j)
            if S.identity() != S.max_stable():
                q = False
                break



If I use geng to generate graphs, how should I load them in my for loop so I can check with Sandpile?

I think the bottleneck in you computation is not geng, but Sandpile: you spend much more time computing Sandpile compared with asking geng for the next graph.

Jori Mäntysalo (TAU)

unread,
Mar 22, 2019, 4:37:42 AM3/22/19
to sage-devel
OK, more explanation.

* * *

First I compare time for generating graphs in Nauty and in Sage. As plain
graphs(n) uses nauty, I have test.sage containing

print(sum(1 for _ in graphs(9)))

It takes about 11½ seconds to run. I tested this with

time ./sage test.sage

Then,

./local/bin/geng 9 > /dev/null

says "274668 graphs generated in 0.12 sec". So, if we just want to sample
few graphs, we can get the speedup of ~50x. In other words, it is slow to
convert data to Python internal format.

OTOH number of many finite structures up to isomorphism grows very fast.
So you can for example test some hypothesis to n=10 on a mobile phone,
n=11 on a desktop computer and n=13 on a supercomputer. Same happens when
you optimize code.

* * *

Next, does geng use multiple cpu cores? No. There is no difference between

time taskset -c 1 ./local/bin/geng 9 > /dev/null
time taskset -c 1-4 ./local/bin/geng 9 > /dev/null

(You could also use "top" to see cpu usage.)

* * *

Now, how to use geng, make a sample, and then get them to Sage? First I
generated all graphs (here to n=9 for speed):

$ ./local/bin/geng 9 > g9
>A ./local/bin/geng -d0D8 n=9 e=0-36
>Z 274668 graphs generated in 0.12 sec

OK, now I have a big list of strings:

$ head -3 g9 ; tail -3 g9
H??????
H????A?
H????B?
H]~~~~~
H^~~~~~
H~~~~~~

Every line is an encoded graph. I want to make a sample, lets say every
1000:th line. Every line is (HERE, not when n=12) 8 bytes long. So,

$ i=0; while [[ i -lt 274668 ]]; do dd if=g9 bs=8 skip=$i count=1 >> g9sample 2> /dev/null; i=$((i+1000)); done

will give you a file of 275 lines:

$ wc -l g9sample
275 g9sample

And now I did a test2.sage -file:

with open('g9sample', 'r') as fp:
c = 0
n = 0
for line in fp:
g = Graph(line, format='graph6')
n += 1
if g.is_connected():
c += 1
print("About %s percent are connected" % round(100.0*c/n))

and

$ ./sage test2.sage
About 95 percent are connected

Of course there are many other ways for this. For example you could read
the whole file with Python and just skip 99,9% of lines, or skip every
line with propability of 0.999 etc. Hopefully you get the idea from this.

Simon King

unread,
Mar 22, 2019, 4:37:53 AM3/22/19
to sage-...@googlegroups.com
Hi David,

On 2019-03-22, David Coudert <david....@gmail.com> wrote:
> Our interface to nauty geng is in Python, but the difficulty is not here.
>
> - It takes time to build graphs from graph6 strings, and also to build
> Sandpiles (12 for each graph)
> - The number of biconnected graphs with 12 nodes is huge: 153.620.333.545
> See https://oeis.org/A002218
>
> Some parallelism could certainly help here, but there will be no miracle.

If I understand correctly, this thread is about how to create an
iterator for these graphs that does not try to generate all of them
at once (and if I understand correctly, nauty/geng provides such
iterator, it just isn't wrapped, but could easily be, since the output
of geng, which is a stand-alone executable, can be read as a string [by
a wrapper in Sage] and then translated into a graph).

The miracle would be that in a doc test you could probably iterate over
the first 10 graphs, demonstrating that you do not need to create the
remaining 153,620,333,535.

Best regards,
Simon

Jori Mäntysalo (TAU)

unread,
Mar 22, 2019, 8:10:14 AM3/22/19
to sage-...@googlegroups.com
On Thu, 21 Mar 2019, Simon King wrote:

> Does either of you plan to open a ticket and make the functionality
> available, that according to Jori is present in nauty but according
> to Ai isn't wrapped in Sage?

At least I do not.

Ai Bo

unread,
Mar 22, 2019, 11:33:17 AM3/22/19
to sage-devel
Thank you!
I have resolved for up to "geng 11". It is the 12 that I need a solution. I am aware how many graphs it generates.

I used multi-core manually, i.e., I launched multiple(24) geng program on a 24-core machine.

There is no doubt that geng is much faster than using sagemath's nauty_geng(python). Thank you for the suggestion.

With 12, I can't write to a file, it is at least 1500G. I can write to a file up to around 300G at most.

That is why I am thinking how to divide the output of "geng 12".   So far, I don't have any idea. Any suggestion?

Jori Mäntysalo (TAU)

unread,
Mar 22, 2019, 12:06:49 PM3/22/19
to sage-devel
On Fri, 22 Mar 2019, Ai Bo wrote:

> With 12, I can't write to a file, it is at least 1500G. I can write to a
> file up to around 300G at most.

> That is why I am thinking how to divide the output of "geng 12".   So
> far, I don't have any idea. Any suggestion?

You can (and should) use A/B -notation to geng:

$ ./local/bin/geng 9 > /dev/null
>A ./local/bin/geng -d0D8 n=9 e=0-36
>Z 274668 graphs generated in 0.15 sec

$ ./local/bin/geng 9 0/10 > /dev/null
>A ./local/bin/geng -X0x200d0D8 n=9 e=0-36 class=0/10
>Z 30682 graphs generated in 0.02 sec

$ ./local/bin/geng 9 1/10 > /dev/null
>A ./local/bin/geng -X0x200d0D8 n=9 e=0-36 class=1/10
>Z 26300 graphs generated in 0.02 sec

and to verify this:

./local/bin/geng 9 | wc -l

outputs 274668, just like

for x in $(seq 0 9); do ./local/bin/geng 9 $x/10; done | wc -l

(But I guess you must use splitting number bigger than 10, maybe ~100.)

I think that you can run this in parallel and get you computation done in
a day or two. But n=13 might need a supercomputer.
Reply all
Reply to author
Forward
0 new messages