Implementation of Graphs, c_graphs, and NetworkX

114 views
Skip to first unread message

Nathann Cohen

unread,
Nov 27, 2009, 9:57:46 AM11/27/09
to Sage devel
Hello everybody !!!

Reviewing patch #7533, Rob Beezer noticed that the Graph function to
compute the distances between all the pairs of vertices in a Graph (
shortest_path_all_pairs(), written in Python ) is much slower than
computing independently the distances between all the pairs of
vertices in the Graph (which uses NetworkX), which is a bit
worrying...

I created ticket #7540 to eventually change this, to which Yann
replied by mentionning that NetworkX also contained a function to
compute all the distances in the graph. Here is one comparision
between these functions :

sage: g=graphs.RandomGNP(300,.1)
sage: time a=[g.distance(i,j) for (i,j) in Subsets(g,2)]
CPU times: user 2.67 s, sys: 0.01 s, total: 2.68 s
Wall time: 2.68 s

sage: time e=g.shortest_path_all_pairs()
CPU times: user 37.95 s, sys: 0.13 s, total: 38.08 s
Wall time: 38.23 s

sage: time e=networkx.all_pairs_shortest_path(g)
CPU times: user 1.93 s, sys: 0.02 s, total: 1.94 s
Wall time: 1.96 s

As you can see, the differences are crystal clear...

It then seems we should replace the Floyd-Marshall algorithm in Sage's
function shortest_path_all_pairs() by a direct call to NetworkX, and
call it a deal. It also seems to be for me the moment to ask a
question about the implementation of graphs in Sage :

If I make no mistake, Robert Miller rewrote the Graph class in C,
which sounds like we are trying to remove networkX from Sage and use
our own version of graphs instead. If this is the case, we will have a
C class for graphs, meaning that some Graph functions will be written
in C instead of Python, thus gaining a lot of speed in Sage. At this
point, what should we do ? Rewrite the Python implementation of the
Floyd Marshall algorithm in C ( or whatever was written in NetworkX if
more efficient ), just call NetworkX, something different ?

I think Sage's graph class needs to be rewritten to be a bit more
efficient..... Efficiency is a problem I have sometimes in Sage, on
instances for which this should not be a problem.... So I expect a lot
from the c_graphs that are currently somewhere in Sage ( are they used
? How, when ? )

In the end : What is going on with the C graphs in Sage, can we expect
them to eb available soon ? What are we trying to do with NetworkX (
leave it, keep it, nothing special ? )

Thank you for your answers !

Nathann Cohen

YannLC

unread,
Nov 27, 2009, 10:34:00 AM11/27/09
to sage-devel
No answer to the main question, but just completing the timings:

sage: G=graphs.RandomGNP(300,.1)
sage: time d=[G.distance(i,j) for (i,j) in Subsets(G.vertices(),2)]
CPU times: user 3.67 s, sys: 0.00 s, total: 3.67 s
Wall time: 3.72 s
sage: time d=networkx.all_pairs_shortest_path(G)
CPU times: user 2.59 s, sys: 0.00 s, total: 2.59 s
Wall time: 2.61 s
sage: time e=networkx.all_pairs_shortest_path_length(G.networkx_graph
(copy=False))
CPU times: user 1.34 s, sys: 0.00 s, total: 1.34 s
Wall time: 1.36 s

YannLC

unread,
Nov 27, 2009, 11:06:27 AM11/27/09
to sage-devel
You can obtain "some" c_grah in Sage with

cG = G.copy(implementation="c_graph")

I don't know how far the development is though.

Minh Nguyen

unread,
Nov 29, 2009, 8:22:35 AM11/29/09
to sage-...@googlegroups.com, Robert Miller
Hi Nathann,

On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com> wrote:

<SNIP>

> If I make no mistake, Robert Miller rewrote the Graph class in C,
> which sounds like we are trying to remove networkX from Sage and use
> our own version of graphs instead. If this is the case, we will have a
> C class for graphs, meaning that some Graph functions will be written
> in C instead of Python, thus gaining a lot of speed in Sage. At this
> point, what should we do ? Rewrite the Python implementation of the
> Floyd Marshall algorithm in C ( or whatever was written in NetworkX if
> more efficient ), just call NetworkX, something different ?
>
> I think Sage's graph class needs to be rewritten to be a bit more
> efficient..... Efficiency is a problem I have sometimes in Sage, on
> instances for which this should not be a problem.... So I expect a lot
> from the c_graphs that are currently somewhere in Sage ( are they used
> ? How, when ? )
>
> In the end : What is going on with the C graphs in Sage, can we expect
> them to eb available soon ? What are we trying to do with NetworkX (
> leave it, keep it, nothing special ? )

I share some of your concerns as expressed above. In the last few
days, I started writing an introductory chapter for a book on
algorithmic graph theory using Sage. In doing so, I found that Sage
lacks some basic features. For example, I have been unable to find a
function/method to compute the degree sequence of a graph.

What is required is a statement from Robert Miller, or people involved
in the development of C graph, about the current state of C graph. The
statement should include the state where C graph is at in terms of
development, other things that need to be implemented, as well as how
to navigate around the graph theory module.

I was reading through graph.py only to discover that it's very, very
long. Can that module be split up and organized along topical lines?
I'm interested in helping out with developing Sage's graph theory
module. That's a reason why I started writing the small book I
mentioned above. In writing that book, I want to develop an
understanding of the theory and algorithmic aspect of graph theory.
And I plan to use that understanding to contribute to
developing/maintaining the graph theory module.

--
Regards
Minh Van Nguyen

Nathann Cohen

unread,
Nov 29, 2009, 11:27:31 AM11/29/09
to sage-devel
HMmmm...

I started creating new modules, and I wanted to split it piece by
piece, with time. Ticket #7365 creates a module named
graph_decomposition which I intend to fill ( but I will begin to write
these functions when this patch will be merged and the file
created ).

Is it possible in Python to create a module containing a function X,
then to write in the body of the graph class "from my_module import X"
so that the X function can be used as any method in the graph class ?
I am asking this question because I thought it was possible, but I was
not able to do it with several functions defined in graph_coloring,
perhaps because of a cyclic dependance. This would enable us to split
this file efficiently, and to avoid having to copy docstrings and
create useless wrappers as it is the case for the moment with
coloring functions.

I also created ticket #7369 but it was refused as a blocker and I do
not really know how to produce such a patch without wiping out all the
other modifications. If you know how to write it, plllleaaaaaase tell
me :-)

If I can help you with your book, I'd gladly do it !!! I plan great
things for Graphs in Sage, but these days I am really stuck with
reviews... Just look at the state of this section on the Trac
Server !

Even though, most of this is not really "urgent" for Sage ( short of
the two LP tickets.. I madly need them to be in the standard Sage by
the end of december ) and can easily wait. The most important thing
for the graph library, methinks, is to be rewritten efficiently in C
so that I will be able to rewrite the most useful functions from
Python to Cython and speed up the rest of them...

For the degree sequence, I think g.degree() should do :-)

Nathann

kcrisman

unread,
Nov 29, 2009, 6:42:30 PM11/29/09
to sage-devel

> For example, I have been unable to find a
> function/method to compute the degree sequence of a graph.
>

That is odd; I am pretty sure there used to be such a method, maybe
two years ago?

- kcrisman

Minh Nguyen

unread,
Nov 29, 2009, 9:17:20 PM11/29/09
to sage-...@googlegroups.com
Hi kcrisman,

On Mon, Nov 30, 2009 at 10:42 AM, kcrisman <kcri...@gmail.com> wrote:

<SNIP>

> That is odd; I am pretty sure there used to be such a method, maybe
> two years ago?

It turns out there is such a function. One could use
Graph.degree_iterator() as suggested by mhansen, or Graph.degree() as
suggested by ncohen. But the result must be sorted in non-increasing
order:

[mvngu@sage ~]$ sage
----------------------------------------------------------------------
| Sage Version 4.2.1, Release Date: 2009-11-14 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: g = Graph({"a": ["b", "e"], "b": ["a", "g", "e", "c"], \
....: "c": ["b", "e", "d"], "d": ["c", "f"], "e": ["f", "a", "b", "c"], \
....: "f": ["g", "d", "e"], "g": ["b", "f"]})
sage: sorted(g.degree(), reverse=True)
[4, 4, 3, 3, 2, 2, 2]
sage: sorted(g.degree_iterator(), reverse=True)
[4, 4, 3, 3, 2, 2, 2]

kcrisman

unread,
Nov 30, 2009, 1:09:17 PM11/30/09
to sage-devel

> sage: sorted(g.degree(), reverse=True)
> [4, 4, 3, 3, 2, 2, 2]

This is probably what I was thinking of; I didn't see sample code.
Probably in the situations I used, I just used g.degree() all by its
lonesome self, unsorted, because the cases I needed could be done "by
hand".

- kcrisman

Minh Nguyen

unread,
Nov 30, 2009, 4:50:27 PM11/30/09
to sage-...@googlegroups.com
Hi kcrisman,

On Mon, Nov 30, 2009 at 10:42 AM, kcrisman <kcri...@gmail.com> wrote:

<SNIP>

> That is odd; I am pretty sure there used to be such a method, maybe
> two years ago?

Ticket #7564 [1] improves the documentation of the method
GenericGraph.degree() by adding two examples showing how one could use
that method to obtain the degree sequence of a graph. The ticket needs
some reviewing love :-)

[1] http://trac.sagemath.org/sage_trac/ticket/7564

Robert Bradshaw

unread,
Dec 1, 2009, 12:01:17 AM12/1/09
to sage-...@googlegroups.com
On Nov 29, 2009, at 5:22 AM, Minh Nguyen wrote:

> Hi Nathann,
>
> On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com
> > wrote:
>
> <SNIP>
>
>> If I make no mistake, Robert Miller rewrote the Graph class in C,
>> which sounds like we are trying to remove networkX from Sage and use
>> our own version of graphs instead. If this is the case, we will
>> have a
>> C class for graphs, meaning that some Graph functions will be written
>> in C instead of Python, thus gaining a lot of speed in Sage. At this
>> point, what should we do ? Rewrite the Python implementation of the
>> Floyd Marshall algorithm in C ( or whatever was written in NetworkX
>> if
>> more efficient ), just call NetworkX, something different ?

I think the basic idea was that one could write a faster (sparse and
dense) graph "core," and then run all the NetworkX algorithms on top
of it as long as it supported the interface (for manipulating and
querying vertices and edges). If some code was still too slow then it
would be moved down to C (hopefully it would be sufficient to declare
the graphs as c-graphs and compile with Cython to remove all the
Python function call overhead). I don't know how well this works at
the moment.

>> I think Sage's graph class needs to be rewritten to be a bit more
>> efficient..... Efficiency is a problem I have sometimes in Sage, on
>> instances for which this should not be a problem.... So I expect a
>> lot
>> from the c_graphs that are currently somewhere in Sage ( are they
>> used
>> ? How, when ? )

They are used for graph isomorphism, for example.

>> In the end : What is going on with the C graphs in Sage, can we
>> expect
>> them to eb available soon ?

You can use them right now. They're just not as feature full.

>> What are we trying to do with NetworkX (leave it, keep it, nothing
>> special ? )

We're certainly keeping it for the time being, but there's a limit to
speed if one sticks with pure Python. I don't think there's any goal
to replace NetworkX altogether. It could be like a lot of other things
were we handle some things ourselves (where we needed absolute speed)
and call off to NetworkX for the rest.

> I share some of your concerns as expressed above. In the last few
> days, I started writing an introductory chapter for a book on
> algorithmic graph theory using Sage. In doing so, I found that Sage
> lacks some basic features. For example, I have been unable to find a
> function/method to compute the degree sequence of a graph.

Though much could be added of course, I'm sure something this basic is
a lack of documentation rather than functionality.

> What is required is a statement from Robert Miller, or people involved
> in the development of C graph, about the current state of C graph. The
> statement should include the state where C graph is at in terms of
> development, other things that need to be implemented, as well as how
> to navigate around the graph theory module.

That would be helpful, but it's not needed. Robert Miller is busy
studying p-descent and other BSD related topics for his thesis right
now, not graphs, so I'd imagine that there's not much going on
development wise with them unless you want to take up the banner.

> I was reading through graph.py only to discover that it's very, very
> long. Can that module be split up and organized along topical lines?

As a first pass, the Graph and DiGraph class could be split out into
their own files.

- Robert

Robert Bradshaw

unread,
Dec 1, 2009, 12:24:18 AM12/1/09
to sage-...@googlegroups.com
On Nov 29, 2009, at 8:27 AM, Nathann Cohen wrote:

> HMmmm...
>
> I started creating new modules, and I wanted to split it piece by
> piece, with time. Ticket #7365 creates a module named
> graph_decomposition which I intend to fill ( but I will begin to write
> these functions when this patch will be merged and the file
> created ).
>
> Is it possible in Python to create a module containing a function X,
> then to write in the body of the graph class "from my_module import X"
> so that the X function can be used as any method in the graph class ?
> I am asking this question because I thought it was possible, but I was
> not able to do it with several functions defined in graph_coloring,
> perhaps because of a cyclic dependance. This would enable us to split
> this file efficiently, and to avoid having to copy docstrings and
> create useless wrappers as it is the case for the moment with
> coloring functions.

I think something like this is possible, but it does make it harder to
figure out how to navigate the source. Big files are a pain, but so
are classes that get split among a dozen small files.

> I also created ticket #7369 but it was refused as a blocker and I do
> not really know how to produce such a patch without wiping out all the
> other modifications. If you know how to write it, plllleaaaaaase tell
> me :-)

Look up hg copy and hg rename. This works if you patch is merged with
the other changes (not just applied on top). The criteria for a
blocker is really high (like giving blatantly wrong answers or not
starting up).

> If I can help you with your book, I'd gladly do it !!! I plan great
> things for Graphs in Sage, but these days I am really stuck with
> reviews... Just look at the state of this section on the Trac
> Server !

If you're having trouble to getting patches reviewed fast enough, one
trick is to try to get people to reciprocate after reviewing some of
theirs.

> Even though, most of this is not really "urgent" for Sage ( short of
> the two LP tickets.. I madly need them to be in the standard Sage by
> the end of december )
>
> and can easily wait. The most important thing
> for the graph library, methinks, is to be rewritten efficiently in C
> so that I will be able to rewrite the most useful functions from
> Python to Cython and speed up the rest of them...
>
> For the degree sequence, I think g.degree() should do :-)
>
> Nathann
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel-...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

Fernando Perez

unread,
Dec 1, 2009, 5:59:29 AM12/1/09
to sage-devel
On Mon, Nov 30, 2009 at 9:01 PM, Robert Bradshaw
<robe...@math.washington.edu> wrote:
> I think the basic idea was that one could write a faster (sparse and
> dense) graph "core," and then run all the NetworkX algorithms on top
> of it as long as it supported the interface (for manipulating and
> querying vertices and edges). If some code was still too slow then it
> would be moved down to C (hopefully it would be sufficient to declare
> the graphs as c-graphs and compile with Cython to remove all the
> Python function call overhead). I don't know how well this works at
> the moment.
>

Is there any particular reason why this code couldn't be contributed
upstream to Networkx? I use networkx outside of Sage, and it would be
great to have speed improvements made there as well; since Sage
already ships nx it would obviously benefit regardless.

There may be a good reason for not doing so, I'm just curious. From
my perspective, the more Sage work also benefits upstream projects,
the more impact it has and more people benefit from it, both those who
use Sage directly and those who may use those projects outside of
Sage. Obviously in some cases this isn't viable (dependencies on core
Sage code are an obvious one), hence my question.

Cheers,

f

William Stein

unread,
Dec 1, 2009, 7:40:13 AM12/1/09
to sage-devel
One potential problem is that all the Sage graph theory code is GPL'd,
but Networkx is now BSD licensed (it used to be GPL'd). Given that
graph theory in Sage is an area with a lot of possibly unfriendly
competition with Magma and Mathematica, the GPL license is appropriate
for us for that part of Sage.

A second potential problem is that all the code we're talking about is
written in Cython, whereas Networkx is a pure Python library (I
think?).

-- William

William Stein

unread,
Dec 1, 2009, 7:45:03 AM12/1/09
to sage-devel
On Tue, Dec 1, 2009 at 5:59 AM, Fernando Perez <fpere...@gmail.com> wrote:
One potential problem is that all the Sage graph theory code is GPL'd,
but Networkx is now BSD licensed (it used to be LGPL'd). Given that
graph theory in Sage is an area with a lot of possibly unfriendly
competition with Magma and Mathematica, the GPL license is appropriate
for us for that part of Sage.

I also recall that the way Networkx switched from LGPL to BSD was odd.
They sent an email to their list and said something like: "we're
switching from the LGPL to the BSD license; if you have a problem with
that please respond within 5 days, otherwise we're making the switch".
That wasn't the exact wording, but it was something like that. I
would personally be uncomfortable contributing code to a project that
switches licenses like that.

A second potential problem is that all the code we're talking about is
written in Cython, whereas Networkx is a pure Python library (I
think?).

The biggest problem we have wrt Networkx right now is that Sage still
ships a very old version, because Networkx's API changed a huge amount
and it was so far too difficult to change Sage to work with the new
API. I wish somebody would do so...

-- William

William Stein

unread,
Dec 1, 2009, 7:48:34 AM12/1/09
to sage-devel
I also recall that the way Networkx switched from LGPL to BSD made me
uncomfortable. They sent an email to their list and said something
like: "we're switching from the LGPL to the BSD license; if you have a
problem with that please respond within 10 days". That wasn't the
exact wording, but it was something like that.
http://www.mail-archive.com/sage-...@googlegroups.com/msg27341.html

The biggest problem we have wrt Networkx right now is on our end --
Sage still ships a very old version, because Networkx's API changed a
huge amount and it was so far too difficult to change Sage to work
with the new API. I hope somebody will fix this soon, since the
longer we wait the harder it becomes.

William

Robert Bradshaw

unread,
Dec 1, 2009, 9:50:38 AM12/1/09
to sage-...@googlegroups.com
On Dec 1, 2009, at 2:59 AM, Fernando Perez wrote:

> On Mon, Nov 30, 2009 at 9:01 PM, Robert Bradshaw
> <robe...@math.washington.edu> wrote:
>> I think the basic idea was that one could write a faster (sparse and
>> dense) graph "core," and then run all the NetworkX algorithms on top
>> of it as long as it supported the interface (for manipulating and
>> querying vertices and edges). If some code was still too slow then it
>> would be moved down to C (hopefully it would be sufficient to declare
>> the graphs as c-graphs and compile with Cython to remove all the
>> Python function call overhead). I don't know how well this works at
>> the moment.
>>
>
> Is there any particular reason why this code couldn't be contributed
> upstream to Networkx? I use networkx outside of Sage, and it would be
> great to have speed improvements made there as well; since Sage
> already ships nx it would obviously benefit regardless.

I think, at least for cgraphs, it's an issue of NetworkX wanting to
stay pure Python. They have few(er) dependancies, and we're happy to
contribute upstream.

- Robert

Robert Bradshaw

unread,
Dec 1, 2009, 9:54:19 AM12/1/09
to sage-...@googlegroups.com
I missed this--yes, this is an issue as well (though we still like
helping upstream BSD projects as well).

- Robert


Nathann Cohen

unread,
Dec 1, 2009, 11:56:51 AM12/1/09
to sage-devel
Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I
assure you it will be very hard for them to compete with LP on our
side and so many features around in Sage... As most of NetworkX's
functions are not very hard to rewrite once the basis is set ( graph
structure, neighbors, etc ) I am more set to try to move away from
this library, especially if they changed to a Sage-uncompatible
license.

The hardest things to rewrite that I can see on their web page at the
moment is the stuff related to max cliques, which is way better in
Cliquer now merged natively in Sage. The isomorphism test has already
been rewritten using Nice ( and does not need them anymore -- or at
least I believe it is ). Last one is the matching problem, which has
already been rewritten in LP and is waiting for reviews.

From the point of view of algorithms, I do not think we really need
them anymore. What we can not do natively is Sage can be rewritten
without too much pain ( tell me what needs to be or create a Trac
ticket, I'll take a look at it :-) )

Nathann

Fernando Perez

unread,
Dec 1, 2009, 2:56:42 PM12/1/09
to sage-devel
On Tue, Dec 1, 2009 at 4:40 AM, William Stein <wst...@gmail.com> wrote:
> One potential problem is that all the Sage graph theory code is GPL'd,
> but Networkx is now BSD licensed (it used to be GPL'd).    Given that
> graph theory in Sage is an area with a lot of possibly unfriendly
> competition with Magma and Mathematica, the GPL license is appropriate
> for us for that part of Sage.

Aha, that makes perfect sense for Sage then. Thanks for the clarification.

> A second potential problem is that all the code we're talking about is
> written in Cython, whereas Networkx is a pure Python library (I
> think?).

Actually I've heard Aric Hagberg (the nx project lead) say that he's
very interested in Cythonizing parts of nx for speed reasons, so I
doubt this would be a problem; they might actually like to see
contributions from people with Cython expertise. But your point on
licensing stands.

Cheers,

f

Fernando Perez

unread,
Dec 1, 2009, 3:03:44 PM12/1/09
to sage-devel
On Tue, Dec 1, 2009 at 8:56 AM, Nathann Cohen <nathan...@gmail.com> wrote:
> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I
> assure you it will be very hard for them to compete with LP on our
> side and so many features around in Sage... As most of NetworkX's
> functions are not very hard to rewrite once the basis is set ( graph
> structure, neighbors, etc ) I am more set to try to move away from
> this library, especially if they changed to a Sage-uncompatible
> license.

An important clarification:

The BSD license is *not* incompatible with Sage, as Sage ships a
massive amount of BSD (and similarly licensed) code already. What
William said was that for this particular topic, using the GPL was
better aligned with the goals and needs of Sage, not that the NetworkX
license was incompatible.

Cheers,

f

YannLC

unread,
Dec 5, 2009, 6:08:50 PM12/5/09
to sage-devel
I have a patch making sage work with Netwotkx 1.0rc1 based on sage
4.3.alpha0, but it will probably conflict with many other graph
related patches...
If anyone wants to improve it, please do!

It's ticket #7608 now.

YannLC

unread,
Dec 5, 2009, 7:02:14 PM12/5/09
to sage-devel
Just to be clear, my patch is based on 4.3alpha0 and won't apply
cleanly to 4.3alpha1, that's why it's still marked as needs work.

YannLC

unread,
Dec 5, 2009, 8:55:50 PM12/5/09
to sage-devel
patch updated. It should apply fine to sage 4.3alpha1 now.

Yann

Nathann Cohen

unread,
Dec 6, 2009, 2:56:28 AM12/6/09
to sage-devel
Your patch exposes the function maximum_weight matching from
NertworkX !!!
There is a high chance that I may have written the patch for matching
using LP for naught, but an even higher chance that NetworkX's
function could be even more efficient (at least for small cases) !!!
Thank you very much :-)

I hope it can be merged into next release !!!

Nathann

Robert Miller

unread,
Dec 8, 2009, 8:20:33 PM12/8/09
to sage-...@googlegroups.com
On Tue, Dec 1, 2009 at 8:56 AM, Nathann Cohen <nathan...@gmail.com> wrote:
> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I
> assure you it will be very hard for them to compete with LP on our
> side and so many features around in Sage... As most of NetworkX's
> functions are not very hard to rewrite once the basis is set ( graph
> structure, neighbors, etc ) I am more set to try to move away from
> this library, especially if they changed to a Sage-uncompatible
> license.

Here is the short story with Sage's c_graphs: I have done the work
necessary for Sage to use them for just about everything in sight. I
think there are a few functions which still depend on NetworkX, but
this patch ( http://sage.math.washington.edu/home/rlmill/switch_on_cgraphs.patch
) switches the default for all graphs to use c_graphs instead of
NetworkX -- it's old, and may need rebase. With this patch, almost
nothing still uses NetworkX, but some things slow down a little.
Although the underlying structure is faster, the translation in the
middle is slowing things down. What someone ambitious should do is
apply that patch, run their favorite function, and see whether it
slows down or speeds up, and if it slows down, vet out why it does in
the new backend code, to see what can be optimized. Everything is
ready to switch, the only reason we haven't have been because of
certain speed regressions (despite other functions getting faster ;)
-- we'll just have to see what happens.)

--
Robert L. Miller
http://www.rlmiller.org/

Robert Miller

unread,
Dec 8, 2009, 8:27:15 PM12/8/09
to sage-...@googlegroups.com
>>> In the end : What is going on with the C graphs in Sage, can we
>>> expect
>>> them to eb available soon ?
>
> You can use them right now. They're just not as feature full.

Actually, that's no longer true. As of the closing of #6085, they are
just as full of features as NX-based Sage graphs. The caveat is that a
few functions in the graph library actually call
self.networkx_graph().some_function(), so they still use NX.

Anyone interested in helping to get Sage using C graphs to completely
replace NetworkX will have my immediate attention, and is encouraged
to email me directly. It's actually very close to finished!

Robert Miller

unread,
Dec 8, 2009, 9:09:38 PM12/8/09
to sage-...@googlegroups.com
I have posted the relevant code here:

http://trac.sagemath.org/sage_trac/ticket/7634

Robert Miller

unread,
Dec 9, 2009, 2:46:34 PM12/9/09
to sage-devel
Responding to a comment from trac:
> One problem IMHO with `c_graph` is that as is (correct me if I'm wrong)
> we won't be able to have a fast `in_neighbors`.

This is certainly true, if you're using a SparseGraph to represent a
DiGraph. In this case, I think there should actually be two
SparseGraphs in the backend, representing arcs and reversed arcs. That
way we never need to call in_neighbors, only out_neighbors of the
other SparseGraph. This would double the time of many operations, but
those operations are much much faster than NetworkX so that might not
be so bad.

Other suggestions are welcome!

Nathann Cohen

unread,
Dec 10, 2009, 1:14:18 AM12/10/09
to sage-devel
Well, in my use of directed graphs I can swear I need to talk about
out-neighbors at least as often as I need to talk about out-
neighbors.... Storing them two time would not be a waste in opinion,
and anyway we can not afford to have only the out-neighbors
available... I have not read the implementation of c_graphs yet ( and
I intend to ) but this should not be too hard to go back if we are
not satisfied with the speed ( which I hope will not happen ) :-)

Nathann

Nathann Cohen

unread,
Dec 10, 2009, 1:23:24 AM12/10/09
to sage-devel
Just a random thought : wouldn't it be way more efficient to write the
definitions of a Graph ( and perhaps the basic functions too )
directly in C, then to wrap them through Cython ?

Nathann

Robert Miller

unread,
Dec 10, 2009, 1:23:51 AM12/10/09
to sage-...@googlegroups.com
OK, these are both tickets now:

http://trac.sagemath.org/sage_trac/ticket/7640
http://trac.sagemath.org/sage_trac/ticket/7651

These will need to be dealt with before we can switch. I'm optimistic,
given that this is still a pretty short list. I hope everyone reading
this will try out the patch here and see if their favorite functions
slow down:

http://trac.sagemath.org/sage_trac/ticket/7634

Robert Miller

unread,
Dec 10, 2009, 1:41:11 AM12/10/09
to sage-...@googlegroups.com
Re: Cython vs C, there isn't a speed difference between C and Cython,
so it is advantageous for everything to be written in Cython. There
you get Python syntax, object oriented programming, speed and the
Cython part of the Sage library all in one place.

The c_graph structures themselves are written to entirely use C
objects from start to finish, so they're very fast.

Some code needs to stay in Python. Think about it: if it's essentially
Python anyway, when you change the code would you rather wait one
second to compile, or five? Now multiply that one vs. five seconds by
the number of files you are thinking about Cythonizing. It could take
much longer if your file is long or the code is complicated. In some
cases, i.e. functions which themselves are called frequently from
other functions, it makes perfect sense to use Cython. But if your
function is mainly being called by the user, or takes any real amount
of time at all to execute (I mean, e.g. calling other functions), it
doesn't hurt to leave it in Python.

The graph backends (including c_graph backends) are Python objects. As
far as the graph backends are concerned, I can mention a few things
which are blocking progress toward pure Cython backends:
1. In order for this to matter, graph.py needs to be split up, into
probably more than just generic_graph, graph, and digraph. I'm
thinking probably a Python and Cython file for graph and for digraph.
So that would be five new files, three Python, two Cython. That way
the relevant parts of the Sage graph class could call Cython quickly,
and the rest could stay in a .py file.
2. It would really be much more efficient for the backends to make use
of genuine iterators. There are two ways to do this, the
procrastination way and the hard way. Sooner or later, Cython will
support things like yield statements, and if we just wait for this to
happen, it will be nearly trivial to switch them over. That's the easy
way. The hard way is to create a bunch of new classes, with __iter__
and __next__ methods, for each of the iterator functions in each of
the backends. This is sure to introduce new bugs and waste developer
time, too.
3. My thesis ;)

Given these issues, I should note that these are some obstructions to
the *optimal* speed. I still think the switchover at #7634 will be
majorly beneficial to speed.
Reply all
Reply to author
Forward
0 new messages