112 views

Skip to first unread message

Nov 27, 2009, 9:57:46 AM11/27/09

to Sage devel

Hello everybody !!!

Reviewing patch #7533, Rob Beezer noticed that the Graph function to

compute the distances between all the pairs of vertices in a Graph (

shortest_path_all_pairs(), written in Python ) is much slower than

computing independently the distances between all the pairs of

vertices in the Graph (which uses NetworkX), which is a bit

worrying...

I created ticket #7540 to eventually change this, to which Yann

replied by mentionning that NetworkX also contained a function to

compute all the distances in the graph. Here is one comparision

between these functions :

sage: g=graphs.RandomGNP(300,.1)

sage: time a=[g.distance(i,j) for (i,j) in Subsets(g,2)]

CPU times: user 2.67 s, sys: 0.01 s, total: 2.68 s

Wall time: 2.68 s

sage: time e=g.shortest_path_all_pairs()

CPU times: user 37.95 s, sys: 0.13 s, total: 38.08 s

Wall time: 38.23 s

sage: time e=networkx.all_pairs_shortest_path(g)

CPU times: user 1.93 s, sys: 0.02 s, total: 1.94 s

Wall time: 1.96 s

As you can see, the differences are crystal clear...

It then seems we should replace the Floyd-Marshall algorithm in Sage's

function shortest_path_all_pairs() by a direct call to NetworkX, and

call it a deal. It also seems to be for me the moment to ask a

question about the implementation of graphs in Sage :

If I make no mistake, Robert Miller rewrote the Graph class in C,

which sounds like we are trying to remove networkX from Sage and use

our own version of graphs instead. If this is the case, we will have a

C class for graphs, meaning that some Graph functions will be written

in C instead of Python, thus gaining a lot of speed in Sage. At this

point, what should we do ? Rewrite the Python implementation of the

Floyd Marshall algorithm in C ( or whatever was written in NetworkX if

more efficient ), just call NetworkX, something different ?

I think Sage's graph class needs to be rewritten to be a bit more

efficient..... Efficiency is a problem I have sometimes in Sage, on

instances for which this should not be a problem.... So I expect a lot

from the c_graphs that are currently somewhere in Sage ( are they used

? How, when ? )

In the end : What is going on with the C graphs in Sage, can we expect

them to eb available soon ? What are we trying to do with NetworkX (

leave it, keep it, nothing special ? )

Thank you for your answers !

Nathann Cohen

Reviewing patch #7533, Rob Beezer noticed that the Graph function to

compute the distances between all the pairs of vertices in a Graph (

shortest_path_all_pairs(), written in Python ) is much slower than

computing independently the distances between all the pairs of

vertices in the Graph (which uses NetworkX), which is a bit

worrying...

I created ticket #7540 to eventually change this, to which Yann

replied by mentionning that NetworkX also contained a function to

compute all the distances in the graph. Here is one comparision

between these functions :

sage: g=graphs.RandomGNP(300,.1)

sage: time a=[g.distance(i,j) for (i,j) in Subsets(g,2)]

CPU times: user 2.67 s, sys: 0.01 s, total: 2.68 s

Wall time: 2.68 s

sage: time e=g.shortest_path_all_pairs()

CPU times: user 37.95 s, sys: 0.13 s, total: 38.08 s

Wall time: 38.23 s

sage: time e=networkx.all_pairs_shortest_path(g)

CPU times: user 1.93 s, sys: 0.02 s, total: 1.94 s

Wall time: 1.96 s

As you can see, the differences are crystal clear...

It then seems we should replace the Floyd-Marshall algorithm in Sage's

function shortest_path_all_pairs() by a direct call to NetworkX, and

call it a deal. It also seems to be for me the moment to ask a

question about the implementation of graphs in Sage :

If I make no mistake, Robert Miller rewrote the Graph class in C,

which sounds like we are trying to remove networkX from Sage and use

our own version of graphs instead. If this is the case, we will have a

C class for graphs, meaning that some Graph functions will be written

in C instead of Python, thus gaining a lot of speed in Sage. At this

point, what should we do ? Rewrite the Python implementation of the

Floyd Marshall algorithm in C ( or whatever was written in NetworkX if

more efficient ), just call NetworkX, something different ?

I think Sage's graph class needs to be rewritten to be a bit more

efficient..... Efficiency is a problem I have sometimes in Sage, on

instances for which this should not be a problem.... So I expect a lot

from the c_graphs that are currently somewhere in Sage ( are they used

? How, when ? )

In the end : What is going on with the C graphs in Sage, can we expect

them to eb available soon ? What are we trying to do with NetworkX (

leave it, keep it, nothing special ? )

Thank you for your answers !

Nathann Cohen

Nov 27, 2009, 10:34:00 AM11/27/09

to sage-devel

No answer to the main question, but just completing the timings:

sage: G=graphs.RandomGNP(300,.1)

sage: time d=[G.distance(i,j) for (i,j) in Subsets(G.vertices(),2)]

CPU times: user 3.67 s, sys: 0.00 s, total: 3.67 s

Wall time: 3.72 s

sage: time d=networkx.all_pairs_shortest_path(G)

CPU times: user 2.59 s, sys: 0.00 s, total: 2.59 s

Wall time: 2.61 s

sage: time e=networkx.all_pairs_shortest_path_length(G.networkx_graph

(copy=False))

CPU times: user 1.34 s, sys: 0.00 s, total: 1.34 s

Wall time: 1.36 s

sage: G=graphs.RandomGNP(300,.1)

sage: time d=[G.distance(i,j) for (i,j) in Subsets(G.vertices(),2)]

CPU times: user 3.67 s, sys: 0.00 s, total: 3.67 s

Wall time: 3.72 s

sage: time d=networkx.all_pairs_shortest_path(G)

CPU times: user 2.59 s, sys: 0.00 s, total: 2.59 s

Wall time: 2.61 s

sage: time e=networkx.all_pairs_shortest_path_length(G.networkx_graph

(copy=False))

CPU times: user 1.34 s, sys: 0.00 s, total: 1.34 s

Wall time: 1.36 s

Nov 27, 2009, 11:06:27 AM11/27/09

to sage-devel

You can obtain "some" c_grah in Sage with

cG = G.copy(implementation="c_graph")

I don't know how far the development is though.

cG = G.copy(implementation="c_graph")

I don't know how far the development is though.

Nov 29, 2009, 8:22:35 AM11/29/09

to sage-...@googlegroups.com, Robert Miller

Hi Nathann,

On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com> wrote:

<SNIP>

> If I make no mistake, Robert Miller rewrote the Graph class in C,

> which sounds like we are trying to remove networkX from Sage and use

> our own version of graphs instead. If this is the case, we will have a

> C class for graphs, meaning that some Graph functions will be written

> in C instead of Python, thus gaining a lot of speed in Sage. At this

> point, what should we do ? Rewrite the Python implementation of the

> Floyd Marshall algorithm in C ( or whatever was written in NetworkX if

> more efficient ), just call NetworkX, something different ?

>

> I think Sage's graph class needs to be rewritten to be a bit more

> efficient..... Efficiency is a problem I have sometimes in Sage, on

> instances for which this should not be a problem.... So I expect a lot

> from the c_graphs that are currently somewhere in Sage ( are they used

> ? How, when ? )

>

> In the end : What is going on with the C graphs in Sage, can we expect

> them to eb available soon ? What are we trying to do with NetworkX (

> leave it, keep it, nothing special ? )

I share some of your concerns as expressed above. In the last few

days, I started writing an introductory chapter for a book on

algorithmic graph theory using Sage. In doing so, I found that Sage

lacks some basic features. For example, I have been unable to find a

function/method to compute the degree sequence of a graph.

What is required is a statement from Robert Miller, or people involved

in the development of C graph, about the current state of C graph. The

statement should include the state where C graph is at in terms of

development, other things that need to be implemented, as well as how

to navigate around the graph theory module.

I was reading through graph.py only to discover that it's very, very

long. Can that module be split up and organized along topical lines?

I'm interested in helping out with developing Sage's graph theory

module. That's a reason why I started writing the small book I

mentioned above. In writing that book, I want to develop an

understanding of the theory and algorithmic aspect of graph theory.

And I plan to use that understanding to contribute to

developing/maintaining the graph theory module.

--

Regards

Minh Van Nguyen

On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com> wrote:

<SNIP>

> If I make no mistake, Robert Miller rewrote the Graph class in C,

> which sounds like we are trying to remove networkX from Sage and use

> our own version of graphs instead. If this is the case, we will have a

> C class for graphs, meaning that some Graph functions will be written

> in C instead of Python, thus gaining a lot of speed in Sage. At this

> point, what should we do ? Rewrite the Python implementation of the

> Floyd Marshall algorithm in C ( or whatever was written in NetworkX if

> more efficient ), just call NetworkX, something different ?

>

> I think Sage's graph class needs to be rewritten to be a bit more

> efficient..... Efficiency is a problem I have sometimes in Sage, on

> instances for which this should not be a problem.... So I expect a lot

> from the c_graphs that are currently somewhere in Sage ( are they used

> ? How, when ? )

>

> In the end : What is going on with the C graphs in Sage, can we expect

> them to eb available soon ? What are we trying to do with NetworkX (

> leave it, keep it, nothing special ? )

days, I started writing an introductory chapter for a book on

algorithmic graph theory using Sage. In doing so, I found that Sage

lacks some basic features. For example, I have been unable to find a

function/method to compute the degree sequence of a graph.

What is required is a statement from Robert Miller, or people involved

in the development of C graph, about the current state of C graph. The

statement should include the state where C graph is at in terms of

development, other things that need to be implemented, as well as how

to navigate around the graph theory module.

I was reading through graph.py only to discover that it's very, very

long. Can that module be split up and organized along topical lines?

I'm interested in helping out with developing Sage's graph theory

module. That's a reason why I started writing the small book I

mentioned above. In writing that book, I want to develop an

understanding of the theory and algorithmic aspect of graph theory.

And I plan to use that understanding to contribute to

developing/maintaining the graph theory module.

--

Regards

Minh Van Nguyen

Nov 29, 2009, 11:27:31 AM11/29/09

to sage-devel

HMmmm...

I started creating new modules, and I wanted to split it piece by

piece, with time. Ticket #7365 creates a module named

graph_decomposition which I intend to fill ( but I will begin to write

these functions when this patch will be merged and the file

created ).

Is it possible in Python to create a module containing a function X,

then to write in the body of the graph class "from my_module import X"

so that the X function can be used as any method in the graph class ?

I am asking this question because I thought it was possible, but I was

not able to do it with several functions defined in graph_coloring,

perhaps because of a cyclic dependance. This would enable us to split

this file efficiently, and to avoid having to copy docstrings and

create useless wrappers as it is the case for the moment with

coloring functions.

I also created ticket #7369 but it was refused as a blocker and I do

not really know how to produce such a patch without wiping out all the

other modifications. If you know how to write it, plllleaaaaaase tell

me :-)

If I can help you with your book, I'd gladly do it !!! I plan great

things for Graphs in Sage, but these days I am really stuck with

reviews... Just look at the state of this section on the Trac

Server !

Even though, most of this is not really "urgent" for Sage ( short of

the two LP tickets.. I madly need them to be in the standard Sage by

the end of december ) and can easily wait. The most important thing

for the graph library, methinks, is to be rewritten efficiently in C

so that I will be able to rewrite the most useful functions from

Python to Cython and speed up the rest of them...

For the degree sequence, I think g.degree() should do :-)

Nathann

I started creating new modules, and I wanted to split it piece by

piece, with time. Ticket #7365 creates a module named

graph_decomposition which I intend to fill ( but I will begin to write

these functions when this patch will be merged and the file

created ).

Is it possible in Python to create a module containing a function X,

then to write in the body of the graph class "from my_module import X"

so that the X function can be used as any method in the graph class ?

I am asking this question because I thought it was possible, but I was

not able to do it with several functions defined in graph_coloring,

perhaps because of a cyclic dependance. This would enable us to split

this file efficiently, and to avoid having to copy docstrings and

create useless wrappers as it is the case for the moment with

coloring functions.

I also created ticket #7369 but it was refused as a blocker and I do

not really know how to produce such a patch without wiping out all the

other modifications. If you know how to write it, plllleaaaaaase tell

me :-)

If I can help you with your book, I'd gladly do it !!! I plan great

things for Graphs in Sage, but these days I am really stuck with

reviews... Just look at the state of this section on the Trac

Server !

Even though, most of this is not really "urgent" for Sage ( short of

the two LP tickets.. I madly need them to be in the standard Sage by

the end of december ) and can easily wait. The most important thing

for the graph library, methinks, is to be rewritten efficiently in C

so that I will be able to rewrite the most useful functions from

Python to Cython and speed up the rest of them...

For the degree sequence, I think g.degree() should do :-)

Nathann

Nov 29, 2009, 6:42:30 PM11/29/09

to sage-devel

> For example, I have been unable to find a

> function/method to compute the degree sequence of a graph.

>

two years ago?

- kcrisman

Nov 29, 2009, 9:17:20 PM11/29/09

to sage-...@googlegroups.com

Hi kcrisman,

On Mon, Nov 30, 2009 at 10:42 AM, kcrisman <kcri...@gmail.com> wrote:

<SNIP>

> That is odd; I am pretty sure there used to be such a method, maybe

> two years ago?

It turns out there is such a function. One could use

Graph.degree_iterator() as suggested by mhansen, or Graph.degree() as

suggested by ncohen. But the result must be sorted in non-increasing

order:

[mvngu@sage ~]$ sage

----------------------------------------------------------------------

| Sage Version 4.2.1, Release Date: 2009-11-14 |

| Type notebook() for the GUI, and license() for information. |

----------------------------------------------------------------------

sage: g = Graph({"a": ["b", "e"], "b": ["a", "g", "e", "c"], \

....: "c": ["b", "e", "d"], "d": ["c", "f"], "e": ["f", "a", "b", "c"], \

....: "f": ["g", "d", "e"], "g": ["b", "f"]})

sage: sorted(g.degree(), reverse=True)

[4, 4, 3, 3, 2, 2, 2]

sage: sorted(g.degree_iterator(), reverse=True)

[4, 4, 3, 3, 2, 2, 2]

On Mon, Nov 30, 2009 at 10:42 AM, kcrisman <kcri...@gmail.com> wrote:

<SNIP>

> That is odd; I am pretty sure there used to be such a method, maybe

> two years ago?

Graph.degree_iterator() as suggested by mhansen, or Graph.degree() as

suggested by ncohen. But the result must be sorted in non-increasing

order:

[mvngu@sage ~]$ sage

----------------------------------------------------------------------

| Sage Version 4.2.1, Release Date: 2009-11-14 |

| Type notebook() for the GUI, and license() for information. |

----------------------------------------------------------------------

sage: g = Graph({"a": ["b", "e"], "b": ["a", "g", "e", "c"], \

....: "c": ["b", "e", "d"], "d": ["c", "f"], "e": ["f", "a", "b", "c"], \

....: "f": ["g", "d", "e"], "g": ["b", "f"]})

sage: sorted(g.degree(), reverse=True)

[4, 4, 3, 3, 2, 2, 2]

sage: sorted(g.degree_iterator(), reverse=True)

[4, 4, 3, 3, 2, 2, 2]

Nov 30, 2009, 1:09:17 PM11/30/09

to sage-devel

> sage: sorted(g.degree(), reverse=True)

> [4, 4, 3, 3, 2, 2, 2]

Probably in the situations I used, I just used g.degree() all by its

lonesome self, unsorted, because the cases I needed could be done "by

hand".

- kcrisman

Nov 30, 2009, 4:50:27 PM11/30/09

to sage-...@googlegroups.com

> That is odd; I am pretty sure there used to be such a method, maybe

> two years ago?

Ticket #7564 [1] improves the documentation of the method
> two years ago?

GenericGraph.degree() by adding two examples showing how one could use

that method to obtain the degree sequence of a graph. The ticket needs

some reviewing love :-)

[1] http://trac.sagemath.org/sage_trac/ticket/7564

Dec 1, 2009, 12:01:17 AM12/1/09

to sage-...@googlegroups.com

On Nov 29, 2009, at 5:22 AM, Minh Nguyen wrote:

> Hi Nathann,

>

> On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com

> > wrote:

>

> <SNIP>

>

>> If I make no mistake, Robert Miller rewrote the Graph class in C,

>> which sounds like we are trying to remove networkX from Sage and use

>> our own version of graphs instead. If this is the case, we will

>> have a

>> C class for graphs, meaning that some Graph functions will be written

>> in C instead of Python, thus gaining a lot of speed in Sage. At this

>> point, what should we do ? Rewrite the Python implementation of the

>> Floyd Marshall algorithm in C ( or whatever was written in NetworkX

>> if

>> more efficient ), just call NetworkX, something different ?

I think the basic idea was that one could write a faster (sparse and
> Hi Nathann,

>

> On Sat, Nov 28, 2009 at 1:57 AM, Nathann Cohen <nathan...@gmail.com

> > wrote:

>

> <SNIP>

>

>> If I make no mistake, Robert Miller rewrote the Graph class in C,

>> which sounds like we are trying to remove networkX from Sage and use

>> our own version of graphs instead. If this is the case, we will

>> have a

>> C class for graphs, meaning that some Graph functions will be written

>> in C instead of Python, thus gaining a lot of speed in Sage. At this

>> point, what should we do ? Rewrite the Python implementation of the

>> Floyd Marshall algorithm in C ( or whatever was written in NetworkX

>> if

>> more efficient ), just call NetworkX, something different ?

dense) graph "core," and then run all the NetworkX algorithms on top

of it as long as it supported the interface (for manipulating and

querying vertices and edges). If some code was still too slow then it

would be moved down to C (hopefully it would be sufficient to declare

the graphs as c-graphs and compile with Cython to remove all the

Python function call overhead). I don't know how well this works at

the moment.

>> I think Sage's graph class needs to be rewritten to be a bit more

>> efficient..... Efficiency is a problem I have sometimes in Sage, on

>> instances for which this should not be a problem.... So I expect a

>> lot

>> from the c_graphs that are currently somewhere in Sage ( are they

>> used

>> ? How, when ? )

>> In the end : What is going on with the C graphs in Sage, can we

>> expect

>> them to eb available soon ?

>> What are we trying to do with NetworkX (leave it, keep it, nothing

>> special ? )

We're certainly keeping it for the time being, but there's a limit to

speed if one sticks with pure Python. I don't think there's any goal

to replace NetworkX altogether. It could be like a lot of other things

were we handle some things ourselves (where we needed absolute speed)

and call off to NetworkX for the rest.

> I share some of your concerns as expressed above. In the last few

> days, I started writing an introductory chapter for a book on

> algorithmic graph theory using Sage. In doing so, I found that Sage

> lacks some basic features. For example, I have been unable to find a

> function/method to compute the degree sequence of a graph.

a lack of documentation rather than functionality.

> What is required is a statement from Robert Miller, or people involved

> in the development of C graph, about the current state of C graph. The

> statement should include the state where C graph is at in terms of

> development, other things that need to be implemented, as well as how

> to navigate around the graph theory module.

studying p-descent and other BSD related topics for his thesis right

now, not graphs, so I'd imagine that there's not much going on

development wise with them unless you want to take up the banner.

> I was reading through graph.py only to discover that it's very, very

> long. Can that module be split up and organized along topical lines?

their own files.

- Robert

Dec 1, 2009, 12:24:18 AM12/1/09

to sage-...@googlegroups.com

On Nov 29, 2009, at 8:27 AM, Nathann Cohen wrote:

> HMmmm...

>

> I started creating new modules, and I wanted to split it piece by

> piece, with time. Ticket #7365 creates a module named

> graph_decomposition which I intend to fill ( but I will begin to write

> these functions when this patch will be merged and the file

> created ).

>

> Is it possible in Python to create a module containing a function X,

> then to write in the body of the graph class "from my_module import X"

> so that the X function can be used as any method in the graph class ?

> I am asking this question because I thought it was possible, but I was

> not able to do it with several functions defined in graph_coloring,

> perhaps because of a cyclic dependance. This would enable us to split

> this file efficiently, and to avoid having to copy docstrings and

> create useless wrappers as it is the case for the moment with

> coloring functions.

I think something like this is possible, but it does make it harder to
> HMmmm...

>

> I started creating new modules, and I wanted to split it piece by

> piece, with time. Ticket #7365 creates a module named

> graph_decomposition which I intend to fill ( but I will begin to write

> these functions when this patch will be merged and the file

> created ).

>

> Is it possible in Python to create a module containing a function X,

> then to write in the body of the graph class "from my_module import X"

> so that the X function can be used as any method in the graph class ?

> I am asking this question because I thought it was possible, but I was

> not able to do it with several functions defined in graph_coloring,

> perhaps because of a cyclic dependance. This would enable us to split

> this file efficiently, and to avoid having to copy docstrings and

> create useless wrappers as it is the case for the moment with

> coloring functions.

figure out how to navigate the source. Big files are a pain, but so

are classes that get split among a dozen small files.

> I also created ticket #7369 but it was refused as a blocker and I do

> not really know how to produce such a patch without wiping out all the

> other modifications. If you know how to write it, plllleaaaaaase tell

> me :-)

the other changes (not just applied on top). The criteria for a

blocker is really high (like giving blatantly wrong answers or not

starting up).

> If I can help you with your book, I'd gladly do it !!! I plan great

> things for Graphs in Sage, but these days I am really stuck with

> reviews... Just look at the state of this section on the Trac

> Server !

trick is to try to get people to reciprocate after reviewing some of

theirs.

> Even though, most of this is not really "urgent" for Sage ( short of

> the two LP tickets.. I madly need them to be in the standard Sage by

> the end of december )

>

> and can easily wait. The most important thing

> for the graph library, methinks, is to be rewritten efficiently in C

> so that I will be able to rewrite the most useful functions from

> Python to Cython and speed up the rest of them...

>

> For the degree sequence, I think g.degree() should do :-)

>

> Nathann

>

> To post to this group, send an email to sage-...@googlegroups.com

> To unsubscribe from this group, send an email to sage-devel-...@googlegroups.com

> For more options, visit this group at http://groups.google.com/group/sage-devel

> URL: http://www.sagemath.org

Dec 1, 2009, 5:59:29 AM12/1/09

to sage-devel

On Mon, Nov 30, 2009 at 9:01 PM, Robert Bradshaw

<robe...@math.washington.edu> wrote:

> I think the basic idea was that one could write a faster (sparse and

> dense) graph "core," and then run all the NetworkX algorithms on top

> of it as long as it supported the interface (for manipulating and

> querying vertices and edges). If some code was still too slow then it

> would be moved down to C (hopefully it would be sufficient to declare

> the graphs as c-graphs and compile with Cython to remove all the

> Python function call overhead). I don't know how well this works at

> the moment.

>

Is there any particular reason why this code couldn't be contributed
<robe...@math.washington.edu> wrote:

> I think the basic idea was that one could write a faster (sparse and

> dense) graph "core," and then run all the NetworkX algorithms on top

> of it as long as it supported the interface (for manipulating and

> querying vertices and edges). If some code was still too slow then it

> would be moved down to C (hopefully it would be sufficient to declare

> the graphs as c-graphs and compile with Cython to remove all the

> Python function call overhead). I don't know how well this works at

> the moment.

>

upstream to Networkx? I use networkx outside of Sage, and it would be

great to have speed improvements made there as well; since Sage

already ships nx it would obviously benefit regardless.

There may be a good reason for not doing so, I'm just curious. From

my perspective, the more Sage work also benefits upstream projects,

the more impact it has and more people benefit from it, both those who

use Sage directly and those who may use those projects outside of

Sage. Obviously in some cases this isn't viable (dependencies on core

Sage code are an obvious one), hence my question.

Cheers,

f

Dec 1, 2009, 7:40:13 AM12/1/09

to sage-devel

but Networkx is now BSD licensed (it used to be GPL'd). Given that

graph theory in Sage is an area with a lot of possibly unfriendly

competition with Magma and Mathematica, the GPL license is appropriate

for us for that part of Sage.

A second potential problem is that all the code we're talking about is

written in Cython, whereas Networkx is a pure Python library (I

think?).

-- William

Dec 1, 2009, 7:45:03 AM12/1/09

to sage-devel

On Tue, Dec 1, 2009 at 5:59 AM, Fernando Perez <fpere...@gmail.com> wrote:

One potential problem is that all the Sage graph theory code is GPL'd,

but Networkx is now BSD licensed (it used to be LGPL'd). Given that
graph theory in Sage is an area with a lot of possibly unfriendly

competition with Magma and Mathematica, the GPL license is appropriate

for us for that part of Sage.

I also recall that the way Networkx switched from LGPL to BSD was odd.
competition with Magma and Mathematica, the GPL license is appropriate

for us for that part of Sage.

They sent an email to their list and said something like: "we're

switching from the LGPL to the BSD license; if you have a problem with

that please respond within 5 days, otherwise we're making the switch".

That wasn't the exact wording, but it was something like that. I

would personally be uncomfortable contributing code to a project that

switches licenses like that.

A second potential problem is that all the code we're talking about is

written in Cython, whereas Networkx is a pure Python library (I

think?).

ships a very old version, because Networkx's API changed a huge amount

and it was so far too difficult to change Sage to work with the new

API. I wish somebody would do so...

-- William

Dec 1, 2009, 7:48:34 AM12/1/09

to sage-devel

uncomfortable. They sent an email to their list and said something

like: "we're switching from the LGPL to the BSD license; if you have a

problem with that please respond within 10 days". That wasn't the
exact wording, but it was something like that.

http://www.mail-archive.com/sage-...@googlegroups.com/msg27341.html
The biggest problem we have wrt Networkx right now is on our end --

Sage still ships a very old version, because Networkx's API changed a

huge amount and it was so far too difficult to change Sage to work

with the new API. I hope somebody will fix this soon, since the
huge amount and it was so far too difficult to change Sage to work

longer we wait the harder it becomes.

William

Dec 1, 2009, 9:50:38 AM12/1/09

to sage-...@googlegroups.com

On Dec 1, 2009, at 2:59 AM, Fernando Perez wrote:

> On Mon, Nov 30, 2009 at 9:01 PM, Robert Bradshaw

> <robe...@math.washington.edu> wrote:

>> I think the basic idea was that one could write a faster (sparse and

>> dense) graph "core," and then run all the NetworkX algorithms on top

>> of it as long as it supported the interface (for manipulating and

>> querying vertices and edges). If some code was still too slow then it

>> would be moved down to C (hopefully it would be sufficient to declare

>> the graphs as c-graphs and compile with Cython to remove all the

>> Python function call overhead). I don't know how well this works at

>> the moment.

>>

>

> Is there any particular reason why this code couldn't be contributed

> upstream to Networkx? I use networkx outside of Sage, and it would be

> great to have speed improvements made there as well; since Sage

> already ships nx it would obviously benefit regardless.

I think, at least for cgraphs, it's an issue of NetworkX wanting to
> On Mon, Nov 30, 2009 at 9:01 PM, Robert Bradshaw

> <robe...@math.washington.edu> wrote:

>> I think the basic idea was that one could write a faster (sparse and

>> dense) graph "core," and then run all the NetworkX algorithms on top

>> of it as long as it supported the interface (for manipulating and

>> querying vertices and edges). If some code was still too slow then it

>> would be moved down to C (hopefully it would be sufficient to declare

>> the graphs as c-graphs and compile with Cython to remove all the

>> Python function call overhead). I don't know how well this works at

>> the moment.

>>

>

> Is there any particular reason why this code couldn't be contributed

> upstream to Networkx? I use networkx outside of Sage, and it would be

> great to have speed improvements made there as well; since Sage

> already ships nx it would obviously benefit regardless.

stay pure Python. They have few(er) dependancies, and we're happy to

contribute upstream.

- Robert

Dec 1, 2009, 9:54:19 AM12/1/09

to sage-...@googlegroups.com

helping upstream BSD projects as well).

- Robert

Dec 1, 2009, 11:56:51 AM12/1/09

to sage-devel

Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I

assure you it will be very hard for them to compete with LP on our

side and so many features around in Sage... As most of NetworkX's

functions are not very hard to rewrite once the basis is set ( graph

structure, neighbors, etc ) I am more set to try to move away from

this library, especially if they changed to a Sage-uncompatible

license.

The hardest things to rewrite that I can see on their web page at the

moment is the stuff related to max cliques, which is way better in

Cliquer now merged natively in Sage. The isomorphism test has already

been rewritten using Nice ( and does not need them anymore -- or at

least I believe it is ). Last one is the matching problem, which has

already been rewritten in LP and is waiting for reviews.

From the point of view of algorithms, I do not think we really need

them anymore. What we can not do natively is Sage can be rewritten

without too much pain ( tell me what needs to be or create a Trac

ticket, I'll take a look at it :-) )

Nathann

assure you it will be very hard for them to compete with LP on our

side and so many features around in Sage... As most of NetworkX's

functions are not very hard to rewrite once the basis is set ( graph

structure, neighbors, etc ) I am more set to try to move away from

this library, especially if they changed to a Sage-uncompatible

license.

The hardest things to rewrite that I can see on their web page at the

moment is the stuff related to max cliques, which is way better in

Cliquer now merged natively in Sage. The isomorphism test has already

been rewritten using Nice ( and does not need them anymore -- or at

least I believe it is ). Last one is the matching problem, which has

already been rewritten in LP and is waiting for reviews.

From the point of view of algorithms, I do not think we really need

them anymore. What we can not do natively is Sage can be rewritten

without too much pain ( tell me what needs to be or create a Trac

ticket, I'll take a look at it :-) )

Nathann

Dec 1, 2009, 2:56:42 PM12/1/09

to sage-devel

On Tue, Dec 1, 2009 at 4:40 AM, William Stein <wst...@gmail.com> wrote:

> One potential problem is that all the Sage graph theory code is GPL'd,

> but Networkx is now BSD licensed (it used to be GPL'd). Given that

> graph theory in Sage is an area with a lot of possibly unfriendly

> competition with Magma and Mathematica, the GPL license is appropriate

> for us for that part of Sage.

Aha, that makes perfect sense for Sage then. Thanks for the clarification.
> One potential problem is that all the Sage graph theory code is GPL'd,

> but Networkx is now BSD licensed (it used to be GPL'd). Given that

> graph theory in Sage is an area with a lot of possibly unfriendly

> competition with Magma and Mathematica, the GPL license is appropriate

> for us for that part of Sage.

> A second potential problem is that all the code we're talking about is

> written in Cython, whereas Networkx is a pure Python library (I

> think?).

very interested in Cythonizing parts of nx for speed reasons, so I

doubt this would be a problem; they might actually like to see

contributions from people with Cython expertise. But your point on

licensing stands.

Cheers,

f

Dec 1, 2009, 3:03:44 PM12/1/09

to sage-devel

On Tue, Dec 1, 2009 at 8:56 AM, Nathann Cohen <nathan...@gmail.com> wrote:

> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I

> assure you it will be very hard for them to compete with LP on our

> side and so many features around in Sage... As most of NetworkX's

> functions are not very hard to rewrite once the basis is set ( graph

> structure, neighbors, etc ) I am more set to try to move away from

> this library, especially if they changed to a Sage-uncompatible

> license.

An important clarification:
> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I

> assure you it will be very hard for them to compete with LP on our

> side and so many features around in Sage... As most of NetworkX's

> functions are not very hard to rewrite once the basis is set ( graph

> structure, neighbors, etc ) I am more set to try to move away from

> this library, especially if they changed to a Sage-uncompatible

> license.

The BSD license is *not* incompatible with Sage, as Sage ships a

massive amount of BSD (and similarly licensed) code already. What

William said was that for this particular topic, using the GPL was

better aligned with the goals and needs of Sage, not that the NetworkX

license was incompatible.

Cheers,

f

Dec 5, 2009, 6:08:50 PM12/5/09

to sage-devel

I have a patch making sage work with Netwotkx 1.0rc1 based on sage

4.3.alpha0, but it will probably conflict with many other graph

related patches...

If anyone wants to improve it, please do!

It's ticket #7608 now.

4.3.alpha0, but it will probably conflict with many other graph

related patches...

If anyone wants to improve it, please do!

It's ticket #7608 now.

Dec 5, 2009, 7:02:14 PM12/5/09

to sage-devel

Just to be clear, my patch is based on 4.3alpha0 and won't apply

cleanly to 4.3alpha1, that's why it's still marked as needs work.

cleanly to 4.3alpha1, that's why it's still marked as needs work.

Dec 5, 2009, 8:55:50 PM12/5/09

to sage-devel

patch updated. It should apply fine to sage 4.3alpha1 now.

Yann

Yann

Dec 6, 2009, 2:56:28 AM12/6/09

to sage-devel

Your patch exposes the function maximum_weight matching from

NertworkX !!!

There is a high chance that I may have written the patch for matching

using LP for naught, but an even higher chance that NetworkX's

function could be even more efficient (at least for small cases) !!!

Thank you very much :-)

I hope it can be merged into next release !!!

Nathann

NertworkX !!!

There is a high chance that I may have written the patch for matching

using LP for naught, but an even higher chance that NetworkX's

function could be even more efficient (at least for small cases) !!!

Thank you very much :-)

I hope it can be merged into next release !!!

Nathann

Dec 8, 2009, 8:20:33 PM12/8/09

to sage-...@googlegroups.com

On Tue, Dec 1, 2009 at 8:56 AM, Nathann Cohen <nathan...@gmail.com> wrote:

> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I

> assure you it will be very hard for them to compete with LP on our

> side and so many features around in Sage... As most of NetworkX's

> functions are not very hard to rewrite once the basis is set ( graph

> structure, neighbors, etc ) I am more set to try to move away from

> this library, especially if they changed to a Sage-uncompatible

> license.

Here is the short story with Sage's c_graphs: I have done the work
> Hmmm... If we can make Sage's C graphs as fast as NetworkX's , I

> assure you it will be very hard for them to compete with LP on our

> side and so many features around in Sage... As most of NetworkX's

> functions are not very hard to rewrite once the basis is set ( graph

> structure, neighbors, etc ) I am more set to try to move away from

> this library, especially if they changed to a Sage-uncompatible

> license.

necessary for Sage to use them for just about everything in sight. I

think there are a few functions which still depend on NetworkX, but

this patch ( http://sage.math.washington.edu/home/rlmill/switch_on_cgraphs.patch

) switches the default for all graphs to use c_graphs instead of

NetworkX -- it's old, and may need rebase. With this patch, almost

nothing still uses NetworkX, but some things slow down a little.

Although the underlying structure is faster, the translation in the

middle is slowing things down. What someone ambitious should do is

apply that patch, run their favorite function, and see whether it

slows down or speeds up, and if it slows down, vet out why it does in

the new backend code, to see what can be optimized. Everything is

ready to switch, the only reason we haven't have been because of

certain speed regressions (despite other functions getting faster ;)

-- we'll just have to see what happens.)

--

Robert L. Miller

http://www.rlmiller.org/

Dec 8, 2009, 8:27:15 PM12/8/09

to sage-...@googlegroups.com

>>> In the end : What is going on with the C graphs in Sage, can we

>>> expect

>>> them to eb available soon ?

>

> You can use them right now. They're just not as feature full.

Actually, that's no longer true. As of the closing of #6085, they are
>>> expect

>>> them to eb available soon ?

>

> You can use them right now. They're just not as feature full.

just as full of features as NX-based Sage graphs. The caveat is that a

few functions in the graph library actually call

self.networkx_graph().some_function(), so they still use NX.

Anyone interested in helping to get Sage using C graphs to completely

replace NetworkX will have my immediate attention, and is encouraged

to email me directly. It's actually very close to finished!

Dec 8, 2009, 9:09:38 PM12/8/09

to sage-...@googlegroups.com

Dec 9, 2009, 2:46:34 PM12/9/09

to sage-devel

Responding to a comment from trac:

> One problem IMHO with `c_graph` is that as is (correct me if I'm wrong)

> we won't be able to have a fast `in_neighbors`.

This is certainly true, if you're using a SparseGraph to represent a

DiGraph. In this case, I think there should actually be two

SparseGraphs in the backend, representing arcs and reversed arcs. That

way we never need to call in_neighbors, only out_neighbors of the

other SparseGraph. This would double the time of many operations, but

those operations are much much faster than NetworkX so that might not

be so bad.

Other suggestions are welcome!

> One problem IMHO with `c_graph` is that as is (correct me if I'm wrong)

> we won't be able to have a fast `in_neighbors`.

This is certainly true, if you're using a SparseGraph to represent a

DiGraph. In this case, I think there should actually be two

SparseGraphs in the backend, representing arcs and reversed arcs. That

way we never need to call in_neighbors, only out_neighbors of the

other SparseGraph. This would double the time of many operations, but

those operations are much much faster than NetworkX so that might not

be so bad.

Other suggestions are welcome!

Dec 10, 2009, 1:14:18 AM12/10/09

to sage-devel

Well, in my use of directed graphs I can swear I need to talk about

out-neighbors at least as often as I need to talk about out-

neighbors.... Storing them two time would not be a waste in opinion,

and anyway we can not afford to have only the out-neighbors

available... I have not read the implementation of c_graphs yet ( and

I intend to ) but this should not be too hard to go back if we are

not satisfied with the speed ( which I hope will not happen ) :-)

Nathann

out-neighbors at least as often as I need to talk about out-

neighbors.... Storing them two time would not be a waste in opinion,

and anyway we can not afford to have only the out-neighbors

available... I have not read the implementation of c_graphs yet ( and

I intend to ) but this should not be too hard to go back if we are

not satisfied with the speed ( which I hope will not happen ) :-)

Nathann

Dec 10, 2009, 1:23:24 AM12/10/09

to sage-devel

Just a random thought : wouldn't it be way more efficient to write the

definitions of a Graph ( and perhaps the basic functions too )

directly in C, then to wrap them through Cython ?

Nathann

definitions of a Graph ( and perhaps the basic functions too )

directly in C, then to wrap them through Cython ?

Nathann

Dec 10, 2009, 1:23:51 AM12/10/09

to sage-...@googlegroups.com

OK, these are both tickets now:

http://trac.sagemath.org/sage_trac/ticket/7640

http://trac.sagemath.org/sage_trac/ticket/7651

These will need to be dealt with before we can switch. I'm optimistic,

given that this is still a pretty short list. I hope everyone reading

this will try out the patch here and see if their favorite functions

slow down:

http://trac.sagemath.org/sage_trac/ticket/7634

http://trac.sagemath.org/sage_trac/ticket/7640

http://trac.sagemath.org/sage_trac/ticket/7651

These will need to be dealt with before we can switch. I'm optimistic,

given that this is still a pretty short list. I hope everyone reading

this will try out the patch here and see if their favorite functions

slow down:

http://trac.sagemath.org/sage_trac/ticket/7634

Dec 10, 2009, 1:41:11 AM12/10/09

to sage-...@googlegroups.com

so it is advantageous for everything to be written in Cython. There

you get Python syntax, object oriented programming, speed and the

Cython part of the Sage library all in one place.

The c_graph structures themselves are written to entirely use C

objects from start to finish, so they're very fast.

Some code needs to stay in Python. Think about it: if it's essentially

Python anyway, when you change the code would you rather wait one

second to compile, or five? Now multiply that one vs. five seconds by

the number of files you are thinking about Cythonizing. It could take

much longer if your file is long or the code is complicated. In some

cases, i.e. functions which themselves are called frequently from

other functions, it makes perfect sense to use Cython. But if your

function is mainly being called by the user, or takes any real amount

of time at all to execute (I mean, e.g. calling other functions), it

doesn't hurt to leave it in Python.

The graph backends (including c_graph backends) are Python objects. As

far as the graph backends are concerned, I can mention a few things

which are blocking progress toward pure Cython backends:

1. In order for this to matter, graph.py needs to be split up, into

probably more than just generic_graph, graph, and digraph. I'm

thinking probably a Python and Cython file for graph and for digraph.

So that would be five new files, three Python, two Cython. That way

the relevant parts of the Sage graph class could call Cython quickly,

and the rest could stay in a .py file.

2. It would really be much more efficient for the backends to make use

of genuine iterators. There are two ways to do this, the

procrastination way and the hard way. Sooner or later, Cython will

support things like yield statements, and if we just wait for this to

happen, it will be nearly trivial to switch them over. That's the easy

way. The hard way is to create a bunch of new classes, with __iter__

and __next__ methods, for each of the iterator functions in each of

the backends. This is sure to introduce new bugs and waste developer

time, too.

3. My thesis ;)

Given these issues, I should note that these are some obstructions to

the *optimal* speed. I still think the switchover at #7634 will be

majorly beneficial to speed.

Reply all

Reply to author

Forward

0 new messages