Re: C graph backend

Skip to first unread message

Robert Miller

Jul 23, 2009, 11:57:04 AM7/23/09

> Do you have a todo list of what needs to be done to get the C graph backend
> as the default backend?

Technically nothing -- if you switch all the default function
arguments from implementation='networkx' to implementation='c_graph',
all the doctests pass (I think you need sage-4.1.1.alpha0). This took
over a month of nonstop work-- I was in Europe too, so we're talking
frequent twelve hour days, no family...

> Also, I noticed that there is a huge difference between getting the in and
> out neighbors of a vertex in a digraph, and the incoming edges for sparse C
> graphs take 1.5 times as long as either the outgoing edges or networkx
> (either direction).

This is one of the reasons that they're not default now-- several
functions are still slower than NetworkX. First of all, in_neighbors
for SparseGraphs are their own issue (see #2):

1. The SparseGraph and DenseGraph datastructures themselves are
significantly faster than the corresponding backends:

sage: a=graphs.PathGraph(1000).to_directed().copy(implementation='c_graph')
sage: b=a._backend
sage: c=b._cg
sage: %timeit L = list(b.iterator_out_nbrs(30))
100000 loops, best of 3: 5.67 µs per loop
sage: %timeit L = list(c.out_neighbors(30))
1000000 loops, best of 3: 1.47 µs per loop

This is because the current state of the backends are *working,* not *fast.*

2. SparseGraphs are by their nature directed -- this is part of the
data structure. Probably the best way to have fast in_neighbors
functions is to store two SparseGraphs, but this would double the run
time of every other operation. You reviewed the original SparseGraph
patch, so you probably understand why this is the case-- all edges
coming out of a vertex are stored together, but edges coming into a
given vertex could be all over the place.

3. If you (or anyone else interested in getting faster graphs in Sage)
wanted to put in some work to get things ready to switch over, one
thing that would be very useful is to grep through the
SparseGraphBackend and DenseGraphBackend for "iter": this is likely
the worst bottleneck, and should be the first to get fixed. Since
Cython doesn't yet support the "yield" statement, I had to use the
"iter" command to quickly get working code. What would be much better
would be to implement Cython classes just for iterating through the
different things that happen in backends. Even SparseGraphs and
DenseGraphs themselves could use some "yield/iter - love" -- I bet you
could get some significant speedups in those classes too.

I've made this last one a trac ticket, and I see this as the next big step:

Robert L. Miller

Reply all
Reply to author
0 new messages