Getting size of predecessor and successor graphs for node in a DAG quickly

548 views
Skip to first unread message

Zane Selvans

unread,
Sep 15, 2009, 1:41:31 AM9/15/09
to networkx-discuss
I'm using NX to represent the stratigraphic relationships between a
bunch of tectonic features on Europa, whose relative times of
formation I'm trying to untangle. To do this, I've encoded their
crosscutting relationships as a DAG, using nx.MultiDiGraph. For a
given feature, I want to know how many in the overall set of features
is above it, and how many are below it, to place bounds on where in
the stratigraphic stack it could possibly be.

Right now, I'm doing it this way:

stratsort = {}
for lin,n in zip(sub_gsn.nodes(), range(len(sub_gsn))):
ub = len(sub_gsn) - len(nx.dfs_tree(sub_gsn, source=lin))
lb = len(nx.dfs_tree(sub_gsn, source=lin, reverse_graph=True))-1
stratsort[lin] = (lb, ub)

But it's painfully slow. What's the likely hangup? Is there a much
faster way to get the size of the predecessor and successor graphs for
a given node in a DAG? My graphs aren't all that big. Order 100s of
nodes, and 1000s of edges.

Thanks!

Aric Hagberg

unread,
Sep 15, 2009, 11:33:46 AM9/15/09
to networkx...@googlegroups.com

Maybe something like this? (If I understand correctly what you want).

import networkx as nx
G = nx.gnc_graph(100) # a DAG
R = G.reverse()
for n in G:
print n,\
len(nx.single_source_shortest_path_length(G,n))-1,\ # successors
len(nx.single_source_shortest_path_length(R,n))-1 # predecessors


Aric

Zane Selvans

unread,
Sep 15, 2009, 5:02:56 PM9/15/09
to networkx-discuss


On Sep 15, 8:33 am, Aric Hagberg <ahagb...@gmail.com> wrote:
> On Mon, Sep 14, 2009 at 11:41 PM, Zane Selvans <zane.selv...@gmail.com> wrote:
>
> > Right now, I'm doing it this way:
>
> > stratsort = {}
> > for lin,n in zip(sub_gsn.nodes(), range(len(sub_gsn))):
> >    ub = len(sub_gsn) - len(nx.dfs_tree(sub_gsn, source=lin))
> >    lb = len(nx.dfs_tree(sub_gsn, source=lin, reverse_graph=True))-1
> >    stratsort[lin] = (lb, ub)
>
> > But it's painfully slow.
>
> Maybe something like this? (If I understand correctly what you want).
>
> import networkx as nx
> G = nx.gnc_graph(100) # a DAG
> R = G.reverse()
> for n in G:
>     print n,\
>           len(nx.single_source_shortest_path_length(G,n))-1,\ # successors
>           len(nx.single_source_shortest_path_length(R,n))-1   # predecessors

Great! That sped things up by a factor of 3, and caching the __hash__
() values of my node objects upon instantiation sped it up by another
factor of 2. Unfortunately the remaining time sink (which scales as |
V|^2) is calculating the intersections between the geographic
features, and I don't think I can do a whole lot about that.

Zane Selvans

unread,
Sep 18, 2009, 8:01:52 PM9/18/09
to networkx-discuss
> > import networkx as nx
> > G = nx.gnc_graph(100) # a DAG
> > R = G.reverse()
> > for n in G:
> >     print n,\
> >           len(nx.single_source_shortest_path_length(G,n))-1,\ # successors
> >           len(nx.single_source_shortest_path_length(R,n))-1   # predecessors
>
> Great!  That sped things up by a factor of 3, and caching the __hash__
> () values of my node objects upon instantiation sped it up by another
> factor of 2.

Hmm. Now I realize I need to do this many more times than I
originally thought, for a Monte Carlo simulation. It seems like there
ought to be a recursive successors function, that just tells you what
nodes are reachable from a source, without going to the extra
computational trouble of finding the shortest paths.

Dan Schult

unread,
Sep 18, 2009, 9:59:03 PM9/18/09
to networkx...@googlegroups.com

On Sep 18, 2009, at 8:01 PM, Zane Selvans wrote:

Hmm.  Now I realize I need to do this many more times than I
originally thought, for a Monte Carlo simulation.  It seems like there
ought to be a recursive successors function, that just tells you what
nodes are reachable from a source, without going to the extra
computational trouble of finding the shortest paths.

Can you give an example?
By successors do you mean the successors along the shortest paths?
(have to compute paths)
Or successors meaning neighbors moving along outward edges?
(use nested successors functions)
e.g. to find the neighbors that are 2 steps from the current node try:
>>> set(nbrnbr for nbrnbr in G[nbr] for nbr in G[node] if nbrnbr is not node)

If you just want all nodes reachable and don't care what edges you go through,
start by computing the connected components and work on each separately.
Hope this helps....Dan
Reply all
Reply to author
Forward
0 new messages