Is there any interest in a (hypothetical) standard graph API (with 'graph' meaning a network, consisting of nodes and edges)? Yes, we have the standard ways of implementing graphs through (e.g.) dicts mapping nodes to neighbor-sets, but if one wants a graph that's implemented in some other way, this may not be the most convenient (or abstract) interface to emulate. It might be nice to have the kind of polymorphic freedom that one has with, e.g, with the DB-API. One could always develop factories or adaptors (such as for PyProtocols) to/from the dict-of-sets version...
So, any interest? Or am I just a lone nut in wanting this?
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
> Is there any interest in a (hypothetical) standard graph API (with > 'graph' meaning a network, consisting of nodes and edges)?
1) Yes! 2) Only if it's in C. I don't mind (re)writing pure Python graph containers, it's the speed of a pure Python graph that's the bigger issue to me (mostly object inspection and/or decorate-insert-retrieve-undecorate cycles). If it started in a pure-Python package, then went into the stdlib, and then got a C implementation (like sets.py did), that would be fine. 3) It would have to accept arbitrary objects. No "make your class a subclass of GraphNode" garbage. If someone wanted to make a subclass of Graph called StringGraph, which was optimized for strings, or IntGraph optimized for ints, that would be fine. But the base class (I assume a class implementation?) should handle instances of Object.
I'm sure there are many more details, but those are the big ones IMO.
Robert Brewer MIS Amor Ministries fuman...@amor.org
Magnus Lie Hetland <mlh <at> furu.idi.ntnu.no> writes:
> Is there any interest in a (hypothetical) standard graph API (with > 'graph' meaning a network, consisting of nodes and edges)?
I don't need one right now, but I know I have a few times in the past. Certainly seems like a good idea to me. We've got sets as builtins now, no reason we shouldn't have a simple graph API, at least in the library.
Magnus Lie Hetland wrote: > Is there any interest in a (hypothetical) standard graph API (with > 'graph' meaning a network, consisting of nodes and edges)? Yes, we > have the standard ways of implementing graphs through (e.g.) dicts > mapping nodes to neighbor-sets, but if one wants a graph that's > implemented in some other way, this may not be the most convenient (or > abstract) interface to emulate. It might be nice to have the kind of > polymorphic freedom that one has with, e.g, with the DB-API. One could > always develop factories or adaptors (such as for PyProtocols) to/from > the dict-of-sets version...
> So, any interest? Or am I just a lone nut in wanting this?
Magnus, A know I'd appreciate it. It could be used to configure neural nets and logic networks; where this api would make it easy to build an abstraction then "compile" it into a faster representation for execution - or just run the tree/graph in "interpreted" mode. I don't think it would get a lot of use, but the use would be high end. wes
In article <slrncik8tm.4g....@furu.idi.ntnu.no>, m...@furu.idi.ntnu.no (Magnus Lie Hetland) wrote:
> Yes, we have the standard ways of implementing graphs through (e.g.) > dicts mapping nodes to neighbor-sets, but if one wants a graph that's > implemented in some other way, this may not be the most convenient > (or abstract) interface to emulate.
Actually, my interpretation of this standard way is as a fairly abstract interface, rather than a specific instantiation such as dict-of-sets: Most of the time, I merely require that iter(G) produces a sequence of the vertices of graph G, and iter(G[v]) produces a sequence of neighbors of vertex v. I also sometimes use "v in G" and "w in G[v]" to test existence of vertices or edges.
Pros and cons of this approach:
- You can use a list instead of a set in the adjacency list part of the representation. This may be faster and more space efficient when the vertex degrees are small.
- It's easy to create test graphs as code literals
- Any indexable object can be a vertex. The vertex identities can be something meaningful to your program. On the other hand, that means (unless you know where your graph came from) you can't rely on the vertices being special vertex objects with nice properties and you can't use objects like None as flag values unless you're sure they won't be vertices.
- It doesn't provide an abstract way of changing the graph (although that's relatively easy if G is e.g. a dict of sets)
- It doesn't directly represent multigraphs
- It doesn't directly represent undirected graphs (instead you have to replace an undirected edge by two directed edges and hope your callers don't give you a directed graph by mistake).
- There isn't an explicit object representing an edge, although you can create one by using a tuple (v,w) or (for undirected edges) a set. This can be an advantage in terms of memory usage but a disadvantage in terms of number of object creations. Also it means that if you want to store information on the edges you have to use a dict indexed by the edge instead of attributes on an edge object (probably better style anyway since it prevents different algorithms on the same graph from colliding with each other's attributes).
I don't have a "high-end" use for it, but I did write a program which graphs the revision history of a software repository. It would have been nice to have most of that code in a library, and if such a library existed, it would probably implement operations I was too lazy to implement, such as coloring.
On Mon, Aug 23, 2004 at 06:43:35PM +0000, wes weston wrote: > Magnus Lie Hetland wrote: > >Is there any interest in a (hypothetical) standard graph API (with > >'graph' meaning a network, consisting of nodes and edges)? Yes, we > >have the standard ways of implementing graphs through (e.g.) dicts > >mapping nodes to neighbor-sets, but if one wants a graph that's > >implemented in some other way, this may not be the most convenient (or > >abstract) interface to emulate. It might be nice to have the kind of > >polymorphic freedom that one has with, e.g, with the DB-API. One could > >always develop factories or adaptors (such as for PyProtocols) to/from > >the dict-of-sets version...
> >So, any interest? Or am I just a lone nut in wanting this?
> Magnus, > A know I'd appreciate it. It could be used to configure > neural nets and logic networks; where this api would make > it easy to build an abstraction then "compile" it into a > faster representation for execution - or just run the > tree/graph in "interpreted" mode. > I don't think it would get a lot of use, but the use > would be high end. > wes
"Robert Brewer" <fuman...@amor.org> writes: >> Is there any interest in a (hypothetical) standard graph API (with >> 'graph' meaning a network, consisting of nodes and edges)?
> 1) Yes! > 2) Only if it's in C. I don't mind (re)writing pure Python graph > containers, it's the speed of a pure Python graph that's the bigger > issue to me (mostly object inspection and/or > decorate-insert-retrieve-undecorate cycles). If it started in a > pure-Python package, then went into the stdlib, and then got a C > implementation (like sets.py did), that would be fine. > 3) It would have to accept arbitrary objects. No "make your class a > subclass of GraphNode" garbage. If someone wanted to make a subclass of > Graph called StringGraph, which was optimized for strings, or IntGraph > optimized for ints, that would be fine. But the base class (I assume a > class implementation?) should handle instances of Object.
> I'm sure there are many more details, but those are the big ones IMO.
1) Yes, from me as well. I've not used graphs in Python much, but mainly because I found that I got sidetracked from my original project into implementing graph algorithms, and never completed the original job :-)
2) I don't see a need for C at the start. prototype in Python, and migrate to C if the speed is needed. Like the set datatype did. But aim for fast Python, so that a C implementation can be avoided if it's unnecessary.
3) Definitely arbitrary objects. I'm not sure about a Graph class, even - given that the dictionary of adjacency lists implementation is so easy to build, it may be worth writing algorithms that just depend on a graph "protocol". Something like: for n in g iterates through all nodes in g, for m in g[n] iterates through all nodes adjacent to n, and for cases where edge weights are needed, g[n][m] is the weight of edge n->m. You can get a long way on just those primitives. If we had adaptation in Python (PEP 246) I'd suggest an IGraph protocol, plus adapters for common implementation methods.
In my personal graph library, I found that one of the nastiest issues was writing suitably general DFS/BFS algorithms which had "hooks" at relevant points (node visited in preorder, inorder, and postorder, for example) without swamping the runtime of the algorithm with calls to possibly dummy hook functions. I'm not sure that a C implementation would help here, but good design and fine tuning certainly would.
(If anyone wants to see my current code, I'd be happy to share it).
Paul. -- In theory, there is no difference between theory and practice; In practice, there is. -- Chuck Reid
In article <mailman.2240.1093287844.5135.python-l...@python.org>, Phil Frost <ind...@bitglue.com> wrote:
> +1 for standard graph API!
> I don't have a "high-end" use for it, but I did write a program which > graphs the revision history of a software repository. It would have been > nice to have most of that code in a library, and if such a library > existed, it would probably implement operations I was too lazy to > implement, such as coloring.
I use the existing Guido-standard graph representation, that is: iter(G) and iter(G[v]) list vertices of G and neighbors of v in G v in G and w in G[v] test existence of vertices and edges in G
It includes both simple basic graph algorithm stuff (copying a graph, a DFS implementation that works non-recursively so it doesn't run into Python's recursion limit) and some much more advanced algorithms (e.g. non-bipartite maximum matching).
> Is there any interest in a (hypothetical) standard graph API (with > 'graph' meaning a network, consisting of nodes and edges)? Yes, we > have the standard ways of implementing graphs through (e.g.) dicts > mapping nodes to neighbor-sets, but if one wants a graph that's > implemented in some other way, this may not be the most convenient (or > abstract) interface to emulate. It might be nice to have the kind of > polymorphic freedom that one has with, e.g, with the DB-API. One could > always develop factories or adaptors (such as for PyProtocols) to/from > the dict-of-sets version...
> So, any interest? Or am I just a lone nut in wanting this?
> -- > Magnus Lie Hetland "Canned Bread: The greatest thing since sliced > http://hetland.org bread!" [from a can in Spongebob Squarepants]
Not sure if this falls under the category of an API, but it may be relevant to what you are doing.
Magnus Lie Hetland wrote: > So, any interest? Or am I just a lone nut in wanting this?
I have often needed to use simple graph concepts and wrote a bunch of code, then at some point I have started to unify it and (slowly) put together a consistent model. When my research brings me back to graphs I'll try to finish it.
I do have a lot functionality working, you can associate arbitrary data with the nodes and edges, bfs, dfs, topological sort,graph generation, David Epstein's python dijkstra algorithm, graphviz visualization.
On Mon, 23 Aug 2004 11:58:15 -0700, David Eppstein wrote: > - It doesn't directly represent multigraphs
> - It doesn't directly represent undirected graphs (instead you have to > replace an undirected edge by two directed edges and hope your callers > don't give you a directed graph by mistake).
> - There isn't an explicit object representing an edge, although you can > create one by using a tuple (v,w) or (for undirected edges) a set.
I think these three things speak to why there isn't a graph type and probably won't be one any time soon; unlike "Sets", there are just too many types of "graphs" in use, all fundamentally different in implementation, and with all differences having massive performance implications. As you indirectly point out, each of the following is an independent dimension:
* Directed, undirected * Multi or non-multi * Explicit edges/explicit nodes with links/node and edge objects * Simple and fast implementation of nodes/nodes and edges that take attributes
That's a good 24 possible types of graph library, each with implications w.r.t. algorithms and performance.
While the abstract idea of a standard graph library is appealing to some people, any actual concrete implementation will likely leave the majority of people who want to use it out in the cold, resulting either in something only useful in the simplest of cases, or suffering from major feeping creaturitis as it tries to cover too many bases at once.
Magnus Lie Hetland wrote: > Is there any interest in a (hypothetical) standard graph API (with > 'graph' meaning a network, consisting of nodes and edges)? Yes, we > have the standard ways of implementing graphs through (e.g.) dicts > mapping nodes to neighbor-sets, but if one wants a graph that's > implemented in some other way, this may not be the most convenient (or > abstract) interface to emulate. It might be nice to have the kind of > polymorphic freedom that one has with, e.g, with the DB-API. One could > always develop factories or adaptors (such as for PyProtocols) to/from > the dict-of-sets version...
> So, any interest? Or am I just a lone nut in wanting this?
Magnus, Do you have any design thoughts. It would be good to have weighted, directed graphs and depth first traversal. wes
>> Yes, we have the standard ways of implementing graphs through (e.g.) >> dicts mapping nodes to neighbor-sets, but if one wants a graph that's >> implemented in some other way, this may not be the most convenient >> (or abstract) interface to emulate.
>Actually, my interpretation of this standard way is as a fairly abstract >interface, rather than a specific instantiation such as dict-of-sets: >Most of the time, I merely require that iter(G) produces a sequence of >the vertices of graph G, and iter(G[v]) produces a sequence of neighbors >of vertex v. I also sometimes use "v in G" and "w in G[v]" to test >existence of vertices or edges.
Yes, I agree, to some extent. I guess the problems start when you want to manipulate the graph. I think it would be nice to be able to use an empty graph object to build a given graph without knowing the implementation. I guess you could do that in this implementation too (if all the neighbor sets were initialized).
But if this does turn out to be an acceptable API, I'm all for it. I just think it would be nice to have a Recommended Standard(tm), to create interoperability.
[snip]
>- It doesn't provide an abstract way of changing the graph (although >that's relatively easy if G is e.g. a dict of sets)
Right.
>- It doesn't directly represent multigraphs
Unless you insist on having neighbor-sets, it does, doesn't it? Neighbor-lists can be used for this...?
[snip]
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
In article <pan.2004.08.23.16.22.34.771...@jerf.org>, Jeremy Bowers wrote: [snip]
As I tried to state in the original post (I probably wasn't clear enough) I'm not talking about a standard *implementation*, just a standard *API*, like the DB-API. This could easily cover all kinds of strange beasts such as directed or undirected, weighted or unweighted (etc.) graphs; multigraphs, chain graphs, hypergraphs, who knows.
I'm basically just suggesting that it might be useful to have a "standard" interface for these things. It may be that the simple de facto standard that David cites is sufficient (although it certainly doesn't cover hypergraphs -- but that's possibly going a bit too far anyway.)
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
In article <YssWc.503921$Gx4.282...@bgtnsc04-news.ops.worldnet.att.net>, wes
weston wrote:
[snip]
>Magnus, > Do you have any design thoughts. It would be good to have weighted, >directed graphs and depth first traversal.
I've thought of several alternatives; basically, I just thought about defining the "standard" API for the basic abstract data type (including weights, direction, labels, colours etc.). Concrete implementations and algorithms would be a separate issue.
>wes
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
In article <mailman.2228.1093282813.5135.python-l...@python.org>,
Robert Brewer wrote: >> Is there any interest in a (hypothetical) standard graph API (with >> 'graph' meaning a network, consisting of nodes and edges)?
>1) Yes!
:)
>2) Only if it's in C.
That would be up to the implementer. I'm only talking about defining an API -- not the implementation. (Cf. the DB-API.)
[snip]
>3) It would have to accept arbitrary objects. No "make your class a >subclass of GraphNode" garbage.
Of course. Signature-based polymorphism and protocols all the way :)
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
In article <slrncikldl.e8q....@furu.idi.ntnu.no>, m...@furu.idi.ntnu.no (Magnus Lie Hetland) wrote:
> >- It doesn't directly represent multigraphs
> Unless you insist on having neighbor-sets, it does, doesn't it? > Neighbor-lists can be used for this...?
If you're doing anything serious with a multigraph you need to have some way of distinguishing different edges between the same pair of vertices. For instance, an edge object for each edge, that you can use as an index to store information about that edge. A neighbor list that has multiple copies of the same neighbor won't let you do that, you can iterate through the edges but not distinguish one from another.
Another possibility, which fits into the same general abstract API but is more specialized, would be to represent a multigraph by a dict of dicts, where the outer dict maps each vertex to its neighbors and the inner dict maps each neighbor to the number of edges; then you could represent each edge by a tuple (v,w,index) with index in range(G[v][w]).
In article <eklxfytr....@yahoo.co.uk>, Paul Moore wrote:
[snip]
> If we had adaptation in Python (PEP 246) I'd suggest an > IGraph protocol, plus adapters for common implementation methods.
This is exactly what I was thinking about -- one could use PyProtocols, for example.
The (as of now quite hypothetical) standard would, however, not have to include such an interface; one could easily implement that (and adapters) based on the standard description.
I'm thinking along the lines of an informational PEP.
>In my personal graph library, I found that one of the nastiest issues >was writing suitably general DFS/BFS algorithms which had "hooks" at >relevant points
Yes... I've been thinking about this -- it might be useful to allow some form of traversal (where you could supply your own queue object, for example, giving you bfs, dfs, dijkstra, prim, whatever) and have the traversal take the form of an iterator (similar to the Boost graph library). But if access to each node and its neighbors is available in the interface, such traversals wouldn't really have to be part of the standard API...
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]
In article <slrnciklv0.e8q....@furu.idi.ntnu.no>, m...@furu.idi.ntnu.no (Magnus Lie Hetland) wrote:
> > Do you have any design thoughts. It would be good to have weighted, > >directed graphs and depth first traversal.
> I've thought of several alternatives; basically, I just thought about > defining the "standard" API for the basic abstract data type > (including weights, direction, labels, colours etc.). Concrete > implementations and algorithms would be a separate issue.
I would strongly prefer not to have weights or similar attributes as part of a graph API. I would rather have the weights be a separate dict or function or whatever passed to the graph algorithm. The main reason is that I might want the same algorithm to be applied to the same graph with a different set of weights.
A secondary reason is that we already have in Python a good general mechanism (dicts) for associating arbitrary information with objects, I don't see a need for reinventing a more specific mechanism for doing so when the objects are pieces of graphs and the information is some list of weight, label, etc that some graph API designer thinks is sufficient.
I think this may contradict some things I said a year or two ago about using a dict-of-dicts representation in which G[v][w] provides the weight; I've changed my mind.
On Mon, 23 Aug 2004 20:41:53 +0000, Magnus Lie Hetland wrote: > enough) I'm not talking about a standard *implementation*, just a > standard *API*, like the DB-API. This could easily cover all kinds of > strange beasts such as directed or undirected, weighted or unweighted > (etc.) graphs; multigraphs, chain graphs, hypergraphs, who knows.
I believe the equivalent thing in the C++ world is the Boost Graph Library described here:
I tried using it once, and it was so horrendously complicated that I gave up. Some of the complexity came from having to abstract all the different kinds of graphs that were supported, but a lot of it was also a result of the static nature of C++.
Still, it may be useful as a source of ideas and/or warning of problems.
In article <slrncikma7.e8q....@furu.idi.ntnu.no>, m...@furu.idi.ntnu.no (Magnus Lie Hetland) wrote:
> >In my personal graph library, I found that one of the nastiest issues > >was writing suitably general DFS/BFS algorithms which had "hooks" at > >relevant points
> Yes... I've been thinking about this -- it might be useful to allow > some form of traversal (where you could supply your own queue object, > for example, giving you bfs, dfs, dijkstra, prim, whatever) and have > the traversal take the form of an iterator (similar to the Boost graph > library). But if access to each node and its neighbors is available in > the interface, such traversals wouldn't really have to be part of the > standard API...
The first way is to generate a sequence of triples (v,w,edgetype): (v,v,forward) = start of new DFS tree with root v (v,w,forward) = first time DFS reaches w from parent v (v,w,nontree) = edge from v to already-visited vertex w (v,w,reverse) = DFS returns from v to parent w (v,v,reverse) = finish DFS tree with root v
The second way is a class with shadowable calls preorder(parent,child), postorder(parent,child), and backedge(source,destination). The arguments to the calls are basically the same as the first and second items in the triples described above, although I suppose I could have made separate calls for the DFS tree roots. The class is instantiated with a graph argument and as part of its initialization performs a DFS on that graph, making the calls as it goes.
The first way is lower level and I think uglier, but on the other hand it's usable to make iterators in a way that the second isn't; see e.g. the code in the same DFS.py file for iterating through the vertices of a graph in DFS postorder.
On Mon, 23 Aug 2004 20:41:53 +0000, Magnus Lie Hetland wrote: > In article <pan.2004.08.23.16.22.34.771...@jerf.org>, Jeremy Bowers wrote: > [snip]
> As I tried to state in the original post (I probably wasn't clear enough) > I'm not talking about a standard *implementation*, just a standard *API*, > like the DB-API. This could easily cover all kinds of strange beasts such > as directed or undirected, weighted or unweighted (etc.) graphs; > multigraphs, chain graphs, hypergraphs, who knows.
Point conceded about API and not a library, but I'm not sure that changes my point much. Your API is going to assume something about how edges are represented (which will conflict with somebody), *or* it will be so vague as to not have any particular advantage over the nothing we have now. And so on for most of the other dimensions.
In article <eppstein-4A56C8.14042623082...@news.service.uci.edu>,
David Eppstein wrote:
[snip]
>I would strongly prefer not to have weights or similar attributes as >part of a graph API. I would rather have the weights be a separate dict >or function or whatever passed to the graph algorithm. The main reason >is that I might want the same algorithm to be applied to the same graph >with a different set of weights.
I can see that.
[snip stuff about using dicts]
This can be said about all objects, really; no reason to have attributes as long as we can associate values with the objects using dicts. This is where things start to look implementation-specific, even though it is *possible* to keep it abstract using this interface.
One of my motivations is allowing arbitrary structures behind the scenes, e.g. the graph may be a front-end for something that is computed on-the-fly using specialized hardware (in fact a very real possibilit in my case). I could have something like this be represented by several distinct objects (e.g. one for the topological structure, one for the weights), of course, but I'd certainly have to implement the weight mapping myself, and not use built-in dicts.
I do think your API is nice in that it is simple, but I also have the feeling that using it with other implementations would be sort of unnatural; one would be trying to emulate the "dict-of-lists with dicts for properties" implementation, because that implementation *would* have been simple -- had one used it.
Also, again, this doesn't lend itself very well to manipulating graphs. If I set one weight to infinity, I might expect (perhaps) the corresponding edge to disappear (otherwise the graph would have to be complete in the first place) or similar things; there may also be other dependencies between properties. Not easily handled with a separate object for each property. And using functions for everything that needs calculating doesn't easily lead to polymorphism...
It's not horribly inconvenient, of course (just a matter of defining a few objects referring to the same underlying mechanism, each with a different __getitem__ method). I'm just airing my thoughts about why it *might* be useful to have something else -- possibly in addition.
Perhaps one could have something like two levels? The Level 1 Graph API would support the "graph as mapping from nodes to neighbors" with "properties as separate mappings" and the Level 2 Graph API could add some convenience methods/properties for encapsulation and manipulation?
[snip]
> I think this may contradict some things I said a year or two ago > about using a dict-of-dicts representation in which G[v][w] provides > the weight;
Yeah, I remember you saying that :)
> I've changed my mind.
Fair enough. FWIW, I agree with your new position when it comes to the simple dict-based implementation; this is basically how it's done in pseudocode, usually (e.g. having pi[v] represent the predecessor of v and so forth).
-- Magnus Lie Hetland "Canned Bread: The greatest thing since sliced http://hetland.org bread!" [from a can in Spongebob Squarepants]