Efficient way to store/traverse an Undirected Graph?

575 views
Skip to first unread message

Count Vajhula

unread,
Mar 16, 2011, 6:05:25 PM3/16/11
to orient-...@googlegroups.com
Hi all,
I am trying to store an undirected graph in OrientDB, and I have a few questions on this:

1. I looked through the documentation, and I see APIs for "in edges" and "out edges" which seem to assume a directed graph. If I am trying to store an undirected graph, then would I need to add two edges every time I want to add a shared edge between two vertices? I.e.
i. Add an out-edge to vertex 1, going to vertex 2
ii. Add an out-edge to vertex 2, going to vertex 1
(Ignore in-edges? Or do in-edges get added automatically when an out-edge is added?)

Is there a better way to accomplish this?

2. Is there a fast way to get all the "neighbors" of a vertex? Right now the way that I see to do this, is to get all the out-edges from a vertex (assuming I follow the method I described in (1) of creating the reverse out-edges for every edge added), and then get the "outVertex"es from those edges. But in this case, I would need to loop through every edge one by one, which seems like it would be slow. Is there some optimized way to just get all the immediate neighbors? (And in future, I may want to get neighbors-of-neighbors, too. Is there an efficient way of doing that?)


Any help would be appreciated, thanks!

-Sid

Luca Garulli

unread,
Mar 17, 2011, 9:14:44 AM3/17/11
to orient-database, Count Vajhula
Hi,
by the graph is always bidirectional, so when you create an edge you have always the symmetric arrow back.

To traverse the graph you could query the root vertex and then cross the relationship via API or via GREMLIN. Take a look here to start using Gremlin: https://github.com/tinkerpop/gremlin/wiki/Basic-Graph-Traversals

Lvc@

Marko Rodriguez

unread,
Mar 17, 2011, 11:11:50 AM3/17/11
to orient-...@googlegroups.com, Count Vajhula
Hi,

Adding to what Luca said: You will be interested in the bothE step in Gremlin. It gets both incoming and outgoing edges from a vertex.


See ya,
Marko.

Count Vajhula

unread,
Mar 17, 2011, 3:51:04 PM3/17/11
to Marko Rodriguez, orient-...@googlegroups.com
Thanks guys, that sounds like just what I needed. I'll look into it, but just to clarify, I also found this on http://code.google.com/p/orient/wiki/GraphDatabaseTinkerpop :

"Since TinkerPop Blueprints API is quite raw and doesn't provide ad-hoc methods for very common use cases you could need to access to the underlying ODatabaseGraphTx object to better use the graph-engine under the hood. Commons operations are:

  • Count incoming and outgoing edges without browsing them all
  • Get incoming and outgoing vertexes without browsing the edges
  • ...
"

So Gremlin is an ~alternate~ way of achieving the same goals, but without going down into the raw API. Is this correct?

Thanks,
-Sid

Marko Rodriguez

unread,
Mar 17, 2011, 6:37:25 PM3/17/11
to Count Vajhula, orient-...@googlegroups.com
Hi,

Blueprints is a generic API that binds to various graph databases including TinkerGraph, Neo4j, OrientDB, Sail RDF stores, Dex, and soon InfiniteGraph. Each of these databases has their own particular graph APIs as well. The benefits of using Blueprints are discussed here:


Take care,
Marko.

Count Vajhula

unread,
Mar 17, 2011, 6:45:00 PM3/17/11
to Marko Rodriguez, orient-...@googlegroups.com
Thanks, yes I meant that using Gremlin is an alternative to going down into the "underlying ODatabaseGraphTx" API, right? Blueprints benefits are clear to me :)

Marko Rodriguez

unread,
Mar 17, 2011, 6:46:29 PM3/17/11
to Count Vajhula, orient-...@googlegroups.com
Hey,

Gremlin talks to Blueprints. So the benefits of Blueprints trickle to Gremlin. But then, with Gremlin, you get a traversal language. :)

See ya,
Marko.

Count Vajhula

unread,
Mar 17, 2011, 6:49:23 PM3/17/11
to Marko Rodriguez, orient-...@googlegroups.com
Got it, thanks!

Count Vajhula

unread,
Mar 18, 2011, 4:00:54 PM3/18/11
to Marko Rodriguez, orient-...@googlegroups.com
Luca, Marko,
I'm trying to get the neighbors of a vertex in a graph. Right now, using bothE, this seems to work to get the neighbors of v:

v.bothE.bothV.filter { it.id != v.id }

Is this an efficient method? I wonder if, since the undirected graph is a different data structure from a directed graph, and would benefit from a different set of optimizations and traversal techniques -- would it make sense to support the Undirected Graph as a different data model in Orientdb as well as in Gremlin/Blueprints? For example undirected traversals in Gremlin could look like this:

v.neighborV
v.neighborV('colleagues')
v.neighborV('friends').neighborV('friends')

And there could be some underlying optimizations for these operations if the underlying graph db supports undirected graph, otherwise it could just be mapped to the equivalent of something like the first query, which gets bothE and applies a filter.

Just a suggestion, what do you think? In any case, if there is another way to get neighbors using bothE or some other method which is better, then I'd be happy to know about it.

Thanks!
-Sid

Count Vajhula

unread,
Mar 18, 2011, 8:11:01 PM3/18/11
to Marko Rodriguez, orient-...@googlegroups.com
Hi Marko,
To build on my previous question, I'm having some trouble getting the Gremlin call to work correctly (could be a bug). Hope you can help me figure it out!

So as I mentioned earlier, I am trying to get the neighbors of a vertex as an ArrayList of property values. For example, every vertex in the graph may have a property 'name', and I would like to get all the neighbors of a vertex as a list of 'names'. Right now, I am able to get the output as a PropertyPipe, but to convert that to a list I need to call a collect {} on it. Then, when that executes, the data in the resulting list seems to be corrupted.

Here, with just the PropertyPipe, the output looks fine:

gremlin> a.bothE.bothV.filter { it.id != a.id }.name.getPipes()[-1] 
==>Mike
==>Dan
gremlin> _.class
==>class com.tinkerpop.pipes.pgm.PropertyPipe

(The -1 index in getPipes()[-1] is to get the PropertyPipe out of the returned pipes).

But when I add the collect {}, the resulting list is corrupted, "[Mikf, Dao]":

gremlin> a.bothE.bothV.filter { it.id != a.id }.name.getPipes()[-1].collect { it -> it.next() }
==>Mikf
==>Dao
gremlin> _.class
==>class java.util.ArrayList


Any idea what may be causing this? And is this the correct way to get neighbors of a vertex as a list? Seems a bit complex..

Thanks a ton, really appreciate your time,
-Sid

Marko Rodriguez

unread,
Mar 19, 2011, 11:12:31 AM3/19/11
to Count Vajhula, orient-...@googlegroups.com
Hi Sid,

But when I add the collect {}, the resulting list is corrupted, "[Mikf, Dao]":

collect{} is not a Pipe. Its a closure method in Groovy. Here are the pipes used in Gremlin:

Count Vajhula

unread,
Mar 19, 2011, 1:57:13 PM3/19/11
to Marko Rodriguez, orient-...@googlegroups.com
Thanks, the 'aggregate' step did the trick:

gremlin> li = []                                                                 
gremlin> a.bothE.bothV.filter{it.id != a.id}.name.getPipes()[-1].aggregate(li)   
==>Mike
==>Dan
gremlin> li
==>Mike
==>Dan
gremlin> li.class
==>class java.util.ArrayList

Marko Rodriguez

unread,
Mar 19, 2011, 2:49:12 PM3/19/11
to Count Vajhula, orient-...@googlegroups.com
Cool. One note, be wary of != ... I would do:

!it.id.equals(a.id)

just to be safe.

Marko.

Count Vajhula

unread,
Mar 19, 2011, 2:56:06 PM3/19/11
to Marko Rodriguez, orient-...@googlegroups.com
Good call, I'll do that.
Reply all
Reply to author
Forward
0 new messages