BFS on an entire graph with cycles/loops

1,370 views
Skip to first unread message

Ankit Singhal

unread,
Jul 10, 2014, 9:21:44 AM7/10/14
to gremli...@googlegroups.com
Hi everyone,

I want to perform BFS on the entire graph using gremlin. The graph contains cycles and loops. I went through the link https://github.com/tinkerpop/gremlin/wiki/Depth-First-vs.-Breadth-First .The example doesn't illustrate how to deal with cycles/loops during BFS.

Let's take this simple case :-

A graph has 3 vertices A, B, C, D and 4 edges A -> B, A  -> C, C -> D and D -> C.

As you can notice, there is a cycle C -> D -> C.

I want to traverse this entire graph through BFS algorithm starting from vertex A. Traversing this, I should get the following output in the same order:-

A->B
A->C
C->D
D->C

I tried something like : g.v(0).outE.inV.simplePath.loop(3){true}{true}.path. But it didn't work and went into infinite loop.

Any help would be greatly appreciated.

Thanks

Marko Rodriguez

unread,
Jul 10, 2014, 10:28:42 AM7/10/14
to gremli...@googlegroups.com
Hi,

There is a bug in Gremlin2 that is not easily solved (though rectified in Gremlin3). You don't have access to the true path within an loop construct. You can do this:

g.(0).out.loop{it.path.isSimple()}{true}.path

Note that isSimple() is a recent addition which is simply a function like this if you are running a version of Gremlin that doesn't have it:

def boolean isSimple(List path) {
return path.size() == new HashSet(path).size();
}

HTH,
Marko.
--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

kp

unread,
Jul 24, 2014, 7:15:53 AM7/24/14
to gremli...@googlegroups.com
Thanks a lot Marko..

One more question..

Will out step return vertices with all properties or vertices with just ids (i.e. light weight vertices). I don't want to load all the properties of the vertex during traversal. Just the id will do fine. Later if I want I can load its vertices but not during traversal.

Is there a way to do it?

Marko Rodriguez

unread,
Jul 24, 2014, 9:31:21 AM7/24/14
to gremli...@googlegroups.com
Hi kp,

Will out step return vertices with all properties or vertices with just ids (i.e. light weight vertices). I don't want to load all the properties of the vertex during traversal. Just the id will do fine. Later if I want I can load its vertices but not during traversal.

This is something you could very easily test on your own in a fraction of the time it took you to write your email, then me read it, have me open the Gremlin console, demonstrate the fact, cut/paste the result, indent it, apply courier font, and type the rest of the body of the email, with salutation and pick which signature I should go with -- http://markorodriguez.com or http://thinkaurelius.com (hmmm.. I think http://markorodriguez.com cause it looks good on me in summer).

gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.V.out.transform{it.map()}
==>{age=27, name=vadas}
==>{age=32, name=josh}
==>{name=lop, lang=java}
==>{name=lop, lang=java}
==>{name=ripple, lang=java}
==>{name=lop, lang=java}
gremlin>
gremlin> g.V.out.transform{it.id}
==>2
==>4
==>3
==>3
==>5
==>3

kp

unread,
Jul 25, 2014, 3:04:05 AM7/25/14
to gremli...@googlegroups.com
Hi Marko,

Thanks for taking your time to reply. I tried it on my own before posting the question and read your reply on this post https://groups.google.com/forum/#!topic/gremlin-users/weg1S2sySOA. But I didn't think the solution addressed my concern.

Let me try to be clear.

When we do g.V, it returns list of vertices in the graphs, right?. I want to know if these vertices are loaded in memory along with all their properties or just the id of the vertex is loaded.
When I debug the java code, I find that the vertices have all properties loaded. The problem is I have graph with millions of nodes and edges and when I do traversal, it loads the entire vertex even though I don't need the properties of vertices during traversal.

Marko Rodriguez

unread,
Jul 25, 2014, 11:52:56 AM7/25/14
to gremli...@googlegroups.com
Hi KP,

When we do g.V, it returns list of vertices in the graphs, right?. I want to know if these vertices are loaded in memory along with all their properties or just the id of the vertex is loaded.
When I debug the java code, I find that the vertices have all properties loaded. The problem is I have graph with millions of nodes and edges and when I do traversal, it loads the entire vertex even though I don't need the properties of vertices during traversal.

It depends on the backend implementation. 

* TinkerGraph is fully in-memory so everything "is loaded." 
* Neo4j maintains the graph in-memory as well, but when pulling from disk, the properties are grabbed only when requested (I believe). 
* Titan will retrieve properties when requested (but again, if the vertex/properties are cached, then "its loaded.")
* Giraph (in TP3) will have all the properties loaded as the entire graph is stored in-memory across the cluster.
* Faunus will fetch all the vertex data (id, properties, edges, and edge properties) when that particular vertex is touched.
* … not sure about other vendors.

In short, where the data is is a function of the underlying storage engine. You will have to review the specifics of the graph engine you are using.

Finally, if you are doing traversals over millions of vertices and edges, realize that OLTP systems like Titan/Cassandra, Neo4j, OrientDB, etc. are probably not the best solution as these are intended for short transactions touching little data during a single transaction. Can you speak to your use case?

HTH,
Marko.

http://markorodriguez.com <--- still summer.



Reply all
Reply to author
Forward
0 new messages