Connected Component in JanusGraph

311 views
Skip to first unread message

Antriksh Shah

unread,
Nov 28, 2017, 10:45:39 AM11/28/17
to Gremlin-users
1)For an existing graph, I want to perform label propagation algorithm. 
To simplify it furthur, for a given vertex V; update a specific property for all vertices connected to V directly or transitively via any edge.

Currently I am using the query- 
graph.traversal().V(vertex).repeat(both()).until(cyclicPath()).both().dedup().property("label","new_label")
Is there a faster way to do this? 

2)Also once I have update the data, I want to get all vertices whose label have been updated from the above command and write it back to a relational database as well.

For that I am  doing

Iterator iter = graph.traversal().V(vertex).repeat(both()).until(cyclicPath()).both().dedup().valueMap("property_key", "label");
Then iterating over each vertex that I received I write it back to the database.
At times if the graph is too dense, I get a heap size exceeded exception in trying to store all graphs in iterator.

Is there any optimised way to run this faster? 
My code is in java.

Any help or suggestions, very greatly appreciated. Thank you in advance.

Daniel Kuppitz

unread,
Nov 28, 2017, 12:56:20 PM11/28/17
to gremli...@googlegroups.com
This can all be done in a single run, you don't even have to enable path computations.

g.V(vertex).
  repeat(both().dedup()).
    emit().
  or(hasNot("foo"), has("foo", neq("bar"))).
  property("foo", "bar")

The result will already contain all vertices that were updated. Just note, that you can't change the label of a vertex, but if "label" is a property in your case, then it's all good.

Example:

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(3).property("foo","bar").iterate() // now v[3] will not appear in the result as its 'foo' value won't be updated
gremlin> g.V(1).
           repeat(both().dedup()).
             emit().
           or(hasNot("foo"), has("foo", neq("bar"))).
           property("foo", "bar")
==>v[2]
==>v[4]
==>v[1]
==>v[6]
==>v[5]

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/e9e28834-f291-43fd-8d5d-f376a6b3a4a8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Antriksh Shah

unread,
Dec 2, 2017, 4:41:26 AM12/2/17
to Gremlin-users
Hey daniel,

I tried the above query, but it is not covering all the vertices of a given component.

Thanks for the help!


Also if it is not too much trouble, could you have a look at this question as well? It is related to this question.
Somebody has replied saying it is not possible to find connected components for a fully connected graph of 35 vertices.

Thanks again!
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages