--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/be141f52-4fcc-44e7-9646-63741a7eb626n%40googlegroups.com.
I'm trying to find all the nodes in a connected component in a graph, which contains around ~130M vertices and ~350M edges.
Following is the query I'm using to find the count of nodes in connected components -
Input - starting vertex id/name
Output - count of nodes in the connected component.
Query - g.v().has("name", "driver1").repeat(where(without ("a")).store("a").both().simplePath().dedup()).emit().hasLabel("driver").count().fold()
The above query is taking around ~ 52 sec
RepeatStep is taking around ~ 29 sec
Is there a way we can optimize the linear traversal in Repeatstep or lookup in WherePredicateatep?
Profile output of above query-
"dur": 29008.200345,
"counts": ("traverserCount": 13809,"elementCount": 13809},
name: "RepeatStep ([Where PredicateStep (without ([a])), Profilestep, Storestep (a), Profilestep, JanuaGraphVertexStep(BOTH, vertex), ProfileStep, PathFilterstep(simple), Profilestep, RepeatEndstep, Profilestep], until(false), emit(true))",
"annotations":{
"percentDur": 52.557919400750215},
"id": "2.0.0()",
"metrics": [
{
"dur": 38.137699,
"counts":{
traverserCount: 13810,
elementCount: 13810},
"name":"WherePredicateStep(without([a]))",
"id": "0.1.0 (2.0.0())"
},
{
"dur": 28628.594393,
"counts": {
"traverserCount": 252428,
"elementCount": 252428
},
name: "JanusGraphVertexStep (BOTH, vertex)", "annotations"