How to filter out step 3 vertex list based on step 1 vertex

57 views
Skip to first unread message

Ronnie

unread,
Jul 9, 2021, 12:17:02 PM7/9/21
to Gremlin-users
Hi,
Assuming following schema
VertexA--edgeAB-->VertexB
VertexA--edgeAC-->VertexC
VertexC--edgeCB-->VertexB

Traversal
step 1: start with VertexB,
step 2: traverse edgeAB to find connected VertexA,
step 3: traverse edgeAC to find connected VertexC
step 4: how to filter out VertexC which are not connect to VertexB from step 1 ?

Gremlin queries that i tried:
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").is("B"))
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").hasId(__.select("B").id()))

Expected result: list of VertexC vertices which are connect back to the same VertexB vertex from step 1
Actual result: empty

Solution from the JanusGraph forum which works:
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").as("C").out("edgeCB").as("B2").where("B", eq("B2")).select("C")

Couple of questions:
1. Is there a more efficient gremlin than the solution mentioned above?
2. why does the where traversal approach e.g. where(__.out("edgeCB").is("B")) not work. Is it because "B" is considered as a literal instead of step label?

Thanks,
Ronnie

HadoopMarc

unread,
Jul 21, 2021, 3:42:53 PM7/21/21
to Gremlin-users
Hi Ronnie,

Regarding 1., I did not test it but expect the following to work:
g.V().hasLabel("VertexB").in("edgeAB").out("edgeAC").where(__.out("edgeCB").cyclicPath())

This is a less verbose query but not necessarily more efficient.

Your suggestion regarding 2. is correct. I do not know of any overview of all places where a string is interpreted as a key to a previous as() step.

I assume you also realized that the hasLabel() step triggers a table scan and does not scale (use the has() step against an index).

Best wishes,     Marc

Op vrijdag 9 juli 2021 om 18:17:02 UTC+2 schreef Ronnie:

Ronnie

unread,
Aug 18, 2021, 6:46:50 PM8/18/21
to Gremlin-users
Hi Marc,
Sorry for the delayed response - got stuck with some other stuff.

Thanks for pointing out the cyclicPath() step to filter. I verified that it works fine.  In terms of performance, based on my limited data set, the profile step seems to indicate that the verbose query is more efficient.

Also thanks for pointing out that the hasLabel() does not scale! Much appreciated! It unfortunate that indexing based on the label is not possible.

Thanks!
Ronnie
Reply all
Reply to author
Forward
0 new messages