Collecting all visited vertices in traversal order

146 views
Skip to first unread message

m...@unsilo.ai

unread,
Feb 6, 2017, 7:12:17 AM2/6/17
to Gremlin-users
Hi,

I am new to Gremlin and I am still exploring how to do things, and I hope you don't mind me asking question that may seem trivial to experienced Gremlin users.

I like to collect all the visited vertices in traversal order rather than just outputting the paths.

E.g. given the following example graph:

ex = TinkerGraph.open()
v1 = ex.addVertex(T.label, 'n', 'text', 'cake', 'pos', 1)
v2 = ex.addVertex(T.label, 'n', 'text', 'pizza', 'pos', 2)
v3 = ex.addVertex(T.label, 'n', 'text', 'apple', 'pos', 3)
v4 = ex.addVertex(T.label, 'n', 'text', 'orange', 'pos', 4)
v5 = ex.addVertex(T.label, 'n', 'text', 'fish', 'pos', 5)
v6 = ex.addVertex(T.label, 'n', 'text', 'ham', 'pos', 6)
e1 = v2.addEdge('p', v1)
e2 = v3.addEdge('p', v2)
e2 = v4.addEdge('q', v2)
e3 = v5.addEdge('p', v3)
e3 = v6.addEdge('p', v3)
g = ex.traversal()

and the following traversal:

tx.V().hasLabel('n').repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit().dedup().simplePath().path().by('text')

I get the output:

==>[cake,pizza]
==>[cake,pizza,apple]
==>[cake,pizza,apple,fish]
==>[cake,pizza,apple,ham]

However, I want to find [cake,pizza,apple,fish,ham], which represents all visited vertices in traversal order of the largest subgraph when traversing tx.V().hasLabel('n').repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit(), i.e. [cake, pizza, apple] is considered subgraph that I like to filter out. The result I am looking for is not found above because its not really the "path" element I should output, and the dedup is probably also misplaced, but I can't seem to get my head around how to express this in a simple way with Gremlin. I somehow need to collapse the above output into [cake,pizza,apple,fish,ham]. I would appreciate if someone could guide me, and please let me know if the question is in any way unclear.

Thanks,
Mario

m...@unsilo.ai

unread,
Feb 6, 2017, 7:16:58 AM2/6/17
to Gremlin-users
Sorry, there was a small typo in the above example. The line g = ex.traversal() should have been tx = ex.traversal() for the example to work. The corrected example would look like this:

ex = TinkerGraph.open()
v1 = ex.addVertex(T.label, 'n', 'text', 'cake', 'pos', 1)
v2 = ex.addVertex(T.label, 'n', 'text', 'pizza', 'pos', 2)
v3 = ex.addVertex(T.label, 'n', 'text', 'apple', 'pos', 3)
v4 = ex.addVertex(T.label, 'n', 'text', 'orange', 'pos', 4)
v5 = ex.addVertex(T.label, 'n', 'text', 'fish', 'pos', 5)
v6 = ex.addVertex(T.label, 'n', 'text', 'ham', 'pos', 6)
e1 = v2.addEdge('p', v1)
e2 = v3.addEdge('p', v2)
e2 = v4.addEdge('q', v2)
e3 = v5.addEdge('p', v3)
e3 = v6.addEdge('p', v3)
tx = ex.traversal()
tx.V().hasLabel('n').repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit().dedup().simplePath().path().by('text')

Thanks,
Mario

Daniel Kuppitz

unread,
Feb 6, 2017, 8:54:46 AM2/6/17
to gremli...@googlegroups.com
Let's start with your original traversal:

gremlin> g.V().hasLabel('n').repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit().dedup().simplePath().path().by('text')
==>[cake,pizza]
==>[cake,pizza,apple]
==>[cake,pizza,apple,fish]
==>[cake,pizza,apple,ham]

If you don't want the paths, why do you use path() at all? Let's flatten the result:

gremlin> g.V().hasLabel('n').repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit().dedup().simplePath().values('text')
==>pizza
==>apple
==>fish
==>ham

Ok, we've lost the cake, that's because the traversal doesn't emit the first vertex.

gremlin> g.V().hasLabel('n').emit().repeat(__.in('p').hasLabel('n').order().by('pos', incr)).dedup().simplePath().values('text')
==>cake
==>pizza
==>apple
==>fish
==>ham
==>orange

There it is again, but we also got oranges.To exclude it we can filter only those vertices w/ incoming edges:

gremlin> g.V().hasLabel('n').filter(inE('p')).emit().repeat(__.in('p').hasLabel('n').order().by('pos', incr)).dedup().simplePath().values('text')
==>cake
==>pizza
==>apple
==>fish
==>ham

The final step is to fold() the result:

gremlin> g.V().hasLabel('n').filter(inE('p')).emit().repeat(__.in('p').hasLabel('n').order().by('pos', incr)).dedup().simplePath().values('text').fold()
==>[cake,pizza,apple,fish,ham]

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/84da7707-3441-4d9b-86fd-482965542c5d%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

m...@unsilo.ai

unread,
Feb 6, 2017, 3:10:02 PM2/6/17
to Gremlin-users
Thanks Daniel,

This was really helpful. I also played around with fold at some point but I didn't get it to work the way I wanted, but your educational walk through of your solution made me realise that my example above was also a bit too simple. I actually have multiple of these closures within the graph that I like to find, and fold puts them all in the same list. I really want a list of lists, one list for each of them. E.g. if you add the following additional edges to the graph

v7 = ex.addVertex(T.label, 'n', 'text', 'juice', 'pos', 7)
v8 = ex.addVertex(T.label, 'n', 'text', 'potato', 'pos', 8)
e6 = v7.addEdge('p', v4)
e7 = v8.addEdge('p', v4)

You would get [cake,pizza,apple,fish,ham,orange,juice,potato] but orange is not connected to pizza with p. Its instead connected with juice and potato through p, so it forms its own subgraph that I like to capture in a result like this:

[[cake,pizza,apple,fish,ham],[orange,juice,potato]]

Sorry for not bringing this additional requirement up earlier, but I actually didn't think about it before I saw your solution.

Best
Mario

Daniel Kuppitz

unread,
Feb 6, 2017, 4:57:55 PM2/6/17
to gremli...@googlegroups.com
Well, in that case let's use path() again, start only at those vertices that don't have an outgoing p-edge, group the paths by the first vertex and merge the grouped values.

gremlin> g.V().hasLabel('n').not(outE("p")).as("a").
           until(__.not(__.in("p").simplePath())).

             repeat(__.in('p').hasLabel('n').order().by('pos', incr)).
           path().by("text").group().by(select("a")).
           select(values).unfold().map(unfold().unfold().dedup().fold())
==>[cake,pizza,apple,fish,ham]
==>[orange,juice,potato]

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

m...@unsilo.ai

unread,
Feb 7, 2017, 10:55:33 AM2/7/17
to Gremlin-users
Hi Daniel,

Thank you very much for giving some insights on how to do this with Gremlin. I must admit it took some work to fully understand the details, especially all the final unfolding and folding, which is somewhat cryptic and its hard to decipher the intention just by reading the statements. I wouldn't have solve it myself in a reasonable amount of time. I guess I need to spend more time to go through the code examples in the documentation, although I prefer learning on my own use cases.

I adapted it to the real graphs I have, and it seems to do what it is suppose to, although I need much more work in capturing all the different real-world variations and edge cases.

I have one more question though. Is there some important semantic difference between

until(__.not(__.in("p").simplePath())).repeat(__.in('p').hasLabel('n').order().by('pos', incr)).path()

and

repeat(__.in('p').hasLabel('n').order().by('pos', incr)).emit().simplePath().path()

since they seem to result in the same output in my tests?

Best
Mario


Daniel Kuppitz

unread,
Feb 7, 2017, 12:36:23 PM2/7/17
to gremli...@googlegroups.com
There's a big difference. The first statement only emits full paths, while the second statement also emits incomplete paths (MUCH more (unnecessary) data to process).

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages