Group by path

103 views
Skip to first unread message

Robert Dale

unread,
Jul 18, 2017, 11:16:32 AM7/18/17
to Gremlin-users

I would like to group some paths by their destination. However, it appears that only the last path is available in the resulting group.

So for the example graph below, I would like to get all paths taken to destinations a or b.

graph = TinkerGraph.open()
g = graph.traversal()

g.addV('start').property('name', 'start').as('start').
addV('level1').property('name', '1').as('1').
addV('level1').property('name', '2').as('2').
addV('level1').property('name', '3').as('3').
addV('level1').property('name', '4').as('4').
addV('level2').property('name', 'a').as('a').
addV('level2').property('name', 'b').as('b').
addE('next').from('start').to('1').
addE('next').from('start').to('2').
addE('next').from('start').to('3').
addE('next').from('start').to('4').
addE('next').from('1').to('a').
addE('next').from('2').to('a').
addE('next').from('3').to('b').
addE('next').from('4').to('b')

Here we can see that there are 2 paths for each destination.

g.V().hasLabel('start').out().as('level1').out().as('level2').path().by(values('name'))
==>[start,1,a]
==>[start,2,a]
==>[start,3,b]
==>[start,4,b]

When grouped, only the last path of each group remains:

g.V().hasLabel('start').out().as('level1').out().as('level2').group().by('name').by(path().by(values('name')))
==>[a:[start,2,a],b:[start,4,b]]

I know they are there because...

g.V().hasLabel('start').out().as('level1').out().as('level2').group().by('name').by(path().unfold().values('name').fold())
==>[a:[start,1,a,start,2,a],b:[start,3,b,start,4,b]]

In the end, I got it to work by putting the path in the main traversal and grouping on the last index in the path.

g.V().hasLabel('start').out().as('level1').out().as('level2').path().from('level1').to('level2').by('name').as('path').group().by(tail(local))
==>[a:[[1,a],[2,a]],b:[[3,b],[4,b]]]

Good results but let's get rid of the duplicate destinations 'a','b' in the values part. I'll try to limit what's in the path.

g.V().hasLabel('start').out().as('level1').out().as('level2').path().from('level1').to('level2').by('name').as('path').group().by(tail(local)).by(limit(local, 1))
==>[a:2,b:4]

Doh! I tried a few thing but couldn't get this one to work.  Back to square 1!


So, the question is:  Why don't these things work the way I expect it to?  :-)


Robert Dale

unread,
Jul 18, 2017, 3:27:04 PM7/18/17
to Gremlin-users

I think I get it now.  If a traversal is provided, then it's left to the traversal to provide the fold().

These now work as expected.

g.V().hasLabel('start').out().as('level1').out().as('level2').group().by('name').by(path().from('level1').to('level2').by(values('name')).fold())
==>[a:[[1,a],[2,a]],b:[[3,b],[4,b]]]

g.V().hasLabel('start').out().as('level1').out().as('level2').path().from('level1').to('level2').by('name').as('path').group().by(tail(local)).by(limit(local, 1).fold())
==>[a:[1,2],b:[3,4]]

Yay.
Reply all
Reply to author
Forward
0 new messages