[TinkerPop] The by()-projection and its use with "post-processing" steps.

Skip to first unread message

Marko Rodriguez

Dec 16, 2014, 12:21:04 PM12/16/14
to gremli...@googlegroups.com

A big change was made to TinkerPop3 that greatly simplifies the syntax and removes the need for lambdas in 95% of use cases. The by()-step was introduced. It is like as()-step in that it is not a true step, but instead, modulates the step previous to it. 

Notice how we can now just access the property values of an element in a by()-projection without using a lambda. This is so much cleaner than before.

gremlin> g.E().groupCount().by('weight')
==>[0.5:1, 1.0:2, 0.4:2, 0.2:1]

// before: g.E().groupCount{it.get().value('weight')}

What is nice about select() is that it takes a var arg of step labels -- no longer a List<String> -- so it looks much prettier.

gremlin> g.V().as('a').out('created').as('b').select('a','b')  // no projection
==>[a:v[1], b:v[3]]
==>[a:v[4], b:v[5]]
==>[a:v[4], b:v[3]]
==>[a:v[6], b:v[3]]
gremlin> g.V().as('a').out('created').as('b').select('a','b').by('name') // name projection
==>[a:marko, b:lop]
==>[a:josh, b:ripple]
==>[a:josh, b:lop]
==>[a:peter, b:lop]
gremlin> g.V().as('a').out('created').as('b').select('a','b').by(id).by('name') // id and name projection
==>[a:1, b:lop]
==>[a:4, b:ripple]
==>[a:4, b:lop]
==>[a:6, b:lop]
gremlin> g.V().as('a').out('created').as('b').select('a','b').by{it.value('name')[2]}  // arbitrary function projection
==>[a:r, b:p]
==>[a:s, b:p]
==>[a:s, b:p]
==>[a:t, b:p]

Now there is no longer a difference between order() and orderBy(). Its all order() with by()-projections. Notice that if you do need something more complex than just a property value comparison, you can provide a 2-arg lambda. 

gremlin> g.V().values('name').order()
gremlin> g.V().order().by('name',incr).values('name')
gremlin> g.V().has('age').order().by('name'){a,b->a[1]<=>b[1]}.values('name')
gremlin> g.V().has('age').order().by('name'){a,b->a[1]<=>b[1]}.by('age',incr).values('name')

The path()-step is analogous to the the select()-step in that it takes an arbitrary number of by() projections and evaluates them in a round-robin fashion.

gremlin> g.V().outE().inV().outE().inV().path()
==>[v[1], e[8][1-knows->4], v[4], e[10][4-created->5], v[5]]
==>[v[1], e[8][1-knows->4], v[4], e[11][4-created->3], v[3]]
gremlin> g.V().outE().inV().outE().inV().path().by('name').by{it.value('weight') + 10}
==>[marko, 11.0, josh, 11.0, ripple]
==>[marko, 11.0, josh, 10.400000005960464, lop]

Finally, perhaps the biggest change is that groupBy() has been renamed group() and you can provide it up to three by() projections: key projection, value projection, and collection reduction projection.

gremlin> g.V().group().by{it.value('name')[1]}
==>[a:[v[1], v[2]], e:[v[6]], i:[v[5]], o:[v[3], v[4]]]
gremlin> g.V().group().by{it.value('name')[1]}.by('name')
==>[a:[marko, vadas], e:[peter], i:[ripple], o:[lop, josh]]
gremlin> g.V().group().by{it.value('name')[1]}.by('name').by{it.size()}
==>[a:2, e:1, i:1, o:2]

Pretty neat eh? I updated all the docs so you can see more examples:

Finally, you will be happy to know that the functions are no longer Function<Traverser<S>,?>, but instead Function<S,?>. In an effort to make things clear and simple for 95% of use cases, most traversal with lambdas will not have a it.get() call to unwrap the traversal. However, it will still exist for branching-steps like jump(), choose(), etc.  as well as the "core steps" map(), flatMap(), filter(), sideEffect(). The general goal with the addition of by() is to try and remove the need for lambdas as much as possible from typical traversals. Hopefully, we will be able to add a sufficient number of steps that make it so its rare for someone to use map(), flatMap(), etc.

Reply all
Reply to author
0 new messages