local step vs flatMap step (Tinkerpop3)

145 views
Skip to first unread message

Laverne Schrock

unread,
Jul 11, 2016, 7:07:08 AM7/11/16
to Gremlin-users
Hi,

I'm wondering if there is any difference between the local and flatMap steps.

Example with the Modern demo graph

graph = TinkerFactory.createModern()
g = graph.traversal()
g.V().hasLabel('person').outE().count() // total number of edges leaving a 'person' vertex
==>6
g.V().hasLabel('person').local(outE().count()).mean() // average number of edges leaving a 'person' vertex
==>1.5
g.V().hasLabel('person').flatMap(outE().count()).mean() // ALSO the average number of edges leaving a 'person' vertex
==>1.5

Since flatMap and local seem to have exactly the same behavior, is there any reason to use one over the other? I'm using Tinkerpop 3.0.1-incubating distributed with Titan 1.0.0

Thanks,
Laverne Schrock


Stephen Mallette

unread,
Jul 13, 2016, 11:50:56 AM7/13/16
to Gremlin-users
I've often wondered this myself - Marko, any insight on this one?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/b5345645-3747-486b-9e08-01ef50b32263%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marko Rodriguez

unread,
Jul 13, 2016, 1:15:24 PM7/13/16
to gremli...@googlegroups.com
Hello,

This is a good question.

flatMap(traversal) came late in the game and local(traversal) already existed. There is a slight variation in behavior that is subtle

gremlin> g.V().both().barrier().flatMap(both().groupCount("m")).cap("m")
==>[v[1]:3, v[2]:1, v[3]:3, v[4]:3, v[5]:1, v[6]:1]
gremlin> g.V().both().barrier().local(both().groupCount("m")).cap("m")
==>[v[1]:7, v[2]:3, v[3]:7, v[4]:7, v[5]:3, v[6]:3]

TraversalFlatMapStep will project all incoming traversers to a split() with bulk 1. LocalStep will simply pass the traverser through to its wrapped traversal untouched.

It is a subtle difference that should not exist. I think we should really just get rid of LocalStep, but keep local() as I like talking at that higher like. Similar to how where(traversal) is identical to filter(traversal) when the child traversal does not have as() starts/ends. I just like saying “where” vs. “filter."

The reason why we do projection to a split() with bulk 1 is for situations like this:

gremlin> g.V().both().barrier().groupCount().by(outE().count())
==>[0:5, 1:1, 2:3, 3:3]
gremlin> g.V().both().barrier().flatMap(outE().count()).groupCount()
==>[0:5, 1:1, 2:3, 3:3]
gremlin> g.V().both().barrier().local(outE().count()).groupCount()
==>[0:3, 1:1, 6:1, 9:1]

See how local() is not really what you want. You don’t want to “double count” edges. This is why all by()-modulators project to a split() w/ bulk 1. LocalStep just isn’t like that… bad, good? …. I never ran into a reason why I would not want that projection save…. :/

Marko.


Reply all
Reply to author
Forward
0 new messages