sack behaviour with flatMap (coalesce) and branch (choose) steps

Michael Pollmeier

unread,

Jul 5, 2017, 6:27:50 AM7/5/17

to Gremlin-users

Can someone explain the split/merge behaviour of sack in more detail? Given a simple graph:

graph = TinkerGraph.open()
v0 = graph.addVertex("0")
v1 = graph.addVertex("1")
v0.addEdge("aaa", v1)
g = graph.traversal().withSack(0d, {a -> println("splitting sack"); a}, {a, b -> println("merging sack") a+b})

// coalesce splits the traverser and sack. However it never calls the sack's mergeOperator, so the sack changes inside the coalesce options don't propagate through to the main traverser.

g.V(v0.id).sack{s, v -> 1d}.coalesce(outE("aaa").sack{s, e -> println("aaa"); 2d}).sack()

// output:
// splitting sack
// splitting sack
// aaa
// splitting sack
// splitting sack
// ==>1.0

// In contrast: `choose` also splits the traverser and sack, and the mergeOperator is also never called, but somehow the sack modifications propagate through to the main traverser. This seems to be true for all branch steps (coalesce is a FlatMapStep).

g.V(v0.id).sack{s, v -> 1d}.choose(outE().label()).option("aaa", __.outE("aaa").sack{s, e -> println("aaa"); 2d}).sack()

// output:
// splitting sack
// splitting sack
// splitting sack
// splitting sack
// splitting sack
// aaa
// splitting sack
// ==>2.0

Is there a way to propagate the sack from within coalesce back to the main traverser? I tried a few combinations of `withBulk` and `local` steps, but the results don't change. Do I misunderstand the concept of the mergeOperator?

Michael Pollmeier

unread,

Jul 6, 2017, 2:44:39 PM7/6/17

to Gremlin-users

The same non-propagation is true when using `as` inside coalesce btw. That's really limiting :(

Any ideas, help or pointers are very much appreciated.

As a side node: coalesce is listed under `BRANCH STEPS` in GraphTraversal.java, however it's a flatMap step and therefor has a different behaviour (as described above). Should we move it to it's brothers and sisters under `MAP STEPS`?

Marko Rodriguez

unread,

Jul 10, 2017, 9:55:51 AM7/10/17

to gremli...@googlegroups.com

Hello,

Can someone explain the split/merge behaviour of sack in more detail? Given a simple graph:

When a traverser is split, the sack is split.

When traversers are merged, the sacks are merged.

When does traverser splitting happen? It happens whenever a traverser is cloned which is basically all steps EXCEPT filter() and sideEfffect() steps.

When does traverser merging happen? It happens whenever there is a barrier of some sort. When two traversers become one.

Thus, merging is less “predictable” than splitting. If you want to force a merge, simply use barrier().

Now some branch steps will have a “switch check” traversal and then its respective branch traversals (e.g. coalesce(), choose(), branch(), repeat(), ..). The “switch check” traversers die once they calculate the branch function. Then, the traverser parent that was used to create the the “switch check” child is cloned again and sent down the respective branch traversal.

So, to your examples ---

graph = TinkerGraph.open()
v0 = graph.addVertex("0")
v1 = graph.addVertex("1")
v0.addEdge("aaa", v1)
g = graph.traversal().withSack(0d, {a -> println("splitting sack"); a}, {a, b -> println("merging sack”); a+b})

// coalesce splits the traverser and sack. However it never calls the sack's mergeOperator, so the sack changes inside the coalesce options don't propagate through to the main traverser.

g.V(v0.id).sack{s, v -> 1d}.coalesce(outE("aaa").sack{s, e -> println("aaa"); 2d}).sack()

// output:
// splitting sack
// splitting sack
// aaa
// splitting sack
// splitting sack
// ==>1.0

What is there to merge? There is only one edge. Add another edge and watch.

First off, you have a ; bug in your merge operator.

g = graph.traversal().withSack(0d, {a -> println("splitting sack"); a}, {a, b -> println("merging sack"); a+b})

gremlin> g.V(v0.id).sack{s, v -> 1d}.coalesce(outE("aaa").sack{s, e -> println("aaa"); 2d}).sack().barrier()
splitting sack
splitting sack
aaa
splitting sack
splitting sack
splitting sack
aaa
splitting sack
splitting sack
merging sack
==>1.0
==>1.0
gremlin>

// In contrast: `choose` also splits the traverser and sack, and the mergeOperator is also never called, but somehow the sack modifications propagate through to the main traverser. This seems to be true for all branch steps (coalesce is a FlatMapStep).

g.V(v0.id).sack{s, v -> 1d}.choose(outE().label()).option("aaa", __.outE("aaa").sack{s, e -> println("aaa"); 2d}).sack()

// output:
// splitting sack
// splitting sack
// splitting sack
// splitting sack
// splitting sack
// aaa
// splitting sack
// ==>2.0

Is there a way to propagate the sack from within coalesce back to the main traverser? I tried a few combinations of `withBulk` and `local` steps, but the results don't change. Do I misunderstand the concept of the mergeOperator?

gremlin> g.V(v0.id).sack{s, v -> 1d}.choose(outE().label()).option("aaa", __.outE("aaa").sack{s, e -> println("aaa"); 2d}).sack().barrier()
splitting sack
splitting sack
splitting sack
splitting sack
splitting sack
aaa
splitting sack
splitting sack
aaa
splitting sack
merging sack
==>2.0
==>2.0

HTH,

Marko.

Michael Pollmeier

unread,

Jul 10, 2017, 12:31:05 PM7/10/17

to Gremlin-users

I was a bit overambitious when I simplified the example.. as you pointed out, there wasn't actually anything to merge with only one edge. Barrier is the answer. Additional benefit: labels inside coalesce work as well now. Thank you!

For the next person: here's a complete example for coalesce and sack that does call the merge operator:

graph = TinkerGraph.open()
v0 = graph.addVertex("0")
v1 = graph.addVertex("1")

v2 = graph.addVertex("2")
v0.addEdge("aaa", v1)
v0.addEdge("aaa", v2)

g = graph.traversal().withSack(0d, {a -> println("splitting sack"); a}, {a, b -> println("merging sack"); a+b})

g.V(v0.id).sack{s, v -> 1d}.choose(outE().label()).option("aaa", __.as("a").outE("aaa").sack{s, e -> println("aaa"); 2d}).sack().barrier().select("a")

Reply all

Reply to author

Forward