[TinkerPop3] Gremlin Sacks -- Traverser Local Data Structures

Marko Rodriguez

unread,

Nov 24, 2014, 8:59:00 AM11/24/14

to gremli...@googlegroups.com

Hi,

I've noticed numerous situations where developers are writing Gremlin traversals that make use of data aggregation along the traverser's path. Typically, people will do path() and then "reduce" the data in the path to get the specific result they want out of the path. Unfortunately, this is inefficient as path calculations are expensive, unmergable, and such calculations are "post traversal" and not innate to the act of traversing. What does all that mean?

gremlin> g.V().as('x').outE().inV().jump('x',2).path{1.0}{it.value('weight')}
==>[1.0, 1.0, 1.0, 1.0, 1.0]
==>[1.0, 1.0, 1.0, 0.4, 1.0]
-------------------------------
gremlin> g.V().as('x').outE().inV().jump('x',2).path{1.0}{it.value('weight')}.map{it.get().objects().inject(1.0){a,b -> a * b}} // OLD WAY
==>1.0
==>0.4
gremlin> g.V().withSack{1.0f}.as('x').outE().sack(mult,'weight').inV().jump('x',2).sack() // NEW WAY
==>1.0
==>0.4

In the first example, we walk a path, get that path, then get data from the path elements to reduce to some single result -- i.e. the multiplied weights of the edges traversed.

In the second example, as we walk, we multiply the weight value to the traverser's current "sack" which was initialized to 1.0f via withSack() -- i.e. on the fly reduction.

One the the primary boons of using sack() over path() is that there is less memory usage and with merge operators, scalable path analysis in OLAP situations.

You can read some more examples on the SNAPSHOT docs:

http://www.tinkerpop.com/docs/3.0.0-SNAPSHOT/#sack-step

Use cases:

1. Decaying energy algorithms (Gremlin's are no longer discrete but can be modulated by an "energy sack").

2. Graph data harvesting (Gremlin's can pick up data as they walk).

3. In-process path analysis -- as paths are walked, statistics can be gleaned and processed.

Finally, for those wanting to know the difference between sideEffects and sacks:

Sacks are traverser local data structures.

SideEffects are traversal global data structures.

Enjoy!,

Marko.

http://markorodriguez.com

Geoffroy Cowan

unread,

Sep 30, 2015, 11:37:48 PM9/30/15

to Gremlin-users

Is it possible to use sacks inside until? Something like the following;

g.withSack(0.0f).V(4).until(sack()>0.3f).emit().repeat(outE().sack(sum,'weight').inV()).times(2).path().by().by('weight')

Marko Rodriguez

unread,

Oct 1, 2015, 10:10:20 AM10/1/15

to gremli...@googlegroups.com

Hello,

Just use:

until(sack().is(gt(0.3f)))

SIDENOTE: You have both times() and until() on the same repeat(). times(2) is a shorthand for until(loops().is(eq(2)) and a repeat() can only have one (or no) until() and one emit().

HTH,

Marko.

http://markorodriguez.com

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/05fdaed2-84c2-460d-889d-a050bcb5cd9f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marko Rodriguez

unread,

Oct 1, 2015, 10:17:41 AM10/1/15

to gremli...@googlegroups.com

Hi,

I just added this IllegalStateException for 3.0.2+ so people don't get confused.

https://github.com/apache/incubator-tinkerpop/commit/590f5b797592dd51a71110d4b0e4afe41f486dfa