Controlling parallel execution of gremlin queries

1,517 views
Skip to first unread message

Ümit Akkuş

unread,
Jan 8, 2017, 8:06:25 PM1/8/17
to Gremlin-users
Hello everyone,

I'm trying to see if and how parallelization of a given gremlin query happens and how I can control the concurrency. 

For example, given a query like

g.V().out().out()

how does tinkerpop decides on how many parallel traversal objects are created? 

I would like to say only one is created and given the lazy evaluation with Iterator, every time hasNext (or next) is called, the traverser makes the next step.

Regardless of what I said above is correct or false, is there a way to get tinkerpop to continue to evaluate the query while I'm working on the first result returned from next? If not, is it advisable (or completel wrong) to perform
  
g.V().out()

first, and then create multiple threads to execute the second out()?

Thanks

Marko Rodriguez

unread,
Jan 9, 2017, 9:34:01 AM1/9/17
to gremli...@googlegroups.com
Hello,

I'm trying to see if and how parallelization of a given gremlin query happens and how I can control the concurrency. 

For example, given a query like

g.V().out().out()

how does tinkerpop decides on how many parallel traversal objects are created? 

Gremlin OLTP’s VertexStep will create one traverser for each outgoing edge — and so forth for the next out(). However, different strategies can alter this basic form. For instance:

1. LazyBarrierStrategy will try and "bulk" all traversers after the first out() and after the second out(). This way, if 15 traversers are at the same vertex after the first out(), there is no need to compute out() 15 times, just once, because they have been bulked.
2. DSEVertexStepStrategy is a DSEGraph specific strategy that will evaluate out() in batch parallel.

I would like to say only one is created and given the lazy evaluation with Iterator, every time hasNext (or next) is called, the traverser makes the next step.

It all depends on strategies and compilation. You can always do an explain() to see compilation or a profile() to see how traversers get bulked/grouped.

gremlin> g.V().both().both().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[])                                             6           6           0.053    23.37
VertexStep(BOTH,vertex)                                               12          12           0.052    23.01
VertexStep(BOTH,vertex)                                               30          30           0.121    53.62
                                            >TOTAL                     -           -           0.227        -
gremlin> g.V().both().barrier().both().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[])                                             6           6           0.049    20.93
VertexStep(BOTH,vertex)                                               12          12           0.071    30.01
NoOpBarrierStep                                                       12           6           0.048    20.32
VertexStep(BOTH,vertex)                                               30          12           0.068    28.75
                                            >TOTAL                     -           -           0.238        -
gremlin>

Realize to that this is all “up to the compiler” because Gremlin does NOT guarantee traverser order and thus, you will get the “same” results regardless if they are bulked or not.

Regardless of what I said above is correct or false, is there a way to get tinkerpop to continue to evaluate the query while I'm working on the first result returned from next? If not, is it advisable (or completel wrong) to perform
  
g.V().out()

first, and then create multiple threads to execute the second out()?

In terms of threading, OLAP (GraphComputer and GraphActors (in development)) will thread as you mention, but OLTP will not (unless there is a provider specific strategy that does so). The reason being, there is so little computation being done in each thread (its basically just pointer chasing) that there is too much overhead in spawning/managing threads. Bulking (discussed previous) is a much better way to speed up graph traversals (a memoization technique).

HTH,
Marko.


Ümit Akkuş

unread,
Jan 9, 2017, 2:41:44 PM1/9/17
to Gremlin-users
Thank you Marko for the detailed answer. My question came with assumption that the database access is high latency, and therefore we would want to run any database accesses in parallel as much as possible. Is there a way to achieve this type of parallelization in gremlin?

Ümit Akkuş

unread,
Jan 9, 2017, 2:42:02 PM1/9/17
to Gremlin-users
The question is for OLTP.

Marko Rodriguez

unread,
Jan 9, 2017, 2:46:40 PM1/9/17
to gremli...@googlegroups.com
Hi,

I believe you are talking about “batch gets” and in that case, it depends on the provider. For instance, Titan and DSEGraph will batch get out().out() type traversals to reduce latency.

Marko.
-- 
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

pieter-gmail

unread,
Jan 10, 2017, 1:07:05 AM1/10/17
to gremli...@googlegroups.com
Hi,

This is the same concern I have.
Some (parts of) gremlins are not optimizable.

g.traversal().V().this().that().out().out()

If `this().that()` is unoptimizable then the `out().out()` could be done
in parallel for every incoming traverser.

Can your resent work with actors help in this scenario?

Thanks
Pieter

On 09/01/2017 21:46, Marko Rodriguez wrote:
> Hi,
>
> I believe you are talking about “batch gets” and in that case, it
> depends on the provider. For instance, Titan and DSEGraph will batch
> get out().out() type traversals to reduce latency.
>
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Jan 9, 2017, at 12:42 PM, Ümit Akkuş <uak...@gmail.com
>> http://markorodriguez <http://markorodriguez/>.com
>>
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Gremlin-users" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to gremlin-user...@googlegroups.com
>> <mailto:gremlin-user...@googlegroups.com>.
>> <https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to gremlin-user...@googlegroups.com
> <mailto:gremlin-user...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/gremlin-users/12725B52-0C3E-4ED4-959F-F85918654E8F%40gmail.com
> <https://groups.google.com/d/msgid/gremlin-users/12725B52-0C3E-4ED4-959F-F85918654E8F%40gmail.com?utm_medium=email&utm_source=footer>.

Marko Rodriguez

unread,
Jan 10, 2017, 7:18:42 AM1/10/17
to gremli...@googlegroups.com
Hello,

I think people are confusing different types of “parallel execution.” Here are the 3 types of parallel execution mentioned in this thread:

1. Threads — multiple threads pulling traversers through the pipeline.
2. Multi-get — fetching batch elements from the underlying database.
3. Bulking — merging multiple traversers into a single traverser.

Each of the above are different forms of “parallelization.” Here are the good/bads of the items above:

1. Thread management typically kills your performance. GraphActors is all about threading, but you will want a “big” traversal.
- With “big” traversals you are able to load balance your traverser processing across the cluster and you get data local processing.
2. You have to be smart about the effects of a multi-get when it comes to the semantics of barriers. Thus, getting your compilation strategy correct is hard.
- If you don’t have aspects of your graph hot in cache, then this is a huge win.
3. You have to use barriers to try and unify as many traversals as possible (space). Moreover, if you have no bulks, you just barrier’d for no reason (time).
- LazyBarrierStrategy knows about your memory and does its best to barrier until it starts using too much and then drains itself.

HTH,
Marko.

http://markorodriguez.com
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/3c7a5ec6-74d5-771e-43ee-d055c35e55ca%40gmail.com.

Ramzi Oueslati

unread,
Feb 17, 2017, 12:06:08 PM2/17/17
to Gremlin-users
Hello Marko,

I worked a little bit on the gremlin driver a few months ago. And lately I have been working on a server side implementation with a database.
I believe we have the same issue as Umit Akkus (database latency). We would like to make database accesses run in parallel in the out().out() scenario. What would be the best orientation according to you ? A specific startegy ?

Thanks
Ramzi

Marko Rodriguez

unread,
Feb 17, 2017, 4:46:56 PM2/17/17
to gremli...@googlegroups.com

Ramzi Oueslati

unread,
Feb 20, 2017, 5:27:07 AM2/20/17
to Gremlin-users
Thank you Marko,

I will look into it and let you know.

Ramzi

Tunay Gur

unread,
Mar 24, 2017, 7:11:57 PM3/24/17
to Gremlin-users
An optimized strategy to parallelize data store queries is exactly what I need to just wanted to check if someone successfully achieved this as well. 

Ultimately I'm interested in doing N hop subgraph queries (given a vertex find the subgraph that includes all vertices that are at most N hop away from the source). I'm aware of multiget_slice that can slice over a list of keys, I thought of putting a barrier after each iteration in my loop with the expectation that TP would make one multiget_slice call with list of vertices from the previous iteration. However, nothing changed in number of data store calls. 

Am I completely off base here or just not getting the query syntax not right ? 


I tried something along these lines: 

g.V().has("User", "uuid", "11111111-1111-1111-1111-111111111101").repeat(__.bothE().subgraph("subGraph").otherV().barrier()).times(2).cap("subGraph")

Qiu Xiafei

unread,
Mar 31, 2017, 8:22:59 AM3/31/17
to Gremlin-users
Hi, Marko!

As you mentioned 3 types of parallel execution, should TraversalVertexProgram be the 4th choice? Gremlin OLAP is executed in a BFS and also parallel manner. I've seen some code transfer OLTP traversal step into OLAP step in Gremlin' code base. By the way, is TraversalVertexProgram production ready?

在 2017年1月10日星期二 UTC+8下午8:18:42,Marko A. Rodriguez写道:
>>> To view this discussion on the web
>>> visit https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com
>>> <https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Gremlin-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to gremlin-user...@googlegroups.com

Marko Rodriguez

unread,
Mar 31, 2017, 11:29:48 AM3/31/17
to gremli...@googlegroups.com
Hello,

TraversalVertexProgram *is* Gremlin OLAP. The GraphComputer system of TinkerPop executes VertexPrograms. TraversalVertexProgram is a type of VertexProgram that knows how to execute a Gremlin traversal.

HTH,
Marko.

Qiu Xiafei

unread,
Mar 31, 2017, 9:58:29 PM3/31/17
to Gremlin-users
Hi,

Is TraversalVertexProgram completely support all OLTP steps? Or there's some restraints?

在 2017年3月31日星期五 UTC+8下午11:29:48,Marko A. Rodriguez写道:

>>> To view this discussion on the web 
>>> visit https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com 
>>> <https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com?utm_medium=email&utm_source=footer>. 
>>> For more options, visit https://groups.google.com/d/optout. 
>> 
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "Gremlin-users" group. 
>> To unsubscribe from this group and stop receiving emails from it, send 
>> an email to gremlin-user...@googlegroups.com 

>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/gremlin-users/12725B52-0C3E-4ED4-959F-F85918654E8F%40gmail.com 
>> <https://groups.google.com/d/msgid/gremlin-users/12725B52-0C3E-4ED4-959F-F85918654E8F%40gmail.com?utm_medium=email&utm_source=footer>. 
>> For more options, visit https://groups.google.com/d/optout. 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group. 
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com. 
> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/3c7a5ec6-74d5-771e-43ee-d055c35e55ca%40gmail.com. 
> For more options, visit https://groups.google.com/d/optout. 

Maatary Okouya

unread,
Mar 1, 2019, 8:52:48 PM3/1/19
to Gremlin-users
Marko, 

I am interested in point 1 and 2, just for the sake of understanding how Gremlin is supposed to work in OLTP mode. I always wanted to understand if Gremlin in OLTP mode, parallelize its traversal. From everything I red, to me the answer is no, it is a depth-first search strategy always. The Traversers are processed in a serial fashion, more abstractly one legal path is traversed at a time. 

However, in the following blog: http://www.doanduyhai.com/blog/?p=13439  (From DataStax), it is stated the following: 

>>>>
Evaluation Strategy in Gremlin:
  1. depth-first: this strategy traverses down the entire path as specified by a the steps before turning to the next legal path. In theory there is a single traverser that explores each path but in practice for optimisation purposes, the vendors can implement concurrent traversers for distinct paths. The depth-first strategy is the default strategy in Gremlin unless specified otherwise.
  2. >>>
The concurrent traversers here sounds really strange to me. First, a traverser is an object, so it is more concurrent traversals. This being said i am very confused by that, because in my opinion, without prior knowledge of what the traversal is supposed to do, we can easily get into situation where multiple node are visited multiple times. If your algorithm requires that this does not happem, it becomes hard to track when node have been visited or not. Indeed bulking would not work here, because Traverser will reach common node at different point in time. 

This being said it might be that i don't see certain things. 

1 - Threads would not work because of what i explain above. Why would you say that it is just a matter of processing the traverser. The fork-joint pool could easily handle that if that was the issue. I think it is an issue of lack of prior knowledge. If you disagree would you explain ? 

2 - What's the logic behind the detection of when to Bulk, that could further help me understand. Is there something that track the location of traverser in the graph ?

3 - Do you really understand that Multi-Threading, let say properly done, that is,  with a thread-pool executor and Bulking are similar parallelization ? 

4 - I tried to check DSEVertexStepStrategy, but could not find the code of it.

Best, 
M
>>> To view this discussion on the web
>>> visit https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com
>>> <https://groups.google.com/d/msgid/gremlin-users/c6237470-8f01-4596-8be2-2f299f072148%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Gremlin-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to gremlin-user...@googlegroups.com

Maatary Okouya

unread,
Mar 2, 2019, 3:11:01 AM3/2/19
to Gremlin-users
This explanation here from TinkerPop Tutorial spunds right to me https://github.com/tinkerpop/gremlin/wiki/Depth-First-vs.-Breadth-First

Maatary Okouya

unread,
Mar 2, 2019, 9:02:01 AM3/2/19
to Gremlin-users
Note I am well aware of Vertex program, but it is precisely because i am comparing it to BSP model/Pregel/Vertex Program to the OLTP processing in the context of a single node machine, that i want to understand the limitation of the the OLTP engine.

Marko Rodriguez

unread,
Mar 2, 2019, 3:48:25 PM3/2/19
to gremli...@googlegroups.com
Hello Maatary,

TinkerPop3 OLTP is not threaded. This is one of the big pushes for TinkerPop4. In TinkerPop4, we really need to blur the distinction between what is OLTP and what is OLAP and use execution engines that can be easily serialized, parallelized, distributed, distributed parallelized, etc. The farthest I got with parallel/threaded OLTP in TinkerPop3 is gremlin-akka/, but unfortunately, the branch was never merged as it required a pretty intense overhaul of the core TinkerPop3 concepts.

I wouldn’t expect parallel OLTP in TinkerPop3, but look forward to it in TinkerPop4 (which just started development today :).

Take care,
Marko.
Reply all
Reply to author
Forward
0 new messages