JavaTranslator vs. GroovyTranslator and Gremlin bytecode compilation speeds.

982 views
Skip to first unread message

Marko Rodriguez

unread,
Sep 28, 2016, 10:56:00 AM9/28/16
to gremli...@googlegroups.com, d...@tinkerpop.apache.org
Hello,

Gremlin bytecode provides a language agnostic way of sending Gremlin traversals between machines — whether physical or virtual. For instance, it is possible to send bytecode from one JVM to another or from CPython to the JVM across the network. Once bytecode is received, it needs to be translated into a representation that the processing VM can then evaluate.

GremlinServer is smart in that when bytecode is received it will analyze it for lambdas. If there are lambdas, written in language X, then it will use XTranslator and XScriptEngine to evaluate the bytecode and create a Traversal for evaluation. However, if there are no lambdas, then it will use JavaTranslator to create a Traversal for evaluation.

So, the question for me is:

Is JavaTranslator (which uses Java reflection to convert bytecode to Traversal) faster than GroovyTranslator/GroovyScriptEngine (which creates a String script for and evaluates it in the ScriptEngine)?

Lets see. Here is our script in total.

import org.apache.tinkerpop.gremlin.jsr223.JavaTranslator
import org.apache.tinkerpop.gremlin.groovy.jsr223.GroovyTranslator

//// EXECUTED LOCALLY (e.g. CLIENT APPLICATION) ////

g = EmptyGraph.instance().traversal()

t = g.V().has('name','marko').
      repeat(out()).times(2).
      groupCount().by('name'); []
bytecode = t.bytecode
// send the bytecode over the wire

//// EXECUTED REMOTELY (e.g. GREMLIN SERVER) ////

groovy = new GremlinGroovyScriptEngine()
bindings = groovy.createBindings()
bindings.put('g',g)
compiled = groovy.compile(GroovyTranslator.of('g').translate(bytecode))

x = JavaTranslator.of(g).translate(bytecode); []
y = compiled.eval(bindings); []
z = groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings); []
x == y
y == z
z == x
x.toString()

clock(1000){ JavaTranslator.of(g).translate(bytecode) }
clock(1000){ compiled.eval(bindings) } // caching
clock(1000){ groovy.reset(); groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings) } // no caching

First, lets make sure they all return the same traversal:

gremlin> x = JavaTranslator.of(g).translate(bytecode); []
gremlin> y = compiled.eval(bindings); []
gremlin> z = groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings); []
gremlin> x == y
==>true
gremlin> y == z
==>true
gremlin> z == x
==>true
gremlin> x.toString()
==>[GraphStep(vertex,[]), HasStep([name.eq(marko)]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), GroupCountStep(value(name))]
gremlin>

Great. They do. Now lets see how fast they are.

gremlin> clock(1000){ JavaTranslator.of(g).translate(bytecode) }
==>0.004768085
gremlin> clock(1000){ compiled.eval(bindings) } // caching
==>0.015168259
gremlin> clock(1000){ groovy.reset(); groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings) } // no caching
==>40.790075693
gremlin>

Cool. JavaTranslator is about 1000x faster than a evaluating a String script and about 3x faster than evaluating a compiled script. JavaTranslator takes about 40 micro-seconds to translate the bytecode, while an uncached String script takes 40 milliseconds.

So, what did we learn?

1. Bytecode is slick in that we don’t have to use Gremlin-Groovy to evaluate it (if there are no lambdas) and thus, can do everything in Java and fast!
2. It very important to always use parameterized queries with GremlinServer/etc. as you can see how costly it is to evaluate a String script repeatedly.

What is crazy is that my JavaTranslator code is gheeeeeetto.


If anyone wants to submit a PR to make JavaTranslator more efficient, please do. However, we are still doing well with what we have regardless.

Take care,
Marko.

Robert Dale

unread,
Sep 28, 2016, 11:20:07 AM9/28/16
to gremli...@googlegroups.com, d...@tinkerpop.apache.org
It would be good to understand when each is used. My assumptions:

REST: Groovy
Client.submit(String): Groovy
Client.submit(Traversal): Java
Client.submit(Bytecode): Java
traversal().withRemote(DriverRemoteConnection): Java

??
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to gremlin-user...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/gremlin-users/05AFFEA0-3982-4F61-856F-6CE0CBD51602%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.



--
Robert Dale

Vladyslav Kosulin

unread,
Apr 27, 2017, 4:04:55 PM4/27/17
to Gremlin-users, d...@tinkerpop.apache.org
I love the idea, but here are my concerns:
How can difference in client and server JVM compiler versions affect bytecode evaluation by a server? What if client was compiled with newer compiler, and the server does not recognize the version of bytecode sent by the client? I would assume it will fail. And what will happen when serialVersionUID differs if client and server are on different versions of application (not the tinkerpop version)? There will be no problem with groovy string in either case, but the java bytecode evaluation may fail, right?

Stephen Mallette

unread,
Apr 28, 2017, 7:47:29 AM4/28/17
to Gremlin-users, d...@tinkerpop.apache.org
How can difference in client and server JVM compiler versions affect bytecode evaluation by a server?

JVM versions won't matter. Gremlin bytecode is evaluated by Gremlin so, if there were versioning issues they would be specific to TinkerPop versioning. A client on TinkerPop 3.2.5 sending bytecode to TinkerPop 3.2.4 could potentially have a problem if the it used some new feature of 3.2.5 that wasn't available at the time of 3.2.4. As we look forward with bytecode and serialization/protocol issues we will need to do more with our versioning of it so that 3.2.5 could communicate without hassles to older versions. This is a problem more generalized to remoting than just something specific to bytecode.

On Thu, Apr 27, 2017 at 4:04 PM, Vladyslav Kosulin <vkos...@gmail.com> wrote:
I love the idea, but here are my concerns:
How can difference in client and server JVM compiler versions affect bytecode evaluation by a server? What if client was compiled with newer compiler, and the server does not recognize the version of bytecode sent by the client? I would assume it will fail. And what will happen when serialVersionUID differs if client and server are on different versions of application (not the tinkerpop version)? There will be no problem with groovy string in either case, but the java bytecode evaluation may fail, right?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/22b8b005-1101-4c5c-bdb4-942a4cff585f%40googlegroups.com.

Sharon OConnor

unread,
May 8, 2017, 1:37:07 PM5/8/17
to Gremlin-users
I am curious how you send the "bytecode over the wire" in Java with DSE Graph using the java driver.  I am new to all of this and still trying to figure out the best way to implement my social graph traversals.

Stephen Mallette

unread,
May 9, 2017, 9:03:09 AM5/9/17
to Gremlin-users

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

Erick Johnson

unread,
Feb 25, 2018, 9:37:07 PM2/25/18
to Gremlin-users
Sorry to reopen an old thread, but this discussion along with another thread (https://groups.google.com/forum/#!topic/gremlin-users/UuxzWQTl1x4) were the most complete info I could find regarding executing traversals from string gremlin traversals. Also I was curious about Robert Dale's comment in the other thread about there being no obvious benefit to bytecode execution, this seemed to conflict with the findings Marko posted above.

To verify the info from above, I wrote a dirty benchmark to get familiar with comparing executing native Java traversals vs executing gremlin-groovy vs executing bytecode

https://gist.github.com/erickj/30aa179a73ab1e2a2b7355b63a34b487#file-traversalbench-java

In the example given above (to the best I could repro), I found that depending on the query (I sampled 4 random examples on the TinkerFactory.createModern graph), bytecode and native Java are roughly comparable in execution time, however compiled groovy scripts are generally on par once warmed, and usually no more than 3-4 x slower from a cold start....

For example (from https://gist.github.com/erickj/30aa179a73ab1e2a2b7355b63a34b487#file-traversalbench-java):

g.V().has('name','marko').repeat(out()).times(2).groupCount().by('name')

running 10, 100, 1000 queries gave  times:

Java
  • @10 - 0.577540ms / query
  • @100 - 0.336397ms / query
  • @ 1000 - 0.104268ms / query
Groovy
  • 0.976204ms
  • 0.244138ms
  • 0.159212ms

Bytecode

  • 0.375211ms
  • 0.295963ms
  • 0.123809ms

There's no controlling for gc or anything here so take these numbers as a rough estimate.


However, once compilation time for gremilin-groovy and graphson-bytecode deserialization are accounted for then the numbers become a bit different. Groovy may see some slow down, but it's not obviously significant. However graphson -> bytecode deserializatoin slows down the total bytecode execution by 4-5x.


For the same case as above graphson/bytecode consistently executes at 0.45 - 0.8 ms / query during the 1000 iteration sample. This is roughly 3-5x slower than native Java and groovy (accounting for script compilation)


https://gist.github.com/erickj/30aa179a73ab1e2a2b7355b63a34b487#file-results-timed_compile-true


Re: executing the groovy string script vs precompiled scripts - I chose not to call reset() as Marko did above since it seems like a single scriptengine instance wouldn't be reset between each execution in production, but would simply run with a new context. However correct me if I'm wrong on this.


Soooo.... what's my point? I'm not really sure. To be honest all queries executed a fair bit faster than I expected. Even noticing that bytecode execution is significantly slower when taking graphson parsing into account, sub millisecond execution time still seems very fast however maybe gremlin execution isn't as slow as it what was billed above.


Thanks for the great work you guys do on this, especially the documentation, it's really top notch.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Stephen Mallette

unread,
Feb 26, 2018, 6:36:57 AM2/26/18
to Gremlin-users
Thanks for doing that analysis. Generally speaking I'd say that it's accurate of where things are now. I've heard that GraphSON deserialization introduces some overhead to bytecode traversals. I think that it might be advantageous for Gremlin Server to have some form of prepared traversal cache similar to the compilation caching that groovy scripts have. Then we could skip the deserialization process for traversals executed repeatedly. 



To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/259b1efe-1dea-4170-bf88-1167c2e2072b%40googlegroups.com.

Robert Dale

unread,
Feb 26, 2018, 6:40:31 AM2/26/18
to gremli...@googlegroups.com
My comment is taken out of context. It appeared that the OP wanted to convert a Gremlin Groovy String to Bytecode just for the sake of being able to use JavaTranslator.of(g).  Without stating a use case, there is no clear benefit to Translating 2x when it can be done just 1x.

Robert Dale

On Sun, Feb 25, 2018 at 8:02 PM, Erick Johnson <ejohn...@gmail.com> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/259b1efe-1dea-4170-bf88-1167c2e2072b%40googlegroups.com.

Erick Johnson

unread,
Feb 26, 2018, 5:30:22 PM2/26/18
to Gremlin-users
Thanks for the clarification Robert, sorry about the confusion.

Robert Dale

Bryan B. Thompson

unread,
Feb 27, 2018, 10:16:27 AM2/27/18
to Gremlin-users
There are effective memory leaks associated with Groovy compilation. The soft reference cache in front of the Groovy compilation stage is essentially a bandaid over a process that leaks the generated classes.

Note that these classes are not leaked in a true sense. They are eventually recoverable. BUT the recovery depends on the Java heap filling up to the point where soft reference collection is triggered (for soft references inside of the generated java.lang.Class objects).  Default GC policies (including G1) are defeated by this under a sustained workload with large numbers of distinct (different) queries.  (This is not a problem if you have only a few parameterized queries since you are then generating only a limited number of compiled queries and the cache does not "leak".)

Groovy also opens up a can of worms in terms of server side security.

So, +1 on the move away from Groovy compilation in the backend server.

One option to keep Groovy compilation would be to have a Groovy => byteCode proxy in front of the server JVM.  Since that proxy would be stateless, the opportunity for a memory leak (as described above) is mitigated and you can always restart the proxy (or load balance across proxies) as necessary.  But you really do not want an opportunity for a memory leak on the server.

Thanks,
Bryan
Reply all
Reply to author
Forward
0 new messages