Gremlin Server not responding to large traversals

72 views
Skip to first unread message

Florian Cäsar

unread,
Apr 8, 2021, 5:46:10 AM4/8/21
to Gremlin-users
Hi all,

I'm using the Gremlin Python library to perform traversals on a JanusGraph deployment of Gremlin Server (the same also happens using just Tinkergraph). Some long traversals (with thousands of instructions) don't get a response, no errors, no timeouts, no log entries or errors on the server or client. Nothing.

The conditions for this silence treatment aren't clear; it doesn't simply depend on numer of bytes or instructions. For instance, this code will hang forever for me:

g = traversal().withRemote(...)
g = g.inject("")
for i in range(0, 8000):
   g = g.constant("test")
print(f"submitting traversal")
result = g.next()
print(f"done, got: {result}") # this is never reached

I'm saying it doesn't depend on the number of bytes in the request since the number of instructions beyond which I don't get response doesn't change even with very large constant values in place of just "test". For instance, injecting 7000 values with many paragraphs of Lorem Ipsum works as expected and returns in a few milliseconds.

While it shouldn't matter (since I should be getting a proper error instead of nothing), I've already increased server-side maxHeaderSize, maxChunkSize, maxContentLength etc. to very high numbers. Changing the serialization format (e.g. from GraphSONMessageSerializerV3d0 to GraphBinaryMessageSerializerV1) doesn't help either.

What is going on here?

Note: I know that very long traversals are an anti-pattern in Gremlin, but sometimes it's not possible or very inefficient to structure traversals such that they can use injected values instead (e.g. when adding many edges).

Stephen Mallette

unread,
Apr 9, 2021, 5:45:40 AM4/9/21
to gremli...@googlegroups.com
The issue is less with bytes and string lengths and more with the length of the traversal chain (i.e. the number of steps in your traversal). You end up hitting a JVM limit on the stack size on the server. You can increase the stacksize on the jvm by changing the size of the -Xss value which should allow you a longer traversal length. That will likely come with the need to re-examine other JVM settings like -Xmx and perhaps garbage collection options. 

I do find it interesting that you don't get any error messages though - you should see a stackoverflow somewhere, unless the server is just wholly bogged down by your request. I'd consider throwing more -Xmx at it to see if you can get it to respond with an error at least or to keep an eye on server logs to at least see it surface there. 

We could probably modify the anti-pattern documentation to include this sort of information. 

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/ab86e2b8-9a4d-44cc-8c6b-e4aedeff4fd0n%40googlegroups.com.

Florian Cäsar

unread,
Apr 9, 2021, 7:00:12 AM4/9/21
to Gremlin-users
Thanks very much, this indeed seems like the culprit. In the interest of searchability and as not to duplicate the discussion, I'll continue in the stackoverflow post: https://stackoverflow.com/questions/66987730/why-is-gremlin-server-janusgraph-ignoring-some-of-my-requests
Reply all
Reply to author
Forward
0 new messages