Gremlin server performance

294 views
Skip to first unread message

Gerald Vasend

unread,
Sep 28, 2016, 3:59:51 PM9/28/16
to Gremlin-users
I am currently using aiogremlin to query via the gremlin server. Performance seems to fluctuate a lot. I structured the program to allow a user-defined number of concurrent queries. More concurrent queries mostly correlates with higher effective output of processed query results. However, I have noticed significant variations in performance that even gets to the point where it seems to hang. Sometimes when things slow down I set the number of concurrent queries to 1 and sometimes it helps. The question is: What facilities exists to evaluate gremlin server performance? Are there any guidelines/recommendations regarding concurrent gremlin queries?

Thanks!

Jerry

Gerald Vasend

unread,
Sep 28, 2016, 4:00:47 PM9/28/16
to Gremlin-users
FYI using gremlin server with datastax

Stephen Mallette

unread,
Sep 30, 2016, 10:44:51 AM9/30/16
to Gremlin-users
Not sure what could be wrong. I've not seen those problems, but I don't test aiogremlin and python as i usually work with the java driver. From the server's perspective, I wouldn't expect any problems with issuing concurrent requests with the exception of concurrent requests on a single session. There was an issue there at what point with long run scripts, but I think that is recently resolved (maybe with 3.2.2 iirc, but if not then definitely for the upcoming 3.2.3). But if you're not using a session that shouldn't apply. 

Maybe David Brown has some ideas from the aiogremlin perspective?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/907856de-3cdc-48a1-81b3-5a9f8f74c1f5%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Gerald Vasend

unread,
Sep 30, 2016, 11:10:40 AM9/30/16
to Gremlin-users
I may have a "suspect". According to the person that configured the cluster, there is an issue with disk space set aside for cassandra temporary files. Without knowing the details I suspect it is possible that is impacting performance and could fluctuate from one time to the next depending on how much space is available. Is it plausible that running out of temporary file space could impact a query? Symptoms range from running very slow to seemingly hang. 


On Friday, September 30, 2016 at 9:44:51 AM UTC-5, Stephen Mallette wrote:
Not sure what could be wrong. I've not seen those problems, but I don't test aiogremlin and python as i usually work with the java driver. From the server's perspective, I wouldn't expect any problems with issuing concurrent requests with the exception of concurrent requests on a single session. There was an issue there at what point with long run scripts, but I think that is recently resolved (maybe with 3.2.2 iirc, but if not then definitely for the upcoming 3.2.3). But if you're not using a session that shouldn't apply. 

Maybe David Brown has some ideas from the aiogremlin perspective?
On Wed, Sep 28, 2016 at 4:00 PM, Gerald Vasend <gva...@gmail.com> wrote:
FYI using gremlin server with datastax

On Wednesday, September 28, 2016 at 2:59:51 PM UTC-5, Gerald Vasend wrote:
I am currently using aiogremlin to query via the gremlin server. Performance seems to fluctuate a lot. I structured the program to allow a user-defined number of concurrent queries. More concurrent queries mostly correlates with higher effective output of processed query results. However, I have noticed significant variations in performance that even gets to the point where it seems to hang. Sometimes when things slow down I set the number of concurrent queries to 1 and sometimes it helps. The question is: What facilities exists to evaluate gremlin server performance? Are there any guidelines/recommendations regarding concurrent gremlin queries?

Thanks!

Jerry

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Gerald Vasend

unread,
Sep 30, 2016, 11:48:11 AM9/30/16
to Gremlin-users
Also, some additional information on the nature of the query. Total number of vertexs that I am processing is in the hundreds of millions. I "chunk" the queries via another type of vertex called a "block". Each block has edges to 1000-2000 of the vertexs of interest. One query will then return computed feature data for a few thousand vertexes (one block's worth) so the size of the query results can be fairly large. Using python aio I kick off multiple block-driven queries. When it is running fast it processes up to 700 vertexes per second. including the gremlin server and python client writing out the data. When it gets real bad it just hangs.

David Brown

unread,
Oct 1, 2016, 10:39:24 AM10/1/16
to Gremlin-users
Hi Gerald,

Well I'm not sure why this is happening, as I haven't really seen anything like this in testing or benchmarking--albeit the vast majority of testing concurrent functionality has been against TinkerGraph, not DSE graph. aiogremlin will issue as many concurrent queries as you tell it to, disregarding the number of unresolved queries in process. I've heard this can overload the server. I don't know if this is the cause of your problem, but it may be something to consider.The Java Driver controls the max number of in process queries, and maybe doing something like this could help your case. 

Unfortunately aiogremlin doesn't support this. Could I suggest that you move to the Goblin driver? The implementation is better overall, plus it provides a bunch of functionality (like controlling the max number of inflight requests), including an asynchronous Gremlin Language Variant. I know it can be a pain making the switch, but as ZEROFAIL is now backing the Goblin project, I am considering dropping maintenance support for aiogremlin.


Best,

Dave

Stephen Mallette

unread,
Oct 1, 2016, 12:03:19 PM10/1/16
to Gremlin-users
Thanks Dave - I think this underscores the point I tend to keep making about how we have to get the GLV drivers like gremlin-python to be feature consistent with the java driver (at least, as much as we can). In that way, everything will work the same. 

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/9289844a-d75a-4196-a4f7-ce6f7f3cd88c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages