Concurrent TimeoutException on connection to gremlin server remotely

57 views
Skip to first unread message

sarthak...@gmail.com

unread,
Oct 19, 2019, 5:55:45 AM10/19/19
to JanusGraph users
Hi,
I have a Gremlin Server Running v3.3.3
I am connecting to it remotely to run my gremlin queries via Java. But recently I'm bombarded with this error

`org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists`

Initially, when I faced this issue, I used to restart gremlin service and it used to work again but this that doesn't solve the problem anymore. I'm not sure what is the issue here

Here is my remote-objects.yaml file
```
hosts: [fci-graph-writer-gremlin]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
connectionPool: {
  channelizer: Channelizer.WebSocketChannelizer,
  maxContentLength: 81928192
}
```

gremlin-server.yaml
```
host: 0
port: 8182
scriptEvaluationTimeout: 120000
threadPoolWorker: 4
gremlinPool: 16
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  fci: conf/janusgraph-hbase.properties,
  insights: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}},
    scripts: [scripts/empty-sample.groovy], 
    staticImports: ['org.opencypher.gremlin.process.traversal.CustomPredicates.*']}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
  - { className: org.opencypher.gremlin.server.op.cypher.CypherOpProcessor, config: { sessionTimeout: 28800000}}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 81928192
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
```

pom.xml
```

<dependency>

<groupId>org.janusgraph</groupId>

<artifactId>janusgraph-all</artifactId>

<version>0.3.1</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>gremlin-driver</artifactId>

<version>3.3.3</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>tinkergraph-gremlin</artifactId>

<version>3.3.3</version>

</dependency>
```

I'm really stuck here. Any help is appreciated. Thanks!!

Stephen Mallette

unread,
Oct 21, 2019, 7:52:42 AM10/21/19
to janusgra...@googlegroups.com
It's hard to say what the problem could be given your description. All I can say given the information you've provided is that the driver has marked all of your hosts as "dead" for some reason. That could have happened for any number of reasons. Assuming you knew the server was running and are certain of network stability, then I guess I'd next look at server logs to see what kinds of errors were occurring just prior to the driver returning this error. 

As an aside, I see that you're using 3.3.3 for the TinkerPop driver and I don't remember exactly what conditions triggered a "dead" host back then. I'm pretty sure there have been some refinements to that decision making in more recent releases.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/aae648cf-32fd-4579-8f92-9e89f4e17d8a%40googlegroups.com.

sarthak...@gmail.com

unread,
Oct 21, 2019, 8:59:52 AM10/21/19
to JanusGraph users
Hi Stephen,
Thanks for replying. I was able to solve this issue but I'm not certain how this was causing an error.

I'm connecting to `remote-objects.yaml` file from my java code. I recently added a property
```
connectionPool: {
channelizer: Channelizer.WebSocketChannelizer
}
```

After removing this, the system is again back to normal. I got this value from Doc: http://tinkerpop.apache.org/docs/3.2.9/reference/#connecting-via-remotegraph
Topic: Connecting via Java > Configuration

I am not able to understand how this value is making system unavailable even though my gremlin-server is in a working state. It's like the request doesn't even go to gremlin-server.

Stephen Mallette

unread,
Oct 21, 2019, 9:23:15 AM10/21/19
to janusgra...@googlegroups.com
Yeah....I think the documentation misled you a bit.


You perhaps took that "Default" written there very literally without considering the "Description" which describes that field as:

"The fully qualified classname of the client Channelizer that defines how to connect to the server."

So, "Channelizer.WebSocketChannelizer" is really just the class name, not the FQCN. There wasn't enough room to put that whole thing in that "Default" column. If you'd done:

connectionPool: { channelizer: org.apache.tinkerpop.gremlin.driver.Channelizer$WebSocketChannelizer }

it would be working. What isn't so good is that you didn't get an error message for that configuration problem.




--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
Message has been deleted

sarthak...@gmail.com

unread,
Oct 21, 2019, 10:35:41 AM10/21/19
to JanusGraph users
Ok. Got it. Thanks Stephen.

I just have another query. Maybe you can help.

So I have multiple graphs which I have defined them in my `gremlin-server.yaml`

```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
gremlin-groovy: {
scripts: [scripts/empty-sample.groovy],
plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}
}
}
```

And to initialise them at run time, I have mentioned these graph in my empty-sample.groovy

```
// define the default TraversalSource to bind queries to - this one will be named "g".
globals << [g2: graph2.traversal(),g1:graph1.traversal()]
```

The issue here is, sometimes this g2 doesn't get initialised at runtime.. and if I restart my server again.. it works.. The only difference between these properties file is storage.hbase.table

What could be the reason behind this random behaviour? 

Stephen Mallette

unread,
Oct 21, 2019, 10:36:21 AM10/21/19
to janusgra...@googlegroups.com
I'm reading this as a JanusGraph-hbase specific sort of question and I'm not sure what the issue might be there. When you say "g2" doesn't get initialized, do you get an error in the server startup? or is it some other kind of error?

On Mon, Oct 21, 2019 at 10:33 AM <sarthak...@gmail.com> wrote:
Ok. Got it. Thanks Stephen.

I just have another query. Maybe you can help.

So I have multiple graphs which I have defined them in my `gremlin-server.yaml`

```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
scriptEngines: {
gremlin-groovy: {
scripts[scripts/empty-sample.groovy],
plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}
}
}
```

And to initialise them at run time, I have mentioned these graph in my empty-sample.groovy

```
// define the default TraversalSource to bind queries to - this one will be named "g".
globals << [g2: graph2.traversal(),g1:graph1.traversal()]
```

The issue here is, sometimes this g2 doesn't get initialised at runtime.. and if I restart my server again.. it works.. The only difference between these properties file is storage.hbase.table

What could be the reason behind this random behaviour? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

sarthak...@gmail.com

unread,
Oct 21, 2019, 11:05:43 AM10/21/19
to JanusGraph users
Well I don't have the error message right now. I'll post it when I'll run in that scenario again but the thing is the gremlin server starts. But graph isn't initialised. i.e. g2 isn't available for querying data. And after restarting the service, with the same properties, it works fine. 

So, I couldn't understand this random behaviour. 

Also, should the value in `storage.hbase.table` name in properties file provided to gremlin-server.yaml be already created in hbase table?? 

We first start the service then insert the data in hbase which created the table and then query the results.

sarthak...@gmail.com

unread,
Oct 21, 2019, 2:38:10 PM10/21/19
to JanusGraph users
Hi Stephen,
Below is the error I get for g2

```
globals << [g2:graph2.traversal(),g1:graph1.traversal()] took 894ms
08:14:59.098 [gremlin-server-exec-2] ERROR o.a.t.g.j.DefaultGremlinScriptEngineManager - Could not create GremlinScriptEngine for gremlin-groovy
java.lang.IllegalStateException: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:464) ~[gremlin-core-3.3.3.jar:3.3.3] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.createGremlinScriptEngine(DefaultGremlinScriptEngineManager.java:450) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.getEngineByName(DefaultGremlinScriptEngineManager.java:219) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.lambda$getEngineByName$0(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] Caused by: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902 at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:397) ~[gremlin-groovy-3.3.3.jar:3.3.3] at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:460) ~[gremlin-core-3.3.3.jar:3.3.3] ... 24 common frames omitted Caused by: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:713) ~[gremlin-groovy-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:395) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 26 common frames omitted Caused by: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:66) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.PogoGetPropertySite.getProperty(PogoGetPropertySite.java:51) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:310) ~[groovy-2.4.15-indy.jar:2.4.15] at Script12902.run(Script12902.groovy:40) ~[na:na] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:690) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 27 common frames omitted 08:14:59.099 [gremlin-server-exec-2] WARN o.a.t.g.s.h.HttpGremlinEndpointHandler - Invalid request - responding with 500 Internal Server Error and gremlin-groovy is not an available GremlinScriptEngine java.lang.IllegalArgumentException: gremlin-groovy is not an available GremlinScriptEngine at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.registerLookUpInfo(CachedGremlinScriptEngineManager.java:95) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:58) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] 08:14:59.099 [gremlin-server-worker-1] DEBUG log-aggregator-encoder - [id: 0x2216578e, L:/127.0.0.1:8182 - R:/127.0.0.1:36244] WRITE: 70B ``` gremlin-server.yaml ```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
}
```

Stephen Mallette

unread,
Oct 21, 2019, 2:47:56 PM10/21/19
to janusgra...@googlegroups.com
That seems to be the failure as a result of a request - is that right?  I'm wondering if there is an error at server startup when the script executes that you're missing? Or do the startup logs look clean?

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

sarthak...@gmail.com

unread,
Oct 22, 2019, 8:50:55 AM10/22/19
to JanusGraph users
That's all the log I have right now. The startup looks clean to me. And this isn't a failure of a request. At least no request from client (our) side. If gremlin or hbase is sending any request, like a ping to verify connection, then I'm not sure about that.

But what could be the reason of this random behaviour? If the properties were wrong, then it shouldn't start or configure at any point. But it is just random.

Stephen Mallette

unread,
Oct 22, 2019, 10:20:30 AM10/22/19
to janusgra...@googlegroups.com
hmm, i'm not sure how the HttpGremlinEndpointHandler would log an error if it didn't get a request. nothing in TinkerPop that I can think of would issue a request to that. we don't even recommend that folks use that really. 

anyway, that aside, i'm not aware of situations where Gremlin Server will have a successful init script run only to later lose a global binding. it's actually "hard" to get rid of a reference bound to the ScriptEngine once it's in there. I guess I would try to do some debugging in the init script to try to figure out what's happening. Perhaps, verify that "g2" actually works at time of init? Like, maybe:

g =  graph2.traversal()
ctx.logger.info("found a vertex: " + g.V().limit(1).next())
globals << [g2: g,g1:graph1.traversal()]



On Tue, Oct 22, 2019 at 8:50 AM <sarthak...@gmail.com> wrote:
That's all the log I have right now. The startup looks clean to me. And this isn't a failure of a request. At least no request from client (our) side. If gremlin or hbase is sending any request, like a ping to verify connection, then I'm not sure about that.

But what could be the reason of this random behaviour? If the properties were wrong, then it shouldn't start or configure at any point. But it is just random.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages