com.datastax.driver.core.exceptions.BusyPoolException with CQL backend

80 views
Skip to first unread message

scott_p...@persistent.co.in

unread,
Aug 30, 2017, 12:29:36 PM8/30/17
to JanusGraph users
I'm trying out the new storage.backend=CQL with cassandra from the master branch (0.2.0) and I'm consistently hitting this error after a few hours of CRUD operations, which could previously be handled (0.1.0) with no problems using the storage.backend=cassandra driver.

Is this a known issue that will be addressed in the final 0.2.0 release? Any recommendations on configuration options to adjust the pool or connection sizes?

org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:57)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:95)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:137)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:200)
at org.janusgraph.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:133)
at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:729)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1374)
at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph$GraphTransaction.doCommit(JanusGraphBlueprintsGraph.java:272)
at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:105)
... 13 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT1M40S
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:101)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
... 21 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$null$2(CQLKeyColumnValueStore.java:123)
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore$$Lambda$184.00000000BC05D670.apply(Unknown Source)
at io.vavr.API$Match$Case0.apply(API.java:3174)
at io.vavr.API$Match.of(API.java:3137)
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$static$3(CQLKeyColumnValueStore.java:120)
at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore$$Lambda$65.00000000C02B77E0.apply(Unknown Source)
at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateManyUnlogged(CQLStoreManager.java:415)
at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateMany(CQLStoreManager.java:346)
at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:79)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:98)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:95)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
... 22 more
Caused by: java.util.concurrent.ExecutionException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.x.x.x:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [/x.x.x.x] Pool is busy (no available connection and the queue has reached its max size 256)))
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at io.vavr.concurrent.Future$$Lambda$71.00000000C00F2990.apply(Unknown Source)
at io.vavr.control.Try.of(Try.java:62)
at io.vavr.concurrent.FutureImpl.lambda$run$2(FutureImpl.java:199)
at io.vavr.concurrent.FutureImpl$$Lambda$72.00000000C00F3670.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522)
... 4 more
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.x.x.x:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [/x.x.x.x] Pool is busy (no available connection and the queue has reached its max size 256)))
at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:211)
at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:46)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:275)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution$1.onFailure(RequestHandler.java:338)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.Futures$ImmediateFuture.addListener(Futures.java:106)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1322)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1258)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.query(RequestHandler.java:297)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:272)
at com.datastax.driver.core.RequestHandler.startNewExecution(RequestHandler.java:115)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:95)
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)
at org.janusgraph.diskstorage.cql.CQLStoreManager.lambda$null$24(CQLStoreManager.java:406)
at org.janusgraph.diskstorage.cql.CQLStoreManager$$Lambda$106.00000000C0938C00.apply(Unknown Source)
at io.vavr.collection.Iterator$35.getNext(Iterator.java:1632)
at io.vavr.collection.AbstractIterator.next(AbstractIterator.java:34)
at io.vavr.collection.Iterator$34.getNext(Iterator.java:1510)
at io.vavr.collection.AbstractIterator.next(AbstractIterator.java:34)
at io.vavr.collection.Iterator$34.getNext(Iterator.java:1510)
at io.vavr.collection.AbstractIterator.next(AbstractIterator.java:34)
at io.vavr.collection.Traversable.foldLeft(Traversable.java:471)
at io.vavr.concurrent.Future.sequence(Future.java:549)
at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateManyUnlogged(CQLStoreManager.java:388)
at org.janusgraph.diskstorage.cql.CQLStoreManager.mutateMany(CQLStoreManager.java:346)
at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:79)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:98)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:95)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:95)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:137)
at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:200)
at org.janusgraph.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:133)
at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:729)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1374)
at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph$GraphTransaction.doCommit(JanusGraphBlueprintsGraph.java:272)
at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:105)
... 13 more

Jason Plurad

unread,
Aug 30, 2017, 12:52:52 PM8/30/17
to JanusGraph users
Looks like you can set a batch size, and the default is 20. Let us know if it helps.

storage.cql.batch-statement-size=20

Related reading on the Cassandra Java Driver FAQ.

Robert Dale

unread,
Aug 30, 2017, 1:21:03 PM8/30/17
to Jason Plurad, JanusGraph users
Are you iterating your results all the way out?

Robert Dale

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8fd71cd7-d385-42a0-adf9-73c05f69fa31%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Scott P

unread,
Aug 30, 2017, 2:25:22 PM8/30/17
to JanusGraph users, plu...@gmail.com
I think I am iterating all the results but I'm not certain. I looked through my traversal code and categorized it into one of these three.

a) Most of my traversals end with toList() or iterate(), such as these.
JanusGraph.traversal().V().has("propA", "value1").has("propB", graphUrl).hasLabel("label1").toList()
JanusGraph.traversal().V().has("propB", "value2").hasLabel("label2").property("propC", "value3").drop().iterate()

b) There are a couple of snippets where I already have a Vertex object and I do g.V(v).next(). I was expecting that a lookup by a Vertex object will only have 1 single result, so it's okay to just call next()
Vertex v = input;
JanusGraph.traversal().V(v).next()

c) And I found a couple suspect snippets where I am ending the traversal with hasNext() to check if something exists.
JanusGraph.traversal().V().has("propD", "value4").hasLabel("label3").hasNext()

Could any of these be leaking a connection?

Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

Scott P

unread,
Aug 30, 2017, 2:35:01 PM8/30/17
to JanusGraph users
A single thread is doing all CRUD operations, it commits after roughly every 10,000 element touches. Could the number of deltas in the transaction be too large?

Robert Dale

unread,
Aug 31, 2017, 6:37:16 PM8/31/17
to Scott P, JanusGraph users
To be honest, I'm not sure how the backend handles open iterators and other open resources. `next()` and `hasNext()` may not necessarily know that nothing follows and should clean itself up. I'm not even sure what requires closing in an embedded graph traversal at this point since I typically work with the remote client.  However, looking at the code in the stacktrace, it's creating the batches for adds/deletes and submitting them in parallel. Looking at the FAQ that Jason linked, it agree it would point to batch size.  Thus, if you are committing every 10k mutations, then batch size should probably be (10000/256) around 40. Or, your transaction size should be (256*20) about 5k.  Let us know if these changes make sense and work for you.

Robert Dale

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/ab139ab4-79b0-4d0b-a8f4-2340339ee22f%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages