Re: Best driver configuration for aws lambda / cassandra single node setup

217 views
Skip to first unread message

Andy Tolbert

unread,
Feb 23, 2018, 2:01:35 PM2/23/18
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Jens,


The aws lambda container is somewhat a black box, I don't think threads are allowed to run between actual invocations of the lambda so I suspect that ControlConnection is not allowed to keep the connection alive.


I think that should be allowed by lambda, as long as the container is active the Cluster and it's connections should persist I would think.

I haven't done much experimentation with AWS lambda and the java driver, but i would expect that it should be able to maintain the connections within the context of the containers created by AWS lambda.   The driver also sends heartbeat messages on connections to keep them active, so i wouldn't expect something in AWS to detect an inactive socket at close it.

Are there any previous logs that indicate why the pool was shutdown?  That may help us understand what is going on.

Thanks,
Andy


On Fri, Feb 23, 2018 at 4:54 AM, Jens Teglhus Møller <j...@mostlyharmless.dk> wrote:
Hi

We have a setup with a single cassandra node.

We have aws lambdas connecting to the cassandra node running on an ec2 instance and once in a while (mostly if the lamda has been idle for a while with an open connection) we get exceptions like (ip address replaced with x.y.z.w):

The lambda was not invoked for 90 minutes, then this:

ERROR c.d.d.c.RequestHandler - Unexpected error while querying /x.y.z.w
com.datastax.driver.core.exceptions.ConnectionException: [/x.y.z.w:9042] Pool is shutdown
at com.datastax.driver.core.HostConnectionPool.closeAsync(HostConnectionPool.java:613)
at com.datastax.driver.core.SessionManager.removePool(SessionManager.java:400)
at com.datastax.driver.core.SessionManager.onDown(SessionManager.java:485)
at com.datastax.driver.core.Cluster$Manager.onDown(Cluster.java:1910)
at com.datastax.driver.core.Cluster$Manager.access$1200(Cluster.java:1354)
at com.datastax.driver.core.Cluster$Manager$5.runMayThrow(Cluster.java:1867)
at com.datastax.driver.core.ExceptionCatchingRunnable.run(ExceptionCatchingRunnable.java:32)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)

ERROR c.d.d.c.ControlConnection - [Control connection] Cannot connect to any host, scheduling retry in 1000 milliseconds
All host(s) tried for query failed (tried: /x.y.z.w:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/x.y.z.w:9042] Pool is shutdown)): com.datastax.driver.core.exceptions.NoHostAvailableException
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.y.z.w:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/x.y.z.w:9042] Pool is shutdown))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:37)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)
at dk.danskespil.safe.cassandra.CassandraConnector.execute(CassandraConnector.java:76)
at dk.danskespil.safe.cassandra.repository.BaseRepository.find(BaseRepository.java:29)
at dk.danskespil.safe.metricscollector.collector.DrawingStartStructureReceivedService.findTodayDrawingStartStructure(DrawingStartStructureReceivedService.java:83)
at dk.danskespil.safe.metricscollector.collector.DrawingStartStructureReceivedService.check(DrawingStartStructureReceivedService.java:51)
at dk.danskespil.safe.metricscollector.MetricCollectorsHandler.handleRequest(MetricCollectorsHandler.java:32)
at dk.danskespil.safe.metricscollector.MetricCollectorsHandler.handleRequest(MetricCollectorsHandler.java:16)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.y.z.w:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/x.y.z.w:9042] Pool is shutdown))
at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:230)
at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:50)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:301)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution$1.onFailure(RequestHandler.java:368)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1228)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:399)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:911)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:822)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:686)
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:54)
at com.datastax.driver.core.HostConnectionPool$PendingBorrow.setException(HostConnectionPool.java:707)
at com.datastax.driver.core.HostConnectionPool.closeAsync(HostConnectionPool.java:613)
at com.datastax.driver.core.SessionManager.removePool(SessionManager.java:400)
at com.datastax.driver.core.SessionManager.onDown(SessionManager.java:485)
at com.datastax.driver.core.Cluster$Manager.onDown(Cluster.java:1910)
at com.datastax.driver.core.Cluster$Manager.access$1200(Cluster.java:1354)
at com.datastax.driver.core.Cluster$Manager$5.runMayThrow(Cluster.java:1867)
at com.datastax.driver.core.ExceptionCatchingRunnable.run(ExceptionCatchingRunnable.java:32)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)

The aws lambda container is somewhat a black box, I don't think threads are allowed to run between actual invocations of the lambda so I suspect that ControlConnection is not allowed to keep the connection alive.

What is the best driver configuration (LoadBalancingPolicy/ReconnectionPolicy/PoolingOptions/RetryPolicy) for this case?

At the moment we have some ugly hacks which catch NoHostAvailableException and do a reconnect (create new cluster and connect to it), but it seems like the driver has a lot of features that should help us handle the situation, just not quite sure about the best solution.

Best regards Jens

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Jens Teglhus Møller

unread,
Feb 24, 2018, 1:56:55 AM2/24/18
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Andy

There is no activity in the log file between the two invocations. Lambdas are billed by cpu seconds, spend during an active request, so i kind of expect that aws somehow freezes the jvm in between invocations, so you cannot cheat and use cpu resources outside an invocation, the logs indicates that i think. But a lot of stuff could be going on and i should probably ask an aws board if such heartbeat/keepalive code is allowed to run (i expect all database connection pools to have a similar functionality and aws recommends to keep the connections between invocations <https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html>.

I tried to workaround the issue by implementing a RetryPolicy that would allow the driver to recover (not sure if it can recover from a shutdown pool, testing this scenario is rather complicated). When i run a loop doing queries against a database and then restart the database the policy is never called, is that expected behaviour?

Best regards Jens
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Reply all
Reply to author
Forward
0 new messages