Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:530) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeQuery(SessionManager.java:566) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:119) ~[stormjar.jar:na]
We are having this issue in production, so would love to get some advice on how to triage this. What we know for sure:
* When errors start happening Cassandra cluster isn't under unusual load.
* We are able to reach Cassandra hosts (cluster) from client side - Not a connection issue.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
2015-02-27 16:47:11,636 Hashed wheel timer #1 DEBUG com.datastax.driver.core.Connection - Defuncting connection to /x.x.x.x:9042
com.datastax.driver.core.exceptions.DriverException: Timed out waiting for server response
at com.datastax.driver.core.RequestHandler.onTimeout(RequestHandler.java:570)
at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:893)
at com.datastax.shaded.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:546)
at com.datastax.shaded.netty.util.HashedWheelTimer$Worker.notifyExpiredTimeouts(HashedWheelTimer.java:446)
at com.datastax.shaded.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:395)
at com.datastax.shaded.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at java.lang.Thread.run(Thread.java:745)
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98290, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98290, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98291, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98291, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98292, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98292, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,838 Cassandra Java Driver worker-1311 DEBUG com.datastax.driver.core.Connection - Connection[/x.x.x.x:9042-98293, inFlight=0, closed=true] closing connection
2015-02-27 16:47:11,868 ...
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108)
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:530)
at com.datastax.driver.core.SessionManager.executeQuery(SessionManager.java:566)
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:119)
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
2015-03-03 17:54:49,188 Hashed wheel timer #1 DEBUG com.datastax.driver.core.Connection - Defuncting connection to /x.x.x.x:9042
com.datastax.driver.core.exceptions.DriverException: Timed out waiting for server response
at com.datastax.driver.core.RequestHandler.onTimeout(RequestHandler.java:570)
at com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:893)
at com.datastax.shaded.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:546)
at com.datastax.shaded.netty.util.HashedWheelTimer$Worker.notifyExpiredTimeouts(HashedWheelTimer.java:446)
at com.datastax.shaded.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:395)
at com.datastax.shaded.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at java.lang.Thread.run(Thread.java:745)
C* reject big batches right off the bat by default (more precisely any batch that is bigger than the batch_size_fail_threshold_in_kb which is 50kb by default)
SELECT id, lastUpdatedAt FROM myTable WHERE id IN (?)
CREATE TABLE myTable (
id text,
data text,
lastupdatedat timestamp,
PRIMARY KEY ((id))
)
It's not always completely clear which query caused the driver timeouts, but one suspect is this:
with a list of ~500-1000 ids, on this table:
id text,
data text,
lastupdatedat timestamp,
PRIMARY KEY ((id))
)
I'm gathering from the documentation that we should probably be doing this query differently, but regardless the question remains: why does the driver time out instead of the server?
SELECT id, lastUpdatedAt FROM myTable WHERE id IN (?)
CREATE TABLE myTable (
But given how it does work currently, it seems like the recommendation is "don't write slow queries". While we certainly strive for that, we're still learning, and it's frustrating that the consequence of a mistake is that all queries fail.
Is there a better way to self-monitor for slow queries like this, so that the errors don't cascade into these opaque "no host was tried" errors that are difficult to trace?
Would it be advisable to increase the driver read timeout to something larger, so that slow queries don't blow up the whole driver?
Forgive my ignorance, but are timeouts on a per-query/per-request basis? If not, they definitely should be…
Would definitely like to get some feedback from datastax driver team regarding this (Seems like a serious bug). Since, as far as we can tell, we are following all the prescribed patterns.
--
Olivier Michallat
Driver & tools engineer, DataStax
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Olivier Michallat
Driver & tools engineer, DataStax
row_cache_size_in_mb: 2000 (was previously default value) + row cache propertiy update on a single table.
counter_cache_size_in_mb: 500 (was previously default value)
concurrent_reads: 64 (was previously 32)
The server have enough memory to hold such a configuration.
And according to comments in cassandra.yaml, the concurrent_reads could/should be much greater than 32 in our configuration (12 cores).
We tried to rollbacked and it seems ok now.
I don't understand why those settings brings timeouts.
And i don't understand why driver finally consider nodes down whereas they are all up and i have no timeout issue with cqlsh.
-- Thomas
PS : By the way (even if not related with that issue), we experience some NullPointerException from the driver (the stacktrace is attached to the email). Should I create a JIRA issue ?
For another table, I get :nodetool: Unable to compute when histogram overflowed
The JMX measures on driver side show higher latencies (please see jconsole screenchot attached). The maximum time is about 23 seconds.
I did not understand why the C* nodes could not come back soon after the requests' timeouts
Is there a way to speed up the come-back of C* nodes (I am not sure to have seen any C* node returning to the UP state) ? Is there a driver's API that makes possible to "force" the state of nodes (we would notice by ourself if a C* node is really DOWN) ?
"Cassandra Java Driver worker-0" nid=36 state=TIMED_WAITING
- waiting on <0x1b072aaf> (a com.google.common.util.concurrent.ListenableFutureTask)
- locked <0x1b072aaf> (a com.google.common.util.concurrent.ListenableFutureTask)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
at java.util.concurrent.FutureTask.get(FutureTask.java:199)
at com.datastax.driver.core.policies.DCAwareRoundRobinPolicy.waitOnReconnection(DCAwareRoundRobinPolicy.java:368)
at com.datastax.driver.core.policies.DCAwareRoundRobinPolicy.access$100(DCAwareRoundRobinPolicy.java:56)
at com.datastax.driver.core.policies.DCAwareRoundRobinPolicy$1.computeNext(DCAwareRoundRobinPolicy.java:310)
at com.datastax.driver.core.policies.DCAwareRoundRobinPolicy$1.computeNext(DCAwareRoundRobinPolicy.java:279)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at com.datastax.driver.core.policies.TokenAwarePolicy$1.computeNext(TokenAwarePolicy.java:157)
at com.datastax.driver.core.policies.TokenAwarePolicy$1.computeNext(TokenAwarePolicy.java:142)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:102)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
- java.util.concurrent.ThreadPoolExecutor$Worker@e1e012a
We experience it few times per day on each node. We only use prepared statements. A statement of an occurrence is : "SELECT valCol FROM ks.table WHERE keyCol = ?;" where keyCol is a BIGINT and valCol a UUID. (This table is used as an 'index table').
However, it appears that the timer for this metric starts before the driver has chosen the node to query
--
Olivier Michallat
Driver & tools engineer, DataStax
About the method refreshConnectedHosts(), we are going to expose it from our application through JMX, so that we could test it.
As you suspected, we currently have many "Cassandra Java Driver worker-x" threads waiting with that stack trace. Then, we will try to lower the retry count.
We use currently the class DowngradingConsistencyRetryPolicy and will use AlwaysIgnoreRetryPolicy instead (Note : the classAlwaysIgnoreRetryPolicy is a test class in the driver version 2.1.5, so I copy-pasted the code).
We set a particular instance of our application with far less load on the C* driver, but connected to the same C* cluster as other applications instances (that still send much more requests) up. We experienced after some hours, that the C* requests are also getting slower on that particular instance. It think that the test tend to confirm that the issue is indeed about nodes' status handling (and not about requests' capacity of the driver).
Hi Guys,
--
Olivier Michallat
Driver & tools engineer, DataStax
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Olivier Michallat
Driver & tools engineer, DataStax
OliverHi,How about 2.1.X version, will it be fixed?
Hi Olivier,
--
Olivier Michallat
Driver & tools engineer, DataStax
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
We recently upgrade to the 2.1.4 version of the driver (from 2.1.1) and started seeing the "NoHostAvailableException: All host(s) tried for query failed (no host was tried)" errors--not immediately--maybe eight to 36 hours after application start up:
I am trying to write 2GB(which is limit of Cassandra single key/value) data into single(or many) column using Datastax driver,CQL3 on one machine windows node.I am hardly able to write data like 100MB(in single column), that too by facing almost all kind of exceptions and config changes.If i try write 100MB data i have to keep "commitlog_segment_size_in_mb: 200" which works; after that Cassandra killing itself.Is there any way where i can insert upto 2GB data into one(at least) or many column and find out timing ? 1. I am able to write till 100MB (in single column of table) by commitlog_segment_size_in_mb: 200; 2. When i tried to write more ; i am getting nohostavailableexception: all host(s) tried for query failed waiting for server ; operation time out waiting for server response. 3 I have seen system and debug log there in nothing in there. 4. I tried it in both linux and window machine setup in windows error is :- ERROR [HintsWriteExecutor:1] 2016-06-10 12:59:40,239 CassandraDaemon.java:195 - Exception in thread Thread[HintsWriteExecutor:1,5,main] Qus is can we write more then 1GB data in one column of table or not ? Any help appreciated on above - Thanks deepal |
On Fri, Feb 27, 2015 at 9:26 AM, Ankush Goyal <ank...@gmail.com> wrote:
Hi Guys,
We recently (this week) upgraded our java driver from 2.0.4 to 2.1.4 (we are running Cassandra version 2.0.12). We were using 2.0.4 for a long time (more than an year) in production and did not have this issue.
But, after upgrading we starting seeing intermittent NoHostAvailableExceptions (that fail all the requests on client side):Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:530) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeQuery(SessionManager.java:566) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:119) ~[stormjar.jar:na]
We are having this issue in production, so would love to get some advice on how to triage this. What we know for sure:
* When errors start happening Cassandra cluster isn't under unusual load.
* We are able to reach Cassandra hosts (cluster) from client side - Not a connection issue.
Cheers,
-Ankush
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.
I have configured a cassandra clustter with 3 nodes
Node1(192.168.0.2) , Node2(192.168.0.3), Node3(192.168.0.4)
Created a keyspace 'test' with replication factor as 2
.
Create KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 2}
When I stop either Node2 or Node3 (one at a time and both at one time) , I am able to do the CRUD operations on the keyspace.table.
When I stop Node1 and try to update/create a row from Node4 or Node3, getting following error although Node3 and Node4 are up and running-:
All host(s) tried for query failed (tried: /192.168.0.4:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections))) com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.0.4:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)))
I am not sure how Cassandra elects a leader if a leader node dies.
Thanks
Uttam Anand