Naresh,
The WriteTimeoutException is triggered by the Cassandra node itself because it's not able to answer to your write query within the rcp_timeout time that is defined. As Alex mentioned, you're likely to have reached some IO limits within the environment you're working on. As writes only use sequential IOs you shouldn't see any WriteTimeout except maybe during some compactions. If you have such timeouts, it's likely to be because you have a mixed workload with reads. If you have such a mixed workload, you should:
- Make sure that you use a separate disk for the commitlog, which will help you to reduce the pressure on the SSTable disk as you will then have most of the IOs of the latter disk dedicated to reads, to memtable flushes and to sstable compactions.
- Try to tune your concurrent_reads in cassandra.yaml in order to reduce the pressure on the disk. See the comments in cassandra.yaml that are fairly explicit about it:
# For workloads with more data than can fit in memory, Cassandra's
# bottleneck will be reads that need to fetch data from
# disk. "concurrent_reads" should be set to (16 * number_of_drives) in
# order to allow the operations to enqueue low enough in the stack
# that the OS and drives can reorder them.
#
# On the other hand, since writes are almost never IO bound, the ideal
# number of "concurrent_writes" is dependent on the number of cores in
# your system; (8 * number_of_cores) is a good rule of thumb.
concurrent_reads: 32
concurrent_writes: 32
- If the above don't work or if you just have a laptop with a single disk, try to increase the timeouts in the cassandra.yaml as:
write_request_timeout_in_ms: 10000
# How long a coordinator should continue to retry a CAS operation
# that contends with other proposals for the same row
cas_contention_timeout_in_ms: 1000
# How long the coordinator should wait for truncates to complete
# (This can be much longer, because unless auto_snapshot is disabled
# we need to flush first so we can snapshot before removing the data.)
truncate_request_timeout_in_ms: 60000
# The default timeout for other, miscellaneous operations
request_timeout_in_ms: 10000
You don't have any timeout to increase on the driver side as the driver will wait until the coordinator node (your single node in your case) timeout.
Michael