On May 25, 1:46 am, Dan Washusen <
d...@reactive.org> wrote:
> I'm also unclear as to what you're trying to achieve/what your issue is. I can take down nodes and as long as the cluster can maintain quorum (which my codes requires for certain reads and writes) everything works as expected without any further modification to Pelops. It does log warning messages every once in a while but it doesn't throw exceptions. I do this on a semi-regular basis in production when the Cassandra team releases new versions...
This is just what I want to achieve :)
I wrote a simple test and noticed that if I write a single column with
quorum with a downed node it logs this 10s suspension but succeeds. As
soon as I write 2 columns in a batch it breaks:
2011-05-25 11:50:59,453 WARN [main]
org.scale7.cassandra.pelops.Operand - Operation failed as result of
network exception. Connection is being marked as corrupt (and will
probably be be destroyed). See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:
129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:
129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:
101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:
378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:
297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:
204)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassandra.java:906)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.java:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:67)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:63)
at org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:82)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:72)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:
127)
... 14 more
2011-05-25 11:51:03,455 WARN [main]
org.scale7.cassandra.pelops.Operand - Operation failed as result of
network exception. Connection is being marked as corrupt (and will
probably be be destroyed). See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:
129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:
129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:
101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:
378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:
297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:
204)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassandra.java:906)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.java:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:67)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:63)
at org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:82)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:72)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:
127)
... 14 more
Same with hector. Now I'm completely confused. What does it matter how
many columns I write? I'm not supposed to execute after each column,
am I?
> > if (!pool.getConnectionValidator().validate(c)) {
> > final long suspendedUntil =
> > System.currentTimeMillis() + 60000L;
> > node.setSuspensionState(new
> > CommonsBackedPool.INodeSuspensionState()
> > {
> > @Override