seed node failure crash the whole cluster (cont.)

Message has been deleted

wing

unread,

Feb 7, 2011, 10:06:01 PM2/7/11

to Scale 7 - Libraries and systems for scalable computing

i have some problems when seed node failure happen and i originally
submit an email here for help:

http://www.mail-archive.com/u...@cassandra.apache.org/msg09731.html

so i would not repeat the background here much but instead i will
provide more details on the problem, thanks

3 machines are in the cluster:
10.1.4.221
10.1.4.223
10.1.4.224

with 10.1.4.221 and 10.1.4.223 marked as "seeds" in /conf/
cassandra.yaml as:

- 10.1.4.221
- 10.1.4.223

so i use Pelops to initialize the cluster by:

new Cluster("10.1.4.221,10.1.4.223,10.1.4.224" /* nodeIps */,
9160 /* nodePort */, true /* dynamicNodeDiscovery */)

and i will describe different problem situation i met, will be a bit
long, please forgive me

situation 1)
- all 3 machines are up with keyspace and column family already
created
- client repeatedly do things in a loop (get mutator, write update to
the same row/column with level ONE, then get selector and query column
value to print out)
- inside the loop, 10.1.4.221 is down
- exceptions happened in the client and the client quit:

Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Operation failed as result of network exception. Connection is being
marked as c
orrupt (and will probably be be destroyed). See cause for details...
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTranspor
t.java:132)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTranspo
rt.java:129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.ja
va:101)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.ja
va:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.ja
va:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryPr
otocol.java:204)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:906)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
46)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
42)
at
org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:56)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)
at

Returned connection 'Connection[Keyspace1][10.1.4.221:9160][17977639]'
has been
closed or is marked as corrupt
Destroying connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Attempting to honor the notNodeHint '10.1.4.221', skipping node
Chose node '10.1.4.224'...
Attempting to borrow free connection for node '10.1.4.224'
Borrowing connection 'Connection[Keyspace1][10.1.4.224:9160]
[32477527]'
Operation failed as result of network exception. Connection is being
marked as c
orrupt (and will probably be be destroyed). See cause for details...
TimedOutException()
at org.apache.cassandra.thrift.Cassandra
$batch_mutate_result.read(Cassan
dra.java:16493)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:916)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
46)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
42)
at
org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:56)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)
at

Returned connection 'Connection[Keyspace1][10.1.4.224:9160][32477527]'
has been
closed or is marked as corrupt
Destroying connection 'Connection[Keyspace1][10.1.4.224:9160]
[32477527]'
Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]
[16805237]'
Operation failed as result of network exception. Connection is being
marked as c
orrupt (and will probably be be destroyed). See cause for details...
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTranspor
t.java:132)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTranspo
rt.java:129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.ja
va:101)
at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.ja
va:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.ja
va:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryPr
otocol.java:204)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:906)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
46)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
42)
at
org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:56)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)
at

Returned connection 'Connection[Keyspace1][10.1.4.221:9160][16805237]'
has been
closed or is marked as corrupt
Destroying connection 'Connection[Keyspace1][10.1.4.221:9160]
[16805237]'

situation 2)
- similar to situation 1, i close 10.1.4.223 and 10.1.4.224
alternatively, but it does not affect the client
- the client just keep "choosing" 10.1.4.221:

Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Returning connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Returning connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Returning connection 'Connection[Keyspace1][10.1.4.221:9160]
[17977639]'
Determining which node is the least loaded
Node '10.1.4.221' has 0 active connections
Node '10.1.4.224' has 0 active connections
Node '10.1.4.223' has 0 active connections
Chose node '10.1.4.221'...

situation 3)
- before starting the client, i close 10.1.4.221
- and start the client and have the following exception stack traces
(with custom and junit codes removed) and the client quit:

Dynamic node discovery is enabled, detecting initial list of nodes
from [10.1.4.
221, 10.1.4.223, 10.1.4.224]
Failed to open transport. See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Conn
ection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.ja
va:81)
at
org.scale7.cassandra.pelops.Connection.open(Connection.java:
70)
at
org.scale7.cassandra.pelops.ManagerOperand.openClient(ManagerOperand.
java:54)
at
org.scale7.cassandra.pelops.ManagerOperand.tryOperation(ManagerOperan
d.java:94)
at
org.scale7.cassandra.pelops.KeyspaceManager.getKeyspaceNames(Keyspace
Manager.java:35)
at
org.scale7.cassandra.pelops.Cluster.refreshInternal(Cluster.java:145)

at org.scale7.cassandra.pelops.Cluster.refresh(Cluster.java:
118)
at org.scale7.cassandra.pelops.Cluster.refresh(Cluster.java:
136)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:56)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:42)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:34)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:
333)
at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:525)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 31 more
Failed to discover nodes dynamically, using existing list of nodes.
See cause f
or details...
org.apache.thrift.transport.TTransportException: Cannot write to null
outputStre
am
at
org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTranspo
rt.java:142)
at
org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.j
ava:156)
at org.apache.cassandra.thrift.Cassandra
$Client.send_describe_keyspaces(
Cassandra.java:1019)
at org.apache.cassandra.thrift.Cassandra
$Client.describe_keyspaces(Cassa
ndra.java:1009)
at org.scale7.cassandra.pelops.KeyspaceManager
$1.execute(KeyspaceManager
.java:32)
at org.scale7.cassandra.pelops.KeyspaceManager
$1.execute(KeyspaceManager
.java:29)
at
org.scale7.cassandra.pelops.ManagerOperand.tryOperation(ManagerOperan
d.java:97)
at
org.scale7.cassandra.pelops.KeyspaceManager.getKeyspaceNames(Keyspace
Manager.java:35)
at
org.scale7.cassandra.pelops.Cluster.refreshInternal(Cluster.java:145)

at org.scale7.cassandra.pelops.Cluster.refresh(Cluster.java:
118)
at org.scale7.cassandra.pelops.Cluster.refresh(Cluster.java:
136)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:56)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:42)
at org.scale7.cassandra.pelops.Cluster.<init>(Cluster.java:34)
Failed to open transport. See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Conn
ection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.ja
va:81)
at
org.scale7.cassandra.pelops.Connection.open(Connection.java:
70)
at
org.scale7.cassandra.pelops.ManagerOperand.openClient(ManagerOperand.
java:54)
at
org.scale7.cassandra.pelops.ManagerOperand.tryOperation(ManagerOperan
d.java:94)
at
org.scale7.cassandra.pelops.KeyspaceManager.getKeyspaceSchema(Keyspac
eManager.java:55)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:
333)
at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:525)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 32 more
Failed to open transport. See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Conn
ection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.ja
va:81)
at
org.scale7.cassandra.pelops.Connection.open(Connection.java:
70)
at
org.scale7.cassandra.pelops.ManagerOperand.openClient(ManagerOperand.
java:54)
at
org.scale7.cassandra.pelops.ManagerOperand.tryOperation(ManagerOperan
d.java:94)
at
org.scale7.cassandra.pelops.KeyspaceManager.getKeyspaceSchema(Keyspac
eManager.java:55)

situation 4)
- before starting client, i close 10.1.4.223 or 10.1.4.224
respectively
- the client show exceptions connecting to 10.1.4.223 or 224, but can
finally start and run and connect to 10.1.4.221 and the remaining
working 223 or 224
- the exceptions in the start up:

Adding node '10.1.4.223' to the pool...
MBean
'com.scale7.cassandra.pelops.pool:type=CommonsBackedPoolPooledNode-
Keyspac
e1' is already registered, removing...
Registering MBean
'com.scale7.cassandra.pelops.pool:type=CommonsBackedPoolPooled
Node-Keyspace1'...
Made new connection 'Connection[Keyspace1][10.1.4.223:9160][2773808]'
Failed to open transport. See cause for details...
org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Conn
ection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.ja
va:81)
at
org.scale7.cassandra.pelops.Connection.open(Connection.java:
70)
at org.scale7.cassandra.pelops.pool.CommonsBackedPool
$ConnectionFactory.
makeObject(CommonsBackedPool.java:785)
at
org.apache.commons.pool.impl.GenericKeyedObjectPool.addObject(Generic
KeyedObjectPool.java:1685)
at
org.apache.commons.pool.impl.GenericKeyedObjectPool.ensureMinIdle(Gen
ericKeyedObjectPool.java:2058)
at
org.apache.commons.pool.impl.GenericKeyedObjectPool.preparePool(Gener
icKeyedObjectPool.java:1722)
at
org.scale7.cassandra.pelops.pool.CommonsBackedPool.addNode(CommonsBac
kedPool.java:373)
at
org.scale7.cassandra.pelops.pool.CommonsBackedPool.<init>(CommonsBack
edPool.java:104)
at
org.scale7.cassandra.pelops.pool.CommonsBackedPool.<init>(CommonsBack
edPool.java:64)
at
org.scale7.cassandra.pelops.pool.CommonsBackedPool.<init>(CommonsBack
edPool.java:52)
at org.scale7.cassandra.pelops.Pelops.addPool(Pelops.java:24)

to sum up:
- if the cluster does not have any keyspace, we can't use "auto
discover = true"
- in fact, setting "auto discover = true" does not help in the
automatic fail over
- it seems that Pelops always use the first ip in the node list/
cluster?
- it seems that Pelops always fail when the seed node is down when
calling:
org.scale7.cassandra.pelops.KeyspaceManager.getKeyspaceNames
- how can we use Pelops to have a transparent and automatic fail over
when at least one machine (non-seed) of the cluster is up?
- i am new to cassandra and pelops and if there is anything i am
probably doing wrong, please free feel to let me know and i would love
to provide more details

thanks for your time

Dan Washusen

unread,

Feb 7, 2011, 11:12:38 PM2/7/11

to sca...@googlegroups.com

Hi Wing,

I've added comments inline.

Cheers,
Dan

On 8 February 2011 14:03, wing <ywt...@gmail.com> wrote:

i have some problems when seed node failure happen and i originally
submit an email here for help:

http://www.mail-archive.com/us...@cassandra.apache.org/msg09731.html

<snip>

In this scenario Pelops is trying three times;

Against 10.1.4.221 which is down.
Against 10.1.4.224 which then throws a TimedOutException (see TimedOutException on http://wiki.apache.org/cassandra/API)
Against 10.1.4.221 because LeastLoadedNodeSelectionStrategy just chooses the first least loaded node.

A few interesting points;

Why is node 10.1.4.224 throwing a TimedOutException?
Pelops is just asking the pool to give it a connection that doesn't include the last node that failed. In your scenario two nodes have failed and the first node hasn't been taken out of the picture yet. Pelops could be a little smarter here and use a list of nodes that it should try and avoid, that way the pool should return 10.1.4.223 for the third attempt...
Another improvement to LeastLoadedNodeSelectionStrategy could be if all nodes are equally loaded (in your case all zero) then pick one at random (or in a round robin style)

(https://github.com/s7/scale7-pelops/issues/issue/30)

situation 2)
- similar to situation 1, i close 10.1.4.223 and 10.1.4.224
alternatively, but it does not affect the client
- the client just keep "choosing" 10.1.4.221:

As above the LeastLoadedNodeSelectionStrategy could be improved to pick a node at random.

This looks like it's caused by the "SAFE_NODE_CHANGE_DELAY" in ManagerOperand that was added a while back. Iv'e created https://github.com/s7/scale7-pelops/issues/issue/29 to address this...

Dan Washusen

unread,

Feb 8, 2011, 6:44:58 PM2/8/11

to sca...@googlegroups.com

Hi Wing,

I've just published a new version of the 1.0-RC1-0.7.0-SNAPSHOT Pelops jar (scale7-pelops-1.0-RC1-0.7.0-20110208.234304-5.jar) that should address the issues you've mentioned below. I've done some testing here but it would be very much appreciated if you could repeat your tests and report back.

Cheers,

Dan

wing

unread,

Feb 9, 2011, 9:58:53 PM2/9/11

to Scale 7 - Libraries and systems for scalable computing

my maven downloaded the latest pelops: scale7-pelops-1.0-
RC1-0.7.0-20110210.021053-6.jar

i can see that the pelops are changed to "retry" for 3 different
machines

but another problem happens that will throw exception out:

whenever any of one machine is down "forever", pelops will retry and
after a moment, pelops will complain that the "max-retry" is reached
and just throw exception out

so it will:
- abort the client to start if any one of the machines is down before
the client is started
- abort the running client if any one of the machines is down after
the client is started

some stacktraces:
Node '10.1.4.224' has 0 active connections, 27 borrowed connections
and 10 corru
pted connections
Node '10.1.4.223' has 0 active connections, 27 borrowed connections
and 0 corrup
ted connections
Chose node '10.1.4.223'...
Attempting to borrow free connection for node '10.1.4.223'
Borrowing connection 'Connection[Keyspace1][10.1.4.223:9160]
[16250176]'
Returning connection 'Connection[Keyspace1][10.1.4.223:9160]
[16250176]'

Determining which node is the least loaded

Node '10.1.4.221' has 0 active connections, 28 borrowed connections
and 0 corrup
ted connections
Node '10.1.4.224' has 0 active connections, 27 borrowed connections
and 10 corru
pted connections
Node '10.1.4.223' has 0 active connections, 28 borrowed connections
and 0 corrup
ted connections
Chose node '10.1.4.224'...
Attempting to borrow free connection for node '10.1.4.224'
Made new connection 'Connection[Keyspace1][10.1.4.224:9160][8789796]'
An exception was thrown while attempting to create a connection to
'10.1.4.224',
trying another node...

org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Conn
ection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.ja
va:81)
at org.scale7.cassandra.pelops.Connection.open(Connection.java:

69)
at org.scale7.cassandra.pelops.pool.CommonsBackedPool
$ConnectionFactory.
makeObject(CommonsBackedPool.java:784)
at
org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(Gene
ricKeyedObjectPool.java:1190)
at
org.scale7.cassandra.pelops.pool.CommonsBackedPool.getConnectionExcep
t(CommonsBackedPool.java:277)
at
org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:49)
at
org.scale7.cassandra.pelops.Selector.getColumnFromRow(Selector.java:3
05)
at
org.scale7.cassandra.pelops.Selector.getColumnFromRow(Selector.java:2
82)
at
org.scale7.cassandra.pelops.Selector.getColumnFromRow(Selector.java:2
68)

=======================================================================

another stacktraces:

Determining which node is the least loaded

Node '10.1.4.221' has 0 active connections, 7 borrowed connections and
0 corrupt
ed connections
Node '10.1.4.224' has 0 active connections, 7 borrowed connections and
0 corrupt
ed connections
Node '10.1.4.223' has 0 active connections, 6 borrowed connections and
0 corrupt
ed connections
Chose node '10.1.4.223'...
Attempting to borrow free connection for node '10.1.4.223'
Borrowing connection 'Connection[Keyspace1][10.1.4.223:9160]
[17192413]'

Operation failed as result of network exception. Connection is being

marked as c
orrupt (and will probably be be destroyed). See cause for details...
TimedOutException()
at org.apache.cassandra.thrift.Cassandra
$batch_mutate_result.read(Cassan
dra.java:16493)
at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:916)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
46)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
42)
at

org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:58)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)
at
com.yesasia.cassandra.pelops.PelopsCassandraKeyspaceSupport.writeCass
andraColumns(PelopsCassandraKeyspaceSupport.java:83)
at com.yesasia.cassandra.pelops.Try1Test.test1(Try1Test.java:
41)
at
com.yesasia.cassandra.pelops.Try1Test.test1_repeat1000Times(Try1Test.
java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMet
hodRunner.java:99)
at
org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethod
Runner.java:81)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAn
dAfterRunner.java:34)
at
org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunne
r.java:75)
at
org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java
:45)
at
org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(Te
stClassMethodsRunner.java:71)
at
org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethod
sRunner.java:35)
at org.junit.internal.runners.TestClassRunner
$1.runUnprotected(TestClass
Runner.java:42)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAn
dAfterRunner.java:34)
at
org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:5
2)
at
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.
java:62)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTes
tSet(AbstractDirectoryTestSuite.java:140)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(Ab
stractDirectoryTestSuite.java:127)
at org.apache.maven.surefire.Surefire.run(Surefire.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(Su
refireBooter.java:345)
at
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.j
ava:1009)
Returned connection 'Connection[Keyspace1][10.1.4.223:9160][17192413]'

has been
closed or is marked as corrupt

Destroying connection 'Connection[Keyspace1][10.1.4.223:9160]
[17192413]'

Determining which node is the least loaded

Node '10.1.4.221' has 0 active connections, 7 borrowed connections and
0 corrupt
ed connections
Node '10.1.4.224' has 0 active connections, 7 borrowed connections and
0 corrupt
ed connections
Node '10.1.4.223' has 0 active connections, 7 borrowed connections and
1 corrupt
ed connections

Chose node '10.1.4.221'...
Attempting to borrow free connection for node '10.1.4.221'
Borrowing connection 'Connection[Keyspace1][10.1.4.221:9160]

[22971385]'

Operation failed as result of network exception. Connection is being

marked as c
orrupt (and will probably be be destroyed). See cause for details...
org.apache.thrift.transport.TTransportException

org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:58)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)
at
com.yesasia.cassandra.pelops.PelopsCassandraKeyspaceSupport.writeCass
andraColumns(PelopsCassandraKeyspaceSupport.java:83)
at com.yesasia.cassandra.pelops.Try1Test.test1(Try1Test.java:
41)
at
com.yesasia.cassandra.pelops.Try1Test.test1_repeat1000Times(Try1Test.
java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMet
hodRunner.java:99)
at
org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethod
Runner.java:81)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAn
dAfterRunner.java:34)
at
org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunne
r.java:75)
at
org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java
:45)
at
org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(Te
stClassMethodsRunner.java:71)
at
org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethod
sRunner.java:35)
at org.junit.internal.runners.TestClassRunner
$1.runUnprotected(TestClass
Runner.java:42)
at
org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAn
dAfterRunner.java:34)
at
org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:5
2)
at
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.
java:62)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTes
tSet(AbstractDirectoryTestSuite.java:140)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(Ab
stractDirectoryTestSuite.java:127)
at org.apache.maven.surefire.Surefire.run(Surefire.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(Su
refireBooter.java:345)
at
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.j
ava:1009)
Returned connection 'Connection[Keyspace1][10.1.4.221:9160][22971385]'

has been
closed or is marked as corrupt

Destroying connection 'Connection[Keyspace1][10.1.4.221:9160]
[22971385]'

Determining which node is the least loaded

Node '10.1.4.221' has 0 active connections, 8 borrowed connections and
1 corrupt
ed connections
Node '10.1.4.224' has 0 active connections, 7 borrowed connections and
0 corrupt
ed connections
Node '10.1.4.223' has 0 active connections, 7 borrowed connections and
1 corrupt
ed connections

Chose node '10.1.4.224'...
Attempting to borrow free connection for node '10.1.4.224'
Borrowing connection 'Connection[Keyspace1][10.1.4.224:9160]

[18820833]'

Operation failed as result of network exception. Connection is being

marked as c
orrupt (and will probably be be destroyed). See cause for details...

UnavailableException()
at org.apache.cassandra.thrift.Cassandra
$batch_mutate_result.read(Cassan
dra.java:16485)

at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:916)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
46)
at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:
42)
at

org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:58)
at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:
51)

Returned connection 'Connection[Keyspace1][10.1.4.224:9160][18820833]'

has been
closed or is marked as corrupt
Destroying connection 'Connection[Keyspace1][10.1.4.224:9160]

[18820833]'

On 2月9日, 上午7時44分, Dan Washusen <d...@reactive.org> wrote:
> Hi Wing,
> I've just published a new version of the 1.0-RC1-0.7.0-SNAPSHOT Pelops
> jar (scale7-pelops-1.0-RC1-0.7.0-20110208.234304-5.jar)
> that should address the issues you've mentioned below. I've done some
> testing here but it would be very much appreciated if you could repeat your
> tests and report back.
>
> Cheers,
> Dan
>
> On 8 February 2011 15:12, Dan Washusen <d...@reactive.org> wrote:
>
> > Hi Wing,
> > I've added comments inline.
>
> > Cheers,
> > Dan
>

> > On 8 February 2011 14:03, wing <ywts...@gmail.com> wrote:
>
> >> i have some problems when seed node failure happen and i originally
> >> submit an email here for help:
>

> >> http://www.mail-archive.com/u...@cassandra.apache.org/msg09731.html

> > 1. Against 10.1.4.221 which is down.
> > 2. Against 10.1.4.224 which then throws a TimedOutException (see
> > TimedOutException onhttp://wiki.apache.org/cassandra/API)
> > 3. Against 10.1.4.221 because LeastLoadedNodeSelectionStrategy just

> > chooses the first least loaded node.
>
> > A few interesting points;
>

> > - Why is node 10.1.4.224 throwing a TimedOutException?
> > - Pelops is just asking the pool to give it a connection that doesn't

> > include the last node that failed. In your scenario two nodes have failed
> > and the first node hasn't been taken out of the picture yet. Pelops could
> > be a little smarter here and use a list of nodes that it should try and
> > avoid, that way the pool should return 10.1.4.223 for the third attempt...

> > - Another improvement to LeastLoadedNodeSelectionStrategy could be if

> >https://github.com/s7/scale7-pelops/issues/issue/29to address this...

Dan Washusen

unread,

Feb 10, 2011, 1:06:26 AM2/10/11

to sca...@googlegroups.com

I've committed a series of fixes that should address this issue;

When an error occurs while trying to obtains a connection to a node increment it's corrupt counter.
When a node is detected as down while attempting to open a connection to it (java.net.ConnectException) then suspend the node 10 seconds (configurable).

In both those cases the node is added to the "avoidNodes" set so the same operation should not attempt to use those nodes again.

Pelops also provides a hook to suspends nodes based on whatever criteria you feel works (see org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSuspensionStrategy). Pelops will invoke the configured INodeSuspensionStrategy each time the maintenance tasks run (every minute by default). Unfortunately there isn't a default implementation yet. Feel free to contribute one if you come up with something good! At a minimum we really need an impl of INodeSuspensionStrategy that could be invoked via JMX. This would allow an administrator to take a node out of the picture for scheduled down time or whatever...

wing

unread,

Feb 10, 2011, 5:13:08 AM2/10/11

to Scale 7 - Libraries and systems for scalable computing

i have created my own implementation of INodeSuspensionStrategy and
can successfully initialize the pool with my own
INodeSuspensionStrategy

can verify by having this print out:

"Initialising pool node suspension strategy: XXXXX"

but pelops never call my INodeSuspensionStrategy when:
1) one of the cluster is down before client is started
2) one of the cluster is down after client is started and working

also, when implementing INodeSuspensionStrategy ,

the method evaluate requires parameter PooledNode, which is "default"
access, so i cannot implement this interface unless putting to the
same package with the interface, would it be possible to make
PooledNode as public?

thanks

On 2月10日, 下午2時06分, Dan Washusen <d...@reactive.org> wrote:
> I've committed a series of fixes that should address this issue;
>

> - When an error occurs while trying to obtains a connection to a node

> increment it's corrupt counter.

> - When a node is detected as down while attempting to open a connection

> to it (java.net.ConnectException) then suspend the node 10 seconds
> (configurable).
>
> In both those cases the node is added to the "avoidNodes" set so the same
> operation should not attempt to use those nodes again.
>
> Pelops also provides a hook to suspends nodes based on whatever criteria you
> feel works (see
> org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSuspensionStrategy).
> Pelops will invoke the configured INodeSuspensionStrategy each time the
> maintenance tasks run (every minute by default). Unfortunately there isn't
> a default implementation yet. Feel free to contribute one if you come up
> with something good! At a minimum we really need an impl
> of INodeSuspensionStrategy that could be invoked via JMX. This would allow
> an administrator to take a node out of the picture for scheduled down time
> or whatever...
>

> ...
>
> 閱讀更多 »

Dan Washusen

unread,

Feb 10, 2011, 5:48:55 PM2/10/11

to sca...@googlegroups.com

Could you please clarify what you mean by "but pelops never call my INodeSuspensionStrategy when"? Previously the confgured INodeSuspensionStrategy was called each time the maintenance tasks ran (every minute by default). Are you saying that even when the maintenance tasks ran your INodeSuspensionStrategy impl. was never called?

I've also made some more improvements including a call to the suspension code during init...

wing

unread,

Feb 15, 2011, 6:41:07 AM2/15/11

to Scale 7 - Libraries and systems for scalable computing

sorry for coming back late and i have tested again using: scale7-
pelops-1.0-RC1-0.7.0-20110214.210743-9.jar

now i can see my own INodeSuspensionStrategy can be called

and my latest problem is:

with the same setup (10.1.4.221, 10.1.4.223, 10.1.4.224)

if 10.1.4.221 is down before or after client is started, it will make
the connections to 10.1.4.223 and 10.1.4.224 as "corrupted" with
errors:

Attempting to borrow free connection for node '10.1.4.224'

Borrowing connection 'Connection[Keyspace1][10.1.4.224:9160]
[11502424]'

Operation failed as result of network exception. Connection is being
marked as c
orrupt (and will probably be be destroyed). See cause for details...

UnavailableException()
at org.apache.cassandra.thrift.Cassandra
$batch_mutate_result.read(Cassan
dra.java:16485)

at org.apache.cassandra.thrift.Cassandra
$Client.recv_batch_mutate(Cassan
dra.java:916)
at org.apache.cassandra.thrift.Cassandra
$Client.batch_mutate(Cassandra.j
ava:890)

Chose node '10.1.4.223'...
Attempting to borrow free connection for node '10.1.4.223'

Borrowing connection 'Connection[Keyspace1][10.1.4.223:9160][7804298]'

Operation failed as result of network exception. Connection is being
marked as c
orrupt (and will probably be be destroyed). See cause for details...

UnavailableException()

Attempting to honor the avoidNodesHint '[10.1.4.223, 10.1.4.224]',
skipping node
'10.1.4.224'
Attempting to honor the avoidNodesHint '[10.1.4.223, 10.1.4.224]',
skipping node
'10.1.4.223'

no matter whether my own INodeSuspensionStrategy has suspended
10.1.4.221, the above corruption still happen

in fact, i can see that 10.1.4.221 is marked as

Excluding node '10.1.4.221' because it's either been removed from the
pool or has been suspended

but if 10.1.4.221 is up, but either 223 or 224, it will not affect the
client and the client can keep running

will it be a cassandra problem, not pelops?

On 2月11日, 上午6時48分, Dan Washusen <d...@reactive.org> wrote:
> Could you please clarify what you mean by "but pelops never call my
> INodeSuspensionStrategy when"? Previously the
> confgured INodeSuspensionStrategy was called each time the maintenance tasks
> ran (every minute by default). Are you saying that even when the maintenance
> tasks ran your INodeSuspensionStrategy impl. was never called?
>
> I've also made some more improvements including a call to the suspension
> code during init...
>

> ...
>
> 閱讀更多 >>

Dan Washusen

unread,

Feb 15, 2011, 5:07:13 PM2/15/11

to sca...@googlegroups.com

According the http://wiki.apache.org/cassandra/API the UnavailableException is thrown when "Not all the replicas required could be created and/or read".

You mentioned you were using a consistency level of ONE to perform reads and writes; what replication factor are you using?

--
Dan Washusen

On Tuesday, 15 February 2011 at 10:41 PM, wing wrote:

UnavailableException

wing

unread,

Feb 15, 2011, 8:41:49 PM2/15/11

to Scale 7 - Libraries and systems for scalable computing

replication_factor = 1

strategy_class = org.apache.cassandra.locator.SimpleStrategy

i also downloaded the latest cassandra 0.7.1 and change to use
different seed and also start from scratch, but still similar problems
happen

On 2月16日, 上午6時07分, Dan Washusen <d...@reactive.org> wrote:
> According thehttp://wiki.apache.org/cassandra/APIthe UnavailableException is thrown when "Not all the replicas required could be created and/or read".

Dan Washusen

unread,

Feb 15, 2011, 8:54:16 PM2/15/11

to sca...@googlegroups.com

With replication_factor = 1 each bit of data is only stored on one node; if you take down a node then it's data is unavailable (thus UnavailableException is thrown).

If you want a three node cluster to tolerate a missing node then you'll need to increase the replication factor to three...

The 'corrupted' flag just indicates that an error occurred while attempting to perform an operation against that node. Pelops could probably avoid closing connections when it detects an UnavailableException but generally you're cluster is in a bad way if you see UnavailableExceptions being thrown...

--
Dan Washusen
Sent with Sparrow

TSANG Yiu Wing

unread,

Feb 15, 2011, 10:44:03 PM2/15/11

to sca...@googlegroups.com

thanks for your patient answers

i miss the replication factor reason as i have checked that the data
is replicated to all different machines in the same cluster and was
used to think that the data should be available for all nodes

i have increased the replication factor to 2 and verify that as long
as at least 2 out of 3 are up in the cluster, the client is ok

Dan Washusen

unread,

Feb 17, 2011, 1:02:49 AM2/17/11

to sca...@googlegroups.com

My pleasure! Your help in testing these scenarios is very much appreciated...

--
Dan Washusen

Reply all

Reply to author

Forward