Hi,
setup: 9 servers, 3 shards with 3 rs members each. MongoDB 2.2
One member (member1) in a set crashed, because of a server crash. This server also runs 1 of 3 config servers. After that another member (member2) crashed because it couldn't reach the crashed server? This is the message on member2:
Tue Sep 25 13:34:56 [conn538] DBClientCursor::init call() failed
Tue Sep 25 13:34:56 [conn538] scoped connection to config1:27019,config2:27019,config3:27019 not being returned to the pool
Tue Sep 25 13:34:56 [conn538] warning: 13104 SyncClusterConnection::findOne prepare failed: 10276 DBClientBase::findN: transport error: config3:27019 ns: admin.$cmd query: { fsync: 1 } config3:27019:{}
Tue Sep 25 13:34:56 [conn538] warning: moveChunk commit outcome ongoing: { applyOps: [ { op: "u", b: false, ns: "config.chunks", o: { _id: "db.coll1-uuid_"38f9dbbe-86ec-444b-9e6a-483eab0f9bb2"_id_ObjectId('50444151e4b0c4a3a8c5cf74')", lastmod: Timest$
Tue Sep 25 13:34:57 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:34:59 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:01 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:01 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:01 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:03 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:05 [rsHealthPoll] couldn't connect to member1:27018: couldn't connect to server member1:27018
Tue Sep 25 13:35:06 [conn538] ERROR: moveChunk commit failed: version is at907|1||000000000000000000000000 instead of 908|1||50604b9fb961dd917fdc2316
Tue Sep 25 13:35:06 [conn538] ERROR: TERMINATING
Tue Sep 25 13:35:06 dbexit:
Tue Sep 25 13:35:06 [conn538] shutdown: going to close listening sockets...
Tue Sep 25 13:35:06 [conn538] closing listening socket: 6
Tue Sep 25 13:35:06 [conn538] closing listening socket: 7
Tue Sep 25 13:35:06 [conn538] shutdown: going to flush diaglog...
Tue Sep 25 13:35:06 [conn538] shutdown: going to close sockets...
Tue Sep 25 13:35:06 [conn538] shutdown: waiting for fs preallocator...
Tue Sep 25 13:35:06 [conn538] shutdown: lock for final commit...
Tue Sep 25 13:35:06 [conn538] shutdown: final commit...
Tue Sep 25 13:35:06 [conn1] end connection member2_IP:41925 (21 connections now open)
Tue Sep 25 13:35:06 [initandlisten] now exiting
Tue Sep 25 13:35:06 dbexit: ; exiting immediately
This is really weird, because the redundancy of 3 server should provide some kind of failover right? But if one member drags down another member, than thats really ugly.
Is this a bug ?
Thanks & regards
Daniel