We just updated our secondary to 2.2 and while performing the inital sync (near the end while replaying the oplog) we get an fassert:Mon Sep 3 15:00:38 [repl writer worker 7] ERROR: writer worker caught exception: E11000 duplicate key error index: [collection].$_id_ dup key: { : ObjectId('5044fe2d293c0b02a893576b') } on: { ts: Timestamp 1346698527000|3315, h: -2145615260524114526, op: "u", ns: "[collection]", o2: { _id: ObjectId('5044fe2d293c0b02a893576b'), pc: { $size: 35 } }, o: { $push: { pc: { u: ObjectId('0000000000000000002283a9'), c: [ 12, 20, 43, 51, 73, 5, 24, 31, 48, 67, 1, 23, 34, 58, 69, 3, 30, 36, 53, 71, 11, 19, 39, 59, 68 ] } } } }Mon Sep 3 15:00:38 [repl writer worker 7] Fatal Assertion 163600xade6e1 0x802e03 0x64f77d 0x77d3dd 0x7c3659 0x7f4a859c1efc 0x7f4a84d5359d/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0x802e03]/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12d) [0x64f77d]/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x26d) [0x77d3dd]/usr/bin/mongod() [0x7c3659]/lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc) [0x7f4a859c1efc]/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f4a84d5359d]Mon Sep 3 15:00:38 [repl writer worker 7]***aborting after fassert() failureIs there any way to work around this problem? Manually deleting the record (on the secondary) does nothing, I'm not sure how to proceed except resyncing and hoping it doesn't happen again.This collection is updated frequently, I wonder if it is this case: https://jira.mongodb.org/browse/SERVER-6671
Experiencing the exact same problem can anyone give us a resolution?
0xade6e1 0x802e03 0x5e29f6 0x6538aa 0x65394a 0x653d58 0x7c3659 0x39fec07851 0x39fe8e811d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0x802e03]
/usr/bin/mongod(_ZN5mongo11ReplSetImpl17syncDoInitialSyncEv+0x46) [0x5e29f6]
/usr/bin/mongod(_ZN5mongo11ReplSetImpl11_syncThreadEv+0x6a) [0x6538aa]
/usr/bin/mongod(_ZN5mongo11ReplSetImpl10syncThreadEv+0x2a) [0x65394a]
/usr/bin/mongod(_ZN5mongo15startSyncThreadEv+0xa8) [0x653d58]
/usr/bin/mongod() [0x7c3659]
/lib64/libpthread.so.0() [0x39fec07851]
/lib64/libc.so.6(clone+0x6d) [0x39fe8e811d]
Wed Sep 12 15:43:35 [rsSync]***aborting after fassert() failure
Bug fixes regularly get backported to earlier versions. I wouldn't be surprised if this one does as well.
-William
To do so, we have some options:1- stop writes on the primary while the secondary is cloning databases (after the first clone stage, writes can proceed again)2- restart all members of the replica set onto 2.2.1 after taking a backup3- start a secondary using a copy of the datafiles from the primary, rather than allowing initial sync to clone the dataThere is also one fix in 2.2.2 that makes this upgrade more likely to succeed, as it restores some of the 2.0.x behavior if a secondary is syncing from a primary of that version.-Eric