We are attempting to migrate ~300GB of data and ~100GB of indexes to another replica set. We previously had a machine on HDD that could sync perfectly but was perpetually 24 hours behind. We upgraded to a server with SDDs, and we tried to initial sync twice, and both times it has shutdown unexpectedly.
Fri Sep 28 04:06:04 [initandlisten] connection accepted from :37317 #1748 (18 connections now open)
Fri Sep 28 04:06:06 [rsSync] 40000000/374169652 10%
Fri Sep 28 04:06:18 [rsSync] 42000000/374169652 11%
***** SERVER RESTARTED *****
Fri Sep 28 04:09:35 [initandlisten] MongoDB starting : pid=898 port=27018 dbpath=/var/lib/mongodb 64-bit
Fri Sep 28 04:09:35 [initandlisten] db version v2.2.0, pdfile version 4.5
Fri Sep 28 04:09:35 [initandlisten] git version: f5e83eae9cfbec7fb7a071321928f00d1b0c5207
The shutdown is unsolved, but that wouldn't really be an issue if it wasn't for the fact that the machine came up as a SECONDARY, and eligible for primary with no indexes. We weren't aware of the issue at first so we stepped down our other machine, and the fact that the new machine had no indexes (except _id), caused a lot of trouble.
This may be unrelated as well, but we also had a lot of assertion failures in our logs.
Fri Sep 28 14:25:10 [rsSync] local.oplog.rs warning assertion failure _intents.size() < 2000000 src/mongo/db/dur_commitjob.h 101 0xade6e1 0x802c5a 0x78c4a0 0x78c4ff 0x78c7d2 0x78c8ed 0x78c95b 0xa07c1a 0x626166 0x62de4b 0x73954c 0xb6708a 0x64b5eb 0x65345e 0x6538f8 0x65394a 0x653d58 0x7c3659 0x7fe12f2a39ca 0x7fe12e64acdd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]
/usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x802c5a]
/usr/bin/mongod(_ZN5mongo3dur9CommitJob4noteEPvi+0x280) [0x78c4a0]
/usr/bin/mongod(_ZN5mongo3dur18ThreadLocalIntents8_unspoolEv+0x4f) [0x78c4ff]
/usr/bin/mongod(_ZN5mongo3dur18ThreadLocalIntents7unspoolEv+0x52) [0x78c7d2]
/usr/bin/mongod(_ZN5mongo3dur18ThreadLocalIntents4pushERKNS0_11WriteIntentE+0x6d) [0x78c8ed]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl18declareWriteIntentEPvj+0x6b) [0x78c95b]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl10writingPtrEPvj+0xa) [0xa07c1a]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails13addDeletedRecEPNS_13DeletedRecordENS_7DiskLocE+0x1a6) [0x626166]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails5allocEPKciRNS_7DiskLocE+0x1eb) [0x62de4b]
/usr/bin/mongod(_ZN5mongo11DataFileMgr17fast_oplog_insertEPNS_16NamespaceDetailsEPKci+0x6c) [0x73954c]
/usr/bin/mongod(_ZN5mongo11_logOpObjRSERKNS_7BSONObjE+0x27a) [0xb6708a]
/usr/bin/mongod(_ZN5mongo7replset8SyncTail15applyOpsToOplogEPSt5dequeINS_7BSONObjESaIS3_EE+0x4b) [0x64b5eb]
/usr/bin/mongod(_ZN5mongo7replset8SyncTail16oplogApplicationEv+0x48e) [0x65345e]
/usr/bin/mongod(_ZN5mongo11ReplSetImpl11_syncThreadEv+0xb8) [0x6538f8]
/usr/bin/mongod(_ZN5mongo11ReplSetImpl10syncThreadEv+0x2a) [0x65394a]
/usr/bin/mongod(_ZN5mongo15startSyncThreadEv+0xa8) [0x653d58]
/usr/bin/mongod() [0x7c3659]
/lib/libpthread.so.0(+0x69ca) [0x7fe12f2a39ca]
/lib/libc.so.6(clone+0x6d) [0x7fe12e64acdd]
I checked google and found that it was harmless, but the assertion ended up crashing mongodb here:
Fri Sep 28 15:20:42 [rsSyncNotifier] dbexception in groupCommit causing immediate shutdown: 13524 out of memory AlignedBuilder
Fri Sep 28 15:20:42 gc1
Fri Sep 28 15:20:42 Got signal: 6 (Aborted).
Fri Sep 28 15:20:42 Backtrace:
0xade6e1 0x5582d9 0x7fe12e597af0 0x7fe12e597a75 0x7fe12e59b5c0 0xb503f7 0xa0a61a 0xa0a83a 0xa0818f 0xa0835c 0xad8d87 0xad9673 0xadb1b8 0x94ff58 0x95250c 0x9556cb 0x7c3659 0x7fe12f2a39ca 0x7fe12e64acdd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9]
/lib/libc.so.6(+0x33af0) [0x7fe12e597af0]
/lib/libc.so.6(gsignal+0x35) [0x7fe12e597a75]
/lib/libc.so.6(abort+0x180) [0x7fe12e59b5c0]
/usr/bin/mongod(_ZN5mongo10mongoAbortEPKc+0x47) [0xb503f7]
/usr/bin/mongod() [0xa0a61a]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl9commitNowEv+0x1a) [0xa0a83a]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl16_aCommitIsNeededEv+0x3f) [0xa0818f]
/usr/bin/mongod(_ZN5mongo3dur11DurableImpl14commitIfNeededEb+0x4c) [0xa0835c]
/usr/bin/mongod(_ZN5mongo4Lock7DBWrite7lockTopERNS_9LockStateE+0x57) [0xad8d87]
/usr/bin/mongod(_ZN5mongo4Lock7DBWrite6lockDBERKSs+0xe3) [0xad9673]
/usr/bin/mongod(_ZN5mongo4Lock7DBWriteC1ERKNS_10StringDataE+0x58) [0xadb1b8]
/usr/bin/mongod(_ZN5mongo7replset14BackgroundSync9hasCursorEv+0x68) [0x94ff58]
/usr/bin/mongod(_ZN5mongo7replset14BackgroundSync9markOplogEv+0x2c) [0x95250c]
/usr/bin/mongod(_ZN5mongo7replset14BackgroundSync14notifierThreadEv+0x10b) [0x9556cb]
/usr/bin/mongod() [0x7c3659]
/lib/libpthread.so.0(+0x69ca) [0x7fe12f2a39ca]
/lib/libc.so.6(clone+0x6d) [0x7fe12e64acdd]
I am out of ideas on what we can do to solve this problem, as stated before this was our second initial sync.