Problems with upgrading from 2.2.2 to 2.4

Tecbot

unread,

Mar 20, 2013, 5:07:23 PM3/20/13

to mongod...@googlegroups.com

Hello guys,

today I added two new servers with version 2.4 to my replica set which has 3 servers runs 2.2.2.

The 2 new servers sync fresh data from the master which runs 2.2.2.

The sync works fine until the step 3 of 3 from the oplog is reached.

Afther this the log has many entries like this:

Wed Mar 20 21:30:53.216 [repl writer worker 1] unindex failed (key too big?) ...

And at the end it crashed with this stacktrace:

Wed Mar 20 21:30:53.689 [repl writer worker 1] ERROR: writer worker caught exception: btree: key+recloc already in index on ...
Wed Mar 20 21:30:53.689 [repl writer worker 1] Fatal Assertion 16360
0xdcae01 0xd8ab83 0xc1e12c 0xd986b1 0xe13709 0x7fbd98b068ca 0x7fbd97eb9b6d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdcae01]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xd8ab83]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1e12c]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd986b1]
/usr/bin/mongod() [0xe13709]
/lib/libpthread.so.0(+0x68ca) [0x7fbd98b068ca]
/lib/libc.so.6(clone+0x6d) [0x7fbd97eb9b6d]
Wed Mar 20 21:30:53.692 [repl writer worker 1]

***aborting after fassert() failure

Wed Mar 20 21:30:53.692 Got signal: 6 (Aborted).

Wed Mar 20 21:30:53.695 Backtrace:
0xdcae01 0x6ce879 0x7fbd97e1c230 0x7fbd97e1c1b5 0x7fbd97e1efc0 0xd8abbe 0xc1e12c 0xd986b1 0xe13709 0x7fbd98b068ca 0x7fbd97eb9b6d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdcae01]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6ce879]
/lib/libc.so.6(+0x32230) [0x7fbd97e1c230]
/lib/libc.so.6(gsignal+0x35) [0x7fbd97e1c1b5]
/lib/libc.so.6(abort+0x180) [0x7fbd97e1efc0]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xd8abbe]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1e12c]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd986b1]
/usr/bin/mongod() [0xe13709]
/lib/libpthread.so.0(+0x68ca) [0x7fbd98b068ca]
/lib/libc.so.6(clone+0x6d) [0x7fbd97eb9b6d]

The funny thing is one server has reached the goal but he logs every time now:

[repl writer worker 1] unindex failed (key too big?) ...

Can anyone explain what is happens? This prevents us to upgrade to the new version.

Thanks & Regards

Thomas

Thomas Rueckstiess

unread,

Mar 20, 2013, 8:43:17 PM3/20/13

to mongod...@googlegroups.com

Hi Thomas,

To diagnose this issue, I'd like to have a look at the full mongod primary log file as well. However, since the Google group is public and you may not want to share all the information contained in the log file, are you able to open a ticket in our Jira system under the "Community Private" project? The link to create the ticket is this: https://jira.mongodb.org/secure/CreateIssue.jspa?pid=10020&issuetype=7. You may have to create a Jira user account if you don't have one yet.

Once you have openend the ticket, my colleagues and I are able to assist you and keep all the files and information to diagnose this case confidential.

Regards,

Thomas

Tecbot

unread,

Mar 21, 2013, 4:28:22 AM3/21/13

to mongod...@googlegroups.com

Hi Thomas,

thanks for the reply. I created a new ticket https://jira.mongodb.org/browse/SUPPORT-508 and attached the requested logs. I hope we can fix this problem ASAP.

Thanks & Regards

Thomas

Jason Rassi

unread,

Mar 25, 2013, 1:41:52 PM3/25/13

to mongod...@googlegroups.com

The root cause turned out to be SERVER-9087 [1]. A workaround (resync by means other than initial sync) was used to resolve the issue.

[1] https://jira.mongodb.org/browse/SERVER-9087 (fixed in 2.4.1)

Reply all

Reply to author

Forward