Problems with upgrading from 2.2.2 to 2.4

163 views
Skip to first unread message

Tecbot

unread,
Mar 20, 2013, 5:07:23 PM3/20/13
to mongod...@googlegroups.com
Hello guys,

today I added two new servers with version 2.4 to my replica set which has 3 servers runs 2.2.2.

The 2 new servers sync fresh data from the master which runs 2.2.2.
The sync works fine until the step 3 of 3 from the oplog is reached.
Afther this the log has many entries like this:

Wed Mar 20 21:30:53.216 [repl writer worker 1] unindex failed (key too big?) ... 

And at the end it crashed with this stacktrace:

Wed Mar 20 21:30:53.689 [repl writer worker 1] ERROR: writer worker caught exception: btree: key+recloc already in index on ...
Wed Mar 20 21:30:53.689 [repl writer worker 1]   Fatal Assertion 16360
0xdcae01 0xd8ab83 0xc1e12c 0xd986b1 0xe13709 0x7fbd98b068ca 0x7fbd97eb9b6d
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdcae01]
 /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xd8ab83]
 /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1e12c]
 /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd986b1]
 /usr/bin/mongod() [0xe13709]
 /lib/libpthread.so.0(+0x68ca) [0x7fbd98b068ca]
 /lib/libc.so.6(clone+0x6d) [0x7fbd97eb9b6d]
Wed Mar 20 21:30:53.692 [repl writer worker 1]

***aborting after fassert() failure


Wed Mar 20 21:30:53.692 Got signal: 6 (Aborted).

Wed Mar 20 21:30:53.695 Backtrace:
0xdcae01 0x6ce879 0x7fbd97e1c230 0x7fbd97e1c1b5 0x7fbd97e1efc0 0xd8abbe 0xc1e12c 0xd986b1 0xe13709 0x7fbd98b068ca 0x7fbd97eb9b6d
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdcae01]
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6ce879]
 /lib/libc.so.6(+0x32230) [0x7fbd97e1c230]
 /lib/libc.so.6(gsignal+0x35) [0x7fbd97e1c1b5]
 /lib/libc.so.6(abort+0x180) [0x7fbd97e1efc0]
 /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xd8abbe]
 /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1e12c]
 /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd986b1]
 /usr/bin/mongod() [0xe13709]
 /lib/libpthread.so.0(+0x68ca) [0x7fbd98b068ca]
 /lib/libc.so.6(clone+0x6d) [0x7fbd97eb9b6d]

The funny thing is one server has reached the goal but he logs every time now:

[repl writer worker 1] unindex failed (key too big?) ... 

Can anyone explain what is happens? This prevents us to upgrade to the new version.

Thanks & Regards
Thomas

Thomas Rueckstiess

unread,
Mar 20, 2013, 8:43:17 PM3/20/13
to mongod...@googlegroups.com
Hi Thomas,

To diagnose this issue, I'd like to have a look at the full mongod primary log file as well. However, since the Google group is public and you may not want to share all the information contained in the log file, are you able to open a ticket in our Jira system under the "Community Private" project? The link to create the ticket is this: https://jira.mongodb.org/secure/CreateIssue.jspa?pid=10020&issuetype=7. You may have to create a Jira user account if you don't have one yet. 

Once you have openend the ticket, my colleagues and I are able to assist you and keep all the files and information to diagnose this case confidential.

Regards,
Thomas

Tecbot

unread,
Mar 21, 2013, 4:28:22 AM3/21/13
to mongod...@googlegroups.com
Hi Thomas,

thanks for the reply. I created a new ticket https://jira.mongodb.org/browse/SUPPORT-508 and attached the requested logs. I hope we can fix this problem ASAP.

Thanks & Regards
Thomas

Jason Rassi

unread,
Mar 25, 2013, 1:41:52 PM3/25/13
to mongod...@googlegroups.com
The root cause turned out to be SERVER-9087 [1].  A workaround (resync by means other than initial sync) was used to resolve the issue.

Reply all
Reply to author
Forward
0 new messages