13623 DR102 too much data written uncommitted

278 views
Skip to first unread message

yosi botzer

unread,
Mar 22, 2011, 10:44:11 AM3/22/11
to mongodb-user, yo...@amobee.com
Hi,

I have installed a mongoDB replica set with 3 instances (one of them
is arbiter).

I have tried to kill the Primary while it was under stress.

The Secondary became Primary fast and took the responsibility for the
reads and writes.

However, after I have restarted the instance that was original
Primary, for some reason it cannot get out from status RECOVERING.

I keep see the following lines in the log of this instance:


ue Mar 22 16:37:45 [replica set sync] replSet rollback 2
FindCommonPoint
Tue Mar 22 16:37:45 [replica set sync] replSet info rollback our last
optime: Mar 22 16:06:51:699
Tue Mar 22 16:37:45 [replica set sync] replSet info rollback their
last optime: Mar 22 16:23:07:17d
Tue Mar 22 16:37:45 [replica set sync] replSet info rollback diff in
end of log times: -976 seconds
Tue Mar 22 16:38:10 [dur] lsn set 1739529
Tue Mar 22 16:38:24 [replica set sync] replSet rollback found matching
events at Mar 22 15:50:42:22
Tue Mar 22 16:38:24 [replica set sync] replSet rollback
findcommonpoint scanned : 3540891
Tue Mar 22 16:38:24 [replica set sync] replSet replSet rollback 3
fixup
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 3.5
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 4 n:300
Tue Mar 22 16:38:24 [replica set sync] replSet minvalid=Mar 22
16:23:46 4d88b0f2:3b8
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 4.6
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 4.7
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 5 d:0 u:300
Tue Mar 22 16:38:24 [replica set sync] replSet rollback 6
Tue Mar 22 16:38:25 [replica set sync] replSet syncThread: 13623 DR102
too much data written uncommitted
Tue Mar 22 16:38:25 [dur] lsn set 1754179
Tue Mar 22 16:39:03 [replica set sync] replSet our last op time
written: Mar 22 16:08:49:2f3
Tue Mar 22 16:39:03 [replica set sync] replset source's GTE: Mar 22
16:08:49:2f3
Tue Mar 22 16:39:03 [replica set sync] replSet rollback 0
Tue Mar 22 16:39:03 [replica set sync] replSet rollback 1
Tue Mar 22 16:39:03 [replica set sync] replSet rollback 2
FindCommonPoint
Tue Mar 22 16:39:03 [replica set sync] replSet info rollback our last
optime: Mar 22 16:06:47:6bd
Tue Mar 22 16:39:03 [replica set sync] replSet info rollback their
last optime: Mar 22 16:24:25:c9
Tue Mar 22 16:39:03 [replica set sync] replSet info rollback diff in
end of log times: -1058 seconds
Tue Mar 22 16:39:10 [dur] lsn set 1799519
Tue Mar 22 16:39:42 [replica set sync] replSet rollback found matching
events at Mar 22 15:50:42:22
Tue Mar 22 16:39:42 [replica set sync] replSet rollback
findcommonpoint scanned : 3670940
Tue Mar 22 16:39:42 [replica set sync] replSet replSet rollback 3
fixup
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 3.5
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 4 n:300
Tue Mar 22 16:39:42 [replica set sync] replSet minvalid=Mar 22
16:25:04 4d88b140:32b
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 4.6
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 4.7
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 5 d:0 u:300
Tue Mar 22 16:39:42 [replica set sync] replSet rollback 6
Tue Mar 22 16:39:42 [replica set sync] replSet syncThread: 13623 DR102
too much data written uncommitted
Tue Mar 22 16:39:42 [dur] lsn set 1832049


This is how I have started my 3 instances:

on one machine:

./mongod --replSet as --fork --logpath /home/siteop/mongo/mongodblog/
mongo.log --port 27017 --dbpath /data/db/ --journal --rest
./mongod --replSet as --fork --logpath /home/siteop/mongo/mongodblog/
mongoA.log --port 27018 --dbpath /data/dbA/

on the other machine:

./mongod --replSet as --fork --logpath /home/siteop/mongo/mongodblog/
mongo.log --port 27017 --dbpath /data/db/ --journal --rest

Please note that my stress causing something like 5000 updates per
seconds.


Does anyone have any idea what should I do in order to help my
instance to get out from the RECOVERING sate and to become a normal
set member (either PRIMARY or SECONDARY) ???


Thanks
Yosi




Eliot Horowitz

unread,
Mar 22, 2011, 12:44:22 PM3/22/11
to mongod...@googlegroups.com, yo...@amobee.com
What version of mongo is this with?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

yosi botzer

unread,
Mar 22, 2011, 12:49:57 PM3/22/11
to mongodb-user
The latest (and impressive) 1.8.

BTW, I am thinking about reducing --syncdelay to 20 and increasing --
oplogSize to 4096.

Do you think it might help?

On Mar 22, 6:44 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> What version of mongo is this with?
>

Dwight Merriman

unread,
Mar 22, 2011, 2:17:33 PM3/22/11
to mongod...@googlegroups.com, yosi botzer
hi,


we will be fixing.  in the meantime, the workaround would be to start the mongod without --journal, let it roll back, shut down cleanly, then go back to --journal mode.

sorry about the issue.

yosi botzer

unread,
Mar 23, 2011, 5:09:01 AM3/23/11
to mongodb-user
Thanks Dwight for the quick response.

I'll try it post 1.8.2 and will let you know if it works for me.

yosi botzer

unread,
Mar 23, 2011, 6:44:58 AM3/23/11
to mongodb-user
BTW, I have tried the workaround that you have suggested.

I had also to delete the journal directory, otherwise the process
refused to start without the --journal.




On Mar 23, 11:09 am, yosi botzer <yosi.bot...@gmail.com> wrote:
> Thanks Dwight for the quick response.
>
> I'll try it post 1.8.2 and will let you know if it works for me.
>
> On Mar 22, 8:17 pm, Dwight Merriman <dwi...@10gen.com> wrote:
>
>
>
>
>
>
>
> > hi,
>
> > this is the problem:http://jira.mongodb.org/browse/SERVER-2737
>
> > <http://jira.mongodb.org/browse/SERVER-2737>we will be fixing.  in the
> > meantime, the workaround would be to start the mongod without --journal, let
> > it roll back, shut down cleanly, then go back to --journal mode.
>
> > sorry about the issue.
>
Reply all
Reply to author
Forward
0 new messages