how can i tell if my oplog is big enough for a mongorestore on the primary

davidmo

unread,

Aug 16, 2017, 3:34:55 PM8/16/17

to mongodb-user

hi folks.

we are on 3.0.7 with a 3 member replicate cluster.

i need to run mongorestore on the primary but i do not want to blow up the oplog. the oplog

is 70 Gb and the mongodump files are 35 Gb. i want to run this when the oplog headroom is

17-20 hrs. any thing i can do to make this less of a guess ?

thanks

Kevin Adistambha

unread,

Sep 5, 2017, 1:42:38 AM9/5/17

to mongodb-user

Hi

i need to run mongorestore on the primary but i do not want to blow up the oplog.

You can get a summary of the oplog using the rs.printReplicationInfo(), which will show you the detail of your oplog in terms of size and time (i.e. the difference in timestamp between the earliest and the latest oplog entries). Based on the oplog size, you can then approximate how much time is required for the latest oplog entry to be overwritten. Of course, this is an approximation, and the actual time would also depend on how busy your deployment is, which may change day-to-day.

Since the oplog was designed to perform this kind of operation, do you have a specific concern regarding filling the oplog due to mongorestore?

Best regards,
Kevin

davidmo

unread,

Sep 6, 2017, 4:06:53 PM9/6/17

to mongodb-user

i do. for example,

i have to mongorestore 3 new databases that are approximately 50 Gb each to my replicate cluster.

i run this:
rs0:SECONDARY> rs.printReplicationInfo();rs.printReplicationInfo();
configured oplog size:   4348.386688232422MB
log length start to end: 43553secs (12.1hrs)
oplog first event time: Wed Sep 06 2017 03:51:31 GMT-0400 (EDT)
oplog last event time:   Wed Sep 06 2017 15:57:24 GMT-0400 (EDT)
now:                     Wed Sep 06 2017 15:57:24 GMT-0400 (EDT)

i dont understand what i can get from this that makes me feel confident (or not) that when

i restore the first file i will not completely fill the oplog and

cause a secondary to get out of synch. lets assume no processing in the primary while the restore is running. do you have a way to calculate when it is ok ?

this has been bothering me for a long time.

thanks

david

Kevin Adistambha

unread,

Sep 7, 2017, 10:02:38 PM9/7/17

to mongodb-user

Hi David,

configured oplog size: 4348.386688232422MB

From the output you provided, it appears that your oplog is 4GB in size, not 70GB. Is this correct? In this case, you are restoring more data than what the oplog can hold, and if your secondaries cannot keep up, they could fall off the oplog. Having said that:

i restore the first file i will not completely fill the oplog and cause a secondary to get out of synch

By default, mongorestore will perform the restore using write concern majority if you specify that you are restoring to a replica set using the correct host parameter.

You can of course specify a stronger write concern setting to specify that the restore must be replicated to all members of the replica set. This will ensure that none of your secondaries would be left behind. See write concern for more information on different write concern settings and how it affects replicated writes.

On another note, your MongoDB 3.0.7 is quite old as it was released in 2015. You might want to consider upgrading to the latest in the 3.2 series or even better, the 3.4 series. There have been many significant improvements and bug fixes since then.

Best regards,
Kevin

davidmo

unread,

Sep 8, 2017, 1:41:58 PM9/8/17

to mongodb-user

thanks for your reply !

i have the same concern with all the replicate clusters. i just happened to have another cluster where we needed to do a mongorestore

so it was a great realtime example. not a typo - the oplogs are sized differently among the replicate clusters. they are the same for all members of the same cluster.

if i understand what you are saying, then a mongorestore with the write concern specified to be all members will never cause the oplog to fill up ! it will not do another

write untill it has confirmation from all the secondaries that the current write is completed. w: "majority" is not as safe but probably safe enough if the network speed is

equal among the cluster members.

about the version, we are working on it. we have one cluster that is at 3.4.7 - trying to get the others there.

On Wednesday, August 16, 2017 at 3:34:55 PM UTC-4, davidmo wrote:

Reply all

Reply to author

Forward