reclaim space with mmapv1 storage

davidmo

unread,

Jun 22, 2017, 2:11:12 PM6/22/17

to mongodb-user

hi all. i have a 3.0.7 mongo replicate cluster with a primary and secondary in one datacenter and a secondary in a second datacenter.
i desperately need to shrink some databases and reclaim space. from approx 260 Gb to 75 Gb. what is the best way to do this ?

scenario 1:
shutdown the secondary
bring it back up no replset different port
mongodump the database
mongorestore the database
shutdown and bring it back up with replset and original port
rotate through the cluster

scenario 2:
shutdown the secondary
bring it back up no replset different port
run db.repair
shutdown and bring it back up with replset and original port

thanks all, and please give any asuggestions/coments/gotchas you have.

Kevin Adistambha

unread,

Jun 27, 2017, 10:07:10 PM6/27/17

to mongodb-user

Hi

i desperately need to shrink some databases and reclaim space. from approx 260 Gb to 75 Gb. what is the best way to do this ?

The best way to reclaim space in your replica set is to perform a rolling initial sync. The general procedure can be found in Perform Maintenance on Replica Set Members. Please make sure that you have sufficient oplog size to be able to do an initial sync given your data size and your network speed. It’s also best if you ensure that your backups are current before doing any maintenance work on your deployment.

Please be very careful about running the repairDatabase process in a replica set. This process is similar to fsck, in that it will attempt to remove corrupt documents from a single mongod. If one of the secondaries has a corrupt copy of a document (e.g. due to storage corruption, etc.), repairDatabase will remove this document, making this secondary having a different content than the rest of the replica set. This could lead to further issues in the future, since MongoDB assumes that all nodes in a replica set contains identical data.

Also, please note that the latest version in MongoDB 3.0 series is 3.0.15. 3.0.7 was released almost two years ago, and there has been many bugfixes and improvements since then.

Best regards,
Kevin

davidmo

unread,

Jun 28, 2017, 9:47:37 AM6/28/17

to mongodb-user

thanks for your reply. i can bring down another server so there will be no writes to the oblog (no primary). that eliminates oplog worry.
are people really doing rolling initial syncs with > terrabyte servers ? it takes me 20 minutes to ftp 50 Gb.
i would like to do 1 database at a time as the many small databases have no wasted space. would scenario 1 work ? are there any folks out there who
do 1 database at a time with dump/restore ?

about the version - i know. i am trying to get to 3.4. very hard to set up testing resources.

Kevin Adistambha

unread,

Jul 7, 2017, 2:14:55 AM7/7/17

to mongodb-user

Hi

are people really doing rolling initial syncs with > terrabyte servers ? it takes me 20 minutes to ftp 50 Gb.

The feasibility of using rolling initial sync will of course depend on your situation, data size, downtime requirements, and the provisioning details of the deployment. It may be feasible for a certain deployment provisioned using the best hardware, but may not be as feasible when using lesser hardware. Also, long initial sync process may be tolerable in some cases, but not so in others.

One reason why initial sync is the recommended method in most cases is because it is the least risky method if your aim is to recover space in an MMAPv1 deployment, requires relatively little planning, and involve no downtime. However, if you determined that doing an initial sync is not feasible for you, please ensure that the method is thoroughly tested and verified before proceeding in a production environment.

Best regards,
Kevin

Reply all

Reply to author

Forward