Background
I'm looking for an incremental backup solution for MongoDB. While point-in-time restore functionality is not a requirement, it would be nice to have. I'm running a replica set without sharding.
Some options I've tried include just taking a daily snapshot with mongodump, or storing incremental diffs of this with duplicity. However, with my data set being a few hundred GB, these approaches are too slow.
MMS Backups would be the perfect solution. Unfortunately I can't store the backups in the US because of privacy regulations, so I'm looking for something that runs on-premise. I am considering on-premise MMS as an option, but also want to look at other options.
Oplog approach
I'm looking into scripting a solution that backups up periodic dumps / snapshots, and then continuously saving the oplog. Would an approach like this work?
Backup:
1. Take a weekly snapshot with mongodump --oplog.
2. Every hour (assume the oplog always has more than an hour of changes), dump all new data from the oplog with something like this:
mongodump -d local -c
oplog.rs --query "{ts : { "$gt" : { "$timestamp" : { "t" : (timestamp here) } } }}"
Some care will have to be taken to keep track of the last seen timestamp, and also that there are no gaps in the oplog.
3. Move these files to some offsite backup location.
Restore:
1. Take the latest full dump, and restore with mongorestore --oplogReplay.
2. For each oplog dump after that (in order):
a. Place the oplog in an empty folder, say "oplog-n", as oplog.bson (instead of local/oplog.rs.bson)
b. Restore the oplog with mongorestore --oplogReplay "oplog-n".
Notes:
* Point-in-time restore can be achieved by using the --oplogLimit option.
* The restore can be done with a single mongorestore command by concatenating the oplog.bson files (may need some care to make the format is valid).
Would this approach work? It feels like it's not the intended use case for the --oplogReplay command, but mongorestore doesn't complain when doing it.
I'm aware of the Tarja project, that does something similar. However, I'll be more comfortable using only standard MongoDB commands and some bash-like scripting.
Regards,
Ralf Kistner