Backing up leveldb instances

2,480 views
Skip to first unread message

Jay Kreps

unread,
Jun 16, 2012, 5:50:15 PM6/16/12
to lev...@googlegroups.com
I am interested in being able to take point-in-time backups of a running leveldb instance. I am okay if there is a short pause while this occurs. Theoretically this should be relatively easy to do, I think, just
1. Temporarily block writes
2. Append any new data since the last backup to the old backup

If the backups are frequent, then the diff that needs to be copied may be quite small (e.g. if 1MB of new writes occurred, you just copy 1MB).

However this is defeated by the background compaction thread, which will be writing, deleting files, etc.

Does anyone have a good solution to this? A solution that has been proposed is to close the database and then do the backup. However this is not a great solution as closing the database would drop all the cache. Doing this regularly effectively defeats the purpose of caching...

I see a fork that seems to implement an enable/disable option for compaction, which would probably do the trick. Any chance this will make it into an official release? http://code.google.com/r/tfengjun-leveldb-2/source/detail?r=a8b82005ab9aa5734a280f8bc351e486ee6cab8e#

Does anyone else have a recipe for taking an incremental backup (e.g. what files to copy, etc).

-Jay


Dave Smith

unread,
Jun 16, 2012, 6:02:53 PM6/16/12
to lev...@googlegroups.com
I think you could just use a snapshot and traverse that, no? Maybe
there is some subtlety I'm missing... :)

D.

Jay Kreps

unread,
Jun 16, 2012, 10:48:41 PM6/16/12
to lev...@googlegroups.com
What I want is to back up the .sst, .log, and MANIFEST files so I can have a copy of the DB elsewhere. I think what you are suggesting is using a snapshot and iterating over the entire dataset and write it out in some other format. That is not appealing. I want to just copy the difference of what has been written (imagine this backup runs every few minutes) not the entire data set and I want to retain the leveldb file formats so the data is immediately formatted for serving. Intuitively I should be able to just take new .sst and .log files since the last backup. Does that make sense?

-Jay

Dhruba Borthakur

unread,
Jun 17, 2012, 2:33:30 AM6/17/12
to lev...@googlegroups.com
Since leveldb is a LSM engine, it never overwrites data. Does it make sense to do something like this:

1. Stall new writes
2. An api DB::listAllFiles() that can list the names and lengths of all existing files in a DB. This list is already part of the table_cache and will be very quick to retrieve. This means that new writes are stalled for a miniscule amount of time
3. Enhance posix_env.cc to make PosixEnv::Delete actually move files to a "archival directory" instead of deleting it.

Now, when leveldb compaction deletes files, they actually move to the archival directory. The backup process first invokes DB::listAllFiles() and then copies out whichever files it needs to copy from either the leveldb directory or the archival directory. The backup process can delete files from the archival directory when they are not needed. I plan to implement such an algorithm.

This approach has minimal impact on performance as well as compaction.

thanks.
dhruba

--
Subscribe to my posts at http://www.facebook.com/dhruba

Theo Schlossnagle

unread,
Jun 17, 2012, 8:46:12 AM6/17/12
to lev...@googlegroups.com
Given that compaction is a rather critical function for long term health in most leveldb databases (at least all our different uses), it would seem that solving this problem more generally would be needed.

FWIW, we just use file system snapshots and block-level-incremental replays to keep copies current. From a performance and cost basis, this method almost identical achieves what you want.

My two cents.

-- 
Theo Schlossnagle

Jay Kreps

unread,
Jun 17, 2012, 12:10:00 PM6/17/12
to lev...@googlegroups.com
Yes, if I understand what you are saying right I think that would definitely work for what I want to do.

To re-cap, since there is still some confusion, what I want to do is keep a replica of the DB files somewhere else and frequently sync this replica by applying a diff (basically an online backup). ActiveMQ was trying to do a similar thing I think but they have to do all kinds of hijinks to get a consistent snapshot of the files (https://github.com/fusesource/fuse-extra/tree/master/fusemq-leveldb), so this is a relatively common need, perhaps. BDB JE also supports some functionality to pause compaction for this very reason.

One detail of my case is I actually don't care if i block writes for a few seconds while the backup occurs as long as the backup is incremental. But for most people they would probably want a low-latency snapshot capability.

A filesystem-level snapshot would work, but adds a lot of other downsides. Since leveldb is append only snapshots at the application level should be really straight-forward.

So Dhruba, to flesh out what you are saying, I think to do incremental backups using your functionality I would do the following:
1. Pause new writes
2. List the current .sst and .log files and their exact lengths, this comprises an exact point-in-time snapshot of the data
3. Compute the diff from the last backup (e.g. new files and newly appended data on existing files)
4. Apply this diff to the backup location, grabbing the data either from the main data directory or from the archival directory if the files have been deleted).

This would also work for me, though it seems a bit more complex then just pausing compaction for 30 seconds while the backup occurs. Do you have a fork where you are doing this I would love to check it out.

Also, any hope of any of these things getting back into mainline?

-Jay

Dhruba Borthakur

unread,
Jun 18, 2012, 1:30:39 PM6/18/12
to lev...@googlegroups.com
Hi Jay, 

I have not yet build the backup approach that I described earlier, but will build it in the next few weeks. I will post a response to this email thread when I have that code.

thanks for theoretically verifying that the approach is correct,
dhruba

Linu Raj

unread,
Apr 1, 2015, 1:00:07 AM4/1/15
to lev...@googlegroups.com
Hi Dhruba,

I know this is an old thread. 

At this point what is your advice on taking level-db backups and restore. We are planning to use level-db along with ActiveMQ.

Appreciate your help on this.

Thanks!

Dhruba Borthakur

unread,
Apr 2, 2015, 8:18:59 PM4/2/15
to lev...@googlegroups.com
Hi Linu,

Here are some links on how to do backups/incremental-backups with rocksdb
If you have more questions about this approach, please send me direct-email 
(since the above link does not have anything to do with leveldb).

thanks
dhruba


--
You received this message because you are subscribed to the Google Groups "leveldb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leveldb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Linu Raj

unread,
Apr 3, 2015, 11:58:48 PM4/3/15
to lev...@googlegroups.com
Thanks Dhruba. I will definitely give it a shot using leveldbjni API. 

Once again thank you for your guidance. I'll let you know the outcome.


Regards,
Raj, Linu.

--
You received this message because you are subscribed to a topic in the Google Groups "leveldb" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/leveldb/9dftnHN7z5w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to leveldb+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages