parse rdb files?

2,089 views
Skip to first unread message

Tim Lossen

unread,
Jan 16, 2012, 10:14:11 AM1/16/12
to redi...@googlegroups.com
hello list,

i have a large number of rdb files, and i want to inspect
them. what is the best way to do this -- without loading
them into redis?

is there any stand-alone tool to read and parse rdb files?

(the rdb files are not dumps, but redis diskstore files,
and each contains just a single redis hash. i would like
to extract some values from this hash.)

any hints appreciated ...

thanks
tim


--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Jan 16, 2012, 10:23:29 AM1/16/12
to redi...@googlegroups.com
Hello Tim,

you can easily create a Ruby/Python/Whatever script to parse large
quantities of RDB files, in theory.
In practice the file format is currently only documented in the source
code of Redis itself...

If you are going to do it, you can use this thread to ask whatever
question you need to get answered and I'll make my best efforts to
promptly reply, so that the thread can later turned into documentation
(or used to integrate documentation) of the RDB format.

I look forward to properly document the RDB format, but unfortunately
right now I can't because there are other priorities regarding the 2.6
release.

Thanks,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Dvir Volk

unread,
Jan 16, 2012, 11:05:40 AM1/16/12
to redi...@googlegroups.com
Salvatore, a related question - is it technically possible with a reasonable amount of effort, to enable redis to explicitly "hot load" an rdb file from a location, optionally *without* resetting the data (just overriding whatever colliding key, or defining a collision strategy per user preference)?

Dvir Volk
System Architect, DoAT, http://doat.com

Salvatore Sanfilippo

unread,
Jan 16, 2012, 11:26:40 AM1/16/12
to redi...@googlegroups.com
On Mon, Jan 16, 2012 at 5:05 PM, Dvir Volk <dv...@doat.com> wrote:
> Salvatore, a related question - is it technically possible with a reasonable
> amount of effort, to enable redis to explicitly "hot load" an rdb file from
> a location, optionally *without* resetting the data (just overriding

Hot loading is already possible with DEBUG RELOAD, but this will flush
the old dataset, and even if it works is not supported officially...
Another trick is to implement a fake slave, that will just send the
RDB file after the SYNC request... and finally will close the
connection: it is a trick but this is somewhat officially supported,
because it is at the base of replication.

However collision strategies are currently not an option, it's just
remove the old, add the new. The place where this could be done maybe
is in our famous redis-cli feature to handel the RDB format, so other
than dumping the RDB into CSV/JSON this wonder-redis-cli would also be
able to merge different RDB files with different strategies, or even
compute the intersection or difference between RDB files.

Cheers,
Salvatore

Yiftach Shoolman

unread,
Jan 16, 2012, 11:31:46 AM1/16/12
to redi...@googlegroups.com
Merging different RDB files utility can also be very helpful for Redis cluster/sharding with data persistence. 
Yiftach Shoolman
+972-54-7634621

Tim Lossen

unread,
Jan 16, 2012, 11:55:18 AM1/16/12
to redi...@googlegroups.com
> you can easily create a Ruby/Python/Whatever script to parse large
> quantities of RDB files, in theory.

hmmm ..... i was kinda hoping somebody had done that already ;)

is redis-check-dump.c a good place to start?

--
http://tim.lossen.de

ivan babrou

unread,
Jan 16, 2012, 12:13:12 PM1/16/12
to redi...@googlegroups.com
there is nodejs module for this: https://github.com/pconstr/rdb-parser
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou

Salvatore Sanfilippo

unread,
Jan 17, 2012, 5:10:14 AM1/17/12
to redi...@googlegroups.com
On Mon, Jan 16, 2012 at 5:55 PM, Tim Lossen <t...@lossen.de> wrote:
>> you can easily create a Ruby/Python/Whatever script to parse large
>> quantities of RDB files, in theory.
>
> hmmm ..... i was kinda hoping somebody had done that already ;)
>
> is redis-check-dump.c a good place to start?

Actually the node.js module to parse RDB, or the Redis source code
itself (rdb.c), is a better place since redis-check-dump is designed
to check for inconsistencies but does not directly handle many things
needed to really obtain all the infos from an RDB file.

Salvatore

Salvatore Sanfilippo

unread,
Jan 17, 2012, 5:11:10 AM1/17/12
to redi...@googlegroups.com
On Mon, Jan 16, 2012 at 6:13 PM, ivan babrou <ibo...@gmail.com> wrote:

> there is nodejs module for this: https://github.com/pconstr/rdb-parser

Cool, not bad at all, I read the source code a few minutes and it
seems pretty good with support for everything in 2.4.

Thanks for the info,
Salvatore

Dvir Volk

unread,
Jan 17, 2012, 6:13:58 AM1/17/12
to redi...@googlegroups.com
I might get a couple of work days next week to write an rdb handling tool.
the main objective would be to hot sync data between different redis instances without down time.

I first thought of implementing it inside redis, but I think I'll do it as an external tool.
Here's what I think I'll do:

1. Wrap the rdb.c calls inside a python module to make the rest easier, and create a raw parser.
This will allow me to easily maintain compatibility with future redis versions.

2. read an rdb with that parser, allowing:
  • Filtering of keys to act upon (by pattern, db number or type)
  • Push the keys to a redis instance using the python redis client.
  • set collision strategy (overwrite | ignore | abort | reset )
  • set locking strategy
3. allow this tool to be extended for other uses, such as creating a full text index of string values or keys, creating a prefix tree of keys, or whatever you may want.

ivan babrou

unread,
Jan 17, 2012, 6:30:34 AM1/17/12
to redi...@googlegroups.com
What about replication? You don't need rdb to sync data with no downtime, really.

Dvir Volk

unread,
Jan 17, 2012, 6:34:41 AM1/17/12
to redi...@googlegroups.com
it's difficult because there's no way to get all the keys from all the databases in the source instance, without blocking it for what can be a long time.
plus in my case the source server is not on the same LAN as the destination.

it's easier to just get the source's rdb and load it into the destination (for my use case at least).

ivan babrou

unread,
Jan 17, 2012, 6:47:40 AM1/17/12
to redi...@googlegroups.com
It's not easier. Replication is non-blocking operation at all.

If you live in different networks - use ssh tunneling to provide direct channel between master and slave for replication.

Dvir Volk

unread,
Jan 17, 2012, 7:21:48 AM1/17/12
to redi...@googlegroups.com
doing this with the slave protocol seems to me more complex than just the rdb parses. plus it requires a working source (what if you just want to merge 2 rdbs and don't have enough RAM to do this? what if, as in my case, the latency and bandwidth between the source as the master makes it too slow to replicate, than just sending a compressed rdb?)

Tim Lossen

unread,
Jan 17, 2012, 9:31:06 AM1/17/12
to redi...@googlegroups.com
excellent, that would certainly solve my problem as well.
please keep us posted on your progress, dvir!

tim

Tim Lossen

unread,
Jan 24, 2012, 10:38:52 AM1/24/12
to redi...@googlegroups.com
ok, so i went ahead and (started to) implement a parser in ruby:

https://gist.github.com/1670711

it only deals with my current problem (= redis diskstore files
which contain a single hash value), and does not handle
anything else yet, but it could of course be generalized ...

cheers
tim


On 2012-01-16, at 16:23 , Salvatore Sanfilippo wrote:

--
http://tim.lossen.de

Dvir Volk

unread,
Jan 24, 2012, 10:44:13 AM1/24/12
to redi...@googlegroups.com
cool!
I'm still in for making the solution I described, but I won't get the time to do it in the next week or so.

If you care to write yours in python I'd be glad to collaborate ;)
Dvir Volk
System Architect, The Everything Project (formerly DoAT)

Tim Lossen

unread,
Jan 24, 2012, 10:45:56 AM1/24/12
to redi...@googlegroups.com
shouldn't be too hard to convert it to python, i guess ...

Dvir Volk

unread,
Jan 24, 2012, 10:51:33 AM1/24/12
to redi...@googlegroups.com
actually your code can be converted to python with a few search&replaces, quite different from the ruby code I'm used to see (and facepalm), which is Chef recipes :)
maybe I should give ruby a try after all.

Tim Lossen

unread,
Jan 24, 2012, 11:16:16 AM1/24/12
to redi...@googlegroups.com
you definitely should -- ruby is a lovely language.

Julien Ammous

unread,
Oct 16, 2012, 5:45:28 AM10/16/12
to redi...@googlegroups.com
I just found this project: https://github.com/sripathikrishnan/redis-rdb-tools
It is written in python but might be helpful, especially https://github.com/sripathikrishnan/redis-rdb-tools/wiki/Redis-RDB-Dump-File-Format

On Monday, 15 October 2012 13:39:55 UTC+2, flygoast wrote:
i  implement a parser in perl

https://github.com/flygoast/Redis-RdbParser 

CPAN address

http://search.cpan.org/~flygoast/Redis-RdbParser-0.03/ 

cheers

flygoast

Reply all
Reply to author
Forward
0 new messages