Recovering from corrupted dump file

2,884 views
Skip to first unread message

Alan W

unread,
May 21, 2011, 10:41:26 PM5/21/11
to Redis DB
We had an OOM event on our filesystem and in an attempt to recover I
stopped Redis and moved the log files to a different volume. When I
tried to restart the server I got the message "Short read or OOM
loading DB. Unrecoverable error, aborting now." I recognize now that
it was foolish to stop Redis since that lost all in-memory data, but
what's done is done. Now I need to recover what I can from dump.rdb.

Running redis-check-dump on dump.rdb gives me the following:

==== Processed 1000682 valid opcodes (in 81027054 bytes)
=======================
==== Error trace (STRING: (unknown))
===========================================
0x04d45ffd - Error reading string object
0x04d45ffd - Error reading entry key
0x04d45ffd - Error for type STRING
==== Processed 0 valid opcodes (in 0 bytes)
====================================
==== Error trace
===============================================================
0x04d46000 - Expected EOF, got

Total unprocessable opcodes: 2


There seems to be a lot of salvageable data in the file, but how can I
get it? Do I need to tweak hex values in the file, or is there a tool
somewhere that will help me recover the uncorrupted data?

Salvatore Sanfilippo

unread,
May 22, 2011, 5:57:10 AM5/22/11
to redi...@googlegroups.com
Hey Alan,

are you sure the file was generated with the same Redis version that
is trying to load it?
.rdb files should be impossible to corrupt, as they are write once in
a temp file and moved with rename(2) into dump.rdb file name.
However I think even rename(2) is not perfectly safe, even if we
fsync() the file before renaming as there are the disk buffers and so
forth. Still... it is the first report of this kind we get in two
years.

Now since 2.4 and unstable are saving in a different format, and we
changed the .rdb version header field only recently, it is possible
that you need a newer Redis version.

If you don't have sensitive data in this file you can upload it
somewhere and send me and Pieter the link.

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Pieter Noordhuis

unread,
May 22, 2011, 6:08:06 AM5/22/11
to redi...@googlegroups.com
Hi Alan, Salvatore,

Not sure how this could happen with the rename(2) trick and all, but I don't think this is a problem with RDB versions. Rather, Redis expects to read an opcode, but reaches EOF. To solve this, you can use the "truncate" tool to truncate the size of the RDB dump to the specified mark (81027054). Make sure to create a backup of the file before doing so (just to be sure). After truncating, you need to append the EOF opcode so Redis will know it has reached the *valid* end of the file when reading it. You can do so with "echo -n '\xff' > mynewdump.rdb". If all is well, Redis should be able to load the valid portion of your dump again.

Cheers,
Pieter

Pieter Noordhuis

unread,
May 22, 2011, 10:48:40 AM5/22/11
to redi...@googlegroups.com
The echo snippet would overwrite the dump, hardly what you need when recovering ;-).

Correction: appending the EOF byte to the dump is done with: "echo -n '\xff' >> mynewdump.rdb" (note >>).

Cheers,
Pieter

Didier Spezia

unread,
May 22, 2011, 2:00:38 PM5/22/11
to Redis DB

There is also a low probability for the file to be corrupted at
the very last steps of the dump process just before the rename(2).
See issue 417.

Seems unlikely, though.

Regards,
Didier.
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Alan W

unread,
May 24, 2011, 1:34:48 PM5/24/11
to Redis DB
Thanks to all for the replies. I was able to successfully recover the
snapshot.

This was caused not by an OOM error like I originally reported, but a
disk full error. Apologies for any confusion that might have caused.

It turns out that a key-value object was only partially written when
the volume ran out of space, which left the rdb file in an unreadable
state. I was able to use the source code at github to figure out
where the last valid object ended, then used essentially the procedure
recommended by Pieter to truncate the file at the correct position and
append the EOF. Redis happily started up after that.

I don't know if atomicity of write operations to the snapshot has been
considered as a feature, but in this case it would have prevented the
corruption from occurring. I imagine there would be some tradeoff of
write speed in order to accomplish this, and perhaps this is why write
operations are non-atomic at present.

Anyhow, thanks again to all for your input, and thanks especially to
antirez for providing the code through github to make it possible to
look at the internals to understand what was happening.

Alan

On May 22, 8:48 am, Pieter Noordhuis <pcnoordh...@gmail.com> wrote:
> The echo snippet would overwrite the dump, hardly what you need when recovering ;-).
>
> Correction: appending the EOF byte to the dump is done with: "echo -n '\xff' >> mynewdump.rdb" (note >>).
>
> Cheers,
> PieterOn Sunday, May 22, 2011 at 12:08 PM, Pieter Noordhuis wrote:
>
> Hi Alan, Salvatore,
>
>
>
>
>
>
>
>
>
> > Not sure how this could happen with the rename(2) trick and all, but I don't think this is a problem with RDB versions. Rather, Redis expects to read an opcode, but reaches EOF. To solve this, you can use the "truncate" tool to truncate the size of the RDB dump to the specified mark (81027054). Make sure to create a backup of the file before doing so (just to be sure). After truncating, you need to append the EOF opcode so Redis will know it has reached the *valid* end of the file when reading it. You can do so with "echo -n '\xff' > mynewdump.rdb". If all is well, Redis should be able to load the valid portion of your dump again.
>
> > Cheers,
> > Pieter
> > On Sunday, May 22, 2011 at 11:57 AM, Salvatore Sanfilippo wrote:
> > > Hey Alan,
>
> > > are you sure the file was generated with the same Redis version that
> > > is trying to load it?
> > > .rdb files should be impossible to corrupt, as they are write once in
> > > a temp file and moved with rename(2) into dump.rdb file name.
> > > However I think even rename(2) is not perfectly safe, even if we
> > > fsync() the file before renaming as there are the disk buffers and so
> > > forth. Still... it is the first report of this kind we get in two
> > > years.
>
> > > Now since 2.4 and unstable are saving in a different format, and we
> > > changed the .rdb version header field only recently, it is possible
> > > that you need a newer Redis version.
>
> > > If you don't have sensitive data in this file you can upload it
> > > somewhere and send me and Pieter the link.
>
> > > Cheers,
> > > Salvatore
>
> > > > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Michael Frolov

unread,
Aug 5, 2013, 9:01:21 AM8/5/13
to redi...@googlegroups.com
Can you tell recovery process details, please? I have the same error and possibly same error reason. Couldn't find working recovery solutions. Simple truncate and EOF write doesn't work in my case. The subject of my interest is source code at github that helped you to figure out where the last valid object ended and how to use it. I tried redis-rdb-tools, but it couldn't dump my rdb to json or txt formats failing with "bytearray index out of range" exception.

вторник, 24 мая 2011 г., 21:34:48 UTC+4 пользователь Alan W написал:

Josiah Carlson

unread,
Aug 5, 2013, 11:52:42 AM8/5/13
to redi...@googlegroups.com
The simplest solution is to edit redis-rdb-tools to catch the exception and stop processing.

 - Josiah


To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Michael Frolov

unread,
Aug 5, 2013, 1:07:34 PM8/5/13
to redi...@googlegroups.com
Hmm, thanks. I'll try it tomorrow. 


2013/8/5 Josiah Carlson <josiah....@gmail.com>

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/P7F3jBPu6lM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
  Михаил Фролов.

Sripathi Krishnan

unread,
Aug 5, 2013, 5:00:40 PM8/5/13
to redi...@googlegroups.com
Redis-rdb-tools parses whatever it can and fails at the first error. When you do encounter an error, you can start skipping a few bytes till you find a valid opcode. From that point onwards, you can resume normal parsing. 

The main loop for redis-rdb-tools is over here - https://github.com/sripathikrishnan/redis-rdb-tools/blob/master/rdbtools/parser.py#L279. An example of what I am describing can be found in redis-check-dump source code over here - https://github.com/antirez/redis/blob/unstable/src/redis-check-dump.c#L618

The python equivalent of loadEntry() method would either be read_object() or skip_object(). 

You can give it a try - hopefully you should be able to recover a lot more data using this process.

--Sri

Michael Frolov

unread,
Aug 6, 2013, 11:35:16 AM8/6/13
to redi...@googlegroups.com
Thanks for your help. I created a bootstrap script from rdb.py and modifying parser to try your solution.

вторник, 6 августа 2013 г., 1:00:40 UTC+4 пользователь Sripathi Krishnan написал:

Michael Frolov

unread,
Aug 9, 2013, 11:25:39 AM8/9/13
to redi...@googlegroups.com
To my great sorrow the error dumping to json appears at the last byte of file. The most sensitive data is permanently lost... power loss seems to cause all the things...

вторник, 6 августа 2013 г., 19:35:16 UTC+4 пользователь Michael Frolov написал:

Bigi Lui

unread,
Jan 29, 2018, 5:43:45 AM1/29/18
to Redis DB
Hello,

Sorry to revive an old thread, but this is the closest thing in my googling to my problem.

I have a corrupted .rdb file. It's entirely my fault and not Redis -- I believe what happened was that I previously misconfigured a different Redis instance (I run multiple Redis instances on different ports on the same server, because it's personal projects with low budget), but in their config file at one point I had some of those difference instances write to the same rdb file as the first Redis instance.

Anyway, it came time to reboot my server recently, and I was devastated to find out that my first Redis instance didn't have the data I needed.

I inspected the rdb file by opening it up with a text editor. I understand it's a binary file, but I can see that the data I need is still in there in some form, and appears like they could be salvaged. There is multiple copies of the rdb initial string like "REDIS0006" and "REDIS0007" in there, which leads me to believe there is multiple copies of rdb written to this same file.

I've tried various things like using "dd" to shave off first X bytes to "skip" a partial rdb file to the next appearance of the "REDIS0006" for example. This actually worked to get me to data for one or two other Redis instances (old) but alas, it couldn't get to the one I wanted.

At this point, if Salvatore or anyone on the Redis team can help, I'm even willing to send you the entire rdb file for debug. There's some sensitive users data in there, but bottom line is they are side projects and no financial information is in there (none of my apps/sites are paid services), and I'd much rather be able to help bring these accounts back so they can use my services.
Reply all
Reply to author
Forward
0 new messages