Recovering from bad format AOF when aof file is 5.4 GB

3,461 views
Skip to first unread message

Luis Lavena

unread,
Sep 13, 2010, 8:46:31 PM9/13/10
to redi...@googlegroups.com
Hello guys,

Due a swapping condition our server crashed during the BGREWRITEAOF
and both the forked and the main redis-server server died.

Since we store a lot of information in Redis (HTML) we end with a
5.4GB file that we believe is pretty much close to the total of
information, however we are unable to load it back.

[28061] 13 Sep 17:23:24 * Server started, Redis version 2.0.1
[28061] 13 Sep 17:24:29 # Bad file format reading the append only file

I found this thread:

http://groups.google.com/group/redis-db/browse_thread/thread/d04777e6d68aed95

But even we found that last MULTI command was incomplete and removed
it, we still get the bad format messages, so there is a chance
something else is happening.

Every attempt to enable "loglevel debug" and get more information was
not working, only when VM was enabled log got noisy, but loading this
file is taking a lot.

Normally before it crash it loads 8.4GB of RAM in a few minutes, but
with VM enabled and default values is taking a lot longer and a log
file of around 350MB...

Any suggestions on tracing this and be able to recover the AOF contents?

Thank you.
--
Luis Lavena
AREA 17
-
Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

Salvatore Sanfilippo

unread,
Sep 13, 2010, 9:09:23 PM9/13/10
to redi...@googlegroups.com
Sorry for the short reply but here is 3 AM and I'm going to sleep ;)
But I guess this can be pretty urgent, so:

1) Make a backup copy of your AOF file, then..
2) Just use the ./redis-check-aof tool in the Redis distribution
(compiled with the usual "make")

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Luis Lavena

unread,
Sep 14, 2010, 8:39:05 AM9/14/10
to redi...@googlegroups.com
On Mon, Sep 13, 2010 at 10:09 PM, Salvatore Sanfilippo
<ant...@gmail.com> wrote:
> Sorry for the short reply but here is 3 AM and I'm going to sleep ;)

Thank you Salvatore, I give up trying around 11pm (gmt-3)

> But I guess this can be pretty urgent, so:
>
> 1) Make a backup copy of your AOF file, then..
> 2) Just use the ./redis-check-aof tool in the Redis distribution
> (compiled with the usual "make")

0x1576d9ef8: Expected to read 7 bytes, got 0 bytes
AOF is not valid

I already new about AOF not been valid, but what I presume is the hex
position (byte 5761769208) could lead me to the problem, right?

Thank you in advance for your help.

Konstantin Merenkov

unread,
Sep 14, 2010, 8:45:02 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 4:39 PM, Luis Lavena <luisl...@gmail.com> wrote:
> On Mon, Sep 13, 2010 at 10:09 PM, Salvatore Sanfilippo
> <ant...@gmail.com> wrote:
>> Sorry for the short reply but here is 3 AM and I'm going to sleep ;)
>
> Thank you Salvatore, I give up trying around 11pm (gmt-3)
>
>> But I guess this can be pretty urgent, so:
>>
>> 1) Make a backup copy of your AOF file, then..
>> 2) Just use the ./redis-check-aof tool in the Redis distribution
>> (compiled with the usual "make")
>
> 0x1576d9ef8: Expected to read 7 bytes, got 0 bytes
> AOF is not valid

Even if you get it fixed in your file, I bet that it is an important
thing to handle in redis itself.

--
Best Regards,
Konstantin Merenkov

Salvatore Sanfilippo

unread,
Sep 14, 2010, 8:52:54 AM9/14/10
to redi...@googlegroups.com
Sorry I forgot to tell you the most important bit:

./redis-chck-aof --fix <filename>

with --fix it will fix your file.

Cheers,
Salvatore

Luis Lavena

unread,
Sep 14, 2010, 8:53:00 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 9:39 AM, Luis Lavena <luisl...@gmail.com> wrote:
> On Mon, Sep 13, 2010 at 10:09 PM, Salvatore Sanfilippo
> <ant...@gmail.com> wrote:
>> Sorry for the short reply but here is 3 AM and I'm going to sleep ;)
>
> Thank you Salvatore, I give up trying around 11pm (gmt-3)
>
>> But I guess this can be pretty urgent, so:
>>
>> 1) Make a backup copy of your AOF file, then..
>> 2) Just use the ./redis-check-aof tool in the Redis distribution
>> (compiled with the usual "make")
>
> 0x1576d9ef8: Expected to read 7 bytes, got 0 bytes
> AOF is not valid
>
> I already new about AOF not been valid, but what I presume is the hex
> position (byte 5761769208) could lead me to the problem, right?
>

Hmn:

appendonly.aof is 5761769208, and the byte indicated by
redis-check-aof seems to be the EOF precisely.

The last command is:

$ tail -n 5 appendonly.aof
*1
$4
exec
*1
$5

According to the protocol file, is expecting 5 bytes after the last command.

Should I truncate it? since is 5.4GB will take a lot to resave the
file just for a test, want to ensure first ;-)

Thank you.

Salvatore Sanfilippo

unread,
Sep 14, 2010, 8:55:51 AM9/14/10
to redi...@googlegroups.com
I guess you are reading my --fix message now, but make sure to do a
backup of the old file, so you can recover the old file if something
bad happens but especially you can run diff -u against the two files
so you can see what changed.

Cheers,
Salvatore

Salvatore Sanfilippo

unread,
Sep 14, 2010, 8:54:39 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 2:45 PM, Konstantin Merenkov
<kmer...@gmail.com> wrote:

> Even if you get it fixed in your file, I bet that it is an important
> thing to handle in redis itself.

We do this already, as the --fix option of the utility shipped with
the Redis distribution can check and fix the file.

But I think it is important that this is a two step process, that is,
the user needs to understand the file is corrupted, using an external
tool to fix it doing a backup before, possibly checking what is the
difference with the two files with diff.

An AOF corrupted is not something to handle more or less silently I
think, but for sure we need to provide the tools to deal with this
problems :)

Cheers,
Salvatore

Luis Lavena

unread,
Sep 14, 2010, 9:04:52 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 9:55 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> I guess you are reading my --fix message now, but make sure to do a
> backup of the old file, so you can recover the old file if something
> bad happens but especially you can run diff -u against the two files
> so you can see what changed.
>

Yes, thank you.

$ diff -u appendonly.aof.backup appendonly.aof
--- appendonly.aof.backup 2010-09-14 05:32:55.000000000 -0700
+++ appendonly.aof 2010-09-14 05:59:56.000000000 -0700
@@ -208222098,5 +208222098,3 @@
*1
$4
exec
-*1
-$5

Pretty trivial fix.

Will bring the original file again since I truncated the file manually
following the previous thread that I mentioned.

Salvatore, thank you so much.

Salvatore Sanfilippo

unread,
Sep 14, 2010, 9:11:02 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 3:04 PM, Luis Lavena <luisl...@gmail.com> wrote:

> Pretty trivial fix.
>
> Will bring the original file again since I truncated the file manually
> following the previous thread that I mentioned.
>
> Salvatore, thank you so much.

You are welcome!

Btw the last commit in Redis master changes the error message to the following:

"Bad file format reading the append only file: make a backup of your
AOF file, then use ./redis-check-dump --fix <filename>"

Cheers,
Salvatore

Santiago Perez

unread,
Sep 14, 2010, 9:17:07 AM9/14/10
to redi...@googlegroups.com
> "Bad file format reading the append only file: make a backup of your
> AOF file, then use ./redis-check-dump --fix <filename>"

Shouldn't it be ./redis-check-aof ?

Salvatore Sanfilippo

unread,
Sep 14, 2010, 9:18:04 AM9/14/10
to redi...@googlegroups.com
Sorry.. fixing :)

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--

Salvatore Sanfilippo

unread,
Sep 14, 2010, 9:21:43 AM9/14/10
to redi...@googlegroups.com
Hello again Luis,

please can you send the output of tail -50 filename against your
corrupted AOF file?
I want to check how it was fixed in a more detailed way. Thanks!

Cheers,
Salvatore

Luis Lavena

unread,
Sep 14, 2010, 9:29:40 AM9/14/10
to redi...@googlegroups.com
On Tue, Sep 14, 2010 at 10:21 AM, Salvatore Sanfilippo
<ant...@gmail.com> wrote:
> Hello again Luis,
>
> please can you send the output of tail -50 filename against your
> corrupted AOF file?
> I want to check how it was fixed in a more detailed way. Thanks!
>

Will email you the tail contents and the diff directly.

Thank you.

Salvatore Sanfilippo

unread,
Sep 16, 2010, 12:51:46 PM9/16/10
to redi...@googlegroups.com
Thank you very much Luis,

I can confirm the tool is doing the right thing :)

Cheers,
Salvatore

Reply all
Reply to author
Forward
0 new messages