aof-load-truncated not working as expected

406 views
Skip to first unread message

Jay Rolette

unread,
Dec 3, 2014, 4:12:49 PM12/3/14
to Redis Db
I'm periodically seeing cases where a system loses power (or otherwise has a hard crash where we don't get a chance to do a clean shutdown or reboot) and Redis won't load due to AOF corruption.

I'm using Redis 2.8.16 with the aof-load-truncated option enabled, but the auto-truncate code doesn't seem to be triggering. Instead, I get the same warning about needing to use "redis-check-aof --fix".

Couple of other particularly relevant redis.conf configuration parameters:

appendonly yes
appendfsync no
save 900 1
aof-load-truncated yes

Disabling appendfsync is on purpose. This particular instance just has stats data, so it only writes it out every 15 mins and not a big deal if we lose it.

Looking at the AOF file, what I'm seeing typically looks like this:

<snip>
$4
HSET
$19
stats:5s:1412823570
$10
resmon:mem
$8
1:46.000
*4
$4
HSET
$19
stats:1m:1412823540
$10
resmon:mem
$8
7:46.000
*4
$4
HSET
$19
stats:1h:1412820000
$10
resmon:mem
$10
611:45.871
*4
$4
HSET
$23
stats:sliced:1412823570
$15
resmon:mem:mem0
$8
1:46.000
<~4K of 0x00 bytes>

None of the commands in the AOF file are ever truncated or malformed. There is just some amount of NULLs extra at the end of the file.

I would have expected this case to be covered by aof-load-truncated, but looking at aof.c, it only does the auto truncate if we were in the middle of a MULTI/EXEC transaction. Any chance it can be called in cases like this?

I'm also curious about the block of NULLs at the end. I've started having my guys ping me to look at the AOF file when this happens and it's relatively consistent. No partially written commands, just a bunch of NULLs (and sometimes a newline) at the end of the AOF file.

Side-effect of how Redis writes to the AOF file maybe? I haven't had a chance to look at that part of the code, but it smells like maybe it is writing out a fixed size buffer and then calling truncate() to trim the file back to the correct size.

Thanks,
Jay

Salvatore Sanfilippo

unread,
Dec 3, 2014, 4:50:34 PM12/3/14
to Redis DB
Hello Jay,

On Wed, Dec 3, 2014 at 10:12 PM, Jay Rolette <rol...@infiniteio.com> wrote:
> I would have expected this case to be covered by aof-load-truncated, but
> looking at aof.c, it only does the auto truncate if we were in the middle of
> a MULTI/EXEC transaction. Any chance it can be called in cases like this?

Actually the code behavior is a bit different. I understand that it's
a bit convoluted and at a first glance it seems like you said. But
actually it always loads the AOF file if it is truncated without
garbage at the end.

> I'm also curious about the block of NULLs at the end. I've started having my
> guys ping me to look at the AOF file when this happens and it's relatively
> consistent. No partially written commands, just a bunch of NULLs (and
> sometimes a newline) at the end of the AOF file.

The NULLs are the end is the reason Redis is not able to load the
file, basically: it detects it as garbage so refuses to start.

> Side-effect of how Redis writes to the AOF file maybe? I haven't had a
> chance to look at that part of the code, but it smells like maybe it is
> writing out a fixed size buffer and then calling truncate() to trim the file
> back to the correct size.

No... Redis business with AOF is just to append via write(2). I
believe the problem is that metadata of the file has no chance to get
updated so somewhat we see a partial empty block at the end of the
file.
This should be filesystem implementation specific. So basically the
problem to fix is this one... one time it could be zero-padded, but
other times it could be full of garbage maybe, not sure.
To make aof loading safe we can allow truncations but not malformed data.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

"Fear makes the wolf bigger than he is."
— German proverb

Jay Rolette

unread,
Dec 3, 2014, 5:12:58 PM12/3/14
to Redis Db
Thanks for the speedy response, Salvatore.

I'll adjust my startup scripts to deal with this scenario. I've got a lot more flexibility since it isn't critical if I lose some stats. Just need to have the instance up and running if possible.

Regards,
Jay

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Salvatore Sanfilippo

unread,
Dec 3, 2014, 5:18:07 PM12/3/14
to Redis DB
You are welcome Jay, are you using ext4? I've some feeling that
mounting the filesystem with "data=ordered" option may avoid the
"blank bytes at end" issue.

Salvatore

Jay Rolette

unread,
Dec 3, 2014, 5:37:39 PM12/3/14
to Redis Db
On Wed, Dec 3, 2014 at 4:17 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
You are welcome Jay, are you using ext4? I've some feeling that
mounting the filesystem with "data=ordered" option may avoid the
"blank bytes at end" issue.
 
We are right now, but in the process of switching over to Btrfs as we speak.
Reply all
Reply to author
Forward
0 new messages