S3, Glacier, and WAL-E

101 views
Skip to first unread message

Christophe Pettus

unread,
Sep 15, 2014, 6:06:18 PM9/15/14
to wal-e
So, an interesting problem has presented itself, and I'm not quite sure how to solve it.

One of the (theoretically) nice things about using WAL-E to archive to S3 is that you can set up a lifecycle rule to move stuff to Glacier over time, so that old backups remain available while not paying the full S3 price. (And, of course, they can be deleted entirely after a while.)

This works great... except for the timeline's .history file. Since S3 -> Glacier rules are based on creation date, eventually that file will be migrated to Glacier, and WAL-E restores start failing because it can't get at it. Since S3 lifecycle rules are path-prefix-based, there's no way to set a different rule for that file in particular.

So, some thoughts:

1. I'm misunderstanding how this work, and there's an easy way that I've missed.
2. WAL-E could have an option to put the .history file in a different prefix, so different lifecycle rules could apply. Feels hackish.
3. A different solution that I'm missing.

Thoughts?
--
-- Christophe Pettus
x...@thebuild.com

Daniel Farina

unread,
Sep 15, 2014, 6:38:29 PM9/15/14
to Christophe Pettus, wal-e
On Mon, Sep 15, 2014 at 3:06 PM, Christophe Pettus <x...@thebuild.com> wrote:
> So, an interesting problem has presented itself, and I'm not quite sure how to solve it.
>
> One of the (theoretically) nice things about using WAL-E to archive to S3 is that you can set up a lifecycle rule to move stuff to Glacier over time, so that old backups remain available while not paying the full S3 price. (And, of course, they can be deleted entirely after a while.)
>
> This works great... except for the timeline's .history file. Since S3 -> Glacier rules are based on creation date, eventually that file will be migrated to Glacier, and WAL-E restores start failing because it can't get at it. Since S3 lifecycle rules are path-prefix-based, there's no way to set a different rule for that file in particular.
>
> So, some thoughts:
>
> 1. I'm misunderstanding how this work, and there's an easy way that I've missed.

I don't use glacier so my expertise is not as great but I don't think
you've missed anything obvious.

> 2. WAL-E could have an option to put the .history file in a different prefix, so different lifecycle rules could apply. Feels hackish.

I have considered putting history metadata elsewhere for rapid recall
of the members of of a timeline, as prefixes aid that in S3's indexing
scheme. Right now I have no downstream features to benefit from such
a storage-format-upgrade, but this would be one...admittedly, a bit
minor.

Another entity that should get the same treatment are the "sentinel"
(metadata) files for backups.

> 3. A different solution that I'm missing.

Maybe think a bit more, but there's some reasons to cordon off history
files into a prefix.

The thing that is most worrisome about this is how to deal with
breaking storage compatibility. It's probably not insurmountable but
it's a bit of a thing.

Christophe Pettus

unread,
Sep 16, 2014, 1:00:09 PM9/16/14
to wal-e

On Sep 15, 2014, at 3:37 PM, Daniel Farina <dan...@heroku.com> wrote:

> The thing that is most worrisome about this is how to deal with
> breaking storage compatibility. It's probably not insurmountable but
> it's a bit of a thing.

My thought was something like --history-prefix=..., which the default being the same path as the rest of the files. Thus, existing installations don't have to change anything.

Daniel Farina

unread,
Sep 16, 2014, 1:42:05 PM9/16/14
to Christophe Pettus, wal-e
I am not inclined to add any options to tweak the storage layout. I'd
rather have a fallback for backwards compatibility and otherwise have
new installations to write things in the new format.

Christophe Pettus

unread,
Sep 16, 2014, 1:43:39 PM9/16/14
to Daniel Farina, wal-e

On Sep 16, 2014, at 10:41 AM, Daniel Farina <dan...@heroku.com> wrote:

> I am not inclined to add any options to tweak the storage layout. I'd
> rather have a fallback for backwards compatibility and otherwise have
> new installations to write things in the new format.

No strong feelings on my end. It's my ox that's being gored, so I'm happy to take a pass at it. Would WAL-E just probe both locations for the metadata?

Daniel Farina

unread,
Sep 16, 2014, 1:45:04 PM9/16/14
to Christophe Pettus, wal-e
Yeah. Probably preferring the new location and falling back to the
old. I think that's practical.

jor...@notablepdf.com

unread,
Oct 28, 2014, 5:49:32 PM10/28/14
to wa...@googlegroups.com, dan...@heroku.com
Hi Christophe

What I'm doing for this is to enable versioning on the bucket, and setup the lifecycle rule so that only old versions get moved to glacier (they released that feature recently). Then I've got wal-e setup to delete old backups after a period - when the file is deleted from S3, the file is actually still there because of the versioning, and then the old version gets moved over to glacier after a while.

This way the .history file never gets moved to glacier, since it's not deleted by wal-e, but the logs and old base backups are.

Jordan
Reply all
Reply to author
Forward
0 new messages