On Sat, Jun 15, 2013 at 7:31 PM, <
epic...@gmail.com> wrote:
> That shows ...
>
> $ sudo envdir /etc/wal-e.d/env wal-e backup-list
> name last_modified expanded_size_bytes wal_segment_backup_start
> wal_segment_offset_backup_start wal_segment_backup_stop
> wal_segment_offset_backup_stop
> base_000000010000000400000087_00000032 2013-06-16T01:56:37.000Z
> 000000010000000400000087 00000032
>
> But what I'm meaning is how do you verify that it keeps working? Like
> shouldn't I audit things every once in awhile? I'm seeing a new file (e.g.
> 0000000100000004000000B4.lzo) created in the wal_005 directory every minute.
To test that archiving is continuing unimpeded, monitoring the number
of 'ready' segments and making sure they stay pretty small is how I do
it right now.
I also scan the backup-list data to make sure that a new backup is
taken frequently enough. My recent spate of work on reducing wal-e
memory use for certain kinds of workloads was instigated by this
monitoring.
You can also test restoration. I like to run pg_dumpall > /dev/null after
performing the restore to touch all the relation heaps and form their
tuples at least (indexes are not verified that way). If one finds
that the server crashes or delivers an error suggesting media defect,
then one dealing with corruption most likely, which in theory could be
introduced by wal-e (check the primary too in that case: to date wal-e
has no reported cases of mangling any database).