Ma.gnolia's 'backup' was a filesystem level sync, over firewire, to a
different machine. What was being copied were the the straight up
mysql binary files. It seems that no historical snapshots were kept so
as the original data became more and more corrupted, so did the synced
copy. By the time the main db filesystem failed completely, the
sync'ed snapshot was unusable.
What happens at pinboard is currently quite simple - the data gets
sql-dump-ed and fired to the Amazon S3 orbital storage facilities,
nightly. The old snapshots just sit there, currently - the service is
young, the dataset manageable so just storing the snapshots is not a
significant burden.
Is this a foolproof backup strategy? Probably not. There are probably
many obvious and non-obvious ways it can go quite wrong and data can
be lost and the procedure will be improved along with the rest of the
service. It is likely reasonably resilient to the kind of failure that
ma.gnolia experienced. The major differences are, at pinboard -
- there is a sequence of historical backups
- the data format of the backup is less vulnerable to non-recoverable corruption
- backups are maintained offsite at a distributed storage service with
a reliability and availability record better than that of a mac mini
and a firewire cable.
Another unfortunate omission at ma.gnolia is that they never actually
tried to see if their backup worked until they needed it and it
didn't.
Yesterday, Maciej and I donned hazmat suits, directed an asteroid
strike the pinboard colocation facility and then had the nanobots
restore service at the failover bunker.
Ok, no, he gave me one of the S3 sql dumps and I imported it into the
mysql db running on my laptop. It worked. Data was there.
-pvg