Debugging data loss synching to s3

87 views
Skip to first unread message

Daniel Heath

unread,
Oct 7, 2016, 1:58:07 AM10/7/16
to Camlistore
I have a camlistore instance at home which syncs to an s3 target for backups.

I decided to try out the backups and make sure they still work.

On a new VPS, I installed camlistore and setup the server config:
{
    "listen": ":3179",
    "identity": "00ABA6E4",
    "identitySecretRing": "/home/ec2-user/.config/camlistore/identity-secring.gpg",
    "blobPath": "/home/ec2-user/var/camlistore/blobs",
    "packRelated": true,
    "levelDB": "/home/ec2-user/var/camlistore/index.leveldb",
    "auth": "<redacted>",
    "s3": "<redacted>",
    "dbNames": null
}

Then I ran:

./camtool sync --all

...
Destination needs blob: [sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d; 1424 bytes]
2016/10/07 05:47:34 Upload of sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d to destination blobserver failed: Server didn't receive blob.
Destination needs blob: [sha1-0010c726455975f0bd4cd7da20b9defc73932dd7; 740 bytes]
...
Error: sync all failed: 2 errors during sync


Over in the window running camlistored, I get:

./camlistored
...
...
...
2016/10/07 05:47:34 error fetching public key blob sha1-65682c91f928bcf81741ccc8ab4b0efac33a986e: blob was not fully indexed because of a missing dependency
...


At this point, I stopped my home camlistore server, copied the leveldb directory and blobs onto the VPS, pointed the VPS instance of camlistore at the copy and restarted it.

This resulted in a completely blank 'no results found' UI.

Running camtool sync --all again just gave me hundreds of pages of 'error fetching <sha>: file does not exist' and 'Destination needs blob: <sha>'.

I am running the latest release from github:
[ec2-user@ip-172-31-39-15 camlistore]$ ./camtool -version
./camtool version: 7b78c50007

Mathieu Lonjaret

unread,
Oct 7, 2016, 10:45:12 AM10/7/16
to camli...@googlegroups.com
On 7 October 2016 at 07:58, Daniel Heath <daniel....@gmail.com> wrote:
> I have a camlistore instance at home which syncs to an s3 target for
> backups.
>
> I decided to try out the backups and make sure they still work.
>
> On a new VPS, I installed camlistore and setup the server config:
> {
> "listen": ":3179",
> "identity": "00ABA6E4",
> "identitySecretRing":
> "/home/ec2-user/.config/camlistore/identity-secring.gpg",
> "blobPath": "/home/ec2-user/var/camlistore/blobs",
> "packRelated": true,
> "levelDB": "/home/ec2-user/var/camlistore/index.leveldb",
> "auth": "<redacted>",
> "s3": "<redacted>",
> "dbNames": null
> }
>
> Then I ran:
>
> ./camtool sync --all

Let me make sure I understand the config above, regarding what you're
trying to do.
You start with an empty /home/ec2-user/var/camlistore/blobs , "s3" is
configured to where your backup blobs are, and you're expecting
camtool sync to fill "/home/ec2-user/var/camlistore/blobs" as a
destination, from s3 as the source, is that it?

> ...
> Destination needs blob: [sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d; 1424
> bytes]
> 2016/10/07 05:47:34 Upload of sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d
> to destination blobserver failed: Server didn't receive blob.
> Destination needs blob: [sha1-0010c726455975f0bd4cd7da20b9defc73932dd7; 740
> bytes]
> ...
> Error: sync all failed: 2 errors during sync

Just to be sure, have you checked that the problem isn't related to
https://github.com/camlistore/camlistore/issues/681 ? i.e. have you
tried setting "packRelated": false in the new VPS configuraton?

> Over in the window running camlistored, I get:
>
> ./camlistored
> ...
> ...
> ...
> 2016/10/07 05:47:34 error fetching public key blob
> sha1-65682c91f928bcf81741ccc8ab4b0efac33a986e: blob was not fully indexed
> because of a missing dependency
> ...
>
>
> At this point, I stopped my home camlistore server, copied the leveldb
> directory and blobs onto the VPS, pointed the VPS instance of camlistore at
> the copy and restarted it.
>
> This resulted in a completely blank 'no results found' UI.
>
> Running camtool sync --all again just gave me hundreds of pages of 'error
> fetching <sha>: file does not exist' and 'Destination needs blob: <sha>'.
>
> I am running the latest release from github:
> [ec2-user@ip-172-31-39-15 camlistore]$ ./camtool -version
> ./camtool version: 7b78c50007
>
> --
> You received this message because you are subscribed to the Google Groups
> "Camlistore" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camlistore+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Daniel Heath

unread,
Oct 7, 2016, 11:00:09 PM10/7/16
to camli...@googlegroups.com
On Oct 8, 2016, at 1:44 AM, Mathieu Lonjaret <mathieu....@gmail.com> wrote:

On 7 October 2016 at 07:58, Daniel Heath <daniel....@gmail.com> wrote:
I have a camlistore instance at home which syncs to an s3 target for
backups.

I decided to try out the backups and make sure they still work.

On a new VPS, I installed camlistore and setup the server config:
{
   "listen": ":3179",
   "identity": "00ABA6E4",
   "identitySecretRing":
"/home/ec2-user/.config/camlistore/identity-secring.gpg",
   "blobPath": "/home/ec2-user/var/camlistore/blobs",
   "packRelated": true,
   "levelDB": "/home/ec2-user/var/camlistore/index.leveldb",
   "auth": "<redacted>",
   "s3": "<redacted>",
   "dbNames": null
}

Then I ran:

./camtool sync --all

Let me make sure I understand the config above, regarding what you're
trying to do.
You start with an empty /home/ec2-user/var/camlistore/blobs , "s3" is
configured to where your backup blobs are, and you're expecting
camtool sync to fill "/home/ec2-user/var/camlistore/blobs" as a
destination, from s3 as the source, is that it?


Yes, that's correct.

...
Destination needs blob: [sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d; 1424
bytes]
2016/10/07 05:47:34 Upload of sha1-000e795be15c30a04b8f51b8e92e62b8bd064e1d
to destination blobserver failed: Server didn't receive blob.
Destination needs blob: [sha1-0010c726455975f0bd4cd7da20b9defc73932dd7; 740
bytes]
...
Error: sync all failed: 2 errors during sync

Just to be sure, have you checked that the problem isn't related to
https://github.com/camlistore/camlistore/issues/681 ? i.e. have you
tried setting "packRelated": false in the new VPS configuraton?


I had not; wiping the server and retrying with packRelated: false resulted in the same error.

You received this message because you are subscribed to a topic in the Google Groups "Camlistore" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/camlistore/_eCFdGJFluA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to camlistore+...@googlegroups.com.

Daniel Heath

unread,
Oct 8, 2016, 6:28:54 AM10/8/16
to camli...@googlegroups.com
So taking one failure as an example:

error fetching public key blob sha1-65682c91f928bcf81741ccc8ab4b0efac33a986e: blob was not fully indexed because of a missing dependency

sha1-65682c91f928bcf81741ccc8ab4b0efac33a986e is a public key blob; the camlistore version that created it was: 2016-08-22-0367de7

Could syncing old -> new affect it?

From the web interface for the blob:

Blob content

No data

Indexer metadata

{
  "blobRef": "sha1-65682c91f928bcf81741ccc8ab4b0efac33a986e",
  "size": 449
}

Mutation claims

{
  "claims": null
}

Referenced by
No references

Mathieu Lonjaret

unread,
Oct 19, 2016, 12:43:22 PM10/19/16
to camli...@googlegroups.com
Hi,

To rule out out of order indexing errors, could you please try just
syncing from one bs to another, instead of using --all ?

To help with that, you could define the targets in your client config
file, e.g.:

"servers": {
"src": {
"server": "http://ec2server:3179/bs/",
"auth": "userpass:foo:bar"
},
"dest": {
"server": "http://localhost:3179/bs/",
"auth": "userpass:foo:bar"
}
},

(/bs-packed/ instead of /bs/ if using blobpacked)

And then

camtool sync -src src -dest dest

If that goes well, you can then start your dest server with a -reindex
and see if things look ok.

Daniel Heath

unread,
Nov 1, 2016, 9:07:06 PM11/1/16
to Camlistore
That's been running for a few days and will take some time yet ( 30+gb on a 0.2mb/s uplink ).

The blobs being uploaded are already present in S3, and have been for some time.

Separately, when I run:

./camtool sync --verbose --src http://localhost:3179/sto-s3/ --dest http://localhost:3179/bs

I get:

2016/10/30 21:51:14 At source blob sha1-00003e60f806230ce742737809021a427867ebf1 (1 blobs, 65682 bytes)
2016/10/30 21:51:14 Total blobs: 987, 57103774 bytes

I have 13 'other' files in that bucket which makes me think it's only finding the first 1000 blobs in the s3 bucket and not synching any after that.

Daniel Heath

unread,
Nov 2, 2016, 12:56:52 AM11/2/16
to Camlistore
It is indeed the presence of non-blob files in the bucket.

In pkg/blobserver/s3/enumerate.go line 76 there's a comment asking whether we should error out if unexpected files are encountered.

I assume somewhere there's an assumption that if you ask for 1000 items and get back less than 1000, you've read all of them; I suspect that's a better place to fix the issue.

- Daniel

Mathieu Lonjaret

unread,
Nov 2, 2016, 7:00:48 PM11/2/16
to camli...@googlegroups.com
What do you mean by "non-blob file"?

Daniel Heath

unread,
Nov 2, 2016, 7:06:54 PM11/2/16
to camli...@googlegroups.com
I put a copy of the camlistore binaries and GPG key in the s3 bucket so that
I had everything needed to restore my backups in one place. Removing them from the
bucket fixed the issue, but I still wanted to investigate.

I found that the s3 driver starts listing files from "" (instead of say "sha-000000000000000000"),
so when it asked for the first 1000 files only 987 were blobs.

For some reason that caused the sync to hang after the first batch (I suspect something
assumed that asking for 1000 files and getting back <1000 meant you have all the files).

Mathieu Lonjaret

unread,
Nov 2, 2016, 7:17:16 PM11/2/16
to camli...@googlegroups.com
oh, I see, thanks.

It sounds to me like even though it might be ok to _not_ support
"unkown objects" in the bucket (as we already do), we should indeed
make that case fail in a more obvious way (which would result in e.g.
an obvious error in the sync log messages).

Can you file an issue about it please? (And of course propose a CL
afterwards if that interests you).
Reply all
Reply to author
Forward
0 new messages