creating a new replica

53 views
Skip to first unread message

Sebastien Binet

unread,
Jul 14, 2022, 8:44:14 AM7/14/22
to Perkeep
hi there,

apologies for the burst of questions.

I am trying to create a new replica for my blobs, on some remote
storage.

I have the following client config on the "main" blobstorage:

{
"servers": {
"localhost": {
"server": "http://localhost:3179",
"auth": "localhost",
"default": true
},
"backup": {
"server": "https://remote-storage.org",
"auth": "userpass:user:pass"
}
},
"identity": "me",
"ignoredFiles": [
".DS_Store"
]
}

after starting the remote-storage's perkeep server, I then tried, on the
"main" blobstorage:

$> pk sync -src http://localhost:3179/bs -dest https://remote-storage.org/bs

but I got the following errors on the remote-storage:

Received blob [sha224-XX1; 76937 bytes]
Received blob [sha224-XX2; 66332 bytes]
blobpacked: Packing file sha224-XX3 ...
blobpacked: Error packing file sha224-XX3: file does not exist
Received blob [sha224-XXX3; 961 bytes]

according to #681 and #1073

- https://github.com/perkeep/perkeep/issues/681
- https://github.com/perkeep/perkeep/issues/1073

it seemed like one could get away with first replicating /bs-packed and
then /bs-loose, but I still got the above errors (although, less of
those).

is the only way to create a new replicate to scp-copy the blobs?
(or implement the correct blob-order enumeration? I could try my hand at
it, if pointed a bit at the correct way)

cheers,
-s

Sebastien Binet

unread,
Jul 17, 2022, 10:43:50 AM7/17/22
to per...@googlegroups.com
hi,
I have tried the ssh-copy avenue.
AFAICT, I didn't get the packing errors anymore but for some reason,
starting perkeepd on the remote-storage always failed with:

blobpacked: 0 large blobs found in index, 44723 missing from index
sample missing large blob: sha224-0001b1...xxx
sample missing large blob: sha224-0001bf...xxx
sample missing large blob: sha224-000576...xxx
sample missing large blob: sha224-000a2b...xxx
sample missing large blob: sha224-000d89...xxx
sample missing large blob: sha224-001086...xxx
sample missing large blob: sha224-0011d3...xxx
sample missing large blob: sha224-001399...xxx
sample missing large blob: sha224-0014ca...xxx
sample missing large blob: sha224-00153e...xxx
Error: 44723 large blobs missing from index. Please re-start in recovery mode with -recovery=1

even after restarting with -recovery=1 and/or -reindex (and/or
-recovery=2).

not sure whether that's expected.

my offer of trying to work on a "better" blob-enumeration still stand.
but in the meantime, I'll try to setup the following scaffolding:
- pk list -type=file > list.txt (on localhost)
- for sha224 in list.txt; do pk-get -content $sha224 > filename
- for filename in `ls -1`; do pk-upload remote-storage $filename
(where pk-upload is just pk-put with remote-storage as the end-point and
the following options: "file --filenodes --exiftime")

would that work to extract all files (photos in my case) from
"localhost" and upload them to "remote-storage"?
(albeit, perhaps, in a not-completely-optimal way)

cheers,
-s

Sebastien Binet

unread,
Jul 17, 2022, 5:55:25 PM7/17/22
to per...@googlegroups.com
On Sun Jul 17, 2022 at 16:43 CET, 'Sebastien Binet' via Perkeep wrote:

[...]
ok, perhaps unsurprisingly, this didn't pan out: even with a small
subsample (~30 images), I get the same error when restarting the
"remote-storage" server:

Starting perkeepd version master, 2022-07-10-c501f90cd0; Go go1.18.3 (linux/arm64)
Starting to listen on http://localhost:3179
blobpacked: checking integrity of packed blobs against index...
blobpacked: 0 large blobs found in index, 29 missing from index
sample missing large blob: sha224-098b...xxx
sample missing large blob: sha224-1185...xxx
...

but, then, I tried something else:
- on "localhost", keep the same upload scaffolding as described above
- on "remote-storage", use the "default" server-config.json (ie: with
sqlite instead of postgres)
and the "sample missing large blob: ...." errors disappeared.

what am I doing wrong?
looking at the output of \l in psql, I do see the "perkeep",
"pk_<id-lowercase>_blobpacked" and "pk_<id-lowercase>_syncto_index"
databases.

I have attached the relevant parts of the psql-based server (redacted) config.

please let me know what's amiss.

cheers,
-s
pk-psql-config.json

Sebastien Binet

unread,
Jul 18, 2022, 5:34:16 AM7/18/22
to per...@googlegroups.com
On Sun Jul 17, 2022 at 23:54 CET, 'Sebastien Binet' via Perkeep wrote:
[...]
> what am I doing wrong?
> looking at the output of \l in psql, I do see the "perkeep",
> "pk_<id-lowercase>_blobpacked" and "pk_<id-lowercase>_syncto_index"
> databases.

I have switched to mariadb.
and everything seems to be working.

looking at the list of databases, I've noticed an extra one compared to
the postgresql setup: "pk_<id-lowercase>_index".
that one wasn't present -as far as I could tell- in the postgresql case.
from the name, I guess it explains why -recovery=1 + reindexing didn't
fix the errors reported earlier.

not sure the error is coming from my end or from perkeep.

cheers,
-s
Reply all
Reply to author
Forward
0 new messages