Fsck cycling metadata causes crash when one is missing

25 views
Skip to first unread message

Chris Davies

unread,
Oct 9, 2019, 4:45:33 PM10/9/19
to s3ql
I can't be sure the upgrade path here, but lit's probably Debian "stretch" S3QL 3.x to the latest in the Debian "buster" backports, 3.3+dfsg-1~bpo10+1. (It's definitely not an upgrade from the earlier S3QL filesystem 2.x version.) Everything's been working happily until earlier today when my client system crashed. The S3QL filesystem was in use at the time to an S3 backend, and unsurprisingly when it all came back up I needed fsck.

Here's where it's gone pear-shaped. The fsck fails during the cycling of the metadata. I've added some debug to help identify what's going on, but I can't see where the HTTPError is escaping without being rewritten as NoSuchObject:

2019-10-09 21:23:01.976 3725:MainThread s3ql.backends.swift._detect_features: Detected Swift features for XXX:
copy via COPY
, Bulk delete 1000 keys at a time, maximum meta value length is 255 bytes
2019-10-09 21:23:03.320 3725:MainThread s3ql.fsck.main: Starting fsck of swift://YYY/ZZZ/
2019-10-09 21:23:04.218 3725:MainThread s3ql.fsck.main: Using cached metadata.
2019-10-09 21:23:04.431 3725:MainThread s3ql.fsck.main: Remote metadata is outdated.
...
2019-10-09 21:24:08.019 3725:MainThread s3ql.metadata.upload_metadata: Compressing and uploading metadata...
2019-10-09 21:24:41.005 3725:MainThread s3ql.metadata.upload_metadata: Wrote 10.2 MiB of compressed metadata.
2019-10-09 21:24:41.005 3725:MainThread s3ql.metadata.upload_metadata: Cycling metadata backups...
2019-10-09 21:24:41.006 3725:MainThread s3ql.metadata.cycle_metadata: Backing up old metadata...
2019-10-09 21:24:41.006 3725:MainThread s3ql.metadata.cycle_metadata: - [CJD] copy old metadata 9...
2019-10-09 21:24:52.037 3725:MainThread s3ql.metadata.cycle_metadata: - [CJD] copy old metadata 8...
2019-10-09 21:25:07.689 3725:MainThread s3ql.metadata.cycle_metadata: - [CJD] copy old metadata 7...
2019-10-09 21:25:08.686 3725:MainThread root.excepthook: Uncaught top-level exception:
Traceback (most recent call last):
 
File "/usr/bin/fsck.s3ql", line 11, in <module>
    load_entry_point
('s3ql==3.3', 'console_scripts', 'fsck.s3ql')()
 
File "/usr/lib/s3ql/s3ql/fsck.py", line 1298, in main
    dump_and_upload_metadata
(backend, db, param)
 
File "/usr/lib/s3ql/s3ql/metadata.py", line 319, in dump_and_upload_metadata
    upload_metadata
(backend, fh, param)
 
File "/usr/lib/s3ql/s3ql/metadata.py", line 333, in upload_metadata
    cycle_metadata
(backend)
 
File "/usr/lib/s3ql/s3ql/metadata.py", line 126, in cycle_metadata
    cycle_fn
("s3ql_metadata_bak_%d" % i, "s3ql_metadata_bak_%d" % (i + 1))
 
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 310, in copy
   
self._copy_or_rename(src, dest, rename=False, metadata=metadata)
 
File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 343, in _copy_or_rename
   
self.backend.copy(src, dest, metadata=meta_raw)
 
File "/usr/lib/s3ql/s3ql/backends/swift.py", line 656, in copy
   
self._copy_via_copy(src, dest, metadata=metadata)
 
File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
   
return method(*a, **kw)
 
File "/usr/lib/s3ql/s3ql/backends/swift.py", line 646, in _copy_via_copy
    resp
= self._do_request('COPY', '/%s%s' % (self.prefix, src), headers=headers)
 
File "/usr/lib/s3ql/s3ql/backends/swift.py", line 267, in _do_request
   
raise HTTPError(resp.status, resp.reason, resp.headers)
s3ql
.backends.s3c.HTTPError: 404 Not Found



Suggestions gratefully appreciated.

Thanks,
Chris

Chris Davies

unread,
Oct 10, 2019, 3:43:14 AM10/10/19
to s3ql

On Wednesday, 9 October 2019 21:45:33 UTC+1, Chris Davies wrote:
Here's where it's gone pear-shaped. The fsck fails during the cycling of the metadata. I've added some debug to help identify what's going on, but I can't see where the HTTPError is escaping without being rewritten as NoSuchObject:

As far as getting my data back online I've worked aroud the problem. After several iterations I was able to add an "except HTTPError // if 404 then pass" type construct into the relevant part of metadata.py, which I've now removed again. I'm sure this isn't the correct solution, as HTTPError is almost always converted to NoSuchObject lower down in the code, but I can't see the failure path here to address it directly for you.

Regards,
Chris

Daniel Jagszent

unread,
Oct 10, 2019, 8:34:34 AM10/10/19
to s3ql
Hello Chris,

> I'm sure this isn't the correct solution, as HTTPError is almost
> always converted to NoSuchObject lower down in the code, but I can't
> see the failure path here to address it directly for you.
This should be the place that was missing:
https://github.com/s3ql/s3ql/pull/126



Chris Davies

unread,
Oct 10, 2019, 3:37:26 PM10/10/19
to s3ql
On Thursday, 10 October 2019 13:34:34 UTC+1, Daniel Jagszent wrote:
This should be the place that was missing:
https://github.com/s3ql/s3ql/pull/126


That was fast - thank you very much!
Chris

Reply all
Reply to author
Forward
0 new messages