Bad remote metadata

12 views
Skip to first unread message

Johan Larsson

unread,
Aug 21, 2025, 5:58:44 AMAug 21
to s3...@googlegroups.com
Yesterday I did unmount a s3ql backup mount (Normally I never do that). So today I have error in my backup. Fine I just remote and recopy, but I can't mount it again.

FSCK:
Starting fsck of s3c://mystorage.com/dbsqlbackup/
Downloading metadata...
Downloaded 21298/21298 metadata blocks (100%)
ERROR: Uncaught top-level exception:
Traceback (most recent call last):
  File "/usr/local/bin/fsck.s3ql", line 33, in <module>
    sys.exit(load_entry_point('s3ql==5.2.3', 'console_scripts', 'fsck.s3ql')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/s3ql-5.2.3-py3.12-linux-x86_64.egg/s3ql/fsck.py", line 1319, in main
    db = download_metadata(backend, cachepath + '.db', param, failsafe=param.is_mounted)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/s3ql-5.2.3-py3.12-linux-x86_64.egg/s3ql/database.py", line 561, in download_metadata
    raise DatabaseChecksumError(db_file, params.db_md5, digest)
s3ql.database.DatabaseChecksumError: File /tmp/s3ql-cache-dbsqlbackup/s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db has checksum 3f2ddad6a04f721baab513bb9d588f8dd3353be41b236342c87278aaa9a275ef, expected 09b150d865d20ffec8a6b0c36202bdaddd159bde7137f3027bb723276b814f53

The umount.s3ql yesterday was clean. I have unfortunate tested a bit to muche so the mount log is rolled to many times, so I don't have that to share. But It all looked good. 

I did install sqlite3 and did run a pragma integrity_check and it returns 100 lines of:
row 6064046 missing from index sqlite_autoindex_objects_1
row 6065240 missing from index sqlite_autoindex_objects_1
row 6092815 missing from index sqlite_autoindex_objects_1
row 6108290 missing from index sqlite_autoindex_objects_1
row 6117131 missing from index sqlite_autoindex_objects_1
on the local db.

I'm not sure what to do next. I was looking at use older remote metadata, but the docs says I should try repair first but nothing about how to repair.

BR
/Johan

Daniel Jagszent

unread,
Aug 22, 2025, 11:24:08 AMAug 22
to Johan Larsson, s3...@googlegroups.com

Hello Johan,

It looks like the most recent remote sqlite database and the corresponding .params file do not match. That's something that is bad, and you should probably figure out how that happened. It might be an error in S3QL's incremental database backup logic, your S3 Storage Provider did not persist the data correctly, or something else that is special for your setup.

Your fsck.s3ql call output suggests that you moved the .db + .db-wal and .params files away, used another cache directory, or deleted them.

Hopefully you just moved them away/used another cache directory. Then you could try to do the recovery that https://www.sqlite.org/recovery.html suggests.

I would start with:

Read https://www.rath.org/s3ql-docs/durability.html – especailly rules 4 and 5.

Update S3QL to 5.3.0 – there the fsck.s3ql command got the new option "--fast" that will help you since that will skip the remote metadata consistency check (that will fail in your case).

Recover your local sqlite database:
sqlite3 s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db ".recover --ignore-freelist" > recovered.sql
sqlite3 recovered.db < recovered.sql
mv s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.corrupt.db
mv s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db-wal s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.corrupt.db-wal # this file might not exist, then skip this step.
mv recovered.db s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db
shasum -a 256 s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.db
nano s3c:=2F=2Fmystorage.com=2Fdbsqlbackup=2F.params # <- change the "db_md5" value to the value of the "shasum" command.

Run fsck.s3ql with the correct cache directory and the --fast option. When this works it will probably output many inconsistencies it found/corrected but you should be able to mount the filesystem afterward.

While the filesystem is mounted, make some insignificant changes to it so that the metadata gets changed (e.g., touch a file) and trigger a manual metadata backup (https://www.rath.org/s3ql-docs/man/ctrl.html). Do this five times. The five newest metadata backups on the remote should be consistent now. (fsck.s3ql only checks the last five metadata backups).

Unmount the filesystem.

Run fsck.s3ql --force (without --fast) on the filesystem. The remote metdata check should be OK now.

Consider running s3ql_verify --data ( https://www.rath.org/s3ql-docs/man/verify.html ) on your filesystem when you suspect that your S3 storage might be flaky. This will probably take a long time since it will download every object from the object storage and verify its integrity locally.


Reply all
Reply to author
Forward
0 new messages