fsck is not able to fix it

105 views
Skip to first unread message

Radek AD24 Šalomon

unread,
Mar 23, 2022, 8:56:35 AM3/23/22
to s3ql
Hello,

from unknown reason, the s3ql is not able to fix itself. The volume was mounted, successfully unmounted, but after it is not able to run s3ql again and even not perform fsck. Please, can you help me? Thx a lot.

```
Enter "continue, I know what I am doing" to use the outdated data anyway:
continue, I know what I am doing
WARNING: Renaming outdated cache directory /opt/s3ql-temp/s3c:=2F=2Fs3.eu-central-1.wasabisys.com:443=2Fbucket1-workspace-ad24=2F-cache to .bak0
WARNING: You should delete this directory once you are sure that everything is in order.
Downloading and decompressing metadata...
Reading metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Creating temporary extra indices...
Checking lost+found...
Checking for dirty cache objects...
Checking names (refcounts)...
Checking contents (names)...
WARNING: Content entry for inode 3 refers to non-existing name with id 1, moving to /lost+found/-3
Dropping temporary indices...
> ERROR: Uncaught top-level exception:
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 143, in get_row
    row = next(res)
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/s3ql/common.py", line 117, in inode_for_path
    inode = conn.get_val("SELECT inode FROM contents_v WHERE name=? AND parent_inode=?",
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 127, in get_val
    return self.get_row(*a, **kw)[0]
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 145, in get_row
    raise NoSuchRowError()
s3ql.database.NoSuchRowError: Query produced 0 result rows

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/fsck.s3ql", line 33, in <module>
    sys.exit(load_entry_point('s3ql==3.8.1', 'console_scripts', 'fsck.s3ql')())
  File "/usr/lib/python3.10/site-packages/s3ql/fsck.py", line 1289, in main
    fsck.check(check_cache)
  File "/usr/lib/python3.10/site-packages/s3ql/fsck.py", line 86, in check
    self.check_contents_name()
  File "/usr/lib/python3.10/site-packages/s3ql/fsck.py", line 323, in check_contents_name
    (id_p_new, newname) = self.resolve_free(b"/lost+found", newname)
  File "/usr/lib/python3.10/site-packages/s3ql/fsck.py", line 1068, in resolve_free
    inode_p = inode_for_path(path, self.conn)
  File "/usr/lib/python3.10/site-packages/s3ql/common.py", line 120, in inode_for_path
    raise KeyError('Path %s does not exist' % path)
```

Daniel Jagszent

unread,
Mar 23, 2022, 10:35:59 AM3/23/22
to s3ql
Hello,

[...]
Checking lost+found...
[...]

Checking contents (names)...
WARNING: Content entry for inode 3 refers to non-existing name with id 1, moving to /lost+found/-3
[...]
> ERROR: Uncaught top-level exception:
[...]

  File "/usr/lib/python3.10/site-packages/s3ql/common.py", line 117, in inode_for_path
    inode = conn.get_val("SELECT inode FROM contents_v WHERE name=? AND parent_inode=?",
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 127, in get_val
    return self.get_row(*a, **kw)[0]
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 145, in get_row
    raise NoSuchRowError()
s3ql.database.NoSuchRowError: Query produced 0 result rows
[...]
this is a strange error. It happens because fsck cannot find the lost+found folder when it wants to move an inode into this folder. Strangely it could find the folder in previous steps.

The only possibility for this error I see is that the folder "lost+found" itself is the inode 3 and thus it gets moved into itself.
As far as I can see this can only happen when your Sqlite database is strange (the view contents_v might not be a view but a table).

Open your S3QL Sqlite database (/opt/s3ql-temp/s3c:=2F=2Fs3.eu-central-1.wasabisys.com:443=2Fbucket1-workspace-ad24=2F.db) directly with the Sqlite CLI and tell us the output of the following commands:
.schema
SELECT * FROM contents_v WHERE name='lost+found' AND parent_inode=1;
SELECT * FROM contents LEFT JOIN names ON name_id = names.id WHERE names.name='lost+found' AND contents.parent_inode=1;

The output of the following might also be interesting (it might be many rows when your database is seriously broken):
SELECT contents.rowid, name_id, parent_inode, inode FROM contents LEFT JOIN names ON name_id = names.id WHERE names.id IS NULL;


If your local Sqlite database is broken beyond repair (and ONLY in that case) you might need to download a backup via s3qladm download-metadata and execute the commands on this backup.


Radek AD24 Šalomon

unread,
Mar 23, 2022, 11:47:32 AM3/23/22
to s3ql
Thanks for the reply. (I sent you first reply accidentally only to you.)

On Wednesday, 23 March 2022 at 15:35:59 UTC+1 dan...@jagszent.de wrote:
Hello,

[...]
Checking lost+found...
[...]

Checking contents (names)...
WARNING: Content entry for inode 3 refers to non-existing name with id 1, moving to /lost+found/-3
[...]
> ERROR: Uncaught top-level exception:
[...]

  File "/usr/lib/python3.10/site-packages/s3ql/common.py", line 117, in inode_for_path
    inode = conn.get_val("SELECT inode FROM contents_v WHERE name=? AND parent_inode=?",
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 127, in get_val
    return self.get_row(*a, **kw)[0]
  File "/usr/lib/python3.10/site-packages/s3ql/database.py", line 145, in get_row
    raise NoSuchRowError()
s3ql.database.NoSuchRowError: Query produced 0 result rows
[...]
this is a strange error. It happens because fsck cannot find the lost+found folder when it wants to move an inode into this folder. Strangely it could find the folder in previous steps.

The only possibility for this error I see is that the folder "lost+found" itself is the inode 3 and thus it gets moved into itself.
As far as I can see this can only happen when your Sqlite database is strange (the view contents_v might not be a view but a table).

Open your S3QL Sqlite database (/opt/s3ql-temp/s3c:=2F=2Fs3.eu-central-1.wasabisys.com:443=2Fbucket1-workspace-ad24=2F.db) directly with the Sqlite CLI and tell us the output of the following commands:
.schema
SELECT * FROM contents_v WHERE name='lost+found' AND parent_inode=1;
empty output
 
SELECT * FROM contents LEFT JOIN names ON name_id = names.id WHERE names.name='lost+found' AND contents.parent_inode=1;
empty output


The output of the following might also be interesting (it might be many rows when your database is seriously broken):
SELECT contents.rowid, name_id, parent_inode, inode FROM contents LEFT JOIN names ON name_id = names.id WHERE names.id IS NULL;
plenty (hundereds) of
111506|73767|109628|111980
111507|73768|74828|111981
111508|73769|108875|111982
111509|38816|109630|111983
111510|72572|111983|111984
111511|38816|109635|111985
111512|72572|111985|111986
111513|66691|74914|111987
111514|38816|111987|111988
111515|72572|111988|111989
111516|38816|109638|111990
111517|72572|111990|111991
111518|66700|74914|111992
111519|38888|111992|111993
111520|67252|111993|111994
111521|66701|74914|111995
111522|38816|111995|111996
111523|72568|111996|111997
111524|38816|111992|111998
111525|72572|111998|111999


If your local Sqlite database is broken beyond repair (and ONLY in that case) you might need to download a backup via s3qladm download-metadata and execute the commands on this backup.
All previous metadata has the same issue and same result. But s3ql was running without issue till that time.

Radek AD24 Šalomon

unread,
Mar 23, 2022, 12:03:14 PM3/23/22
to s3ql
and the .schema

sqlite> .schema
CREATE TABLE objects (
        id        INTEGER PRIMARY KEY AUTOINCREMENT,
        refcount  INT NOT NULL,
        size      INT NOT NULL
    );
CREATE TABLE sqlite_sequence(name,seq);
CREATE TABLE blocks (
        id        INTEGER PRIMARY KEY,
        hash      BLOB(32) UNIQUE,
        refcount  INT,
        size      INT NOT NULL,
        obj_id    INTEGER NOT NULL REFERENCES objects(id)
    );
CREATE TABLE inodes (
        -- id has to specified *exactly* as follows to become
        -- an alias for the rowid.
        id        INTEGER PRIMARY KEY AUTOINCREMENT,
        uid       INT NOT NULL,
        gid       INT NOT NULL,
        mode      INT NOT NULL,
        mtime_ns  INT NOT NULL,
        atime_ns  INT NOT NULL,
        ctime_ns  INT NOT NULL,
        refcount  INT NOT NULL,
        size      INT NOT NULL DEFAULT 0,
        rdev      INT NOT NULL DEFAULT 0,
        locked    BOOLEAN NOT NULL DEFAULT 0
    );
CREATE TABLE inode_blocks (
        inode     INTEGER NOT NULL REFERENCES inodes(id),
        blockno   INT NOT NULL,
        block_id    INTEGER NOT NULL REFERENCES blocks(id),
        PRIMARY KEY (inode, blockno)
    );
CREATE TABLE symlink_targets (
        inode     INTEGER PRIMARY KEY REFERENCES inodes(id),
        target    BLOB NOT NULL
    );
CREATE TABLE names (
        id     INTEGER PRIMARY KEY,
        name   BLOB NOT NULL,
        refcount  INT NOT NULL,
        UNIQUE (name)
    );
CREATE TABLE contents (
        rowid     INTEGER PRIMARY KEY AUTOINCREMENT,
        name_id   INT NOT NULL REFERENCES names(id),
        inode     INT NOT NULL REFERENCES inodes(id),
        parent_inode INT NOT NULL REFERENCES inodes(id),

        UNIQUE (parent_inode, name_id)
    );
CREATE TABLE ext_attributes (
        inode     INTEGER NOT NULL REFERENCES inodes(id),
        name_id   INTEGER NOT NULL REFERENCES names(id),
        value     BLOB NOT NULL,

        PRIMARY KEY (inode, name_id)
    );
CREATE VIEW contents_v AS
    SELECT * FROM contents JOIN names ON names.id = name_id
/* contents_v(rowid,name_id,inode,parent_inode,id,name,refcount) */;
CREATE VIEW ext_attributes_v AS
    SELECT * FROM ext_attributes JOIN names ON names.id = name_id
/* ext_attributes_v(inode,name_id,value,id,name,refcount) */;
CREATE TABLE sqlite_stat1(tbl,idx,stat);
Reply all
Reply to author
Forward
0 new messages