Google Groups

Could not find an inode for extended directory ...


Carlo Wood Mar 16, 2008 12:40 PM
Posted in group: ext3grep
Undeleting with ext3grep relies entirely on finding block pointers
in old copies of inodes in the journal. If those don't exist anymore,
than it will fail. That is no reason to give up, but new code
will have to be written and new research will have to done; I'm
afraid that unless someone pays me, it's not going to happen that
I code/add that.

Therefore, lets assume that the needed old inode copies exist.

Also, as has been said several times before, ext3grep is currently
not taking into account file system corruptions of certain types.
Also that would have to be added.

Therefore, lets assume the files were deleted with rm -rf or
something like that and the file system is intact.

There are furthermore a few things that I know don't work:
- ext3grep currently only supports undeleting regular files
  and symbolic links. Special files will simply be skipped.
- In the special case that a journal transaction wraps
  around (at the end of the journal) and it's commit block
  appears as first block; it's disregarded.

[ In fact -- this is exactly what happened on my firewall
  machine when I had a power outage (while I was writing
  ext3grep to recover my home directory!) and resulted
  in e2fsck to NOT being able to recover(!). In fact,
  I blame e2fsck that I lost the entire /etc directory and
  could not boot my firewall anymore (and thus had no
  internet!). I took the harddisk out of the firewall and
  put it in another machine that supported parallel IDE,
  copied the partition to my work horse (which uses SATA)
  and used ext3grep to manually figure out why /etc was gone,
  recovered (manually, using dd in the end) it's directory
  inode and dir entry blocks and was then able to run
  e2fsck to make everything consistent again. I then dd-ed
  the repaired partition back to the harddisk, moved the
  harddisk back into the firewall, and everything worked
  again. Recovery that way took me (only) 12 hours. ]

- As pointed out in the HOWTO, (old) directory blocks
  that have inodes in their dir entries that were reused
  again are plenty. Those reused inode have an unrelated
  delete time that easily can fall into the time window
  that you are undeleting. As a result, those ancient
  dir entries are "hard linked" to another file: they
  use the same inode. I did not add any (heuristic) code
  to try and determine if that is a mistake or not, if
  at all possible; and the result is a lot of wrong
  hardlinks.

However, the reason for this post, what I will attempt
to fix is what I think is MOST broken:

ext3grep start with finding all directory blocks and
then tries to reassemble the whole tree. It does this
without using the journal (which, I think, is a good
thing). It seems to do a good job for directory blocks
that contain a dir entry with the name ".", because
that dir entry has the inode number of directory (path)
that this block belongs to, and it will also contain
a dir entry for ".." with the inode of the parent
directory, as such allowing us to build a directory
structure.

But extended directory blocks (if a directory is so large
that there are more dir entries than fit in one block, more
blocks are used; those extra blocks then do not contain
the dir entries for "." and ".." of course) have no inode
of themselves to tell ext3grep what they belong to.
I solved that as follows: if extended directory block
contains one or more dir entries for directories, I find
the directory block for that inode and then use the inode
of ".." in that block as the inode of the extended
directory block.

Um, graphical example:

Suppose we have this tree:

root/photos/file1
root/photos/file2
...
root/photos/fileN
root/photos/subdir/

Suppose that the directory root/photos contains
so many dir entries that it doesn't fit in one
block, but uses two blocks:

Block1:
123        .
2        ..
501        file1
502        file2
...

Block2:
600        fileN
444        subdir

Then, finding block 1, we know that it's a directory
of inode '2' and it's own inode is 123.

Finding block 2 we know nothing.

However, somewhere else we find a directory block
with this content:

Block3:
444        .
123        ..

Since Block2 tells us it contains a directory with
inode 444, we now know that Block2 belongs to the
directory with inode 123.

The REAL problem exist when we find extended directory
blocks that do not contain any subdirectory themselves.

ext3grep solves that currently by using an (old) locate
dump. Suppose we find an extended directory with
the following content:

Block4:
1000        xyz4925
1001        xyz4926
1002        xyz4927
...

it then looks in the locate dump for paths with those
filenames, finding, for example:

photos/xyz4925
photos/xyz4926
photos/xyz4927
photos.old/xyz4925
photos.old/xyz4926

it then guesses that Block4 belongs to the directory
"photos" because it contains "photos/xyz4927".

If also that fails because they are not in the locate
database, it matches all files in the extended directory
block against a table with regular expressions. When
either 10% or 70% (depends on whether or not...) of the
all files match a given regular expression, the
corresponding parent directory is returned. You need
to edit locate.cc and add those regular expressions
yourself!

Until you provided a recent locate database dump and
created a correct table with regular expressions
you will run into a (long?) list of the following messages:

Could not find an inode for extended directory at BLOCKNR, disregarding it's contents.

This is BAD. You don't want to see those messages.

Currently, what you want to do is edit locate.cc and
set the variable test_blocknr to that number. Change all
#if 0 into #if 1. Recompile ext3grep and run the command again.
Interpret the extra output and fix/add a regular expression,
if possible. If that is not possible, you can also just
add the block number in the if() statement at the beginning
of parent_directory() in locate.cc and return some non-empty
directory there.

--
Carlo Wood <ca...@alinoe.com>