On Thursday, April 12, 2012 3:59:52 PM UTC+5:30, Harry wrote:
> Hello,
> I posted my question to
superuser.com (
http://superuser.com/questions/
> 410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
> haven't got any response yet.
>
> In summary, not realizing what I was doing, I overwrote the first 446
> bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
> if someone could help me salvage my disk!
>
>
> =================
> Details of what I did:
> =================
>
> Using the `dd` command, I was hoping that I would be able to copy over
> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> make Disk A bootable just like Disk B. I issued the command:
>
> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> But when I could not boot up from `sda`, I rebooted from `sdb` to see
> what was going on. To my horror, `sda` was being reported to have a
> bad superblock, now.
>
> Worse, I was **unable** to repair it via the backup superblocks stored
> on the ext4 filesystem. This is what I did. I first got the backup
> superblock addresses, like so:
>
> [root@localhost liveuser]# mke2fs -n /dev/sda
> mke2fs 1.41.14 (22-Dec-2010)
> /dev/sda is entire device, not just one partition!
> Proceed anyway? (y,n) y
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> Stride=0 blocks, Stripe width=0 blocks
> 4890624 inodes, 19537686 blocks
> 976884 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=0
> 597 block groups
> 32768 blocks per group, 32768 fragments per group
> 8192 inodes per group
> Superblock backups stored on blocks:
> 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
> 2654208,
> 4096000, 7962624, 11239424
>
> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> `SUPERBLOCK` values listed above, like so:
>
> [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
> e2fsck 1.41.14 (22-Dec-2010)
> e2fsck: Bad magic number in super-block while trying to open /dev/sda
>
> The superblock could not be read or does not describe a correct ext2
> filesystem. If the device is valid and it really contains an ext2
> filesystem (and not swap or ufs or something else), then the
> superblock
> is corrupt, and you might try running e2fsck with an alternate
> superblock:
> e2fsck -b 8193 <device>
>
> I tried every single value, but each gave the above message!
>
> **Is there anything that I could do NOW to salvage my precious disk?**
> This is an 80G disk with 2 partitions. The `/dev/sda1` partition is
> clean and is mountable; it is the `/dev/sda2` partition that is
> failing to work with commands like `mount`, `debugfs`, `dumpe2fs`,
> etc.
>
> Running `mke2fs -n` for the individual partitions gave me this (notice
> how the **First Data Block** and **Maximum filesystem blocks** both
> show **0** as their value):
>
> [root@localhost liveuser]# mke2fs -n /dev/sda1
> mke2fs 1.41.14 (22-Dec-2010)
> Filesystem label=
> OS type: Linux
> Block size=1024 (log=0)
> Fragment size=1024 (log=0)
> Stride=0 blocks, Stripe width=0 blocks
> 128016 inodes, 512000 blocks
> 25600 blocks (5.00%) reserved for the super user
> First data block=1
> Maximum filesystem blocks=67633152
> 63 block groups
> 8192 blocks per group, 8192 fragments per group
> 2032 inodes per group
> Superblock backups stored on blocks:
> 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
>
> [root@localhost liveuser]# mke2fs -n /dev/sda2
> mke2fs 1.41.14 (22-Dec-2010)
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> Stride=0 blocks, Stripe width=0 blocks
> 4857856 inodes, 19409408 blocks
> 970470 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=0
> 593 block groups
> 32768 blocks per group, 32768 fragments per group
> 8192 inodes per group
> Superblock backups stored on blocks:
> 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
> 2654208,
> 4096000, 7962624, 11239424
>
> I still don't know what was wrong in my `dd` command that corrupted my
> ext4 superblock. You cannot imagine how happy I will be if someone
> could help me recover my disk back... since, except fo this bad
> superblock, all the data is just sitting right there!
>
> PLEASE HELP, my soul is crying as I write this :-((
>
> PS: If I can get a response faster/better from another forum on the
> Net, please do point me to it.
After about 70 posts in this thread by some very helpful and knowledgeable members of this forum, I managed to run into this article:
http://www.microdevsys.com/WordPress/2011/09/19/linux-lvm-recovering-a-lost-volume/
Though the specifics of my situation and the causes leading to it were different from the author's, I did manage to notice the 'vgcfgrestore' command that had been missing all along in all the various suggestions made. I decided to give it a try - and lo and behold - it worked!
While I'm VERY happy now at the prospect of a full data recovery (assuming no data was lost due to e2fsck's fixes), I'd still like to seek some final help from you folks in reconstructing the 'crime scene': basically, deducing from the following sequence of steps of the solution as to what must have gotten corrupted and how and why. I have already told my story enough number times in this thread and on
superuser.com, but does it corroborate with these additional data points of the solution?
================
Solution:
================
a) I noticed that in the image of the partition, the string /dev/sda2 was written. Even though the LVM backup file said
device = "/dev/sda2" # Hint only
, I decided to be a little *superstitious* and switched the data cables in my system once again so that the messed up disk once again became /dev/sda.
$ fdisk -l
Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 156301311 77637632 8e Linux LVM
b) I then restored the /etc/lvm directory.
$ mv /etc/lvm.off /etc/lvm
$ ls -l /etc/lvm/backup/
total 4
-rw-------. 1 root root 1456 Apr 19 09:16 vg_XYZ
c) I then issued an `lvm vgcfgrestore` and -- wow, unlike for the author of the above article -- the command worked for me!
$ lvm vgcfgrestore vg_XYZ
Restored volume group vg_XYZ
$ lvm vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg_XYZ" using metadata type lvm2
d) At this point, I decide to repair the filesystem relying on the strength of my backed up image of the messed up disk. In other words, even if my e2fsck'ing were to wreak further havoc on the disk, I could always return to Step (c) above.
$ e2fsck /dev/mapper/vg_XYZ-lv_root
I had to say 'y' to so many tons and tons of
Directories count wrong for group #<NNN>
prompts that I had almost given up hope.
e) Verification if the partition was clean.
$ e2fsck /dev/mapper/vg_XYZ-lv_root
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/vg_XYZ-lv_root: clean, 912765/4595712 files, 14271569/18374656 blocks
f) Mounting.
$ mount /dev/mapper/vg_XYZ-lv_root /mnt/x
$ ls -l /mnt/x
total 144
dr-xr-xr-x. 2 root root 4096 Apr 8 09:39 bin
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 boot
drwxr-xr-x. 2 root root 4096 Mar 3 2011 cgroup
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 dev
drwxr-xr-x. 189 root root 12288 Apr 10 12:44 etc
drwxr-xr-x. 8 root root 4096 Jan 5 10:11 home
dr-xr-xr-x. 23 root root 12288 Mar 14 03:23 lib
drwx------. 2 root root 16384 Oct 25 18:30 lost+found
drwxr-xr-x. 2 root root 4096 Jul 29 2011 media
drwxr-xr-x. 3 root root 4096 Jul 29 2011 mnt
drwxr-xr-x. 5 root root 4096 Jan 30 22:00 opt
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 proc
dr-xr-x---. 25 root root 4096 Apr 10 12:43 root
drwxr-xr-x. 36 root root 4096 Nov 20 09:09 run
dr-xr-xr-x. 2 root root 12288 Mar 13 06:32 sbin
drwxr-xr-x. 2 root root 4096 Dec 31 08:29 selinux
drwxr-xr-x. 2 root root 4096 Jul 29 2011 srv
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 sys
drwxrwxrwt. 49 root root 28672 Apr 10 12:44 tmp
drwxr-xr-x. 12 root root 4096 Nov 20 07:18 usr
drwxr-xr-x. 22 root root 4096 Jan 4 10:15 var