No success repairing my ext4 file system so far, PLEASE HELP!

654 views
Skip to first unread message

Harry

unread,
Apr 12, 2012, 6:29:52 AM4/12/12
to
Hello,
I posted my question to superuser.com (http://superuser.com/questions/
410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
haven't got any response yet.

In summary, not realizing what I was doing, I overwrote the first 446
bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
if someone could help me salvage my disk!


=================
Details of what I did:
=================

Using the `dd` command, I was hoping that I would be able to copy over
the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
make Disk A bootable just like Disk B. I issued the command:

dd if=/dev/sdb of=/dev/sda bs=446 count=1

But when I could not boot up from `sda`, I rebooted from `sdb` to see
what was going on. To my horror, `sda` was being reported to have a
bad superblock, now.

Worse, I was **unable** to repair it via the backup superblocks stored
on the ext4 filesystem. This is what I did. I first got the backup
superblock addresses, like so:

[root@localhost liveuser]# mke2fs -n /dev/sda
mke2fs 1.41.14 (22-Dec-2010)
/dev/sda is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
4890624 inodes, 19537686 blocks
976884 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
597 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424

Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
`SUPERBLOCK` values listed above, like so:

[root@localhost liveuser]# e2fsck -b 32768 /dev/sda
e2fsck 1.41.14 (22-Dec-2010)
e2fsck: Bad magic number in super-block while trying to open /dev/sda

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the
superblock
is corrupt, and you might try running e2fsck with an alternate
superblock:
e2fsck -b 8193 <device>

I tried every single value, but each gave the above message!

**Is there anything that I could do NOW to salvage my precious disk?**
This is an 80G disk with 2 partitions. The `/dev/sda1` partition is
clean and is mountable; it is the `/dev/sda2` partition that is
failing to work with commands like `mount`, `debugfs`, `dumpe2fs`,
etc.

Running `mke2fs -n` for the individual partitions gave me this (notice
how the **First Data Block** and **Maximum filesystem blocks** both
show **0** as their value):

[root@localhost liveuser]# mke2fs -n /dev/sda1
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128016 inodes, 512000 blocks
25600 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=67633152
63 block groups
8192 blocks per group, 8192 fragments per group
2032 inodes per group
Superblock backups stored on blocks:
8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

[root@localhost liveuser]# mke2fs -n /dev/sda2
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
4857856 inodes, 19409408 blocks
970470 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
593 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424

I still don't know what was wrong in my `dd` command that corrupted my
ext4 superblock. You cannot imagine how happy I will be if someone
could help me recover my disk back... since, except fo this bad
superblock, all the data is just sitting right there!

PLEASE HELP, my soul is crying as I write this :-((

PS: If I can get a response faster/better from another forum on the
Net, please do point me to it.

Richard Kettlewell

unread,
Apr 12, 2012, 6:46:15 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> `SUPERBLOCK` values listed above, like so:
>
> [root@localhost liveuser]# e2fsck -b 32768 /dev/sda

Did you mean sda or sda1 here? (And similarly elsewhere.)

> I still don't know what was wrong in my `dd` command that corrupted my
> ext4 superblock.

Next time make a backup first.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 7:08:26 AM4/12/12
to
On Apr 12, 3:46 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
> Harry <simonsha...@gmail.com> writes:
> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> > `SUPERBLOCK` values listed above, like so:
>
> >         [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>
> Did you mean sda or sda1 here?  (And similarly elsewhere.)

I meant sda2 throughout; the /dev/sda1 partition is just fine.

Fyi, the `fdisk -l` prints this:
Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488
sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 156301311 77637632 8e Linux LVM

Can you/somebody please help me? I have not modified the state of the
disk in any other way other than first 446 bytes of MBR via the `dd`
command, so all the data should literally be sitting there.

> Next time make a backup first.
Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
smart enough for the 'simple-enough' operation I was doing.

Richard Kettlewell

unread,
Apr 12, 2012, 7:36:03 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:

>>> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>>> `SUPERBLOCK` values listed above, like so:
>>>
>>>         [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> Did you mean sda or sda1 here?  (And similarly elsewhere.)
>
> I meant sda2 throughout; the /dev/sda1 partition is just fine.

Passing the right device name to e2fsck might help, then.
(How do you expect anyone to help you if you don't describe the
situation accurately?)

> Fyi, the `fdisk -l` prints this:
> Disk /dev/sda: 80.0 GB, 80026361856 bytes
> 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
> sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Device Boot Start End Blocks Id System
> /dev/sda1 * 2048 1026047 512000 83 Linux
> /dev/sda2 1026048 156301311 77637632 8e Linux LVM

If those IDs are accurate then there's no filesystem in sda2, but rather
an LVM PV, which would explain why attempts to use ext4 tools on it
don't work.

> Can you/somebody please help me? I have not modified the state of the
> disk in any other way other than first 446 bytes of MBR via the `dd`
> command, so all the data should literally be sitting there.

If that's what you actually did then it won't have touched the contents
of any partitions, and certainly not sda2 which is 501MB into the disk.
However, you seem to have a habit of getting device names wrong, which
might explain what really happened.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 8:25:16 AM4/12/12
to
Richard, let me first come out clean with the whole truth.

Since my original post to superuser.com, I had switched the location
of the problem disk from it being 'sda' originally to being 'sdb' now.
This is because the original 'sdb' had a freshly installed, bootable
OS (Fedora 16) and it was bigger (250 GB) - so I thought I'd make it
my 'primary' disk (or, 'sda'), and keep the older 80 G disk as the
'secondary' (or, 'sdb') disk.

However, just before pasting the `fdisk -l` output for you (in my
previous post to you), I simply searched and replaced 'sdb' with 'sda'
in the `fdisk -l` output to make my message on this forum look
consistent with what I posted yesterday on superuser.com.

Here is the unedited output of `fdisk -l` on the system that has this
problem disk currently:

Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488
sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1026047 512000 83 Linux
/dev/sdb2 1026048 156301311 77637632 8e Linux LVM

> > Can you/somebody please help me? I have not modified the state of the
> > disk in any other way other than first 446 bytes of MBR via the `dd`
> > command, so all the data should literally be sitting there.
>
> If that's what you actually did then it won't have touched the contents
> of any partitions, and certainly not sda2 which is 501MB into the disk.
> However, you seem to have a habit of getting device names wrong, which
> might explain what really happened.
>

I have a very, /VERY/ basic idea of partitions; other than a 1-line
description of benefits of the LVM feature / concept, I don't really
know how to handle them, especially when something (like this) goes
wrong. Would you be able to (*please*) guide me with the necessary
steps that i need to carry out *now* to fix this situation? Can't even
begin to tell you, how much I would appreciate yours (or any other
reader's) help at this point. You can reprimand me as much as you like
on the way, I most certainly deserve it.

Aside: The basic problem here has been this: I've been using Linux for
a while but only as a blackbox user - (1) to avoid using Windows, and
(2)I love it, as it has tons of great and free software. Now, I tend
to go with the defaults during installs and I am learning the systems
side slowly in a background thread. Though I have not master these
details fully, this time I somehow thought I had learned
'enough' (about the `dd` command and the MBR) when this disaster
happened. I was very lazy also in not taking the backup... as it was a
mere 446 byte chunk.

Richard Kettlewell

unread,
Apr 12, 2012, 8:49:07 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> I have a very, /VERY/ basic idea of partitions; other than a 1-line
> description of benefits of the LVM feature / concept, I don't really
> know how to handle them, especially when something (like this) goes
> wrong. Would you be able to (*please*) guide me with the necessary
> steps that i need to carry out *now* to fix this situation? Can't even
> begin to tell you, how much I would appreciate yours (or any other
> reader's) help at this point. You can reprimand me as much as you like
> on the way, I most certainly deserve it.

At a guess what's going on is that you're attempting to mount /dev/sdb2
when in fact your filesystem is not on sdb2 at all but in a logical
volume somewhere within it.

If that's the case then you can use 'lvdisplay' to get a list of logical
volumes, which should include the device name(s) you need.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 9:09:31 AM4/12/12
to
On Apr 12, 5:49 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
> Harry <simonsha...@gmail.com> writes:
> > I have a very,  /VERY/ basic idea of partitions; other than a 1-line
> > description of benefits of the LVM feature / concept, I don't really
> > know how to handle them, especially when something (like this) goes
> > wrong. Would you be able to (*please*) guide me with the necessary
> > steps that i need to carry out *now* to fix this situation? Can't even
> > begin to tell you, how much I would appreciate yours (or any other
> > reader's) help at this point. You can reprimand me as much as you like
> > on the way, I most certainly deserve it.
>
> At a guess what's going on is that you're attempting to mount /dev/sdb2
> when in fact your filesystem is not on sdb2 at all but in a logical
> volume somewhere within it.

Oh!

> If that's the case then you can use 'lvdisplay' to get a list of logical
> volumes, which should include the device name(s) you need.
>

I issued `lvdisplay -vvv` as root and this is what I got:
(Note: /dev/sdb is the device whose first 446 bytes got messed up).

Processing: lvdisplay -vvv
O_DIRECT will be used
Setting global/locking_type to 1
Setting global/wait_for_locks to 1
File-based locking selected.
Setting global/locking_dir to /var/lock/lvm
Preparing SELinux context for /var/lock/lvm to
system_u:object_r:lvm_lock_t:s0.
Resetting SELinux context to default value.
Finding all logical volumes
/dev/sr0: Added to device cache
/dev/scd0: Aliased to /dev/sr0 in device cache (preferred
name)
/dev/disk/by-id/ata-HL-DT-ST_DVDRAM_GH22NS50_K00ABGG0301:
Aliased to /dev/scd0 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0: Aliased to /
dev/scd0 in device cache
/dev/disk/by-label/Fedora\x2016\x20i386\x20DVD: Aliased to /
dev/scd0 in device cache
/dev/disk/by-id/wwn-0x5001480000000000: Aliased to /dev/scd0
in device cache
/dev/cdrom: Aliased to /dev/scd0 in device cache (preferred
name)
/dev/cdrw: Aliased to /dev/cdrom in device cache
/dev/dvd: Aliased to /dev/cdrom in device cache
/dev/dvdrw: Aliased to /dev/cdrom in device cache
/dev/sda: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4: Aliased to /
dev/sda in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4: Aliased
to /dev/sda in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0: Aliased to /
dev/sda in device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9: Aliased to /dev/sda in
device cache
/dev/sda1: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4-part1: Aliased
to /dev/sda1 in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4-part1:
Aliased to /dev/sda1 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0-part1: Aliased
to /dev/sda1 in device cache
/dev/disk/by-uuid/9c08e3e9-e9aa-43fc-a76b-36179552271d:
Aliased to /dev/sda1 in device cache
/dev/disk/by-label/_Fedora-16-i686-: Aliased to /dev/sda1 in
device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9-part1: Aliased to /dev/
sda1 in device cache
/dev/sda2: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4-part2: Aliased
to /dev/sda2 in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4-part2:
Aliased to /dev/sda2 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0-part2: Aliased
to /dev/sda2 in device cache
/dev/disk/by-uuid/c3d258fd-5cf3-4da7-abab-c5903f940c7a:
Aliased to /dev/sda2 in device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9-part2: Aliased to /dev/
sda2 in device cache
/dev/sdb: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D: Aliased to /dev/sdb
in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D: Aliased to /dev/
sdb in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0: Aliased to /
dev/sdb in device cache
/dev/sdb1: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D-part1: Aliased to /dev/
sdb1 in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D-part1: Aliased
to /dev/sdb1 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0-part1: Aliased
to /dev/sdb1 in device cache
/dev/disk/by-uuid/27e07bc8-a9f3-4a0c-bed9-e73ac1fc95f8:
Aliased to /dev/sdb1 in device cache
/dev/sdb2: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D-part2: Aliased to /dev/
sdb2 in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D-part2: Aliased
to /dev/sdb2 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0-part2: Aliased
to /dev/sdb2 in device cache
/dev/loop0: Added to device cache
/dev/disk/by-label/Fedora-16-i686-Live-Desktop.iso: Aliased
to /dev/loop0 in device cache
/dev/loop1: Added to device cache
/dev/loop2: Added to device cache
/dev/loop3: Added to device cache
/dev/loop4: Added to device cache
/dev/loop5: Added to device cache
/dev/loop6: Added to device cache
/dev/loop7: Added to device cache
Opened /dev/loop0 RO O_DIRECT
/dev/loop0: size is 0 sectors
/dev/loop0: Skipping: Too small to hold a PV
Closed /dev/loop0
Opened /dev/sda RO O_DIRECT
/dev/sda: size is 488397168 sectors
/dev/sda: block size is 4096 bytes
/dev/sda: Skipping: Partition table signature found
Closed /dev/sda
/dev/cdrom: Skipping: Unrecognised LVM device type 11
Opened /dev/loop1 RO O_DIRECT
/dev/loop1: size is 0 sectors
/dev/loop1: Skipping: Too small to hold a PV
Closed /dev/loop1
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: size is 475811840 sectors
Closed /dev/sda1
/dev/sda1: size is 475811840 sectors
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: block size is 4096 bytes
Closed /dev/sda1
Using /dev/sda1
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: block size is 4096 bytes
/dev/sda1: No label detected
Closed /dev/sda1
Opened /dev/loop2 RO O_DIRECT
/dev/loop2: size is 0 sectors
/dev/loop2: Skipping: Too small to hold a PV
Closed /dev/loop2
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: size is 12582912 sectors
Closed /dev/sda2
/dev/sda2: size is 12582912 sectors
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: block size is 4096 bytes
Closed /dev/sda2
Using /dev/sda2
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: block size is 4096 bytes
/dev/sda2: No label detected
Closed /dev/sda2
Opened /dev/loop3 RO O_DIRECT
/dev/loop3: size is 0 sectors
/dev/loop3: Skipping: Too small to hold a PV
Closed /dev/loop3
Opened /dev/loop4 RO O_DIRECT
/dev/loop4: size is 0 sectors
/dev/loop4: Skipping: Too small to hold a PV
Closed /dev/loop4
Opened /dev/loop5 RO O_DIRECT
/dev/loop5: size is 0 sectors
/dev/loop5: Skipping: Too small to hold a PV
Closed /dev/loop5
Opened /dev/loop6 RO O_DIRECT
/dev/loop6: size is 0 sectors
/dev/loop6: Skipping: Too small to hold a PV
Closed /dev/loop6
Opened /dev/loop7 RO O_DIRECT
/dev/loop7: size is 0 sectors
/dev/loop7: Skipping: Too small to hold a PV
Closed /dev/loop7
Opened /dev/sdb RO O_DIRECT
/dev/sdb: size is 156301488 sectors
/dev/sdb: block size is 4096 bytes
/dev/sdb: Skipping: Partition table signature found
Closed /dev/sdb
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: size is 1024000 sectors
Closed /dev/sdb1
/dev/sdb1: size is 1024000 sectors
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: block size is 4096 bytes
Closed /dev/sdb1
Using /dev/sdb1
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: block size is 4096 bytes
/dev/sdb1: No label detected
Closed /dev/sdb1
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: size is 155275264 sectors
Closed /dev/sdb2
/dev/sdb2: size is 155275264 sectors
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: block size is 4096 bytes
Closed /dev/sdb2
Using /dev/sdb2
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: block size is 4096 bytes
/dev/sdb2: lvm2 label detected at sector 1
lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
Closed /dev/sdb2
No volume groups found


Does it look good to you?
What to do now?

unruh

unread,
Apr 12, 2012, 11:46:41 AM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 3:46?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:
>> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> > `SUPERBLOCK` values listed above, like so:
>>
>> > ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>
> I meant sda2 throughout; the /dev/sda1 partition is just fine.

I am very confused. Why in the world would you be writting to /dev/sda2
in order to make it bootable? Booting is usually from the MBR of the
disk, not from the partitions. So, what did you actually do. Please tell
us exactly without any misprints this time.


>
> Fyi, the `fdisk -l` prints this:
> Disk /dev/sda: 80.0 GB, 80026361856 bytes
> 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
> sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Device Boot Start End Blocks Id System
> /dev/sda1 * 2048 1026047 512000 83 Linux
> /dev/sda2 1026048 156301311 77637632 8e Linux LVM

Is sda2 really LVM or is this another misprint or evidence of the
destruction you caused?


>
> Can you/somebody please help me? I have not modified the state of the
> disk in any other way other than first 446 bytes of MBR via the `dd`
> command, so all the data should literally be sitting there.

Of the MBR or of the second partition? Which?

>
>> Next time make a backup first.
> Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
> smart enough for the 'simple-enough' operation I was doing.

I guess that is usually the way people learn. If you really value the
data on that disk, buy another disk, do a dd copy of this disk to that
new disk, and do all your experiments on the new disk.


unruh

unread,
Apr 12, 2012, 11:50:11 AM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 4:36?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:
>> > Richard Kettlewell <r...@greenend.org.uk> wrote:
>> >> Harry <simonsha...@gmail.com> writes:
>> >>> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> >>> `SUPERBLOCK` values listed above, like so:
>>
>> >>> ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> >> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>>
>> > I meant sda2 throughout; the /dev/sda1 partition is just fine.
>>
>> Passing the right device name to e2fsck might help, then.
>> (How do you expect anyone to help you if you don't describe the
>> situation accurately?)
>>
>> > Fyi, the `fdisk -l` prints this:
>> > ? ? Disk /dev/sda: 80.0 GB, 80026361856 bytes
>> > ? ? 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
>> > sectors
>> > ? ? Units = sectors of 1 * 512 = 512 bytes
>> > ? ? Sector size (logical/physical): 512 bytes / 512 bytes
>> > ? ? I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > ? ? Disk identifier: 0x00000000
>>
>> > ? ? ? ?Device Boot ? ? ?Start ? ? ? ? End ? ? ?Blocks ? Id ?System
>> > ? ? /dev/sda1 ? * ? ? ? ?2048 ? ? 1026047 ? ? ?512000 ? 83 ?Linux
>> > ? ? /dev/sda2 ? ? ? ? 1026048 ? 156301311 ? ?77637632 ? 8e ?Linux LVM
>>
>> If those IDs are accurate then there's no filesystem in sda2, but rather
>> an LVM PV, which would explain why attempts to use ext4 tools on it
>> don't work.
>
> Richard, let me first come out clean with the whole truth.
>
> Since my original post to superuser.com, I had switched the location
> of the problem disk from it being 'sda' originally to being 'sdb' now.
> This is because the original 'sdb' had a freshly installed, bootable
> OS (Fedora 16) and it was bigger (250 GB) - so I thought I'd make it
> my 'primary' disk (or, 'sda'), and keep the older 80 G disk as the
> 'secondary' (or, 'sdb') disk.
>
> However, just before pasting the `fdisk -l` output for you (in my
> previous post to you), I simply searched and replaced 'sdb' with 'sda'
> in the `fdisk -l` output to make my message on this forum look
> consistent with what I posted yesterday on superuser.com.

And you expect help? Wi
But officer, how was I to know that shooting him in the head would kill
him. It was only a very tiny hole!

Doug Freyburger

unread,
Apr 12, 2012, 11:59:07 AM4/12/12
to
Harry wrote:
>
> =================
> Details of what I did:
> =================
>
> Using the `dd` command, I was hoping that I would be able to copy over
> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> make Disk A bootable just like Disk B. I issued the command:
>
> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> But when I could not boot up from `sda`, I rebooted from `sdb` to see
> what was going on. To my horror, `sda` was being reported to have a
> bad superblock, now.

Okay, let's step back and think about what you did compared to what you
have been trying to do since.

What you did - Take the partition table of disk sdb and write it to the
partition table of disk sda. That means nothing on sdb partitions is
effected. That means the partitions that used to be on sda have been
destroyed but none of the data inside those former partitions has been
touched.

What you have been trying to do since - Work on in invalid partitions
that got cloned form disk sdb. No amount of effort on this front can
ever possibily work. Either the partition tables started out identical
and you saw no effect and none of this ever happened or they are
different and you are now trying to work on incorrect partition data.
That can't ever help.

So what do you need to do? You need to stop working on the invalid
partitions in the table and start working on restoring the correct
partitions. Nothing else is going to help.

Do you have a print out of the partition table that used to be on disk
sda? If you do your situtation is promising. If you don't it's time to
start guessing.

Get the print out. Do "fdisk /dev/sda". By hand delete all of the
partitions that exist on it - They were copied in place and are not
valid. Then by hand create the partitions in the sizes and locations
they used to exist. Save it and reboot. Try the fsck again. If it
worked you're done with the debugging. Back to the drawing board of how
to mark a drive bootable - You have learned that's not the way.

If you got it wrong your saving grace is so far all you have written to
is the partition table. All of the data in the former partitions is
still there. So start using "fdisk /dev/sda" and start guessing at how
many partitions there used to be and what sizes they used to be. Hint -
Start by guessing one partition on the whole disk. Then one partition
on half of it. Then one partition on 3/4ths of it or whatever. Do a
binary descent. Eventually you'll know the exact size and location of
the first partition.

If there's a second partition it will start one cylinder after the
first. Initial guess is the rest of the drive. Lather rinse repeat
until you have it figured out.

The hard part will be figuring out the size of any swap partition
because fsck won't help. You hope that drive only had filesystems.

That's your strategy. Don't bother with any work on data inside the
partitions that are there now because the partition table is not correct.

Harry

unread,
Apr 12, 2012, 12:35:01 PM4/12/12
to
On Apr 12, 8:46 pm, unruh <un...@invalid.ca> wrote:
unruh, I do expect to be reprimanded so cannot and thus won't fire
back at you...
I'm clarifying what I did in Doug's post. Otherwise, I'll have repeat
the same info in multiple responses. If you can, please be around.

unruh

unread,
Apr 12, 2012, 12:39:17 PM4/12/12
to
On 2012-04-12, Doug Freyburger <dfre...@yahoo.com> wrote:
> Harry wrote:
>>
>> =================
>> Details of what I did:
>> =================
>>
>> Using the `dd` command, I was hoping that I would be able to copy over
>> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> make Disk A bootable just like Disk B. I issued the command:
>>
>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> what was going on. To my horror, `sda` was being reported to have a
>> bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.

Of course we have no idea if that is actually what he did, since he
keeps revising his statements. For example, maybe that was really
bs=446K. He first said everywhere that sda was written he meant sda2. Is
this one of those cases? It seems like it since he also says that sda1
partition is fine, and the partition table is readable by fdisk.


>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda. That means nothing on sdb partitions is
> effected. That means the partitions that used to be on sda have been
> destroyed but none of the data inside those former partitions has been
> touched.

It is also unclear what the partitions were. It would seem, but who
knows, that sdb2 ( since he has also told us he lied about the sda and
sdb labeling) is an LVM partition. (Why? Oh well.)

>
> What you have been trying to do since - Work on in invalid partitions
> that got cloned form disk sdb. No amount of effort on this front can
> ever possibily work. Either the partition tables started out identical
> and you saw no effect and none of this ever happened or they are
> different and you are now trying to work on incorrect partition data.
> That can't ever help.
>
> So what do you need to do? You need to stop working on the invalid
> partitions in the table and start working on restoring the correct
> partitions. Nothing else is going to help.
>
> Do you have a print out of the partition table that used to be on disk
> sda? If you do your situtation is promising. If you don't it's time to
> start guessing.
>
> Get the print out. Do "fdisk /dev/sda". By hand delete all of the
> partitions that exist on it - They were copied in place and are not
> valid. Then by hand create the partitions in the sizes and locations
> they used to exist. Save it and reboot. Try the fsck again. If it
> worked you're done with the debugging. Back to the drawing board of how
> to mark a drive bootable - You have learned that's not the way.

As I said, he should clone the disk and work on the clone only. He is
liable to mess things up still further by trying to fix things.

Harry

unread,
Apr 12, 2012, 12:48:29 PM4/12/12
to
On Apr 12, 8:59 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> > =================
> > Details of what I did:
> > =================
>
> > Using the `dd` command, I was hoping that I would be able to copy over
> > the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> > make Disk A bootable just like Disk B. I issued the command:
>
> > dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> > But when I could not boot up from `sda`, I rebooted from `sdb` to see
> > what was going on. To my horror, `sda` was being reported to have a
> > bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.
>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda. That means nothing on sdb partitions is
> effected. That means the partitions that used to be on sda have been
> destroyed but none of the data inside those former partitions has been
> touched.

That's correct. With you so far.

> What you have been trying to do since - Work on in invalid partitions
> that got cloned form disk sdb. No amount of effort on this front can
> ever possibily work. Either the partition tables started out identical
> and you saw no effect and none of this ever happened or they are
> different and you are now trying to work on incorrect partition data.
> That can't ever help.
>
> So what do you need to do? You need to stop working on the invalid
> partitions in the table and start working on restoring the correct
> partitions. Nothing else is going to help.

Yes, all I am interested in right now is to *somehow* be able to
recover the filesystem (ext4) and the data on it.

> Do you have a print out of the partition table that used to be on disk
> sda? If you do your situtation is promising. If you don't it's time to
> start guessing.

No, I don't have a printout of the partition table of the former sda.

> Get the print out. Do "fdisk /dev/sda". By hand delete all of the
> partitions that exist on it - They were copied in place and are not
> valid. Then by hand create the partitions in the sizes and locations
> they used to exist. Save it and reboot. Try the fsck again. If it
> worked you're done with the debugging. Back to the drawing board of how
> to mark a drive bootable - You have learned that's not the way.
>
> If you got it wrong your saving grace is so far all you have written to
> is the partition table. All of the data in the former partitions is
> still there. So start using "fdisk /dev/sda" and start guessing at how
> many partitions there used to be and what sizes they used to be. Hint -
> Start by guessing one partition on the whole disk. Then one partition
> on half of it. Then one partition on 3/4ths of it or whatever. Do a
> binary descent. Eventually you'll know the exact size and location of
> the first partition.
>
> If there's a second partition it will start one cylinder after the
> first. Initial guess is the rest of the drive. Lather rinse repeat
> until you have it figured out.
>
> The hard part will be figuring out the size of any swap partition
> because fsck won't help. You hope that drive only had filesystems.
>
> That's your strategy. Don't bother with any work on data inside the
> partitions that are there now because the partition table is not correct.

Doug, after getting some revelations from Richard above (on LVM and
the futilitiy of using ext4 tools directly on it), I have described my
situation more (and hopefully) better at the following superuser.com
link. Not sure if it would be rude on my part to ask you this, but if
you don't mind the inconvenience of clicking this link, you will see a
more succinct and helpful description of my problem.

Superuser.com link:
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it

Before I try the binary descent approach you suggest above, may I
request you to go over this new superuser.com link and let me know if
you still thing nothing else simpler is possible to salvage the file
system sitting in the logical volume somewhere.

Many, many thanks to unruh and yourself for responding. I will be
forever grateful to you guys if you can help me salvage my data.


Harry

unread,
Apr 12, 2012, 12:53:43 PM4/12/12
to
On Apr 12, 9:39 pm, unruh <un...@invalid.ca> wrote:
unruh, I indeed have a cloned image of the original messed up device (/
dev/sda).

Please forgive my typos in my original post. This doesn't mean that
everything I wrote had a typo in it. The `dd` command that I issued,
e.g., has no typos in the original post.

I know there is a way to mount the image of the clone device
(partition?) in loopback mode. If you/Doug/sb can provide step by step
instructions, then I would be very grateful to you.

Richard Kettlewell

unread,
Apr 12, 2012, 1:43:02 PM4/12/12
to
Doug Freyburger <dfre...@yahoo.com> writes:
> Harry wrote:

>> =================
>> Details of what I did:
>> =================
>>
>> Using the `dd` command, I was hoping that I would be able to copy over
>> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> make Disk A bootable just like Disk B. I issued the command:
>>
>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> what was going on. To my horror, `sda` was being reported to have a
>> bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.
>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda.

That's not correct. 446 is precisely the value you use to avoid
modifying the partition table of the target. Since the partition table
quoted is consistent with an 80GB disk and not a 250GB disk, it's a safe
bet that the partition table on the target wasn't modified.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 1:46:46 PM4/12/12
to
On Apr 12, 8:59 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> If you got it wrong your saving grace is so far all you have written to
> is the partition table.  All of the data in the former partitions is
> still there.  So start using "fdisk /dev/sda" and start guessing at how
> many partitions there used to be and what sizes they used to be.  Hint -
> Start by guessing one partition on the whole disk.  Then one partition
> on half of it.  Then one partition on 3/4ths of it or whatever.  Do a
> binary descent.  Eventually you'll know the exact size and location of
> the first partition.
>
> If there's a second partition it will start one cylinder after the
> first.  Initial guess is the rest of the drive.  Lather rinse repeat
> until you have it figured out.

Doug, is there any way to avoid having to reboot after each edit of
the partition table? The messed up disk is sitting as sdb (or, a
secondary disk) in my current system, which means I am not booting
from it. For example, given the fact that right now sdb is
unmountable, can I repeatedly try the mount command (or, any LVM-
equivalent of mount) to check whether or not my edits to the partition
table were correct. I have included a copy of a backup of the LVM
setup here:

http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it

Harry

unread,
Apr 12, 2012, 1:55:21 PM4/12/12
to
On Apr 12, 10:43 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
I read from a Wikipedia entry on disk partitioning that the first 446
bytes of the MBR is 'pure code' -- i.e. no data. Which is why I
ventured in the first place to boldly attempt -- without a backup of
the target MBR -- the copying of the 446-byte MBR code from the former
(250 GB) sdb to the former (80G) sda.

But something went wrong, and I still don't understand it. Right now,
my priority is to get my data back. Immediately after this is done,
I'd also like to what the heck I did.

Doug Freyburger

unread,
Apr 12, 2012, 2:07:29 PM4/12/12
to
Richard Kettlewell wrote:
> Doug Freyburger <dfre...@yahoo.com> writes:
>> Harry wrote:
>
>>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
>> What you did - Take the partition table of disk sdb and write it to the
>> partition table of disk sda.
>
> That's not correct. 446 is precisely the value you use to avoid
> modifying the partition table of the target. Since the partition table
> quoted is consistent with an 80GB disk and not a 250GB disk, it's a safe
> bet that the partition table on the target wasn't modified.

I wonder if "dd" has an underlying block size or if it starts with an
uninitialized block.

I suspect it did not read the first sector, changethe first 446 bytes
in the buffer, then write the first sector back out again. I fear it
wrote 512 byes of data (maybe more in 512 bye increments) where the
first 446 bytes were good and the rest was nonsense. It would explain
the missing partition data.

I'll reread the thread then glance at that forum.

Richard Kettlewell

unread,
Apr 12, 2012, 2:18:02 PM4/12/12
to
Harry <simon...@gmail.com> writes:
> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
> Closed /dev/sdb2
> No volume groups found

You MIGHT be able to use dmsetup to create a block device corresponding
to the lv_root volume described in the backup file you quote over on
superuser. See my response there.

--
http://www.greenend.org.uk/rjk/

Gernot Fink

unread,
Apr 12, 2012, 2:20:01 PM4/12/12
to
In article <81fb3fa8-62ab-43be...@h4g2000pbe.googlegroups.com>,
Harry <simon...@gmail.com> writes:
> dd if=/dev/sdb of=/dev/sda bs=446 count=1

this should not shred the partitontable, but it looks like it did.

Check or repair the Partitiontabele as first step.
If you find nothing or a bad table use testdisk to scan for
partitions.

After this use dumpe2fs to find alternate superblocks.
http://www.cyberciti.biz/faq/linux-find-alternative-superblocks/

As next step make a initial boot with supergrubdisk.

--
MFG Gernot

Richard Kettlewell

unread,
Apr 12, 2012, 2:20:06 PM4/12/12
to
Doug Freyburger <dfre...@yahoo.com> writes:
> I wonder if "dd" has an underlying block size or if it starts with an
> uninitialized block.
>
> I suspect it did not read the first sector, changethe first 446 bytes
> in the buffer, then write the first sector back out again. I fear it
> wrote 512 byes of data (maybe more in 512 bye increments) where the
> first 446 bytes were good and the rest was nonsense. It would explain
> the missing partition data.
>
> I'll reread the thread then glance at that forum.

The partition table is fine. You're chasing a red herring.

--
http://www.greenend.org.uk/rjk/

Doug Freyburger

unread,
Apr 12, 2012, 2:27:30 PM4/12/12
to
Harry wrote:
>
> I issued `lvdisplay -vvv` as root and this is what I got:
> (Note: /dev/sdb is the device whose first 446 bytes got messed up).
> ...
> /dev/sdb: size is 156301488 sectors
> /dev/sdb: block size is 4096 bytes
> /dev/sdb: Skipping: Partition table signature found
> Closed /dev/sdb

There's a partition table for sdb. That's hopeful.

> /dev/sdb1: size is 1024000 sectors
> Closed /dev/sdb1
> /dev/sdb1: size is 1024000 sectors
> Opened /dev/sdb1 RO O_DIRECT
> /dev/sdb1: block size is 4096 bytes

That's 250 MB, right? It's a typical size for /boot.

> /dev/sdb2: size is 155275264 sectors
> Opened /dev/sdb2 RO O_DIRECT
> /dev/sdb2: block size is 4096 bytes

Does that match the 37 GB I calculated from the fdisk output. Seems
like only half of the drive was used. Most likely that's my arithmatic
not what the numbers really say.

> /dev/sdb2: lvm2 label detected at sector 1

That's a bingo.

> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
> Closed /dev/sdb2
> No volume groups found

It says it found a volume group.

> Does it look good to you?
> What to do now?

Variations on "vgscan" and "vgimport". I've done more LVM work on HPUX
than on Linux recently so I can't rattle vgscan and vginput lines off
the top of my head.

Doug Freyburger

unread,
Apr 12, 2012, 2:51:21 PM4/12/12
to
Harry wrote:
>
> /dev/sdb2: lvm2 label detected at sector 1
> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)

After reading the forum response by Steven Monday I think this gives a
hint as to how to proceed next. It says there's a volume group there.
It says it does not know the name of that volume group (probably because
it was active not exported when it went down).

The first Linux host I could get to was Red Hat so it may behave diff.

>more /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)

Try "vgs". It just might tell you there is now a volume group
"#orphans_lvm2" available for import. Pain in the neck having a hash in
the name but tha'ts certainly deliberate.

Try "vgimport -a" to see if it brings it in as "#orphans_lvm2" or
"vg_XYZ".

Then try "vgimport -v vg_XYZ", yeah again, shrug.

Then "vgimport -v \#orphans_lvm2" escaping the hash.

If any of those work do "vgrename \#orphans_lvm2 vg_XYZ" and
"vgexport vg_XYZ".

If all of that fails try

vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw

or

vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ

I pasted that ID from your forum post so if it's editted use the real
one.

Then all the "-a Y" stuff on the VG and LVs in it. Then fsck. Make
sure to vgexport it before removing it back to the previous host.

unruh

unread,
Apr 12, 2012, 2:57:08 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 8:46?pm, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Harry <simonsha...@gmail.com> wrote:
>>
>> > On Apr 12, 3:46?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> >> Harry <simonsha...@gmail.com> writes:
>> >> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> >> > `SUPERBLOCK` values listed above, like so:
>>
>> >> > ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> >> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>>
>> > I meant sda2 throughout; the /dev/sda1 partition is just fine.
>>
>> I am very confused. Why in the world would you be writting to /dev/sda2
>> in order to make it bootable? Booting is usually from the MBR of the
>> disk, not from the partitions. So, what did you actually do. Please tell
>> us exactly without any misprints this time.
>>
>>
>>
>> > Fyi, the `fdisk -l` prints this:
>> > ? ? Disk /dev/sda: 80.0 GB, 80026361856 bytes
>> > ? ? 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
>> > sectors
>> > ? ? Units = sectors of 1 * 512 = 512 bytes
>> > ? ? Sector size (logical/physical): 512 bytes / 512 bytes
>> > ? ? I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > ? ? Disk identifier: 0x00000000
>>
>> > ? ? ? ?Device Boot ? ? ?Start ? ? ? ? End ? ? ?Blocks ? Id ?System
>> > ? ? /dev/sda1 ? * ? ? ? ?2048 ? ? 1026047 ? ? ?512000 ? 83 ?Linux
>> > ? ? /dev/sda2 ? ? ? ? 1026048 ? 156301311 ? ?77637632 ? 8e ?Linux LVM
>>
>> Is sda2 really LVM or is this another misprint or evidence of the
>> destruction you caused?
>>
>>
>>
>> > Can you/somebody please help me? I have not modified the state of the
>> > disk in any other way other than first 446 bytes of MBR via the `dd`
>> > command, so all the data should literally be sitting there.
>>
>> Of the MBR or of the second partition? Which?
>>
>>
>>
>> >> Next time make a backup first.
>> > Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
>> > smart enough for the 'simple-enough' operation I was doing.
>>
>> I guess that is usually the way people learn. If you really value the
>> data on that disk, buy another disk, do a dd copy of this disk to that
>> new disk, and do all your experiments on the new disk.
>
> unruh, I do expect to be reprimanded so cannot and thus won't fire
> back at you...
> I'm clarifying what I did in Doug's post. Otherwise, I'll have repeat
> the same info in multiple responses. If you can, please be around.

The reprimand is for lack of clarity. However my advice still stands. Do
not try to fix the original. Buy a new disk and do a dd copy from the
first to that, and then fix that copy. If your data is not worth $100,
then format the disk and start over. You will waste more than $100 of
time.


unruh

unread,
Apr 12, 2012, 3:03:10 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 9:39?pm, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Doug Freyburger <dfrey...@yahoo.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > Harry wrote:
>>
>> >> =================
>> >> Details of what I did:
>> >> =================
>>
>> >> Using the `dd` command, I was hoping that I would be able to copy over
>> >> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> >> make Disk A bootable just like Disk B. I issued the command:
>>
>> >> ? ? dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> >> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> >> what was going on. To my horror, `sda` was being reported to have a
>> >> bad superblock, now.
>>
>> > Okay, let's step back and think about what you did compared to what you
>> > have been trying to do since.
>>
>> Of course we have no idea if that is actually what he did, since he
>> keeps revising his statements. For example, maybe that was really
>> bs=446K. He first said everywhere that sda was written he meant sda2. Is
>> this one of those cases? It seems like it since he also says that sda1
>> partition is fine, and the partition table is readable by fdisk.
>>
>>
>>
>> > What you did - Take the partition table of disk sdb and write it to the
>> > partition table of disk sda. ?That means nothing on sdb partitions is
>> > effected. ?That means the partitions that used to be on sda have been
>> > destroyed but none of the data inside those former partitions has been
>> > touched.
>>
>> It is also unclear what the partitions were. It would seem, but who
>> knows, that sdb2 ( since he has also told us he lied about the sda and
>> sdb labeling) is an LVM partition. (Why? Oh well.)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > What you have been trying to do since - Work on in invalid partitions
>> > that got cloned form disk sdb. ?No amount of effort on this front can
>> > ever possibily work. ?Either the partition tables started out identical
>> > and you saw no effect and none of this ever happened or they are
>> > different and you are now trying to work on incorrect partition data.
>> > That can't ever help.
>>
>> > So what do you need to do? ?You need to stop working on the invalid
>> > partitions in the table and start working on restoring the correct
>> > partitions. ?Nothing else is going to help.
>>
>> > Do you have a print out of the partition table that used to be on disk
>> > sda? ?If you do your situtation is promising. ?If you don't it's time to
>> > start guessing.
>>
>> > Get the print out. ?Do "fdisk /dev/sda". ?By hand delete all of the
>> > partitions that exist on it - They were copied in place and are not
>> > valid. ?Then by hand create the partitions in the sizes and locations
>> > they used to exist. ?Save it and reboot. ?Try the fsck again. ?If it
>> > worked you're done with the debugging. ?Back to the drawing board of how
>> > to mark a drive bootable - You have learned that's not the way.
>>
>> As I said, he should clone the disk and work on the clone only. He is
>> liable to mess things up still further by trying to fix things.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > If you got it wrong your saving grace is so far all you have written to
>> > is the partition table. ?All of the data in the former partitions is
>> > still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> > many partitions there used to be and what sizes they used to be. ?Hint -
>> > Start by guessing one partition on the whole disk. ?Then one partition
>> > on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> > binary descent. ?Eventually you'll know the exact size and location of
>> > the first partition.
>>
>> > If there's a second partition it will start one cylinder after the
>> > first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> > until you have it figured out.
>>
>> > The hard part will be figuring out the size of any swap partition
>> > because fsck won't help. ?You hope that drive only had filesystems.
>>
>> > That's your strategy. ?Don't bother with any work on data inside the
>> > partitions that are there now because the partition table is not correct.
>
> unruh, I indeed have a cloned image of the original messed up device (/
> dev/sda).
>
> Please forgive my typos in my original post. This doesn't mean that
> everything I wrote had a typo in it. The `dd` command that I issued,
> e.g., has no typos in the original post.
>
> I know there is a way to mount the image of the clone device
> (partition?) in loopback mode. If you/Doug/sb can provide step by step
> instructions, then I would be very grateful to you.

OK, so you have put away the original disk in a safe place, and will not
touch it. It is the clone you are working on. Then as doug says, your
partition table is probably messed up. Thus you have no reason to trust
anything it says (fdisk -l), and as he says, you have to try to
reconstruct it. Ie, you have to repartition the clone so that its
partitions are the same as they were before you made your mistake.
Do you have any information about how they were partitioned? When you
partitioned it originally, how did you do it? do you accept defaults or
do you round up (eg the first partition has 10GB and the second the
rest). If you really cannot remember, I do not think that there is any
way of finding out-- at least I do not know of any. There may well be
something in the structure of the disk that tells you from the structure
of the data where the second partition starts. How well used is the
disk? Is there liable to be a huge blank space before the next
partition because that space was never used?



Harry

unread,
Apr 12, 2012, 2:49:01 PM4/12/12
to
On Apr 12, 11:18 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
Richard, please see my comment there. The command is syntactically
invalid. Did you try this on Fedora 16 or some other OS (or another
version of dmsetup)?

unruh

unread,
Apr 12, 2012, 3:07:43 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
The file system is there. It is the partition table that is messed up,
probably. Ie, if you can figure out where the boundaries of the
partitions are, you can just reconstruct those and everything else will
be there, including ext4

>
>> Do you have a print out of the partition table that used to be on disk
>> sda? If you do your situtation is promising. If you don't it's time to
>> start guessing.
>
> No, I don't have a printout of the partition table of the former sda.

That makes it hard. How did you originally partition it? How do you
choose your partition sizes?
The partition table on there now comes ( probably) from the other drive
whose mbr you cloned. Ie, it is totally untrustworthy, including the lvm
comments.


>
> Superuser.com link:
> http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
>
> Before I try the binary descent approach you suggest above, may I
> request you to go over this new superuser.com link and let me know if
> you still thing nothing else simpler is possible to salvage the file
> system sitting in the logical volume somewhere.

You have no idea if you have a logical volume there. The partition table
is completely untrustworthy. It is like trying to find your way to rome
while using a map of Switzerland.

unruh

unread,
Apr 12, 2012, 3:09:29 PM4/12/12
to
You may be right. On the other hand, the evidence is otherwise. He
cannot read the data on the disk, which he should be able to if the
partition table is unmodified.


>

unruh

unread,
Apr 12, 2012, 3:11:43 PM4/12/12
to
On 2012-04-12, Richard Kettlewell <r...@greenend.org.uk> wrote:
You know that how? which piece of data that he reports do you base this
on? and How does your theory explain the other symptoms. Ie, if he is
chasing a red herring, what the real trail he should be following?


>

Harry

unread,
Apr 12, 2012, 3:12:45 PM4/12/12
to
On Apr 12, 11:51 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> >       /dev/sdb2: lvm2 label detected at sector 1
> >         lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
>
> After reading the forum response by Steven Monday I think this gives a
> hint as to how to proceed next.  It says there's a volume group there.
> It says it does not know the name of that volume group (probably because
> it was active not exported when it went down).
>
> The first Linux host I could get to was Red Hat so it may behave diff.
>
> >more /etc/redhat-release
>
> Red Hat Enterprise Linux Server release 5.7 (Tikanga)
>
> Try "vgs".  It just might tell you there is now a volume group
> "#orphans_lvm2" available for import.  Pain in the neck having a hash in
> the name but tha'ts certainly deliberate.
>
> Try "vgimport -a" to see if it brings it in as "#orphans_lvm2" or
> "vg_XYZ".
>
> Then try "vgimport -v vg_XYZ", yeah again, shrug.
>
> Then "vgimport -v \#orphans_lvm2" escaping the hash.

$ vgs
No volume groups found

$ vgimport -a
No volume groups found

$ vgimport -v vg_XYZ
Using volume group(s) on command line
Finding volume group "vg_XYZ"
Volume group "vg_XYZ" not found

$ vgimport -v \#orphans_lvm2
Using volume group(s) on command line

$ vgs -a
No volume groups found

$ vgs -a -d



>
> If any of those work do "vgrename  \#orphans_lvm2 vg_XYZ" and
> "vgexport vg_XYZ".

Didn't try vgexport.



> If all of that fails try
>
> vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw
>
> or
>
> vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ

$ vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw
Volume group "WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw" not found

$ vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ
No complete volume groups found



> Then all the "-a Y" stuff on the VG and LVs in it.  Then fsck.  Make
> sure to vgexport it before removing it back to the previous host.

Didn't come this far.

unruh

unread,
Apr 12, 2012, 3:13:27 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 8:59?pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
>> Harry wrote:
>>
>> If you got it wrong your saving grace is so far all you have written to
>> is the partition table. ?All of the data in the former partitions is
>> still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> many partitions there used to be and what sizes they used to be. ?Hint -
>> Start by guessing one partition on the whole disk. ?Then one partition
>> on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> binary descent. ?Eventually you'll know the exact size and location of
>> the first partition.
>>
>> If there's a second partition it will start one cylinder after the
>> first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> until you have it figured out.
>
> Doug, is there any way to avoid having to reboot after each edit of
> the partition table? The messed up disk is sitting as sdb (or, a

The messed up disk should be nowhere around your computer. It should be
unplugged and stored in a closet. You should be working ONLY with a
clone of it, which you claim to have.

Harry

unread,
Apr 12, 2012, 3:21:07 PM4/12/12
to
On Apr 13, 12:13 am, unruh <un...@invalid.ca> wrote:
> >  http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...

I have an image of the messed up disk (the former sda), which I
created thus:
dd if=/dev/sda of=/path/to/messedup.imsg

Isn't this sufficient, unruh? If something goes wrong, I can always
copy the image back, isn't it?

Richard Kettlewell

unread,
Apr 12, 2012, 3:39:53 PM4/12/12
to
unruh <un...@invalid.ca> writes:
> Richard Kettlewell <r...@greenend.org.uk> wrote:

>> The partition table is fine. You're chasing a red herring.
> You know that how? which piece of data that he reports do you base
> this on?

The consistency between the physical disk, the partition table and the
lvm backup data.

--
http://www.greenend.org.uk/rjk/

unruh

unread,
Apr 12, 2012, 4:12:02 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 13, 12:13?am, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Harry <simonsha...@gmail.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > On Apr 12, 8:59?pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
>> >> Harry wrote:
>>
>> >> If you got it wrong your saving grace is so far all you have written to
>> >> is the partition table. ?All of the data in the former partitions is
>> >> still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> >> many partitions there used to be and what sizes they used to be. ?Hint -
>> >> Start by guessing one partition on the whole disk. ?Then one partition
>> >> on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> >> binary descent. ?Eventually you'll know the exact size and location of
>> >> the first partition.
>>
>> >> If there's a second partition it will start one cylinder after the
>> >> first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> >> until you have it figured out.
>>
>> > Doug, is there any way to avoid having to reboot after each edit of
>> > the partition table? The messed up disk is sitting as sdb (or, a
>>
>> The messed up disk should be nowhere around your computer. It should be
>> unplugged and stored in a closet. You should be working ONLY with a
>> clone of it, which you claim to have.
>>
>>
>>
>>
>>
>>
>>
>> > secondary disk) in my current system, which means I am not booting
>> > from it. ?For example, given the fact that right now sdb is
>> > unmountable, can I repeatedly try the mount command (or, any LVM-
>> > equivalent of mount) to check whether or not my edits to the partition
>> > table were correct. I have included a copy of a backup of the LVM
>> > setup here:
>>
>> > ?http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
>
> I have an image of the messed up disk (the former sda), which I
> created thus:
> dd if=/dev/sda of=/path/to/messedup.imsg
>
> Isn't this sufficient, unruh? If something goes wrong, I can always
> copy the image back, isn't it?

Perhaps, but I would far rather play with a copy than the original.

The Natural Philosopher

unread,
Apr 12, 2012, 4:35:31 PM4/12/12
to
Harry wrote:
> Hello,
> I posted my question to superuser.com (http://superuser.com/questions/
> 410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
> haven't got any response yet.
>
> In summary, not realizing what I was doing, I overwrote the first 446
> bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
> if someone could help me salvage my disk!
>

I bet you would. My cursory response is that you have probably - if it
wasnt backed up, fucked it right royally and completely.


If it mounts at all, and you camn gete data off it, back it up and satrt
again.

If it doesn't mount...well. Good luck.

IF you know the format of the superblock you MIGHT try patching that in..

so if you have another identical drive you COULD rip that off the one
for the other.

Id dd the raw disk off first as is and save that somewhere on something
else as a possible backup

Then you might try a repartiton on it with ..fdisk? to restore the data
structures bit that tends to wipe directories as well? Or does it? Not sure.

Buta repartition would at least establishing the sort of formats you
need for a spuerblock and then rolling the backed up disk all except
that block back would maybe fix it.

Moral: don't go down one way streets unless you know exactly where they
lead.


>


--
To people who know nothing, anything is possible.
To people who know too much, it is a sad fact
that they know how little is really possible -
and how hard it is to achieve it.

The Natural Philosopher

unread,
Apr 12, 2012, 4:37:19 PM4/12/12
to
unruh wrote:
>
>
> But officer, how was I to know that shooting him in the head would kill
> him. It was only a very tiny hole!
>
:-)

David W. Hodgins

unread,
Apr 12, 2012, 5:15:17 PM4/12/12
to
On Thu, 12 Apr 2012 15:12:45 -0400, Harry <simon...@gmail.com> wrote:

> $ vgs -a
> No volume groups found

Try vgscan followed by "vgchange -a y", then lvscan.

Regards, Dave Hodgins

--
Change nomail.afraid.org to ody.ca to reply by email.
(nomail.afraid.org has been set up specifically for
use in usenet. Feel free to use it yourself.)

Harry

unread,
Apr 12, 2012, 11:30:09 PM4/12/12
to
On Apr 13, 2:15 am, "David W. Hodgins" <dwhodg...@nomail.afraid.org>
wrote:
> On Thu, 12 Apr 2012 15:12:45 -0400, Harry <simonsha...@gmail.com> wrote:
> > $ vgs -a
> >   No volume groups found
>
> Try vgscan followed by "vgchange -a y", then lvscan.
>
> Regards, Dave Hodgins
>
> --
> Change nomail.afraid.org to ody.ca to reply by email.
> (nomail.afraid.org has been set up specifically for
> use in usenet. Feel free to use it yourself.)

$ vgscan
Reading all physical volumes. This may take a while...

Harry

unread,
Apr 13, 2012, 9:58:19 AM4/13/12
to
On Apr 13, 8:30 am, Harry <simonsha...@gmail.com> wrote:

Guys, I am still somewhat hopeful but don't know what to do or who to
ask now.

Is there any other forum (LVM-related) where I could try asking?

This is not my area of expertise at all; I use Linux only as an
applications user.

Please help your fallen comrade out...

Harry

unread,
Apr 13, 2012, 10:13:36 AM4/13/12
to
Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
), is there any way I could programmtically conduct a binary search
for the correct start of the lv_root? For example,

StartLocation = some 'good' value
dmsetup create foo --table "0 146997248 linear /dev/sdb2
$StartLocation"
if mount -o ro /dev/mapper/foo /mnt succeeds,
BINGO!
else
StartLocation = StartLocation + some 'good' Increment
dmsetup remove foo
fi

Based on your recommendations I could try some good values of
StartLocation and Increment, instead of a brute-force search of the
entire 80G pv.

unruh

unread,
Apr 13, 2012, 11:22:37 AM4/13/12
to
On 2012-04-13, Harry <simon...@gmail.com> wrote:
We have tried. If you want to look for a company which specialises in
recovering data from hard disks, it may be time to do that.

Doug Freyburger

unread,
Apr 13, 2012, 12:50:57 PM4/13/12
to
Richard Kettlewell wrote:
> unruh <un...@invalid.ca> writes:
>> Richard Kettlewell <r...@greenend.org.uk> wrote:
>
>>> The partition table is fine. You're chasing a red herring.
>
>> You know that how? which piece of data that he reports do you base
>> this on?
>
> The consistency between the physical disk, the partition table and the
> lvm backup data.

The problem is if only the boot portion of the MBR were written to, then
why won't the volume group on the second partition activate? It remains
the most like suspect based in the "vgimport -vvv" he posted but where
our our guarantees?

According to the "fdisk -l" output there is a 250 MB parition in Linux
format marked bootable. Clearly /boot. It does not fsck nor does it
mount as /mnt/boot. if only the boot code of the MBR were written that
partition would fsck and mount.

According to the "vgimport -vvv" output posted here and the "pvscan"
output posted on the forum there is a 79 GB partition in Linux LVM
format that "should" contain the volume group vg_XYZ. Neither vgimport
nor vgscan works.

Both of these results tell me that the partition table probably was
trashed. We started with a base assumption that the partition table was
valid and worked based on that premise. No progress was made using
tools suggested by the partition table contents. To me that says the
partition table is in fact bad. But how much else was written to?

Doing "fsck /dev/sdb1" did not show a filesystem there. Either more was
overwritten than the partition table or there was no filesystem there on
the original.

I go back to my suggstion of looking for contents. Does fdisk work on
loopback files? I've never tried that. I would rather do dd to a new
device. Put a single partiton for the whole drive. Look for a
filesystem with fsck (done) and a volume group with vgscan. If it finds
one look at the size - A partition tha'ts too big will show data smaller
than the whole drive. Narrow down the size by halfing. A failure means
the half was too small, success means on target or too big.

The problem with the method is what if there was a swap partition at the
beginning. Then we don't know where to start other than the beginning.
Make a partition 1 cyclinder then the rest in the second one. Try that.
Keep cycling 1 more cylider at a time. Way too much work unless that
could be automated. Can PartitionMagic do something like that?

J G Miller

unread,
Apr 13, 2012, 1:13:34 PM4/13/12
to
On Friday, April 13th, 2012, at 06:58:19h -0700, Harry explained:

> Guys, I am still somewhat hopeful ...

Have you tried any of these tools?


<http://www.sleuthkit.ORG/sleuthkit/>

<http://www.sleuthkit.ORG/autopsy/>


<http://www.digital-forensic.ORG/framework/download/>

Robert Nichols

unread,
Apr 13, 2012, 8:29:41 PM4/13/12
to
On 04/13/2012 09:13 AM, Harry wrote:
> Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
> http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
> ), is there any way I could programmtically conduct a binary search
> for the correct start of the lv_root? For example,
>
> StartLocation = some 'good' value
> dmsetup create foo --table "0 146997248 linear /dev/sdb2
> $StartLocation"
> if mount -o ro /dev/mapper/foo /mnt succeeds,
> BINGO!
> else
> StartLocation = StartLocation + some 'good' Increment
> dmsetup remove foo
> fi
>
> Based on your recommendations I could try some good values of
> StartLocation and Increment, instead of a brute-force search of the
> entire 80G pv.

You previously reported "/dev/sdb2: lvm2 label detected at sector 1", and
that confirms that the start of the physical volume is correctly located
in the partition. What does "pvck -vv /dev/sdb2" have to say? I should
report the location of the metadata records for your volume group(s).

--
Bob Nichols AT comcast.net I am "RNichols42"

Harry

unread,
Apr 14, 2012, 12:24:08 AM4/14/12
to
On Apr 13, 9:50 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Richard Kettlewell wrote:
> > unruh <un...@invalid.ca> writes:
> >> Richard Kettlewell <r...@greenend.org.uk> wrote:
>
> >>> The partition table is fine. You're chasing a red herring.
>
> >> You know that how? which piece of data that he reports do you base
> >> this on?
>
> > The consistency between the physical disk, the partition table and the
> > lvm backup data.
>
> The problem is if only the boot portion of the MBR were written to, then
> why won't the volume group on the second partition activate? It remains
> the most like suspect based in the "vgimport -vvv" he posted but where
> our our guarantees?
>
> According to the "fdisk -l" output there is a 250 MB parition in Linux
> format marked bootable. Clearly /boot. It does not fsck nor does it
> mount as /mnt/boot. if only the boot code of the MBR were written that
> partition would fsck and mount.

No, actually, I /can/ mount the boot partition sdb1.

$ fdisk -l
<snip>
Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1026047 512000 83 Linux
/dev/sdb2 1026048 156301311 77637632 8e Linux LVM

$ mount /dev/sdb1 /mnt/x

$ ls /mnt/x
config-3.2.5-3.fc16.i686.PAE
initramfs-3.2.9-1.fc16.i686.PAE.img
config-3.2.9-1.fc16.i686.PAE
initramfs-3.2.9-2.fc16.i686.PAE.img
config-3.2.9-2.fc16.i686.PAE initrd-
plymouth.img
config.mk-compat-wireless-3.3-rc1-2-3.2.5-3.fc16.i686.PAE lost+found
config.mk-compat-wireless-3.3-rc1-2-3.2.9-1.fc16.i686.PAE
System.map-3.2.5-3.fc16.i686.PAE
config.mk-compat-wireless-3.3-rc1-2-3.2.9-2.fc16.i686.PAE
System.map-3.2.9-1.fc16.i686.PAE
efi
System.map-3.2.9-2.fc16.i686.PAE
grub
vmlinuz-3.2.5-3.fc16.i686.PAE
grub2
vmlinuz-3.2.9-1.fc16.i686.PAE
initramfs-3.2.5-3.fc16.i686.PAE.img
vmlinuz-3.2.9-2.fc16.i686.PAE

$ df -h /mnt/x
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 485M 85M 376M 19% /mnt/x

> According to the "vgimport -vvv" output posted here and the "pvscan"
> output posted on the forum there is a 79 GB partition in Linux LVM
> format that "should" contain the volume group vg_XYZ. Neither vgimport
> nor vgscan works.
>
> Both of these results tell me that the partition table probably was
> trashed. We started with a base assumption that the partition table was
> valid and worked based on that premise. No progress was made using
> tools suggested by the partition table contents. To me that says the
> partition table is in fact bad. But how much else was written to?
>
> Doing "fsck /dev/sdb1" did not show a filesystem there. Either more was
> overwritten than the partition table or there was no filesystem there on
> the original.

Here's what fsck is showing for sdb1.

$ umount /mnt/x

$ fsck -n /dev/sdb1
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/sdb1: clean, 248/128016 files, 102228/512000 blocks

> I go back to my suggstion of looking for contents. Does fdisk work on
> loopback files? I've never tried that. I would rather do dd to a new
> device. Put a single partiton for the whole drive. Look for a
> filesystem with fsck (done) and a volume group with vgscan. If it finds
> one look at the size - A partition tha'ts too big will show data smaller
> than the whole drive. Narrow down the size by halfing. A failure means
> the half was too small, success means on target or too big.

Richard, I'd like to try what you're suggesting... but, I'm afraid,
I'm not following you. Would you elaborate just a little bit more. I
have a cloned image of the bad sdb. Now what do I do with this image
using dd? So far, I am able to fdisk bad.img as follows:

$ losetup /dev/loop1 bad.img

$ # If /dev/loop1 is not specified on the next line,
$ # then fdisk can't see it.
$ fdisk -l /dev/loop1

Disk /dev/loop1: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/loop1p1 * 2048 1026047 512000 83 Linux
/dev/loop1p2 1026048 156301311 77637632 8e Linux LVM

vgscan reports no volumes.

$ vgscan
Reading all physical volumes. This may take a while...
No volume groups found

Now, what to do next?

> The problem with the method is what if there was a swap partition at the
> beginning. Then we don't know where to start other than the beginning.
> Make a partition 1 cyclinder then the rest in the second one. Try that.
> Keep cycling 1 more cylider at a time. Way too much work unless that
> could be automated. Can PartitionMagic do something like that?

From the volume group backup file which I have shared over
superuser.com, it seems there is indeed a swap partition in the
beginning. However, as I said above, I'm not fully understanding the
method you're suggesting. Could you spell out the steps along with
names of the programs to use in those steps and possibly some other
details, as I'm not a partitioning expert at all?

Harry

unread,
Apr 14, 2012, 12:27:27 AM4/14/12
to
On Apr 14, 5:29 am, Robert Nichols
<SEE_SIGNAT...@localhost.localdomain.invalid> wrote:
> On 04/13/2012 09:13 AM, Harry wrote:
>
>
>
>
>
>
>
>
>
> > Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
> >http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
> > ), is there any way I could programmtically conduct a binary search
> > for the correct start of the lv_root? For example,
>
> >     StartLocation = some 'good' value
> >     dmsetup create foo --table "0 146997248 linear /dev/sdb2
> > $StartLocation"
> >     if mount -o ro /dev/mapper/foo /mnt succeeds,
> >          BINGO!
> >     else
> >        StartLocation = StartLocation + some 'good' Increment
> >        dmsetup remove foo
> >     fi
>
> > Based on your recommendations I could try some good values of
> > StartLocation and Increment, instead of a brute-force search of the
> > entire 80G pv.
>
> You previously reported "/dev/sdb2: lvm2 label detected at sector 1", and
> that confirms that the start of the physical volume is correctly located
> in the partition.  What does "pvck -vv /dev/sdb2" have to say?  I should
> report the location of the metadata records for your volume group(s).
>
> --
> Bob Nichols         AT comcast.net I am "RNichols42"

$ pvck -vv /dev/sdb2
Setting global/locking_type to 1
Setting global/wait_for_locks to 1
File-based locking selected.
Setting global/locking_dir to /var/lock/lvm
Scanning /dev/sdb2
/dev/sdb2: size is 155275264 sectors
/dev/sdb2: size is 155275264 sectors
/dev/sdb2: lvm2 label detected at sector 1
Found label on /dev/sdb2, sector 1, type=LVM2 001
Found text metadata area: offset=4096, size=1044480
Found LVM2 metadata record at offset=1006080, size=42496,
offset2=0 size2=0
Found LVM2 metadata record at offset=991232, size=14848, offset2=0
size2=0
Found LVM2 metadata record at offset=979456, size=11776, offset2=0
size2=0
Found LVM2 metadata record at offset=967168, size=12288, offset2=0
size2=0
Found LVM2 metadata record at offset=966144, size=1024, offset2=0
size2=0

Please note that, in the above output, there is non-graphic trailing
character (after "LVM2 001") on this line:
...
Found label on /dev/sdb2, sector 1, type=LVM2 001
...

Message has been deleted

Harry

unread,
Apr 14, 2012, 12:56:06 AM4/14/12
to
On Apr 13, 7:13 pm, Harry <simonsha...@gmail.com> wrote:
> On Apr 13, 6:58 pm, Harry <simonsha...@gmail.com> wrote:
>
> > On Apr 13, 8:30 am, Harry <simonsha...@gmail.com> wrote:
>
> > Guys, I am still somewhat hopeful but don't know what to do or who to
> > ask now.
>
> > Is there any other forum (LVM-related) where I could try asking?
>
> > This is not my area of expertise at all; I use Linux only as an
> > applications user.
>
> > Please help your fallen comrade out...
>
> Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
> ), is there any way I could programmtically conduct a binary search
> for the correct start of the lv_root? For example,
>
>    StartLocation = some 'good' value
>    dmsetup create foo --table "0 146997248 linear /dev/sdb2
> $StartLocation"
>    if mount -o ro /dev/mapper/foo /mnt succeeds,
>         BINGO!
>    else
>       StartLocation = StartLocation + some 'good' Increment
>       dmsetup remove foo
>    fi
>
> Based on your recommendations I could try some good values of
> StartLocation and Increment, instead of a brute-force search of the
> entire 80G pv.

-----------------------------------------------
NOTE:
I delete one of my messages (which I posted a few minutes back)
showing the output of the script I wrote. Please ignore that message
and consider this message instead.
-----------------------------------------------

Based on Richard's response (at
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
), I wrote the following script.

------------------------------------------------------
#!/bin/bash
set -u

(( extent_size = 65536 ))
(( pe_start = 2048 ))

# current sector
(( curr = 0 ))

(( Increment = 1 ))
(( maxExtentCount = 126 + 2243 ))

# Clean up any previously existing foo.
if [ -b /dev/mapper/foo ]; then
dmsetup remove foo
fi

while : ; do

(( StartLocation = pe_start + curr * extent_size ))

echo -n "Trying curr = $curr ,StartLocation = $StartLocation ..."

if dmsetup create foo --table "0 146997248 linear /dev/sdb2
$StartLocation" &>/dev/null ; then
if mount -o ro /dev/mapper/foo /mnt &>/dev/null ; then
echo
echo 'BINGO'
break
else
echo " mount failed"
fi

dmsetup remove foo
else
echo " dmsetup failed"
fi

(( curr = curr + Increment ))

if [ $curr -gt $maxExtentCount ]; then
echo "reached maxExtentCount, quitting"
break
fi
done

# Clean up
if [ -b /dev/mapper/foo ]; then
dmsetup remove foo
fi
------------------------------------------------------

No success, though. The above loop ran as follows. Notice, how...
for curr = 0 thru 126, dmsetup succeeds but mount fails
for curr = 126 thru 2369, dmsetup itself fails.

Does that tell you anything else interesting that is not obvious to
me?

Trying curr = 0 ,StartLocation = 2048 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 1 ,StartLocation = 67584 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 2 ,StartLocation = 133120 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 3 ,StartLocation = 198656 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 4 ,StartLocation = 264192 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 5 ,StartLocation = 329728 ...mount: you must specify the
filesystem type
mount failed

...

Trying curr = 125 ,StartLocation = 8194048 ...mount: you must specify
the filesystem type
mount failed
Trying curr = 126 ,StartLocation = 8259584 ...mount: you must specify
the filesystem type
mount failed
Trying curr = 127 ,StartLocation = 8325120 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 128 ,StartLocation = 8390656 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 129 ,StartLocation = 8456192 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 130 ,StartLocation = 8521728 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed

...

Trying curr = 2367 ,StartLocation = 155125760 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 2368 ,StartLocation = 155191296 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 2369 ,StartLocation = 155256832 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
reached maxExtentCount, quitting


I am not sure if I'm incrementing 'curr' in proper units (sectors/
blocks/extent_size/bytes). Where is this volume backup file
documented, btw? (Googled for it, couldn't find it.)

I would like to understand LVM and even regular partitioning more
thoroughly. Would someone kindly suggest the smallest number of
resources (books, links to online articles/tutorials, etc) that would
teach me not just the commands but the concepts behind them. Anything
and everything above the BIOS and assembly-language I would like to
understand. For example, I won't want to understand the boot code in
the first 446 bytes of MBR but everything else you guys know about
partitioning and LVM.

Harry

unread,
Apr 14, 2012, 12:58:11 AM4/14/12
to
No, I haven't. But I did try TestDisk in non-modifying mode and it
didn't help.

Robert Nichols

unread,
Apr 14, 2012, 8:58:31 AM4/14/12
to
That is all perfectly normal. It's hard to understand why it's not working.
Try setting aside the /etc/lvm directory (rename it to /etc/xxlvm) to force
the system to work only with the physical devices and not rely on previously
cached data. Then see what 'pvscan', 'vgscan', and 'lvscan' can find. The
vgscan should repopulate the /etc/lvm directory with good data.

Harry

unread,
Apr 14, 2012, 9:35:30 AM4/14/12
to
On Apr 14, 5:58 pm, Robert Nichols
$ mv /etc/lvm /etc/lvm.off

$ pvscan
PV /dev/sdb2 lvm2 [74.04 GiB]
Total: 1 [74.04 GiB] / in use: 0 [0 ] / in no VG: 1 [74.04 GiB]

$ vgscan
Reading all physical volumes. This may take a while...
No volume groups found

$ lvscan
No volume groups found

$ ls -l /etc/lvm
total 0

Robert Nichols

unread,
Apr 14, 2012, 8:59:40 PM4/14/12
to
On 04/14/2012 08:35 AM, Harry wrote:
>
> $ mv /etc/lvm /etc/lvm.off
>
> $ pvscan
> PV /dev/sdb2 lvm2 [74.04 GiB]
> Total: 1 [74.04 GiB] / in use: 0 [0 ] / in no VG: 1 [74.04 GiB]
^^^^^^^^^^^^^^^^
I'm having a hard time imagining what could have happened to make it
appear that there was nothing in use inside that PV, especially since
'pvck' did find metadata records in there. Had this system been
running for a long time without rebooting prior to your escapade?
That would raise the possibility that the structures on disk had
been damaged for quite a while, and the system couldn't have survived
a reboot.

Now, I'm wondering what your chances would be of reconstructing the
VGs and LVs from the data in the backup files. If those volumes had
ever been resized or rearranged since they were created, I fear your
chances of success would be just about nil, but I don't know what
else to suggest. I think I'd want to play around with that on newly
created structures on another disk before trying it for real.
Message has been deleted

Harry

unread,
Apr 14, 2012, 11:13:09 PM4/14/12
to
This is what, I believe, I did.

The 80G sdb that I'm trying to recover now, used to be sda in my system earlier. I then installed a 250G sdb in the system, installed Fedora 16 on it, made sdb bootable, and then booted my sytsem.

Now, my system was booting just fine from sdb. In my ignorance (of the low, system-level details of MBR, partitioning, BIOS, etc), I thought that if I copied the first 446 bytes of MBR of sdb (which is working fine right now for me) to sda, then I might be able to boot from sda also if I were to place sda into another system. Since these 446 bytes would be pure boot code which would be common to both sda and sdb (esp, since both had Fedora 16 installs on them) and with no data in it, I thought, I had nothing to lose in trying this operation out: at worst, sda would continue to remain non-bootable, at which point I'd look for some other solution.

So, I fearlessly issued a

dd if=/dev/sdb of=/dev/sda bs=446 count=1

and, I believe, sda continued to remain non-bootable.

Since, mentally, I tend to feel more comfortable with sda being the primary/bootable disk in my system, at this point I physically switched sda and sdb cables in my system.

I still didn't think anything was seriously wrong; I thought, I should be able to fix things now. Now, because I didn't understand LVM *at all* (I only have a very high-level idea, even now!), I incorrectly thought that the sdb2 partition (shown by fdisk -l) was really an ext4 filesystem (and 'not' an LVM volume!), and 'all' I had to do in order to be able to mount it was to use the standard methods suggested on the Net. (sdb was being reported as having a bad superblock and ext4 supposedly has redundant backups of this stored in it which one could use to repair it.) When I couldn't mount/repair, I posted the question here,
http://superuser.com/questions/410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock .

---------------
Note: When this ext4 repair attempt didn't succeed on the sdb2 (partition), I even (incorrectly) tried the commands on sdb (the device) using various superblock offsets. Because the e2fsck man page has this statement in it

"The location of the backup superblock is dependent on the filesystem's blocksize.
For filesystems with 1k blocksizes, a backup superblock can be found at block 8193; for
filesystems with 2k blocksizes, at block 16384; and for 4k blocksizes, at block 32768."

, I even tried this command by assuming different blocksizes, for I thought I wouldn't be able to accurately and easily find out the actual blocksize of the filesystem given that it is in a 'broken' state at this point.
---------------

At this point, someone on this forum enlightened me that, because it was an LVM and not an ext4, ext4 repair tools were not working.

I *hope* these ext4 repair commands didn't mess up the LVM; each superblock offset I would specify would fail to get recognized as a valid superblock by e2fsck and so the repair operation would fail. Had the e2fsck command really succeeded by incorrectly identifying some random junk as a valid superblock, then I could understand (*now*) how it may have corrupted the LVM that I'm now struggling to bring back to life.

Only at this point and not any earlier, did I begin to anticipate serious trouble ahead and decided to clone /dev/sdb. Therefore, strictly speaking, the cloned sdb image I have now, does NOT reflect the state of things immediately after the 'MBR 446-byte overwrite' operation. I'm still hoping that none of my repair attempts (via e2fsck on both the device-sdb and the partition-sdb2!) corrupted the LVM on sdb2.

Robert Nichols

unread,
Apr 15, 2012, 4:14:31 PM4/15/12
to
On 04/14/2012 10:13 PM, Harry wrote:
>
> This is what, I believe, I did.
[recap snipped]
I don't see any way that anything you reported doing could cause the problem
you are seeing, and frankly I can't come up with any likely typos in those
commands that would do it either. If you had mistyped the destination
device as "sda1" or "sda2" instead of "sda", nothing would have been
affected (neither ext2/3/4 nor LVM2 store anything in the first 512 bytes).
If you had mistyped the blocksize as "446k" or the count as "1k", you would
have overwritten the partition table and possibly part of the ext3 file
system on that first partition. You've still got what appears to be a good
partition table, a good file system on the first partition, and an LVM2
header at the appropriate location in the second partition. e2fsck won't do
any writing if it doesn't find something that looks like a valid superblock.
Something else had to have happened, but I can't imagine what.

What is the history of that LVM2 PV? Is it likely that the LVs were
allocated in single extents? As a last resort you could do a brute force
search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
at offset 56 (0x48) in a sector, making that a possible superblock, then
use dmsetup to define a virtual device that begins 1024 bytes prior to that
sector and see what e2fsck has to say about that (use the "-n" option so
that it will open the device read-only). Sounds like it might take a long
time, but probably no more than you've already spent.

Harry

unread,
Apr 16, 2012, 5:39:27 AM4/16/12
to
On Monday, April 16, 2012 1:44:31 AM UTC+5:30, Robert Nichols wrote:
> On 04/14/2012 10:13 PM, Harry wrote:
> >
> > This is what, I believe, I did.
> [recap snipped]
> I don't see any way that anything you reported doing could cause the problem
> you are seeing, and frankly I can't come up with any likely typos in those
> commands that would do it either. If you had mistyped the destination
> device as "sda1" or "sda2" instead of "sda", nothing would have been
> affected (neither ext2/3/4 nor LVM2 store anything in the first 512 bytes).
> If you had mistyped the blocksize as "446k" or the count as "1k", you would
> have overwritten the partition table and possibly part of the ext3 file
> system on that first partition. You've still got what appears to be a good
> partition table, a good file system on the first partition, and an LVM2
> header at the appropriate location in the second partition. e2fsck won't do
> any writing if it doesn't find something that looks like a valid superblock.
> Something else had to have happened, but I can't imagine what.

I'm happy to hear that.

> What is the history of that LVM2 PV? Is it likely that the LVs were
> allocated in single extents?

Don't know. I recall just going with the Fedora 16 installation-time defaults, and only changing the sizes of the swap and root partitions.

> As a last resort you could do a brute force
> search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
> at offset 56 (0x48) in a sector, making that a possible superblock, then
> use dmsetup to define a virtual device that begins 1024 bytes prior to that
> sector and see what e2fsck has to say about that (use the "-n" option so
> that it will open the device read-only). Sounds like it might take a long
> time, but probably no more than you've already spent.


I think, you meant "offset 56 (0x38)".

I have launched the script listed below. It doesn't do the virtual device mapping *yet*; for now, it just looks for the signature. I will add the device mapper later.

After 43 minutes of running, I printed the progress (kill -10) and got this:
Trying sector 948135 of 146997248 .645001 %
ETA: 230 hours ( 9.58 days)

Any way to speed it up... say, by skipping regions of the device, considering interesting regions first?


----------------------------------------------------------------
#!/bin/bash
set -u

function printProgress {
local timeCurr=
local timeTaken=
local eta=

echo "Trying sector $currSector of $lastSector " $(echo "scale=6; $currSector * 100.0 / $lastSector " | bc) " %"

timeCurr=$(date +%s) # seconds
(( timeTaken = timeCurr - timeStart ))
(( eta = timeTaken * (lastSector - currSector) / (currSector - startSector) / 3600 ))
echo " ETA: $eta hours" "(" $(echo "scale=2; $eta / 24" | bc) " days)"
}

timeStart=$(date +%s)

trap "printProgress" 10

extent_count=2243
extent_size=65536
lastSector=$((extent_count * extent_size))

# Start by skipping these many sectors.
#
# Use either the first arg to this script or
# a default value of 2.
skipSectors=${1-2}
((startSector = skipSectors + 1 ))
echo "Started, by skipping $skipSectors sectors."

# ---------------------
# Main loop
# ---------------------
while :; do
((currSector = skipSectors + 1 ))

# Get the signature 'sig'. I have tested that this incantation
# does indeed give the signature 53ef for a device having a
# valid ext4 partition in the beginning.
sig=$(dd if=/dev/sdb2 bs=512 skip=$skipSectors |
xxd -c 10 | head -109 | tail -1 | cut -d ' ' -f2)

if [ "$sig" = "53ef" ]; then
echo "Found an ext4 fs at sector $currSector , quitting."
break
fi

# Did not find ext4 sig at the above location, try the next sector
((skipSectors += 1))

if [ $skipSectors -eq $lastSector ]; then
# No more sectors to skip.
echo "FAILED to find an ext4 fs."
break
fi
done
----------------------------------------------------------------

Robert Nichols

unread,
Apr 16, 2012, 9:50:30 AM4/16/12
to
On 04/16/2012 04:39 AM, Harry wrote:
> On Monday, April 16, 2012 1:44:31 AM UTC+5:30, Robert Nichols wrote:
[SNIP]
>> As a last resort you could do a brute force
>> search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
>> at offset 56 (0x48) in a sector, making that a possible superblock, then
>> use dmsetup to define a virtual device that begins 1024 bytes prior to that
>> sector and see what e2fsck has to say about that (use the "-n" option so
>> that it will open the device read-only). Sounds like it might take a long
>> time, but probably no more than you've already spent.
>
>
> I think, you meant "offset 56 (0x38)".

Indeed! Sorry about that.

> I have launched the script listed below. It doesn't do the virtual device mapping *yet*; for now, it just looks for the signature. I will add the device mapper later.
>
> After 43 minutes of running, I printed the progress (kill -10) and got this:
> Trying sector 948135 of 146997248 .645001 %
> ETA: 230 hours ( 9.58 days)

Yipe! I've attached a C program that looks for a sector with the EF53
magic number plus a valid blocksize and a label string that is either
empty or contains valid ASCII characters. It searched a 160GB partition
in about 45 minutes.

Note that you will need to subtract 1024 bytes from this offset when
setting up the virtual device.
e2finder.uue

Harry

unread,
Apr 16, 2012, 11:02:21 AM4/16/12