Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

No success repairing my ext4 file system so far, PLEASE HELP!

733 views
Skip to first unread message

Harry

unread,
Apr 12, 2012, 6:29:52 AM4/12/12
to
Hello,
I posted my question to superuser.com (http://superuser.com/questions/
410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
haven't got any response yet.

In summary, not realizing what I was doing, I overwrote the first 446
bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
if someone could help me salvage my disk!


=================
Details of what I did:
=================

Using the `dd` command, I was hoping that I would be able to copy over
the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
make Disk A bootable just like Disk B. I issued the command:

dd if=/dev/sdb of=/dev/sda bs=446 count=1

But when I could not boot up from `sda`, I rebooted from `sdb` to see
what was going on. To my horror, `sda` was being reported to have a
bad superblock, now.

Worse, I was **unable** to repair it via the backup superblocks stored
on the ext4 filesystem. This is what I did. I first got the backup
superblock addresses, like so:

[root@localhost liveuser]# mke2fs -n /dev/sda
mke2fs 1.41.14 (22-Dec-2010)
/dev/sda is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
4890624 inodes, 19537686 blocks
976884 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
597 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424

Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
`SUPERBLOCK` values listed above, like so:

[root@localhost liveuser]# e2fsck -b 32768 /dev/sda
e2fsck 1.41.14 (22-Dec-2010)
e2fsck: Bad magic number in super-block while trying to open /dev/sda

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the
superblock
is corrupt, and you might try running e2fsck with an alternate
superblock:
e2fsck -b 8193 <device>

I tried every single value, but each gave the above message!

**Is there anything that I could do NOW to salvage my precious disk?**
This is an 80G disk with 2 partitions. The `/dev/sda1` partition is
clean and is mountable; it is the `/dev/sda2` partition that is
failing to work with commands like `mount`, `debugfs`, `dumpe2fs`,
etc.

Running `mke2fs -n` for the individual partitions gave me this (notice
how the **First Data Block** and **Maximum filesystem blocks** both
show **0** as their value):

[root@localhost liveuser]# mke2fs -n /dev/sda1
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
128016 inodes, 512000 blocks
25600 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=67633152
63 block groups
8192 blocks per group, 8192 fragments per group
2032 inodes per group
Superblock backups stored on blocks:
8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

[root@localhost liveuser]# mke2fs -n /dev/sda2
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
4857856 inodes, 19409408 blocks
970470 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
593 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424

I still don't know what was wrong in my `dd` command that corrupted my
ext4 superblock. You cannot imagine how happy I will be if someone
could help me recover my disk back... since, except fo this bad
superblock, all the data is just sitting right there!

PLEASE HELP, my soul is crying as I write this :-((

PS: If I can get a response faster/better from another forum on the
Net, please do point me to it.

Richard Kettlewell

unread,
Apr 12, 2012, 6:46:15 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> `SUPERBLOCK` values listed above, like so:
>
> [root@localhost liveuser]# e2fsck -b 32768 /dev/sda

Did you mean sda or sda1 here? (And similarly elsewhere.)

> I still don't know what was wrong in my `dd` command that corrupted my
> ext4 superblock.

Next time make a backup first.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 7:08:26 AM4/12/12
to
On Apr 12, 3:46 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
> Harry <simonsha...@gmail.com> writes:
> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> > `SUPERBLOCK` values listed above, like so:
>
> >         [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>
> Did you mean sda or sda1 here?  (And similarly elsewhere.)

I meant sda2 throughout; the /dev/sda1 partition is just fine.

Fyi, the `fdisk -l` prints this:
Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488
sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 156301311 77637632 8e Linux LVM

Can you/somebody please help me? I have not modified the state of the
disk in any other way other than first 446 bytes of MBR via the `dd`
command, so all the data should literally be sitting there.

> Next time make a backup first.
Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
smart enough for the 'simple-enough' operation I was doing.

Richard Kettlewell

unread,
Apr 12, 2012, 7:36:03 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:

>>> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>>> `SUPERBLOCK` values listed above, like so:
>>>
>>>         [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> Did you mean sda or sda1 here?  (And similarly elsewhere.)
>
> I meant sda2 throughout; the /dev/sda1 partition is just fine.

Passing the right device name to e2fsck might help, then.
(How do you expect anyone to help you if you don't describe the
situation accurately?)

> Fyi, the `fdisk -l` prints this:
> Disk /dev/sda: 80.0 GB, 80026361856 bytes
> 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
> sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Device Boot Start End Blocks Id System
> /dev/sda1 * 2048 1026047 512000 83 Linux
> /dev/sda2 1026048 156301311 77637632 8e Linux LVM

If those IDs are accurate then there's no filesystem in sda2, but rather
an LVM PV, which would explain why attempts to use ext4 tools on it
don't work.

> Can you/somebody please help me? I have not modified the state of the
> disk in any other way other than first 446 bytes of MBR via the `dd`
> command, so all the data should literally be sitting there.

If that's what you actually did then it won't have touched the contents
of any partitions, and certainly not sda2 which is 501MB into the disk.
However, you seem to have a habit of getting device names wrong, which
might explain what really happened.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 8:25:16 AM4/12/12
to
Richard, let me first come out clean with the whole truth.

Since my original post to superuser.com, I had switched the location
of the problem disk from it being 'sda' originally to being 'sdb' now.
This is because the original 'sdb' had a freshly installed, bootable
OS (Fedora 16) and it was bigger (250 GB) - so I thought I'd make it
my 'primary' disk (or, 'sda'), and keep the older 80 G disk as the
'secondary' (or, 'sdb') disk.

However, just before pasting the `fdisk -l` output for you (in my
previous post to you), I simply searched and replaced 'sdb' with 'sda'
in the `fdisk -l` output to make my message on this forum look
consistent with what I posted yesterday on superuser.com.

Here is the unedited output of `fdisk -l` on the system that has this
problem disk currently:

Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488
sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1026047 512000 83 Linux
/dev/sdb2 1026048 156301311 77637632 8e Linux LVM

> > Can you/somebody please help me? I have not modified the state of the
> > disk in any other way other than first 446 bytes of MBR via the `dd`
> > command, so all the data should literally be sitting there.
>
> If that's what you actually did then it won't have touched the contents
> of any partitions, and certainly not sda2 which is 501MB into the disk.
> However, you seem to have a habit of getting device names wrong, which
> might explain what really happened.
>

I have a very, /VERY/ basic idea of partitions; other than a 1-line
description of benefits of the LVM feature / concept, I don't really
know how to handle them, especially when something (like this) goes
wrong. Would you be able to (*please*) guide me with the necessary
steps that i need to carry out *now* to fix this situation? Can't even
begin to tell you, how much I would appreciate yours (or any other
reader's) help at this point. You can reprimand me as much as you like
on the way, I most certainly deserve it.

Aside: The basic problem here has been this: I've been using Linux for
a while but only as a blackbox user - (1) to avoid using Windows, and
(2)I love it, as it has tons of great and free software. Now, I tend
to go with the defaults during installs and I am learning the systems
side slowly in a background thread. Though I have not master these
details fully, this time I somehow thought I had learned
'enough' (about the `dd` command and the MBR) when this disaster
happened. I was very lazy also in not taking the backup... as it was a
mere 446 byte chunk.

Richard Kettlewell

unread,
Apr 12, 2012, 8:49:07 AM4/12/12
to
Harry <simon...@gmail.com> writes:
> I have a very, /VERY/ basic idea of partitions; other than a 1-line
> description of benefits of the LVM feature / concept, I don't really
> know how to handle them, especially when something (like this) goes
> wrong. Would you be able to (*please*) guide me with the necessary
> steps that i need to carry out *now* to fix this situation? Can't even
> begin to tell you, how much I would appreciate yours (or any other
> reader's) help at this point. You can reprimand me as much as you like
> on the way, I most certainly deserve it.

At a guess what's going on is that you're attempting to mount /dev/sdb2
when in fact your filesystem is not on sdb2 at all but in a logical
volume somewhere within it.

If that's the case then you can use 'lvdisplay' to get a list of logical
volumes, which should include the device name(s) you need.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 9:09:31 AM4/12/12
to
On Apr 12, 5:49 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
> Harry <simonsha...@gmail.com> writes:
> > I have a very,  /VERY/ basic idea of partitions; other than a 1-line
> > description of benefits of the LVM feature / concept, I don't really
> > know how to handle them, especially when something (like this) goes
> > wrong. Would you be able to (*please*) guide me with the necessary
> > steps that i need to carry out *now* to fix this situation? Can't even
> > begin to tell you, how much I would appreciate yours (or any other
> > reader's) help at this point. You can reprimand me as much as you like
> > on the way, I most certainly deserve it.
>
> At a guess what's going on is that you're attempting to mount /dev/sdb2
> when in fact your filesystem is not on sdb2 at all but in a logical
> volume somewhere within it.

Oh!

> If that's the case then you can use 'lvdisplay' to get a list of logical
> volumes, which should include the device name(s) you need.
>

I issued `lvdisplay -vvv` as root and this is what I got:
(Note: /dev/sdb is the device whose first 446 bytes got messed up).

Processing: lvdisplay -vvv
O_DIRECT will be used
Setting global/locking_type to 1
Setting global/wait_for_locks to 1
File-based locking selected.
Setting global/locking_dir to /var/lock/lvm
Preparing SELinux context for /var/lock/lvm to
system_u:object_r:lvm_lock_t:s0.
Resetting SELinux context to default value.
Finding all logical volumes
/dev/sr0: Added to device cache
/dev/scd0: Aliased to /dev/sr0 in device cache (preferred
name)
/dev/disk/by-id/ata-HL-DT-ST_DVDRAM_GH22NS50_K00ABGG0301:
Aliased to /dev/scd0 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0: Aliased to /
dev/scd0 in device cache
/dev/disk/by-label/Fedora\x2016\x20i386\x20DVD: Aliased to /
dev/scd0 in device cache
/dev/disk/by-id/wwn-0x5001480000000000: Aliased to /dev/scd0
in device cache
/dev/cdrom: Aliased to /dev/scd0 in device cache (preferred
name)
/dev/cdrw: Aliased to /dev/cdrom in device cache
/dev/dvd: Aliased to /dev/cdrom in device cache
/dev/dvdrw: Aliased to /dev/cdrom in device cache
/dev/sda: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4: Aliased to /
dev/sda in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4: Aliased
to /dev/sda in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0: Aliased to /
dev/sda in device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9: Aliased to /dev/sda in
device cache
/dev/sda1: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4-part1: Aliased
to /dev/sda1 in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4-part1:
Aliased to /dev/sda1 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0-part1: Aliased
to /dev/sda1 in device cache
/dev/disk/by-uuid/9c08e3e9-e9aa-43fc-a76b-36179552271d:
Aliased to /dev/sda1 in device cache
/dev/disk/by-label/_Fedora-16-i686-: Aliased to /dev/sda1 in
device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9-part1: Aliased to /dev/
sda1 in device cache
/dev/sda2: Added to device cache
/dev/disk/by-id/ata-ST250DM000-1BD141_9VYF4LT4-part2: Aliased
to /dev/sda2 in device cache
/dev/disk/by-id/scsi-SATA_ST250DM000-1BD1_9VYF4LT4-part2:
Aliased to /dev/sda2 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:0:0-part2: Aliased
to /dev/sda2 in device cache
/dev/disk/by-uuid/c3d258fd-5cf3-4da7-abab-c5903f940c7a:
Aliased to /dev/sda2 in device cache
/dev/disk/by-id/wwn-0x5000c5003f95ddd9-part2: Aliased to /dev/
sda2 in device cache
/dev/sdb: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D: Aliased to /dev/sdb
in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D: Aliased to /dev/
sdb in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0: Aliased to /
dev/sdb in device cache
/dev/sdb1: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D-part1: Aliased to /dev/
sdb1 in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D-part1: Aliased
to /dev/sdb1 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0-part1: Aliased
to /dev/sdb1 in device cache
/dev/disk/by-uuid/27e07bc8-a9f3-4a0c-bed9-e73ac1fc95f8:
Aliased to /dev/sdb1 in device cache
/dev/sdb2: Added to device cache
/dev/disk/by-id/ata-ST380211AS_6PS0ND8D-part2: Aliased to /dev/
sdb2 in device cache
/dev/disk/by-id/scsi-SATA_ST380211AS_6PS0ND8D-part2: Aliased
to /dev/sdb2 in device cache
/dev/disk/by-path/pci-0000:00:1f.2-scsi-1:0:1:0-part2: Aliased
to /dev/sdb2 in device cache
/dev/loop0: Added to device cache
/dev/disk/by-label/Fedora-16-i686-Live-Desktop.iso: Aliased
to /dev/loop0 in device cache
/dev/loop1: Added to device cache
/dev/loop2: Added to device cache
/dev/loop3: Added to device cache
/dev/loop4: Added to device cache
/dev/loop5: Added to device cache
/dev/loop6: Added to device cache
/dev/loop7: Added to device cache
Opened /dev/loop0 RO O_DIRECT
/dev/loop0: size is 0 sectors
/dev/loop0: Skipping: Too small to hold a PV
Closed /dev/loop0
Opened /dev/sda RO O_DIRECT
/dev/sda: size is 488397168 sectors
/dev/sda: block size is 4096 bytes
/dev/sda: Skipping: Partition table signature found
Closed /dev/sda
/dev/cdrom: Skipping: Unrecognised LVM device type 11
Opened /dev/loop1 RO O_DIRECT
/dev/loop1: size is 0 sectors
/dev/loop1: Skipping: Too small to hold a PV
Closed /dev/loop1
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: size is 475811840 sectors
Closed /dev/sda1
/dev/sda1: size is 475811840 sectors
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: block size is 4096 bytes
Closed /dev/sda1
Using /dev/sda1
Opened /dev/sda1 RO O_DIRECT
/dev/sda1: block size is 4096 bytes
/dev/sda1: No label detected
Closed /dev/sda1
Opened /dev/loop2 RO O_DIRECT
/dev/loop2: size is 0 sectors
/dev/loop2: Skipping: Too small to hold a PV
Closed /dev/loop2
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: size is 12582912 sectors
Closed /dev/sda2
/dev/sda2: size is 12582912 sectors
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: block size is 4096 bytes
Closed /dev/sda2
Using /dev/sda2
Opened /dev/sda2 RO O_DIRECT
/dev/sda2: block size is 4096 bytes
/dev/sda2: No label detected
Closed /dev/sda2
Opened /dev/loop3 RO O_DIRECT
/dev/loop3: size is 0 sectors
/dev/loop3: Skipping: Too small to hold a PV
Closed /dev/loop3
Opened /dev/loop4 RO O_DIRECT
/dev/loop4: size is 0 sectors
/dev/loop4: Skipping: Too small to hold a PV
Closed /dev/loop4
Opened /dev/loop5 RO O_DIRECT
/dev/loop5: size is 0 sectors
/dev/loop5: Skipping: Too small to hold a PV
Closed /dev/loop5
Opened /dev/loop6 RO O_DIRECT
/dev/loop6: size is 0 sectors
/dev/loop6: Skipping: Too small to hold a PV
Closed /dev/loop6
Opened /dev/loop7 RO O_DIRECT
/dev/loop7: size is 0 sectors
/dev/loop7: Skipping: Too small to hold a PV
Closed /dev/loop7
Opened /dev/sdb RO O_DIRECT
/dev/sdb: size is 156301488 sectors
/dev/sdb: block size is 4096 bytes
/dev/sdb: Skipping: Partition table signature found
Closed /dev/sdb
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: size is 1024000 sectors
Closed /dev/sdb1
/dev/sdb1: size is 1024000 sectors
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: block size is 4096 bytes
Closed /dev/sdb1
Using /dev/sdb1
Opened /dev/sdb1 RO O_DIRECT
/dev/sdb1: block size is 4096 bytes
/dev/sdb1: No label detected
Closed /dev/sdb1
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: size is 155275264 sectors
Closed /dev/sdb2
/dev/sdb2: size is 155275264 sectors
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: block size is 4096 bytes
Closed /dev/sdb2
Using /dev/sdb2
Opened /dev/sdb2 RO O_DIRECT
/dev/sdb2: block size is 4096 bytes
/dev/sdb2: lvm2 label detected at sector 1
lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
Closed /dev/sdb2
No volume groups found


Does it look good to you?
What to do now?

unruh

unread,
Apr 12, 2012, 11:46:41 AM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 3:46?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:
>> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> > `SUPERBLOCK` values listed above, like so:
>>
>> > ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>
> I meant sda2 throughout; the /dev/sda1 partition is just fine.

I am very confused. Why in the world would you be writting to /dev/sda2
in order to make it bootable? Booting is usually from the MBR of the
disk, not from the partitions. So, what did you actually do. Please tell
us exactly without any misprints this time.


>
> Fyi, the `fdisk -l` prints this:
> Disk /dev/sda: 80.0 GB, 80026361856 bytes
> 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
> sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Device Boot Start End Blocks Id System
> /dev/sda1 * 2048 1026047 512000 83 Linux
> /dev/sda2 1026048 156301311 77637632 8e Linux LVM

Is sda2 really LVM or is this another misprint or evidence of the
destruction you caused?


>
> Can you/somebody please help me? I have not modified the state of the
> disk in any other way other than first 446 bytes of MBR via the `dd`
> command, so all the data should literally be sitting there.

Of the MBR or of the second partition? Which?

>
>> Next time make a backup first.
> Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
> smart enough for the 'simple-enough' operation I was doing.

I guess that is usually the way people learn. If you really value the
data on that disk, buy another disk, do a dd copy of this disk to that
new disk, and do all your experiments on the new disk.


unruh

unread,
Apr 12, 2012, 11:50:11 AM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 4:36?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> Harry <simonsha...@gmail.com> writes:
>> > Richard Kettlewell <r...@greenend.org.uk> wrote:
>> >> Harry <simonsha...@gmail.com> writes:
>> >>> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> >>> `SUPERBLOCK` values listed above, like so:
>>
>> >>> ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> >> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>>
>> > I meant sda2 throughout; the /dev/sda1 partition is just fine.
>>
>> Passing the right device name to e2fsck might help, then.
>> (How do you expect anyone to help you if you don't describe the
>> situation accurately?)
>>
>> > Fyi, the `fdisk -l` prints this:
>> > ? ? Disk /dev/sda: 80.0 GB, 80026361856 bytes
>> > ? ? 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
>> > sectors
>> > ? ? Units = sectors of 1 * 512 = 512 bytes
>> > ? ? Sector size (logical/physical): 512 bytes / 512 bytes
>> > ? ? I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > ? ? Disk identifier: 0x00000000
>>
>> > ? ? ? ?Device Boot ? ? ?Start ? ? ? ? End ? ? ?Blocks ? Id ?System
>> > ? ? /dev/sda1 ? * ? ? ? ?2048 ? ? 1026047 ? ? ?512000 ? 83 ?Linux
>> > ? ? /dev/sda2 ? ? ? ? 1026048 ? 156301311 ? ?77637632 ? 8e ?Linux LVM
>>
>> If those IDs are accurate then there's no filesystem in sda2, but rather
>> an LVM PV, which would explain why attempts to use ext4 tools on it
>> don't work.
>
> Richard, let me first come out clean with the whole truth.
>
> Since my original post to superuser.com, I had switched the location
> of the problem disk from it being 'sda' originally to being 'sdb' now.
> This is because the original 'sdb' had a freshly installed, bootable
> OS (Fedora 16) and it was bigger (250 GB) - so I thought I'd make it
> my 'primary' disk (or, 'sda'), and keep the older 80 G disk as the
> 'secondary' (or, 'sdb') disk.
>
> However, just before pasting the `fdisk -l` output for you (in my
> previous post to you), I simply searched and replaced 'sdb' with 'sda'
> in the `fdisk -l` output to make my message on this forum look
> consistent with what I posted yesterday on superuser.com.

And you expect help? Wi
But officer, how was I to know that shooting him in the head would kill
him. It was only a very tiny hole!

Doug Freyburger

unread,
Apr 12, 2012, 11:59:07 AM4/12/12
to
Harry wrote:
>
> =================
> Details of what I did:
> =================
>
> Using the `dd` command, I was hoping that I would be able to copy over
> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> make Disk A bootable just like Disk B. I issued the command:
>
> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> But when I could not boot up from `sda`, I rebooted from `sdb` to see
> what was going on. To my horror, `sda` was being reported to have a
> bad superblock, now.

Okay, let's step back and think about what you did compared to what you
have been trying to do since.

What you did - Take the partition table of disk sdb and write it to the
partition table of disk sda. That means nothing on sdb partitions is
effected. That means the partitions that used to be on sda have been
destroyed but none of the data inside those former partitions has been
touched.

What you have been trying to do since - Work on in invalid partitions
that got cloned form disk sdb. No amount of effort on this front can
ever possibily work. Either the partition tables started out identical
and you saw no effect and none of this ever happened or they are
different and you are now trying to work on incorrect partition data.
That can't ever help.

So what do you need to do? You need to stop working on the invalid
partitions in the table and start working on restoring the correct
partitions. Nothing else is going to help.

Do you have a print out of the partition table that used to be on disk
sda? If you do your situtation is promising. If you don't it's time to
start guessing.

Get the print out. Do "fdisk /dev/sda". By hand delete all of the
partitions that exist on it - They were copied in place and are not
valid. Then by hand create the partitions in the sizes and locations
they used to exist. Save it and reboot. Try the fsck again. If it
worked you're done with the debugging. Back to the drawing board of how
to mark a drive bootable - You have learned that's not the way.

If you got it wrong your saving grace is so far all you have written to
is the partition table. All of the data in the former partitions is
still there. So start using "fdisk /dev/sda" and start guessing at how
many partitions there used to be and what sizes they used to be. Hint -
Start by guessing one partition on the whole disk. Then one partition
on half of it. Then one partition on 3/4ths of it or whatever. Do a
binary descent. Eventually you'll know the exact size and location of
the first partition.

If there's a second partition it will start one cylinder after the
first. Initial guess is the rest of the drive. Lather rinse repeat
until you have it figured out.

The hard part will be figuring out the size of any swap partition
because fsck won't help. You hope that drive only had filesystems.

That's your strategy. Don't bother with any work on data inside the
partitions that are there now because the partition table is not correct.

Harry

unread,
Apr 12, 2012, 12:35:01 PM4/12/12
to
On Apr 12, 8:46 pm, unruh <un...@invalid.ca> wrote:
unruh, I do expect to be reprimanded so cannot and thus won't fire
back at you...
I'm clarifying what I did in Doug's post. Otherwise, I'll have repeat
the same info in multiple responses. If you can, please be around.

unruh

unread,
Apr 12, 2012, 12:39:17 PM4/12/12
to
On 2012-04-12, Doug Freyburger <dfre...@yahoo.com> wrote:
> Harry wrote:
>>
>> =================
>> Details of what I did:
>> =================
>>
>> Using the `dd` command, I was hoping that I would be able to copy over
>> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> make Disk A bootable just like Disk B. I issued the command:
>>
>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> what was going on. To my horror, `sda` was being reported to have a
>> bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.

Of course we have no idea if that is actually what he did, since he
keeps revising his statements. For example, maybe that was really
bs=446K. He first said everywhere that sda was written he meant sda2. Is
this one of those cases? It seems like it since he also says that sda1
partition is fine, and the partition table is readable by fdisk.


>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda. That means nothing on sdb partitions is
> effected. That means the partitions that used to be on sda have been
> destroyed but none of the data inside those former partitions has been
> touched.

It is also unclear what the partitions were. It would seem, but who
knows, that sdb2 ( since he has also told us he lied about the sda and
sdb labeling) is an LVM partition. (Why? Oh well.)

>
> What you have been trying to do since - Work on in invalid partitions
> that got cloned form disk sdb. No amount of effort on this front can
> ever possibily work. Either the partition tables started out identical
> and you saw no effect and none of this ever happened or they are
> different and you are now trying to work on incorrect partition data.
> That can't ever help.
>
> So what do you need to do? You need to stop working on the invalid
> partitions in the table and start working on restoring the correct
> partitions. Nothing else is going to help.
>
> Do you have a print out of the partition table that used to be on disk
> sda? If you do your situtation is promising. If you don't it's time to
> start guessing.
>
> Get the print out. Do "fdisk /dev/sda". By hand delete all of the
> partitions that exist on it - They were copied in place and are not
> valid. Then by hand create the partitions in the sizes and locations
> they used to exist. Save it and reboot. Try the fsck again. If it
> worked you're done with the debugging. Back to the drawing board of how
> to mark a drive bootable - You have learned that's not the way.

As I said, he should clone the disk and work on the clone only. He is
liable to mess things up still further by trying to fix things.

Harry

unread,
Apr 12, 2012, 12:48:29 PM4/12/12
to
On Apr 12, 8:59 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> > =================
> > Details of what I did:
> > =================
>
> > Using the `dd` command, I was hoping that I would be able to copy over
> > the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> > make Disk A bootable just like Disk B. I issued the command:
>
> > dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> > But when I could not boot up from `sda`, I rebooted from `sdb` to see
> > what was going on. To my horror, `sda` was being reported to have a
> > bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.
>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda. That means nothing on sdb partitions is
> effected. That means the partitions that used to be on sda have been
> destroyed but none of the data inside those former partitions has been
> touched.

That's correct. With you so far.

> What you have been trying to do since - Work on in invalid partitions
> that got cloned form disk sdb. No amount of effort on this front can
> ever possibily work. Either the partition tables started out identical
> and you saw no effect and none of this ever happened or they are
> different and you are now trying to work on incorrect partition data.
> That can't ever help.
>
> So what do you need to do? You need to stop working on the invalid
> partitions in the table and start working on restoring the correct
> partitions. Nothing else is going to help.

Yes, all I am interested in right now is to *somehow* be able to
recover the filesystem (ext4) and the data on it.

> Do you have a print out of the partition table that used to be on disk
> sda? If you do your situtation is promising. If you don't it's time to
> start guessing.

No, I don't have a printout of the partition table of the former sda.

> Get the print out. Do "fdisk /dev/sda". By hand delete all of the
> partitions that exist on it - They were copied in place and are not
> valid. Then by hand create the partitions in the sizes and locations
> they used to exist. Save it and reboot. Try the fsck again. If it
> worked you're done with the debugging. Back to the drawing board of how
> to mark a drive bootable - You have learned that's not the way.
>
> If you got it wrong your saving grace is so far all you have written to
> is the partition table. All of the data in the former partitions is
> still there. So start using "fdisk /dev/sda" and start guessing at how
> many partitions there used to be and what sizes they used to be. Hint -
> Start by guessing one partition on the whole disk. Then one partition
> on half of it. Then one partition on 3/4ths of it or whatever. Do a
> binary descent. Eventually you'll know the exact size and location of
> the first partition.
>
> If there's a second partition it will start one cylinder after the
> first. Initial guess is the rest of the drive. Lather rinse repeat
> until you have it figured out.
>
> The hard part will be figuring out the size of any swap partition
> because fsck won't help. You hope that drive only had filesystems.
>
> That's your strategy. Don't bother with any work on data inside the
> partitions that are there now because the partition table is not correct.

Doug, after getting some revelations from Richard above (on LVM and
the futilitiy of using ext4 tools directly on it), I have described my
situation more (and hopefully) better at the following superuser.com
link. Not sure if it would be rude on my part to ask you this, but if
you don't mind the inconvenience of clicking this link, you will see a
more succinct and helpful description of my problem.

Superuser.com link:
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it

Before I try the binary descent approach you suggest above, may I
request you to go over this new superuser.com link and let me know if
you still thing nothing else simpler is possible to salvage the file
system sitting in the logical volume somewhere.

Many, many thanks to unruh and yourself for responding. I will be
forever grateful to you guys if you can help me salvage my data.


Harry

unread,
Apr 12, 2012, 12:53:43 PM4/12/12
to
On Apr 12, 9:39 pm, unruh <un...@invalid.ca> wrote:
unruh, I indeed have a cloned image of the original messed up device (/
dev/sda).

Please forgive my typos in my original post. This doesn't mean that
everything I wrote had a typo in it. The `dd` command that I issued,
e.g., has no typos in the original post.

I know there is a way to mount the image of the clone device
(partition?) in loopback mode. If you/Doug/sb can provide step by step
instructions, then I would be very grateful to you.

Richard Kettlewell

unread,
Apr 12, 2012, 1:43:02 PM4/12/12
to
Doug Freyburger <dfre...@yahoo.com> writes:
> Harry wrote:

>> =================
>> Details of what I did:
>> =================
>>
>> Using the `dd` command, I was hoping that I would be able to copy over
>> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> make Disk A bootable just like Disk B. I issued the command:
>>
>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> what was going on. To my horror, `sda` was being reported to have a
>> bad superblock, now.
>
> Okay, let's step back and think about what you did compared to what you
> have been trying to do since.
>
> What you did - Take the partition table of disk sdb and write it to the
> partition table of disk sda.

That's not correct. 446 is precisely the value you use to avoid
modifying the partition table of the target. Since the partition table
quoted is consistent with an 80GB disk and not a 250GB disk, it's a safe
bet that the partition table on the target wasn't modified.

--
http://www.greenend.org.uk/rjk/

Harry

unread,
Apr 12, 2012, 1:46:46 PM4/12/12
to
On Apr 12, 8:59 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> If you got it wrong your saving grace is so far all you have written to
> is the partition table.  All of the data in the former partitions is
> still there.  So start using "fdisk /dev/sda" and start guessing at how
> many partitions there used to be and what sizes they used to be.  Hint -
> Start by guessing one partition on the whole disk.  Then one partition
> on half of it.  Then one partition on 3/4ths of it or whatever.  Do a
> binary descent.  Eventually you'll know the exact size and location of
> the first partition.
>
> If there's a second partition it will start one cylinder after the
> first.  Initial guess is the rest of the drive.  Lather rinse repeat
> until you have it figured out.

Doug, is there any way to avoid having to reboot after each edit of
the partition table? The messed up disk is sitting as sdb (or, a
secondary disk) in my current system, which means I am not booting
from it. For example, given the fact that right now sdb is
unmountable, can I repeatedly try the mount command (or, any LVM-
equivalent of mount) to check whether or not my edits to the partition
table were correct. I have included a copy of a backup of the LVM
setup here:

http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it

Harry

unread,
Apr 12, 2012, 1:55:21 PM4/12/12
to
On Apr 12, 10:43 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
I read from a Wikipedia entry on disk partitioning that the first 446
bytes of the MBR is 'pure code' -- i.e. no data. Which is why I
ventured in the first place to boldly attempt -- without a backup of
the target MBR -- the copying of the 446-byte MBR code from the former
(250 GB) sdb to the former (80G) sda.

But something went wrong, and I still don't understand it. Right now,
my priority is to get my data back. Immediately after this is done,
I'd also like to what the heck I did.

Doug Freyburger

unread,
Apr 12, 2012, 2:07:29 PM4/12/12
to
Richard Kettlewell wrote:
> Doug Freyburger <dfre...@yahoo.com> writes:
>> Harry wrote:
>
>>> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
>> What you did - Take the partition table of disk sdb and write it to the
>> partition table of disk sda.
>
> That's not correct. 446 is precisely the value you use to avoid
> modifying the partition table of the target. Since the partition table
> quoted is consistent with an 80GB disk and not a 250GB disk, it's a safe
> bet that the partition table on the target wasn't modified.

I wonder if "dd" has an underlying block size or if it starts with an
uninitialized block.

I suspect it did not read the first sector, changethe first 446 bytes
in the buffer, then write the first sector back out again. I fear it
wrote 512 byes of data (maybe more in 512 bye increments) where the
first 446 bytes were good and the rest was nonsense. It would explain
the missing partition data.

I'll reread the thread then glance at that forum.

Richard Kettlewell

unread,
Apr 12, 2012, 2:18:02 PM4/12/12
to
Harry <simon...@gmail.com> writes:
> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
> Closed /dev/sdb2
> No volume groups found

You MIGHT be able to use dmsetup to create a block device corresponding
to the lv_root volume described in the backup file you quote over on
superuser. See my response there.

--
http://www.greenend.org.uk/rjk/

Gernot Fink

unread,
Apr 12, 2012, 2:20:01 PM4/12/12
to
In article <81fb3fa8-62ab-43be...@h4g2000pbe.googlegroups.com>,
Harry <simon...@gmail.com> writes:
> dd if=/dev/sdb of=/dev/sda bs=446 count=1

this should not shred the partitontable, but it looks like it did.

Check or repair the Partitiontabele as first step.
If you find nothing or a bad table use testdisk to scan for
partitions.

After this use dumpe2fs to find alternate superblocks.
http://www.cyberciti.biz/faq/linux-find-alternative-superblocks/

As next step make a initial boot with supergrubdisk.

--
MFG Gernot

Richard Kettlewell

unread,
Apr 12, 2012, 2:20:06 PM4/12/12
to
Doug Freyburger <dfre...@yahoo.com> writes:
> I wonder if "dd" has an underlying block size or if it starts with an
> uninitialized block.
>
> I suspect it did not read the first sector, changethe first 446 bytes
> in the buffer, then write the first sector back out again. I fear it
> wrote 512 byes of data (maybe more in 512 bye increments) where the
> first 446 bytes were good and the rest was nonsense. It would explain
> the missing partition data.
>
> I'll reread the thread then glance at that forum.

The partition table is fine. You're chasing a red herring.

--
http://www.greenend.org.uk/rjk/

Doug Freyburger

unread,
Apr 12, 2012, 2:27:30 PM4/12/12
to
Harry wrote:
>
> I issued `lvdisplay -vvv` as root and this is what I got:
> (Note: /dev/sdb is the device whose first 446 bytes got messed up).
> ...
> /dev/sdb: size is 156301488 sectors
> /dev/sdb: block size is 4096 bytes
> /dev/sdb: Skipping: Partition table signature found
> Closed /dev/sdb

There's a partition table for sdb. That's hopeful.

> /dev/sdb1: size is 1024000 sectors
> Closed /dev/sdb1
> /dev/sdb1: size is 1024000 sectors
> Opened /dev/sdb1 RO O_DIRECT
> /dev/sdb1: block size is 4096 bytes

That's 250 MB, right? It's a typical size for /boot.

> /dev/sdb2: size is 155275264 sectors
> Opened /dev/sdb2 RO O_DIRECT
> /dev/sdb2: block size is 4096 bytes

Does that match the 37 GB I calculated from the fdisk output. Seems
like only half of the drive was used. Most likely that's my arithmatic
not what the numbers really say.

> /dev/sdb2: lvm2 label detected at sector 1

That's a bingo.

> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
> Closed /dev/sdb2
> No volume groups found

It says it found a volume group.

> Does it look good to you?
> What to do now?

Variations on "vgscan" and "vgimport". I've done more LVM work on HPUX
than on Linux recently so I can't rattle vgscan and vginput lines off
the top of my head.

Doug Freyburger

unread,
Apr 12, 2012, 2:51:21 PM4/12/12
to
Harry wrote:
>
> /dev/sdb2: lvm2 label detected at sector 1
> lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)

After reading the forum response by Steven Monday I think this gives a
hint as to how to proceed next. It says there's a volume group there.
It says it does not know the name of that volume group (probably because
it was active not exported when it went down).

The first Linux host I could get to was Red Hat so it may behave diff.

>more /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)

Try "vgs". It just might tell you there is now a volume group
"#orphans_lvm2" available for import. Pain in the neck having a hash in
the name but tha'ts certainly deliberate.

Try "vgimport -a" to see if it brings it in as "#orphans_lvm2" or
"vg_XYZ".

Then try "vgimport -v vg_XYZ", yeah again, shrug.

Then "vgimport -v \#orphans_lvm2" escaping the hash.

If any of those work do "vgrename \#orphans_lvm2 vg_XYZ" and
"vgexport vg_XYZ".

If all of that fails try

vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw

or

vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ

I pasted that ID from your forum post so if it's editted use the real
one.

Then all the "-a Y" stuff on the VG and LVs in it. Then fsck. Make
sure to vgexport it before removing it back to the previous host.

unruh

unread,
Apr 12, 2012, 2:57:08 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 8:46?pm, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Harry <simonsha...@gmail.com> wrote:
>>
>> > On Apr 12, 3:46?pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
>> >> Harry <simonsha...@gmail.com> writes:
>> >> > Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
>> >> > `SUPERBLOCK` values listed above, like so:
>>
>> >> > ? ? ? ? [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
>>
>> >> Did you mean sda or sda1 here? ?(And similarly elsewhere.)
>>
>> > I meant sda2 throughout; the /dev/sda1 partition is just fine.
>>
>> I am very confused. Why in the world would you be writting to /dev/sda2
>> in order to make it bootable? Booting is usually from the MBR of the
>> disk, not from the partitions. So, what did you actually do. Please tell
>> us exactly without any misprints this time.
>>
>>
>>
>> > Fyi, the `fdisk -l` prints this:
>> > ? ? Disk /dev/sda: 80.0 GB, 80026361856 bytes
>> > ? ? 255 heads, 63 sectors/track, 9729 cylinders, total 156301488
>> > sectors
>> > ? ? Units = sectors of 1 * 512 = 512 bytes
>> > ? ? Sector size (logical/physical): 512 bytes / 512 bytes
>> > ? ? I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > ? ? Disk identifier: 0x00000000
>>
>> > ? ? ? ?Device Boot ? ? ?Start ? ? ? ? End ? ? ?Blocks ? Id ?System
>> > ? ? /dev/sda1 ? * ? ? ? ?2048 ? ? 1026047 ? ? ?512000 ? 83 ?Linux
>> > ? ? /dev/sda2 ? ? ? ? 1026048 ? 156301311 ? ?77637632 ? 8e ?Linux LVM
>>
>> Is sda2 really LVM or is this another misprint or evidence of the
>> destruction you caused?
>>
>>
>>
>> > Can you/somebody please help me? I have not modified the state of the
>> > disk in any other way other than first 446 bytes of MBR via the `dd`
>> > command, so all the data should literally be sitting there.
>>
>> Of the MBR or of the second partition? Which?
>>
>>
>>
>> >> Next time make a backup first.
>> > Yes, I have learnt the lesson a VERY PAINFUL way. I thought I was
>> > smart enough for the 'simple-enough' operation I was doing.
>>
>> I guess that is usually the way people learn. If you really value the
>> data on that disk, buy another disk, do a dd copy of this disk to that
>> new disk, and do all your experiments on the new disk.
>
> unruh, I do expect to be reprimanded so cannot and thus won't fire
> back at you...
> I'm clarifying what I did in Doug's post. Otherwise, I'll have repeat
> the same info in multiple responses. If you can, please be around.

The reprimand is for lack of clarity. However my advice still stands. Do
not try to fix the original. Buy a new disk and do a dd copy from the
first to that, and then fix that copy. If your data is not worth $100,
then format the disk and start over. You will waste more than $100 of
time.


unruh

unread,
Apr 12, 2012, 3:03:10 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 9:39?pm, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Doug Freyburger <dfrey...@yahoo.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > Harry wrote:
>>
>> >> =================
>> >> Details of what I did:
>> >> =================
>>
>> >> Using the `dd` command, I was hoping that I would be able to copy over
>> >> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
>> >> make Disk A bootable just like Disk B. I issued the command:
>>
>> >> ? ? dd if=/dev/sdb of=/dev/sda bs=446 count=1
>>
>> >> But when I could not boot up from `sda`, I rebooted from `sdb` to see
>> >> what was going on. To my horror, `sda` was being reported to have a
>> >> bad superblock, now.
>>
>> > Okay, let's step back and think about what you did compared to what you
>> > have been trying to do since.
>>
>> Of course we have no idea if that is actually what he did, since he
>> keeps revising his statements. For example, maybe that was really
>> bs=446K. He first said everywhere that sda was written he meant sda2. Is
>> this one of those cases? It seems like it since he also says that sda1
>> partition is fine, and the partition table is readable by fdisk.
>>
>>
>>
>> > What you did - Take the partition table of disk sdb and write it to the
>> > partition table of disk sda. ?That means nothing on sdb partitions is
>> > effected. ?That means the partitions that used to be on sda have been
>> > destroyed but none of the data inside those former partitions has been
>> > touched.
>>
>> It is also unclear what the partitions were. It would seem, but who
>> knows, that sdb2 ( since he has also told us he lied about the sda and
>> sdb labeling) is an LVM partition. (Why? Oh well.)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > What you have been trying to do since - Work on in invalid partitions
>> > that got cloned form disk sdb. ?No amount of effort on this front can
>> > ever possibily work. ?Either the partition tables started out identical
>> > and you saw no effect and none of this ever happened or they are
>> > different and you are now trying to work on incorrect partition data.
>> > That can't ever help.
>>
>> > So what do you need to do? ?You need to stop working on the invalid
>> > partitions in the table and start working on restoring the correct
>> > partitions. ?Nothing else is going to help.
>>
>> > Do you have a print out of the partition table that used to be on disk
>> > sda? ?If you do your situtation is promising. ?If you don't it's time to
>> > start guessing.
>>
>> > Get the print out. ?Do "fdisk /dev/sda". ?By hand delete all of the
>> > partitions that exist on it - They were copied in place and are not
>> > valid. ?Then by hand create the partitions in the sizes and locations
>> > they used to exist. ?Save it and reboot. ?Try the fsck again. ?If it
>> > worked you're done with the debugging. ?Back to the drawing board of how
>> > to mark a drive bootable - You have learned that's not the way.
>>
>> As I said, he should clone the disk and work on the clone only. He is
>> liable to mess things up still further by trying to fix things.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > If you got it wrong your saving grace is so far all you have written to
>> > is the partition table. ?All of the data in the former partitions is
>> > still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> > many partitions there used to be and what sizes they used to be. ?Hint -
>> > Start by guessing one partition on the whole disk. ?Then one partition
>> > on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> > binary descent. ?Eventually you'll know the exact size and location of
>> > the first partition.
>>
>> > If there's a second partition it will start one cylinder after the
>> > first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> > until you have it figured out.
>>
>> > The hard part will be figuring out the size of any swap partition
>> > because fsck won't help. ?You hope that drive only had filesystems.
>>
>> > That's your strategy. ?Don't bother with any work on data inside the
>> > partitions that are there now because the partition table is not correct.
>
> unruh, I indeed have a cloned image of the original messed up device (/
> dev/sda).
>
> Please forgive my typos in my original post. This doesn't mean that
> everything I wrote had a typo in it. The `dd` command that I issued,
> e.g., has no typos in the original post.
>
> I know there is a way to mount the image of the clone device
> (partition?) in loopback mode. If you/Doug/sb can provide step by step
> instructions, then I would be very grateful to you.

OK, so you have put away the original disk in a safe place, and will not
touch it. It is the clone you are working on. Then as doug says, your
partition table is probably messed up. Thus you have no reason to trust
anything it says (fdisk -l), and as he says, you have to try to
reconstruct it. Ie, you have to repartition the clone so that its
partitions are the same as they were before you made your mistake.
Do you have any information about how they were partitioned? When you
partitioned it originally, how did you do it? do you accept defaults or
do you round up (eg the first partition has 10GB and the second the
rest). If you really cannot remember, I do not think that there is any
way of finding out-- at least I do not know of any. There may well be
something in the structure of the disk that tells you from the structure
of the data where the second partition starts. How well used is the
disk? Is there liable to be a huge blank space before the next
partition because that space was never used?



Harry

unread,
Apr 12, 2012, 2:49:01 PM4/12/12
to
On Apr 12, 11:18 pm, Richard Kettlewell <r...@greenend.org.uk> wrote:
Richard, please see my comment there. The command is syntactically
invalid. Did you try this on Fedora 16 or some other OS (or another
version of dmsetup)?

unruh

unread,
Apr 12, 2012, 3:07:43 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
The file system is there. It is the partition table that is messed up,
probably. Ie, if you can figure out where the boundaries of the
partitions are, you can just reconstruct those and everything else will
be there, including ext4

>
>> Do you have a print out of the partition table that used to be on disk
>> sda? If you do your situtation is promising. If you don't it's time to
>> start guessing.
>
> No, I don't have a printout of the partition table of the former sda.

That makes it hard. How did you originally partition it? How do you
choose your partition sizes?
The partition table on there now comes ( probably) from the other drive
whose mbr you cloned. Ie, it is totally untrustworthy, including the lvm
comments.


>
> Superuser.com link:
> http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
>
> Before I try the binary descent approach you suggest above, may I
> request you to go over this new superuser.com link and let me know if
> you still thing nothing else simpler is possible to salvage the file
> system sitting in the logical volume somewhere.

You have no idea if you have a logical volume there. The partition table
is completely untrustworthy. It is like trying to find your way to rome
while using a map of Switzerland.

unruh

unread,
Apr 12, 2012, 3:09:29 PM4/12/12
to
You may be right. On the other hand, the evidence is otherwise. He
cannot read the data on the disk, which he should be able to if the
partition table is unmodified.


>

unruh

unread,
Apr 12, 2012, 3:11:43 PM4/12/12
to
On 2012-04-12, Richard Kettlewell <r...@greenend.org.uk> wrote:
You know that how? which piece of data that he reports do you base this
on? and How does your theory explain the other symptoms. Ie, if he is
chasing a red herring, what the real trail he should be following?


>

Harry

unread,
Apr 12, 2012, 3:12:45 PM4/12/12
to
On Apr 12, 11:51 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Harry wrote:
>
> >       /dev/sdb2: lvm2 label detected at sector 1
> >         lvmcache: /dev/sdb2: now in VG #orphans_lvm2 (#orphans_lvm2)
>
> After reading the forum response by Steven Monday I think this gives a
> hint as to how to proceed next.  It says there's a volume group there.
> It says it does not know the name of that volume group (probably because
> it was active not exported when it went down).
>
> The first Linux host I could get to was Red Hat so it may behave diff.
>
> >more /etc/redhat-release
>
> Red Hat Enterprise Linux Server release 5.7 (Tikanga)
>
> Try "vgs".  It just might tell you there is now a volume group
> "#orphans_lvm2" available for import.  Pain in the neck having a hash in
> the name but tha'ts certainly deliberate.
>
> Try "vgimport -a" to see if it brings it in as "#orphans_lvm2" or
> "vg_XYZ".
>
> Then try "vgimport -v vg_XYZ", yeah again, shrug.
>
> Then "vgimport -v \#orphans_lvm2" escaping the hash.

$ vgs
No volume groups found

$ vgimport -a
No volume groups found

$ vgimport -v vg_XYZ
Using volume group(s) on command line
Finding volume group "vg_XYZ"
Volume group "vg_XYZ" not found

$ vgimport -v \#orphans_lvm2
Using volume group(s) on command line

$ vgs -a
No volume groups found

$ vgs -a -d



>
> If any of those work do "vgrename  \#orphans_lvm2 vg_XYZ" and
> "vgexport vg_XYZ".

Didn't try vgexport.



> If all of that fails try
>
> vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw
>
> or
>
> vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ

$ vgimport WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw
Volume group "WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw" not found

$ vgrename WN8593-xRnx-dn29-rcpb-tRAm-Bs5R-93DGWw xg_XYZ
No complete volume groups found



> Then all the "-a Y" stuff on the VG and LVs in it.  Then fsck.  Make
> sure to vgexport it before removing it back to the previous host.

Didn't come this far.

unruh

unread,
Apr 12, 2012, 3:13:27 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 12, 8:59?pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
>> Harry wrote:
>>
>> If you got it wrong your saving grace is so far all you have written to
>> is the partition table. ?All of the data in the former partitions is
>> still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> many partitions there used to be and what sizes they used to be. ?Hint -
>> Start by guessing one partition on the whole disk. ?Then one partition
>> on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> binary descent. ?Eventually you'll know the exact size and location of
>> the first partition.
>>
>> If there's a second partition it will start one cylinder after the
>> first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> until you have it figured out.
>
> Doug, is there any way to avoid having to reboot after each edit of
> the partition table? The messed up disk is sitting as sdb (or, a

The messed up disk should be nowhere around your computer. It should be
unplugged and stored in a closet. You should be working ONLY with a
clone of it, which you claim to have.

Harry

unread,
Apr 12, 2012, 3:21:07 PM4/12/12
to
On Apr 13, 12:13 am, unruh <un...@invalid.ca> wrote:
> >  http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...

I have an image of the messed up disk (the former sda), which I
created thus:
dd if=/dev/sda of=/path/to/messedup.imsg

Isn't this sufficient, unruh? If something goes wrong, I can always
copy the image back, isn't it?

Richard Kettlewell

unread,
Apr 12, 2012, 3:39:53 PM4/12/12
to
unruh <un...@invalid.ca> writes:
> Richard Kettlewell <r...@greenend.org.uk> wrote:

>> The partition table is fine. You're chasing a red herring.
> You know that how? which piece of data that he reports do you base
> this on?

The consistency between the physical disk, the partition table and the
lvm backup data.

--
http://www.greenend.org.uk/rjk/

unruh

unread,
Apr 12, 2012, 4:12:02 PM4/12/12
to
On 2012-04-12, Harry <simon...@gmail.com> wrote:
> On Apr 13, 12:13?am, unruh <un...@invalid.ca> wrote:
>> On 2012-04-12, Harry <simonsha...@gmail.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > On Apr 12, 8:59?pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
>> >> Harry wrote:
>>
>> >> If you got it wrong your saving grace is so far all you have written to
>> >> is the partition table. ?All of the data in the former partitions is
>> >> still there. ?So start using "fdisk /dev/sda" and start guessing at how
>> >> many partitions there used to be and what sizes they used to be. ?Hint -
>> >> Start by guessing one partition on the whole disk. ?Then one partition
>> >> on half of it. ?Then one partition on 3/4ths of it or whatever. ?Do a
>> >> binary descent. ?Eventually you'll know the exact size and location of
>> >> the first partition.
>>
>> >> If there's a second partition it will start one cylinder after the
>> >> first. ?Initial guess is the rest of the drive. ?Lather rinse repeat
>> >> until you have it figured out.
>>
>> > Doug, is there any way to avoid having to reboot after each edit of
>> > the partition table? The messed up disk is sitting as sdb (or, a
>>
>> The messed up disk should be nowhere around your computer. It should be
>> unplugged and stored in a closet. You should be working ONLY with a
>> clone of it, which you claim to have.
>>
>>
>>
>>
>>
>>
>>
>> > secondary disk) in my current system, which means I am not booting
>> > from it. ?For example, given the fact that right now sdb is
>> > unmountable, can I repeatedly try the mount command (or, any LVM-
>> > equivalent of mount) to check whether or not my edits to the partition
>> > table were correct. I have included a copy of a backup of the LVM
>> > setup here:
>>
>> > ?http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
>
> I have an image of the messed up disk (the former sda), which I
> created thus:
> dd if=/dev/sda of=/path/to/messedup.imsg
>
> Isn't this sufficient, unruh? If something goes wrong, I can always
> copy the image back, isn't it?

Perhaps, but I would far rather play with a copy than the original.

The Natural Philosopher

unread,
Apr 12, 2012, 4:35:31 PM4/12/12
to
Harry wrote:
> Hello,
> I posted my question to superuser.com (http://superuser.com/questions/
> 410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
> haven't got any response yet.
>
> In summary, not realizing what I was doing, I overwrote the first 446
> bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
> if someone could help me salvage my disk!
>

I bet you would. My cursory response is that you have probably - if it
wasnt backed up, fucked it right royally and completely.


If it mounts at all, and you camn gete data off it, back it up and satrt
again.

If it doesn't mount...well. Good luck.

IF you know the format of the superblock you MIGHT try patching that in..

so if you have another identical drive you COULD rip that off the one
for the other.

Id dd the raw disk off first as is and save that somewhere on something
else as a possible backup

Then you might try a repartiton on it with ..fdisk? to restore the data
structures bit that tends to wipe directories as well? Or does it? Not sure.

Buta repartition would at least establishing the sort of formats you
need for a spuerblock and then rolling the backed up disk all except
that block back would maybe fix it.

Moral: don't go down one way streets unless you know exactly where they
lead.


>


--
To people who know nothing, anything is possible.
To people who know too much, it is a sad fact
that they know how little is really possible -
and how hard it is to achieve it.

The Natural Philosopher

unread,
Apr 12, 2012, 4:37:19 PM4/12/12
to
unruh wrote:
>
>
> But officer, how was I to know that shooting him in the head would kill
> him. It was only a very tiny hole!
>
:-)

David W. Hodgins

unread,
Apr 12, 2012, 5:15:17 PM4/12/12
to
On Thu, 12 Apr 2012 15:12:45 -0400, Harry <simon...@gmail.com> wrote:

> $ vgs -a
> No volume groups found

Try vgscan followed by "vgchange -a y", then lvscan.

Regards, Dave Hodgins

--
Change nomail.afraid.org to ody.ca to reply by email.
(nomail.afraid.org has been set up specifically for
use in usenet. Feel free to use it yourself.)

Harry

unread,
Apr 12, 2012, 11:30:09 PM4/12/12
to
On Apr 13, 2:15 am, "David W. Hodgins" <dwhodg...@nomail.afraid.org>
wrote:
> On Thu, 12 Apr 2012 15:12:45 -0400, Harry <simonsha...@gmail.com> wrote:
> > $ vgs -a
> >   No volume groups found
>
> Try vgscan followed by "vgchange -a y", then lvscan.
>
> Regards, Dave Hodgins
>
> --
> Change nomail.afraid.org to ody.ca to reply by email.
> (nomail.afraid.org has been set up specifically for
> use in usenet. Feel free to use it yourself.)

$ vgscan
Reading all physical volumes. This may take a while...

Harry

unread,
Apr 13, 2012, 9:58:19 AM4/13/12
to
On Apr 13, 8:30 am, Harry <simonsha...@gmail.com> wrote:

Guys, I am still somewhat hopeful but don't know what to do or who to
ask now.

Is there any other forum (LVM-related) where I could try asking?

This is not my area of expertise at all; I use Linux only as an
applications user.

Please help your fallen comrade out...

Harry

unread,
Apr 13, 2012, 10:13:36 AM4/13/12
to
Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
), is there any way I could programmtically conduct a binary search
for the correct start of the lv_root? For example,

StartLocation = some 'good' value
dmsetup create foo --table "0 146997248 linear /dev/sdb2
$StartLocation"
if mount -o ro /dev/mapper/foo /mnt succeeds,
BINGO!
else
StartLocation = StartLocation + some 'good' Increment
dmsetup remove foo
fi

Based on your recommendations I could try some good values of
StartLocation and Increment, instead of a brute-force search of the
entire 80G pv.

unruh

unread,
Apr 13, 2012, 11:22:37 AM4/13/12
to
On 2012-04-13, Harry <simon...@gmail.com> wrote:
We have tried. If you want to look for a company which specialises in
recovering data from hard disks, it may be time to do that.

Doug Freyburger

unread,
Apr 13, 2012, 12:50:57 PM4/13/12
to
Richard Kettlewell wrote:
> unruh <un...@invalid.ca> writes:
>> Richard Kettlewell <r...@greenend.org.uk> wrote:
>
>>> The partition table is fine. You're chasing a red herring.
>
>> You know that how? which piece of data that he reports do you base
>> this on?
>
> The consistency between the physical disk, the partition table and the
> lvm backup data.

The problem is if only the boot portion of the MBR were written to, then
why won't the volume group on the second partition activate? It remains
the most like suspect based in the "vgimport -vvv" he posted but where
our our guarantees?

According to the "fdisk -l" output there is a 250 MB parition in Linux
format marked bootable. Clearly /boot. It does not fsck nor does it
mount as /mnt/boot. if only the boot code of the MBR were written that
partition would fsck and mount.

According to the "vgimport -vvv" output posted here and the "pvscan"
output posted on the forum there is a 79 GB partition in Linux LVM
format that "should" contain the volume group vg_XYZ. Neither vgimport
nor vgscan works.

Both of these results tell me that the partition table probably was
trashed. We started with a base assumption that the partition table was
valid and worked based on that premise. No progress was made using
tools suggested by the partition table contents. To me that says the
partition table is in fact bad. But how much else was written to?

Doing "fsck /dev/sdb1" did not show a filesystem there. Either more was
overwritten than the partition table or there was no filesystem there on
the original.

I go back to my suggstion of looking for contents. Does fdisk work on
loopback files? I've never tried that. I would rather do dd to a new
device. Put a single partiton for the whole drive. Look for a
filesystem with fsck (done) and a volume group with vgscan. If it finds
one look at the size - A partition tha'ts too big will show data smaller
than the whole drive. Narrow down the size by halfing. A failure means
the half was too small, success means on target or too big.

The problem with the method is what if there was a swap partition at the
beginning. Then we don't know where to start other than the beginning.
Make a partition 1 cyclinder then the rest in the second one. Try that.
Keep cycling 1 more cylider at a time. Way too much work unless that
could be automated. Can PartitionMagic do something like that?

J G Miller

unread,
Apr 13, 2012, 1:13:34 PM4/13/12
to
On Friday, April 13th, 2012, at 06:58:19h -0700, Harry explained:

> Guys, I am still somewhat hopeful ...

Have you tried any of these tools?


<http://www.sleuthkit.ORG/sleuthkit/>

<http://www.sleuthkit.ORG/autopsy/>


<http://www.digital-forensic.ORG/framework/download/>

Robert Nichols

unread,
Apr 13, 2012, 8:29:41 PM4/13/12
to
On 04/13/2012 09:13 AM, Harry wrote:
> Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
> http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how-to-mount-and-recover-data-from-it
> ), is there any way I could programmtically conduct a binary search
> for the correct start of the lv_root? For example,
>
> StartLocation = some 'good' value
> dmsetup create foo --table "0 146997248 linear /dev/sdb2
> $StartLocation"
> if mount -o ro /dev/mapper/foo /mnt succeeds,
> BINGO!
> else
> StartLocation = StartLocation + some 'good' Increment
> dmsetup remove foo
> fi
>
> Based on your recommendations I could try some good values of
> StartLocation and Increment, instead of a brute-force search of the
> entire 80G pv.

You previously reported "/dev/sdb2: lvm2 label detected at sector 1", and
that confirms that the start of the physical volume is correctly located
in the partition. What does "pvck -vv /dev/sdb2" have to say? I should
report the location of the metadata records for your volume group(s).

--
Bob Nichols AT comcast.net I am "RNichols42"

Harry

unread,
Apr 14, 2012, 12:24:08 AM4/14/12
to
On Apr 13, 9:50 pm, Doug Freyburger <dfrey...@yahoo.com> wrote:
> Richard Kettlewell wrote:
> > unruh <un...@invalid.ca> writes:
> >> Richard Kettlewell <r...@greenend.org.uk> wrote:
>
> >>> The partition table is fine. You're chasing a red herring.
>
> >> You know that how? which piece of data that he reports do you base
> >> this on?
>
> > The consistency between the physical disk, the partition table and the
> > lvm backup data.
>
> The problem is if only the boot portion of the MBR were written to, then
> why won't the volume group on the second partition activate? It remains
> the most like suspect based in the "vgimport -vvv" he posted but where
> our our guarantees?
>
> According to the "fdisk -l" output there is a 250 MB parition in Linux
> format marked bootable. Clearly /boot. It does not fsck nor does it
> mount as /mnt/boot. if only the boot code of the MBR were written that
> partition would fsck and mount.

No, actually, I /can/ mount the boot partition sdb1.

$ fdisk -l
<snip>
Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1026047 512000 83 Linux
/dev/sdb2 1026048 156301311 77637632 8e Linux LVM

$ mount /dev/sdb1 /mnt/x

$ ls /mnt/x
config-3.2.5-3.fc16.i686.PAE
initramfs-3.2.9-1.fc16.i686.PAE.img
config-3.2.9-1.fc16.i686.PAE
initramfs-3.2.9-2.fc16.i686.PAE.img
config-3.2.9-2.fc16.i686.PAE initrd-
plymouth.img
config.mk-compat-wireless-3.3-rc1-2-3.2.5-3.fc16.i686.PAE lost+found
config.mk-compat-wireless-3.3-rc1-2-3.2.9-1.fc16.i686.PAE
System.map-3.2.5-3.fc16.i686.PAE
config.mk-compat-wireless-3.3-rc1-2-3.2.9-2.fc16.i686.PAE
System.map-3.2.9-1.fc16.i686.PAE
efi
System.map-3.2.9-2.fc16.i686.PAE
grub
vmlinuz-3.2.5-3.fc16.i686.PAE
grub2
vmlinuz-3.2.9-1.fc16.i686.PAE
initramfs-3.2.5-3.fc16.i686.PAE.img
vmlinuz-3.2.9-2.fc16.i686.PAE

$ df -h /mnt/x
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 485M 85M 376M 19% /mnt/x

> According to the "vgimport -vvv" output posted here and the "pvscan"
> output posted on the forum there is a 79 GB partition in Linux LVM
> format that "should" contain the volume group vg_XYZ. Neither vgimport
> nor vgscan works.
>
> Both of these results tell me that the partition table probably was
> trashed. We started with a base assumption that the partition table was
> valid and worked based on that premise. No progress was made using
> tools suggested by the partition table contents. To me that says the
> partition table is in fact bad. But how much else was written to?
>
> Doing "fsck /dev/sdb1" did not show a filesystem there. Either more was
> overwritten than the partition table or there was no filesystem there on
> the original.

Here's what fsck is showing for sdb1.

$ umount /mnt/x

$ fsck -n /dev/sdb1
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/sdb1: clean, 248/128016 files, 102228/512000 blocks

> I go back to my suggstion of looking for contents. Does fdisk work on
> loopback files? I've never tried that. I would rather do dd to a new
> device. Put a single partiton for the whole drive. Look for a
> filesystem with fsck (done) and a volume group with vgscan. If it finds
> one look at the size - A partition tha'ts too big will show data smaller
> than the whole drive. Narrow down the size by halfing. A failure means
> the half was too small, success means on target or too big.

Richard, I'd like to try what you're suggesting... but, I'm afraid,
I'm not following you. Would you elaborate just a little bit more. I
have a cloned image of the bad sdb. Now what do I do with this image
using dd? So far, I am able to fdisk bad.img as follows:

$ losetup /dev/loop1 bad.img

$ # If /dev/loop1 is not specified on the next line,
$ # then fdisk can't see it.
$ fdisk -l /dev/loop1

Disk /dev/loop1: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/loop1p1 * 2048 1026047 512000 83 Linux
/dev/loop1p2 1026048 156301311 77637632 8e Linux LVM

vgscan reports no volumes.

$ vgscan
Reading all physical volumes. This may take a while...
No volume groups found

Now, what to do next?

> The problem with the method is what if there was a swap partition at the
> beginning. Then we don't know where to start other than the beginning.
> Make a partition 1 cyclinder then the rest in the second one. Try that.
> Keep cycling 1 more cylider at a time. Way too much work unless that
> could be automated. Can PartitionMagic do something like that?

From the volume group backup file which I have shared over
superuser.com, it seems there is indeed a swap partition in the
beginning. However, as I said above, I'm not fully understanding the
method you're suggesting. Could you spell out the steps along with
names of the programs to use in those steps and possibly some other
details, as I'm not a partitioning expert at all?

Harry

unread,
Apr 14, 2012, 12:27:27 AM4/14/12
to
On Apr 14, 5:29 am, Robert Nichols
<SEE_SIGNAT...@localhost.localdomain.invalid> wrote:
> On 04/13/2012 09:13 AM, Harry wrote:
>
>
>
>
>
>
>
>
>
> > Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,
> >http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
> > ), is there any way I could programmtically conduct a binary search
> > for the correct start of the lv_root? For example,
>
> >     StartLocation = some 'good' value
> >     dmsetup create foo --table "0 146997248 linear /dev/sdb2
> > $StartLocation"
> >     if mount -o ro /dev/mapper/foo /mnt succeeds,
> >          BINGO!
> >     else
> >        StartLocation = StartLocation + some 'good' Increment
> >        dmsetup remove foo
> >     fi
>
> > Based on your recommendations I could try some good values of
> > StartLocation and Increment, instead of a brute-force search of the
> > entire 80G pv.
>
> You previously reported "/dev/sdb2: lvm2 label detected at sector 1", and
> that confirms that the start of the physical volume is correctly located
> in the partition.  What does "pvck -vv /dev/sdb2" have to say?  I should
> report the location of the metadata records for your volume group(s).
>
> --
> Bob Nichols         AT comcast.net I am "RNichols42"

$ pvck -vv /dev/sdb2
Setting global/locking_type to 1
Setting global/wait_for_locks to 1
File-based locking selected.
Setting global/locking_dir to /var/lock/lvm
Scanning /dev/sdb2
/dev/sdb2: size is 155275264 sectors
/dev/sdb2: size is 155275264 sectors
/dev/sdb2: lvm2 label detected at sector 1
Found label on /dev/sdb2, sector 1, type=LVM2 001
Found text metadata area: offset=4096, size=1044480
Found LVM2 metadata record at offset=1006080, size=42496,
offset2=0 size2=0
Found LVM2 metadata record at offset=991232, size=14848, offset2=0
size2=0
Found LVM2 metadata record at offset=979456, size=11776, offset2=0
size2=0
Found LVM2 metadata record at offset=967168, size=12288, offset2=0
size2=0
Found LVM2 metadata record at offset=966144, size=1024, offset2=0
size2=0

Please note that, in the above output, there is non-graphic trailing
character (after "LVM2 001") on this line:
...
Found label on /dev/sdb2, sector 1, type=LVM2 001
...

Message has been deleted

Harry

unread,
Apr 14, 2012, 12:56:06 AM4/14/12
to
On Apr 13, 7:13 pm, Harry <simonsha...@gmail.com> wrote:
> On Apr 13, 6:58 pm, Harry <simonsha...@gmail.com> wrote:
>
> > On Apr 13, 8:30 am, Harry <simonsha...@gmail.com> wrote:
>
> > Guys, I am still somewhat hopeful but don't know what to do or who to
> > ask now.
>
> > Is there any other forum (LVM-related) where I could try asking?
>
> > This is not my area of expertise at all; I use Linux only as an
> > applications user.
>
> > Please help your fallen comrade out...
>
> Based on the contents of /etc/lvm/backup/vg_XYZ (which I provide here,http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
> ), is there any way I could programmtically conduct a binary search
> for the correct start of the lv_root? For example,
>
>    StartLocation = some 'good' value
>    dmsetup create foo --table "0 146997248 linear /dev/sdb2
> $StartLocation"
>    if mount -o ro /dev/mapper/foo /mnt succeeds,
>         BINGO!
>    else
>       StartLocation = StartLocation + some 'good' Increment
>       dmsetup remove foo
>    fi
>
> Based on your recommendations I could try some good values of
> StartLocation and Increment, instead of a brute-force search of the
> entire 80G pv.

-----------------------------------------------
NOTE:
I delete one of my messages (which I posted a few minutes back)
showing the output of the script I wrote. Please ignore that message
and consider this message instead.
-----------------------------------------------

Based on Richard's response (at
http://superuser.com/questions/411697/lvm-volume-with-corrupt-mbr-how...
), I wrote the following script.

------------------------------------------------------
#!/bin/bash
set -u

(( extent_size = 65536 ))
(( pe_start = 2048 ))

# current sector
(( curr = 0 ))

(( Increment = 1 ))
(( maxExtentCount = 126 + 2243 ))

# Clean up any previously existing foo.
if [ -b /dev/mapper/foo ]; then
dmsetup remove foo
fi

while : ; do

(( StartLocation = pe_start + curr * extent_size ))

echo -n "Trying curr = $curr ,StartLocation = $StartLocation ..."

if dmsetup create foo --table "0 146997248 linear /dev/sdb2
$StartLocation" &>/dev/null ; then
if mount -o ro /dev/mapper/foo /mnt &>/dev/null ; then
echo
echo 'BINGO'
break
else
echo " mount failed"
fi

dmsetup remove foo
else
echo " dmsetup failed"
fi

(( curr = curr + Increment ))

if [ $curr -gt $maxExtentCount ]; then
echo "reached maxExtentCount, quitting"
break
fi
done

# Clean up
if [ -b /dev/mapper/foo ]; then
dmsetup remove foo
fi
------------------------------------------------------

No success, though. The above loop ran as follows. Notice, how...
for curr = 0 thru 126, dmsetup succeeds but mount fails
for curr = 126 thru 2369, dmsetup itself fails.

Does that tell you anything else interesting that is not obvious to
me?

Trying curr = 0 ,StartLocation = 2048 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 1 ,StartLocation = 67584 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 2 ,StartLocation = 133120 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 3 ,StartLocation = 198656 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 4 ,StartLocation = 264192 ...mount: you must specify the
filesystem type
mount failed
Trying curr = 5 ,StartLocation = 329728 ...mount: you must specify the
filesystem type
mount failed

...

Trying curr = 125 ,StartLocation = 8194048 ...mount: you must specify
the filesystem type
mount failed
Trying curr = 126 ,StartLocation = 8259584 ...mount: you must specify
the filesystem type
mount failed
Trying curr = 127 ,StartLocation = 8325120 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 128 ,StartLocation = 8390656 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 129 ,StartLocation = 8456192 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 130 ,StartLocation = 8521728 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed

...

Trying curr = 2367 ,StartLocation = 155125760 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 2368 ,StartLocation = 155191296 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
Trying curr = 2369 ,StartLocation = 155256832 ...device-mapper: resume
ioctl failed: Invalid argument
Command failed
dmsetup failed
reached maxExtentCount, quitting


I am not sure if I'm incrementing 'curr' in proper units (sectors/
blocks/extent_size/bytes). Where is this volume backup file
documented, btw? (Googled for it, couldn't find it.)

I would like to understand LVM and even regular partitioning more
thoroughly. Would someone kindly suggest the smallest number of
resources (books, links to online articles/tutorials, etc) that would
teach me not just the commands but the concepts behind them. Anything
and everything above the BIOS and assembly-language I would like to
understand. For example, I won't want to understand the boot code in
the first 446 bytes of MBR but everything else you guys know about
partitioning and LVM.

Harry

unread,
Apr 14, 2012, 12:58:11 AM4/14/12
to
No, I haven't. But I did try TestDisk in non-modifying mode and it
didn't help.

Robert Nichols

unread,
Apr 14, 2012, 8:58:31 AM4/14/12
to
That is all perfectly normal. It's hard to understand why it's not working.
Try setting aside the /etc/lvm directory (rename it to /etc/xxlvm) to force
the system to work only with the physical devices and not rely on previously
cached data. Then see what 'pvscan', 'vgscan', and 'lvscan' can find. The
vgscan should repopulate the /etc/lvm directory with good data.

Harry

unread,
Apr 14, 2012, 9:35:30 AM4/14/12
to
On Apr 14, 5:58 pm, Robert Nichols
$ mv /etc/lvm /etc/lvm.off

$ pvscan
PV /dev/sdb2 lvm2 [74.04 GiB]
Total: 1 [74.04 GiB] / in use: 0 [0 ] / in no VG: 1 [74.04 GiB]

$ vgscan
Reading all physical volumes. This may take a while...
No volume groups found

$ lvscan
No volume groups found

$ ls -l /etc/lvm
total 0

Robert Nichols

unread,
Apr 14, 2012, 8:59:40 PM4/14/12
to
On 04/14/2012 08:35 AM, Harry wrote:
>
> $ mv /etc/lvm /etc/lvm.off
>
> $ pvscan
> PV /dev/sdb2 lvm2 [74.04 GiB]
> Total: 1 [74.04 GiB] / in use: 0 [0 ] / in no VG: 1 [74.04 GiB]
^^^^^^^^^^^^^^^^
I'm having a hard time imagining what could have happened to make it
appear that there was nothing in use inside that PV, especially since
'pvck' did find metadata records in there. Had this system been
running for a long time without rebooting prior to your escapade?
That would raise the possibility that the structures on disk had
been damaged for quite a while, and the system couldn't have survived
a reboot.

Now, I'm wondering what your chances would be of reconstructing the
VGs and LVs from the data in the backup files. If those volumes had
ever been resized or rearranged since they were created, I fear your
chances of success would be just about nil, but I don't know what
else to suggest. I think I'd want to play around with that on newly
created structures on another disk before trying it for real.
Message has been deleted

Harry

unread,
Apr 14, 2012, 11:13:09 PM4/14/12
to
This is what, I believe, I did.

The 80G sdb that I'm trying to recover now, used to be sda in my system earlier. I then installed a 250G sdb in the system, installed Fedora 16 on it, made sdb bootable, and then booted my sytsem.

Now, my system was booting just fine from sdb. In my ignorance (of the low, system-level details of MBR, partitioning, BIOS, etc), I thought that if I copied the first 446 bytes of MBR of sdb (which is working fine right now for me) to sda, then I might be able to boot from sda also if I were to place sda into another system. Since these 446 bytes would be pure boot code which would be common to both sda and sdb (esp, since both had Fedora 16 installs on them) and with no data in it, I thought, I had nothing to lose in trying this operation out: at worst, sda would continue to remain non-bootable, at which point I'd look for some other solution.

So, I fearlessly issued a

dd if=/dev/sdb of=/dev/sda bs=446 count=1

and, I believe, sda continued to remain non-bootable.

Since, mentally, I tend to feel more comfortable with sda being the primary/bootable disk in my system, at this point I physically switched sda and sdb cables in my system.

I still didn't think anything was seriously wrong; I thought, I should be able to fix things now. Now, because I didn't understand LVM *at all* (I only have a very high-level idea, even now!), I incorrectly thought that the sdb2 partition (shown by fdisk -l) was really an ext4 filesystem (and 'not' an LVM volume!), and 'all' I had to do in order to be able to mount it was to use the standard methods suggested on the Net. (sdb was being reported as having a bad superblock and ext4 supposedly has redundant backups of this stored in it which one could use to repair it.) When I couldn't mount/repair, I posted the question here,
http://superuser.com/questions/410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock .

---------------
Note: When this ext4 repair attempt didn't succeed on the sdb2 (partition), I even (incorrectly) tried the commands on sdb (the device) using various superblock offsets. Because the e2fsck man page has this statement in it

"The location of the backup superblock is dependent on the filesystem's blocksize.
For filesystems with 1k blocksizes, a backup superblock can be found at block 8193; for
filesystems with 2k blocksizes, at block 16384; and for 4k blocksizes, at block 32768."

, I even tried this command by assuming different blocksizes, for I thought I wouldn't be able to accurately and easily find out the actual blocksize of the filesystem given that it is in a 'broken' state at this point.
---------------

At this point, someone on this forum enlightened me that, because it was an LVM and not an ext4, ext4 repair tools were not working.

I *hope* these ext4 repair commands didn't mess up the LVM; each superblock offset I would specify would fail to get recognized as a valid superblock by e2fsck and so the repair operation would fail. Had the e2fsck command really succeeded by incorrectly identifying some random junk as a valid superblock, then I could understand (*now*) how it may have corrupted the LVM that I'm now struggling to bring back to life.

Only at this point and not any earlier, did I begin to anticipate serious trouble ahead and decided to clone /dev/sdb. Therefore, strictly speaking, the cloned sdb image I have now, does NOT reflect the state of things immediately after the 'MBR 446-byte overwrite' operation. I'm still hoping that none of my repair attempts (via e2fsck on both the device-sdb and the partition-sdb2!) corrupted the LVM on sdb2.

Robert Nichols

unread,
Apr 15, 2012, 4:14:31 PM4/15/12
to
On 04/14/2012 10:13 PM, Harry wrote:
>
> This is what, I believe, I did.
[recap snipped]
I don't see any way that anything you reported doing could cause the problem
you are seeing, and frankly I can't come up with any likely typos in those
commands that would do it either. If you had mistyped the destination
device as "sda1" or "sda2" instead of "sda", nothing would have been
affected (neither ext2/3/4 nor LVM2 store anything in the first 512 bytes).
If you had mistyped the blocksize as "446k" or the count as "1k", you would
have overwritten the partition table and possibly part of the ext3 file
system on that first partition. You've still got what appears to be a good
partition table, a good file system on the first partition, and an LVM2
header at the appropriate location in the second partition. e2fsck won't do
any writing if it doesn't find something that looks like a valid superblock.
Something else had to have happened, but I can't imagine what.

What is the history of that LVM2 PV? Is it likely that the LVs were
allocated in single extents? As a last resort you could do a brute force
search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
at offset 56 (0x48) in a sector, making that a possible superblock, then
use dmsetup to define a virtual device that begins 1024 bytes prior to that
sector and see what e2fsck has to say about that (use the "-n" option so
that it will open the device read-only). Sounds like it might take a long
time, but probably no more than you've already spent.

Harry

unread,
Apr 16, 2012, 5:39:27 AM4/16/12
to
On Monday, April 16, 2012 1:44:31 AM UTC+5:30, Robert Nichols wrote:
> On 04/14/2012 10:13 PM, Harry wrote:
> >
> > This is what, I believe, I did.
> [recap snipped]
> I don't see any way that anything you reported doing could cause the problem
> you are seeing, and frankly I can't come up with any likely typos in those
> commands that would do it either. If you had mistyped the destination
> device as "sda1" or "sda2" instead of "sda", nothing would have been
> affected (neither ext2/3/4 nor LVM2 store anything in the first 512 bytes).
> If you had mistyped the blocksize as "446k" or the count as "1k", you would
> have overwritten the partition table and possibly part of the ext3 file
> system on that first partition. You've still got what appears to be a good
> partition table, a good file system on the first partition, and an LVM2
> header at the appropriate location in the second partition. e2fsck won't do
> any writing if it doesn't find something that looks like a valid superblock.
> Something else had to have happened, but I can't imagine what.

I'm happy to hear that.

> What is the history of that LVM2 PV? Is it likely that the LVs were
> allocated in single extents?

Don't know. I recall just going with the Fedora 16 installation-time defaults, and only changing the sizes of the swap and root partitions.

> As a last resort you could do a brute force
> search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
> at offset 56 (0x48) in a sector, making that a possible superblock, then
> use dmsetup to define a virtual device that begins 1024 bytes prior to that
> sector and see what e2fsck has to say about that (use the "-n" option so
> that it will open the device read-only). Sounds like it might take a long
> time, but probably no more than you've already spent.


I think, you meant "offset 56 (0x38)".

I have launched the script listed below. It doesn't do the virtual device mapping *yet*; for now, it just looks for the signature. I will add the device mapper later.

After 43 minutes of running, I printed the progress (kill -10) and got this:
Trying sector 948135 of 146997248 .645001 %
ETA: 230 hours ( 9.58 days)

Any way to speed it up... say, by skipping regions of the device, considering interesting regions first?


----------------------------------------------------------------
#!/bin/bash
set -u

function printProgress {
local timeCurr=
local timeTaken=
local eta=

echo "Trying sector $currSector of $lastSector " $(echo "scale=6; $currSector * 100.0 / $lastSector " | bc) " %"

timeCurr=$(date +%s) # seconds
(( timeTaken = timeCurr - timeStart ))
(( eta = timeTaken * (lastSector - currSector) / (currSector - startSector) / 3600 ))
echo " ETA: $eta hours" "(" $(echo "scale=2; $eta / 24" | bc) " days)"
}

timeStart=$(date +%s)

trap "printProgress" 10

extent_count=2243
extent_size=65536
lastSector=$((extent_count * extent_size))

# Start by skipping these many sectors.
#
# Use either the first arg to this script or
# a default value of 2.
skipSectors=${1-2}
((startSector = skipSectors + 1 ))
echo "Started, by skipping $skipSectors sectors."

# ---------------------
# Main loop
# ---------------------
while :; do
((currSector = skipSectors + 1 ))

# Get the signature 'sig'. I have tested that this incantation
# does indeed give the signature 53ef for a device having a
# valid ext4 partition in the beginning.
sig=$(dd if=/dev/sdb2 bs=512 skip=$skipSectors |
xxd -c 10 | head -109 | tail -1 | cut -d ' ' -f2)

if [ "$sig" = "53ef" ]; then
echo "Found an ext4 fs at sector $currSector , quitting."
break
fi

# Did not find ext4 sig at the above location, try the next sector
((skipSectors += 1))

if [ $skipSectors -eq $lastSector ]; then
# No more sectors to skip.
echo "FAILED to find an ext4 fs."
break
fi
done
----------------------------------------------------------------

Robert Nichols

unread,
Apr 16, 2012, 9:50:30 AM4/16/12
to
On 04/16/2012 04:39 AM, Harry wrote:
> On Monday, April 16, 2012 1:44:31 AM UTC+5:30, Robert Nichols wrote:
[SNIP]
>> As a last resort you could do a brute force
>> search for a 16-bit integer 0xEF53 (little-endian, the byte order is 53 EF)
>> at offset 56 (0x48) in a sector, making that a possible superblock, then
>> use dmsetup to define a virtual device that begins 1024 bytes prior to that
>> sector and see what e2fsck has to say about that (use the "-n" option so
>> that it will open the device read-only). Sounds like it might take a long
>> time, but probably no more than you've already spent.
>
>
> I think, you meant "offset 56 (0x38)".

Indeed! Sorry about that.

> I have launched the script listed below. It doesn't do the virtual device mapping *yet*; for now, it just looks for the signature. I will add the device mapper later.
>
> After 43 minutes of running, I printed the progress (kill -10) and got this:
> Trying sector 948135 of 146997248 .645001 %
> ETA: 230 hours ( 9.58 days)

Yipe! I've attached a C program that looks for a sector with the EF53
magic number plus a valid blocksize and a label string that is either
empty or contains valid ASCII characters. It searched a 160GB partition
in about 45 minutes.

Note that you will need to subtract 1024 bytes from this offset when
setting up the virtual device.
e2finder.uue

Harry

unread,
Apr 16, 2012, 11:02:21 AM4/16/12
to
> begin 664 e2finder.c
> M(V1E9FEN92!?1DE,15]/1D93151?0DE44R`V-`HC:6YC;'5D92`\<W1D:6\N
> M:#X*(VEN8VQU9&4@/'-T9&QI8BYH/@HC:6YC;'5D92`\<W1R:6YG+F@^"B-I
> M;F-L=61E(#QE<G)N;RYH/@HC:6YC;'5D92`\8W1Y<&4N:#X*(VEN8VQU9&4@
> M/&QI;G5X+V9S+F@^"B-I;F-L=61E(#QL:6YU>"]E>'0R7V9S+F@^"@HC9&5F
> M:6YE('-P8B`H<VEZ96]F*&)U9BDO-3$R*0H*:6YT(&UA:6XH:6YT(&%R9V,L
> M(&-H87(@*F%R9W9;72D@>PH@("`@;&]N9R!B;&MC;W5N=#L*("`@(&EN="!S
> M96-T;W(L(&YS+"!P+"!O:SL*("`@('-T871I8R!C:&%R(&)U9ELT,#DV73L*
> M("`@($9)3$4@*FEN9CL*("`@(&-H87(@*F-M9"P@*FQA8F5L.PH@("`@<W1R
> M=6-T(&5X=#)?<W5P97)?8FQO8VL@*G-B7W!T<CL*"B`@("!I9B@H8VUD(#T@
> M<W1R<F-H<BAA<F=V6S!=+"`G+R<I*2`A/2!.54Q,*2`@*RMC;60["B`@("!E
> M;'-E("!C;60@/2!A<F=V6S!=.PH*("`@(&EF*&%R9V,@/"`R*2!["@EF<')I
> M;G1F*'-T9&5R<BP@(E5S86=E.B`E<R!D979I8V5<;B(L(&-M9"D["@ER971U
> M<FX@,3L*("`@('T*("`@(&EF*"AI;F8@/2!F;W!E;BAA<F=V6S%=+"`B<B(I
> M*2`]/2!.54Q,*2!["@EP97)R;W(H87)G=ELQ72D["@ER971U<FX@,3L*("`@
> M('T*"B`@("!P=71S*")">71E(&]F9G-E="`@<V5C=&]R("@U,3)"*2(I.PH@
> M("`@<'5T<R@B+2TM+2TM+2TM+2TM("`M+2TM+2TM+2(I.PH@("`@8FQK8V]U
> M;G0@/2`P.PH@("`@=VAI;&4H*&YS(#T@9G)E860H8G5F+"`U,3(L('-P8BP@
> M:6YF*2D@/B`P*2!["@EF;W(H<V5C=&]R(#T@,#L@<V5C=&]R(#P@;G,[("LK
> M('-E8W1O<BD@>PH)("`@('-B7W!T<B`]("AS=')U8W0@97AT,E]S=7!E<E]B
> M;&]C:R`J*2AB=68K-3$R*G-E8W1O<BD["@D@("`@:68H<V)?<'1R+3YS7VUA
> M9VEC(#T](#!X968U,RD@>PH)"6QA8F5L(#T@<V)?<'1R+3YS7W9O;'5M95]N
> M86UE.PH)"6]K(#T@,#L*"0EF;W(H<"`](#`[('`@/"!S:7IE;V8H<V)?<'1R
> M+3YS7W9O;'5M95]N86UE*3L@*RMP*2!["@D)("`@(&EF*&QA8F5L6W!=(#T]
> M("=<,"<I('LK*V]K.R`@8G)E86L[?0H)"2`@("!I9B@A*&ES87-C:6DH;&%B
> M96Q;<%TI*2D@(&)R96%K.PH)"2`@("!I9B@A*&ES9W)A<&@H;&%B96Q;<%TI
> M('Q\(&ES8FQA;FLH;&%B96Q;<%TI*2D@(&)R96%K.PH)"7T*"0EI9B@H=6YS
> M:6=N960I*'-B7W!T<BT^<U]L;V=?8FQO8VM?<VEZ92D@/B`R*2`@8V]N=&EN
> M=64["@D):68H;VL@?'P@<#X]."D@>PH)"2`@("!P<FEN=&8H(B4C,#$R;'@@
> M*"4X;&0I+"!L86)E;#U<(B4N,39S7"(L(&)S/25D:UQN(BP*"0D)("`@-3$R
> M*BAB;&MC;W5N="MS96-T;W(I+`H)"0D@("`H8FQK8V]U;G0K<V5C=&]R*2P*
> M"0D)("`@;&%B96PL"@D)"2`@(#$\/'-B7W!T<BT^<U]L;V=?8FQO8VM?<VEZ
> M92D["@D)("`@(&9F;'5S:"AS=&1O=70I.PH)"7T*"2`@("!]"@E]"@EB;&MC
> M;W5N="`K/2!N<SL*("`@('T*("`@(&EF*&9E;V8H:6YF*2D@>PH)9G!R:6YT
> M9BAS=&1E<G(L(")%3T8@;VX@)7-<;B(L(&%R9W9;,5TI.PH)<F5T=7)N(#`[
> M"B`@("!]"B`@("!E;'-E(&EF*&9E<G)O<BAI;F8I*2`@<&5R<F]R*&%R9W9;
> 5,5TI.PH@("`@<F5T=7)N(#$["GT*
> `
> end

Hey - first of all, THANKS for sharing your C program! I had written mine; though it processes under an hour (just like yours), it VERY LIKELY has bugs and I won't trust it at all. So, would rather use yours.

I get this error when I try to compile it. Cursory googling tells me umode_t is deprecated. Any quick workaround for this?

$ gcc -o e2finder e2finder.c
In file included from e2finder.c:8:0:
/usr/include/linux/ext2_fs.h:178:41: error: unknown type name 'umode_t'

Harry

unread,
Apr 16, 2012, 1:39:08 PM4/16/12
to
Bob,

I had to add this line before include <stdio.h> to get the program to compile:
#define umode_t mode_t

Because mode_t itself is defined as 'unsigned <something>', I took the quick liberty of doing the above without a sufficiently deeper thought on all its ramifications.

There were 1601 entries printed. I extracted the base 10 number (from the parentheses), subtracted 2 (sectors) from it, and used the resulting number in device mapping code as follows:


# This just extracts the sector that has the e2 fs signature.
# Actual subtraction will be done later below, in the script.

$ cat e2finder.log | sed -r 's/\(|\)|,//g' | field 2 > tmp
$ cat tmp
8521728
...
98174976


---------------------------------
#!/bin/bash
set -u

if [ -b /dev/mapper/foo ]; then
dmsetup remove foo
fi

bogusMax=100
cat tmp | while read x; do
# superblock starts 2 sectors earlier
(( x = x - 2 ))

echo "Trying superblock at sector $x ..."
if dmsetup create foo --table "0 $bogusMax linear /dev/sdb2 $x"; then
if e2fsck -n /dev/mapper/foo &>/dev/null; then
echo "Found superblock at sector $x ."
else
echo "FAILED to find superblock at sector $x ."
fi
dmsetup remove foo
else
echo "FAILED to dmsetup at sector $x"
fi
done
---------------------------------


Unfortunately, every single superblock offset failed!


Trying superblock at sector 8521726 ...
FAILED to find superblock at sector 8521726 .
...
Trying superblock at sector 98174974 ...
FAILED to find superblock at sector 98174974 .


Do I still have hope?!

Harry

unread,
Apr 16, 2012, 2:08:31 PM4/16/12
to
If I allow e2fsck to prints its error output, this is what I get for each value of the sector:

Trying superblock at sector 8521726 ...
e2fsck 1.41.14 (22-Dec-2010)
e2fsck: Group descriptors look bad... trying backup blocks...
e2fsck: Invalid argument when using the backup blocks
e2fsck: going back to original superblock
Error reading block 3062704254 (Invalid argument). Ignore error? no

Superblock has an invalid journal (inode 8).
Clear? no

e2fsck: Illegal inode number while checking ext3 journal for /dev/mapper/foo
FAILED to find superblock at sector 8521726 .

<repeated similarly for each value superblock sector>


Doug Freyburger

unread,
Apr 16, 2012, 2:47:52 PM4/16/12
to
Harry wrote:
>
> I would like to understand LVM and even regular partitioning more
> thoroughly. Would someone kindly suggest the smallest number of
> resources (books, links to online articles/tutorials, etc) that would
> teach me not just the commands but the concepts behind them. Anything
> and everything above the BIOS and assembly-language I would like to
> understand. For example, I won't want to understand the boot code in
> the first 446 bytes of MBR but everything else you guys know about
> partitioning and LVM.

Partitioning takes an entire disk and divides it up into one or more
sub-disks. Then each sub-disk is used like it's a disk. You write an
MBR to make it bootable, a table of the partitions, then the first
partition starts. The software reads the partition tables and uses the
numbers as offsets to calculate block addresses inside each sub-disk.
There can be details about primary and extended partitions but that's
just more of the same. Those details don't change the simple concepts
and offset arithmetic of how partitions work.

Paritions are formatted to contain file systems. Files live in file
systems.

Logical volume manager takes one or more disks and puts them into a pool
rather like how virtual membery takes one or more memory module and puts
them together into a pool. The entire pool is then divided into extents
rather like how memory is divided into pages. There's a table of the
extents early on the disk rather like there's a table of the pages early
in the main memory. Then virtual disks are created out of the pool
(logical volumes) rather like process address spaces are created out of
the pages in main memory.

Logical volumes are formatted to contain file systems. Files live in
file systems.

It's the same concept mapped as physical slices one way, as a pool of
virtual extents the other way.

The two layers do mix and match - It's possible to put either entire
drives or partitions into the volume group pool. Both levels work at
their own level. Your example showed a disk divided into a 250 MB
partition used to directly contain a file system (probably /boot) and
the rest in the second partition that formed a volume group.

Those are the concepts. Each layer has its own tools. fdisk for
partitions. vgcreate/vgscan/vgextend/vgreduce/vgchange for volume
groups. lvcreate and the other lv* cmmands because that's the extra
layer of virtualization. Then both types return to the file system
tools like mkfs.

Doug Freyburger

unread,
Apr 16, 2012, 3:04:18 PM4/16/12
to
Harry wrote:
> Doug Freyburger <dfrey...@yahoo.com> wrote:
>
>> According to the "fdisk -l" output there is a 250 MB parition in Linux
>> format marked bootable. Clearly /boot. It does not fsck nor does it
>> mount as /mnt/boot. if only the boot code of the MBR were written that
>> partition would fsck and mount.
>
> No, actually, I /can/ mount the boot partition sdb1.
> ...
> $ ls /mnt/x
> config-3.2.5-3.fc16.i686.PAE
> initramfs-3.2.9-1.fc16.i686.PAE.img
> config-3.2.9-1.fc16.i686.PAE
> initramfs-3.2.9-2.fc16.i686.PAE.img
> config-3.2.9-2.fc16.i686.PAE initrd-
> plymouth.img
> config.mk-compat-wireless-3.3-rc1-2-3.2.5-3.fc16.i686.PAE lost+found
> config.mk-compat-wireless-3.3-rc1-2-3.2.9-1.fc16.i686.PAE
> System.map-3.2.5-3.fc16.i686.PAE
> config.mk-compat-wireless-3.3-rc1-2-3.2.9-2.fc16.i686.PAE
> System.map-3.2.9-1.fc16.i686.PAE
> efi
> System.map-3.2.9-2.fc16.i686.PAE
> grub
> vmlinuz-3.2.5-3.fc16.i686.PAE
> grub2
> vmlinuz-3.2.9-1.fc16.i686.PAE
> initramfs-3.2.5-3.fc16.i686.PAE.img
> vmlinuz-3.2.9-2.fc16.i686.PAE

That's clearly a /boot mount point. Conclusive evidence the partition
table was not trashed. No way did a "dd" copy all 250 MB.

>> According to the "vgimport -vvv" output posted here and the "pvscan"
>> output posted on the forum there is a 79 GB partition in Linux LVM
>> format that "should" contain the volume group vg_XYZ. Neither vgimport
>> nor vgscan works.

> I
> have a cloned image of the bad sdb. Now what do I do with this image
> using dd? So far, I am able to fdisk bad.img as follows:
>
> $ losetup /dev/loop1 bad.img
>
> $ # If /dev/loop1 is not specified on the next line,
> $ # then fdisk can't see it.
> $ fdisk -l /dev/loop1
>
> Disk /dev/loop1: 80.0 GB, 80026361856 bytes
> 255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Device Boot Start End Blocks Id System
> /dev/loop1p1 * 2048 1026047 512000 83 Linux

Definitely /boot

> /dev/loop1p2 1026048 156301311 77637632 8e Linux LVM

And therefore this has to be what I would normally call VolGroup00 on he
many Red Hat systems I have built. In your posts you have called in
vg_XYZ.

> vgscan reports no volumes.
>
> $ vgscan
> Reading all physical volumes. This may take a while...
> No volume groups found
>
> Now, what to do next?

That's the puzzle we have gotten to. It's nowhere near where you did
the dd. Is there any chance the dd actually had the partition in its of
clause? "of=/dev/sdb2". If so that trashed the configuration blocks of
the volume group not the MBR and partition table. So I move on to the
next speculation. If you are positive it was "of=/dev/sdb" with no
number none of the rest applies.

That is consistent with the results - It won't boot because there's no /
because there's no vg_XYZ because the volume group table had the first
442 bytes overwritten.

Volume groups do have configuration data and it can be recovered.
Maybe. When you started the thread you wer elooking for alternate
superblocks. Volume groups do have tables that work sort of like that.
I had hoped that "vgscan" would look for alternate copies.

I take it there was not a second drive in vg_XYZ? Alternate copies go
on every drive. Not sure what other tools to use if there was a single
disk. I build commercial systems with mirrored boot for reasons like
this. Or at least bootable kickstart images on DVD-ROM. So at this
point I've run to the end of my rope.

Tools like PartitionMagic look into partitions. You need a tool like
VolumeGroupMagic. If there is such a tool. If there is I want one to
add to my war chest marked "Just because you're paranoid doesn't mean
they are out to get you".

Doug Freyburger

unread,
Apr 16, 2012, 3:15:46 PM4/16/12
to
Robert Nichols wrote:
> Harry wrote:
>> $ pvck -vv /dev/sdb2
>
>> Found label on /dev/sdb2, sector 1, type=LVM2 001
>> Found text metadata area: offset=4096, size=1044480
>> Found LVM2 metadata record at offset=1006080, size=42496,
>> offset2=0 size2=0
>> Found LVM2 metadata record at offset=991232, size=14848, offset2=0
>> size2=0
>
> That is all perfectly normal. It's hard to understand why it's not working.
> Try setting aside the /etc/lvm directory (rename it to /etc/xxlvm) to force
> the system to work only with the physical devices and not rely on previously
> cached data. Then see what 'pvscan', 'vgscan', and 'lvscan' can find. The
> vgscan should repopulate the /etc/lvm directory with good data.

If it was the beginning of /dev/sdb2 that was overwritten not the
beginning of the disk then the first LVM label was trashed. It's a huge
if but maybe that's what happened.

I see in the man page for the Red Har version of "pvck" that it
supports "--labelsector sector". Above is a list of those sectors.
They should act something like the alternative super blocks discussed
earlier when we were thinking about an ext4 filesystem.

I don't see that switch in the man page for any of the other related
commands. Sigh.

Have you tried "lvscan" in the hopes that it uses those alternate labels?

Robert Nichols

unread,
Apr 16, 2012, 4:58:07 PM4/16/12
to
On 04/16/2012 12:39 PM, Harry wrote:

>
> I had to add this line before include<stdio.h> to get the program to compile:
> #define umode_t mode_t
>
> Because mode_t itself is defined as 'unsigned<something>', I took the quick liberty of doing the above without a sufficiently deeper thought on all its ramifications.
>
> There were 1601 entries printed. I extracted the base 10 number (from the parentheses), subtracted 2 (sectors) from it, and used the resulting number in device mapping code as follows:
>
>
> # This just extracts the sector that has the e2 fs signature.
> # Actual subtraction will be done later below, in the script.
>
> $ cat e2finder.log | sed -r 's/\(|\)|,//g' | field 2> tmp
> $ cat tmp
> 8521728
> ...
> 98174976
>
>
> ---------------------------------
> #!/bin/bash
> set -u
>
> if [ -b /dev/mapper/foo ]; then
> dmsetup remove foo
> fi
>
> bogusMax=100
> cat tmp | while read x; do
> # superblock starts 2 sectors earlier
> (( x = x - 2 ))
>
> echo "Trying superblock at sector $x ..."
> if dmsetup create foo --table "0 $bogusMax linear /dev/sdb2 $x"; then
> if e2fsck -n /dev/mapper/foo&>/dev/null; then
> echo "Found superblock at sector $x ."
> else
> echo "FAILED to find superblock at sector $x ."
> fi
> dmsetup remove foo
> else
> echo "FAILED to dmsetup at sector $x"
> fi
> done
> ---------------------------------
>
>
> Unfortunately, every single superblock offset failed!
>
>
> Trying superblock at sector 8521726 ...
> FAILED to find superblock at sector 8521726 .
> ...
> Trying superblock at sector 98174974 ...
> FAILED to find superblock at sector 98174974 .

First, for this program it doesn't matter how you define umode_t since
the program never uses the one macro where that name appears. Anything
that keeps the compiler happy should be fine.

I ran the program on an LVM device where I had some known file systems
and was able to do a successful e2fsck. But, ...

1. You need to make the "num_sectors" arg to dmsetup at least large enough
to contain the file system or the fsck will surely fail.

2. It is not sufficient to begin the virtual device 2 sectors before the
candidate superblock. Remember that an alternate superblock at
offset XXX still expects the file system to begin at its real starting
point, not just 2 blocks before this alternate superblock. Consider:

0x0000100400 ( 2050), label="SL6-var", bs=4k # Real SB
0x0008100000 ( 264192), label="SL6-var", bs=4k # 1st alternate
0x0018100000 ( 788480), label="SL6-var", bs=4k # 2nd alternate

All of these expect the file system to begin at sector offset 2048,
and would need you to run "e2fsck -b 32768" or "e2fsck -b 98304" for
the alternate SBs.

WARNING: I get a ton of group descriptor checksum errors and block bitmap
differences whenever I try to use an alternate super block, and
this also happens for a regular file system where I have not
been playing dmsetup games. Seems dangerous, but I ran e2fsck
with the "-y" option to let it fix everything, and the file
system still checked OK from the primary SB. Be careful, and
run e2fsck with the "-n" option until you're reasonably sure
you've got things set up right.

Here's what I used to set up the mapper:

Find=2050; dmsetup create foo --table \
"0 $((159260346*2-(Find-2) )) linear /dev/sda12 $((Find-2))"

where that "159260346" is the total number of 1K blocks in the partition,
as reported by "fdisk -l".

I hope this helps.

Harry

unread,
Apr 16, 2012, 11:19:37 PM4/16/12
to
Thanks. This is definitely a good high-level description.
Thanks, Doug. What you wrote above should serve as a good conceptual overview.

However, I'd still want to read a book or an online resource that explains things both conceptually and at a practitioner's level. Basically, I'd like to be able to troubleshoot my own problems.

Here's one resource I found:
http://docs.fedoraproject.org/en-US/Fedora/14/html/Storage_Administration_Guide/index.html

But skimming thru it tells me that this won't cover concepts in detail; it uses GUI versions of various tools is another thing. I tend to like O'reilly style presentation/coverage of a topic, and, it seems, there's no (current) book by O'reilly on LVM2.

If you/others know of a good O'reilly type of self-learning resource, I'd like to hear about it.

Also, I'm hoping I'd be able to do hands on exercises in a virtualized environment (VirtualBox, Xen, etc). If you know of any good tutorial that uses such a setup to teach you LVM, please do share.

Harry

unread,
Apr 17, 2012, 12:20:45 AM4/17/12
to
On Tuesday, April 17, 2012 12:45:46 AM UTC+5:30, Doug Freyburger wrote:
> Robert Nichols wrote:
> > Harry wrote:
> >> $ pvck -vv /dev/sdb2
> >
> >> Found label on /dev/sdb2, sector 1, type=LVM2 001
> >> Found text metadata area: offset=4096, size=1044480
> >> Found LVM2 metadata record at offset=1006080, size=42496,
> >> offset2=0 size2=0
> >> Found LVM2 metadata record at offset=991232, size=14848, offset2=0
> >> size2=0
> >
> > That is all perfectly normal. It's hard to understand why it's not working.
> > Try setting aside the /etc/lvm directory (rename it to /etc/xxlvm) to force
> > the system to work only with the physical devices and not rely on previously
> > cached data. Then see what 'pvscan', 'vgscan', and 'lvscan' can find. The
> > vgscan should repopulate the /etc/lvm directory with good data.
>
> If it was the beginning of /dev/sdb2 that was overwritten not the
> beginning of the disk then the first LVM label was trashed. It's a huge
> if but maybe that's what happened.

No, I can assure you the dd command I had issued was:

$ dd if=/dev/sdb of=/dev/sda bs=446 count=1

At the time this command was issued,
a) sdb was the 250G disk with a freshly installed F16;
b) sda was the 80G disk
Because I switched the cables in my system later, it is sdb that is the 80G disk that we're trying to fix now.

> I see in the man page for the Red Har version of "pvck" that it
> supports "--labelsector sector". Above is a list of those sectors.
> They should act something like the alternative super blocks discussed
> earlier when we were thinking about an ext4 filesystem.
>
> I don't see that switch in the man page for any of the other related
> commands. Sigh.
>
> Have you tried "lvscan" in the hopes that it uses those alternate labels?

$ lvscan --all
No volume groups found

Harry

unread,
Apr 17, 2012, 12:44:51 AM4/17/12
to
Well, the XYZ is placeholder for my, real 3-letter hostname. I thought, I would use XYZ instead of the actual hostname to keep the interaction as objective as possible. I don't mind revealing it if it would help in solve this problem, nothing secretive/personal about it.

For the same reason, though my shell prompt is also customized (via PS1) (it's actually a 2-line prompt!), I've been choosing to use only the plain, vanilla '$ ' in all my interactions so far.

> > vgscan reports no volumes.
> >
> > $ vgscan
> > Reading all physical volumes. This may take a while...
> > No volume groups found
> >
> > Now, what to do next?
>
> That's the puzzle we have gotten to. It's nowhere near where you did
> the dd. Is there any chance the dd actually had the partition in its of
> clause? "of=/dev/sdb2". If so that trashed the configuration blocks of
> the volume group not the MBR and partition table. So I move on to the
> next speculation. If you are positive it was "of=/dev/sdb" with no
> number none of the rest applies.

I am absolutely positive that I issued the

$ dd if=/dev/sdb of=/dev/sda bs=446 count=1

command. (Recap Note: What is sdb now, was sda earlier... at the time the dd command was issued.)

I was well aware of the dangers of playing with 'dd', and so was extra, extra careful in constructing it before issuing it. Though I didn't (and still don't deeply) understand partitioning and LVM, esp the way all you folks do, when issuing the dd command I knew at least things like device vs partition, if= vs of=, bs, count, skip, 446-byte MBR code, etc. As I said earlier, I was so confident of what I was doing that I didn't think it necessary to backup the disk!

Even during Fedora 16 install, when I came to this step
http://docs.fedoraproject.org/en-US/Fedora/16/html/Installation_Guide/Assign_Storage_Devices-x86.html
, I remember VERY clearly:
1. leaving this (currently messed up 80G) disk in the Data Storage Devices listbox; and
2. including the new 250 G disk in the Install Target Devices listbox.

Then, a few steps later, at http://docs.fedoraproject.org/en-US/Fedora/16/html/Installation_Guide/s1-diskpartitioning-x86.html, I remember VERY clearly /not/ having the 80G disk selected for formatting.

Could any of this have possibly messed up my disk? Probably not.

Later, I did incorrectly and unsuccessfully try various e2* commands to repair the LVM partition mistaking it for an ext4 fs. Only this part I don't remember fully well; I think, I did use the '-n' option in these commands which would have left the disk intact. Also, because I was simply copy-pasting commands from the Net without really understanding them (relying on the assurance of '-n') and because I tried various permutations of device/partition and offset numbers, I didn't really note down what all I was doing. Thus, except for these various e2* command sequences that I don't fully recall now, I'm absolutely sure of everything else.

Harry

unread,
Apr 17, 2012, 2:02:28 AM4/17/12
to
Ok.

> 2. It is not sufficient to begin the virtual device 2 sectors before the
> candidate superblock. Remember that an alternate superblock at
> offset XXX still expects the file system to begin at its real starting
> point, not just 2 blocks before this alternate superblock. Consider:
>
> 0x0000100400 ( 2050), label="SL6-var", bs=4k # Real SB
> 0x0008100000 ( 264192), label="SL6-var", bs=4k # 1st alternate
> 0x0018100000 ( 788480), label="SL6-var", bs=4k # 2nd alternate
>
> All of these expect the file system to begin at sector offset 2048,
> and would need you to run "e2fsck -b 32768" or "e2fsck -b 98304" for
> the alternate SBs.
>
> WARNING: I get a ton of group descriptor checksum errors and block bitmap
> differences whenever I try to use an alternate super block, and
> this also happens for a regular file system where I have not
> been playing dmsetup games. Seems dangerous, but I ran e2fsck
> with the "-y" option to let it fix everything, and the file
> system still checked OK from the primary SB. Be careful, and
> run e2fsck with the "-n" option until you're reasonably sure
> you've got things set up right.
>
> Here's what I used to set up the mapper:
>
> Find=2050; dmsetup create foo --table \
> "0 $((159260346*2-(Find-2) )) linear /dev/sda12 $((Find-2))"
>
> where that "159260346" is the total number of 1K blocks in the partition,
> as reported by "fdisk -l".


Bob, would you please confirm or clarify the following:

1. My fdisk -l shows this:

$ fdisk -l
...
Disk /dev/sdb: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 1026047 512000 83 Linux
/dev/sdb2 1026048 156301311 77637632 8e Linux LVM

Question: Given this, I should be issuing a `e2finder /dev/sdb2`, isn't it? Based on what I understand about LVMs, /dev/sdb2 is like a device or a disk - only logical instead of physical. Would you please confirm this (or deny, with a sentence or two of explanation)?

2. Now, when I do issue `e2finder /dev/sdb2`, the list starts out as follows:

$ ./e2finder /dev/sdb2
Byte offset sector (512B)
------------ --------
0x0004100000 ( 8521728), label="", bs=4k
0x0014100000 ( 9046016), label="", bs=4k
...

Given this list and given your example just above, would the value of 'Find' for me be: 8521728?

3. In your example, does 'Find' progressively take on values 2050, 264192, 788480... or, does it remain fixed at 2050?

If it's the former, then I wonder what would be the use of printing 2nd and subsequent entries?

And if it's the latter, then why in setting up the device mapping table would you use a varying 'num_sectors' -- namely, '$((159260346*2-(Find-2) ))' -- instead of a fixed value that is 'some, fixed transformation' of the Blocks value reported by fdisk? In other words, based on this statement of yours above,

"Remember that an alternate superblock at offset XXX still
expects the file system to begin at its real starting point,
not just 2 blocks before this alternate superblock."

(which I understand and appreciate, btw), why should the dmsetup invocation be parameterized on 'Find'?

4. You say above,

"All of these expect the file system to begin at sector offset 2048,
and would need you to run "e2fsck -b 32768" or "e2fsck -b 98304" for
the alternate SBs."

Is the concept here that only the first entry printed is *always* the Real, and any and all subsequent ones just 'alternates' or backups of the first one? Would you please confirm this?

Further, in your example, what is the correlation between the -b argument values of 32768, 98304, ... and the corresponding SBs? The reason I ask this is: The man page of e2fsck says that '-b 32768' may be specified when the blocksize is 4k. Mine looks to be 4k, indeed. But then, shouldn't I always use 32768 instead of various numbers? Also, e2fsck suggests using 32768 only when you're relying in its own notion of things to carry out the repair. But, here, we are using a custom program (e2finder) and ought to be using the number *it* is printing out. Even in our example, I fail to find any correlation between e2finder numbers and those that you specify in e2fsck -b.

Would /greatly/ appreciate if you could clarify these few points of confusion for me.

Robert Nichols

unread,
Apr 17, 2012, 11:58:44 AM4/17/12
to
On 04/17/2012 01:02 AM, Harry wrote:
> On Tuesday, April 17, 2012 2:28:07 AM UTC+5:30, Robert Nichols wrote:
...
Yes, you should be running e2finder on /dev/sdb2.

At this point you aren't yet into LVMs, just an ordinary disk that has been
divided into two physical partitions. The ID on the second partition indicates
that the space therein is managed by the Logical Volume Manager. The LV
manager would, if things were working correctly, identify data structures
within that partition and set up virtual devices to access the space.

> 2. Now, when I do issue `e2finder /dev/sdb2`, the list starts out as follows:
>
> $ ./e2finder /dev/sdb2
> Byte offset sector (512B)
> ------------ --------
> 0x0004100000 ( 8521728), label="", bs=4k
> 0x0014100000 ( 9046016), label="", bs=4k
> ...
>
> Given this list and given your example just above, would the value of 'Find'
> for me be: 8521728?

Yes, that command line:

dmsetup create foo --table \
"0 $((159260346*2-(Find-2) )) linear /dev/sda12 $((Find-2))"

does the necessary arithmetic so that you can just plug in the sector offset
for the primary superblock.

> 3. In your example, does 'Find' progressively take on values 2050, 264192,
> 788480... or, does it remain fixed at 2050?

For all cases the file system actually starts at sector 2048 with the
primary superblock at sector 2050.

> If it's the former, then I wonder what would be the use of printing 2nd and
> subsequent entries?

Either you've got "former" and "latter" backwards, or I'm completely
confused by what you are asking.

Consider what you would see if the primary superblock were damaged. There
would be a number of alternate superblocks that you would have to identify
by looking at the repeated low-order hex digits of the byte offset, make an
educated guess about where the primary superblock would have been, and then
plug that sector offset in for "Find=xxxx".

The key is to find several potential superblocks where the low-order 6 hex
digits of the byte offset repeat, except that the first one has 0x400 added.
If you can find that pattern, that very likely identifies the first
superblock. If you don't find one with that 0x400 offset, then perhaps
that primary superblock got overwritten, and you'll have to calculate the
likely location for that primary.

> And if it's the latter, then why in setting up the device mapping table
> would you use a varying 'num_sectors' -- namely, '$((159260346*2-(Find-2) ))' --
> instead of a fixed value that is 'some, fixed transformation' of the Blocks
> value reported by fdisk? In other words, based on this statement of yours above,
>
> "Remember that an alternate superblock at offset XXX still
> expects the file system to begin at its real starting point,
> not just 2 blocks before this alternate superblock."
>
> (which I understand and appreciate, btw), why should the dmsetup invocation
> beparameterized on 'Find'?

Since I don't know the correct size, and the only constraint is that the
size must be at least large enough to contain the file system, I just
calculate the largest possible size, i.e., the number of sectors between
the starting point and the physical end of the partition.

> 4. You say above,
>
> "All of these expect the file system to begin at sector offset 2048,
> and would need you to run "e2fsck -b 32768" or "e2fsck -b 98304" for
> the alternate SBs."
>
> Is the concept here that only the first entry printed is *always* the Real,
> and any and all subsequent ones just 'alternates' or backups of the first
> one? Would you please confirm this?

The first entry might be the real primary superblock, but is could be just
a copy of the superblock that got written to some random location for
reasons unknown. I've got disk partitions that have been used for several
test installations, and there are copies of what look like valid superblocks
scattered all over.

> Further, in your example, what is the correlation between the -b argument
> values of 32768, 98304, ... and the corresponding SBs? The reason I ask this
> is: The man page of e2fsck says that '-b 32768' may be specified when the
> blocksize is 4k. Mine looks to be 4k, indeed. But then, shouldn't I always
> use 32768 instead of various numbers? Also, e2fsck suggests using 32768 only
> when you're relying in its own notion of things to carry out the repair. But,
> here, we are using a custom program (e2finder) and ought to be using the
> number *it* is printing out. Even in our example, I fail to find any
> correlation between e2finder numbers and those that you specify in e2fsck
> -b.

Recognize that the argument to "-b" represents 4k-sized blocks. So,

0x0000100400 # Real SB
- 0x400 # byte offset of real SB
------------
0x0000100000 # actual start of file system

0x0008100000 # 1st alternate
-0x0000100000 # start of file system
------------
0x0008000000 # byte offset of 1st alternate

Divide that number by the block size (4k = 0x1000) and you get
0x8000, or 32768 decimal.

For the 2nd alternate at 0x0018100000, you get a byte offset of 0x18000000,
or 98304 blocks of 4k.

unruh

unread,
Apr 17, 2012, 1:18:39 PM4/17/12
to
On 2012-04-17, Harry <simon...@gmail.com> wrote:
> On Tuesday, April 17, 2012 12:34:18 AM UTC+5:30, Doug Freyburger wrote:
>> Harry wrote:
>> > Doug Freyburger <dfrey...@yahoo.com> wrote:
>> >
>> >> According to the "fdisk -l" output there is a 250 MB parition in Linux
>> >> format marked bootable. Clearly /boot. It does not fsck nor does it
>> >> mount as /mnt/boot. if only the boot code of the MBR were written that
>> >> partition would fsck and mount.
>> >
>> > No, actually, I /can/ mount the boot partition sdb1.

So what. Partition 1 starts in the same place even if the table was
overwritten.


>> > ...
>> > $ ls /mnt/x
>> > config-3.2.5-3.fc16.i686.PAE
>> > initramfs-3.2.9-1.fc16.i686.PAE.img
>> > config-3.2.9-1.fc16.i686.PAE
>> > initramfs-3.2.9-2.fc16.i686.PAE.img
>> > config-3.2.9-2.fc16.i686.PAE initrd-
>> > plymouth.img
>> > config.mk-compat-wireless-3.3-rc1-2-3.2.5-3.fc16.i686.PAE lost+found
>> > config.mk-compat-wireless-3.3-rc1-2-3.2.9-1.fc16.i686.PAE
>> > System.map-3.2.5-3.fc16.i686.PAE
>> > config.mk-compat-wireless-3.3-rc1-2-3.2.9-2.fc16.i686.PAE
>> > System.map-3.2.9-1.fc16.i686.PAE
>> > efi
>> > System.map-3.2.9-2.fc16.i686.PAE
>> > grub
>> > vmlinuz-3.2.5-3.fc16.i686.PAE
>> > grub2
>> > vmlinuz-3.2.9-1.fc16.i686.PAE
>> > initramfs-3.2.5-3.fc16.i686.PAE.img
>> > vmlinuz-3.2.9-2.fc16.i686.PAE
>>
>> That's clearly a /boot mount point. Conclusive evidence the partition
>> table was not trashed. No way did a "dd" copy all 250 MB.

Hardly.

Why would it need to have copied all 250MB?
And of what value is your "absolutely positive"? People have a wonderful
ability to remember what should have happened, rather than what did
happen.


>
> command. (Recap Note: What is sdb now, was sda earlier... at the time the dd command was issued.)
>
> I was well aware of the dangers of playing with 'dd', and so was extra, extra careful in constructing it before issuing it. Though I didn't (and still don't deeply) understand partitioning and LVM, esp the way all you folks do, when issuing the dd command I knew at least things like device vs partition, if= vs of=, bs, count, skip, 446-byte MBR code, etc. As I said earlier, I was so confident of what I was doing that I didn't think it necessary to backup the disk!
>

> Even during Fedora 16 install, when I came to this step
> http://docs.fedoraproject.org/en-US/Fedora/16/html/Installation_Guide/Assign_Storage_Devices-x86.html
> , I remember VERY clearly:
> 1. leaving this (currently messed up 80G) disk in the Data Storage Devices listbox; and
> 2. including the new 250 G disk in the Install Target Devices listbox.
>
> Then, a few steps later, at http://docs.fedoraproject.org/en-US/Fedora/16/html/Installation_Guide/s1-diskpartitioning-x86.html, I remember VERY clearly /not/ having the 80G disk selected for formatting.
>
> Could any of this have possibly messed up my disk? Probably not.
>
> Later, I did incorrectly and unsuccessfully try various e2* commands to repair the LVM partition mistaking it for an ext4 fs. Only this part I don't remember fully well; I think, I did use the '-n' option in these commands which would have left the disk intact. Also, because I was simply copy-pasting commands from the Net without really understanding them (relying on the assurance of '-n') and because I tried various permutations of device/partition and offset numbers, I didn't really note down what all I was doing. Thus, except for these various e2* command sequences that I don't fully recall now, I'm absolutely sure of everything else.
>

And so you could well have totally messed it up.

Harry

unread,
Apr 17, 2012, 9:41:09 PM4/17/12
to
On Tuesday, April 17, 2012 10:48:39 PM UTC+5:30, unruh wrote:
> And of what value is your "absolutely positive"? People have a wonderful
> ability to remember what should have happened, rather than what did
> happen.
>
> And so you could well have totally messed it up.

Can you just shut the fsck -y up?

Harry

unread,
Apr 19, 2012, 12:53:37 AM4/19/12
to
Bob, I understood the 'gist' of what you said above, but not every detail. It is probably because I'm still quite tense. A revisit to your post later may perhaps find me in a happier and more receptive state than I am now to understand all that detail.

So, what I did was, I simply pasted your parameterized dmsetup comand relying on the strength of fact that it would cause no persistent changes. In a nutshell, it did not work. I got these messages.

Trying superblock at sector 8521728 ...
e2fsck 1.41.14 (22-Dec-2010)
e2fsck: Group descriptors look bad... trying backup blocks...
e2fsck: Bad magic number in super-block when using the backup blocks
e2fsck: going back to original superblock
Error reading block 3062704254 (Invalid argument). Ignore error? no

Superblock has an invalid journal (inode 8).
Clear? no

e2fsck: Illegal inode number while checking ext3 journal for /dev/mapper/foo
FAILED to find superblock at sector 8521726 .

Trying superblock at sector 9046016 ...
...

Meanwhile, frustrated and desperate, I kept googling and managed to run into this article:
http://www.microdevsys.com/WordPress/2011/09/19/linux-lvm-recovering-a-lost-volume/


Good news and Bad news:

1. The good news is that I *was* able to get the various LVM commands to succeed. I think, vgcfgrestore, may have been the key thing. I don't pretend that I understand everything about it or why it worked, but, yes, it did work. Here's what I did:

a) I noticed that in the image of the partition, the string /dev/sda2 was written. Even though the LVM backup file said
device = "/dev/sda2" # Hint only
, I decided to be a little superstitious and switched the data cables in my system once again so that the messed up disk once again became /dev/sda.

$ fdisk -l
Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 156301311 77637632 8e Linux LVM

b) Restored the /etc/lvm directory.

$ mv /etc/lvm.off /etc/lvm
$ ls -l /etc/lvm/backup/
total 4
-rw-------. 1 root root 1456 Apr 19 09:16 vg_XYZ

c) I then issued lvm vgcfgrestore and -- wow, unlike for the author of the above article -- the command worked for me!

$ lvm vgcfgrestore vg_XYZ
Restored volume group vg_XYZ

$ lvm vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg_XYZ" using metadata type lvm2

$ lvm vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg_XYZ" using metadata type lvm2

$ lvm pvscan
PV /dev/sda2 VG vg_XYZ lvm2 [74.03 GiB / 0 free]
Total: 1 [74.03 GiB] / in use: 1 [74.03 GiB] / in no VG: 0 [0 ]

$ lvm vgchange -a y vg_XYZ
2 logical volume(s) in volume group "vg_XYZ" now active

$ lvm pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 vg_XYZ lvm2 a-- 74.03g 0

$ lvm vgs
VG #PV #LV #SN Attr VSize VFree
vg_XYZ 1 2 0 wz--n- 74.03g 0

$ lvm lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv_root vg_XYZ -wi-a- 70.09g
lv_swap vg_XYZ -wi-a- 3.94g

$ mount /dev/vg_XYZ/lv_root /mnt/x/
mount: you must specify the filesystem type

$ mount -vvvvvv /dev/vg_XYZ/lv_root /mnt/x
mount: fstab path: "/etc/fstab"
mount: mtab path: "/etc/mtab"
mount: lock path: "/etc/mtab~"
mount: temp path: "/etc/mtab.tmp"
mount: UID: 0
mount: eUID: 0
mount: spec: "/dev/mapper/vg_XYZ-lv_root"
mount: node: "/mnt/x"
mount: types: "(null)"
mount: opts: "(null)"
mount: you didn't specify a filesystem type for /dev/mapper/vg_XYZ-lv_root
I will try all types mentioned in /etc/filesystems or /proc/filesystems
Trying ext4
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "ext4", mountflags: -1058209792, data: (null)
Trying ext3
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "ext3", mountflags: -1058209792, data: (null)
Trying ext2
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "ext2", mountflags: -1058209792, data: (null)
Trying iso9660
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "iso9660", mountflags: -1058209792, data: (null)
Trying vfat
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "vfat", mountflags: -1058209792, data: (null)
Trying hfs
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "hfs", mountflags: -1058209792, data: (null)
Trying hfsplus
mount: mount(2) syscall: source: "/dev/mapper/vg_XYZ-lv_root", target: "/mnt/x", filesystemtype: "hfsplus", mountflags: -1058209792, data: (null)
mount: you must specify the filesystem type

2. The bad news is that, as you can see above, I don't know what to do now to get the mount to succeed. Because the article-author's situation isn't exactly the same as mine, and also because I don't understand either his or my situation fully, I don't feel safe in issuing any more commands (pvcreate, vgextend, pvmove, pvreduce, pvremove) that he mentions after the 'mount failure' point in his article.

So, would greatly appreciate if you or someone could kindly tell me what to do *now* to get the partitions to mount (esp, the lv_root partition), so that I can back it up the first thing. Once a backup is made, I can play with this LVM stuff and learn and explore it to my heart's content, read all the posts at a slower and more relaxed pace.

Many, many thanks to all those who have taken interest in this thread, have stood by my side all this time, and have been very patient with my newbie questions.

Harry

unread,
Apr 19, 2012, 9:43:48 AM4/19/12
to
On Thursday, April 12, 2012 3:59:52 PM UTC+5:30, Harry wrote:
> Hello,
> I posted my question to superuser.com (http://superuser.com/questions/
> 410796/unable-to-repair-an-ext4-filesystem-with-bad-superblock) but
> haven't got any response yet.
>
> In summary, not realizing what I was doing, I overwrote the first 446
> bytes of MBR via the DD command. Would greatly, *GREATLY* appreciate
> if someone could help me salvage my disk!
>
>
> =================
> Details of what I did:
> =================
>
> Using the `dd` command, I was hoping that I would be able to copy over
> the first 446 bytes from Disk B (250GB) to Disk A (80GB), in order to
> make Disk A bootable just like Disk B. I issued the command:
>
> dd if=/dev/sdb of=/dev/sda bs=446 count=1
>
> But when I could not boot up from `sda`, I rebooted from `sdb` to see
> what was going on. To my horror, `sda` was being reported to have a
> bad superblock, now.
>
> Worse, I was **unable** to repair it via the backup superblocks stored
> on the ext4 filesystem. This is what I did. I first got the backup
> superblock addresses, like so:
>
> [root@localhost liveuser]# mke2fs -n /dev/sda
> mke2fs 1.41.14 (22-Dec-2010)
> /dev/sda is entire device, not just one partition!
> Proceed anyway? (y,n) y
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> Stride=0 blocks, Stripe width=0 blocks
> 4890624 inodes, 19537686 blocks
> 976884 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=0
> 597 block groups
> 32768 blocks per group, 32768 fragments per group
> 8192 inodes per group
> Superblock backups stored on blocks:
> 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
> 2654208,
> 4096000, 7962624, 11239424
>
> Then, I used `e2fsck -b SUPERBLOCK /dev/sda`, with each of the
> `SUPERBLOCK` values listed above, like so:
>
> [root@localhost liveuser]# e2fsck -b 32768 /dev/sda
> e2fsck 1.41.14 (22-Dec-2010)
> e2fsck: Bad magic number in super-block while trying to open /dev/sda
>
> The superblock could not be read or does not describe a correct ext2
> filesystem. If the device is valid and it really contains an ext2
> filesystem (and not swap or ufs or something else), then the
> superblock
> is corrupt, and you might try running e2fsck with an alternate
> superblock:
> e2fsck -b 8193 <device>
>
> I tried every single value, but each gave the above message!
>
> **Is there anything that I could do NOW to salvage my precious disk?**
> This is an 80G disk with 2 partitions. The `/dev/sda1` partition is
> clean and is mountable; it is the `/dev/sda2` partition that is
> failing to work with commands like `mount`, `debugfs`, `dumpe2fs`,
> etc.
>
> Running `mke2fs -n` for the individual partitions gave me this (notice
> how the **First Data Block** and **Maximum filesystem blocks** both
> show **0** as their value):
>
> [root@localhost liveuser]# mke2fs -n /dev/sda1
> mke2fs 1.41.14 (22-Dec-2010)
> Filesystem label=
> OS type: Linux
> Block size=1024 (log=0)
> Fragment size=1024 (log=0)
> Stride=0 blocks, Stripe width=0 blocks
> 128016 inodes, 512000 blocks
> 25600 blocks (5.00%) reserved for the super user
> First data block=1
> Maximum filesystem blocks=67633152
> 63 block groups
> 8192 blocks per group, 8192 fragments per group
> 2032 inodes per group
> Superblock backups stored on blocks:
> 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
>
> [root@localhost liveuser]# mke2fs -n /dev/sda2
> mke2fs 1.41.14 (22-Dec-2010)
> Filesystem label=
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> Stride=0 blocks, Stripe width=0 blocks
> 4857856 inodes, 19409408 blocks
> 970470 blocks (5.00%) reserved for the super user
> First data block=0
> Maximum filesystem blocks=0
> 593 block groups
> 32768 blocks per group, 32768 fragments per group
> 8192 inodes per group
> Superblock backups stored on blocks:
> 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
> 2654208,
> 4096000, 7962624, 11239424
>
> I still don't know what was wrong in my `dd` command that corrupted my
> ext4 superblock. You cannot imagine how happy I will be if someone
> could help me recover my disk back... since, except fo this bad
> superblock, all the data is just sitting right there!
>
> PLEASE HELP, my soul is crying as I write this :-((
>
> PS: If I can get a response faster/better from another forum on the
> Net, please do point me to it.

After about 70 posts in this thread by some very helpful and knowledgeable members of this forum, I managed to run into this article:

http://www.microdevsys.com/WordPress/2011/09/19/linux-lvm-recovering-a-lost-volume/

Though the specifics of my situation and the causes leading to it were different from the author's, I did manage to notice the 'vgcfgrestore' command that had been missing all along in all the various suggestions made. I decided to give it a try - and lo and behold - it worked!

While I'm VERY happy now at the prospect of a full data recovery (assuming no data was lost due to e2fsck's fixes), I'd still like to seek some final help from you folks in reconstructing the 'crime scene': basically, deducing from the following sequence of steps of the solution as to what must have gotten corrupted and how and why. I have already told my story enough number times in this thread and on superuser.com, but does it corroborate with these additional data points of the solution?

================
Solution:
================

a) I noticed that in the image of the partition, the string /dev/sda2 was written. Even though the LVM backup file said

device = "/dev/sda2" # Hint only

, I decided to be a little *superstitious* and switched the data cables in my system once again so that the messed up disk once again became /dev/sda.

$ fdisk -l
Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 156301311 77637632 8e Linux LVM

b) I then restored the /etc/lvm directory.

$ mv /etc/lvm.off /etc/lvm
$ ls -l /etc/lvm/backup/
total 4
-rw-------. 1 root root 1456 Apr 19 09:16 vg_XYZ

c) I then issued an `lvm vgcfgrestore` and -- wow, unlike for the author of the above article -- the command worked for me!

$ lvm vgcfgrestore vg_XYZ
Restored volume group vg_XYZ

$ lvm vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg_XYZ" using metadata type lvm2

d) At this point, I decide to repair the filesystem relying on the strength of my backed up image of the messed up disk. In other words, even if my e2fsck'ing were to wreak further havoc on the disk, I could always return to Step (c) above.

$ e2fsck /dev/mapper/vg_XYZ-lv_root

I had to say 'y' to so many tons and tons of

Directories count wrong for group #<NNN>

prompts that I had almost given up hope.

e) Verification if the partition was clean.

$ e2fsck /dev/mapper/vg_XYZ-lv_root
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/vg_XYZ-lv_root: clean, 912765/4595712 files, 14271569/18374656 blocks

f) Mounting.

$ mount /dev/mapper/vg_XYZ-lv_root /mnt/x

$ ls -l /mnt/x
total 144
dr-xr-xr-x. 2 root root 4096 Apr 8 09:39 bin
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 boot
drwxr-xr-x. 2 root root 4096 Mar 3 2011 cgroup
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 dev
drwxr-xr-x. 189 root root 12288 Apr 10 12:44 etc
drwxr-xr-x. 8 root root 4096 Jan 5 10:11 home
dr-xr-xr-x. 23 root root 12288 Mar 14 03:23 lib
drwx------. 2 root root 16384 Oct 25 18:30 lost+found
drwxr-xr-x. 2 root root 4096 Jul 29 2011 media
drwxr-xr-x. 3 root root 4096 Jul 29 2011 mnt
drwxr-xr-x. 5 root root 4096 Jan 30 22:00 opt
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 proc
dr-xr-x---. 25 root root 4096 Apr 10 12:43 root
drwxr-xr-x. 36 root root 4096 Nov 20 09:09 run
dr-xr-xr-x. 2 root root 12288 Mar 13 06:32 sbin
drwxr-xr-x. 2 root root 4096 Dec 31 08:29 selinux
drwxr-xr-x. 2 root root 4096 Jul 29 2011 srv
drwxr-xr-x. 2 root root 4096 Oct 25 18:30 sys
drwxrwxrwt. 49 root root 28672 Apr 10 12:44 tmp
drwxr-xr-x. 12 root root 4096 Nov 20 07:18 usr
drwxr-xr-x. 22 root root 4096 Jan 4 10:15 var

Bob

unread,
Apr 19, 2012, 12:29:41 PM4/19/12
to
On 04/19/2012 08:43 AM, Harry wrote:
[SNIP]
Looks like you finally got there. Good going! I still have no clue about
what might have happened to cause that mess. Next step will be to see if
the files you want to save are all intact.

Doug Freyburger

unread,
Apr 19, 2012, 12:45:50 PM4/19/12
to
Harry wrote:
>
> /dev/sda1 * 2048 1026047 512000 83 Linux

You will want to run fsck on this partition separately. You've already
been able to get it to mount if I recall. It used to be /boot so it
might not be of interest to you. Easily rebuilt.

> $ lvm vgs
> VG #PV #LV #SN Attr VSize VFree
> vg_XYZ 1 2 0 wz--n- 74.03g 0
>
> $ lvm lvs
> LV VG Attr LSize Origin Snap% Move Log Copy% Convert
> lv_root vg_XYZ -wi-a- 70.09g
> lv_swap vg_XYZ -wi-a- 3.94g
> ...
> So, would greatly appreciate if you or someone could kindly tell me
> what to do *now* to get the partitions to mount

The easy one first - Based on the name "lv_swap" don't bother with that
one. It will contain paged out virtual memory pages in a format that is
not useful once the system has shut down.

For "lv_root" do you remember it's format. I'm in th emiddle of similar
conversations on group and off and I don't remember which one used ext3
and which one used ext4. With the volume group still active so
"vgdisplay -v" shows its logical volumes and/or so lvs shows them as
above do -

fsck -t ext3 /dev/vg_XYZ/lv_root

or maybe

fsck -o full -t ext3 /dev/vg_XYZ/lv_root

It will tell you how much data is intact. Might be all of it. After
fsck runs clean (might take several passes) you should be able to mount
it by specifying its format -

mount -t ext3 /dev/vg_XYZ/lv_root /mnt/root

Then you'll be able to see files with names you expect and/or files
under /mnt/root/lost+found who names are based on the inode numbers.
With a small number of files in lost+found it's practical to start
guessing. With a large number not so much.

Harry

unread,
Apr 19, 2012, 9:34:03 PM4/19/12
to
On Thursday, April 19, 2012 10:15:50 PM UTC+5:30, Doug Freyburger wrote:
> Harry wrote:
> >
> > /dev/sda1 * 2048 1026047 512000 83 Linux
>
> You will want to run fsck on this partition separately. You've already
> been able to get it to mount if I recall. It used to be /boot so it
> might not be of interest to you. Easily rebuilt.

I completely did not realize that!
(Ran the check, it is clean.)
Thanks for reminding.

> For "lv_root" do you remember it's format. I'm in th emiddle of similar
> conversations on group and off and I don't remember which one used ext3
> and which one used ext4. With the volume group still active so
> "vgdisplay -v" shows its logical volumes and/or so lvs shows them as
> above do -
>
> fsck -t ext3 /dev/vg_XYZ/lv_root

It is ext4.
I noticed that fsck called e2fsck internally and didn't require the -t ext4 argument. I tried both with and without the argument, same result: clean.

> or maybe
>
> fsck -o full -t ext3 /dev/vg_XYZ/lv_root
>
> It will tell you how much data is intact. Might be all of it. After
> fsck runs clean (might take several passes) you should be able to mount
> it by specifying its format -

I'm using the fsck that comes packaged with util-linux-2.20.1-2.2.fc16.i686 on Fedora 16. It doesn't have the -o option.

> mount -t ext3 /dev/vg_XYZ/lv_root /mnt/root
> Then you'll be able to see files with names you expect and/or files
> under /mnt/root/lost+found who names are based on the inode numbers.
> With a small number of files in lost+found it's practical to start
> guessing. With a large number not so much.

Didn't see any under /mnt/lost+found - Thank God!
I'm assuming this means all files under lv_root got recovered successfully.

Thanks, Doug.

Harry

unread,
Apr 19, 2012, 9:50:43 PM4/19/12
to
On Thursday, April 19, 2012 9:59:41 PM UTC+5:30, Bob wrote:
> Looks like you finally got there. Good going!
Can't thank you and other posters (Doug, Richard, David, J G Miller, Gernot) enough for helping this LVM newbie.

> Next step will be to see if
> the files you want to save are all intact.

I verified that I was able to load Thunderbird (mail client) based on the recovered .thunderbird folder, with no mail missing. A random and brief manual inspection of my other recovered folders and their files seemed to tell me things were back to normal again. I never md5sum the content being backed up, nor use any specialized tools for backups that would created a binary, possibly compressed md5summed backup image; I simply use rsync to 'dump'the contents to my backup drive.

Doug Freyburger

unread,
Apr 20, 2012, 1:19:46 PM4/20/12
to
Harry wrote:
> Doug Freyburger wrote:
>> Harry wrote:
>> >
>> > /dev/sda1 * 2048 1026047 512000 83 Linux
>>
>>It used to be /boot so it
>> might not be of interest to you. Easily rebuilt.
>
> Thanks for reminding.

Automatically built data.

>> or maybe
>> fsck -o full -t ext3 /dev/vg_XYZ/lv_root
>
> I'm using the fsck that comes packaged with util-linux-2.20.1-2.2.fc16.i686 on Fedora 16. It doesn't have the -o option.

It is from one of the many other systems I work on then. Probably the
VXFS version.

>> mount -t ext3 /dev/vg_XYZ/lv_root /mnt/root
>
> Didn't see any under /mnt/lost+found - Thank God!
> I'm assuming this means all files under lv_root got recovered successfully.

Yes that is what it means. Recovery project complete. Backup project
begins.

Harry

unread,
Apr 21, 2012, 11:37:48 PM4/21/12
to
On Friday, April 20, 2012 10:49:46 PM UTC+5:30, Doug Freyburger wrote:
> Harry wrote:
> > Doug Freyburger wrote:
> >> Harry wrote:
> >> >
> >> > /dev/sda1 * 2048 1026047 512000 83 Linux
> >>
> >>It used to be /boot so it
> >> might not be of interest to you. Easily rebuilt.
> >
> > Thanks for reminding.
>
> Automatically built data.

Didn't understand.

> >> or maybe
> >> fsck -o full -t ext3 /dev/vg_XYZ/lv_root
> >
> > I'm using the fsck that comes packaged with util-linux-2.20.1-2.2.fc16.i686 on Fedora 16. It doesn't have the -o option.
>
> It is from one of the many other systems I work on then. Probably the
> VXFS version.

Ok, will just ignore it then.

> >> mount -t ext3 /dev/vg_XYZ/lv_root /mnt/root
> >
> > Didn't see any under /mnt/lost+found - Thank God!
> > I'm assuming this means all files under lv_root got recovered successfully.
>
> Yes that is what it means. Recovery project complete. Backup project
> begins.

I'll post another question shortly to invite some tips and best practices on the organization of partitions and their backup.

Doug Freyburger

unread,
Apr 25, 2012, 12:01:58 PM4/25/12
to
Harry wrote:
> Doug Freyburger wrote:
>> Harry wrote:
>> > Doug Freyburger wrote:
>> >> Harry wrote:
>
>> >> > /dev/sda1 * 2048 1026047 512000 83 Linux
>
>> >>It used to be /boot so it
>> >> might not be of interest to you. Easily rebuilt.
>
>> > Thanks for reminding.
>
>> Automatically built data.
>
> Didn't understand.

Checking back to see if this point ended up clear. You wanted to
recover user data from that drive. Partition sda1 contains /boot that
does not contain any user data. That's why I said to ignore it for your
purposes.

The mount point /boot contains new and old versions of the kernel and
other boot support data. Its contents are automatically generated at
system build time and later when kernel upgrades/patches are applied.
It's very interesting when your purpose is recovering a system to
bootable status. Since your purpose was not to recover the system to
bootable status you could ignore it. A matter of context, what you
wanted to do.

Given all of the other discussion I suspect its contents are intact and
you would have been able to render the system bootable at some point
using its contents. You have a different boot disk so that was not your
goal.

Harry

unread,
Apr 28, 2012, 7:32:52 PM4/28/12
to
On Wednesday, April 25, 2012 9:31:58 PM UTC+5:30, Doug Freyburger wrote:
> Harry wrote:
> > Doug Freyburger wrote:
> >> Harry wrote:
> >> > Doug Freyburger wrote:
> >> >> Harry wrote:
> >
> >> >> > /dev/sda1 * 2048 1026047 512000 83 Linux
> >
> >> >>It used to be /boot so it
> >> >> might not be of interest to you. Easily rebuilt.
> >
> >> > Thanks for reminding.
> >
> >> Automatically built data.
> >
> > Didn't understand.
>
> Checking back to see if this point ended up clear. You wanted to
> recover user data from that drive. Partition sda1 contains /boot that
> does not contain any user data. That's why I said to ignore it for your
> purposes.

Yes, I had understood this part fine.

> The mount point /boot contains new and old versions of the kernel and
> other boot support data. Its contents are automatically generated at
> system build time and later when kernel upgrades/patches are applied.
> It's very interesting when your purpose is recovering a system to
> bootable status. Since your purpose was not to recover the system to
> bootable status you could ignore it. A matter of context, what you
> wanted to do.

In your last post, I was not sure what you meant by 'data' in "Automatically built data". Now, I understand.

> Given all of the other discussion I suspect its contents are intact and
> you would have been able to render the system bootable at some point
> using its contents. You have a different boot disk so that was not your
> goal.

Yes, that's correct.

Thanks a ton, Doug, for being patient with my questions!
0 new messages