the superblock or the partition table is likely to be corrupt!

10,310 views
Skip to first unread message

Moa

unread,
Apr 7, 2013, 8:18:19 PM4/7/13
to al...@googlegroups.com
Hi 

Been exhausted by the D-LINK firmware performance I just change it dor the ALT-F RC3
I have the error the superblock or the partition table is likely to be corrupt! just after installation, it's unable to mount RW my ext3 filesystem.

I try : 
1 - check function in the web interface -> that's fail.
2 - Force Fix function  in the web interface -> that's fail.
3 - I try e2fsck -f /dev/md0 the command  using ssh interface

Am I doing the right method, or am I destroying my last chance to repair the system :-) ?

Thanks

Joao Cardoso

unread,
Apr 7, 2013, 9:42:01 PM4/7/13
to al...@googlegroups.com


On Monday, April 8, 2013 1:18:19 AM UTC+1, Moa wrote:
Hi 

Been exhausted by the D-LINK firmware performance I just change it dor the ALT-F RC3
I have the error the superblock or the partition table is likely to be corrupt! 

Where do you see this error?
Please attach the whole system and kernel log (System->Utilities)

 
just after installation, it's unable to mount RW my ext3 filesystem.

I try : 
1 - check function in the web interface -> that's fail.
2 - Force Fix function  in the web interface -> that's fail.
 
Error Messages? (full messages!) the above "the superblock or the partition table is likely to be corrupt!" only?
 
3 - I try e2fsck -f /dev/md0 the command  using ssh interface


The "Force Fix" already does a 'e2fsck -yf'.

I'm afraid that the error is not from the filesystem but from the disk itself, only the full log will tell.
What does 'sfdisk -luS /dev/sd?' outputs? 


Am I doing the right method, or am I destroying my last chance to repair the system :-) ?

It depends on what the problem is. Full logs and error messages, please
 

Thanks

Moa

unread,
Apr 8, 2013, 1:32:11 AM4/8/13
to al...@googlegroups.com
Hi joao thx for your quick reply 

here some element :



On Sunday, April 7, 2013 9:42:01 PM UTC-4, Joao Cardoso wrote:


On Monday, April 8, 2013 1:18:19 AM UTC+1, Moa wrote:
Hi 

Been exhausted by the D-LINK firmware performance I just change it dor the ALT-F RC3
I have the error the superblock or the partition table is likely to be corrupt! 

Where do you see this error? /var/log/systemerror.log
Please attach the whole system and kernel log (System->Utilities) Done

 
just after installation, it's unable to mount RW my ext3 filesystem.

I try : 
1 - check function in the web interface -> that's fail.
Unable to automatically fix md0, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
/dev/md0: The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!


/dev/md0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
	(i.e., without -a or -p options)
2 - Force Fix function  in the web interface -> that's fail.
Checking md0 finished with status code 8: fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt! 
Abort? yes
 
 
Error Messages? (full messages!) the above "the superblock or the partition table is likely to be corrupt!" only?
 
3 - I try e2fsck -f /dev/md0 the command  using ssh interface


The "Force Fix" already does a 'e2fsck -yf'. Yes but it abort automaticaly when executed from web interface see text highligted in red

I'm afraid that the error is not from the filesystem but from the disk itself, only the full log will tell.
What does 'sfdisk -luS /dev/sd?' outputs?  
 
# sfdisk -luS /dev/md0

Disk /dev/md0: 488116928 cylinders, 2 heads, 4 sectors/track
No partitions found



# sfdisk -luS /dev/sda2
Warning: start=2088450 - this looks like a partition rather than
the entire disk. Using fdisk on it is probably meaningless.
[Use the --force option if you really want this]

# sfdisk -luS /dev/sdb2
Warning: start=2088450 - this looks like a partition rather than
the entire disk. Using fdisk on it is probably meaningless.
[Use the --force option if you really want this]



Am I doing the right method, or am I destroying my last chance to repair the system :-) ?

Joao Cardoso

unread,
Apr 8, 2013, 11:38:53 AM4/8/13
to al...@googlegroups.com


On Monday, April 8, 2013 6:32:11 AM UTC+1, Moa wrote:
Hi joao thx for your quick reply 

here some element :



On Sunday, April 7, 2013 9:42:01 PM UTC-4, Joao Cardoso wrote:


On Monday, April 8, 2013 1:18:19 AM UTC+1, Moa wrote:
Hi 

Been exhausted by the D-LINK firmware performance I just change it dor the ALT-F RC3
I have the error the superblock or the partition table is likely to be corrupt! 

Where do you see this error? /var/log/systemerror.log
Please attach the whole system and kernel log (System->Utilities) Done

 
just after installation, it's unable to mount RW my ext3 filesystem.

I try : 
1 - check function in the web interface -> that's fail.
Unable to automatically fix md0, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
/dev/md0: The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!

So the disk and your data is OK. At least mostly.

As you can see, the filesystem thinks that it has 488116951 blocks available for its usage on the disk, but the disk itself has only allocated 488116928 blocks for the filesystem, i.e., the filesystem thinks it has more 23 blocks (roughly 100KB) than it really has.

The simplest way to fix this is to use Disk->Filesystem->FS Operations, Enlarge.
This will resize the filesystem to the disk available capacity.
This is generally used in the reverse situation, when the filesystem is smaller than the disk partition holding it, but I think that it will also work in this situation.
There is the very remote possibility of you loosing some data, specially if the disk is (or has been in the past) nearly full of data, say 99% full.

Please try it and report back.

How has this happened? Don't know, but Alt-F didn't change anything on your disks, they are as the D-Link fw left them.
And because the D-Link fw didn't do any consistency checks on you filesystems the problem might be lying around for years.
This was the main reason why Alt-F was written from the very beginning, to fix this careless issue in the D-Link fw.




/dev/md0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
	(i.e., without -a or -p options)
2 - Force Fix function  in the web interface -> that's fail.
Checking md0 finished with status code 8: fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt! 
Abort? yes
 
 
Error Messages? (full messages!) the above "the superblock or the partition table is likely to be corrupt!" only?
 
3 - I try e2fsck -f /dev/md0 the command  using ssh interface


The "Force Fix" already does a 'e2fsck -yf'. Yes but it abort automaticaly when executed from web interface see text highligted in red

I'm afraid that the error is not from the filesystem but from the disk itself, only the full log will tell.
What does 'sfdisk -luS /dev/sd?' outputs?  
 
# sfdisk -luS /dev/md0

No, I said "sda" or "sdb", not md0.
Physical disks are labeled 'sd?', where the ? will be 'a' for the first disk, 'b' for the second, etc.
Each disk can be further split into several partitions, sda1, the first partition, sda2, etc.
md0 or md1 are logical devices, not physical devices.
Read the Disk Partitioner and Disk Filesystem online help pages.

But they are OK, no need to post the results.

Moa

unread,
Apr 8, 2013, 7:07:40 PM4/8/13
to
Hi Joao,

Thx your help is much apreciated.

I try two time and the result is : 

Checking md0 failed with error code 8: fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort? yes

here the system log :

Apr  8 18:53:47 myDNS-323 user.notice hot: Start fscking md0
Apr  8 18:53:48 myDNS-323 user.notice hot: Unable to automatically fix md0, mounting Read Only: fsck 1.41.14 (22-Dec-2010) /dev/md0: The filesystem size (according to the superblock) is 488116951 blocks The physical size of the device is 488116928 blocks Either the superbl
Apr  8 18:53:48 myDNS-323 user.info kernel: EXT3-fs: barriers not enabled
Apr  8 18:53:48 myDNS-323 user.info kernel: kjournald starting.  Commit interval 5 seconds
Apr  8 18:53:48 myDNS-323 user.info kernel: EXT3-fs (md0): mounted filesystem with ordered data mode
Apr  8 18:55:40 myDNS-323 authpriv.info dropbear[2433]: Exit (root): Exited normally
Apr  8 18:55:44 myDNS-323 user.notice root: Checking md0 failed with error code 8: fsck 1.41.14 (22-Dec-2010) e2fsck 1.41.14 (22-Dec-2010) The filesystem size (according to the superblock) is 488116951 blocks The physical size of the device is 488116928 blocks Either the s
Apr  8 19:01:56 myDNS-323 user.notice hot: Start fscking md0
Apr  8 19:01:57 myDNS-323 user.notice hot: Unable to automatically fix md0, mounting Read Only: fsck 1.41.14 (22-Dec-2010) /dev/md0: The filesystem size (according to the superblock) is 488116951 blocks The physical size of the device is 488116928 blocks Either the superbl
Apr  8 19:01:57 myDNS-323 user.info kernel: EXT3-fs: barriers not enabled
Apr  8 19:01:57 myDNS-323 user.info kernel: kjournald starting.  Commit interval 5 seconds
Apr  8 19:01:57 myDNS-323 user.info kernel: EXT3-fs (md0): mounted filesystem with ordered data mode
Apr  8 19:04:20 myDNS-323 daemon.info sysctrl: temp=39.1	 fan=400

Joao Cardoso

unread,
Apr 8, 2013, 7:59:47 PM4/8/13
to al...@googlegroups.com


On Tuesday, April 9, 2013 12:04:03 AM UTC+1, Moa wrote:
Hi Joao,

Thx your help is much apreciated.

I try two time and the result is : 

Checking md0 failed with error code 8: fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort? yes

My mistake.

Before the resize operation being executed a fsck is run, which deploys the above error.
I'm afraid that you have to resort to the command line to fix this.
You have to telnet or ssh the box and login as the 'root' user, using the same password as the webUI password.

Nobody likes to play with other's people data, mainly because the course of action depends on the commands output, and nobody can foresee all possibilities.

What I would do, if it was my disk (depending on the data I would try hard to make a backup first)

I would unmount the filesystem first using the webUI: Disk->Filesystem, select md0, Operation "unmount"

I would try to resize the filesystem using the command

resize2fs -p /dev/md0

If it says to run fsck first, which is very likely to happens, I would execute

fsck -f /dev/md0

and answer "no" to the

  Either the superblock or the partition table is likely to be corrupt!
  Abort?

question. As the size discrepancy is small I believe that no problem would arise.

If other questions was asked afterwards, then depending on the questions I would stop the programs by typing "CTRL+C", and I would repeat the command again, to see if the size inconsistency was fixed. If it was not, I would have to research a bit more.
Otherwise I would repeat the command but now saying by default "yes" to all questions:

fsck -fy /dev/md0

The '-fy' option to the fsck command above (the "Force Fix" Operation) means to answer yes to all questions during fsck operation.
This is needed because most of the time hundreds of cryptic questions are asked, and as we do trust fsck we want to avoid them.
But this also avoids fsck to fix the size inconsistency, so it can't be used in the first place.

If this succeeds, (a further 'fsck -f /dev/md0' should produce no more frightening messages), I would now use the 

resize2fs -/dev/md0

command.

But this is if I was dealing with my data, after doing a backup.

Moa

unread,
Apr 9, 2013, 8:05:28 AM4/9/13
to al...@googlegroups.com
Hi joao

I actualy have a backup of my precious data stored in NAS.

Here the result resizing abord before ending :


# resize2fs -p /dev/md0
resize2fs 1.41.14 (22-Dec-2010)
Please run 'e2fsck -f /dev/md0' first.

# fsck -f /dev/md0
fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? no

Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md0: 428261/122036224 files (15.8% non-contiguous), 484798352/488116951 blocks
# resize2fs -p /dev/md0
resize2fs 1.41.14 (22-Dec-2010)
Resizing the filesystem on /dev/md0 to 488116928 (4k) blocks.
Begin pass 2 (max = 23)
Relocating blocks             resize2fs: Attempt to read block from filesystem resulted in short read while trying to resize /dev/md0
Please run 'e2fsck -fy /dev/md0' to fix the filesystem
after the aborted resize operation.

# e2fsck -fy /dev/md0
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort? yes



Joao Cardoso

unread,
Apr 9, 2013, 10:33:28 AM4/9/13
to al...@googlegroups.com


On Tuesday, April 9, 2013 1:05:28 PM UTC+1, Moa wrote:
Hi joao

I actualy have a backup of my precious data stored in NAS.

Here the result resizing abord before ending :


# resize2fs -p /dev/md0
resize2fs 1.41.14 (22-Dec-2010)
Please run 'e2fsck -f /dev/md0' first.

# fsck -f /dev/md0
fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? no

Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md0: 428261/122036224 files (15.8% non-contiguous), 484798352/488116951 blocks

This is good, no inconsistencies was reported.
 
# resize2fs -p /dev/md0
resize2fs 1.41.14 (22-Dec-2010)
Resizing the filesystem on /dev/md0 to 488116928 (4k) blocks.
Begin pass 2 (max = 23)
Relocating blocks             resize2fs: Attempt to read block from filesystem resulted in short read while trying to resize /dev/md0

hmmm, the "short read" is problematic.
It might indicate one of two things:
 1-bad(s) sector(s) in the disk, or
 2-attempt to read past the disk (or partition) end. I beat on this

You can get an idea of the second hypothesis validity by going to Disk Partitioner and seeing if the free space is negative. I believe that it is.
This means that either the RAID/filesytem/partition sizes are not adequate to each other. The differences are small, but that's enough.

Combined with the use of RAID... well, makes the recovery a bit more difficult.
There are two possibilities:
 -update your backup if needed, create a new RAID and filesystems (I recommend using the Disk Wizard) refill it from the backup.
 -continue with this recovery attempt, one step a day.

Continuing:
Detecting back blocks in the 2TB disks is going to take several days of continuous operation, so lets start with the second approach.
I need more system diagnosis. Run the following commands. They will not change anything on disk, only supply information.

sfdisk -luS /dev/sd? # obtain the disks partition table sizes

mke2fs -n /dev/md0 # obtain location of superblock backups

dumpe2fs /dev/md0 | grep superblock # another, more reliable, way to obtain location of superblock backups

mdadm --examine /dev/sda? # get details of the existing RAID components

mdadm --detail /dev/md0 # get details of the RAID device

dmesg | grep 'end of device'  # see if kernel complained about disk access. Better do after the fsck step

Moa

unread,
Apr 9, 2013, 6:01:49 PM4/9/13
to
Let's have some fun and challenge with the second approach (first one is too basic) :-)

sfdisk -luS /dev/sd? # obtain the disks partition table sizes
----
# sfdisk -luS /dev/sd?

Disk /dev/sda: 243201 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1            63   1060289    1060227  82  Linux swap
/dev/sda2       2088450 3907024064 3904935615  83  Linux
/dev/sda3             0         -          0   0  Empty
/dev/sda4       1060290   2088449    1028160  83  Linux

Disk /dev/sdb: 243201 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1            63   1060289    1060227  82  Linux swap
/dev/sdb2       2088450 3907024064 3904935615  83  Linux
/dev/sdb3             0         -          0   0  Empty
/dev/sdb4       1060290   2088449    1028160  83  Linux
----
mke2fs -n /dev/md0 # obtain location of superblock backups
----
# mke2fs -n /dev/md0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
122036224 inodes, 488116928 blocks
24405846 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
14897 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848
----
dumpe2fs /dev/md0 | grep superblock # another, more reliable, way to obtain location of superblock backups
----
# dumpe2fs /dev/md0 | grep superblock
dumpe2fs 1.41.14 (22-Dec-2010)
  Primary superblock at 0, Group descriptors at 1-117
  Backup superblock at 32768, Group descriptors at 32769-32885
  Backup superblock at 98304, Group descriptors at 98305-98421
  Backup superblock at 163840, Group descriptors at 163841-163957
  Backup superblock at 229376, Group descriptors at 229377-229493
  Backup superblock at 294912, Group descriptors at 294913-295029
  Backup superblock at 819200, Group descriptors at 819201-819317
  Backup superblock at 884736, Group descriptors at 884737-884853
  Backup superblock at 1605632, Group descriptors at 1605633-1605749
  Backup superblock at 2654208, Group descriptors at 2654209-2654325
  Backup superblock at 4096000, Group descriptors at 4096001-4096117
  Backup superblock at 7962624, Group descriptors at 7962625-7962741
  Backup superblock at 11239424, Group descriptors at 11239425-11239541
  Backup superblock at 20480000, Group descriptors at 20480001-20480117
  Backup superblock at 23887872, Group descriptors at 23887873-23887989
  Backup superblock at 71663616, Group descriptors at 71663617-71663733
  Backup superblock at 78675968, Group descriptors at 78675969-78676085
  Backup superblock at 102400000, Group descriptors at 102400001-102400117
  Backup superblock at 214990848, Group descriptors at 214990849-214990965
----
mdadm --examine /dev/sda? # get details of the existing RAID components
----
# mdadm --examine /dev/sda?
mdadm: No md superblock detected on /dev/sda1.
/dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 70c213df:be6dfb90:c0366bea:3e75f93a
  Creation Time : Sun Feb 27 13:54:03 2011
     Raid Level : raid1
  Used Dev Size : 1952467712 (1862.02 GiB 1999.33 GB)
     Array Size : 1952467712 (1862.02 GiB 1999.33 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Tue Apr  9 10:48:39 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : eb9e60ac - correct
         Events : 11781296


      Number   Major   Minor   RaidDevice State
this     1       8        2        1      active sync   /dev/sda2

   0     0       8       18        0      active sync   /dev/sdb2
   1     1       8        2        1      active sync   /dev/sda2
mdadm: No md superblock detected on /dev/sda4.
----
mdadm --detail /dev/md0 # get details of the RAID device
----
# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Feb 27 13:54:03 2011
     Raid Level : raid1
     Array Size : 1952467712 (1862.02 GiB 1999.33 GB)
  Used Dev Size : 1952467712 (1862.02 GiB 1999.33 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Apr  9 10:48:39 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 70c213df:be6dfb90:c0366bea:3e75f93a
         Events : 0.11781296

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
----
dmesg | grep 'end of device'  # see if kernel complained about disk access. Better do after the fsck step
----
# no result after fsck -f /dev/md0 ... Abort<y>? no
----

Joao Cardoso

unread,
Apr 10, 2013, 2:24:06 PM4/10/13
to al...@googlegroups.com
The disk partition table and the RAID seems to be OK.
Some elementary math follows. Big numbers are involved, but the math is simple.

-The disk has 243201 * 255 * 63 = 3907024065 sectors (each with 512 bytes)

-The disk partition which holds the RAID component (sda2) ends on sector 3907024064, the disk last sector.

-The sda2 partition has 3904935615 sectors,
  i.e., 3904935615 / 8 = 488116951 blocks (each has 4096 bytes, i.e., 8 sectors),
  or 3904935615 * 512 = 1999327034880 bytes

-The RAID information stored in sda2 says that it has 1999326937088 bytes (1952467712 * 1024)

There are 1999327034880 - 1999326937088 = 97792 bytes in the partition that are used for RAID maintenance

-The md0 filesystem superblock says it has 488116951 * 4096 = 1999327031296 bytes

This can't be, it uses more space (1999327031296 - 1999326937088 =   94208 bytes) than the RAID makes available!

We already knew this from the fsck "Either the superblock or the partition table is likely to be corrupt!" message, but now we know that the RAID is OK for the existing disk partition table.
Just to check the fsck numbers: 

The filesystem size (according to the superblock) is 488116951 blocks
The physical size of the device is 488116928 blocks

Thus (488116951 - 488116928) * 4096 = 94208 bytes, exactly what we have determined independently. fsck is right! :-)

When resize2fs tries to access those extra 94208 bytes it receives the "short read" error, because it is attempting to access data outside the (virtual) md0 device.

So you had a filesystem in sda3 (or sdb3), then you created a RAID using sda3 (or sdb3) as one of its components, and an existing filesystem was detected and used there. If this was your's or a D-Link firmware error, I don't know.

Now, how to fix?

The first hypothesis is to admit that the default superblock is in error and try to use one of the backup superblocks.
You already have a list of the backup superblocks, try using one of them at a time.

Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848

Using the first backup superblock, try running fsck with it, and see if the "Either the superblock or the partition table is likely to be corrupt!" reappears:

e2fsck -b 32768 -f /dev/md0

If it does, abort and try the next backup superblock.
If after trying a couple of backup superblocks the same message appears, you will have to resort to the force option to resize2fs:

resize2fs -fp /dev/md0

If it succeeds, use a plain

e2fsck -f /dev/md0

at the end.

Solved?

If not you might have to fool fdisk, and mdadm, and I don't know if that is possible.

The disk partition tools sometimes report different than the real disk size, and it might be possible to grow the last partition by a small amount (184 sectors, 94208 bytes), then grow the RAID, and then resize2fs will not complain of a "short read".
But this is very uncertain, and as you have backups, the right course of action if to use the Disk Wizard and recover the data from the backups.

Just to see if it would be possible, what is output of

fdisk -lu /dev/sda /dev/sdb

cat
/sys/block/sda/size /sys/block/sdb/size





On Tuesday, April 9, 2013 10:59:12 PM UTC+1, Moa wrote:
Let's have some fun and challenge with the second approach (first one is too basic) :-)

sfdisk -luS /dev/sd? # obtain the disks partition table sizes
----
# sfdisk -luS /dev/sd?

Disk /dev/sda: 243201 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1            63   1060289    1060227  82  Linux swap
/dev/sda2       2088450 3907024064 3904935615  83  Linux
/dev/sda3             0         -          0   0  Empty
/dev/sda4       1060290   2088449    1028160  83  Linux

Disk /dev/sdb: 243201 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1            63   1060289    1060227  82  Linux swap
/dev/sdb2       2088450 3907024064 3904935615  83  Linux
/dev/sdb3             0         -          0   0  Empty
/dev/sdb4       1060290   2088449    1028160  83  Linux
----
mke2fs -n /dev/md0 # obtain location of superblock backups
----
# mke2fs -n /dev/md0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
122036224 inodes, 488116928 blocks
24405846 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
14897 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848
----
dumpe2fs /dev/md0 | grep superblock # another, more reliable, way to obtain location of superblock backups
----
# dumpe2fs /dev/md0 | grep superblock
dumpe2fs 1.41.14 (22-Dec-2010)
  Primary superblock at 0, Group descriptors at 1-117
  Backup superblock at 32768, Group descriptors at 32769-32885
  Backup superblock at 98304, Group descriptors at 98305-98421
  Backup superblock at 163840, Group descriptors at 163841-163957
  Backup superblock at 229376, Group descriptors at 229377-229493
  Backup superblock at 294912, Group descriptors at 294913-295029
  Backup superblock at 819200, Group descriptors at 819201-819317
  Backup superblock at 884736, Group descriptors at 884737-884853
  Backup superblock at 1605632, Group descriptors at 1605633-1605749
  Backup superblock at 2654208, Group descriptors at 2654209-2654325
  Backup superblock at 4096000, Group descriptors at 4096001-4096117
  Backup superblock at 7962624, Group descriptors at 7962625-7962741
  Backup superblock at 11239424, Group descriptors at 11239425-11239541
  Backup superblock at 20480000, Group descriptors at 20480001-20480117
  Backup superblock at 23887872, Group descriptors at 23887873-23887989
  Backup superblock at 71663616, Group descriptors at 71663617-71663733
  Backup superblock at 78675968, Group descriptors at 78675969-78676085
  Backup superblock at 102400000, Group descriptors at 102400001-102400117
  Backup superblock at 214990848, Group descriptors at 214990849-214990965
----
mdadm --examine /dev/sda? # get details of the existing RAID components
mdadm --detail /dev/md0 # get details of the RAID device
----
# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Feb 27 13:54:03 2011
     Raid Level : raid1
     Array Size : 1952467712 (1862.02 GiB 1999.33 GB)
  Used Dev Size : 1952467712 (1862.02 GiB 1999.33 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Apr  9 10:48:39 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 70c213df:be6dfb90:c0366bea:3e75f93a
         Events : 0.11781296

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
----
dmesg | grep 'end of device'  # see if kernel complained about disk access. Better do after the fsck step
----
# no result after fsck -f /dev/md0 ... Abort<y>? no
----


On Tuesday, April 9, 2013 10:33:28 AM UTC-4, Joao Cardoso wrote:

Moa

unread,
Apr 10, 2013, 10:29:23 PM4/10/13
to al...@googlegroups.com
Hi Joao,

Firstly thank you very much for explanations, it's very instructive and I really appreciate to understand. I don't know either if this was my or a D-Link firmware error, I just bought two harddisk plug them in the DNS323 and follow the instruction, meanwhile I installed funplug and some package, but I never played with system management tool.

I run fsck with each superblock, for each try I got the message "Either the superblock or the partition table is likely to be corrupt!"

I try "resize2fs -fp /dev/md0" and it fail "Attempt to read block from filesystem resulted in short read while trying to resize /dev/md0"

Information needed for next step :
# fdisk -lu /dev/sda /dev/sdb

Disk /dev/sda: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sda1              63     1060289      530113+ 82 Linux swap
/dev/sda2         2088450  3907024064  1952467807+ 83 Linux
/dev/sda4         1060290     2088449      514080  83 Linux

Partition table entries are not in disk order

Disk /dev/sdb: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1              63     1060289      530113+ 82 Linux swap
/dev/sdb2         2088450  3907024064  1952467807+ 83 Linux
/dev/sdb4         1060290     2088449      514080  83 Linux

Partition table entries are not in disk order
# cat /sys/block/sda/size /sys/block/sdb/size
3907029168
3907029168

Joao Cardoso

unread,
Apr 11, 2013, 11:39:31 AM4/11/13
to al...@googlegroups.com


On Thursday, April 11, 2013 3:29:23 AM UTC+1, Moa wrote:
Hi Joao,

Firstly thank you very much for explanations, it's very instructive and I really appreciate to understand. I don't know either if this was my or a D-Link firmware error, I just bought two harddisk plug them in the DNS323 and follow the instruction, meanwhile I installed funplug and some package, but I never played with system management tool.

I run fsck with each superblock, for each try I got the message "Either the superblock or the partition table is likely to be corrupt!"

I try "resize2fs -fp /dev/md0" and it fail "Attempt to read block from filesystem resulted in short read while trying to resize /dev/md0"

Information needed for next step :

We will try a simpler way.

To summarize:
-The existing filesystem on md0 has more space allocated then the md0 (virtual) device has
-The difference is very small but enough to prevent the standard recovery/fix commands
-The filesystem has the same allocated space as the sda2 (and sdb2) physical devices (that md0 is made upon)

So,
-unmount the filesystem, stop the raid,
-do the fsck on the physical device, do the resize on the physical device
-reassemble the raid, remount the filesystem.

With this approach your data will be available and OK on sda2 on a regular physical device, not on RAID.
I'm not sure to what the resize step will do to the RAID information stored in sda2, so reassembling the RAID might be impossible, and recreating it necessary. I  any case, when sdb3 will be re-added to the RAID, a resync will happens, and that can take some tens of  hours (15?).

Procedure:

-Unmount md0 using the webUI
-stop the md0 RAID using the webUI
-At the command line, do (if any of the usual error appears in any step, use the Disk Wizard to create a new RAID and use the backup to restore the data)

e2fsck -fy /dev/sda2 # if the error appears, stop, and use your backups
resize2fs
-p /dev/sda2 488116928 # resize to the size of md0; if error, stop and use your backups
mdadm
--zero-superblock /dev/sdb2 # remove RAID information from sdb2; data will still be there
sfdisk
--id /dev/sda 2 da # take the opportunity to change partition type to RAID
sfdisk
--id /dev/sdb 2 da # take the opportunity to change partition type to RAID
mdadm
--assemble /dev/md0 # md0 should start in degraded mode


If an error is spit at this point, such as " no recogniseable superblock", the resize step destroyed the RAID info and the array needs to be re-created.
In this case, use the command (you could also use the RAID web page for the same purpose -- just *don't* create a new filesystem on the RAID afterwards):

mdadm --zero-superblock /dev/sda2
mdadm
--create /dev/md0 --run --level=1 --metadata=0.9 --bitmap=internal --raid-devices=2 /dev/sda2 missing

At this point you can use the webUI to RW mount the filesystem, check it, verify that all your data is there and accessible, then add sdb2 to the RAID. You can use the RAID webUI for this or use the 'mdadm /dev/md0 --add /dev/sdb2' command.
This will start a lengthy resync/rebuild procedure, synchronizing sda2 and sdb2 contents.

As it is obvious I was never faced with a problem like this one, and the reason why recovering from it should be possible is that the difference in sizes of the filesystem and RAID is so small.

The webUI Filesystem and RAID pages both have provisions to enlarge/shrink filesystems/RAIDs, but always within the device/partition bounds.
Reading the online help on these pages, as well as on the Partitioner page, might give you more insight on the subject.

João

Moa

unread,
Apr 11, 2013, 11:52:42 PM4/11/13
to al...@googlegroups.com
Hi Joao

Just for information, I will not be available for next three day, I will continue on monday.

Have a nice Week End.

Thx

Moa

unread,
Apr 15, 2013, 8:46:20 PM4/15/13
to al...@googlegroups.com
Hi Joao,

Your solution works perfectly, my RAID is at 36% recovery at the moment.

Thx you very much for your help,

jpbaril

unread,
Mar 3, 2014, 10:58:23 PM3/3/14
to
Hi, I have a similar problem.

I just bought a 3 TB drive (sda). I partitioned it as GPT in 3 partitions: 0.5 GB (swap), 2000 GB (sda2) and 1000.093 GB (sda3).
I already had a 2TB drive (sdb) that was partitioned in two 1 TB partitions, so I copied the content of it to the 2TB partition on the 3 TB drive. Then I erased the 2 TB drive and converted it to GPT.
I created 2 partitions on it: 0.399 GB (swap) and 2000 GB (sdb2).

I then created a Raid1 array with sda2 and sdb2. It looked to be working and resynchig.
When I looked again next day I had this message that it could not automatically fix md0.... etc.

So my problem looks very similar as this one, so I will try to give you all the same output you asked in the thread.


# sfdisk -luS /dev/sd?

Disk /dev/sda: 97451 cylinders, 255 heads, 63 sectors/track
Warning: The partition table looks like it was made
  for C/H/S=*/256/63 (instead of 97451/255/63).
For this listing I'll assume that geometry.

Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1             1         - 4294967295  ee  Intel EFI GUID Partition Table
/dev/sda2             0         -          0   0  Empty

/dev/sda3             0         -          0   0  Empty
/dev/sda4             0         -          0   0  Empty


Disk /dev/sdb: 243201 cylinders, 255 heads, 63 sectors/track
Warning: The partition table looks like it was made
  for C/H/S=*/256/63 (instead of 243201/255/63).
For this listing I'll assume that geometry.

Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1             1 3907029167 3907029167  ee  Intel EFI GUID Partition Table
/dev/sdb2             0         -          0   0  Empty

/dev/sdb3             0         -          0   0  Empty
/dev/sdb4             0         -          0   0  Empty



# mke2fs -n /dev/md0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
122077184 inodes, 488281200 blocks
24414060 blocks (5.00%) reserved for the super user

First data block=0
Maximum filesystem blocks=0
14902 block groups

32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848


# dumpe2fs /dev/md0 | grep superblock
dumpe2fs 1.41.14 (22-Dec-2010)
  Primary superblock at 0, Group descriptors at 1-117
  Backup superblock at 32768, Group descriptors at 32769-32885
  Backup superblock at 98304, Group descriptors at 98305-98421
  Backup superblock at 163840, Group descriptors at 163841-163957
  Backup superblock at 229376, Group descriptors at 229377-229493
  Backup superblock at 294912, Group descriptors at 294913-295029
  Backup superblock at 819200, Group descriptors at 819201-819317
  Backup superblock at 884736, Group descriptors at 884737-884853
  Backup superblock at 1605632, Group descriptors at 1605633-1605749
  Backup superblock at 2654208, Group descriptors at 2654209-2654325
  Backup superblock at 4096000, Group descriptors at 4096001-4096117
  Backup superblock at 7962624, Group descriptors at 7962625-7962741
  Backup superblock at 11239424, Group descriptors at 11239425-11239541
  Backup superblock at 20480000, Group descriptors at 20480001-20480117
  Backup superblock at 23887872, Group descriptors at 23887873-23887989
  Backup superblock at 71663616, Group descriptors at 71663617-71663733
  Backup superblock at 78675968, Group descriptors at 78675969-78676085
  Backup superblock at 102400000, Group descriptors at 102400001-102400117
  Backup superblock at 214990848, Group descriptors at 214990849-214990965


# mdadm --examine /dev/sda?
mdadm: No md superblock detected on /dev/sda1.
/dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 1d3c6728:4cb8b5d6:ee6f82e5:6e2ef93f
  Creation Time : Mon Mar  3 09:41:50 2014
     Raid Level : raid1
  Used Dev Size : 1953124800 (1862.65 GiB 2000.00 GB)
     Array Size : 1953124800 (1862.65 GiB 2000.00 GB)

   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Mon Mar  3 20:20:40 2014
          State : clean
Internal Bitmap : present

 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 8a531eb1 - correct
         Events : 4618



      Number   Major   Minor   RaidDevice State
this     0       8        2        0      active sync   /dev/sda2

   0     0       8        2        0      active sync   /dev/sda2
   1     1       8       18        1      active sync   /dev/sdb2
mdadm: No md superblock detected on /dev/sda3.



# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Mon Mar  3 09:41:50 2014
     Raid Level : raid1
     Array Size : 1953124800 (1862.65 GiB 2000.00 GB)
  Used Dev Size : 1953124800 (1862.65 GiB 2000.00 GB)

   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Mar  3 20:20:40 2014
          State : active

 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 1d3c6728:4cb8b5d6:ee6f82e5:6e2ef93f (local to host zebrick)
         Events : 0.4618


    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2



Thank you very much!
KernelLog.log
SystemLog.log
systemerror.log

João Cardoso

unread,
Mar 5, 2014, 2:29:17 PM3/5/14
to al...@googlegroups.com


On Tuesday, March 4, 2014 3:55:36 AM UTC, jpbaril wrote:
Hi, I have a similar problem.

I just bought a 3 TB drive (sda). I partitioned it as GPT in 3 partitions: 0.5 GB (swap), 2000 GB (sda2) and 1000.093 GB (sda3).
I already had a 2TB drive (sdb) that was partitioned in two 1 TB partitions, so I copied the content of it to the 2TB partition on the 3 TB drive. Then I erased the 2 TB drive and converted it to GPT.
I created 2 partitions on it: 0.399 GB (swap) and 2000 GB (sdb2).

I then created a Raid1 array with sda2 and sdb2.

That was the problem. You created a RAID over an existing fs.
RAID needs some disk space to managing itself, which means that it offers less space for the fs than its own partition space.
 
It looked to be working and resynchig.

RAID is not a fs, it is like a disk; resyncing only means that all bytes on the two disks/partitions will be equal, independently of its contents -- be it empty or having a fs on it, or whatever.
 
When I looked again next day I had this message that it could not automatically fix md0.... etc.

Yes, because the fs still thinks that it has the whole partition space on which it was first build, sdb2, e.g., but now it lies on a RAID, md0, which offers less space then the original disk partition it was built-on.

1-|oooooooooooooooo| disk partition
2-|ooooooooooo000yy| fs over partition, yy is for fs management, oo for data
3-|ooooooooooooooxx| raid over partition, xx is for raid management
4-|ooooooooooooyyxx| fs over raid, needs yy and xx, less oo space available then in 2

the fs still thinks it's on situation 2, but it is really on situation 4, size disagrees.
The way to correct is the same as depicted in the previous posts, broadly: destroy the RAID, shrink the fs in its sda2 (or sdb2) partiition, create the RAID, enlarge the fs on the RAID.

Your situation is similar but not identical to the one in the post, as your RAID is on a 3TB disk, not on a 2TB disk. If the disk partition where the RAID lies has less than 2TB, the situation will be identical -- but it has be less than 2^31 (2147483648) bytes, 2TB is very broad (and I might have found a bug in Alt-F).
 
The difference is the place where RAID reserves space for itself, either at the partition beginning or at partition end, and that can have implications if the fs that already exists on the partition can or not be used.

In any case, you have to check the RAID metadata, the disk partitions (for GPT you must use fdisk and not sfdisk)

Ah, and you never post the output I asked for in your other topic regarding the Disk Partitioner

jpbaril

unread,
Mar 5, 2014, 11:16:41 PM3/5/14
to al...@googlegroups.com

That was the problem. You created a RAID over an existing fs.
RAID needs some disk space to managing itself, which means that it offers less space for the fs than its own partition space.

What should I have done instead?
 
 The way to correct is the same as depicted in the previous posts, broadly: destroy the RAID, shrink the fs in its sda2 (or sdb2) partiition, create the RAID, enlarge the fs on the RAID.

I tried to shrink sda2 (3 TB drive), but I got this error message:

"df: /dev/sda2: can't find mount point HTTP/1.1 303 Content-Type: text/html; charset=UTF-8 Location: /cgi-bin/diskmaint.cgi"

I'm on RC3.
I don't know if it still shrunk or not.
Going back to filesystem page, the partition is now "checking".

In any case, you have to check the RAID metadata, the disk partitions (for GPT you must use fdisk and not sfdisk)

How do I do that?

Thanks again!

jpbaril

unread,
Mar 6, 2014, 12:14:32 AM3/6/14
to al...@googlegroups.com

I tried to shrink sda2 (3 TB drive), but I got this error message:

"df: /dev/sda2: can't find mount point HTTP/1.1 303 Content-Type: text/html; charset=UTF-8 Location: /cgi-bin/diskmaint.cgi"

I still don't know if it shrunk sda2, but it checked it.
I also tried to shrink sdb2 and got same message, but the FS is not being checked and it stays unmounted.

Same message than with md0:

Unable to automatically fix sdb2, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
/dev/sdb2: The filesystem size (according to the superblock) is 488281250 blocks
The physical size of the device is 488281221 blocks
Either the superblock or the partition table is likely to be corrupt!

João Cardoso

unread,
Mar 6, 2014, 12:51:17 PM3/6/14
to al...@googlegroups.com


On Thursday, March 6, 2014 4:16:41 AM UTC, jpbaril wrote:

That was the problem. You created a RAID over an existing fs.
RAID needs some disk space to managing itself, which means that it offers less space for the fs than its own partition space.

What should I have done instead?


Elaborating a bit more (sorry, I have been a teacher for over 22 years, can't avoid the lecture)

From the RAID online help:
A RAID device, as all other devices, can't be used directly, you have to create a filesystem on it, see Filesystem Maintenance.
 
And from the Filesystems online help:
In order to access and manipulate disk data as files and folders, a device needs to have a filesystem on it.
A device can be a disk partition, a RAID array, a LVM device, etc, and a filesystem can be ext2/3/4, NTFS, FAT, etc.

So, the canonical way to proceed is to first create a RAID device and afterwards create a filesystem on it. Order matters!

There is a catch: creating a filesystem on a device destroys all existing device data, so if you want to preserve existing device data you can't create a filesystem on the new device. The above mentioned wiki enables creating a RAID and preserving its data. But it was only tested on smaller than 2TB fs, using RAID metadata version 0.9.

 
 The way to correct is the same as depicted in the previous posts, broadly: destroy the RAID, shrink the fs in its sda2 (or sdb2) partiition, create the RAID, enlarge the fs on the RAID.

I tried to shrink sda2 (3 TB drive), but I got this error message:

"df: /dev/sda2: can't find mount point HTTP/1.1 303 Content-Type: text/html; charset=UTF-8 Location: /cgi-bin/diskmaint.cgi"

ah, a bug? You got that error when doing what? On the "Filesystem maintenance" page itself or after selecting the "Shrink" operation? I have examined the code and couldn't see how that could happens.

And for that to happens, did 

Nevertheless: Have you destroyed the RAID first? Was sda2 mounted? There is also a wiki about this, "How to convert RAID1 to a "standard" filesystem without losing your data".
 
From your first post in this topic you are using RAID metadata 0.9, so you should have no problem following the wikis.

What you can't do is having the RAID components mounted as filesystems and still being part of an existing RAID.
What I advise you to do is to make sure that your data is safe and start from a fresh beginning.

What *I* would do:
-stop all services
-stop the RAID
-destroy the RAID
-the above is following the convert RAID to "normal" wiki
-a reboot here. The RAID shouldn't show up, verify data exists on both sda2 and sdb2
-remove one of the disk, to keep my data safe,
-follow the convert from standard to RAID wiki


I'm on RC3.
I don't know if it still shrunk or not.

When shrinking finishes and the fs is mounted, it should show almost no free space, and its capacity will be lower than initially.
But it shouldn't be a component of an existing RAID!

Please always report *all* your steps, order matters!
I know that you have tried to shrunk, but don't know if you have destroyed the RAID and verified that the data is available on both disks.

Going back to filesystem page, the partition is now "checking".

In any case, you have to check the RAID metadata, the disk partitions (for GPT you must use fdisk and not sfdisk)

How do I do that?

What you mean with "that"? RAID metadata or disk partitions?
You already did that (verify disk partitions) in your last post in the  "Partitioner results shown as html code" topic.


Thanks again!

jpbaril

unread,
Mar 9, 2014, 5:44:20 PM3/9/14
to

Le jeudi 6 mars 2014 12:51:17 UTC-5, João Cardoso a écrit :


On Thursday, March 6, 2014 4:16:41 AM UTC, jpbaril wrote:

That was the problem. You created a RAID over an existing fs.
RAID needs some disk space to managing itself, which means that it offers less space for the fs than its own partition space.

What should I have done instead?


In those instructions, I read:

'-Change the "standard" partition type to RAID (2) using Disk->Partitioner'

Well, I partitioned my drive with RAID type from the beginning.
 

Elaborating a bit more (sorry, I have been a teacher for over 22 years, can't avoid the lecture)

From the RAID online help:
A RAID device, as all other devices, can't be used directly, you have to create a filesystem on it, see Filesystem Maintenance.
 
And from the Filesystems online help:
In order to access and manipulate disk data as files and folders, a device needs to have a filesystem on it.
A device can be a disk partition, a RAID array, a LVM device, etc, and a filesystem can be ext2/3/4, NTFS, FAT, etc.

So, the canonical way to proceed is to first create a RAID device and afterwards create a filesystem on it. Order matters!

Oh, I think I know what I did.

When I partitioned my drive, I took good care to choose RAID type instead of Linux or LVM.
But, then I created ext4 FS on each drive BEFORE adding them to RAID array.
I think I should have added both non-formated partitions to the RAID1 array, and THEN format md0 to ext4. Right?
 
 I tried to shrink sda2 (3 TB drive), but I got this error message:

"df: /dev/sda2: can't find mount point HTTP/1.1 303 Content-Type: text/html; charset=UTF-8 Location: /cgi-bin/diskmaint.cgi"

ah, a bug? You got that error when doing what? On the "Filesystem maintenance" page itself or after selecting the "Shrink" operation? I have examined the code and couldn't see how that could happens.

After selecting Shrink I was brought to a blank page (in the main page frame) only showing that message.

And for that to happens, did 

Nevertheless: Have you destroyed the RAID first? Was sda2 mounted? There is also a wiki about this, "How to convert RAID1 to a "standard" filesystem without losing your data".
 
From your first post in this topic you are using RAID metadata 0.9, so you should have no problem following the wikis.

What you can't do is having the RAID components mounted as filesystems and still being part of an existing RAID.
What I advise you to do is to make sure that your data is safe and start from a fresh beginning.

What *I* would do:
-stop all services
-stop the RAID
-destroy the RAID
-the above is following the convert RAID to "normal" wiki
-a reboot here. The RAID shouldn't show up, verify data exists on both sda2 and sdb2
-remove one of the disk, to keep my data safe,
-follow the convert from standard to RAID wiki

Tell me if it's ok, but here is what I'm trying right now (sorry if it's something stupid, because I'm not totally sure to understand everything you are trying to explain me).

1. I repartitioned my 2TB drive (sdb) as everything is already on the 2TB partition (sda2) of the 3TB drive and because anyway sdb2 was not showing any files as it should have.
2. I created a RAID1 array with only one component: a non-formated sdb2.
3. I created an ext4 FS on md0.
4. I'm copying right now all the files from sda2 to md0.
5. I will erase the sda2 parition with the paritioner tool, add that now non-formated component to md0 and let it resynch. Or I will maybe try your instructions at "How to convert a "standard" filesystem to RAID1 keeping all your data" to see if I can spare me the time (and NAS cpu load / drive spins) of the recopy of all the files from sdb to sda in the resynching process.
 
Going back to filesystem page, the partition is now "checking".

In any case, you have to check the RAID metadata, the disk partitions (for GPT you must use fdisk and not sfdisk)

How do I do that?

What you mean with "that"? RAID metadata or disk partitions?
You already did that (verify disk partitions) in your last post in the  "Partitioner results shown as html code" topic.

I was referring to checking RAID metadata.

João Cardoso

unread,
Mar 10, 2014, 12:58:27 PM3/10/14
to al...@googlegroups.com
(...)

From the RAID online help:
A RAID device, as all other devices, can't be used directly, you have to create a filesystem on it, see Filesystem Maintenance.
 
And from the Filesystems online help:
In order to access and manipulate disk data as files and folders, a device needs to have a filesystem on it.
A device can be a disk partition, a RAID array, a LVM device, etc, and a filesystem can be ext2/3/4, NTFS, FAT, etc.

So, the canonical way to proceed is to first create a RAID device and afterwards create a filesystem on it. Order matters!
Oh, I think I know what I did.

When I partitioned my drive, I took good care to choose RAID type instead of Linux or LVM.
But, then I created ext4 FS on each drive BEFORE adding them to RAID array. 
I think I should have added both non-formated partitions to the RAID1 array, and THEN format md0 to ext4. Right?

Right.

Imagine what would happens if you put different data on sda2 and sdb2 (you could do it, as there are fs on them) and then create the RAID with sda2 and sdb2 -- what data would be in the RAID? the one from sda2? from sdb2?
 
 
 I tried to shrink sda2 (3 TB drive), but I got this error message:

"df: /dev/sda2: can't find mount point HTTP/1.1 303 Content-Type: text/html; charset=UTF-8 Location: /cgi-bin/diskmaint.cgi"

ah, a bug? You got that error when doing what? On the "Filesystem maintenance" page itself or after selecting the "Shrink" operation? I have examined the code and couldn't see how that could happens.
After selecting Shrink I was brought to a blank page (in the main page frame) only showing that message.

I can't reproduce that.
 
(...)

 
From your first post in this topic you are using RAID metadata 0.9, so you should have no problem following the wikis.

What you can't do is having the RAID components mounted as filesystems and still being part of an existing RAID.
What I advise you to do is to make sure that your data is safe and start from a fresh beginning.

What *I* would do:
-stop all services
-stop the RAID
-destroy the RAID
-the above is following the convert RAID to "normal" wiki
-a reboot here. The RAID shouldn't show up, verify data exists on both sda2 and sdb2
-remove one of the disk, to keep my data safe,
-follow the convert from standard to RAID wiki
Tell me if it's ok, but here is what I'm trying right now (sorry if it's something stupid, because I'm not totally sure to understand everything you are trying to explain me).

You should start by:

0-Destroy the RAID array. RAID info is stored in the disk (sda2/sdb2), and it can mess-up latter setup, such as appearing several RAIDs, md0, md1... which can lead to confusion
 

1. I repartitioned my 2TB drive (sdb) as everything is already on the 2TB partition (sda2) of the 3TB drive and because anyway sdb2 was not showing any files as it should have.
2. I created a RAID1 array with only one component: a non-formated sdb2.
3. I created an ext4 FS on md0. 
4. I'm copying right now all the files from sda2 to md0.

That's OK, you are taking the safest approach. The only inconvenient is the extra time needed to copy all the data.
Before proceeding verify that all the data is on the RAID, do a reboot, to be in the safe side.
 
5. I will erase the sda2 parition with the paritioner tool,

It must have the same size as sdb2, i.e., all RAID component partition sizes must be within 1% of each other
 
add that now non-formated component to md0 and let it resynch.

Yes.

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)
 
Or I will maybe try your instructions at "How to convert a "standard" filesystem to RAID1 keeping all your data" to see if I can spare me the time (and NAS cpu load / drive spins) of the recopy of all the files from sdb to sda in the resynching process.

You already started, keep going as you depicted above.

João

jpbaril

unread,
Mar 10, 2014, 4:04:05 PM3/10/14
to al...@googlegroups.com


Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
(...)

Oh, I think I know what I did.

When I partitioned my drive, I took good care to choose RAID type instead of Linux or LVM.
But, then I created ext4 FS on each drive BEFORE adding them to RAID array. 
I think I should have added both non-formated partitions to the RAID1 array, and THEN format md0 to ext4. Right?

Right.

Imagine what would happens if you put different data on sda2 and sdb2 (you could do it, as there are fs on them) and then create the RAID with sda2 and sdb2 -- what data would be in the RAID? the one from sda2? from sdb2?

Well, I would have tought it would have merged both drives keeping most recently modified file in case of same file existing on both drives. But now I know that's not how it works. :)
 
Tell me if it's ok, but here is what I'm trying right now (sorry if it's something stupid, because I'm not totally sure to understand everything you are trying to explain me).

You should start by:

0-Destroy the RAID array. RAID info is stored in the disk (sda2/sdb2), and it can mess-up latter setup, such as appearing several RAIDs, md0, md1... which can lead to confusion

Yes, that was already done from tests I did just before trying to solve my problem.

 
1. I repartitioned my 2TB drive (sdb) as everything is already on the 2TB partition (sda2) of the 3TB drive and because anyway sdb2 was not showing any files as it should have.
2. I created a RAID1 array with only one component: a non-formated sdb2.
3. I created an ext4 FS on md0. 
4. I'm copying right now all the files from sda2 to md0.

That's OK, you are taking the safest approach. The only inconvenient is the extra time needed to copy all the data.
Before proceeding verify that all the data is on the RAID, do a reboot, to be in the safe side.

Ok, I will.
 
 
5. I will erase the sda2 parition with the paritioner tool,

It must have the same size as sdb2, i.e., all RAID component partition sizes must be within 1% of each other

Well, I partitioned both drives with a 2000 GB partition. I hope ALT-F created them equal in size.

 
add that now non-formated component to md0 and let it resynch.

Yes.

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

When I created my RAID1 array in my step #2, in filesystem I saw an non-mounted non-formated sdb2 parition. After adding it to the raid array, I saw a md0 partition which I then formated as ext4 (my step #3). So everything seems to be OK so far.

 
Or I will maybe try your instructions at "How to convert a "standard" filesystem to RAID1 keeping all your data" to see if I can spare me the time (and NAS cpu load / drive spins) of the recopy of all the files from sdb to sda in the resynching process.

You already started, keep going as you depicted above.

I'm still at #4, so could I try to not delete sda2 and integrate it to md0? I think not as the files would already be on md0 (and as said at top of this reply).
So yes, I will probably end up removing FS from sda2, adding it md0 and let it resynch.

João Cardoso

unread,
Mar 12, 2014, 9:35:21 AM3/12/14
to al...@googlegroups.com


On Monday, March 10, 2014 8:04:05 PM UTC, jpbaril wrote:


Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
(...)

Oh, I think I know what I did.

When I partitioned my drive, I took good care to choose RAID type instead of Linux or LVM.
But, then I created ext4 FS on each drive BEFORE adding them to RAID array. 
I think I should have added both non-formated partitions to the RAID1 array, and THEN format md0 to ext4. Right?

Right.

Imagine what would happens if you put different data on sda2 and sdb2 (you could do it, as there are fs on them) and then create the RAID with sda2 and sdb2 -- what data would be in the RAID? the one from sda2? from sdb2?

Well, I would have tought it would have merged both drives keeping most recently modified file in case of same file existing on both drives. But now I know that's not how it works. :)

No, it does not works like that. RAID don't even know what a file is! It's only job is to keep the disk sectors synchronized and redundant, independently of what it is there, be it FAT, NTFS, ext2/3/4, crypto data...
 
 
Tell me if it's ok, but here is what I'm trying right now (sorry if it's something stupid, because I'm not totally sure to understand everything you are trying to explain me).

You should start by:

0-Destroy the RAID array. RAID info is stored in the disk (sda2/sdb2), and it can mess-up latter setup, such as appearing several RAIDs, md0, md1... which can lead to confusion

Yes, that was already done from tests I did just before trying to solve my problem.

 
1. I repartitioned my 2TB drive (sdb) as everything is already on the 2TB partition (sda2) of the 3TB drive and because anyway sdb2 was not showing any files as it should have.
2. I created a RAID1 array with only one component: a non-formated sdb2.
3. I created an ext4 FS on md0. 
4. I'm copying right now all the files from sda2 to md0.

That's OK, you are taking the safest approach. The only inconvenient is the extra time needed to copy all the data.
Before proceeding verify that all the data is on the RAID, do a reboot, to be in the safe side.

Ok, I will.
 
 
5. I will erase the sda2 parition with the paritioner tool,

It must have the same size as sdb2, i.e., all RAID component partition sizes must be within 1% of each other

Well, I partitioned both drives with a 2000 GB partition. I hope ALT-F created them equal in size.

 
add that now non-formated component to md0 and let it resynch.

Yes.

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

When I created my RAID1 array in my step #2, in filesystem I saw an non-mounted non-formated sdb2 parition. After adding it to the raid array, I saw a md0 partition which I then formated as ext4 (my step #3). So everything seems to be OK so far.

 
Or I will maybe try your instructions at "How to convert a "standard" filesystem to RAID1 keeping all your data" to see if I can spare me the time (and NAS cpu load / drive spins) of the recopy of all the files from sdb to sda in the resynching process.

You already started, keep going as you depicted above.

I'm still at #4, so could I try to not delete sda2 and integrate it to md0? I think not as the files would already be on md0 (and as said at top of this reply).
So yes, I will probably end up removing FS from sda2, adding it md0 and let it resynch.

hope that by now you have it done.
This topic subject has become degenerated, from a fsck error to a standard to RAID1 conversion issue, which by itself is already covered in other topics.

Joao
 

jpbaril

unread,
Mar 12, 2014, 8:03:03 PM3/12/14
to al...@googlegroups.com
Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

That's what is happening.

After copying all data from sda2 to md0, I wanted to remove the FS on sda2 to then add it to RAID array.
I even tried to change Parition type on sda2 to empty and then back to RAID, but it kept its ext4 FS.

How can I remove the FS on sda2 without messing with the structure of the whole drive? (I want to keep data on another partition: sda3)

Thanks

João Cardoso

unread,
Mar 13, 2014, 11:37:48 AM3/13/14
to al...@googlegroups.com


On Thursday, March 13, 2014 12:03:03 AM UTC, jpbaril wrote:
Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

That's what is happening.

In the previous paragraph of the above quoted text I wrote:

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.

So,  "unmount it first": Disk->Filesystems->FS Operations, unmount

jpbaril

unread,
Mar 13, 2014, 7:58:17 PM3/13/14
to


Le jeudi 13 mars 2014 11:37:48 UTC-4, João Cardoso a écrit :


On Thursday, March 13, 2014 12:03:03 AM UTC, jpbaril wrote:
Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

That's what is happening.

In the previous paragraph of the above quoted text I wrote:

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.

So,  "unmount it first": Disk->Filesystems->FS Operations, unmount

I tried to umount sda2 (which had ext4 FS). Then, on partitioner page, I just uncheck the checkbox of sda2 to not keep it withtout making any changes. That did not work, it remounted sda2 with ext4.

What I did is reduce by 0.01 GB the swap parition ("sda1") and increase the sda2 partition by the same amount just to change the start/size of each partitions as you explained. That worked! sda2 was now not formated!
I then reduced sda2 back to previous size and finally was able to added it to RAID array.

I was not able to bring swap parition back to preivous size, I got some error messages.

I hope I did not mess with sda3 by doing all that, because on status page I got:
  • Unable to automatically fix sda3, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3 is mounted.  e2fsck: Cannot continue, aborting.
  • Checking sda3 finished with status code 8: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3: Error allocating block bitmap (1): Memory allocation failed
    e2fsck: aborted
After some time I was still able to mount sda3.

João Cardoso

unread,
Mar 14, 2014, 2:09:36 PM3/14/14
to al...@googlegroups.com


On Thursday, March 13, 2014 11:57:01 PM UTC, jpbaril wrote:


Le jeudi 13 mars 2014 11:37:48 UTC-4, João Cardoso a écrit :


On Thursday, March 13, 2014 12:03:03 AM UTC, jpbaril wrote:
Le lundi 10 mars 2014 12:58:27 UTC-4, João Cardoso a écrit :
You might think that re-partitioning a disk makes its data to disappears, but that is not true. Re-partitioning a MBR disk only changes 512 bytes on it, and if the start and sizes of each partition are the same as before, a filesystem will survive re-partitioning (and Alt-F will fsck and auto-mount it)

That's what is happening.

In the previous paragraph of the above quoted text I wrote:

But first verify that sdb2 does not appears as mounted, and if it does unmount it first.

So,  "unmount it first": Disk->Filesystems->FS Operations, unmount
I tried to umount sda2 (which had ext4 FS). Then, on partitioner page,

You shouldn't need to repartition it again, you can add a partition to an existing RAID even if the partition has a fs on it. When the RAID recover starts the contents of the existing RAID will be "copied" to the new partition, even if a fs exists on it (and its contents will be lost).
This can only be done if the fs is not mounted, that's why I told you "unmount before adding it to the array"

It is different to create a RAID and to assemble a RAID.

I just uncheck the checkbox of sda2 to not keep it withtout making any changes. That did not work, it remounted sda2 with ext4.

Yes, as I told a fs is not necessarily destroyed when re-partition a disk or changing the partition type; when the partitioner finishes its work the partitions will reappear and if a fs is recognized on them they will be mounted. Again, I told "unmount before adding it to the array"
 

What I did is reduce by 0.01 GB the swap parition ("sda1") and increase the sda2 partition by the same amount just to change the start/size of each partitions as you explained. 
That worked! sda2 was now not formated!

Yes, that works, but it was not needed. And playing with partitioning wanting to preserve existing disk data is... well, risky.
 
I then reduced sda2 back to previous size and finally was able to added it to RAID array. 

I was not able to bring swap parition back to preivous size, I got some error messages.

the swap was probably destroyed during your manipulation.
Search the forum for diagnosing and correcting this -- keywords: mkswap, swapon
 

I hope I did not mess with sda3 by doing all that, because on status page I got:

It depends on whether you *always* keep the sda3 Keep checkmark checked. As the partitioner online help says:

Keep will disable changes to the checked partition. (...). Un-checking it, making any change and rechecking it again does not guarantees that the resulting partition will be valid, and the original partition might not be preserved -- if you don't want to change a partition, never uncheck it.
 
  • Unable to automatically fix sda3, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3 is mounted.  e2fsck: Cannot continue, aborting.
  • Checking sda3 finished with status code 8: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3: Error allocating block bitmap (1): Memory allocation failed
    e2fsck: aborted

This message is a symptom that no swap is available, i.e., fsck has not enough memory (physical+swap=virtual) to work.

After some time I was still able to mount sda3.

 Don't understand: was you or was you not able to mount sda3 in read-write mode?

Bottom line:
I avoid to give recipes, as they only work in a very specific situation, which I don't fully know. I prefer people to understand what they are doing, so they can do it again under different circumstances. That's why the online help and wikis always have a short "theoretical" explanation.
What have you read on the wiki or online help that made you took the wrong track? How could them be improved?


jpbaril

unread,
Mar 16, 2014, 3:21:34 AM3/16/14
to al...@googlegroups.com


Le vendredi 14 mars 2014 14:09:36 UTC-4, João Cardoso a écrit :

You shouldn't need to repartition it again, you can add a partition to an existing RAID even if the partition has a fs on it. When the RAID recover starts the contents of the existing RAID will be "copied" to the new partition, even if a fs exists on it (and its contents will be lost).
This can only be done if the fs is not mounted, that's why I told you "unmount before adding it to the array"

Oh, I thought it was to unmount it before trying to partition.
 
It is different to create a RAID and to assemble a RAID.

Well, at least now I know. I just hope I will remember.

  • Unable to automatically fix sda3, mounting Read Only: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3 is mounted.  e2fsck: Cannot continue, aborting.
  • Checking sda3 finished with status code 8: fsck 1.41.14 (22-Dec-2010)
    /dev/sda3: Error allocating block bitmap (1): Memory allocation failed
    e2fsck: aborted

This message is a symptom that no swap is available, i.e., fsck has not enough memory (physical+swap=virtual) to work.

After some time I was still able to mount sda3.

 Don't understand: was you or was you not able to mount sda3 in read-write mode?

For a few minutes sda3 did not want to fsck or mount, but then it mounted.
 
What have you read on the wiki or online help that made you took the wrong track? How could them be improved?

I did not read anything at first because in past I once did something similar. Well, that's what I thought. The problem is, it was not exactly the same thing, and even then, I think I did not remember exactly the appropriate steps I had fallowed.

The previous time I just added a new drive to an existing degraded array. This time, I had to delete the array, remove one of the old drives, move data to new drive, erase partitions on the remaining old drive, repartition the drive and move back data from new drive to old drive as a part of a RAID array.

The mistake I made was to create a FS on new drive instead of just assemble a RAID array with it.

The tricky thing to know is: after creating a RAID partition you have to add it to a raid array directly without creating a FS on the partition, and only create the FS on the md0 partition.

Now, everything seems okay, all my data is there and RAID array seems to work.

The only thing I cannot understand is that the amber leds either keep blinking both at the same time, but also sometimes they blink alternately. It looks like what you describe in the wiki at bullet "for developers", but I don't understand what it means and also this was not doing that before, so I don't know what could have changed.

Anyway, I wanted to express you my gratitude for all your help (not just in the thread, but also in others in the last years) and for the great work you did for the DNS-323. If there is some way I can donate to you, please tell me and I will be glad to give you that modest gift.

Thank you very much.
Reply all
Reply to author
Forward
0 new messages