Two Amber LEDS with RAID1

160 views
Skip to first unread message

Evan Venn

unread,
Dec 19, 2013, 4:22:25 AM12/19/13
to al...@googlegroups.com
I have two amber LEDS with a RAID1/EXT2 config.

The RAID is degraded.  I know this bit. :-)

The NAS is operational but how do I re-sync the two drives?  I have been thru all the menu options and I cannot figure out how to remove the SDB2 from the array so I re-join it to SDA2.  Both drives are 3tb.

Thank you.

 

João Cardoso

unread,
Dec 19, 2013, 2:43:30 PM12/19/13
to


On Thursday, December 19, 2013 9:22:25 AM UTC, Evan Venn wrote:
I have two amber LEDS with a RAID1/EXT2 config.

The RAID is degraded.  I know this bit. :-)

The NAS is operational but how do I re-sync the two drives?

In general, when a RAID component fails the whole disk needs to be replaced.
This means that if you have data stored in that disk under other filesystem that are not in the RAID, then you need to backup that non-RAID data.



Summary: Rebuilding a RAID1 means taking the following steps:

A-Identify the failed RAID component and disk
B-remove the failed component from the RAID
C-remove the failed disk from the box
D-insert a new disk with identical or bigger capacity than the failed one
E-create on the new disk a partition of type RAID with the same capacity as the existing RAID
F-add that new RAID partition as a new  RAID component


 
  I have been thru all the menu options and I cannot figure out how to remove the SDB2 from the array so I re-join it to SDA2.  Both drives are 3tb.

If the drives are 3TB that means that you have used Alt-F to build the RAID, right? If the RAID was created under D-Link firmware, the procedure is different.

If the RAID was build under Alt-F:

A-Identify the failed RAID component and disk
A.1-go to Disk->RAID, to the " RAID Maintenance" section.
A.2-There, under "Components", you should see two disk partitions, e.g., sda2 and sdb2, one of them in red, the degraded RAID component (assuming that it is sdb2 from now on)

B-remove the failed component from the RAID
B-Under "Component Operations",
B.1-select under "Partitions" the component which is failed, sdb2 in the example;
B.2-then under "Operation" select "Remove"

The RAID should continue in the degraded state, but no component will be red.

A RAID component is a disk partition, and from the failed component name you can infer which disk needs to be replaced, just remove the number of the component name end and you will get the disk device name. E.g., if sdb2 is the failed component it means that it belongs to the sdb disk and the sdb disk needs to be replaced.
In the Status page, under the Disks section, you can see what is the box slot/bay associated with the disk, it can be the left or right slot. 

C-remove the failed disk from the box
You can now:
C.1- either power off the box and remove the failed disk, or
C.2-or go to Disk->Utilities, hit the "Eject" button in the line corresponding to the failed disk,
C.2.1-after the confirmation popup, eject the disk without power-off, or
C.2.3-If you get any error popup, you have to power-off the box.

D-insert a new disk with identical or bigger capacity than the failed one
D-Now you need to plug-in a new disk to replace the failed disk.
After plugging the new disk, go to the Status and RAID pages and write down on a paper which disk is on which basy/slot, as disks might have changed names!
Assuming that the disks have not changed names, following with the example above sdb will be the new disk name and sda the old (good) disk name

The new disk can have the same of a bigger capacity than the failed disk, but *not* a smaller capacity.
If the new disk is of a greater capacity than the failed one, only part of the disk will be used to rebuild the RAID and you can later use the unused part to to create a "standard" filesystem and use it.

E-create on the new disk a partition of type RAID with the same capacity as the existing RAID
E- In any case, when both disk are plugged, go to Disk->Partitioner
E.1-in the "Select the disk you want to partition" section check the old good disk, sda in the example, and
E.2-under "Partition Table" select under "CopyTo" the new disk, sdb in the example. Read the popup message carefully, to be sure that you are not doing the wrong operation!!!
This will copy the partition table of the good old disk to the new disk and is a fast operation. After it finishes, if you select either the sda or sdb disk you should see a similar output in the lower section; if the new disk is of a greater capacity you will see some free space on the new disk (that you can later user).
E.3-You don't need the hit the Partition button.

The disks now have a similar partition table and, most important, the new disk has a RAID partition of the same capacity and type as the RAID partition of the old disk. This is the key factor and is all that is needed to rebuild the degraded RAID array.
It can be accomplished on several ways, the other being filling-in for the new disk, in the lower section of the Disk Partitioner, identical information as the old disk and hitting the Partition button. But using the copyTo option is simpler.

F-add that new RAID partition as a new  RAID component
F- You can now go to Disk->RAID and
F.1-under "Component Operations", under "Partition" select the new partition, sdb2 in the example,
F.2-then under "Operation" select "add".
The rebuild should start immediately and should take some 15 hours to accomplish, you can watch the evolution in the Status page.
You can start using the RAID immediately. As a matter of fact, if you opted for ejecting the bad disk with power applied, the RAID was always available for use, and that's is what RAID means, 24/7 *availability*, *not* backup.

Please read this twice, understanding everything that is being done, and if any information does not seems to fit, please ask.

This can be one of the sections of the missing "Rebuilding a degraded RAID1 array", so help us reading-proof it and making suggestions on how to improve it

As the procedure is complex and requires manual intervention, there is no safe automatic way of doing it, although I have thinked on it for several times.

Hoping that there are no errors,
Joao



Thank you.

 

Evan Venn

unread,
Dec 22, 2013, 12:23:03 PM12/22/13
to
I am making progress.  Drive is up and I am online again.  Thank you, so much.

I have scanned the log and I am still a little concerned.  I still have md0: unknown partition table (see below).

Dec 22 16:30:01 DNS-323 user.info kernel: md: bind
Dec 22 16:30:01 DNS-323 user.info kernel: md/raid1:md0: active with 2 out of 2 mirrors
Dec 22 16:30:01 DNS-323 user.info kernel: md0: bitmap initialized from disk: read 2/2 pages, set 0 bits
Dec 22 16:30:01 DNS-323 user.info kernel: created bitmap (22 pages) for device md0
Dec 22 16:30:01 DNS-323 user.info kernel: md0: detected capacity change from 0 to 3000053874688
Dec 22 16:30:01 DNS-323 user.info kernel:  md0: unknown partition table
Dec 22 16:30:01 DNS-323 user.info kernel: eth0: link up, 100 Mb/s, full duplex, flow control disabled
Dec 22 16:30:01 DNS-323 user.notice root: Starting urandom: OK.


Should I worry?

Most grateful for response.

João Cardoso

unread,
Dec 22, 2013, 1:01:44 PM12/22/13
to al...@googlegroups.com


On Sunday, December 22, 2013 5:22:20 PM UTC, Evan Venn wrote:
I am making progress.  Drive is up and I am online again.  Thank you, so much.

I would appreciate any feedback on the instructions, which will become a wiki on its own.
Does it apply fully? Have you followed it without difficulty? Any less clear wording? Something that could be better explained or skipped?
 

I have scanned the log and I am still a little concerned.  I still have md0: unknown partition table (see below).

Dec 22 16:30:01 DNS-323 user.info kernel: md: bind
Dec 22 16:30:01 DNS-323 user.info kernel: md/raid1:md0: active with 2 out of 2 mirrors
Dec 22 16:30:01 DNS-323 user.info kernel: md0: bitmap initialized from disk: read 2/2 pages, set 0 bits
Dec 22 16:30:01 DNS-323 user.info kernel: created bitmap (22 pages) for device md0
Dec 22 16:30:01 DNS-323 user.info kernel: md0: detected capacity change from 0 to 3000053874688
Dec 22 16:30:01 DNS-323 user.info kernel:  md0: unknown partition table
Dec 22 16:30:01 DNS-323 user.info kernel: eth0: link up, 100 Mb/s, full duplex, flow control disabled
Dec 22 16:30:01 DNS-323 user.notice root: Starting urandom: OK.


Should I worry?

No.
A RAID is like a disk (is a device), and so it can be partitioned; the message just says that there are no partitions on it, which is OK.
 

Most grateful for response.




On Thursday, 19 December 2013 09:22:25 UTC, Evan Venn wrote:

Dillon Gilhooley

unread,
Oct 23, 2015, 12:15:29 PM10/23/15
to Alt-F
These are really helpful instructions. I'm not through them yet. I have been trying to rebuild this Raid, but I am not able to re-add the second disk. I get this error:
mdadm: cannot open /dev/sda2: device or resouce busy

Any ideas?

João Cardoso

unread,
Oct 23, 2015, 1:27:09 PM10/23/15
to Alt-F


On Friday, 23 October 2015 17:15:29 UTC+1, Dillon Gilhooley wrote:
These are really helpful instructions. I'm not through them yet. I have been trying to rebuild this Raid,

You should give more context -- is sda a new brand disk? How is the sda disk partitioned? How did the RAID become degraded?... This is a complex matter that puts your data at danger and not supplying enough info (OR LOGS!) might lead to the wrong answer.
 
but I am not able to re-add the second disk. I get this error:
mdadm: cannot open /dev/sda2: device or resouce busy

Probably sda2 has a mounted filesystem on it, you have to unmount it using Disk->Filesystems-, FS Operations, unmount.
Remember that adding sda2 to the RAID will destroy all sda2 existing data.

-You might have another error when adding, something like "already a member, ...spare, ...zero superblock", you have to clear any existing RAID info in sda2, Disk->RAID, Component Operations, select sda2 from "partitions", then Clear from Operations.

-It might happens that after clearing the RAID info from sda2 in the previous step sda2 will not appear anymore in the list af partitions -- that's because the sda2 partition is not of RAID type, use the Disk->Partitioner to set the sda2 Type to RAID. Only uncheck the sda2 Keep checkbutton before Partition it.

-It is possible that the Partitioner will refuse to work, with an "out of order" partition message. That's because you might be using a D-Link firmware partitioned disk; D-Link uses an odd (to be kind) partition scheme. If that is the case, repartition all disk (and loose all its data).

There are other possible issues, and unfortunately the Disk Wizard does not have handles a "Add disk to RAID" option.



Any ideas?

Dillon Gilhooley

unread,
Oct 23, 2015, 4:13:28 PM10/23/15
to Alt-F
Thank you so much. I took your steps in pieces. I didn't know that I needed to unmount, so I started with that. Instead of trying to do more, I simply tried re-adding the component drive to the Raid. It is currently in that operation -- "recovering". I will let you know what happens. I truly appreciate the quick reply as well as your help.
Reply all
Reply to author
Forward
0 new messages