DNS-320-Ax RAID failure recovery

286 views
Skip to first unread message

Michael Brice

unread,
Aug 10, 2022, 6:41:37 AM8/10/22
to Alt-F

The box is currently running Alt-F 1.0 with kernel 4.4.86, and is flashed with "Alt-F-1.0, initrd" and kernel "Alt-F-1.0, kernel 4.4.86"

 

I had a power outage and I believe one of the two drives in my DNS-320 box failed, or at least the system believes it to be so. I believe the sda (left) disk failed and the sdb (right) disk is OK. However when I remove the left disk and connect it to my Windows desktop and use a data recovery tool to inspect it, all data is visible and a surface check shows no errors.

 

I am trying to recover but things appear to be screwy. The Status page is showing three mounted filesystems (see screenshot StatusPage.jpg)

 

However, the filesystem Maintenance screen is showing something different (see screenshot FilessystemMaintenancePage.jpg)

The Disk Partitioner screen shows equivalent configurations for both left (sda) and right (sbd) disks (see screenshots DiskPartitionerLeftPage.jpg and DiskPartitionerRightPage.jpg).

And the RAID maintenance screen shows the following (see screenshot RaidMaintenancePage.jpg).

 

It seems to me that there is total confusion with a variety of device names. The content is no longer accessible from my Windows desktop, however before I started to ‘fix things’ I managed to copy the content to a spare drive on my Windows desktop.

 

Whilst I would like to understand what is going on, I really don’t care, I just want to get a RAID configuration back again. The only devices RAID maintenance sees are sda1, sda2, sdb2.

 

I am at a loss to know here to go, so my question is, what are my options ? 

FilessystemMaintenancePage.jpg
RaidMaintenancePage.jpg
DiskPartitionerRightPage.jpg
DiskPartitionerLeftPage.jpg
StatusPage.jpg

Joao Cardoso

unread,
Aug 10, 2022, 7:46:31 PM8/10/22
to Alt-F
On Wednesday, August 10, 2022 at 11:41:37 AM UTC+1 mikebr...@gmail.com wrote:

The box is currently running Alt-F 1.0 with kernel 4.4.86, and is flashed with "Alt-F-1.0, initrd" and kernel "Alt-F-1.0, kernel 4.4.86"

 

I had a power outage and I believe one of the two drives in my DNS-320 box failed, or at least the system believes it to be so. I believe the sda (left) disk failed and the sdb (right) disk is OK. However when I remove the left disk and connect it to my Windows desktop and use a data recovery tool to inspect it, all data is visible and a surface check shows no errors.

 

I am trying to recover but things appear to be screwy. The Status page is showing three mounted filesystems (see screenshot StatusPage.jpg)

 

However, the filesystem Maintenance screen is showing something different (see screenshot FilessystemMaintenancePage.jpg)

It's saying the same thing. In the fs maintenance, the Mtd. column means "Mounted", and md1, sda4 and sdb4 appears as mounted. sdb2 appears as having a ext3 filesystem that is not mounted (no '*').
sdb2 was part of the md1 RAID1, and is not being added to the RAID for some unknown reason.

The Disk Partitioner screen shows equivalent configurations for both left (sda) and right (sbd) disks (see screenshots DiskPartitionerLeftPage.jpg and DiskPartitionerRightPage.jpg).

You still have the original D-Link partition layout. That's OK. 

And the RAID maintenance screen shows the following (see screenshot RaidMaintenancePage.jpg).

The RAID page only allows to manipulate partitions that are of type RAID (or already assembled RAID devices) but the D-Link format tool marked sda2/sdb2 as being of type "linux".
To be able to add sdb2 to the md1 array at the RAID page you need to change it's type to RAID. In the sdb partition page uncheck the "Keep" checkbox on the sdb2 line and change its type to RAID without changing anything else. Before hitting the Partition button verify that sdb2 start/lenght/size numbers are the same as before.

After doing the above, sdb2 should appear as a Component on the RAID page you should be able to add it (Operation) to the md1 array.
To prevent the same to happen to sda2 you can change its partition type to be of type RAID (after having success with the above).

Why sdb2 loosed the ability to be automatically added to its md1 array is unknown to me.

What I find odd in your description is why you can't see the md1 data under Windows. md1 is mounted and its data should be available and readable (verify the Samba page still has the shares correctly defined). 

Michael Brice

unread,
Aug 11, 2022, 10:19:54 AM8/11/22
to Alt-F
Fisrt, thank for the guidance, it is much appreciated.

I tried to do as you suggested, but after prssing the Partition button I was presented with an error message box (see attached screenshot).

Is it possible to simpy start again with the RAID definition, or is that a can of worms best left alone ? What would be the consequence of perfoming an Erase opeartion o th sdb disk ?

DiskPartitionFailure.jpg

Joao Cardoso

unread,
Aug 11, 2022, 8:20:48 PM8/11/22
to Alt-F
On Thursday, August 11, 2022 at 3:19:54 PM UTC+1 mikebr...@gmail.com wrote:
Fisrt, thank for the guidance, it is much appreciated.

I tried to do as you suggested, but after prssing the Partition button I was presented with an error message box (see attached screenshot).

Ah, the D-Link inherited disk partition layout does not align partitions "correctly", and Alt-F complains and refuses (because of 4K sector disk write performance) 

It is possible to use the command line to change a disk partition type from "linux" to "RAID", but only late next week I will have access to my boxes to confirm the exact command. So, I suggest you to use the procedure bellow, given that you have a backup.
   

Is it possible to simpy start again with the RAID definition, or is that a can of worms best left alone ? What would be the consequence of perfoming an Erase opeartion o th sdb disk ?

As you have already performed a backup, the best  is for you to use the Disk Wizard, selecting both disks, selecting RAID1 as the desired disk layout and ext4 as the desired filesystem. This will reformat both disks and destroy *all* data on them (not the flashed Alt-F firmware) -- notice, the RAID1 *and* the 500MB sda4 *and* 500MB sdb4 filesystems will be destroyed.
You can afterwards recover the data from the backup through the network to the newly created RAID.

Jeremy Laidman

unread,
Aug 12, 2022, 1:04:50 AM8/12/22
to al...@googlegroups.com
Joao Cardoso said:

> It is possible to use the command line to change a disk partition type from "linux" to "RAID", but only late next week I will have access to my boxes to confirm the exact command. So, I suggest you to use the procedure bellow, given that you have a backup.

It might be worth trying changing the partition type first, if you're comfortable trying it out.

You can change the partition type from command-line using one of several tools. My system has fdisk (actually a link to busybox, which doesn't have a complete implementation), sfdisk, sgdisk and gdisk. I can't remember which of these came with my NAS and which I installed as packages. My sfdisk doesn't seem to like my disk's geometry, so I avoid that.

If you have sgdisk installed, use it like so:

First make a backup and check that the backup file looks OK:

# sgdisk --backup sdb.backup /dev/sdb
# file sdb.backup
sdb.backup: DOS/MBR boot sector; partition 1 : ID=0xee, start-CHS (0x0,0,1), end-CHS (0x3ff,254,63), startsector 1, 1953525167 sectors, extended partition table (last)

Then list the partitions and find the one that needs changing (sdb2 in your case?):

# sgdisk --print /dev/sdb
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 3BF9CB11-98A3-4C5E-A0F7-1A9B4D29F5F0
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 3437 sectors (1.7 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048         1050623   512.0 MiB   8200  Linux swap
   2         1050624      1953523711   931.0 GiB   FD00


Note that in my case, sdb2 (partition 2) is already set to FD00 (Linux RAID).

You can change the type code like so - here I'm changing the code for partition 1 (my swap partition) from 8200 to 8301 (Linux reserved) - the format is --typecode=partnum:newcode

# sgdisk --typecode=1:8301 /dev/sdb
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.


and check that it has changed:

# sgdisk --print /dev/sdb
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 3BF9CB11-98A3-4C5E-A0F7-1A9B4D29F5F0
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 3437 sectors (1.7 MiB)


Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048         1050623   512.0 MiB   8301  Linux reserved
   2         1050624      1953523711   931.0 GiB   FD00


In my case, I need to set it back to swap (8200) again with:

# sgdisk --typecode=1:8200 /dev/sdb

An alternative is to use the interactive "gdisk":

# sudo gdisk /dev/sdb

When run, it loads the partition table into memory. Use "p" to print the in-memory partition table, "t" to change type code (it will prompt you for the partition number and type code), "p" again to print the modified partition table, and if it's all good, "w" to write it to disk.

J


--
You received this message because you are subscribed to the Google Groups "Alt-F" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alt-f+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alt-f/48590610-fe99-4f29-abf8-e09d227ca3e8n%40googlegroups.com.

Joao Cardoso

unread,
Aug 12, 2022, 9:48:59 PM8/12/22
to Alt-F
I don't yet have access to my boxes, but I found a post that addresses the same issue: https://groups.google.com/g/alt-f/c/ed0bBd1fJDU/m/cC7yDJlOco0J

So, to change sdb2 type to RAID, ssh or telnet the box, login as the 'root' user, same password as the webUI, and type the following command (assuming that "sdb" is still the disk with issues! --sometimes they change their names after a reboot):

sfdisk -c /dev/sdb 2 da # change disk sdb partition 2 type to RAID (DA is the mdadm author suggestion for RAID)
sfdisk -R /dev/sdb # inform kernel of the change

sdb2 should now appear  as Component in the RAID page, and you can add it to the md1 degraded array. A full resync will follow, copying everything from sda2 to sdb2.

Michael Brice

unread,
Aug 17, 2022, 8:29:40 AM8/17/22
to al...@googlegroups.com
Thank you for the advice and help. I'm a Windows person and the last thing I want is to have to learn how to access the box at the Linux level and attempt 'repairs' using Linux command level stuff.

I have reinitialised the disks and recreated the RAID. It was a little more complicated than I had hoped, Samba configuration stuff and related permission issues. The restore took forever, but it is an old and small box, so I guess that is to be expected. But I am back 'online'.

I still have one issue, no Alt-F packages. When I attempt to install them there is another error message, but that will be a story for another discussion. 

You received this message because you are subscribed to a topic in the Google Groups "Alt-F" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/alt-f/_pR4F_V0h7Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to alt-f+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alt-f/76c734e9-eebf-41ae-b51d-55a466c7594an%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages