Raid 5 USB drive keeps crashing

191 views
Skip to first unread message

Dominic Russell

unread,
Feb 27, 2013, 12:01:05 AM2/27/13
to al...@googlegroups.com
Hello,

I have two DNS-323 that I'm trying to set in a RAID 5 configuration.  On both, the USB drive keeps disconnecting, or disappearing from the list of drives.  Because of this,recovering is running for weeks now, keeps retrying and retrying after every reboot.

Any suggestions?  I tried to swap the USB drives on the DNS-323 units, but still with the same results.  I tried two different kind of external SATA USB drives.
On one, the USB lies on top of the DNS box, and on the other, it lies beneath, could it affect?

Joao Cardoso

unread,
Feb 27, 2013, 9:56:13 AM2/27/13
to


On Wednesday, February 27, 2013 5:01:05 AM UTC, Dominic Russell wrote:
Hello,

I have two DNS-323 that I'm trying to set in a RAID 5 configuration.  On both, the USB drive keeps disconnecting, or disappearing from the list of drives.  Because of this,recovering is running for weeks now, keeps retrying and retrying after every reboot.

We need more RAID5 reports, reporting both success and unsuccess, as I have only tested it under development conditions, not real life usage conditions (and with tiny 80GB disks)

If you are seeing USB errors (System->Utilities->View Logs, kernel log), that is a bad prognosis, and I wouldn't try using the USB drive under that circumstances (specially for a RAID5 setup, where data movement, specially during recovery, is heavy)

Any suggestions?  I tried to swap the USB drives on the DNS-323 units, but still with the same results.  I tried two different kind of external SATA USB drives.

The USB errors occurs with different USB drives? Without USB hubs?
Again, my USB disk usage was mainly development, not real usage, we need more reports on this, both success/unsuccess reports.
My tests were performed using desktop 3-1/2" drives with two kinds of sata/ata/USB adapters.

What is your typical application usage? Mainly file serving?
Or are you using system stressing services, such as couchpotato/sickbeard/sabnzbget/owncloud?
 
On one, the USB lies on top of the DNS box, and on the other, it lies beneath, could it affect?

I don't think so.

Have the setup worked for a while and only now is showing problems, or have problems since the very beginning?

[Added: have you setup RAID5 using the webUI or the command line? Does it has the write-intent-bitmap active? Just relevant for the long recovery time, not the USB errors issue]


Joao Cardoso

unread,
Mar 6, 2013, 2:59:23 PM3/6/13
to al...@googlegroups.com
[On behalf of Dominique Russel]

Hello,

If you are seeing USB errors (System->Utilities->View Logs, kernel log), that is a bad prognosis, and I wouldn't try using the USB drive under that circumstances (specially for a RAID5 setup, where data movement, specially during recovery, is heavy)
There is no error in the log.  USB just disconnect, and the kernel reconnects to it, using a new sdx.  On the first recover, it often happens before the end of it, so I reboot the DNS and let it do it again.  I have an array of three 1.5TB, and it takes over two days to recover.  This week, finally I had it done, but two days after, the USB disconnected again.  I rebooted, but the RAID partition was bugged, the three partitions were green, but the only option available was destroy.  So I destroyed and recreated it, it sees the filesystem on it, but restart the recover, and since then, the USB disconnected every day, so the RAID 5 is still unprotected.

The USB errors occurs with different USB drives? Without USB hubs?
I tried two USB external case.  One is a single close black unit, and the other is a multi connector USB, with a hub built in.  With both, it disconnects regurlarly, I did not notice one better than the other.  Could it be the USB cable not sustaining that load?  The multi port unit, I use another one on my laptop to do the Windows 7 backup, and it always stays connected, so my conclusion is that it is not the unit itself, but Windows 7 backup does not seem to stress the external USB drive as much.

What is your typical application usage? Mainly file serving?
For one, I would like to use it as a file server, but I am not using yet, since I'm waiting to see if it could be stabilized, before putting my company files on it, otherwise I am too worried to loose the files.  In fact I tried to use it one week, and it crashed good, and lost all files on it, the filesystem could not be repaired.  It has three 1TB drives.
The other one I use it a iSCSI target server to do backups of two servers during the night.  It has three 1.5TB drives.

Have the setup worked for a while and only now is showing problems, or have problems since the very beginning?
Those issues have been happening since day 1.

Have you setup RAID5 using the webUI or the command line? Does it has the write-intent-bitmap active?
Yes, I used the WebUI to build the RAID partitions. I ignore what is write-intent-bitmap, so it must be using the default setting.

Joao Cardoso

unread,
Mar 8, 2013, 1:57:28 PM3/8/13
to al...@googlegroups.com
Given your description, I wouldn't put my data on that setup.

Has anybody similar RAID5 experiences? Good experience reports are also useful.
If RAID5 proves to be an issue, I will remove it from the Disk Wizard, but I need real usage feedback.

I have setup a small (150GB) RAID5 setup and I'm currently filling it with data; I will setup a script to exercise continuous data movement in the array for a couple of days, and I will report the results.

From your report (the missing sdc USB device and latter reappearing as sdd), that should appear in the Kernel or System log, and indicates a USB problem.

Does the USB unity puts the drive in standby/spindown? That could explain the problem if the wakeup time from the standly is longer than the RAID5 timeout; in that case the RAID software will mark the disk as failed.

I was not able to disable disk spindown on my USB disks, so the only way to avoid that is to create a cron script that keeps the USB disk awake.

Joao Cardoso

unread,
Mar 11, 2013, 10:32:22 AM3/11/13
to al...@googlegroups.com


On Friday, March 8, 2013 6:57:28 PM UTC, Joao Cardoso wrote:
Given your description, I wouldn't put my data on that setup.

Has anybody similar RAID5 experiences? Good experience reports are also useful.
If RAID5 proves to be an issue, I will remove it from the Disk Wizard, but I need real usage feedback.

I have setup a small (150GB) RAID5 setup and I'm currently filling it with data; I will setup a script to exercise continuous data movement in the array for a couple of days, and I will report the results.

After several days and several TB of uninterrupted RAID5 data creation/copy/deletion, no USB issue was detected. I'm using a rev-B1 board.

Not being able to confirm your issue, I'm again confident on RAID5 when using an external USB drive.

-Try using a better or shorter USB cable.
-For diagnose purposes disable the USB disk spindown, either directly if possible or by regularly changing data on the RAID.

Dominic Russell

unread,
Sep 10, 2013, 3:39:31 PM9/10/13
to al...@googlegroups.com
Hello,

Since I’ve put the two NAS on a UPS, they are very stable now, almost never the USB drive disconnect!  Maybe the USB connection on those DLink NAS is very sensible...

Thanks,
Dominic Russell
MSI Bureautique inc.
--
You received this message because you are subscribed to a topic in the Google Groups "Alt-F" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/alt-f/2Whc_3_p5FU/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to alt-f+un...@googlegroups.com.
To post to this group, send email to al...@googlegroups.com.
Visit this group at http://groups.google.com/group/alt-f?hl=en.
To view this discussion on the web visit https://groups.google.com/d/msg/alt-f/-/DfnyA4vma1EJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

dicky...@gmail.com

unread,
Jan 27, 2015, 5:47:35 AM1/27/15
to al...@googlegroups.com
FYI, I have the same problem as this on my 320L rev B running version

I cant spindown the drive - when I try I get an error -
[root@dns320l-4TB]# hdparm -B  /dev/sde
hdparm: invalid number '/dev/sde'
[root@dns320l-4TB]# hdparm -B 255  /dev/sde

/dev/sde:
 setting APM level to disabled 0xFF (255)

So I will try a shorter cable first.

I have tried 3 USB caddies and this has happened both during the rebuild and also a day or so after a rebuild completes and the drives are constantly copying data through rsync too when it has happened.

Dicky

dicky...@gmail.com

unread,
Jan 27, 2015, 3:15:48 PM1/27/15
to
Latest update.

Changed cable for a brand new Compaq supplied cable and this didnt help - still failed during the raid 5 rebuild. What I noticed was in the logs the usb disk would simply disconnect and then when I unplugged and re plugged it in it showed as the next disk  - ie before it was /dev/sdc now its /dev/sde then when it goes wrong its /dev/sdf.(as in the uploaded rtf doc.)

I also ran mdadm /dev/md0 --remove failed to remove the failed drive from the md0 config -

 Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2
       2       0        0        2      removed

       3       8       81        -      faulty spare
[root@dns320l-4TB]# mdadm /dev/md0 --remove failed
mdadm: hot removed 8:81 from /dev/md0

then added the usb disk again and the rebuild starts.

I have gone back to a previous used sata dock and changed the cable on that too but I dont have a really short one.

The dock that gave me the most issues worked flawlessly when attached to a hub and a laptop as my server.

I have uploaded the log files and issues pictures if needed!
usb issues.zip

Dominic Russell

unread,
Jan 27, 2015, 4:19:44 PM1/27/15
to al...@googlegroups.com
Hello,
 
Since I installed the NAS and external USB on a UPS, it is not crashing anymore...  Was it the USB affected by the variation of current, I cannot say, but since it is now stable, I kept the configuration as is :).
 
Best regards,
Dominic
 
Sent: Tuesday, January 27, 2015 3:15 PM
Subject: [Alt-F] Re: Raid 5 USB drive keeps crashing
 
Latest update.

Changed cable for a brand new Compaq supplied cable and this didnt help - still failed during the raid 5 rebuild. What I noticed was in the logs the usb disk would simply disconnect and then when I unplugged and re plugged it in it showed as the next disk  - ie before it was /dev/sdc now its /dev/sde then when it goes wrong its /dev/sdf.(as in the uploaded rtf doc.)

I also ran mdadm /dev/md0 --remove failed to remove the failed drive from the md0 config -

Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2
       2       0        0        2      removed

       3       8       81        -      faulty spare
[root@dns320l-4TB]# mdadm /dev/md0 --remove failed
mdadm: hot removed 8:81 from /dev/md0

then added the usb disk again and the rebuild starts.

I have gone back to a previous used sata dock and changed the cable on that too but I dont have a really short one.

The dock that gave me the most issues worked flawlessly when attached to a hub and a laptop as my server.

I have tried to upload log files but google says they exceed the limit for the group!


Dicky






On Wednesday, February 27, 2013 at 5:01:05 AM UTC, Dominic Russell wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "Alt-F" group.

To unsubscribe from this group and all its topics, send an email to alt-f+un...@googlegroups.com.

João Cardoso

unread,
Jan 28, 2015, 10:39:58 AM1/28/15
to al...@googlegroups.com


On Tuesday, January 27, 2015 at 8:15:48 PM UTC, dicky...@gmail.com wrote:
Latest update.

Changed cable for a brand new Compaq supplied cable and this didnt help - still failed during the raid 5 rebuild.

A world of caution: it is a well know fact that RAID5 will heavily stress disks during a rebuild, and if a disk fails during that step all your data will be lost.
 
What I noticed was in the logs the usb disk would simply disconnect and then when I unplugged and re plugged it in it showed as the next disk

Yes, the old device name will not be used, as there are still reference to it in the kernel bowels. Only with a clean "eject" the same name will be reused. Hope that you are not using swap on the external USB disk.
 
  - ie before it was /dev/sdc now its /dev/sde then when it goes wrong its /dev/sdf.(as in the uploaded rtf doc.)

I also ran mdadm /dev/md0 --remove failed to remove the failed drive from the md0 config -

 Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2
       2       0        0        2      removed

       3       8       81        -      faulty spare
[root@dns320l-4TB]# mdadm /dev/md0 --remove failed
mdadm: hot removed 8:81 from /dev/md0

then added the usb disk again and the rebuild starts.

I have gone back to a previous used sata dock and changed the cable on that too but I dont have a really short one. 

The dock that gave me the most issues worked flawlessly when attached to a hub and a laptop as my server.

On a RAID 5 setup? At least the whole 3x disk capacity is read/re-write during a rebuild. Your flawless everyday car wouldn't resist the 24 hours Le Man  race ;-(

A single bit error on a 2x6x10^12 bit transfer (~10^13) is enough. And disks have typically 1/10^15 bit error rates, which means that the probability of a single bit error is about 0.1 to 1%. That's the reason why RAID5 is now considered obsoleted by some and not used with modern high capacity disks...

 

I have uploaded the log files and issues pictures if needed!

Dicky






On Wednesday, February 27, 2013 at 5:01:05 AM UTC, Dominic Russell wrote:

dicky...@gmail.com

unread,
Jan 29, 2015, 10:33:13 AM1/29/15
to al...@googlegroups.com
Hi Joao,

I wanted raid 5 as I need around 3 TB of space as I have about 2.5TB of videos, msic and photos and didnt want to have 2 NAS's butI may have to concede and have 2 NAS's created as JBOD with your firmware and rsync between the 2! That way I can keep 1 in the house and 1 in the garage and all my eggs are not in 1 basket - also it would perform much faster with no USB but speed is not critical, only storage space!

The DNS320L is only 40 uk pounds but a 4bay Nas with no disks is over 100 pounds so part of using raid 5 was for cost reasons, especially as I already had 2 x 2tb disks!

Regarding Swap, I have not hit the "enable swap on usb drives" button and there is none in use at the moment, but, yes I believe swap is across all 3 drives.....

[root@dns320l-4TB]# cat /proc/swaps
Filename                                Type            Size    Used    Priority
/dev/sda1\040(deleted)                  partition       524284  0       1
/dev/sdb1                               partition       524284  0       1
/dev/sdc1                               partition       524284  0       1


João Cardoso

unread,
Jan 29, 2015, 11:24:29 AM1/29/15
to al...@googlegroups.com


On Thursday, January 29, 2015 at 3:33:13 PM UTC, dicky...@gmail.com wrote:
Hi Joao,

I wanted raid 5 as I need around 3 TB of space as I have about 2.5TB of videos, msic and photos and didnt want to have 2 NAS's butI may have to concede and have 2 NAS's created as JBOD

No data security!
 
with your firmware and rsync between the 2! That way I can keep 1 in the house and 1 in the garage and all my eggs are not in 1 basket - also it would perform much faster with no USB but speed is not critical, only storage space!

What USB disk are you using? 2.5" or 3.5"? do it have its own power supply or is it a 2.5" disk powered from the box? From Dominic experience that might be important.


The DNS320L is only 40 uk pounds but a 4bay Nas with no disks is over 100 pounds so part of using raid 5 was for cost reasons, especially as I already had 2 x 2tb disks!

Regarding Swap, I have not hit the "enable swap on usb drives" button and there is none in use at the moment, but, yes I believe swap is across all 3 drives.....

It looks like swap is being used in all drives. An Alt-F bug? 
 

[root@dns320l-4TB]# cat /proc/swaps
Filename                                Type            Size    Used    Priority
/dev/sda1\040(deleted)                  partition       524284  0       1

Is that the USB device?
That might be (one of) the reason why the USB takes a new device name -- sda1 is in use by the kernel, the sda name will not be used again. 
Even if you use the "swapoff" command, the "deleted" entry will not disappear.

dicky...@gmail.com

unread,
Jan 31, 2015, 7:19:34 AM1/31/15
to al...@googlegroups.com
Hi.

In answer to your questions below -

1 What USB disk are you using? 2.5" or 3.5"? do it have its own power supply or is it a 2.5" disk powered from the box? From Dominic experience that might be important.
answer: All 3 I tried were 3.5" external caddies all with their own power supplies plugged into the mains.

2 It looks like swap is being used in all drives. An Alt-F bug?
answer: dont know - maybe you can partition just the internal disks for swap and the 2nd partition as raid?

3 [root@dns320l-4TB]# cat /proc/swaps

Filename                                Type            Size    Used    Priority
/dev/sda1\040(deleted)                  partition       524284  0       1

Is that the USB device?
That might be (one of) the reason why the USB takes a new device name -- sda1 is in use by the kernel, the sda name will not be used again. 
Even if you use the "swapoff" command, the "deleted" entry will not disappear.
answer: Actually, yes, at 1 point the usb was 'sda' and the internals sdc and sdb in that order.

I have now blown away the raid 5 as I cant trust it and have a JBOD setup - 1 partition on each disk. I plan to replicate to my second 320L in the future (am clearing the house as I am moving soon now!!) Yes its not secure but disk space is what I need and replicating the data between the house and garage will give me 2 copies.

Will have to look into your backup feature at some point!

Cheers,

dicky
/dev/sdb1                               partition       524284  0       1
/dev/sdc1                               partition       524284  0       1


On Wednesday, February 27, 2013 at 5:01:05 AM UTC, Dominic Russell wrote:
Reply all
Reply to author
Forward
0 new messages