Problem partitioning 3TB drive for RAID 5 array

193 views
Skip to first unread message

Al K

unread,
Mar 23, 2016, 11:28:15 AM3/23/16
to Alt-F
Hi Joao,

I am trying to create a RAID5 system with three 3TB drives on a DNS-320 A1A2.

Per my prior experience with creating a RAID5 array, I started with two 3TB drives in the 

DNS320 and created a degraded RAID5.  When my external USB 3TB drive arrived, my plan was to 

simply partition it and add it to the degraded RAID5 array to complete it with a resync.

When I tried to partition the USB 3TB drive, I ran into a problem.  Please see attached before and after 

screen shots of the Disk Partitioner.  Basically, I was unable to create a 3TB partition 

after the swap space.

I found that the problem I have encountered is very similar (if not the same) to this 

thread:

DNS-323, current Alt-F firmware, and 3TB drives - ScottK


I have attached details of my DNS320 drives and system log (the date and time was not sync'd yet) and kernal log.  Plus the Disk 

Partitioner screen shots.

There were a number of sgdisk commands listed in the prior thread, if my situation is the same, should I execute those sgdisk commands as well?

Thank you for your time.

Al

==========

Firmware version:  
Alt-F 0.1RC4.1

Harddrives:
left sda WDC WD30EZRX-19D8PB0 3.0TB
right sdb WDC WD30EZRX-19D8PB0 3.0TB
USB sdc My Passport 082A 3.0TB

==========


Partition left disk, 3.0TB, WDC WD30EZRX-19D8PB0
Using GPT partitioning.

Dev Start sector Length Size (GB) Type
sda1 64 585936 0.300 swap
sda2 586000 5859947128 3000.293 RAID

Free:
0.000

Partition right disk, 3.0TB, WDC WD30EZRX-19D8PB0
Using GPT partitioning.

Dev Start sector Length Size (GB) Type
sdb1 64 585936 0.300 swap
sdb2 586000 5859947128 3000.293 RAID


==========

Error in Partitioning USB drive???:

expr: syntax error
expr: syntax error
HTTP/1.0 200 OK
Content-Type: text/html; charset=UTF-8

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 

<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="icon" type="image/png" href="../dns-323.png"/>
<style type="text/css">
html { height: 100%; }
body { height: 100%; font-family: arial,verdana; }
</style>
<link rel="stylesheet" type="text/css" href="/scripts/dull/dull.css">
<script type="text/javascript" src="/scripts/dull/dull.js"></script>
<title></title></head>
<body>
<h2 class=title>Disk Partitioner</h2>
<script type="text/javascript">
document.body.style.cursor = 'wait';
</script>
<p>Stopping disk sdc...
 done.</p>
<p>Partitioning disk sdc...
 done</p><p>Setting up partitions details...
 done</p><p>Reloading all disks...
 done</p>
<script type="text/javascript">
document.body.style.cursor = '';
</script>
<p><strong>Success</strong></p>
<script type="text/javascript">
function err() {
window.location.assign(document.referrer)
}
setTimeout("err()", 3000);
</script>
</body></html>
==========
3TB USB after Partition.png
3TB USB before Partition.png
Alt-F Status Page.png
KernelLog.log
SystemLog.log

João Cardoso

unread,
Mar 23, 2016, 4:16:19 PM3/23/16
to Alt-F


On Wednesday, 23 March 2016 15:28:15 UTC, Al K wrote:
Hi Joao,

I am trying to create a RAID5 system with three 3TB drives on a DNS-320 A1A2.

Per my prior experience with creating a RAID5 array, I started with two 3TB drives in the 

DNS320 and created a degraded RAID5.

Do you remember how have you created it? Using the Disk Wizard, the Disk Partitioner, or manually?

I *might* have the bug (the "expr: syntax error") fixed in my sources now, so I will not try to catch it.

You have a couple of hypothesis to solve that:

1-Copy the partition table on one of the 3TB internal disks to the new USB disk:
Disk->Partitioner, in the upper "Select the disk you want to partition" section, under "Partition Table", on one of the internal drives lines select "CopyTo" and select the USB disk device. Be sure you are copying to the right device, probably copy to sdc if sda and sdb are your internal disks.

This will only works if the total number of sectors of the USB disk is equal to or greater than the originating drive. Don't be mislead by the "3TB" label, you have to use the real number of sectors; that info appears in the "System Configuration Log", search for "Disks (GPT)". Eg:

Disk /dev/sdb: 3907029168 sectors, 1.8 TiB

If the USB disk doesn't appears in the System Configuration log (it is generated at boot time, when the USB disk was not yet attached), recreate the log by hitting "StartNow" under Services->User->user.

If that works (and the best way to be sure is to replug the USB disk and recheck under the Disk Partitioner), you should be able to add sdc2 to the running RAID array.

2-Use the Disk Wizard to create a degraded RAID1 using *only* the USB disk (deselecting the other drives). For extra safety, you might want to power off the box, remove the internal disks, and use the wizard with only the USB disk.
I don't know if this will work, but the Disk Wizard is simpler than the Disk Partitioner, so it is more likely to not have bugs (for greater than 2TB disks!)
This will create a partition of type RAID, that is what we want. The (great) down side if that will also add a filesystem on it, which is time lengthy, and you will have to destroy the created RAID1 before being able to add the partition to your existing RAID5 (using Disk->RAID, "RAID Operations", select Destroy).
So the next option is better and faster if you are not afraid of the command line.

3-Use the command line, with similar commands as in the thread you refer to, and assuming that sdc is the USB disk:

sgdisk --zap-all /dev/sdc # destroy the disk partition table, both MBR and GPT
sgdisk --set-alignment=8 --new=1:64:+512M --typecode=1:8200 /dev/sdc # 1st partition of 512MB, type swap
sgdisk --set-alignment=8 --new=2:0:0 --typecode=2:fd00 /dev/sdc # the rest of the disk, type RAID
sgdisk -p /dev/sdc # print partition table
sfdisk -R /dev/sdc # make the kernel reread the partition table

You don't have to create and activate a swap "filesystem" on the swap partition, the above command creates it just "because".
And you can ignore intermediate messages about "reboot"; at the end you might however want to replug the disk.

Worked? which one of the three?

Al K

unread,
Mar 24, 2016, 4:48:44 AM3/24/16
to Alt-F
Hi Joao,

Thanks for your concise response.  When I first created the degraded raid 5 with just the internal disks, I did it manually because I wanted to allocate only 300MB on each internal drive as swap.

As for the three possible solutions below, all have failed.  I have recorded as much detail as I can for each step.

Solution 1 - Copy Partition table from sda to sdc.  Failed.  Drive sizes are different, however, I did it just to see what would happen.  First I erased sdc partition table. See "Erase Partition Table sdc 2016-03-24_15-34-57"

Then I used the copy partition function and it indicated success when I ran it, see "Copy Partition Table sda to sdc 2016-03-24_15-36-00"  

However, when I went to Disk Partitioner again, it showed something completely different, see "After Copy Parittion Table sda to sdc 2016-03-24_15-36-54"  


Solution 2 - Use Disk Wizard on just the USB drive to create a degraded raid 1.  Failed.  See attached "Disk Wizard Fail 2016-03-24_16-03-01" image capture.  I checked Disk Partitioner after the wizard failure and captured the screen.  Please see "Disk Partitioner after Disk Wizard failure 2016-03-24_16-31-38"  Some actions were performed, but not exactly as expected.

Solution 3 - Commandline sgdisk.  Commands successful, see below.  However, when I go back to Disk Partitioner, it reads completely different.  See "Disk Partitioner vs sgdisk -p 2016-03-24_15-52-55"  
I executed the commands substituting 512M for 200M for the size of the first partition.  When I compare what I see from Disk Partitioner and sgdisk -p output, the drive size is completely different.  I do not understand how or why.

[root@NAS3]# sgdisk --zap-all /dev/sdc
Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.

[root@NAS3]# sgdisk --set-alignment=8 --new=1:64:+200M --typecode=1:8200 /dev/sdc
Creating new GPT entries.
The operation has completed successfully.

[root@NAS3]# sgdisk --set-alignment=8 --new=2:0:0 --typecode=2:fd00 /dev/sdc
The operation has completed successfully.

[root@NAS3]# sgdisk -p

[root@NAS3]# sgdisk -p /dev/sdc
Disk /dev/sdc: 732558336 sectors, 2.7 TiB
Logical sector size: 4096 bytes
Disk identifier (GUID): 22464954-D485-4FD2-9F43-2C9DA8704069
Partition table holds up to 128 entries
First usable sector is 6, last usable sector is 732558330
Partitions will be aligned on 64-sector boundaries
Total free space is 58 sectors (232.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              64           51263   200.0 MiB   8200
   2           51264       732558330   2.7 TiB     FD00

[root@NAS3]# sfdisk -R /dev/sdc



After attempting all three solutions, I can see that Disk Partitioner reports 3000.559 GB available, but some other programs (maybe sgdisk?) do not see that same amount of space available.  When I attempt to manually create partition 1 with 500M swap in Disk Partitioner, it reports success, however, upon returning to disk partitioner, it reports only 375.07 swap created.  Next I attempted to create a 1000M raid partition 2, Disk Partitioner reported failure (see below).  

Stopping disk sda... done.

Partitioning disk sda... failed

Could not create partition 2 from 732558328 to 732558327
Could not change partition 2's type code to fd00!
Error encountered; not saving changes.
Error  


If there is other data logs which you need, let me know.  I have an abundance of free time currently to work out this issue.   Thanks! Al

================
After Copy Parittion Table sda to sdc 2016-03-24_15-36-54.png
Copy Partition Table sda to sdc 2016-03-24_15-36-00.png
Disk Partitioner Size vs sgdisk -p size difference 2016-03-24_15-58-45.png
Disk Partitioner vs sgdisk -p 2016-03-24_15-52-55.png
Disk Wizard Fail 2016-03-24_16-03-01.png
Erase Partition Table sdc 2016-03-24_15-34-57.png
sgdisk commands.txt
Disk Partitioner after Disk Wizard failure 2016-03-24_16-31-38.png

João Cardoso

unread,
Mar 24, 2016, 1:59:04 PM3/24/16
to Alt-F


On Thursday, 24 March 2016 08:48:44 UTC, Al K wrote:
Hi Joao,

Thanks for your concise response.  When I first created the degraded raid 5 with just the internal disks, I did it manually because I wanted to allocate only 300MB on each internal drive as swap.

The amount of swap is critical for emulating the scarce RAM memory and allow fsck to check filesystems.
The exact amount can't be predicted, and it depends on the number of files and how deep the folder structure is. 
If it happens that you receive a fsck memory error message, you will have to increase the swap partition, which has its own risks, or temporarily add an external USB swap area.
This is particularly critical for the DNS-323, which has only 64MB of memory; the DNS-320/320L/325 have more RAM and are less prone to such problems.

300MB is just 0.01% of a 3TB disk, can't you spare a little bit more?
 

As for the three possible solutions below, all have failed.  I have recorded as much detail as I can for each step.

Just to be sure that you are not using some leftover from previous Alt-F versions, go to System->Utilities->Fixes and hit "RemoveAll".
 

Solution 1 - Copy Partition table from sda to sdc.  Failed.  Drive sizes are different, however, I did it just to see what would happen.  First I erased sdc partition table.

Not really needed, when you copy over you destroy the previous content.
 
See "Erase Partition Table sdc 2016-03-24_15-34-57"

Then I used the copy partition function and it indicated success when I ran it, see "Copy Partition Table sda to sdc 2016-03-24_15-36-00"  

However, when I went to Disk Partitioner again, it showed something completely different, see "After Copy Parittion Table sda to sdc 2016-03-24_15-36-54"  

Actually it "kind off" succeeds, but it created a MBR partition of type GPT using the whole (2TB for MBR) disk; that is called a "GPT compatible MBR", but it either failed to activate the GPT partition  itself or the Partitioner missed to detect that.
That might be a bug. My tests didn't use a greater than 2TB disk.
 


Solution 2 - Use Disk Wizard on just the USB drive to create a degraded raid 1.  Failed.  See attached "Disk Wizard Fail 2016-03-24_16-03-01" image capture.  I checked Disk Partitioner after the wizard failure and captured the screen.  Please see "Disk Partitioner after Disk Wizard failure 2016-03-24_16-31-38"  Some actions were performed, but not exactly as expected.

That is a bug, see bellow.


Solution 3 - Commandline sgdisk.  Commands successful, see below.  However, when I go back to Disk Partitioner, it reads completely different.  See "Disk Partitioner vs sgdisk -p 2016-03-24_15-52-55"  
I executed the commands substituting 512M for 200M for the size of the first partition.  When I compare what I see from Disk Partitioner and sgdisk -p output, the drive size is completely different.

Yes, but the number of sectors is the same.
The issue is that Alt-F always assume that a sector has 512 bytes, but it seems that your drive reports a sector to have 4096 bytes, 8 times more.
So if you multiply the 375GB the webUI reports by 8, you get 3TB!

When 2TB disks first appeared, for compatibility with existing software they lie saying that their sector size was 512 bytes, when in reality it was 4KB; afterwards, the drives internally multiplied by 8 whatever the number of sectors the user said, and things worked "fine".
That is the reason of the 2TB disk drives 4K sectors alignment issues a few year ago.
 
 I do not understand how or why.

Now you understand, see bellow
 

[root@NAS3]# sgdisk --zap-all /dev/sdc
Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.

[root@NAS3]# sgdisk --set-alignment=8 --new=1:64:+200M --typecode=1:8200 /dev/sdc
Creating new GPT entries.
The operation has completed successfully.

[root@NAS3]# sgdisk --set-alignment=8 --new=2:0:0 --typecode=2:fd00 /dev/sdc
The operation has completed successfully.

[root@NAS3]# sgdisk -p

[root@NAS3]# sgdisk -p /dev/sdc
Disk /dev/sdc: 732558336 sectors, 2.7 TiB
Logical sector size: 4096 bytes

ah!

What do your other 3TB disks report if you print its partition table the same way?

sgdisk -p /dev/sda | grep Logical
sgdisk
-p /dev/sdb | grep Logical

And for completeness, what is reported by the following commands:

cat /sys/block/sd*/queue/hw_sector_size
cat
/sys/block/sd*/queue/physical_block_size
cat
/sys/block/sd*/queue/logical_block_size

On all my 80GB/500GB/1TB/2TB disks "512" is always printed.

And please attach the "System Configuration" log from System->Utilities, View Logs, with all three disks attached (Go to the page end and Download it to your computer). You can attach it all (no confidential info there), although I'm mainly interested in the Disks MBR and Disk GTP sections, as well as the kernel logs that report disks, such as 
  kernel: sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)

So, to conclude, 'sgdisk' did it right, the issue seems to be the Partitioner reporting the wrong size in GB.
Yet to see how to fix that (with your help, I hope)

 
Disk identifier (GUID): 22464954-D485-4FD2-9F43-2C9DA8704069
Partition table holds up to 128 entries
First usable sector is 6, last usable sector is 732558330
Partitions will be aligned on 64-sector boundaries
Total free space is 58 sectors (232.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              64           51263   200.0 MiB   8200
   2           51264       732558330   2.7 TiB     FD00

[root@NAS3]# sfdisk -R /dev/sdc



After attempting all three solutions, I can see that Disk Partitioner reports 3000.559 GB available, but some other programs (maybe sgdisk?) do not see that same amount of space available.  When I attempt to manually create partition 1 with 500M swap in Disk Partitioner, it reports success, however, upon returning to disk partitioner, it reports only 375.07 swap created.  Next I attempted to create a 1000M raid partition 2, Disk Partitioner reported failure (see below).  

Stopping disk sda... done.

Partitioning disk sda... failed

Could not create partition 2 from 732558328 to 732558327
Could not change partition 2's type code to fd00!
Error encountered; not saving changes.
Error  


If there is other data logs which you need, let me know.  I have an abundance of free time currently to work out this issue.   Thanks! Al

Thanks!

Al K

unread,
Mar 25, 2016, 2:26:52 AM3/25/16
to Alt-F
Hi Joao,

Regarding the size of the swap, my drive file structure is not very deep and I do not plan to run any other applications on the DNS320...it will only serve one stream video at any given time.  If I HAVE to increase the swap, I can, but I can accomplish that as and when it is needed.

I went to the System-Utilities-Fixes and RemovedAll.  If I remember correctly, Alt-F 0.1RC4.1 is the only version I have ever installed on this NAS so I believe the system is relatively clean.

Thank you for explaining the math with regards to the sector sizes.  It is clear now.  

See bottom of email for the output of the commands you have requested.  The internal 3TB drives are 512 logical/4096 physical whereas the external USB drive is 4096/4096.  

I have also attached the system configuration log output.

The NAS is available to run any sort of tests/commands (hopefully it does not impact the data on the internal drives...I have backup of the data, just takes a LONG time to copy over again).  I am happy to help out where I can with ALT-F so just let me know what to do and I'll provide the detailed output.

Thank you!
Al


sgdisk output below  
==================================
[root@NAS3]#
[root@NAS3]#
[root@NAS3]# sgdisk -p /dev/sda | grep Logical
Logical sector size: 512 bytes
[root@NAS3]#
[root@NAS3]#
[root@NAS3]#
[root@NAS3]# sgdisk -p /dev/sdb | grep Logical
Logical sector size: 512 bytes
[root@NAS3]#
[root@NAS3]#
[root@NAS3]#
[root@NAS3]# cat /sys/block/sd*/queue/hw_sector_size
512
512
4096
[root@NAS3]# cat /sys/block/sd*/queue/physical_block_size
4096
4096
4096
[root@NAS3]# cat /sys/block/sd*/queue/logical_block_size
512
512
4096
[root@NAS3]#
SystemConf.log

Al K

unread,
Mar 25, 2016, 10:17:05 AM3/25/16
to Alt-F
Hi Joao,

I have some updates which I'd like to share with you.

I started reading about the 4k sector sizes and came across the WD Quick Formatter utility.  WD's description stated that the tool would reformat the drive to NTFS to work with Vista and above.  

So I used the tool, reformatted the drive to NTFS and then attached it to the NAS.  I did the following on the CLI:

[root@NAS3]# sgdisk -p /dev/sdc
Disk /dev/sdc: 5860466688 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 57F79DE4-EBE8-11E5-9488-ACFDCE95C111
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860466654
Partitions will be aligned on 8-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34          262177   128.0 MiB   0C01  Microsoft reserved part
   2          264192      5860466654   2.7 TiB     0700  Basic data partition
[root@NAS3]#
[root@NAS3]#
[root@NAS3]# cat /sys/block/sd*/queue/hw_sector_size
512
512
512
[root@NAS3]#
[root@NAS3]# cat /sys/block/sd*/queue/physical_block_size
4096
4096
4096
[root@NAS3]# cat /sys/block/sd*/queue/logical_block_size
512
512
512
[root@NAS3]#
[root@NAS3]#


I saw that the logical sector sizes has now been converted to 512 byte and thought good!  Now I can try to convert it to use with the existing RAID5 array and hopefully add it in.  Now back to the CLI:

  
[root@NAS3]#
[root@NAS3]# sgdisk --zap-all /dev/sdc
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.

[root@NAS3]# sgdisk --set-alignment=8 --new=1:64:+253M --typecode=1:8200 /dev/sd
c
Creating new GPT entries.
The operation has completed successfully.

[root@NAS3]# sgdisk --set-alignment=8 --new=2:0:0 --typecode=2:fd00 /dev/sdc
The operation has completed successfully.

[root@NAS3]# sgdisk -p /dev/sdc # print partition table
Disk /dev/sdc: 5860466688 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): CBD6F0E3-B557-4D07-A645-FACC0CCBDBFF
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860466654
Partitions will be aligned on 64-sector boundaries
Total free space is 30 sectors (15.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              64          518207   253.0 MiB   8200
   2          518208      5860466654   2.7 TiB     FD00

[root@NAS3]# sfdisk -R /dev/sdc
[root@NAS3]#

After doing the above, I could see in Disk Partitioner the proper sizes for the partitions made via sgdisk.

Next I went to DISK - RAID and added sdc2 to md0.  I think the system started to perform the resync, and then stopped.  I saw that the light on the USB drive stop blinking.  

I went to the CLI and tried:

[root@NAS3]# sfdisk -R /dev/sdc
/dev/sdc: No such file or directory

cannot open /dev/sdc for reading

So I downloaded all the logs via SYSTEM - UTILTIES and they are attached this email.

I have two of these 3TB WD Passport Ultra drives (Drive A was reformatted with WD Quick Formatter which changed the 4k sectors to 512 byte sectors and Drive B is original state with 4k sectors). 

I am not sure how to proceed, as it appears I have TWO problems for my one requirement (partition and add 3TB to my RAID5 array).  The original problem with 4k sector size external 3TB drive and the new problem which I described in this email.

I am available to work on either or both with you.  Just let me know how you'd like to proceed.

Thanks for spending time on my issues.
Al
KernelLog.log
SystemConf.log
SystemLog.log

João Cardoso

unread,
Mar 25, 2016, 6:28:03 PM3/25/16
to Alt-F


On Friday, 25 March 2016 14:17:05 UTC, Al K wrote:
Hi Joao,

I have some updates which I'd like to share with you.

Thanks for all your cooperation

From your previous post it is evident that the Disk Partitioner (and perhaps the Disk Wizard) has to workout using the drive reported logical sector size instead of the 512 bytes default that it now uses. That's fair and OK.

I also found that other greater than 2TB drives in the market now report a logical sector size of 4096.

You can also be sure that when using sgdisk to partition the drive from the command line it does the right thing, i.e., it takes into account the real sector size, and it is the Disk Partitioner that displays it wrongly. The linux kernel is also aware of the real sector size and I'm confident that other tools will also take that into account -- but only experimenting.
This means that you can try to use your 4096 bytes drive (not  WD Quick Formatteed) -- partition it using sgdisk, and attach it to the running RAID.


I started reading about the 4k sector sizes and came across the WD Quick Formatter utility.  WD's description stated that the tool would reformat the drive to NTFS to work with Vista and above.  

In their site they also speak on a Windows-XP compatible format option -- have you selected that? I suspect that the XP mode means 512 bytes per sector (and Advanced Format), while the "Factory Default" means 4096. Not important.
 
Their "format" utility does more than just format, I suspect that the drive or USB adapter firmware  is also updated, as that is the only way to make the drive report a 512 or 4096 logical sector size. That does not depends on the partition table model or contents.

I know remember of a similar question of another user some years ago, this time involving an internal drive and a USB enclosure.
It happens that the drive, when directly SATA-attached to the NAS or a PC, reported 512 bytes sectors, but when put in an USB enclosure and the USB attached to the PC or the NAS, it reported 4096 sectors! At the time I found that other users reported the same issue with some other SATA-USB adapters.
Anyway, moving on...

So you now have two different drives to test with, one with 512 and other with 4096 sectors size.
And you know that you can't use the Partitioner when using the one with 4096 bytes/sector.
Right.
But I would expect that the 4096 bytes/sector drive, after being sgdisk partitioned, could be also added to the RAID. 
Yes, the logs show that.
 
and then stopped.

Yes, the logs show that the USB drive stopped responding and was reset and disconnected. So sdc disappeared.

This happens sometimes when using USB drives, and I can't give a definitive answer why. The classical answer is that the box USB connector can't supply enough power to the drive. You have to research that by yourself.
You can try to use either drive with a standard filesystem, use it to read and write several files for several tens of minutes, and see if it works fine and is reliable.
I have a 1TB WD Passport drive myself that has no issues when attached to either a DNS-323/320l/325/327, and I remember other user successfully using a 2TB WD Pass (if I recollect correctly also under RAID5)


 I saw that the light on the USB drive stop blinking.  

I went to the CLI and tried:

[root@NAS3]# sfdisk -R /dev/sdc
/dev/sdc: No such file or directory

cannot open /dev/sdc for reading

So I downloaded all the logs via SYSTEM - UTILTIES and they are attached this email.

I have two of these 3TB WD Passport Ultra drives (Drive A was reformatted with WD Quick Formatter which changed the 4k sectors to 512 byte sectors and Drive B is original state with 4k sectors). 

I am not sure how to proceed, as it appears I have TWO problems for my one requirement (partition and add 3TB to my RAID5 array).  The original problem with 4k sector size external 3TB drive and the new problem which I described in this email.

The sgdisk command line partitioning *or* the WD utility solved one of the issues.
To have clues about the second, test the drives under non-RAID, and even on a standard computer - you will then be sure who to blame -- the drive or the box USB power.

A few users reported working RAID5 with USB drives (don't remember capacity), either USB or self-powered, but one user reported that only when a UPS was used he could reliably work with RAID5 (too much brownouts there?)


I am available to work on either or both with you.  Just let me know how you'd like to proceed.

As I already told above, try to rule out the drive or the the NAS USB supplied power.
And please report back your findings.

Thanks

PS-I would keep one of the drives out of the WD Formatter utility

Al K

unread,
Mar 26, 2016, 1:34:04 AM3/26/16
to Alt-F
Hi Joao,

FYI, currently, I have two DNS-323 running Alt-F with Raid5.  First one is 3 x 1TB RAID5 and the second one is a 3 x 2TB RAID5.  Both of them use WD Elements drives that are only USB powered.  I have had ZERO issues with the RAID5 arrays, the arrays remain clean even after sudden power outages.  The DNS-320 will be my third RAID5 installation with ALT-F.


So when I created the degraded RAID5 on the DNS-320 with the two 3TB internal drives, I restored the data from the 3TB USB drives (both of them) to the DNS-320 via the NAS USB port.  I was successful in restoring the data (it took about 5 days of operation, but completed without problems).  Hopefully this gives you some hints regarding the drive disconnecting from the DNS-320 during RAID resync.

Both drives have been used successfully on my computer for a few days without any problems prior to connecting to the DNS-320.

I am not sure if that helps in narrowing down there the problem lies.

FYI, I will refer to DriveA as the WD Quick Formatted Drive (512 sectors) and DriveB as the original factory condition (4096 sectors).

I wanted to attempt to partition DriveB and add it to the array via CLI.  I know how to partition it correctly now, but I do not know how to properly add it to the array.  
I found this "mdadm --add /dev/md0 /dev/sdc2"   on a wiki, is this sufficient to add DriveB to the array and start a resync? 

Thanks
Al

João Cardoso

unread,
Mar 26, 2016, 12:42:32 PM3/26/16
to Alt-F


On Saturday, 26 March 2016 05:34:04 UTC, Al K wrote:
Hi Joao,

FYI, currently, I have two DNS-323 running Alt-F with Raid5.  First one is 3 x 1TB RAID5 and the second one is a 3 x 2TB RAID5.  Both of them use WD Elements drives that are only USB powered.  I have had ZERO issues with the RAID5 arrays, the arrays remain clean even after sudden power outages.  The DNS-320 will be my third RAID5 installation with ALT-F.


So when I created the degraded RAID5 on the DNS-320 with the two 3TB internal drives, I restored the data from the 3TB USB drives (both of them) to the DNS-320 via the NAS USB port.  I was successful in restoring the data (it took about 5 days of operation, but completed without problems).  Hopefully this gives you some hints regarding the drive disconnecting from the DNS-320 during RAID resync.

Both drives have been used successfully on my computer for a few days without any problems prior to connecting to the DNS-320.

So, the only new thing that you are trying to do is to use a different box (320 instead of 323) and new disks (3TB instead of 1 and 2TB).
You have to use the new disks for a while on the final box you intend to use, not on a PC.
You can even use it formated as NTFS as you use it in your PC, it will be slow but it will works; save/read/erase a couple of large files into the drive.
 

I am not sure if that helps in narrowing down there the problem lies.

All relevant info can help.


FYI, I will refer to DriveA as the WD Quick Formatted Drive (512 sectors) and DriveB as the original factory condition (4096 sectors).

I wanted to attempt to partition DriveB and add it to the array via CLI.  I know how to partition it correctly now, but I do not know how to properly add it to the array.  
I found this "mdadm --add /dev/md0 /dev/sdc2"   on a wiki, is this sufficient to add DriveB to the array and start a resync? 

Yes, that's all that the webUI does (actually it does 'mdadm /dev/$mdev --add /dev/$rdev')

Al K

unread,
Mar 26, 2016, 11:28:09 PM3/26/16
to Alt-F
Yesterday I thought maybe this new 3TB USB drive has an aggressive spindown policy for power saving.
Since I had two partitions on DriveA512, I formatted partition 1 (sdc1) with ext4 and made it available as a network share.  Subsequently, I ran a program on my laptop to write a file to partition1 every 1 minute so that the drive will continue spinning.

Next via the WebUI, I went to DISK - RAID and added sdc2 to my degraded RAID5 array.  The resync started and ran for 20-30 min.  After that, I noticed that the LED light on DriveA512 did not blink anymore and was a steady state ON.  The lights on the DNS-320 were also steady state ON orange (both drive lights and USB light were orange no blink).  

I could still access the NAS WebUI and CLI.  In the WebUI, under Status, sdc had disappeared.  I have attached some logs which may help.

The problem can reproduced quite easily.  I determined it was not the DriveA512 spindown policy that is causing the disconnect.

My next step will be to take one of my DNS-323 off line and move the 3TB drives over and attempt to get the RAID5 running there. If it works, I suspect there is a problem with the DNS-320 and how it talks to USB drives in a RAID 5 resync.  
KernelLog with KeepAliveHD running.log
SystemLog with KeepAliveHD running.log

Al K

unread,
Mar 27, 2016, 1:34:20 AM3/27/16
to Alt-F
Hi Joao,

Today, I swapped the drives from one of my DNS-323 with the drives from the DNS-320.

DNS-323 has 3 x 3TB drives

DNS-320 has 3 x 2TB drives

The DNS-320 read the 3x2TB RAID5 volume fine.  See attached Status Page.

The DNS-323 read the 3x3TB drives.  I went to DISK - RAID and added sdc2 to the degraded RAID5 array and the resync process started.  It has been running been running over 40 minutes already.  See attached Status Page.

The System Config, Kernal Log and System Log for both DNS-323 and DNS-320 are attached as well.  Perhaps some telling details may emerge from their review.

At this point, I suspect there is some sort of bug or incompatibility with the WB 3TB Passport Ultra and the DNS-320 when it comes to resync/recovery on RAID5.

Joao, please have a look at the logs and if there are some further pieces of information you need, let me know.  I will let the 3x3TB drives complete the resync on the DNS-323 (hopefully successfully...only 3544 min to go).  :)

Thanks
Al
3x2TB Drives in DNS-320 Status 2016-03-27_13-06-42.png
3x3TB Drives in DNS-323 Status 2016-03-27_13-08-22.png
DNS-320 with 3x2TB Drives KernelLog.log
DNS-320 with 3x2TB Drives SystemConf.log
DNS-320 with 3x2TB Drives SystemLog.log
DNS-323 with 3x3TB Drives KernelLog.log
DNS-323 with 3x3TB Drives SystemConf.log
DNS-323 with 3x3TB Drives SystemLog.log

João Cardoso

unread,
Mar 27, 2016, 12:33:25 PM3/27/16
to Alt-F


On Sunday, 27 March 2016 04:28:09 UTC+1, Al K wrote:
Yesterday I thought maybe this new 3TB USB drive has an aggressive spindown policy for power saving.
Since I had two partitions on DriveA512, I formatted partition 1 (sdc1) with ext4 and made it available as a network share.  Subsequently, I ran a program on my laptop to write a file to partition1 every 1 minute so that the drive will continue spinning.

Next via the WebUI, I went to DISK - RAID and added sdc2 to my degraded RAID5 array.  The resync started and ran for 20-30 min.  After that, I noticed that the LED light on DriveA512 did not blink anymore and was a steady state ON.

The error is identical to the one you posted earlier, namely: 

RAID recovery started,
18:06:48 NAS3 user.info kernel: md: recovery of RAID array md0
...

and ~20 minutes later
18:24:51 NAS3 user.info kernel: usb 1-1: reset high-speed USB device number 3 using orion-ehci
...
18:27:31 NAS3 user.err kernel: usb 1-1: device not accepting address 3, error -110
18:27:31 NAS3 user.info kernel: usb 1-1: USB disconnect, device number 3
18:27:31 NAS3 user.info kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
...
And the driver errorcode :
18:27:31 NAS3 user.info kernel: sd 3:0:0:0: [sdc] Unhandled error code
18:27:31 NAS3 user.warn kernel: Result: hostbyte=0x01 driverbyte=0x00
18:27:31 NAS3 user.info kernel: sd 3:0:0:0: [sdc] CDB: 
18:27:31 NAS3 user.warn kernel: cdb[0]=0x8a: 8a 00 00 00 00 00 03 3d c9 f0 00 00 00 f0 00 00
18:27:31 NAS3 user.err kernel: end_request: I/O error, dev sdc, sector 54381040
...
and the raid acknowledge the drive error and continuing
18:27:31 NAS3 user.alert kernel: md/raid:md0: Disk failure on sdc2, disabling device.
18:27:31 NAS3 user.alert kernel: md/raid:md0: Operation continuing on 2 devices.
...
later on, the drive reappears, and the kernel tries to use it, but it does not accept commands:
18:27:32 NAS3 user.info kernel: usb 1-1: new high-speed USB device number 4
18:28:02 NAS3 user.err kernel: usb 1-1: device descriptor read/64, error -110
18:28:02 NAS3 user.info kernel: usb 1-1: new high-speed USB device number 5
18:28:33 NAS3 user.err kernel: usb 1-1: device descriptor read/64, error -110
18:28:33 NAS3 user.info kernel: usb 1-1: new high-speed USB device number 6 
8:28:43 NAS3 user.err kernel: usb 1-1: device not accepting address 6, error -110
18:28:43 NAS3 user.info kernel: usb 1-1: new high-speed USB device number 7
18:28:54 NAS3 user.err kernel: usb 1-1: device not accepting address 7, error -110
18:28:54 NAS3 user.err kernel: hub 1-0:1.0: unable to enumerate USB device on port 1
and at this point the kernel give up

I also found suspicious the successive "Very big device. Trying to use READ CAPACITY(16)." at device detection time, and that lead me to an old 2011 issue with 3TB USB drives, http://marc.info/?l=linux-usb&m=131947167807847&w=2, but it seems to not be your case, as no error is associated.


 The lights on the DNS-320 were also steady state ON orange (both drive lights and USB light were orange no blink).  
 
Degraded RAID


I could still access the NAS WebUI and CLI.  In the WebUI, under Status, sdc had disappeared.  I have attached some logs which may help.

The problem can reproduced quite easily.  I determined it was not the DriveA512 spindown policy that is causing the disconnect.

And 20 minutes of RAID5 recovery also rules out USB power issues.
BTW, you can disable Alt-F trying to force spin-down drives by setting Spindown to 0 under Disk Utilities (the sysctrl: force spindown message)

You are using the USB drive supplied USB cables, right? I had similar USB disconnect/reset errors when using external SATA-USB cheap adapters, but always considered them caused by bad/loose cables/connectors.

There is nothing new in the logs you post in the 323/320 2TB/3TB experiment you post next. Lets wait and see if the RAID5 rebuild succeeds.

RAID5 recovery reads all and writes almost all sectors on the RAID5 assigned partitions (that's why is it so lengthy, and it does not matters if there is no user data), so it is a tough drive test, (and drives often fail during that step). Thus, writing a couple of files to the sda1 ext4 filesystem you now have is not comparable in any way.

WD is clear regarding its desktop drives: they say that they are no intended to be used in RAID setups -- for that, you must buy more expensive server drives;-).
It looks like the issue is related with some response timing required by RAID. I have investigate that in the past but forgot the details, all I remember is that the RAID marks drives as having failed, when they are only a bit slower to respond. But that does not seems related.


My next step will be to take one of my DNS-323 off line and move the 3TB drives over and attempt to get the RAID5 running there. If it works, I suspect there is a problem with the DNS-320 and how it talks to USB drives in a RAID 5 resync.  

Alt-F binaries are identical for both boxes. The only difference is a tiny part of the kernel that is build differently for a particular CPU. The USB infrastructure, that seems to be the issue, is identical for both boxes. Of course, the boxes hardware is completely different.

Please keep us informed.

Al K

unread,
Mar 29, 2016, 12:25:30 PM3/29/16
to Alt-F
Hi Joao,

My DNS-323 finally finished the RAID recovery this evening and it was completed successfully.  I taken the liberty to attach the DNS323 logs (Kernel and System) but I think they indicate that all functions performed normally.  

Tomorrow I will move the 3TB drives back to the DNS-320.  

I believe there is some sort of communication bug in the DNS-320 when it comes to working with RAID5.  I have attached DriveB4096 to the DNS-320 and copied over 1TB of files back and forth without any comms problems popping up.  DriveB4096 never gets disconnected, it goes to sleep when idle, it wakes up when requested, and continues to function normally. 

Do you have any suggestion on how to monitor for the communications problem on the DNS-320 with RAID5?

Thanks
Al
DNS323 with 3TB drives - KernelLog.log
DNS323 with 3TB drives - SystemLog.log

Joao Cardoso

unread,
Mar 29, 2016, 8:46:56 PM3/29/16
to Alt-F


On Tuesday, March 29, 2016 at 5:25:30 PM UTC+1, Al K wrote:
Hi Joao,

My DNS-323 finally finished the RAID recovery this evening and it was completed successfully.  I taken the liberty to attach the DNS323 logs (Kernel and System) but I think they indicate that all functions performed normally.  

Yes, everything is normal.
 

Tomorrow I will move the 3TB drives back to the DNS-320.  

Use the same USB cable also, just to exclude that as a variable 
 

I believe there is some sort of communication bug in the DNS-320 when it comes to working with RAID5.

Yes, something like that. Or at least your particular DNS-320-rev-A1/A2.
 
 I have attached DriveB4096 to the DNS-320 and copied over 1TB of files back and forth without any comms problems popping up.  DriveB4096 never gets disconnected, it goes to sleep when idle, it wakes up when requested, and continues to function normally. 

That is also noteworthy, but not anything like the data movemente that occurs during a RAID5 reconstruction.
As that data movement has already happened in the DNS-323 during the rebuild, the RAID5 might be assembled without any issue in the DNS-320 and it might look that the problem has gone away.
If no issues arise after you move the disks (and USB cable :-) to the DNS-320, you might want to stress test the setup using the webUI RAID test option, which is read only.

What filesystem was you using with DriveB4096? Original partitioning with NTFS?


Do you have any suggestion on how to monitor for the communications problem on the DNS-320 with RAID5?

No I haven't, RAID/USB/devices are built in the kernel, we got the kernel USB error message as we could get a disk sata error message; only in this latter case we would suspect immediately of the disk, not the sata interface.
Your test on the DNS-323 prooved that the USB enclosed hw drive and USB sata/usb interface are working fine (at the DNS-323 speeds).
I think that this is different than USB snooping, but I'm not an expert.


 

Thanks
Al

Thanks to you for sharing your experiences and experiments. Please continue reporting.
 
...

Al K

unread,
Apr 2, 2016, 5:42:13 AM4/2/16
to Alt-F
Hi Joao,

This morning, I placed the 3x3TB RAID5 into the DNS-320 and booted the system.  The DNS-320 booted up fine, I do not think you'll find anything unusual in the logs here but I have attached them anyways.  

Subsequently, I started adding a bunch of files to the DNS-320.  Midway through, sdc stopped responding again.  I have attached the logs also. 

Once I start seeing these messages in the log (user.info kernel: usb 1-1: reset high-speed USB device number 2 using orion-ehci) the USB drive will become disconnected soon.

 I can only think there is something hardware wise wrong with my DNS-320, perhaps the USB interface is defective in some way.  

The 3x3TB RAID5 worked flawlessly in my DNS323 so I do not believe there is a problem with the USB drive.

As of this point I am completely out of ideas on how to further diagnose the problem further.  
If any one has suggestions, I'm open to hearing them.

Thanks
Al

...
3x3TB RAID5 DNS-320 KernelLog.log
3x3TB RAID5 DNS-320 SystemConf.log
3x3TB RAID5 DNS-320 SystemLog.log
DNS320 sdc stopped - KernelLog.log
DNS320 sdc stopped - SystemConf.log
DNS320 sdc stopped - SystemLog.log

João Cardoso

unread,
Apr 2, 2016, 3:00:53 PM4/2/16
to Alt-F


On Saturday, 2 April 2016 10:42:13 UTC+1, Al K wrote:
Hi Joao,

This morning, I placed the 3x3TB RAID5 into the DNS-320 and booted the system.  The DNS-320 booted up fine, I do not think you'll find anything unusual in the logs here but I have attached them anyways.  

Subsequently, I started adding a bunch of files to the DNS-320.  Midway through, sdc stopped responding again.  I have attached the logs also. 

The logs reveal the same problem as the ones you already sent. Only this time it took one hour for the reset/disconnect error appear.
 

Once I start seeing these messages in the log (user.info kernel: usb 1-1: reset high-speed USB device number 2 using orion-ehci) the USB drive will become disconnected soon.

 I can only think there is something hardware wise wrong with my DNS-320, perhaps the USB interface is defective in some way.  

I wouldn't say defective, as it works for a "long" while; and you have used one of the drives on the 320 out of the RAID without issues. Under "normal" usage you might never have any issue.
So it might be that yours 320 USB electronics characteristics are somewhat closer to the specs, and that electric noise deploys the failure?

All other variables being identical (same OS/kernel/drives), the DNS-320 is the only variable left, so, as "as you eliminate the impossible, whatever remains, no matter what improvable, must be the true" as Sherlock would say.
 

The 3x3TB RAID5 worked flawlessly in my DNS323 so I do not believe there is a problem with the USB drive.

As of this point I am completely out of ideas on how to further diagnose the problem further.  

I don't think there is anything more to diagnose. And I don't know how to fix it, if at all possible.
 
If any one has suggestions, I'm open to hearing them.

Your only hypothesis, given the same hardware, is to move one of your 3x1TB or 3x2TB RAID5 setups on the DNS-323 to the DNS-320, and use the new 3x3TB on a DNS-323. And hope that everything works OK (the 3TB WD Passport is USB-3, aren't the 1/2TB USB-2? Does that creates good expectations...? don't know).

Other hypothesis is to test the setup on a 320L, which is cheaper that a 3TB USB drive (but has a noisy fan), or even a 327L (which already has USB-3 and Alt-F will "soon" support -- it's working fine next me, as I write)

Luck,
João


...

Al K

unread,
Apr 3, 2016, 10:18:16 AM4/3/16
to Alt-F
Hi Joao,

To to complete the situation, I have moved my old 3x1TB drive setup to the DNS-320 this morning.  It has booted fine and I have beem moving files in and out of the DNS-320 all day.  As far as I can tell, it has been quite stable and has not disconnected the 1TB USB drive at all.  This 1TB USB drive is a WD Elements drive about 3 years old now.  What this tells me is that there is some difference in the way this 1TB USB drive communicates and handles signals versus the WD Passport Ultra 3TB drive which is not compatible with my DNS-320.

It looks like I will keep the DNS-320 as 3x1TB, and the two DNS-323 as 3x2TB and 3x3TB.   Uncanny that the older DNS-323 is more compatible with new hardware than the newer DNS-320.  

Once I have managed to completely move my data around, I will post a final update (hopefully without further problems from the DNS-320) and mark this as complete.

Thanks for your help!
Al 
...

Al K

unread,
Apr 7, 2016, 1:32:36 PM4/7/16
to Alt-F
Hi Joao,

Interesting update here.  So the DNS-320 has 3x1TB raid5 (using an older WD Elements 1TB USB drive).  I finished moving around a bunch of files to accommodate the new size for the DNS-320 (5 days straight of moving files in and out of the DNS-320).

I took a dump of the systemconfig log and noticed a whole bunch of resets on the USB bus and the old 1TB Elements drive handled all the resets without disconnecting from the DNS-320.  I think it is weird that I have so many of these messages in the log...maybe the DNS-320 has some problem?  

During the past 5 days, I have monitored the status of the 3x1TB raid5 via the webUI and it has always remained clean and stable (no rebuilds at all).

FYI, the 3x3TB raid 5 has been sitting on a DNS-323 and that has been rock solid without any errors reported at all and the raid5 array has maintained stability without any rebuilds.

Since the 3x1TB raid5 continues to function on the DNS-320 and the 3x3TB raid5 works well on the DNS-323, I will leave the setup as is.

If you have any final thoughts on this matter, I'm all ears.

Thanks
Al


...
SystemConf.log

João Cardoso

unread,
Apr 8, 2016, 12:49:55 PM4/8/16
to Alt-F


On Thursday, 7 April 2016 18:32:36 UTC+1, Al K wrote:
Hi Joao,

Interesting update here.  So the DNS-320 has 3x1TB raid5 (using an older WD Elements 1TB USB drive).  I finished moving around a bunch of files to accommodate the new size for the DNS-320 (5 days straight of moving files in and out of the DNS-320).

I took a dump of the systemconfig log and noticed a whole bunch of resets on the USB bus and the old 1TB Elements drive handled all the resets without disconnecting from the DNS-320.

Yes. but that is not good by itself. You have those resets every ~15 minutes, and you should have none. There are none in the DNS-323, right?
Does the resets occurs always or only you are moving files around it?

BTW, it looks like your RAID don't has a "write intent bitmap" (RAID->RAID Operations, Create/Remove Bitmap). It make resyncing much faster.
 
 I think it is weird that I have so many of these messages in the log...maybe the DNS-320 has some problem?  

Yes, I think so. If that is an issue of your particular box or of any DNS-320-rev-Ax boxes I can't know.
 

During the past 5 days, I have monitored the status of the 3x1TB raid5 via the webUI and it has always remained clean and stable (no rebuilds at all).

FYI, the 3x3TB raid 5 has been sitting on a DNS-323 and that has been rock solid without any errors reported at all and the raid5 array has maintained stability without any rebuilds.

Since the 3x1TB raid5 continues to function on the DNS-320 and the 3x3TB raid5 works well on the DNS-323, I will leave the setup as is.

If you have any final thoughts on this matter, I'm all ears.

With so many USB resets I wouldn't be confident on the system reliability. That shouldn't happens!
See if you have resets even when not accessing files. (The System Log time-stamps entries from the Kernel Log, that is where the USB reset errors appear from, so you can know how/when they appear).

 
...

Al K

unread,
Apr 9, 2016, 4:20:37 AM4/9/16
to Alt-F
Hi Joao,

I've performed a few tests with the DNS-320.  It appears that the USB messages only appear when writing files.  When reading from the DNS-320 these USB messages do not appear, writes however, exhibit the USB reset messages.  However, as far as I can tell, the data I have on the DNS-320 is intact and 100% bit for bit identical for the data that I have compared.  

With regards to either of my DNS-323s, no such messages appear.

I aim to use the DNS-320 as per normal and monitor it's reliability, but so far so good other than the USB reset messages.  If the raid breaks down, I'll let you know.

Thanks

Al

...
Reply all
Reply to author
Forward
0 new messages