Software RAID 1 problem

0 views
Skip to first unread message

Jay L

unread,
Dec 28, 2003, 9:18:03 AM12/28/03
to
Hi,

I have a 1U server with dual Western Digital 80 GB HDs both on IDE
Channel 1. I would like to set both of thees drives up in a RAID 1
configuration. I am using Gentoo Linux and have been going through
the configuration process. Previously, the server was running RH8
with RAID1 and I noticed that one drive was down. Before doing the
full install of Gentoo, I ran a battery of tests on the drives with
Western Digital's tools. Both hard drives passed without a single
problem.

Now I am configuring Gentoo and am finding that during the initial
RAID1 sync., I get and error on one of the drives. I am stumped at
this point because as far as I can tell the drives are good and should
not be giving me this problem. If it helps, the MoBo is an MSI-6378
and the machine has 1 GB of RAM and is using Athlon-XP 2000+ CPU.
Does anyone have any ideas how to resolve this?

TIA for any advice

JL

John-Paul Stewart

unread,
Dec 28, 2003, 11:40:16 AM12/28/03
to
Jay L wrote:
>
> Hi,
>
> I have a 1U server with dual Western Digital 80 GB HDs both on IDE
> Channel 1. I would like to set both of thees drives up in a RAID 1
> configuration.

Be aware that having both disks on the same IDE channel will seriously
degrade performance.

> Now I am configuring Gentoo and am finding that during the initial
> RAID1 sync., I get and error on one of the drives.

Be more specific: exactly what is the error you're getting and when?

Jay L

unread,
Dec 28, 2003, 6:35:36 PM12/28/03
to
Here is exact details. I re-ran this this afternoon and so this is
fresh off the machine. I am installing the new OS on a machine with
dual 80 GB HDs. I

FDisked both drives with exactly matching partitions. Here is what
the config looks like:

Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 1 5 40131 fd Linux raid
autodetect
/dev/hda2 6 68 506047+ 82 Linux swap
/dev/hda3 69 9729 77601982+ fd Linux raid
autodetect

Command (m for help):

(It is the same for both so all you need to do is replace hda with
hdb)

I loaded the RAID Kernel module and then created the following
raidtab:

# /boot (RAID 1)
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/hda1
raid-disk 0
device /dev/hdb1
raid-disk 1

# / (RAID 1)
raiddev /dev/md2
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/hda3
raid-disk 0
device /dev/hdb3
raid-disk 1

I then went ahead and begin the synch process using the command mkraid
/dev/md* where * is

either 0 or 2. The initial synch of md0 goes without a hitch though I
have to use mkraid -R since there is remnants from the last synch on
the disks. I then begin to synch md2. Here is the output of
/proc/mdstat during the md2 synch:


Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1]
ide/host0/bus0/target0/lun0/part3[0]
77601856 blocks [2/2] [UU]
[=======>.............] resync = 37.5% (29110720/77601856)
finish=36.7min

speed=21993K/sec
md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
ide/host0/bus0/target0/lun0/part1[0]
40064 blocks [2/2] [UU]

unused devices: <none>


During the synch process on md2, the following errors appear. ( I do
not know exactly when this occurred, but I know that it is after
synching at least 80% of the drive):

hdb: dma_timer_expiry: dma status == 0x61
hdb: timeout waiting for DMA ( repeats again )
hdb: (__ide_dma_test_irq) called while not waiting
hda: status timeout: status=0xd0 { Busy }

hda: drive not ready for command
ide0: reset: success
hdb: irq timeout: status=0xd0 { Busy } (2 more times)

end_request: I/O error, dev 03:43 (hdb), sector 138982783
raid1: Disk faiulure one ide/host0/bus0/target1/lun0/part3, disbaling
device
Operation continuing on 1 devices
hdb: status timeout: status=0xd0 { Busy }

hdb: drive not ready for command
ideo0: reset: success
md2: no spare disk to reconstruct arraay! -- continuing in degraded
mode
hdb: irq timeout: status=0xd0 { Busy }

ide0: rest: success
hdb: irq timeout: stauts=0xd0 { Busy } (these two lines are repeated
once more}

end_request: I/O error, dev 03:43 (hdb), sector 138982911

And now cat /proc/mdstat looks like this:

Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1](F)
ide/host0/bus0/target0/lun0/part3[0]
77601856 blocks [2/1] [U_]

md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
ide/host0/bus0/target0/lun0/part1[0]
40064 blocks [2/2] [UU]

unused devices: <none>


TIA in advance for any thoughts on this.

JL


John-Paul Stewart <jpst...@sympatico.ca> wrote in message news:<3FEF0770...@sympatico.ca>...

Adrian Inman

unread,
Jan 4, 2004, 8:00:46 AM1/4/04
to
Hi Jay,

I'd get hold of smartsuite - not to be confused with Lotus SmartSuite.

I'm a debian man myself so I just go apt-get install smartsuite

the command you want to then use is smartctl -a /dev/hda etc.

this will ask the drive if it thinks its faulty or not - usually more
informative than the naff utils you get from the drive manufacturers.

Good luck.

"Jay L" <jl_...@hotmail.com> wrote in message
news:4c48d1f1.0312...@posting.google.com...

Bruce Allen

unread,
Jan 5, 2004, 6:47:23 AM1/5/04
to
> I'd get hold of smartsuite - not to be confused with Lotus SmartSuite.
> I'm a debian man myself so I just go apt-get install smartsuite
> the command you want to then use is smartctl -a /dev/hda etc.

The smartmontools package (which you can also get with apt) provides
additional functionality beyond smartsuite. Please see
http://smartmontools.sf.net/ for details.

Jay L

unread,
Jan 8, 2004, 2:00:56 PM1/8/04
to
Adrian and others,

Believe, it or not, I think that I found a solution for the problem. I
pulled the drives and reviewed the jumpers and found that they were in
"Cable Select" mode. According to the people who sold me the machine,
this is the normal configuration and allows the system to
automatically select a master/slave depending on the position in the
IDE chain. I opted to change this and manually chose master and slave.
(Master for the last drive in the chain and slave for the middle
drive.) After making this change, my problems have disappeared! I will
keep my eyes on it in case it is not a permanent solution, but so far
so good.

I wanted to post this here in case anyone else experiences this issue.

JL


"Adrian Inman" <adr...@afinite.co.uk> wrote in message news:<bt92pt$bs7$1...@news.freedom2surf.net>...

Reply all
Reply to author
Forward
0 new messages