MD-RAID1 and iSCSI with multipathd: some experience

Ulrich Windl

unread,

Oct 14, 2010, 8:45:03 AM10/14/10

to open-iscsi

Hi,

I was investigating the status of building a RAID1 over iSCSI-connected devices managed by multipathd (SLES10 SP3 Release Notes said it won't work). Here are some of my findings:

1) The multipath-devices cannot be opened exclusively my mdadm:
# mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --bitmap=internal /dev/disk/by-id/scsi-3600508b4001085dd0001100002260000 /dev/disk/by-id/scsi-3600508b4001085dd0001100002290000
mdadm: Cannot open /dev/disk/by-id/scsi-3600508b4001085dd0001100002260000: Device or resource busy
mdadm: Cannot open /dev/disk/by-id/scsi-3600508b4001085dd0001100002290000: Device or resource busy
mdadm: create aborted

open("/dev/disk/by-id/scsi-3600508b4001085dd0001100002260000", O_RDONLY|O_EXCL) = -1 EBUSY (Device or resource busy)

2) The device-mapper files seem to be no SCSI Devices:
# mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --bitmap=internal /dev/dm-18 /dev/dm-19
mdadm: /dev/dm-18 is too small: 0K
mdadm: create aborted
rkdvmso1:~ # sdparm -a /dev/dm-18
unable to access /dev/dm-18, ATA disk?

3) The iSCSI devices are SCSI-devices, but are busy:
# sdparm -a /dev/sdax
/dev/sdax: HP HSV200 5000
Read write error recovery mode page:
AWRE 1 [cha: n, def: 1]
ARRE 1 [cha: n, def: 1]
TB 1 [cha: n, def: 1]
RC 0 [cha: n, def: 0]
[...]
# mdadm --verbose --create /dev/md0 --raid-devices=2 --level=raid1 --bitmap=internal /dev/sdax /dev/sdbo
mdadm: Cannot open /dev/sdax: Device or resource busy
mdadm: Cannot open /dev/sdbo: Device or resource busy
mdadm: create aborted

I'm not a specialist on mdadm, so please if I did something wrong, please tell me.

Regards,
Ulrich

Fubo Chen

unread,

Jan 1, 2011, 12:53:40 PM1/1/11

to Ulrich Windl, open-...@googlegroups.com

On Oct 14 2010, 1:45 pm, "Ulrich Windl" <Ulrich.Wi...@rz.uni-

Hi,

I have been looking at related but not identical question: to
replicate local disk to another server via iSCSI and md mirroring
(RAID1, no multipath). While making that setup I noticed that open-
iscsi times out SCSI commands if the network falls away long enough.
Why does open-iscsi initiator make SCSI commands fail instead of
reporting disk removal ?

$ sg_inq /dev/disk/by-path/ip-192.168.3.114\:3260-iscsi-...:tgt-lun-0
| grep RMB
PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3]

Fubo.

Fubo Chen

unread,

Jan 1, 2011, 1:50:17 PM1/1/11

to open-...@googlegroups.com, Ulrich...@rz.uni-regensburg.de

Hi,

I have been looking at related but not identical problem. I'm trying
to use md to replicate local disk to remote server by iSCSI and
mirroring (RAID1). But I noticed that iSCSI commands fail if network
timeout occurs longer than the iSCSI command timeout. I noticed that
the block device created by open-iscsi is marked as non-removable
(RMB=0). Why does open-iscsi behave this way and why does it not
report disk removal event if network connection fails ?

# mdadm --query --detail /dev/md4 | tail -n 3
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 64 1 active sync /dev/sde

# sg_inq /dev/disk/by-path/ip-192.168.3.114\:3260-iscsi-iqn\:tgt-lun-0

Mike Christie

unread,

Jan 3, 2011, 4:02:08 PM1/3/11

to open-...@googlegroups.com, Fubo Chen, Ulrich Windl

On 01/01/2011 11:53 AM, Fubo Chen wrote:
>
> I have been looking at related but not identical question: to
> replicate local disk to another server via iSCSI and md mirroring
> (RAID1, no multipath). While making that setup I noticed that open-
> iscsi times out SCSI commands if the network falls away long enough.
> Why does open-iscsi initiator make SCSI commands fail instead of
> reporting disk removal ?
>

I just sent a patch to make this configurable
http://groups.google.com/group/open-iscsi/browse_thread/thread/6737d1038ea56454?fwc=2

We did it because many apps could not handle hot add/remove at the time.
So if we removed the device, apps could not do anything and were stuck
referencing a bad pointer to a stale device struct, and if we later
added it apps would get really messed up and not be able to add it to
the multipath device or raid dev or whatever dev you are using.

But you know with the disk removal mehtod like in my patch and what
FC/SAS drivers do, commands are failed still, right? You get both the IO
failures and hotplug remove events.

Reply all

Reply to author

Forward