iscsi HA

joby xavier

unread,

Apr 10, 2012, 6:21:50 AM4/10/12

to open-iscsi

Hi,

I want to set up a iscsi high availability with sheepdog distributed
storage.

Here is my system set up. OS-Ubuntu. Four nodes with sheepdog
distributed storage and i am sharing this storage through iscsi using
two nodes as well as using a virtual ip set up using ucarp.Two nodes
using same iqn. And mounted the iscsi storage as lvm partition (sdc)

node a
node b
node c
node d
node x is the initiator
node a and b having common virtual ip because if 'node a' fails 'node
b' should serve as iscsi target, both have same iqn.

Problem: when a failover happens ie iscsi switching from node one to
two, the iscsi disk fails on initiator 'node x'

Code:
root@prox1:~# pvdisplay
/dev/sdc: read failed after 0 of 4096 at 0: Input/output error
/dev/sdc: read failed after 0 of 4096 at 104792064: Input/output
error

And here is my /var/log/messages errors

Code:
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] CDB: Read(10): 28 00
00 03 1f 80 00 00 08 00
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Unhandled sense code
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Sense Key : Medium
Error [current]
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] CDB: Read(10): 28 00
00 03 1f f0 00 00 08 00
Apr 10 13:08:39 prox1 kernel: sd 30:0:0:1: [sdc] Unhandled sense code

Can anyone give some idea on this? should i do anything on lvm.conf?
should i use multipath-tools? is this the right procedure?

Mike Christie

unread,

Apr 10, 2012, 3:43:41 PM4/10/12

to open-...@googlegroups.com, joby xavier

IO is making it to the target/device ok, but the target/device is
returning a failure. Look at the box running the target. Is there some
more info in those logs?

joby xavier

unread,

Apr 11, 2012, 8:52:43 AM4/11/12

to open-iscsi

no more info on logs,same lines are repeating on var/log/messages.
should i use multipathing for this?

Mike Christie

unread,

Apr 11, 2012, 1:03:33 PM4/11/12

to open-...@googlegroups.com, joby xavier

On 04/11/2012 07:52 AM, joby xavier wrote:
> no more info on logs,same lines are repeating on var/log/messages.
> should i use multipathing for this?
>

I am not sure mutlipath will help because you are getting Medium Errors.
What target are you using?

joby xavier

unread,

Apr 12, 2012, 3:24:43 AM4/12/12

to open-iscsi

I am using tgt (https://github.com/collie/sheepdog/wiki/General-
protocol-support) and open-iscsi on my Ubuntu boxes.

Mike Christie

unread,

Apr 12, 2012, 12:46:54 PM4/12/12

to joby xavier, open-...@googlegroups.com

On 04/11/2012 09:15 PM, joby xavier wrote:
> I am using tgt (
> https://github.com/collie/sheepdog/wiki/General-protocol-support) and
> open-iscsi on my Ubuntu boxes.

When the failover happens do you see the iscsi initiator drop one
connection and reconnect in /var/log/messages? You should see something
like conn error 1011 then a msg about being reconnected in N retries.

joby xavier

unread,

Apr 16, 2012, 1:42:12 AM4/16/12

to open-iscsi

sorry for the delayed response...

here is my /var/log/messages when Virtual IP points to other server
when a failover happens

Apr 16 10:57:14 prox1 kernel: scsi7 : iSCSI Initiator over TCP/IP
Apr 16 10:57:14 prox1 kernel: scsi 7:0:0:0: RAID IET
Controller 0001 PQ: 0 ANSI: 5
Apr 16 10:57:14 prox1 kernel: scsi 7:0:0:1: Direct-Access IET
VIRTUAL-DISK 0001 PQ: 0 ANSI: 5
Apr 16 10:57:14 prox1 kernel: sd 7:0:0:1: [sdc] 2252800 512-byte
logical blocks: (1.15 GB/1.07 GiB)
Apr 16 10:57:14 prox1 kernel: sd 7:0:0:1: [sdc] Write Protect is off
Apr 16 10:57:14 prox1 kernel: sd 7:0:0:1: [sdc] Write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Apr 16 10:57:14 prox1 kernel: sdc: unknown partition table
Apr 16 10:57:14 prox1 kernel: sd 7:0:0:1: [sdc] Attached SCSI disk

Apr 16 10:59:47 prox1 kernel: connection2:0: detected conn error
(1020)
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Unhandled sense code
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Sense Key : Medium
Error [current]
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] CDB: Read(10): 28 00
00 00 00 00 00 00 08 00
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Unhandled sense code
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Sense Key : Medium
Error [current]
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] CDB: Read(10): 28 00
00 00 00 00 00 00 08 00
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Unhandled sense code
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Sense Key : Medium
Error [current]
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] CDB: Read(10): 28 00
00 00 00 08 00 00 08 00
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Unhandled sense code
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Sense Key : Medium
Error [current]
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Add. Sense:
Unrecovered read error
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] CDB: Read(10): 28 00
00 00 00 00 00 00 08 00
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Unhandled sense code
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Result:
hostbyte=invalid driverbyte=DRIVER_SENSE
Apr 16 10:59:51 prox1 kernel: sd 7:0:0:1: [sdc] Sense Key : Medium
Error [current]

this pattern is continuing...

root@prox1:~# pvdisplay
/dev/sdc: read failed after 0 of 4096 at 0: Input/output error

/dev/sdc: read failed after 0 of 4096 at 4096: Input/output error
--- Physical volume ---
PV Name /dev/sda2
VG Name pve
PV Size 232.39 GiB / not usable 3.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 59490
Free PE 4095
Allocated PE 55395
PV UUID qr1b2t-zLXv-WhWh-ZKm2-2dKX-dmtO-BaADAw

Mike Christie

unread,

Apr 16, 2012, 11:44:05 PM4/16/12

to open-...@googlegroups.com, joby xavier

On 04/16/2012 12:42 AM, joby xavier wrote:
> sorry for the delayed response...
>
> here is my /var/log/messages when Virtual IP points to other server
> when a failover happens
>

Could you send all of the /var/log/messages?

Mike Christie

unread,

Apr 17, 2012, 12:56:31 PM4/17/12

to open-...@googlegroups.com, joby xavier

The log seems to be missing the iscsid output, but it looks like the
initiator detects the failover, we drop the connection then relogin.
When we relogin though, the target is just failing IO with that
MEDIUM_ERROR or it is just dropping IO (we see the 1021 errors which
mean a IO timedout and we had to run the scsi error handler).

I think you need to contact the sheepdog developers or the people that
made your target to make sure your config is supported, because it looks
like on the initiator side there is not anything more we can do. The
device is just failing IO we send it. You need to ask the target people
why it is doing that.

joby xavier

unread,

Apr 18, 2012, 12:18:46 AM4/18/12

to Mike Christie, open-...@googlegroups.com

Mike,

We really appreciate your help on this issue. We will definitely contact sheepdog team and will let you know the results

Many Thanks,
Joby Xavier

GProc...@symcor.com

unread,

Apr 19, 2012, 4:12:53 PM4/19/12

to open-...@googlegroups.com, Mike Christie, open-...@googlegroups.com

Generally speaking, you want to avoid using things like cluster vips that
float between iscsi portals. A more elegant solution is to use clustered
storage management eg. RedHat CLVM and having the initiator log into both
nodes and then create multipath maps.

The key here is on the iSCSI target you need to access your storage with
synchronous direct IO so that as IO is issued over the wire should an
iscsi portal go down you dont lose data.

This situation scales quite well and allows you to run active/active.

We implemented this solution using stgt/open-iscsi and it works really
well if you can excuse the unexplained poor read performance of
open-iscsi. We were able to write to our storage at around 900MB/s over
10gbit Ethernet but only read at around 300MB/s.

--

Greg Procunier, RHCSA, RHCE
UNIX Administrator III - Enterprise Servers and Storage
1 Robert Speck Parkway, Suite 400, Mississauga, Ontario L4Z 4E7
Office: 416-673-3320
Mobile: 647-465-9752
Email: gproc...@symcor.com

Mike,

Many Thanks,
Joby Xavier

--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to open-...@googlegroups.com.
To unsubscribe from this group, send email to
open-iscsi+...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.

Reply all

Reply to author

Forward