Error while backing up mulitpath devices using CDP

120 views
Skip to first unread message

jeet2k123

unread,
Jun 25, 2009, 7:06:45 AM6/25/09
to open-iscsi
Hi all,

We using R1soft CDP tool to backup Linux files. on centos 5.2

The Backup works fine for normal disk, but for multipath devices its
giving error while taking backup

I am using 11 iSCSI partition on a SAN device configured in multipath.
I am using 8 path in multipath.
My multipath is using 2 Intel NIC which are connected to dedicated
managed switch(HP)

I have configured multipath as per this doc.
http://wiki.r1soft.com/display/kb/Linux+Agent+Multipath+Workaround

my mount point is
/dev/dm-0 on /home11 type ext3 (rw)

The moment i start backup for above mount point, i get this error
{code}
Jun 23 12:21:24 mss-us2 kernel: connection8:0: ping timeout of 5 secs
expired, last rx 7739207452, last ping 7739212452, now 7739217452
Jun 23 12:21:24 mss-us2 kernel: connection8:0: detected conn error
(1011)
Jun 23 12:21:24 mss-us2 kernel: sd 10:0:0:11: SCSI error: return code
= 0x00020000
Jun 23 12:21:24 mss-us2 kernel: end_request: I/O error, dev sdi,
sector 2362696
Jun 23 12:21:24 mss-us2 kernel: device-mapper: multipath: Failing path
8:128.
Jun 23 12:21:24 mss-us2 multipathd: sdab: tur checker reports path is
down
Jun 23 12:21:24 mss-us2 multipathd: checker failed path 65:176 in map
home13
Jun 23 12:21:24 mss-us2 kernel: device-mapper: multipath: Failing path
65:176.
Jun 23 12:21:24 mss-us2 multipathd: home13: remaining active paths: 7
Jun 23 12:21:25 mss-us2 iscsid: Kernel reported iSCSI connection 8:0
error (1011) state (3)
Jun 23 12:21:28 mss-us2 iscsid: connection8:0 is operational after
recovery (1 attempts)
Jun 23 12:22:00 mss-us2 kernel: connection2:0: ping timeout of 5 secs
expired, last rx 7739243726, last ping 7739248726, now 7739253726
Jun 23 12:22:00 mss-us2 kernel: connection2:0: detected conn error
(1011)
Jun 23 12:22:00 mss-us2 kernel: sd 4:0:0:11: SCSI error: return code =
0x00020000
Jun 23 12:22:00 mss-us2 kernel: end_request: I/O error, dev sdg,
sector 3146136
Jun 23 12:22:00 mss-us2 kernel: device-mapper: multipath: Failing path
8:96.
Jun 23 12:22:00 mss-us2 kernel: device-mapper: multipath: Failing path
8:208.
Jun 23 12:22:00 mss-us2 multipathd: sdn: tur checker reports path is
down
Jun 23 12:22:00 mss-us2 multipathd: checker failed path 8:208 in map
home12
Jun 23 12:22:00 mss-us2 multipathd: home12: remaining active paths: 7
Jun 23 12:22:01 mss-us2 iscsid: Kernel reported iSCSI connection 2:0
error (1011) state (3)
Jun 23 12:22:04 mss-us2 iscsid: connection2:0 is operational after
recovery (1 attempts)
Jun 23 12:22:08 mss-us2 multipathd: dm-0: add map (uevent)
Jun 23 12:22:08 mss-us2 multipathd: dm-0: devmap already registered
Jun 23 12:22:08 mss-us2 multipathd: 8:96: mark as failed
Jun 23 12:22:08 mss-us2 multipathd: home11: remaining active paths: 7
Jun 23 12:22:08 mss-us2 multipathd: 8:128: mark as failed
Jun 23 12:22:08 mss-us2 multipathd: home11: remaining active paths: 6
Jun 23 12:22:08 mss-us2 multipathd: dm-4: add map (uevent)
Jun 23 12:22:08 mss-us2 multipathd: dm-4: devmap already registered
Jun 23 12:22:08 mss-us2 multipathd: dm-0: add map (uevent)
Jun 23 12:22:08 mss-us2 multipathd: dm-0: devmap already registered
Jun 23 12:22:08 mss-us2 multipathd: dm-1: add map (uevent)
Jun 23 12:22:08 mss-us2 multipathd: dm-1: devmap already registered
Jun 23 12:22:13 mss-us2 multipathd: sdab: tur checker reports path is
up
Jun 23 12:22:13 mss-us2 multipathd: 65:176: reinstated
Jun 23 12:22:13 mss-us2 multipathd: home13: remaining active paths: 8
Jun 23 12:22:14 mss-us2 multipathd: sdg: tur checker reports path is
up
Jun 23 12:22:14 mss-us2 multipathd: 8:96: reinstated
Jun 23 12:22:14 mss-us2 multipathd: home11: remaining active paths: 7
Jun 23 12:22:14 mss-us2 multipathd: sdi: tur checker reports path is
up
Jun 23 12:22:14 mss-us2 multipathd: 8:128: reinstated
Jun 23 12:22:14 mss-us2 multipathd: home11: remaining active paths: 8
Jun 23 12:22:15 mss-us2 multipathd: sdn: tur checker reports path is
up
Jun 23 12:22:15 mss-us2 multipathd: 8:208: reinstated
{code}

R1Soft Support guys says there could be problems with the iscsi
device. It's possible that CDP agent may be overwhelming it with too
many requests. If that's the case, it may be resolved with a
configuration change to the iscsi driver.

Any suggestions/tips on what i might be doing wrong ?

Regards
Jeetendra

Mike Christie

unread,
Jun 25, 2009, 11:07:53 AM6/25/09
to open-...@googlegroups.com
On 06/25/2009 06:06 AM, jeet2k123 wrote:
> Hi all,
>
> We using R1soft CDP tool to backup Linux files. on centos 5.2
>
> The Backup works fine for normal disk, but for multipath devices its
> giving error while taking backup
>
> I am using 11 iSCSI partition on a SAN device configured in multipath.
> I am using 8 path in multipath.
> My multipath is using 2 Intel NIC which are connected to dedicated
> managed switch(HP)
>
> I have configured multipath as per this doc.
> http://wiki.r1soft.com/display/kb/Linux+Agent+Multipath+Workaround
>
> my mount point is
> /dev/dm-0 on /home11 type ext3 (rw)
>
> The moment i start backup for above mount point, i get this error
> {code}
> Jun 23 12:21:24 mss-us2 kernel: connection8:0: ping timeout of 5 secs
> expired, last rx 7739207452, last ping 7739212452, now 7739217452
> Jun 23 12:21:24 mss-us2 kernel: connection8:0: detected conn error
> (1011)

What kernel are you using?

jeet2k123

unread,
Jun 25, 2009, 12:05:07 PM6/25/09
to open-iscsi
2.6.18-128.1.6.el5

dushy

unread,
Jun 29, 2009, 3:31:43 AM6/29/09
to open-iscsi
What about your config file and portal settings ?

jeet2k123

unread,
Jun 29, 2009, 7:25:31 AM6/29/09
to open-iscsi
My iscsid.conf is
#######################
node.startup = manual
node.session.timeo.replacement_timeout = 5
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 20
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.session.iscsi.FastAbort = Yes
######################

My portal setting is
######################
172.16.1.200:3260,1 iqn.2002-10.com.san:raid.sn7617293.001
172.16.1.201:3260,1 iqn.2002-10.com.san:raid.sn7617293.101
172.16.1.221:3260,1 iqn.2002-10.com.san:raid.sn7617247.112
172.16.1.220:3260,1 iqn.2002-10.com.san:raid.sn7617247.012
172.16.2.200:3260,1 iqn.2002-10.com.san:raid.sn7617293.201
172.16.2.220:3260,1 iqn.2002-10.com.san:raid.sn7617247.212
172.16.2.221:3260,1 iqn.2002-10.com.san:raid.sn7617247.312
172.16.2.201:3260,1 iqn.2002-10.com.san:raid.sn7617293.301
######################

Mike Christie

unread,
Jun 30, 2009, 11:10:53 PM6/30/09
to open-...@googlegroups.com
On 06/29/2009 06:25 AM, jeet2k123 wrote:
> node.conn[0].timeo.noop_out_interval = 5
> node.conn[0].timeo.noop_out_timeout = 5

If you set these to zero or set them higher does it help. You have to
rerun the discovery command so the new iscsid.conf values are picked up
or if you have the tools from cenots 5.3 then you can just run
iscsiadm -m node -o update -n node.conn[0].timeo.noop_out_interval -v 0
iscsiadm -m node -o update -n node.conn[0].timeo.noop_out_timeout -v 0

Or set this to
iscsiadm -m node -o update -n node.conn[0].timeo.noop_out_interval -v 25
iscsiadm -m node -o update -n node.conn[0].timeo.noop_out_timeout -v 25

jeet2k123

unread,
Jul 16, 2009, 12:26:56 PM7/16/09
to open-iscsi
i tried both the options, still im getting that same error :(

Regards
Jeetendra

Mike Christie

unread,
Jul 16, 2009, 1:59:36 PM7/16/09
to open-...@googlegroups.com

If you set the values to 0 you should not get the ping timeout error
message. When you use that value do you see:

Jun 23 12:21:24 mss-us2 kernel: connection8:0: ping timeout of 5 secs
expired, last rx 7739207452, last ping 7739212452, now 7739217452

If you do then run

iscsidm -m node -T your_tagret -p your_portal | grep noop

and make sure those values are 0.

jeet2k123

unread,
Jul 20, 2009, 5:51:58 AM7/20/09
to open-iscsi
> If you set the values to 0 you should not get the ping timeout error
> message. When you use that value do you see:
>
> Jun 23 12:21:24 mss-us2 kernel: connection8:0: ping timeout of 5 secs
> expired, last rx 7739207452, last ping 7739212452, now 7739217452
>
> If you do then run
>
> iscsidm -m node -T your_tagret -p your_portal | grep noop
>
> and make sure those values are 0.

Hi Mike,

Thanks for your help.
node.conn[0].timeo.noop_out_* value to 0 made the backup worked

-Jeetendra

dushy

unread,
Aug 10, 2009, 8:45:34 AM8/10/09
to open-iscsi
Wouldn't this cause issues with path failover's - cos multipath is
beiung used here ? Iam guessing setting the noop_timeout to 0 will
never fail the path.

Dushyanth

jeet2k123

unread,
Aug 12, 2009, 11:41:41 AM8/12/09
to open-iscsi

> Wouldn't this cause issues with path failover's - cos multipath is
> beiung used here ? Iam guessing setting the noop_timeout to 0 will
> never fail the path.

setting node.conn[0].timeo.noop_out_interval and node.conn
[0].timeo.noop_out_timeout to 25 also works fine
Reply all
Reply to author
Forward
0 new messages