Randomly we get Open-iSCSI "conn errors" when connecting to an
Infortrend A16E-G2130-4 storage array. We had discussed about this
earlier in the list, see:
http://tr.im/DVQm
http://tr.im/DVQp
Open-iSCSI logs this:
===============================================
Nov 2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx
408250499, last ping 408249467, now 408254467
Nov 2 18:34:02 vz-17 kernel: connection1:0: iscsi: detected conn
error (1011)
Nov 2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Nov 2 18:34:07 vz-17 iscsid: connection1:0 is operational after
recovery (1 attempts)
Nov 2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx
408294833, last ping 408299833, now 408304833
Nov 2 18:34:52 vz-17 kernel: connection1:0: iscsi: detected conn
error (1011)
Nov 2 18:34:53 vz-17 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Nov 2 18:34:57 vz-17 iscsid: connection1:0 is operational after
recovery (1 attempts)
===============================================
Running on CentOS 5.4 with "iscsi-initiator-utils-6.2.0.871-0.10.el5";
I think it's not a Open-iSCSI bug as Mike suggested at:
http://groups.google.com/group/open-iscsi/msg/fe37156096b2955f
I have only this error when connecting to Infortrend storage, and not
with NetApp, Nexsan, etc. *connected in the same SAN*.
Using Wireshark I see a lot of "TCP Dup ACK", "TCP ACKed lost
segment", etc. and iSCSI session finally ends in timeout, see a
screenshot here:
Using Wireshark IO graphs I get this strange report about TCP/IP errors:
And this is another report in the same SAN connecting to a NetApp:
Those TCP/IP errors only occurs when connecting to Infortrend
storage.. and no with other targets in the same SAN (using same switch
infrastructure); is there anyway to deal with this using Open-iSCSI?
As I see in Internet, there're a lot of Infortrend's users suffering
this behavior.
Thanks!
P.D: speed and duplex configuration is correct in all point, there
aren't CRC errors in the switch.
--
Santi Saez
http://woop.es
> Randomly we get Open-iSCSI "conn errors" when connecting to an
> Infortrend A16E-G2130-4 storage array. We had discussed about this
> earlier in the list, see:
> Nov 2 18:34:02 vz-17 kernel: ping timeout of 5 secs expired, last rx 408250499, last ping 408249467, now 408254467
> Nov 2 18:34:02 vz-17 kernel: connection1:0: iscsi: detected conn error (1011)
> Nov 2 18:34:03 vz-17 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
> Nov 2 18:34:07 vz-17 iscsid: connection1:0 is operational after recovery (1 attempts)
> Nov 2 18:34:52 vz-17 kernel: ping timeout of 5 secs expired, last rx
That looks vaguely familiar, although I think mine was nop-out timeout
(might be reported in another log file). Does it mostly happen when you do
long sequential reads from the Infortrend unit? In my case it turned out
to be a very low level of packet drops being caused by a cisco 2960G when
'mls qos' was enabled (which due to an IOS bug, didn't increment the drop
counter). I'm not sure if the loss when 'mls qos' is enabled is by design
as part of WRED, or a function of the port buffers being divided up into
things smaller than optimal.
Having TCP window scaling enabled made the problem an order of magnitude
worse, try disabling it and seeing if you have the same problem still?
(suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to
see if that triggers it, assuming it was the same problem I was
suffering).
Every other iSCSI target I've tried recovered pretty gracefully from this,
but not the Infortrend, I suspect their TCP retransmit algorithm needs a
lot of love. I suspect it's pathologically broken when window scaling is
enabled.
Sadly when I opened a ticket with Infortrend, enclosing tcpdumps and
analysis, they were no more useful than to let me know they don't support
debian (despite having instructions for debian iscsi on their website),
and they don't support Western Digital drives in redundant controller
configurations (nice of them to have this somewhere public).
Hope that might be somewhat useful to you.. Please to let me / the list
know how you get on. There was sadly little information on this when I was
tearing my hair out about it.
Best wishes
James
--
James Rice
Jump Networks Ltd
You can turn off ping/nops by setting
node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0
(set that in iscsid.conf then rediscovery the target or run "iscsiadm -m
node -T your_target -o update -n name_of_param_above -v 0"
Or you might want to set them higher.
This might just work around. What might happen is that you will not see
the nop/ping and conn errors and instead would just see a slow down in
the workloads being run.
If you guys can get a hold of any infrotrend people let me know, because
I would be happy to work with them on this.
Dear James,
> That looks vaguely familiar, although I think mine was nop-out timeout
> (might be reported in another log file). Does it mostly happen when you do
> long sequential reads from the Infortrend unit? In my case it turned out
> to be a very low level of packet drops being caused by a cisco 2960G when
> 'mls qos' was enabled (which due to an IOS bug, didn't increment the drop
> counter). I'm not sure if the loss when 'mls qos' is enabled is by design
> as part of WRED, or a function of the port buffers being divided up into
> things smaller than optimal.
>
> Having TCP window scaling enabled made the problem an order of magnitude
> worse, try disabling it and seeing if you have the same problem still?
> (suggest something like dd if=/dev/sdc of=/dev/null bs=1048576 count=10 to
> see if that triggers it, assuming it was the same problem I was
> suffering).
>
> Every other iSCSI target I've tried recovered pretty gracefully from this,
> but not the Infortrend, I suspect their TCP retransmit algorithm needs a
> lot of love. I suspect it's pathologically broken when window scaling is
> enabled.
Disabling TCP window scaling [1] on Linux solves nop-out problem, we
don't get more "iscsi: detected conn error" and performance improves :)
It's very strange, we have 3 Cisco 2960G in the SAN and this behavior
only occurs in two of them, we're looking in depth this problem.
Nop-out has been solved but we still have a lot of "duplicate ACKs" in
all machines. I will update this post with more info. James, thanks a
lot of for the help.
Regards,
[1] # echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
Dear Mike,
> You can turn off ping/nops by setting
>
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
>
> (set that in iscsid.conf then rediscovery the target or run "iscsiadm -m
> node -T your_target -o update -n name_of_param_above -v 0"
Thanks!! As I said to James in the previous email, disabling TCP window
scaling *solves partially* this problem, we still hold nop pings in the
configuration. But still have too many "TCP Dup ACKs" in the network :-S
> This might just work around. What might happen is that you will not see
> the nop/ping and conn errors and instead would just see a slow down in
> the workloads being run.
I have sent your contact to Infortrend developers, a engineer will
contact you, thanks!
Regards,
Could you send more of the log? Do you see a message like
connection1:0 is
operational after recovery (1 attempts)
after you see the conn errors (how many attempts)?
Try disabling nops by setting
>
> Could you send more of the log? Do you see a message like
> connection1:0 is
> operational after recovery (1 attempts)
> after you see the conn errors (how many attempts)?
Here's one particular connection:
Nov 4 05:12:14 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4321648393, last ping 4321653393, now
4321658393
Nov 4 05:12:14 backup kernel: connection22:0: detected conn error (1011)
Nov 4 05:12:21 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
Nov 4 05:12:46 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4321680691, last ping 4321685691, now
4321690691
Nov 4 05:12:46 backup kernel: connection22:0: detected conn error (1011)
Nov 4 05:12:58 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
Nov 4 07:46:03 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4330877890, last ping 4330882890, now
4330887890
Nov 4 07:46:03 backup kernel: connection22:0: detected conn error (1011)
Nov 4 07:46:10 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
Nov 4 07:46:27 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4330901733, last ping 4330906733, now
4330911733
Nov 4 07:46:27 backup kernel: connection22:0: detected conn error (1011)
Nov 4 07:46:32 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
Nov 4 07:47:21 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4330955414, last ping 4330960414, now
4330965414
Nov 4 07:47:21 backup kernel: connection22:0: detected conn error (1011)
Nov 4 07:47:28 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
Nov 4 07:48:28 backup kernel: connection22:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4331023213, last ping 4331028213, now
4331033213
Nov 4 07:48:28 backup kernel: connection22:0: detected conn error (1011)
Nov 4 07:48:35 backup iscsid: connection22:0 is operational after recovery
(1 attempts)
FWIW:
[root@backup ~]# cat /var/log/messages | grep "after recovery" | awk '{print
$11" "$12}' | sort | uniq
(113 attempts)
(1 attempts)
(24 attempts)
(2 attempts)
(3 attempts)
(4 attempts)
(5 attempts)
(66 attempts)
(68 attempts)
(6 attempts)
(7 attempts)
(8 attempts)
(9 attempts)
>
> Try disabling nops by setting
>
> node.conn[0].timeo.noop_out_interval = 0
> node.conn[0].timeo.noop_out_timeout = 0
Ok, I'll let you know how it pans out.
Thanks,
Matthew
Hi mdaitc,
> I’m seeing similar TCP “weirdness” as the other posts mention as well
> as the below errors.
(..)
> Nov 2 08:15:14 backup kernel: connection33:0: detected conn error
> The performance isn’t what I’d expect:
(..)
What happens if you disable TCP window scaling option in RHEL servers?
# echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
In our case, iSCSI "conn errors" stopped after disabling, but still have
a lot of TCP “weirdness” in the network, mainly dup ACKs packages.
What is the sendwait patch? Is it a patch for open-iscsi or to the
kernel network code?
I think I have the source they are using. Could you do a uname -r, so I
can see what kernel they are using.
>
>
>>
>> Try disabling nops by setting
>>
>> node.conn[0].timeo.noop_out_interval = 0
>> node.conn[0].timeo.noop_out_timeout = 0
>
I'm still getting errors:
Nov 10 09:08:04 backup kernel: connection12:0: detected conn error (1011)
Nov 10 09:08:05 backup iscsid: Kernel reported iSCSI connection 12:0 error
(1011) state (3)
Nov 10 09:08:08 backup iscsid: connection12:0 is operational after recovery
(1 attempts)
Nov 10 09:09:43 backup kernel: connection11:0: detected conn error (1011)
Nov 10 09:09:43 backup kernel: connection12:0: detected conn error (1011)
Nov 10 09:09:44 backup kernel: connection11:0: detected conn error (1011)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 11:0 error
(1011) state (3)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 12:0 error
(1011) state (3)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 11:0 error
(1011) state (1)
Nov 10 09:09:46 backup kernel: session11: target reset succeeded\
Nov 10 09:09:47 backup iscsid: connection11:0 is operational after recovery
(1 attempts)
Nov 10 09:09:47 backup iscsid: connection12:0 is operational after recovery
(1 attempts)
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:09:56 backup kernel: end_request: I/O error, dev sdv, sector
60721248
Nov 10 09:09:56 backup kernel: device-mapper: multipath: Failing path 65:80.
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:09:56 backup kernel: end_request: I/O error, dev sdv, sector
60727648
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:10:31 backup kernel: device-mapper: multipath: Failing path
65:112.
Interestingly, I tried a Windows 2008 server R2 talking over a single
connection to the storage unit, configured to access just via one
interface, I was able to sustain 20MB/s so it would ³appear² to be a
Linux-related issue - I'm only able to get 9MB/s out of Linux even when
using 8 interfaces on both controllers.
Matthew
What version of open-iscsi were you using and what kernel, and were you
using the iscsi kernel modules with open-iscsi.org tarball or from the
kernel?
It looks like we are sending more IO than the target can handle. In one
of those cases it took more than 30 or 60 seconds (depending on your
timeout value).
What is the value of
cat /sys/block/sdXYZ/device/timeout
?
If it is 30 or 60 could you increase it to 360? After you login to the
target do
echo 360 > /sys/block/sdXYZ/device/timeout
And what is the value of:
iscsiadm -m node -T your_target | grep node.session.cmds_max
If that is 128, then could you decrease that to 32 or 16?
Run
iscsiadm -m node -T your_target -u
iscsiadm -m node -T your_target -o update -n node.session.cmds_max -v 32
iscsiad-m node -T your_target -l
And if those prevent the io errors then could you do
echo noop > /sys/block/sdXYZ/queue/scheduler
to see if performance increases with a difference scheduler.