ping timeout of 5 secs expired, recv timeout 5, last rx 4393885396, last ping 4393886646, now 4393887896

2,762 views
Skip to first unread message

Matt

unread,
Nov 21, 2011, 9:48:53 PM11/21/11
to open-iscsi

Greetings,

I have seen this error covered in other places in this group but none
of the solutions I have found seem to be the issue I am facing.

I get errors like the one below at random. As you can see, almost
immediately the system says it is back up.

This causes any virtual machines attached to go read only.

Nov 21 09:16:13 hv1 kernel: connection4:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4393885396, last ping 4393886646, now
4393887896
Nov 21 09:16:13 hv1 kernel: connection4:0: detected conn error (1011)
Nov 21 09:16:13 hv1 iscsid: Kernel reported iSCSI connection 4:0 error
(1011) state (3)
Nov 21 09:16:14 hv1 kernel: connection3:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4393885656, last ping 4393886906, now
4393888156
Nov 21 09:16:14 hv1 kernel: connection3:0: detected conn error (1011)
Nov 21 09:16:14 hv1 multipathd: sda: readsector0 checker reports path
is down
Nov 21 09:16:14 hv1 multipathd: checker failed path 8:0 in map mpath5
Nov 21 09:16:14 hv1 kernel: device-mapper: multipath: Failing path
8:0.
Nov 21 09:16:14 hv1 multipathd: mpath5: remaining active paths: 0
Nov 21 09:16:14 hv1 multipathd: dm-2: add map (uevent)
Nov 21 09:16:14 hv1 multipathd: dm-2: devmap already registered
Nov 21 09:16:14 hv1 iscsid: Kernel reported iSCSI connection 3:0 error
(1011) state (3)
Nov 21 09:16:17 hv1 iscsid: connection4:0 is operational after
recovery (1 attempts)
Nov 21 09:16:18 hv1 iscsid: connection3:0 is operational after
recovery (1 attempts)
Nov 21 09:16:29 hv1 multipathd: sda: readsector0 checker reports path
is up
Nov 21 09:16:29 hv1 multipathd: 8:0: reinstated
Nov 21 09:16:29 hv1 multipathd: mpath5: remaining active paths: 1
Nov 21 09:16:29 hv1 multipathd: dm-2: add map (uevent)
Nov 21 09:16:29 hv1 multipathd: dm-2: devmap already registered

I am running the following kernel:
Linux hv1 2.6.18-238.9.1.el5xen

This is CentOS 5.6 64 bit.

Can someone point me in the right direction? Perhaps the ping timeout
need to be adjusted but I have a feeling this is software related.

Thank you.

Matt

Mike Christie

unread,
Nov 22, 2011, 5:33:21 PM11/22/11
to open-...@googlegroups.com, Matt

If you just turn nops/pings off do you still see the conn error and is
operational messages in the logs.

Mike Christie

unread,
Nov 27, 2011, 11:51:55 PM11/27/11
to Matt Lundstrom, open-...@googlegroups.com
On 11/27/2011 06:06 PM, Matt Lundstrom wrote:
> Hi Mike,
>
> Well, the log above was just a sample. The underlying 1011 error sometimes
> appears without a ping error, for example:
>

Could you try the kernel I sent in the other mail:

http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.i686.rpm

http://people.redhat.com/mchristi/kernel-PAE-2.6.18-274.el5.iscsidebug1.i686.rpm

http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.x86_64.rpm

It is just the rhel 5.7 kernel but with the iscsi eh debugging on and
with some printks in some of the other iscsi network eh paths. I had to
make a special kernel because the iscsi modules shipped in rhel 5 do not
allow you to turn on debugging for just the eh paths.

If you need to run the rhel 5.6 kernel then you should be able to take
the iscsi modules from the kernel above and run them in 5.6.

Matt Lundstrom

unread,
Nov 27, 2011, 7:06:51 PM11/27/11
to Mike Christie, open-...@googlegroups.com
Hi Mike,

Well, the log above was just a sample. The underlying 1011 error sometimes appears without a ping error, for example:

Nov 27 17:11:47 hv4 kernel:  connection4:0: detected conn error (1011)
Nov 27 17:11:48 hv4 iscsid: Kernel reported iSCSI connection 4:0 error (1011) state (3)
Nov 27 17:11:54 hv4 iscsid: connection4:0 is operational after recovery (1 attempts)
Nov 27 17:11:54 hv4 kernel:  connection2:0: detected conn error (1011)
Nov 27 17:11:55 hv4 iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3)
Nov 27 17:11:56 hv4 kernel: nfs: server 10.0.43.30 not responding, still trying
Nov 27 17:11:59 hv4 iscsid: connection2:0 is operational after recovery (1 attempts)

[root@hv4 log]# uname -r
2.6.18-274.7.1.el5xen

The connections just fall off the face of the earth at random, and then are picked up a few seconds later.

I was thrown into this project so I am not sure where nops/pings are set or unset.

Kind Regards,

Matt

Matt

unread,
Nov 27, 2011, 7:09:36 PM11/27/11
to open-iscsi
Hi Mike,

Well, the log above was just a sample. The underlying 1011 error
sometimes appears without a ping error, for example:

Nov 27 17:11:47 hv4 kernel: connection4:0: detected conn error (1011)
Nov 27 17:11:48 hv4 iscsid: Kernel reported iSCSI connection 4:0 error
(1011) state (3)
Nov 27 17:11:54 hv4 iscsid: connection4:0 is operational after
recovery (1 attempts)
Nov 27 17:11:54 hv4 kernel: connection2:0: detected conn error (1011)
Nov 27 17:11:55 hv4 iscsid: Kernel reported iSCSI connection 2:0 error
(1011) state (3)


Nov 27 17:11:56 hv4 kernel: nfs: server 10.0.43.30 not responding,
still trying

Nov 27 17:11:59 hv4 iscsid: connection2:0 is operational after
recovery (1 attempts)

[root@hv4 log]# uname -r
2.6.18-274.7.1.el5xen

The connections just fall off the face of the earth at random, and
then are picked up a few seconds later.

I was thrown into this project so I am not sure where nops/pings are
set or unset.

Kind Regards,

Matt

Matt

unread,
Dec 6, 2011, 10:58:37 AM12/6/11
to open-iscsi
Thanks Mike.

These are production boxes so sorry for the delay in getting back to
this thread. We have a little more prep work to do before we can
install one of these.

Hope to update soon.

Regards,

Matt

On Nov 27, 11:51 pm, Mike Christie <micha...@cs.wisc.edu> wrote:
> On 11/27/2011 06:06 PM,MattLundstromwrote:


>
> > Hi Mike,
>
> > Well, the log above was just a sample. The underlying 1011 error sometimes
> > appears without a ping error, for example:
>
> Could you try the kernel I sent in the other mail:
>

> http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.i...
>
> http://people.redhat.com/mchristi/kernel-PAE-2.6.18-274.el5.iscsidebu...
>
> http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.x...

Matt

unread,
Dec 6, 2011, 8:07:21 PM12/6/11
to open-iscsi
On Nov 27, 11:51 pm, Mike Christie <micha...@cs.wisc.edu> wrote:
> On 11/27/2011 06:06 PM, Matt Lundstrom wrote:
>
> > Hi Mike,
>
> > Well, the log above was just a sample. The underlying 1011 error sometimes
> > appears without a ping error, for example:
>
> Could you try the kernel I sent in the other mail:
>
> It is just the rhel 5.7 kernel but with the iscsi eh debugging on and
> with some printks in some of the other iscsi network eh paths. I had to
> make a special kernel because the iscsi modules shipped in rhel 5 do not
> allow you to turn on debugging for just the eh paths.
>
> If you need to run the rhel 5.6 kernel then you should be able to take
> the iscsi modules from the kernel above and run them in 5.6.

Well, we got things juggled around a lot sooner than expected, and got
the kernel installed.

But our VMs would not start under it.

[root@hv4 ~]# tail -f /var/log/messages
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth3)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth4)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth5)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth6)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth7)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth3)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth4)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth5)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth6)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth7)]Illegal
configuration detected for Max BW - using 100 instead

Looks like this kernel doesn't like the Nics on the blade servers.

Regards,

Matt

Mike Christie

unread,
Dec 7, 2011, 6:45:59 PM12/7/11
to open-...@googlegroups.com, Matt
On 12/06/2011 07:07 PM, Matt wrote:
> Looks like this kernel doesn't like the Nics on the blade servers.
>

Doh. My fault. I see you are using the xen kernel 2.6.18-238.9.1.el5xen.
I will rebuild a kernel based on that.

Matt

unread,
Dec 7, 2011, 6:41:41 PM12/7/11
to open-iscsi

Thanks Mike, I will keep an eye out for it. Much appreciated.

Reply all
Reply to author
Forward
0 new messages