I have seen this error covered in other places in this group but none
of the solutions I have found seem to be the issue I am facing.
I get errors like the one below at random. As you can see, almost
immediately the system says it is back up.
This causes any virtual machines attached to go read only.
Nov 21 09:16:13 hv1 kernel: connection4:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4393885396, last ping 4393886646, now
4393887896
Nov 21 09:16:13 hv1 kernel: connection4:0: detected conn error (1011)
Nov 21 09:16:13 hv1 iscsid: Kernel reported iSCSI connection 4:0 error
(1011) state (3)
Nov 21 09:16:14 hv1 kernel: connection3:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4393885656, last ping 4393886906, now
4393888156
Nov 21 09:16:14 hv1 kernel: connection3:0: detected conn error (1011)
Nov 21 09:16:14 hv1 multipathd: sda: readsector0 checker reports path
is down
Nov 21 09:16:14 hv1 multipathd: checker failed path 8:0 in map mpath5
Nov 21 09:16:14 hv1 kernel: device-mapper: multipath: Failing path
8:0.
Nov 21 09:16:14 hv1 multipathd: mpath5: remaining active paths: 0
Nov 21 09:16:14 hv1 multipathd: dm-2: add map (uevent)
Nov 21 09:16:14 hv1 multipathd: dm-2: devmap already registered
Nov 21 09:16:14 hv1 iscsid: Kernel reported iSCSI connection 3:0 error
(1011) state (3)
Nov 21 09:16:17 hv1 iscsid: connection4:0 is operational after
recovery (1 attempts)
Nov 21 09:16:18 hv1 iscsid: connection3:0 is operational after
recovery (1 attempts)
Nov 21 09:16:29 hv1 multipathd: sda: readsector0 checker reports path
is up
Nov 21 09:16:29 hv1 multipathd: 8:0: reinstated
Nov 21 09:16:29 hv1 multipathd: mpath5: remaining active paths: 1
Nov 21 09:16:29 hv1 multipathd: dm-2: add map (uevent)
Nov 21 09:16:29 hv1 multipathd: dm-2: devmap already registered
I am running the following kernel:
Linux hv1 2.6.18-238.9.1.el5xen
This is CentOS 5.6 64 bit.
Can someone point me in the right direction? Perhaps the ping timeout
need to be adjusted but I have a feeling this is software related.
Thank you.
Matt
If you just turn nops/pings off do you still see the conn error and is
operational messages in the logs.
Could you try the kernel I sent in the other mail:
http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.i686.rpm
http://people.redhat.com/mchristi/kernel-PAE-2.6.18-274.el5.iscsidebug1.i686.rpm
http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.x86_64.rpm
It is just the rhel 5.7 kernel but with the iscsi eh debugging on and
with some printks in some of the other iscsi network eh paths. I had to
make a special kernel because the iscsi modules shipped in rhel 5 do not
allow you to turn on debugging for just the eh paths.
If you need to run the rhel 5.6 kernel then you should be able to take
the iscsi modules from the kernel above and run them in 5.6.
Well, the log above was just a sample. The underlying 1011 error
sometimes appears without a ping error, for example:
Nov 27 17:11:47 hv4 kernel: connection4:0: detected conn error (1011)
Nov 27 17:11:48 hv4 iscsid: Kernel reported iSCSI connection 4:0 error
(1011) state (3)
Nov 27 17:11:54 hv4 iscsid: connection4:0 is operational after
recovery (1 attempts)
Nov 27 17:11:54 hv4 kernel: connection2:0: detected conn error (1011)
Nov 27 17:11:55 hv4 iscsid: Kernel reported iSCSI connection 2:0 error
(1011) state (3)
Nov 27 17:11:56 hv4 kernel: nfs: server 10.0.43.30 not responding,
still trying
Nov 27 17:11:59 hv4 iscsid: connection2:0 is operational after
recovery (1 attempts)
[root@hv4 log]# uname -r
2.6.18-274.7.1.el5xen
The connections just fall off the face of the earth at random, and
then are picked up a few seconds later.
I was thrown into this project so I am not sure where nops/pings are
set or unset.
Kind Regards,
Matt
These are production boxes so sorry for the delay in getting back to
this thread. We have a little more prep work to do before we can
install one of these.
Hope to update soon.
Regards,
Matt
On Nov 27, 11:51 pm, Mike Christie <micha...@cs.wisc.edu> wrote:
> On 11/27/2011 06:06 PM,MattLundstromwrote:
>
> > Hi Mike,
>
> > Well, the log above was just a sample. The underlying 1011 error sometimes
> > appears without a ping error, for example:
>
> Could you try the kernel I sent in the other mail:
>
> http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.i...
>
> http://people.redhat.com/mchristi/kernel-PAE-2.6.18-274.el5.iscsidebu...
>
> http://people.redhat.com/mchristi/kernel-2.6.18-274.el5.iscsidebug1.x...
Well, we got things juggled around a lot sooner than expected, and got
the kernel installed.
But our VMs would not start under it.
[root@hv4 ~]# tail -f /var/log/messages
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth3)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth4)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth5)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth6)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:15 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth7)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth3)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth4)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth5)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth6)]Illegal
configuration detected for Max BW - using 100 instead
Dec 6 14:25:45 hv4 kernel: [bnx2x_extract_max_cfg:1074(eth7)]Illegal
configuration detected for Max BW - using 100 instead
Looks like this kernel doesn't like the Nics on the blade servers.
Regards,
Matt
Doh. My fault. I see you are using the xen kernel 2.6.18-238.9.1.el5xen.
I will rebuild a kernel based on that.
Thanks Mike, I will keep an eye out for it. Much appreciated.