Logging hanging. Invalid session

244 views
Skip to first unread message

Nicolas Ecarnot

unread,
Feb 4, 2015, 8:44:34 AM2/4/15
to open-...@googlegroups.com
Hi list,

I'm facing a similar problem as what has been discussed here in 2011 :
https://groups.google.com/d/msg/open-iscsi/pQQ8_ernzgQ/7ZVkoDmY2qYJ
but versions have since raised, so I don't know what to look at.

TL;DR :
iscsiadm -m node -T iqn.1992-04.com.emc:cx.ckm00094900174.a2 -l
...
...
-> sleeping there until I ctrl-c interrupt it.
The only error msg I get is on the console :
iscsi: invalid session 0
Specifying the SAN ip does not improve anything.

Long version :

We have a Dell PE R610 in 64bits with 4 NICs :
- em1 + em2 are bonded and dedicated to the management on a specific VLAN
- em3 and em4 are not bonded, and dedicated to the iSCSI specific VLAN

ping -I from em3 and em4 towards our EMC Cx4-120 SAN is working, towards its two different IPs.
And the ping from the SAN is also working.

In /etc/sysctl.conf, I have :
net.ipv4.conf.em3.arp_ignore = 1
net.ipv4.conf.em4.arp_ignore = 1
net.ipv4.conf.em3.arp_announce = 2
net.ipv4.conf.em4.arp_announce = 2
net.ipv4.conf.em3.rp_filter = 0
net.ipv4.conf.em4.rp_filter = 0


iptables and selinux are off.

LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID:    OracleServer
Description:    Oracle Linux Server release 6.6


The discovery sounds good :
root@serv3:/var/lib/iscsi# iscsiadm -m discovery -p 192.168.29.1 -t st -I default
192.168.29.1:3260,1 iqn.1992-04.com.emc:cx.ckm00094900174.a2
192.168.29.2:3260,2 iqn.1992-04.com.emc:cx.ckm00094900174.b2


The 2 NICs are NetXtreme II BCM5709 Gigabit Ethernet. They support offloading, but I'm not comfortable enough with that yet to play with it. So at present, I'd prefer just use a classical tcp transport.

Iface scanning sounds good too :
root@serv3:/var/lib/iscsi# iscsiadm -m iface
default tcp,<empty>,<empty>,<empty>,<empty>
iser iser,<empty>,<empty>,<empty>,<empty>
bnx2i.f0:4d:a2:07:59:91 bnx2i,f0:4d:a2:07:59:91,<empty>,<empty>,<empty>
bnx2i.00:00:00:00:00:00 bnx2i,00:00:00:00:00:00,<empty>,<empty>,<empty>
bnx2i.f0:4d:a2:07:59:95 bnx2i,f0:4d:a2:07:59:95,<empty>,<empty>,<empty>
bnx2i.f0:4d:a2:07:59:93 bnx2i,f0:4d:a2:07:59:93,<empty>,<empty>,<empty>


Other informations :
root@serv3:/var/lib/iscsi# rpm -qa|grep -i iscsi
iscsi-initiator-utils-6.2.0.873-2.0.2.el6.i686
root@serv3:/var/lib/iscsi# iscsiuio -v
iscsiuio: Version '0.7.6.1', Build Date: 'Wed May 29 14:08:11 PDT 2013'
root@serv3:/var/lib/iscsi# uname -a
Linux serv3 3.8.13-55.1.5.el6uek.x86_64 #2 SMP Wed Jan 28 17:03:28 PST 2015 x86_64 x86_64 x86_64 GNU/Linux


I tried to disable em4, reboot, and try a connection from a single NIC (em3), but that did nothing better.

I don't know if the fact that these NICs support offloading may disturb the simplest tcp transport way of connecting?
I must admit I'm a little blocked here, as this is not the first server I'm connecting using iscsi, but as the pings are OK, what else could I check?

Thank you.

-- 
Nicolas Ecarnot

Mike Christie

unread,
Feb 4, 2015, 5:59:54 PM2/4/15
to open-...@googlegroups.com
On 2/4/15, 7:44 AM, Nicolas Ecarnot wrote:
> Hi list,
>
> I'm facing a similar problem as what has been discussed here in 2011 :
> https://groups.google.com/d/msg/open-iscsi/pQQ8_ernzgQ/7ZVkoDmY2qYJ
> but versions have since raised, so I don't know what to look at.

I think you are hitting a different issue. Could you include your
/var/log/messages, so I can see if iscsid is throwing errors in there?

>
> *TL;DR : *
> iscsiadm -m node -T iqn.1992-04.com.emc:cx.ckm00094900174.a2 -l

Could you set

node.session.initial_login_retry_max = 1

in /etc/iscsi/iscsid.conf? Set that then rerun the discovery command. It
will speed up failures so we do not have to wait a long time while testing.


> ...
> ...
> -> sleeping there until I ctrl-c interrupt it.
> The only error msg I get is on the console :
> iscsi: invalid session 0
> Specifying the SAN ip does not improve anything.
>
> *Long version :*
>
> We have a Dell PE R610 in 64bits with 4 NICs :
> - em1 + em2 are bonded and dedicated to the management on a specific VLAN
> - em3 and em4 are not bonded, and dedicated to the iSCSI specific VLAN
>
> ping -I from em3 and em4 towards our EMC Cx4-120 SAN is working, towards
> its two different IPs.
> And the ping from the SAN is also working.
>
> In /etc/sysctl.conf, I have :
> net.ipv4.conf.em3.arp_ignore = 1
> net.ipv4.conf.em4.arp_ignore = 1
> net.ipv4.conf.em3.arp_announce = 2
> net.ipv4.conf.em4.arp_announce = 2
> net.ipv4.conf.em3.rp_filter = 0
> net.ipv4.conf.em4.rp_filter = 0
>
> iptables and selinux are off.
>
> LSB Version:
> :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
>
> Distributor ID: OracleServer
> Description: Oracle Linux Server release 6.6
>
> The discovery sounds good :
> root@serv3:/var/lib/iscsi# iscsiadm -m discovery -p 192.168.29.1 -t st -I
> default
> 192.168.29.1:3260,1 iqn.1992-04.com.emc:cx.ckm00094900174.a2
> 192.168.29.2:3260,2 iqn.1992-04.com.emc:cx.ckm00094900174.b2

em3 and em4 on the same subnet, right?

Is em1/em2 on the same subnet as em3 and em4?

>
> The 2 NICs are NetXtreme II BCM5709 Gigabit Ethernet. They support
> offloading, but I'm not comfortable enough with that yet to play with it.
> So at present, I'd prefer just use a classical tcp transport.
>
> Iface scanning sounds good too :
> root@serv3:/var/lib/iscsi# iscsiadm -m iface
> default tcp,<empty>,<empty>,<empty>,<empty>
> iser iser,<empty>,<empty>,<empty>,<empty>
> bnx2i.f0:4d:a2:07:59:91 bnx2i,f0:4d:a2:07:59:91,<empty>,<empty>,<empty>
> bnx2i.00:00:00:00:00:00 bnx2i,00:00:00:00:00:00,<empty>,<empty>,<empty>
> bnx2i.f0:4d:a2:07:59:95 bnx2i,f0:4d:a2:07:59:95,<empty>,<empty>,<empty>
> bnx2i.f0:4d:a2:07:59:93 bnx2i,f0:4d:a2:07:59:93,<empty>,<empty>,<empty>
>
> Other informations :
> root@serv3:/var/lib/iscsi# rpm -qa|grep -i iscsi
> iscsi-initiator-utils-6.2.0.873-2.0.2.el6.i686
> root@serv3:/var/lib/iscsi# iscsiuio -v
> iscsiuio: Version '0.7.6.1', Build Date: 'Wed May 29 14:08:11 PDT 2013'
> root@serv3:/var/lib/iscsi# uname -a
> Linux serv3 3.8.13-55.1.5.el6uek.x86_64 #2 SMP Wed Jan 28 17:03:28 PST 2015
> x86_64 x86_64 x86_64 GNU/Linux
>
> I tried to disable em4, reboot, and try a connection from a single NIC
> (em3), but that did nothing better.

Could you take a tcpdump/wireshark trace when you run this experiment?
Also, include the /var/log/messages.

>
> I don't know if the fact that these NICs support offloading may disturb the
> simplest tcp transport way of connecting?


It should not be a issue. I can use either here.

Nicolas Ecarnot

unread,
Feb 5, 2015, 4:11:17 AM2/5/15
to open-...@googlegroups.com
Hello Mike,

Thank you for your answer.
Please see below  some details.


Le mercredi 4 février 2015 23:59:54 UTC+1, Mike Christie a écrit :
I think you are hitting a different issue. Could you include your
/var/log/messages, so I can see if iscsid is throwing errors in there?

I changed the initial login retry value as you adviced, flushed every previous setting (nodes, ifaces, sessions) then rebooted.

And tried again from scratch. Here is the /var/log/messages, with some user-added markup comments :

Feb  5 09:43:33 serv-sauv-adms3 root: ____________________________ Just after reboot, about to run discovery

Now, I'm running :
root@serv-sauv-adms3:/etc# iscsiadm -m discovery -t st -p 192.168.29.1
Démarrage de iscsid :                                      [  OK  ]

192.168.29.1:3260,1 iqn.1992-04.com.emc:cx.ckm00094900174.a2
192.168.29.2:3260,2 iqn.1992-04.com.emc:cx.ckm00094900174.b2



Feb  5 09:44:54 serv-sauv-adms3 kernel: Loading iSCSI transport class v2.0-870.
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (tcp)
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (iser)
Feb  5 09:44:54 serv-sauv-adms3 kernel: libcxgbi:libcxgbi_init_module: tag itt 0x1fff, 13 bits, age 0xf, 4 bits.
Feb  5 09:44:54 serv-sauv-adms3 kernel: libcxgbi:ddp_setup_host_page_size: system PAGE 4096, ddp idx 0.
Feb  5 09:44:54 serv-sauv-adms3 kernel: Chelsio T3 iSCSI Driver cxgb3i v2.0.0 (Jun. 2010)
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (cxgb3i)
Feb  5 09:44:54 serv-sauv-adms3 kernel: Chelsio T4 iSCSI Driver cxgb4i v0.9.1 (Aug. 2010)
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (cxgb4i)
Feb  5 09:44:54 serv-sauv-adms3 kernel: cnic: Broadcom NetXtreme II CNIC Driver cnic v2.5.20b (July 22, 2014)
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2 0000:01:00.0 em1: Added CNIC device
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2 0000:01:00.1 em2: Added CNIC device
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2 0000:02:00.0 em3: Added CNIC device
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2 0000:02:00.1 em4: Added CNIC device
Feb  5 09:44:54 serv-sauv-adms3 kernel: Broadcom NetXtreme II iSCSI Driver bnx2i v2.7.10.31d1 (August 6, 2014)
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (bnx2i)
Feb  5 09:44:54 serv-sauv-adms3 kernel: scsi9 : Broadcom Offload iSCSI Initiator
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2i [02:00.01]: ISCSI_INIT passed
Feb  5 09:44:54 serv-sauv-adms3 kernel: scsi10 : Broadcom Offload iSCSI Initiator
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2i [02:00.00]: ISCSI_INIT passed
Feb  5 09:44:54 serv-sauv-adms3 kernel: scsi11 : Broadcom Offload iSCSI Initiator
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2i [01:00.01]: ISCSI_INIT passed
Feb  5 09:44:54 serv-sauv-adms3 kernel: scsi12 : Broadcom Offload iSCSI Initiator
Feb  5 09:44:54 serv-sauv-adms3 kernel: bnx2i [01:00.00]: ISCSI_INIT passed
Feb  5 09:44:54 serv-sauv-adms3 kernel: iscsi: registered transport (be2iscsi)
Feb  5 09:44:54 serv-sauv-adms3 kernel: In beiscsi_module_init, tt=ffffffffa05df980
Feb  5 09:44:54 serv-sauv-adms3 iscsid: iSCSI logger with pid=8333 started!
Feb  5 09:44:55 serv-sauv-adms3 iscsid: iSCSI daemon with pid=8334 started!
Feb  5 09:45:17 serv-sauv-adms3 root: ____________________________ Discovery completed

At this point, I'm running :

root@serv-sauv-adms3:/var/lib/iscsi# iscsiadm -m node -p 192.168.29.1 -l -T iqn.1992-04.com.emc:cx.ckm00094900174.a2
Logging in to [iface: default, target: iqn.1992-04.com.emc:cx.ckm00094900174.a2, portal: 192.168.29.1,3260] (multiple)




Feb  5 09:45:58 serv-sauv-adms3 root: ____________________________ Trying to login...
Feb  5 09:46:41 serv-sauv-adms3 kernel: scsi13 : iSCSI Initiator over TCP/IP
Feb  5 09:46:41 serv-sauv-adms3 kernel: iscsi: invalid session 0.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: Could not read proc_name for host1 rc 5.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: Could not set session0 priority. READ/WRITE throughout and latency could be affected.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: Received iferror -22: Invalid argument.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: Can't create connection.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: Received iferror -22: Invalid argument.
Feb  5 09:46:41 serv-sauv-adms3 iscsid: can not safely destroy session 0
Feb  5 09:47:49 serv-sauv-adms3 root: ____________________________ Logging hanging and sleeping...
Feb  5 09:48:13 serv-sauv-adms3 root: ____________________________ Logging about to be manually interrupted...
Feb  5 09:48:40 serv-sauv-adms3 root: ____________________________ Logging manually interrupted...


At this point, I have to manually interrupt the command, and no error appear in /var/log/messages.

When running :
root@serv-sauv-adms3:/var/lib/iscsi# iscsiadm -m session
iscsiadm: could not read session targetname: 5
iscsiadm: could not find session info for session1
iscsiadm: No active sessions.


these "sessions" are stacking as I make additional attempts, like the OP in the thread cited previously.
I have no idea how to flush these sessions, because stopping the daemon does not work - it seems stuck waiting for the sessions to end. And I can not unload the appropriate modules easily as they take part of a big modules hierarchy. Rebooting is the lazy way (I know).

Could you set

node.session.initial_login_retry_max = 1

in /etc/iscsi/iscsid.conf? Set that then rerun the discovery command. It
will speed up failures so we do not have to wait a long time while testing.

As I wrote above, I followed your advice before rebooting the server and running the tests.
 
> -> sleeping there until I ctrl-c interrupt it.
> The only error msg I get is on the console :
> iscsi: invalid session 0
> The discovery sounds good :
> root@serv3:/var/lib/iscsi# iscsiadm -m discovery -p 192.168.29.1 -t st -I
> default
> 192.168.29.1:3260,1 iqn.1992-04.com.emc:cx.ckm00094900174.a2
> 192.168.29.2:3260,2 iqn.1992-04.com.emc:cx.ckm00094900174.b2

em3 and em4 on the same subnet, right?

Right.
 

Is em1/em2 on the same subnet as em3 and em4?

No.

Please see below the network config. Please notice that I tried with the default MTU at 1500 before trying the present one at 9000 (that is the actual setting of our SAN), with no improvement.

root@serv-sauv-adms3:/var/lib/iscsi# ip addr show
[... loopback...]
2: em1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether f0:4d:a2:07:59:90 brd ff:ff:ff:ff:ff:ff
3: em2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether f0:4d:a2:07:59:92 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000
    link/ether f0:4d:a2:07:59:94 brd ff:ff:ff:ff:ff:ff
    inet 192.168.29.5/24 brd 192.168.29.255 scope global em3
    inet6 fe80::f24d:a2ff:fe07:5994/64 scope link
       valid_lft forever preferred_lft forever
5: em4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000
    link/ether f0:4d:a2:07:59:96 brd ff:ff:ff:ff:ff:ff
    inet 192.168.29.6/24 brd 192.168.29.255 scope global em4
    inet6 fe80::f24d:a2ff:fe07:5996/64 scope link
       valid_lft forever preferred_lft forever
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether f0:4d:a2:07:59:90 brd ff:ff:ff:ff:ff:ff
    inet 192.168.12.14/24 brd 192.168.12.255 scope global bond0
    inet6 fe80::f24d:a2ff:fe07:5990/64 scope link
       valid_lft forever preferred_lft forever


And here is the routing table :
root@serv-sauv-adms3:/var/lib/iscsi# netstat -rn
Destination     Passerelle      Genmask         Indic   MSS Fenêtre irtt Iface
0.0.0.0         192.168.12.254  0.0.0.0         UG        0 0          0 bond0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 em3
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 em4
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 bond0
192.168.12.0    0.0.0.0         255.255.255.0   U         0 0          0 bond0
192.168.29.0    0.0.0.0         255.255.255.0   U         0 0          0 em3
192.168.29.0    0.0.0.0         255.255.255.0   U         0 0          0 em4


I double checked that both em3 and em4 could ping both of the 2 ip addresses of the SAN.
I double checked that neither em1 nor em2 could ping any of the 2 ip addresses of the SAN.

Could you take a tcpdump/wireshark trace when you run this experiment?

OK, this is what I'm doing at present (Stay tuned).
But I'm not a rocket scientist in tcpdumping, my usage with it is pretty basic usually.
Do I have to restrict the witnessing to the port 3260, or more?

--
Nicolas Ecarnot

Nicolas Ecarnot

unread,
Feb 5, 2015, 10:56:39 AM2/5/15
to open-...@googlegroups.com
Le jeudi 5 février 2015 10:11:17 UTC+1, Nicolas Ecarnot a écrit :

Could you take a tcpdump/wireshark trace when you run this experiment?

OK, this is what I'm doing at present (Stay tuned).
But I'm not a rocket scientist in tcpdumping, my usage with it is pretty basic usually.

OK, I did the exact same tests, while recording a trace with tcpdump and reading it with wireshark (I told you I wasn't comfortable with it...).

I first run the discovery, and I can see some tcp syn/ack going well, then the iSCSI protocol being used, and the login + answer contains informations that are looking good (both IQNs of the SAN, correct).
I then wait for a moment (just to ease searching through the timestamps) then run the login.
At this point, I see some SYN/ACK activity, but NO iSCSI protocol used anymore. It sounds strange.

Then I interrupt the action.

You may find the tcpdump trace here : http://www.ecarnot.net/tmp/20150205-01-em3.tcpdump

Next step for me : I'll try the same from a server that is working good, and compare the traces.

--
Nicolas Ecarnot

Nicolas Ecarnot

unread,
Feb 6, 2015, 2:42:03 AM2/6/15
to open-...@googlegroups.com
SOLVED by Mike.

I'm sorry I did not see it at first time, but I accidentally installed the i686 version of the iscsiadm-utils package.
What was weird though was that using this 32bits tool was successful for discovering the targets.
But it wasn't able to login.

I made a good cleanup into my packages, re-installed the 64 bits version, according to my arch, and the login is doing fine.

Thank you Mike for the time you took to answer me.

Have a nice day.

--
Nicolas Ecarnot
Reply all
Reply to author
Forward
0 new messages