open-iscsi on 2.6..33.4

312 views
Skip to first unread message

Alessandro Baggi

unread,
Mar 12, 2011, 6:55:16 AM3/12/11
to open-iscsi
Hi list. I'm trying to install open-iscsi on Slackware Linux with a
kernel 2.6.33.4, but there are problem with compilation, also with
adding to kernel/makefile section for 2.6.33.4, when it try to compile,
get error on kfifo_init, it's normal because in this kernel this
function get 3 parameters instead of 4.

There is a way to use open-iscsi on newer kernel of 2.6.30?

Thanks in advance

Gurkburk

unread,
Apr 12, 2011, 12:49:52 PM4/12/11
to open-iscsi
Exakt same problem here, any info on this issue?

dfdsf sdfdsf

unread,
Apr 12, 2011, 12:45:55 PM4/12/11
to open-iscsi
Having the exakt same problem, bump on the subject.

On 12 mar, 13:55, Alessandro Baggi <alessandro.ba...@gmail.com> wrote:

Mike Christie

unread,
Apr 12, 2011, 1:28:26 PM4/12/11
to open-...@googlegroups.com, Gurkburk

The current version should work on .30 and .33
http://kernel.org/pub/linux/kernel/people/mnc/open-iscsi/releases/open-iscsi-2.0-872.tar.gz

If the kernel modules are not enabled for your kernel just use the
modules that come with your kernel by just building the updated tools:

make user
make install_user

Joe Hoot

unread,
Apr 12, 2011, 1:43:30 PM4/12/11
to open-...@googlegroups.com
I'm trying to understand the following messages:

Apr 12 13:23:52 oim60025001 tgtd: conn_close(88) connection closed 0x94d80c4 1
.... lots of the above messages...
Apr 12 13:37:19 oim60025001 tgtd: abort_task_set(979) found 271 0
.... lots of the above messages...
Apr 12 13:37:27 oim60025001 tgtd: abort_cmd(955) found e9 e
.... lots of the above messages...
Apr 12 13:39:08 oim60025001 tgtd: conn_close(88) connection closed 0xa9ab8ec 1

Does that typically mean that the target has closed the connection or the initiator?

Mike Christie

unread,
Apr 14, 2011, 12:02:15 AM4/14/11
to open-...@googlegroups.com, Joe Hoot

Might be best to ask the tgtd list, but I think due to the abort
messsages it is probably the initiator if you are using open-iscsi for
the initiator and if you see the abort messages before conn_close ones.

If you see abort ones first, then see conn close ones, then probably a
scsi command is timing out. This causes the scsi layer to have the
initiator abort the command. If the abort fails, the initiator could try
a lun reset or target reset. If we cannot reset or abort the problem
away we drop the connection/session.

On the initiator if you did

echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_eh

you would see a "wait for relogin" message in /var/log/messages then a
"session reset succeeded" or "failing session reset: Could not log back
into" message. The wait for relogin would match the conn close messages
on the target. Then the success or failed messages indicated if we were
able to relogin.

Joe Hoot

unread,
Apr 14, 2011, 6:12:30 AM4/14/11
to open-...@googlegroups.com
Mike,

I'm still digging into this "iSCSI Disconnect" issue that we've been dealing with for about two years here in SUNY. The iscsi-initiator that I'm using is this:

iscsi-initiator-utils-6.2.0.872-6.0.2.el5

I've been running dt tests to the equallogic in our Oracle VM environment with the following set:

echo 1 > /sys/module/libiscsi2/parameters/debug_libiscsi
echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
echo 1 > /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp

When I try to look for "echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_eh" I don't see that as an available option. Are the debug lines above correct? Are there other debug messages that I should be gathering?

[root@oim61024001 src]# ls -l /sys/module/libiscsi
libiscsi2/ libiscsi_tcp/


We have done numerous tests and different hardware. I have eliminated OVM and have been testing this now on OEL5.6.

The scenario that seems to break connections quickly is the following:

- cluster of 5 nodes using OCFS2 (I will be trying to rule OCFS2 and dm-multipath out shortly)
- 5-7 iSCSI volumes connected to each of those 5 nodes
- 4 threads of dt running against each of 5-7 volumes from each host = 28 threads of dt slamming the volumes per host = 140 threads of dt per cluster.
- test only lasts about 2-5 minutes before I start seeing ping timeouts and disconnects.

The issue that we've seen thus far is mainly with EqualLogic and open-iscsi. From what EQL is telling us, the initiator is "aborting" the connection. But from the initiator-side, we just see "ping timeout" messages (and then the connection eventually goes away).

We recently saw a thread (Apr 4) regarding cfq scheduler. So we quickly tested noop and deadline, just to see if that would change anything-- it didn't.

So my most recent test was to try out a different target, just to see if we could rule out the EqualLogic. Each time I changed from EQLX to the tgtd, I would reset (and rescan in my volumes) the iscsid.conf's "FastAbort = No", or yes (if I was testing tgtd), to conform with EQLX's best practices.

So at this point, after I get dm-multipath and OCFS2 out of the equation, it will be down to a tartget + kernel/initiator + I/O scheduler and I want to make sure that I'm getting all the debug information that I might need to analyze what is going on.

Are there any other debug tunables that you might recommend adding to my script?

> --
> You received this message because you are subscribed to the Google Groups "open-iscsi" group.
> To post to this group, send email to open-...@googlegroups.com.
> To unsubscribe from this group, send email to open-iscsi+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
>

Mike Christie

unread,
Apr 14, 2011, 6:45:53 AM4/14/11
to open-...@googlegroups.com, Joe Hoot
On 04/14/2011 05:12 AM, Joe Hoot wrote:
> Mike,
>
> I'm still digging into this "iSCSI Disconnect" issue that we've been dealing with for about two years here in SUNY. The iscsi-initiator that I'm using is this:
>
> iscsi-initiator-utils-6.2.0.872-6.0.2.el5
>
> I've been running dt tests to the equallogic in our Oracle VM environment with the following set:
>
> echo 1> /sys/module/libiscsi2/parameters/debug_libiscsi
> echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> echo 1> /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp
>
> When I try to look for "echo 1> /sys/module/libiscsi/parameters/debug_libiscsi_eh" I don't see that as an available option. Are the debug lines above correct? Are there other debug messages that I should be gathering?
>

It is only in newer kernels. debug_libiscsi will print out what
debug_libiscsi_eh does plus more.


> [root@oim61024001 src]# ls -l /sys/module/libiscsi
> libiscsi2/ libiscsi_tcp/
>
>
> We have done numerous tests and different hardware. I have eliminated OVM and have been testing this now on OEL5.6.
>
> The scenario that seems to break connections quickly is the following:
>
> - cluster of 5 nodes using OCFS2 (I will be trying to rule OCFS2 and dm-multipath out shortly)
> - 5-7 iSCSI volumes connected to each of those 5 nodes
> - 4 threads of dt running against each of 5-7 volumes from each host = 28 threads of dt slamming the volumes per host = 140 threads of dt per cluster.
> - test only lasts about 2-5 minutes before I start seeing ping timeouts and disconnects.
>
> The issue that we've seen thus far is mainly with EqualLogic and open-iscsi. From what EQL is telling us, the initiator is "aborting" the connection. But from the initiator-side, we just see "ping timeout" messages (and then the connection eventually goes away).
>

When the ping times out, the initiator will abort/drop the connection.
It will then try to relogin. So you should probably be seeing the
ping/nop timeout message, then a conn error 1011, then a message about
reconnected in X retries, right?


The thing is that below with tgtd we seem to be seeing scsi commands
timing out. On the initiator logs did you see ping/nop timeout messages
when you hit the problem below with tgtd?

With tgtd let's take take the disks out of it and just test iscsi.

So do something like this:

tgtadm --op new --mode target --tid 1 -T iqn.2001-04.com.pertest
tgtadm --op new --mode logicalunit --tid 1 --lun 1 --bstype null -b
/dev/null
tgtadm --op bind --mode target --tid 1 -I ALL

This of course will not do any real IO so you cannot do tests that
verify day. Just do really harsh heavy read/write tests.


When you do IO to fake disks do you see any ping/nop or scsi timeout errors?


> We recently saw a thread (Apr 4) regarding cfq scheduler. So we quickly tested noop and deadline, just to see if that would change anything-- it didn't.
>
> So my most recent test was to try out a different target, just to see if we could rule out the EqualLogic. Each time I changed from EQLX to the tgtd, I would reset (and rescan in my volumes) the iscsid.conf's "FastAbort = No", or yes (if I was testing tgtd), to conform with EQLX's best practices.
>

The abort setting is not going to help at all for any target. That
basically kicks in after you have already hit the problem you are
hitting. It just speeds up the handling of the problem.


> So at this point, after I get dm-multipath and OCFS2 out of the equation, it will be down to a tartget + kernel/initiator + I/O scheduler and I want to make sure that I'm getting all the debug information that I might need to analyze what is going on.
>
> Are there any other debug tunables that you might recommend adding to my script?

No, the ones you have above are all there is. Maybe tcpdump -w iscsi.out
-i ethXYZ

Did you have the initiator /var/log/messages with debugging on when you
hit the problem below? If so send them.

Joe Hoot

unread,
Apr 14, 2011, 7:39:57 AM4/14/11
to open-...@googlegroups.com
Here's what I am seeing (roughly... with lines scraped in between.. and I'm only grepping for connect7/session7):

Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
...lots of the above types of messages...
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2279563
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
...lots of the above types of messages...
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2279563
Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
...lots of the above types of messages...
Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Sending nopout as ping
Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2280813
Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
...lots of the above types of messages...
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
Apr 11 13:22:30 oim61024009 kernel: connection7:0: ping timeout of 5 secs expired, recv timeout 5, last rx 2278313, last ping 2279563, now 2280813
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
Apr 11 13:22:30 oim61024009 kernel: connection7:0: detected conn error (1011)
Apr 11 13:22:30 oim61024009 kernel: session7: iscsi_data_xmit Tx suspended!
Apr 11 13:22:30 oim61024009 kernel: session7: iscsi_start_session_recovery blocking session
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing pending sc df493a80 itt 0x108
...etc..etc..
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing pending sc d81de300 itt 0x159
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing requeued sc d89b6800 itt 0xd8
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_cleanup_task pending r2t dropped
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing requeued sc ca305ac0 itt 0xd9
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_cleanup_task pending r2t dropped
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing requeued sc d81de440 itt 0xdb
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_cleanup_task pending r2t dropped
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing requeued sc ca305700 itt 0xd7
...pattern continues..
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing requeued sc d847ee40 itt 0xfa
Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_cleanup_task pending r2t dropped
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing in progress sc c9ac8840 itt 0xfb
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing in progress sc d1e04800 itt 0xfc
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing in progress sc df0cdc00 itt 0xfd
...pattern continues...
Apr 11 13:22:30 oim61024009 kernel: session7: fail_all_commands failing in progress sc d8811700 itt 0x107
Apr 11 13:22:30 oim61024009 kernel: session7: flush_control_queues flushing pending mgmt task itt 0x15a
Apr 11 13:22:30 oim61024009 kernel: session7: iscsi2_queuecommand cmd 0x2a rejected (7)
Apr 11 13:22:32 oim61024009 kernel: connection7:0: iscsi_tcp_hdr_recv_prep (digest disabled)
Apr 11 13:22:32 oim61024009 kernel: connection7:0: iscsi_tcp_task_init mtask deq [itt 0x0]
Apr 11 13:22:32 oim61024009 kernel: connection7:0: iscsi_sw_tcp_send_hdr_prep digest disabled
Apr 11 13:22:32 oim61024009 kernel: connection7:0: iscsi_sw_tcp_send_linear_data_prep datalen=208 digest disabled
Apr 11 13:22:32 oim61024009 kernel: session7: iscsi_prep_mgmt_task mgmtpdu [op 0x3 hdr->itt 0x10000000 datalen 208]
Apr 11 13:22:32 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 0 0 size 48 xmit

Joe Hoot

unread,
Apr 14, 2011, 7:50:30 AM4/14/11
to open-...@googlegroups.com
sorry... I'm confusing the issue... the messages that I showed below were from a iSCSI Disconnect test that we had done while connecting to the EqualLogic. I will re-setup my test tot he tgtd to see what it and the initiator show. I will take you suggestion of writting to dev/null.

Mike Christie

unread,
Apr 14, 2011, 3:10:09 PM4/14/11
to open-...@googlegroups.com, Joe Hoot
On 04/14/2011 06:39 AM, Joe Hoot wrote:
> Here's what I am seeing (roughly... with lines scraped in between.. and I'm only grepping for connect7/session7):
>
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> ...lots of the above types of messages...
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2279563
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> ...lots of the above types of messages...
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2279563
> Apr 11 13:22:24 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> ...lots of the above types of messages...
> Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Sending nopout as ping
> Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_check_transport_timeouts Setting next tmo 2280813
> Apr 11 13:22:25 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> ...lots of the above types of messages...
> Apr 11 13:22:30 oim61024009 kernel: connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit

Does it always say "xmit" in the iscsi_tcp_segment_done messages from
13:22:20 - 13:22:30 or does it sometimes say "recv"? Just send the full
logs actually.

And are you running with that kernel fix (I think oracle merged if for
you right). This fix:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=4c48a82935f833d94fcf44c2b0c5d2922acfc77a;hp=d1acfae514425d680912907c6554852f1e258551


If there is no "recv"s in there then you need the attached patch. No
idea if this applies to the oracle kernel you are using.

The weird thing though would be that it means for 10 seconds we just
send data and never get a response to any of the pdus we have sent.

What is the output of iscsiadm -m session -P 2 when you are logged into
the target?

acct-for-all-traffic.patch

Mike Christie

unread,
Apr 14, 2011, 3:17:29 PM4/14/11
to open-...@googlegroups.com, Joe Hoot

Ignore the previous mail. Your mail was the first one I read today and
my eyes are not wide open yet :)

Send the full logs. And were you using that patch to send the
sk_sndtimeo lower? It looks like you were.

Apr 11 13:22:24 oim61024009 kernel: connection7:0:
iscsi_tcp_segment_done copied 2592 0 size 4096 xmit

This message means we have sent 2592 bytes out of 4096. The 0 means that
we are trying to send the rest of the data, but we are not able to sent
anything (0 is the number of bytes we were able to send when we did
sendpage/sendmsg). So send the full logs so I can see what else is in there.

Joe Hoot

unread,
Apr 14, 2011, 3:28:57 PM4/14/11
to open-...@googlegroups.com
This test was done on an OEL5.6 system, using the latest released RHEL kernel + 872 code.

Mike, we just discovered that our Cisco switch didn't have FlowControl enabled properly. It looks like it _should_ have autonegotiated to on, but may not have. We've manually forced it on and are running another test at the moment. I don't want to eat up too much of your time, so let me gather the new set of info from a new test fist. Then I'll attach the full logs.

Thanks,
Joe

> --
> You received this message because you are subscribed to the Google Groups "open-iscsi" group.
> To post to this group, send email to open-...@googlegroups.com.
> To unsubscribe from this group, send email to open-iscsi+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
>

> <acct-for-all-traffic.patch>

Joe Hoot

unread,
Apr 14, 2011, 4:12:52 PM4/14/11
to open-...@googlegroups.com
sorry.. ignore my last email this time :P We've been doing a ton of tests, so I just confused my self. That test is from a default OVM system that had its default kernel and iscsi tools. We also had disconnects in an OEL5.6 environment with latest released 872 as well.

Do you know if the iscsi-initiator-utils-6.2.0.872-6.0.2.el5 rpm contains that fix patch as well? If so, I have the dt tests running as of 16:04 in that OEL5.6 environment (2.6.18-238.el5).

I figured instead of testing in OVM, if I can reproduce it with OEL5.6, it probably has a larger audience.

Mike Christie

unread,
Apr 14, 2011, 7:57:26 PM4/14/11
to open-...@googlegroups.com, Joe Hoot
On 04/14/2011 03:12 PM, Joe Hoot wrote:
> sorry.. ignore my last email this time :P We've been doing a ton of tests, so I just confused my self. That test is from a default OVM system that had its default kernel and iscsi tools. We also had disconnects in an OEL5.6 environment with latest released 872 as well.
>
> Do you know if the iscsi-initiator-utils-6.2.0.872-6.0.2.el5 rpm contains that fix patch as well? If so, I have the dt tests running as of 16:04 in that OEL5.6 environment (2.6.18-238.el5).
>

The patches are kernel patches. Red Hat's 2.6.18-238.el5 kernel has the
fix pointed to by the http://git.kernel.org link in the other mail.

I do not think you are hitting that though. In the logs you sent we just
cannot send data. Send the logs and we will know for sure.

Mike Christie

unread,
Apr 15, 2011, 2:31:06 PM4/15/11
to open-...@googlegroups.com, Joe Hoot
On 04/14/2011 02:17 PM, Mike Christie wrote:
> This message means we have sent 2592 bytes out of 4096. The 0 means that
> we are trying to send the rest of the data, but we are not able to sent
> anything (0 is the number of bytes we were able to send when we did
> sendpage/sendmsg). So send the full logs so I can see what else is in
> there.
>

Ok in your current logs we have

Apr 13 11:07:23 oim61024008 kernel: connection3:0: ping timeout of 5
secs expired, recv timeout 5, last rx 559699, last ping 560949, now 562199


Then looking over session3/connection3:0 we have

connection3:0: iscsi_tcp_segment_done copied 2632 0 size 4096 xmit

from Apr 13 11:07:23 to the middle of Apr 13 11:07:13.

So for 10 seconds we are trying to do sendpage but it keeps returning 0.
For some reason the network layer is just not sending new IO. The
network layer is probably retransmitting and so the network layer runs
out of writespace.

Did you by any chance also take a network trace (tcpdump or
ethereal/wireshark)? Could you send it?

Can you run a rhel kernel? I can add some debugging in to the kernel.

Joe Hoot

unread,
Apr 28, 2011, 3:42:59 PM4/28/11
to open-...@googlegroups.com
Mike or others,

I would like to enable a small (yes, feel free to snicker here...) bit of debugging on one of our systems to try to capture tcp window sizes (so that I can attempt to trend some behaviors). Mike, recall awhile ago you had mentioned that the following will help us gather more information:

echo 1 > /sys/module/libiscsi2/parameters/debug_libiscsi
echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
echo 1 > /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp

I believe that the only one that I really need to turn on to get more information about size of data being sent would be this one:

echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp

as it gives me something like this when I run this command:
[root@oim61024006 ~]# echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd if=/dev/zero of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K count=1; echo 0 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
[root@oim61024006 ~]#

That produced this in /var/log/messages:
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_task_init task deq [itt 0x1c6 imm 0 unsol 0]
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 48 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 560 bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb deace200 ptr=dd84c800 avail=560
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 48 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv copying 48
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 48 size 48 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment done

---> and I'm thinking that this is actually where my dd starts going:
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512)
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 512 bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb deace200 ptr=dd84c830 avail=512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 512 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv copying 512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 512 size 512 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 512 total size 512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment done
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_recv_prep (digest disabled)
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 0 bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb no more data avail. Consumed 0
--> and this is where it had finished?

Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 0 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 0 total size 0

I'm assuming that these "size 48 recv" might be some keepalive thing (nop pings, dm-multipath readsector 0, etc..)?


Would someone on the list be able to help me understand what these messages mean?

For example, in the first one above:

1) "iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512"

0x25 = I don't need to know exactly, but I'm assuming this is telling the target to get ready for a write
ahslen = some type of send length?
datalen 512 = amount that the initiator will be sending in this segment?

2) "iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512)"

basically tells disk to start a 0 and write 512bytes?

3) "iscsi_tcp_recv_skb in 512 bytes"

skb = send kb's?

etc...

But what I'm really trying to pull out of this information is trying to trend how often the initiator is trying to xmit data, but is not allowed to. Maybe there is also a way (outside of just tcpdumping... or even with tcpdump, with a tweaked filter so I don't get gigs of pcaps) to understand what the tcp window size is currently being set to.

I would then try to parse it after the fact to put the data in rrd graphs.

If there are any sites that explain this in more detail, I would happy to dig. The only issue I have is that I'm not a programmer... I haven't reviewed or coded in C in about 10 years (and even then I was just learning it in school). So if the site is too in depth, I may still be asking questions... but regardless, I am more than happy to hit the ground running and at least do the due diligence of reviewing the info if someone can point me in the right direction.

Thanks in advance to anyone for the help... it is certainly appreciated!

Thanks,
Joe

Mike Christie

unread,
Apr 28, 2011, 6:29:24 PM4/28/11
to open-...@googlegroups.com, Joe Hoot
On 04/28/2011 02:42 PM, Joe Hoot wrote:
> Mike or others,
>
> I would like to enable a small (yes, feel free to snicker here...) bit of debugging on one of our systems to try to capture tcp window sizes (so that I can attempt to trend some behaviors). Mike, recall awhile ago you had mentioned that the following will help us gather more information:
>

How are you setting the window sizes?

> echo 1> /sys/module/libiscsi2/parameters/debug_libiscsi
> echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> echo 1> /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp
>
> I believe that the only one that I really need to turn on to get more information about size of data being sent would be this one:
>
> echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
>
> as it gives me something like this when I run this command:
> [root@oim61024006 ~]# echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd if=/dev/zero of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K count=1; echo 0> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> 1+0 records in
> 1+0 records out
> 1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
> [root@oim61024006 ~]#
>
> That produced this in /var/log/messages:
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_task_init task deq [itt 0x1c6 imm 0 unsol 0]
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 48 xmit
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 48 size 48 xmit
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 48 total size 48

Need libiscsi debugging on, but this looks like it might have been a
iscsi pdus related to scsi READ commands. On the next line we see some
data coming in from the target in response to some command.


> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 560 bytes
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb deace200 ptr=dd84c800 avail=560
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 48 recv
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv copying 48
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 48 size 48 recv
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 48 total size 48
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment done
>
> ---> and I'm thinking that this is actually where my dd starts going:
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512

opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to
the initiator in this type of pdu in response to a read command. Below
we see more data getting read in.

> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512)

so here datalen means that the DATA_IN pdu has 512 bytes of data that we
are going to read in.

> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 512 bytes
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb deace200 ptr=dd84c830 avail=512
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 512 recv
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv copying 512
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 512 size 512 recv
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 512 total size 512
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment done

So the above blob just means we got the 512 bytes read in successfully.

> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_recv_prep (digest disabled)
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 0 bytes
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb no more data avail. Consumed 0
> --> and this is where it had finished?
>
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done copied 0 0 size 0 xmit
> Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done total copied 0 total size 0
>
> I'm assuming that these "size 48 recv" might be some keepalive thing (nop pings, dm-multipath readsector 0, etc..)?
>

Is a iscsi header. In a iscsi packet there is basically a header that
tells us what type of command it is (ping, scsi command, ready to
transfer, login, etc), how much data is going to be transferred in the
data segment after the header, etc.

For the xmit path you need libiscsi debugging on as well as libiscsi_tcp
to determine what type of iscsi header it is.

See above for info the packets.

> 3) "iscsi_tcp_recv_skb in 512 bytes"
>
> skb = send kb's?

skb is a networking thing. It is just short for "struct sk_buff" which
just represents a buffer that we could send/recv to/from.


>
> etc...
>
> But what I'm really trying to pull out of this information is trying to trend how often the initiator is trying to xmit data, but is not allowed to. Maybe there is also a way (outside of just tcpdumping... or even with tcpdump, with a tweaked filter so I don't get gigs of pcaps) to understand what the tcp window size is currently being set to.
>

There is no nice printk for when sendpage/sendmsg has failed to send
data. In the past logs I could tell it was happening because in the xmit
messages we would repeat messages like this:

connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit

which comes from here:

ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copied %u %u size %u %s\n",
segment->copied, copied, segment->size,
recv ? "recv" : "xmit");

And so the first value is how much we have copied so far. The second
values is how much got copied when we just called sendpage/sendmsg and
the last value size is the total amount of data we want to send.

In the logs you sent it kept repeating the same values and we never saw
the copied values get incremented and we never saw this message
indicating we sent everything.

/* Do we have more scatterlist entries? */
ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "total copied %u total size
%u\n",
segment->total_copied, segment->total_size);


> I would then try to parse it after the fact to put the data in rrd graphs.
>
> If there are any sites that explain this in more detail, I would happy to dig. The only issue I have is that I'm not a programmer... I haven't reviewed or coded in C in about 10 years (and even then I was just learning it in school). So if the site is too in depth, I may still be asking questions... but regardless, I am more than happy to hit the ground running and at least do the due diligence of reviewing the info if someone can point me in the right direction.
>

If you can build I kernel I can make this a lot easier. I can just add a
printk to the sendpage/sendmsg call, and so it will be really easy to
pinpoint.

Joe Hoot

unread,
Apr 28, 2011, 8:07:56 PM4/28/11
to open-...@googlegroups.com
Thanks for the quick response (as always) Mike.

Ok. I see where you're going. I should be able to build a kernel, but I have already veered a little bit away from where Oracle already is with their OVM kernel. I don't want to veer to far (even with printk statements)-- but maybe... let me think about possibly doing something with just tcpdump to see if I can see tcp windows scaling real low.

We have a concern at ITEC that the EqualLogic is setting TCP windows towards zero. And when it does that we are concerned that if it does this too much, even noop pings aren't getting through.

So... let think about this and see if I can respond again to you tomorrow morning.

Thanks again for jumping on this.

Joe

Shyam...@dell.com

unread,
Apr 29, 2011, 10:46:05 AM4/29/11
to open-...@googlegroups.com

> -----Original Message-----
> From: open-...@googlegroups.com [mailto:open-...@googlegroups.com]
> On Behalf Of Joe Hoot
> Sent: Thursday, April 28, 2011 8:08 PM
> To: open-...@googlegroups.com
> Subject: Re: determining window size
>
> Thanks for the quick response (as always) Mike.
>
> Ok. I see where you're going. I should be able to build a kernel, but
> I have already veered a little bit away from where Oracle already is
> with their OVM kernel. I don't want to veer to far (even with printk
> statements)-- but maybe... let me think about possibly doing something
> with just tcpdump to see if I can see tcp windows scaling real low.
>
> We have a concern at ITEC that the EqualLogic is setting TCP windows
> towards zero. And when it does that we are concerned that if it does
> this too much, even noop pings aren't getting through.

What is the NIC at the initiator..? Is it an offload session ?

I have seen a TCP zero window issue recently.

> iscsi+un...@googlegroups.com.


> > For more options, visit this group at
> http://groups.google.com/group/open-iscsi?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google
> Groups "open-iscsi" group.
> To post to this group, send email to open-...@googlegroups.com.
> To unsubscribe from this group, send email to open-

> iscsi+un...@googlegroups.com.

Joe Hoot

unread,
Apr 29, 2011, 11:47:08 AM4/29/11
to open-...@googlegroups.com
We have a few environments that we're testing this with right now. We are mostly using bnx2 (the nics are capable of offloading, but we aren't using that portion). Since you can't disable the nic from doing offloading from the BIOS, we do ti when the system comes only with the following:

for ETH in eth0 eth1 eth2 eth3; do
ethtool -K $ETH tx off
ethtool -K $ETH rx off
ethtool -K $ETH sg off
ethtool -K $ETH tso off
ethtool -K $ETH gso off
done

For some reason, bnx2i is still getting loaded in the OS as a module, but (I believe, because I tested this awhile ago) even if I rmmod bnx2i, it will automatically get loaded again once iscsid starts back up.

> To unsubscribe from this group, send email to open-iscsi+...@googlegroups.com.

Mike Christie

unread,
Apr 29, 2011, 2:15:21 PM4/29/11
to open-...@googlegroups.com, Joe Hoot
On 04/29/2011 10:47 AM, Joe Hoot wrote:
>
> For some reason, bnx2i is still getting loaded in the OS as a module, but (I believe, because I tested this awhile ago) even if I rmmod bnx2i, it will automatically get loaded again once iscsid starts back up.
>

The iscsi init scripts are lazy and instead of loading modules when they
are used, they just load all of them. The bnx2i module would only be
used if you set it up to be used by passing a iface with bnx2i as the
transport into a iscsiadm discovery or node command.

Shyam...@dell.com

unread,
Apr 29, 2011, 4:15:36 PM4/29/11
to open-...@googlegroups.com

> -----Original Message-----
> From: open-...@googlegroups.com [mailto:open-...@googlegroups.com]
> On Behalf Of Joe Hoot
> Sent: Friday, April 29, 2011 11:47 AM
> To: open-...@googlegroups.com
> Subject: Re: determining window size
>

> We have a few environments that we're testing this with right now. We
> are mostly using bnx2 (the nics are capable of offloading, but we
> aren't using that portion). Since you can't disable the nic from doing
> offloading from the BIOS, we do ti when the system comes only with the
> following:
>

Not really.. You are not using the offload stack when you configure your iscsi sessions to use TCP in the OS. Once the BIOS passes control to the OS, the OS can create sessions through either stack. In fact even if you enabled BIOS(offload mode) you can't really offload iSCSI unless there is a LICENSE key plugged into the motherboard.

I presume that is the case unless you created session using the iface created by bnx2i..


> for ETH in eth0 eth1 eth2 eth3; do
> ethtool -K $ETH tx off
> ethtool -K $ETH rx off
> ethtool -K $ETH sg off
> ethtool -K $ETH tso off
> ethtool -K $ETH gso off
> done

This is not necessary to offload infact you may have a performance hit since you do this..

>
> For some reason, bnx2i is still getting loaded in the OS as a module,
> but (I believe, because I tested this awhile ago) even if I rmmod
> bnx2i, it will automatically get loaded again once iscsid starts back
> up.

Even this is not necessary.. Since even if it is loaded it is not getting used..


I guess.. the trace will exactly point to the tcp window sizes here...

Look for resets either by initiator or target a zero window could be pretty near to a reset..

Sometimes if either initiator or target advertizes a zero window it results in TCP resets and therefore an error messages in the kernel. So watch for those kernel messages as well to profile where you need to search in the trace.

Reply all
Reply to author
Forward
0 new messages