I have already set these parameters
sysctl -w net.ipv4.tcp_window_scaling=0
/etc/iscsi/iscsid.conf
node.session.timeo.replacement_timeout = 86400
node.conn[0].timeo.noop_out_interval = 0
I do not see partitions in /proc/partitions and multipath damenon does not
create any device in ls -lrt /dev/mapper/*
dmesg says
sd 22:0:0:0: SCSI error: return code = 0x060e0000
end_request: I/O error, dev sdm, sector 0
connection3:0: detected conn error (1011)
session3: target reset succeeded
connection12:0: detected conn error (1011)
session12: target reset succeeded
connection3:0: detected conn error (1011)
session3: target reset succeeded
connection12:0: detected conn error (1011)
session12: target reset succeeded
sd 13:0:0:0: timing out command, waited 360s
sd 13:0:0:0: SCSI error: return code = 0x060e0000
end_request: I/O error, dev sdd, sector 63
printk: 11 messages suppressed.
Buffer I/O error on device sdd1, logical block 0
Buffer I/O error on device sdd1, logical block 1
Buffer I/O error on device sdd1, logical block 2
Buffer I/O error on device sdd1, logical block 3
Buffer I/O error on device sdd1, logical block 4
Buffer I/O error on device sdd1, logical block 5
Buffer I/O error on device sdd1, logical block 6
Buffer I/O error on device sdd1, logical block 7
Buffer I/O error on device sdd1, logical block 8
Buffer I/O error on device sdd1, logical block 9
sd 22:0:0:0: timing out command, waited 360s
sd 22:0:0:0: SCSI error: return code = 0x060e0000
end_request: I/O error, dev sdm, sector 63
printk: 22 messages suppressed.
Buffer I/O error on device sdm1, logical block 0
Relevant portions of /var/log/messages are:
Jan 31 14:11:44 tptrac1 iscsid: Kernel reported iSCSI connection 6:0 error
(1011) state (3)
Jan 31 14:11:44 tptrac1 iscsid: Kernel reported iSCSI connection 1:0 error
(1011) state (3)
Jan 31 14:11:45 tptrac1 iscsid: Kernel reported iSCSI connection 7:0 error
(1011) state (3)
Jan 31 14:11:45 tptrac1 iscsid: Kernel reported iSCSI connection 9:0 error
(1011) state (3)
Jan 31 14:11:45 tptrac1 iscsid: Kernel reported iSCSI connection 8:0 error
(1011) state (3)
Jan 31 14:11:47 tptrac1 kernel: connection4:0: detected conn error (1011)
Jan 31 14:11:47 tptrac1 kernel: connection2:0: detected conn error (1011)
Jan 31 14:11:47 tptrac1 kernel: connection3:0: detected conn error (1011)
Jan 31 14:11:47 tptrac1 kernel: session1: target reset succeeded
Jan 31 14:11:47 tptrac1 kernel: session6: target reset succeeded
Jan 31 14:11:47 tptrac1 kernel: connection5:0: detected conn error (1011)
Jan 31 14:11:47 tptrac1 iscsid: connection6:0 is operational after recovery (1
attempts)
Jan 31 14:11:47 tptrac1 iscsid: connection1:0 is operational after recovery (1
attempts)
Something might be wrong with your storage. 0x060e0000 means the scsi
command is timing out. We had to drop the connection and relogin to fix
the problem. We tried to execute the IO for 360 seconds (scsi command
timeout * scsi allowed retries + 1), but it did not complete, so the
scsi layer ended up failing it.
What version of open-iscsi are you using? Could you turn on debugging?
It might be slightly different on your kernel but something like this:
echo 1 > /sys/module/libiscsi/paramters/*debug/*
echo 1 > /sys/module/libiscsi_tcp/paramters/*debug/*
echo 1 > /sys/module/iscsi_tcp/paramters/*debug/*
And can you also send a tcpdump/wireshark trace. This way we can see if
the IO is making some progress but very slowly, or if we just do not see
any IO on the initiator side at all.
>
>
> Regards,
According to my very little experience with huge packages, I think that even that value is rather big. We are running with 9000 here. About 20 years ago we had a printing problem when some packet buffer was a few bytes to small: Small print jobs would work, but lerger ones won't. Maybe you are seeing similar with iSCSI. Also: Have a _matching_ MTU along the whole path.
Regards,
Ulrich
>>> Gopesh Sharma <gopesh.s...@gmail.com> schrieb am 01.02.2012 um 17:05 in
Nachricht
<CAHOahhyEvOptX5=ExHM_OYa_4uv24h_d...@mail.gmail.com>: