Hi
all...
I have
four CentOS 5.4 (2.6.18-164.11.1.el5) servers with iscsid version
2.0-871. Two are misbehaving despite identical configuration. They
all connect to Enhance Tech RS8-IP4 array the same way, directly NIC-to-NIC
without a switch, physically separate from LAN. I created four
targets, one per port, and four separate
volumes/LUNs.
Pasted below is
the config and error log. About a minute after a successful login, the
timeouts/errors begin and keep coming constantly pretty much every minute
whenever the session is logged in, regardless of mount state. The
problematic units are also often very slow logging in, mounting, even
directory listing at times. Also, they sometimes time out and remount the
fs read-only in the middle of a large backup run.
The other two
servers exhibit no such problems whatsoever.
I'm very new to
iSCSI, not sure where to start looking. Would be grateful if someone could
point me in the right direction...
**CONFIG**
node.startup =
automatic
node.session.timeo.replacement_timeout =
120
node.conn[0].timeo.login_timeout =
15
node.conn[0].timeo.logout_timeout =
15
node.conn[0].timeo.noop_out_interval =
5
node.conn[0].timeo.noop_out_timeout =
5
node.session.err_timeo.abort_timeout =
15
node.session.err_timeo.lu_reset_timeout =
20
node.session.initial_login_retry_max = 8
node.session.cmds_max =
128
node.session.queue_depth = 32
node.session.iscsi.InitialR2T =
No
node.session.iscsi.ImmediateData =
Yes
node.session.iscsi.FirstBurstLength =
262144
node.session.iscsi.MaxBurstLength =
16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength =
262144
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength =
32768
node.conn[0].iscsi.HeaderDigest = None
node.session.iscsi.FastAbort
= Yes
**/VAR/LOG/MESSAGES**
Dec 13 17:10:57
db4 kernel: Loading iSCSI transport class v2.0-871.
Dec 13 17:10:57 db4
kernel: cxgb3i: tag itt 0x1fff, 13 bits, age 0xf, 4 bits.
Dec 13 17:10:57 db4
kernel: iscsi: registered transport (cxgb3i)
Dec 13 17:10:57 db4 kernel:
Broadcom NetXtreme II CNIC Driver cnic v2.0.1 (Oct 01, 2009)
Dec 13 17:10:57
db4 kernel: Broadcom NetXtreme II iSCSI Driver bnx2i v2.0.1e (June 22,
2009)
Dec 13 17:10:57 db4 kernel: iscsi: registered transport (bnx2i)
Dec
13 17:10:58 db4 kernel: iscsi: registered transport (tcp)
Dec 13 17:10:58 db4
kernel: iscsi: registered transport (iser)
Dec 13 17:10:58 db4 iscsid: iSCSI
logger with pid=24781 started!
Dec 13 17:10:59 db4 iscsid: transport class
version 2.0-871. iscsid version 2.0-871
Dec 13 17:10:59 db4 iscsid: iSCSI
daemon with pid=24782 started!
Dec 13 17:11:07 db4 kernel: scsi15 : iSCSI
Initiator over TCP/IP
Dec 13 17:11:07 db4 kernel: Vendor:
ETIUSA Model: UltraStorRS8IP4 Rev: 1.1.
Dec 13
17:11:07 db4 kernel: Type:
Direct-Access
ANSI SCSI revision: 04
Dec 13 17:11:07 db4 kernel: SCSI device sdc: 288374784
4096-byte hdwr sectors (1181183 MB)
Dec 13 17:11:07 db4 kernel: sdc: Write
Protect is off
Dec 13 17:11:07 db4 kernel: SCSI device sdc: drive cache:
write back
Dec 13 17:11:07 db4 kernel: SCSI device sdc: 288374784 4096-byte
hdwr sectors (1181183 MB)
Dec 13 17:11:07 db4 kernel: sdc: Write Protect is
off
Dec 13 17:11:07 db4 kernel: SCSI device sdc: drive cache: write
back
Dec 13 17:11:07 db4 kernel: sdc: sdc1
Dec 13 17:11:07 db4
kernel: sd 15:0:0:0: Attached scsi disk sdc
Dec 13 17:11:07 db4 kernel: sd
15:0:0:0: Attached scsi generic sg2 type 0
Dec 13 17:11:08 db4 iscsid:
connection1:0 is operational now
Dec 13 17:11:59 db4 kernel:
connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx
21989742817, last ping 21989747817, now 21989752817
Dec 13 17:11:59 db4
kernel: connection1:0: detected conn error (1011)
Dec 13 17:12:00 db4
iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
Dec 13
17:12:17 db4 iscsid: connection1:0 is operational after recovery (2
attempts)
Dec 13 17:13:06 db4 kernel: connection1:0: ping timeout of 5
secs expired, recv timeout 5, last rx 21989810632, last ping 21989815632, now
21989820632
Dec 13 17:13:06 db4 kernel: connection1:0: detected conn
error (1011)
Dec 13 17:13:07 db4 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Dec 13 17:13:25 db4 iscsid: connection1:0 is
operational after recovery (2 attempts)
*(mounting now)
Dec 13 17:13:34
db4 kernel: kjournald starting. Commit interval 5 seconds
Dec 13
17:13:34 db4 kernel: EXT3 FS on sdc1, internal journal
Dec 13 17:13:34 db4
kernel: EXT3-fs: mounted filesystem with ordered data mode.
*(mount
successful)
Dec 13 17:14:14 db4 kernel: connection1:0: ping timeout of
5 secs expired, recv timeout 5, last rx 21989877855, last ping 21989882855, now
21989887855
Dec 13 17:14:14 db4 kernel: connection1:0: detected conn
error (1011)
Dec 13 17:14:14 db4 iscsid: Kernel reported iSCSI connection 1:0
error (1011) state (3)
Dec 13 17:14:32 db4 iscsid: connection1:0 is
operational after recovery (2 attempts)
Dec 13 17:15:02 db4 kernel:
connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx
21989925928, last ping 21989930928, now 21989935928
Dec 13 17:15:02 db4
kernel: connection1:0: detected conn error (1011)
Dec 13 17:15:02 db4
iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
Dec 13
17:15:20 db4 iscsid: connection1:0 is operational after recovery (2
attempts)
Thanks
guys,
-Paul
--
ea926h