I am trying to use iscsi target with iser(Infiniband), but when I mount it became unstable.
I use OFED-1.2.5.1 , scsi-target-utils-0.2-20071227_1 and iscsi-initiator-utils-2.0-865.
I got logs from iscsi-initiator server like below when it turn into Read-only file system.
[root@OSS1 ~]# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/sdd1 on /mnt/sdd1 type ext3 (rw)
[root@OSS1 ~]# time dd if=/dev/zero of=/mnt/sdd1/1G bs=1024K count=1000
dd: writing `/mnt/sdd1/1G': Read-only file system
126+0 records in
125+0 records out
132059136 bytes (132 MB) copied, 0.472386 seconds, 280 MB/s
real 0m0.519s
user 0m0.000s
sys 0m0.335s
[root@OSS1 ~]#
[root@OSS1 ~]#dmesg
SCSI device sdd: drive cache: write back
sdd: sdd1
sd 3:0:0:2: Attached scsi disk sdd
sd 3:0:0:2: Attached scsi generic sg5 type 0
kjournald starting. Commit interval 5 seconds
EXT3-fs warning (device sdd1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure
EXT3-fs warning (device sdd1): ext3_clear_journal_err: Marking fs in need of filesystem check.
EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
EXT3 FS on sdd1, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
EXT3-fs error (device sdd1): ext3_new_block: Allocating block in system zone - blocks from 1212424, length 1
Aborting journal on device sdd1.
ext3_abort called.
EXT3-fs error (device sdd1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
EXT3-fs error (device sdd1): ext3_free_blocks: Freeing blocks in system zones - Block = 1212424, count = 1
EXT3-fs error (device sdd1) in ext3_free_blocks_sb: Journal has aborted
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_committed_data
[root@OSS1 ~]#
[root@OSS1 ~]#tailf /var/log/messages
Apr 16 20:45:42 OSS1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 16 20:49:43 OSS1 kernel: connection2:0: iscsi: detected conn error (1011)
Apr 16 20:49:43 OSS1 kernel: iser: iscsi_iser_ep_disconnect:ib conn ffff8101be1323c0 state 2
Apr 16 20:49:43 OSS1 kernel: iser: iser_cq_tasklet_fn:comp w. error op 0 status 5
Apr 16 20:49:43 OSS1 last message repeated 4 times
Apr 16 20:49:43 OSS1 kernel: iser: iser_cma_handler:event 10 conn ffff8101be1323c0 id ffff81019ea9d800
Apr 16 20:49:43 OSS1 kernel: iser: iser_free_ib_conn_res:freeing conn ffff8101be1323c0 cma_id ffff81019ea9d800 fmr pool ffff81019f09f9c0 qp ffff81019ea9d200
Apr 16 20:49:43 OSS1 kernel: iser: iser_device_try_release:device ffff8101afa84840 refcount 1
Apr 16 20:49:44 OSS1 iscsid: Kernel reported iSCSI connection 2:0 error (1011) state (3)
Apr 16 20:49:47 OSS1 kernel: iser: iscsi_iser_ib_conn_lookup:no conn exists for eph ffffffffffffffff
Apr 16 20:49:47 OSS1 kernel: iser: iser_connect:connecting to: 10.1.1.146, port 0xbc0c
Apr 16 20:49:47 OSS1 kernel: iser: iser_cma_handler:event 0 conn ffff8101be1323c0 id ffff81019ea9d800
Apr 16 20:49:47 OSS1 kernel: iser: iser_cma_handler:event 2 conn ffff8101be1323c0 id ffff81019ea9d800
Apr 16 20:49:47 OSS1 kernel: iser: iser_create_ib_conn_res:setting conn ffff8101be1323c0 cma_id ffff81019ea9d800: fmr_pool ffff81019eb5ecc0 qp ffff81019e266600
Apr 16 20:49:47 OSS1 kernel: iser: iser_cma_handler:event 9 conn ffff8101be1323c0 id ffff81019ea9d800
Apr 16 20:49:48 OSS1 kernel: iser: iscsi_iser_ep_poll:ib conn ffff8101be1323c0 rc = 1
Apr 16 20:49:48 OSS1 kernel: iser: iscsi_iser_conn_bind:binding iscsi conn ffff81019f597290 to iser_conn ffff8101be1323c0
Apr 16 20:49:49 OSS1 iscsid: received iferror -38
Apr 16 20:49:49 OSS1 iscsid: received iferror -22
Apr 16 20:49:49 OSS1 last message repeated 2 times
Apr 16 20:49:49 OSS1 iscsid: connection2:0 is operational after recovery (2 attempts)
Apr 16 20:49:53 OSS1 kernel: iscsi: host reset succeeded
Apr 16 21:24:22 OSS1 kernel: iser: iser_connect:connecting to: 10.1.1.146, port 0xbc0c
Apr 16 21:24:22 OSS1 kernel: iser: iser_cma_handler:event 0 conn ffff8101b22215c0 id ffff8101a283be00
Apr 16 21:24:22 OSS1 kernel: iser: iser_cma_handler:event 2 conn ffff8101b22215c0 id ffff8101a283be00
Apr 16 21:24:22 OSS1 kernel: iser: iser_create_ib_conn_res:setting conn ffff8101b22215c0 cma_id ffff8101a283be00: fmr_pool ffff8101a2db9d40 qp ffff8101a2f85000
Apr 16 21:24:22 OSS1 kernel: iser: iser_cma_handler:event 9 conn ffff8101b22215c0 id ffff8101a283be00
Apr 16 21:24:23 OSS1 iscsid: transport class version 2.0-724. iscsid version 2.0-865
Apr 16 21:24:23 OSS1 iscsid: iSCSI daemon with pid=4995 started!
Apr 16 21:24:23 OSS1 iscsid: received iferror -38
Apr 16 21:24:23 OSS1 iscsid: received iferror -22
Apr 16 21:24:23 OSS1 last message repeated 2 times
Apr 16 21:24:23 OSS1 iscsid: connection1:0 is operational now
Apr 16 21:24:23 OSS1 kernel: iser: iscsi_iser_ep_poll:ib conn ffff8101b22215c0 rc = 1
Apr 16 21:24:23 OSS1 kernel: scsi3 : iSCSI Initiator over iSER, v.0.1
Apr 16 21:24:23 OSS1 kernel: iser: iscsi_iser_conn_bind:binding iscsi conn ffff8101a2eea290 to iser_conn ffff8101b22215c0
Apr 16 21:24:23 OSS1 kernel: Vendor: IET Model: Controller Rev: 0001
Apr 16 21:24:23 OSS1 kernel: Type: RAID ANSI SCSI revision: 05
Apr 16 21:24:23 OSS1 kernel: scsi 3:0:0:0: Attached scsi generic sg4 type 12
Apr 16 21:24:23 OSS1 kernel: Vendor: IET Model: VIRTUAL-DISK Rev: 0001
Apr 16 21:24:23 OSS1 kernel: Type: Direct-Access ANSI SCSI revision: 05
Apr 16 21:24:23 OSS1 kernel: SCSI device sdd: 390636477 512-byte hdwr sectors (200006 MB)
Apr 16 21:24:23 OSS1 kernel: sdd: Write Protect is off
Apr 16 21:24:23 OSS1 kernel: SCSI device sdd: drive cache: write back
Apr 16 21:24:23 OSS1 kernel: SCSI device sdd: 390636477 512-byte hdwr sectors (200006 MB)
Apr 16 21:24:23 OSS1 kernel: sdd: Write Protect is off
Apr 16 21:24:23 OSS1 kernel: SCSI device sdd: drive cache: write back
Apr 16 21:24:23 OSS1 kernel: sdd: sdd1
Apr 16 21:24:23 OSS1 kernel: sd 3:0:0:2: Attached scsi disk sdd
Apr 16 21:24:23 OSS1 kernel: sd 3:0:0:2: Attached scsi generic sg5 type 0
Apr 16 21:24:24 OSS1 iscsid: received iferror -38
Apr 16 21:24:24 OSS1 iscsid: received iferror -22
Apr 16 21:24:24 OSS1 last message repeated 2 times
Apr 16 21:24:24 OSS1 iscsid: connection2:0 is operational now
Apr 16 21:25:59 OSS1 kernel: kjournald starting. Commit interval 5 seconds
Apr 16 21:25:59 OSS1 kernel: EXT3-fs warning (device sdd1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure
Apr 16 21:25:59 OSS1 kernel: EXT3-fs warning (device sdd1): ext3_clear_journal_err: Marking fs in need of filesystem check.
Apr 16 21:25:59 OSS1 kernel: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
Apr 16 21:25:59 OSS1 kernel: EXT3 FS on sdd1, internal journal
Apr 16 21:25:59 OSS1 kernel: EXT3-fs: recovery complete.
Apr 16 21:25:59 OSS1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr 16 21:26:26 OSS1 kernel: EXT3-fs error (device sdd1): ext3_new_block: Allocating block in system zone - blocks from 1212424, length 1
Apr 16 21:26:26 OSS1 kernel: Aborting journal on device sdd1.
Apr 16 21:26:26 OSS1 kernel: ext3_abort called.
Apr 16 21:26:26 OSS1 kernel: EXT3-fs error (device sdd1): ext3_journal_start_sb: Detected aborted journal
Apr 16 21:26:26 OSS1 kernel: Remounting filesystem read-only
Apr 16 21:26:26 OSS1 kernel: EXT3-fs error (device sdd1): ext3_free_blocks: Freeing blocks in system zones - Block = 1212424, count = 1
Apr 16 21:26:26 OSS1 kernel: EXT3-fs error (device sdd1) in ext3_free_blocks_sb: Journal has aborted
Apr 16 21:26:26 OSS1 kernel: __journal_remove_journal_head: freeing b_committed_data
This eph looks weird. Erez, have you seen this before?
>
> Apr 16 20:49:49 OSS1 iscsid: connection2:0 is operational after recovery (2
> attempts)
>
> Apr 16 20:49:53 OSS1 kernel: iscsi: host reset succeeded
>
Erez can better tell you about the iser messages. It looks like from
this message that a scsi command timed out, we could not abort it, and
could not reset the lun, so we dropped the session. I am not sure if
that is the cause of the iscsi connection error or if the connection
error caused the command to timeout.
I'll try to take a look at it, but it may take me some time (currently
busy with other stuff).
Erez
> I am trying to use iscsi target with iser(Infiniband), but when I
> mount it became unstable.
>
> I use OFED-1.2.5.1 , scsi-target-utils-0.2-20071227_1 and
> iscsi-initiator-utils-2.0-865.
>
> I got logs from iscsi-initiator server like below when it turn into
> Read-only file system.
>
Amir from our team in Voltaire will try to help you with that.
Erez
Sean, hello.
I tried to reproduce what you described. This what I installed:
Initiator: (Redhat 5) OFED 2.1.5.1 with iscsi-initiator-utils-2.0-865
Target: (Sles 10) OFED 2.1.5.1 and scsi-target-utils-0.2-20071227_1
I discovered the target, created an ext3 fs on the device and mounted it
(see mount) and then ran the dd command which did not result in any
error as it did in your case. The fs stayed as it was before.
[root@sage tmp]# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/sdj on /mnt/sdj type ext3 (rw)
[root@sage tmp]#
[root@sage tmp]# time dd if=/dev/zero of=/mnt/sdj/1G bs=1024K count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 4.05476 seconds, 259 MB/s
real 0m4.549s
user 0m0.003s
sys 0m4.545s
[root@sage tmp]#
[root@sage tmp]# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/sdj on /mnt/sdj type ext3 (rw)
[root@sage tmp]#
Please let me know if I left out some parameters that could effect the test.
Anyway, we recommend you to work with OFED-1.3. it has some other bug
fixes and more features.
Amir M.
You meant to say 1.2.5.1. Mention that this is the open-iscsi version
that is shipped with 1.2.5.1. I suggest that you fix this error.
Erez