Hello all,
I encounter a weird issue with open-iscsi. I have a test machine with 500 iscsi volumes backed by an IP san. The test machine then performs r/w with o_direct on those 500 raw block devices. During the test I trigger a failure on the IP san so some iscsi connections break. iscsi client is able to reconnect and recover, however, immediately after recovery, some iscsi read finds corrupted data.
This issue happens frequently. After a lot of tracing on the IP san server, we become sure that those corrupted read requests have never been received by iscsi server at IP san.
In the following timeline diagram, the client generates the read around time t1 when connections are turned down. iscsi connection recovered at time t2. The time between t1 and t2 is about 15~20 seconds. Read returns several seconds after t2.
cut iscsi connections iscsi connection recoveryed
------------------------- t1 ------------------------------------------- t2 ---------------------------------->
The client machine uses Linux libaio to perform read/write. The read/write is performed in the following approach:
- blk devices are opened with O_DIRECT, io buffer is 4K-aligned, io offset is 4K aligned.
- Call io_submit() to submit requests to blk device.
- call io_getevents() to wait for completion events.
* If the status is “N bytes done”, assumes I/O was successful.
* If the status is “-1”, assume IO failure.
Is it possible that, iscsi layer will mark a blk_read/write completion with 0-bytes done because the connection is not available, and the upper layer will receive a completion with 0-bytes as the result?
Thank you for reading.
-Shawn