What do you mean by "never gets available"? Can you attach the full dmesg? Is it that
the block device (/dev/sdX) that is unavailable or the multipath device (/dev/dm-XX)?
Is it all of them or just one or two?
>
> I got the tcpdump from the host and it shows that the host repeatedly
> sends "Test Unit Ready" requests and receives "Logical Unit Not Ready,
> Initializing Cmd. Required (0x0402)" in response. "Mode Sense" and
> "Read Capacity" get meaningful responses. Then the host does a "Read"
> request and gets "Logical Unit Not Ready, Cause Not Reportable
> (0x0400)".
Is that for each of the block disks that showed up when you logged in
or just for some of them? If it is the latter than is expected with
Active-Passive targets which you might have.
.. snip..
> Having very little prior experience with iSCSI I am not sure what
> Initializing Cmd. Required is supposed to mean.
This looks more like an multipath configuration, or the lack of
it.
So the block disks show up fine. That means iSCSI works just fine.
> However
> # fdisk /dev/sdb
> Returns with
> Unable to read /dev/sdb
> Finally,
> # iscsiadm -m node -p <target portal> -T <target> -u
> returns.
>
> I am attaching the relevant part of the dmesg log. Hope it can clarify
> the problem
You only see one disk? Either way, are you running multipath on your machine?
If you are, what is the multipath -ll output?
>
> Dear Konrad, I find it strange that the host does not send "Request
> Sense" after receiving an error
> for "Test Unit Ready". Is it a legit SCSI behaviour?
The kernel intoragates the disks and that is what it gets. It reports to you
these errors. There is no need to send a Request Sense command as the Test Unit
Ready has already provided you the SCSI error, along with the ASC/ASCQ values, as
you can see:
> sd 6:0:0:1: Device not ready: <6>: Current: sense key: Not Ready
> Additional sense: Logical unit not ready, cause not reportable
The next step for enterprise storages, such as the HP StorageWorks
is to issue a START SCSI command - which is what you should see if you
are using multipath. The path checker would start the LUN.
That is expected. The one that was set in the .conf file by your vendor
or the built-in should be used.
These are the built-in values:
{
/* MSA 1000/MSA1500 EVA 3000/5000 with old firmware */
.vendor = "(COMPAQ|HP)",
.product = "(MSA|HSV)1.0.*",
.getuid = DEFAULT_GETUID,
.features = "1 queue_if_no_path",
.hwhandler = "1 hp-sw",
.selector = DEFAULT_SELECTOR,
.pgpolicy = GROUP_BY_PRIO,
.pgfailback = FAILBACK_UNDEF,
.rr_weight = RR_WEIGHT_NONE,
.no_path_retry = 12,
.minio = DEFAULT_MINIO,
.checker_name = HP_SW,
.prio_name = PRIO_HP_SW,
},
> May 17 19:42:07 <hostname> multipathd:
> 3600508b3009203503b2fc2c200040011: stop event checker thread
> May 17 19:42:07 <hostname> kernel: multipathd[13092]: segfault at
> 0000000000000012 rip 00002b0b549d541d rsp 00007fff5682d1b0 error 4
You might want to file a bug with the vendor for that.
.. snip ..
> Thank you for the advice.
You are welcome.
>
> I have multipath 0.4.7 running the default configuration (no /etc/
> multipath.conf)
Good. That should use the default values built-in.