iscsi seems to work, but no device node?

2,750 views
Skip to first unread message

agshekeloh

unread,
Aug 24, 2010, 12:43:18 PM8/24/10
to open-iscsi
Hi,

I'm running open-iscsi 2.0-871, on Ubuntu 10.04 amd64, straight from
the operating system package. Mounting an iSCSI drive works perfectly
when booting from a hard drive, but not when booting diskless via
NFS. The iSCSI server is OpenSolaris.

The diskless node can discover the iSCSI node and log in:

# iscsiadm -m session -P1
Target: iqn.1986-03.com.sun:02:2293f889-f1bb-e764-e45f-b7931c6c86a5
Current Portal: XXX.XXX.64.168:3260,1
Persistent Portal: XXX.XXX.64.168:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:
01:cd00198dd976
Iface IPaddress: XXX.XXX.199.20
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE

I would expect to see /dev/sda at this point, but it doesn't appear.
There's no nodes starting with "ip" in /dev/disk/by-path, either.

I ran iscsid -fd8 by hand and looked for errors, but it looked much
like the output from the working machine (at least to my untrained
eye). Any suggestions on how to debug this further?

Thanks,
==ml

Mike Christie

unread,
Aug 24, 2010, 8:02:59 PM8/24/10
to open-...@googlegroups.com, agshekeloh


What does iscsiadm -m session -P 3 print out? Could you also send the
/var/log/messages output when you run the login command that fails to
find disks.

Also if you do

echo - - - > /sys/class/scsi_host/hostX/scan

X is the host number you will see from the -P3 command.


Does that find disks (check /var/log/messages and /dev after the echo
has completed).

mwlucas

unread,
Aug 25, 2010, 10:16:52 AM8/25/10
to open-iscsi
(I'm the original requester, I just changed Google to use my primary
email address)

Thanks for responding.

> What does iscsiadm -m session -P 3 print out?

# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.0-871
Target: iqn.1986-03.com.sun:02:2293f889-f1bb-e764-e45f-b7931c6c86a5
Current Portal: XXX.XXX.64.168:3260,1
Persistent Portal: XXX.XXX.64.168:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:
01:cd00198dd976
Iface IPaddress: XXX.XXX.199.20
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 262144
FirstBurstLength: 262144
MaxBurstLength: 16776192
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1
************************
Attached SCSI devices:
************************
Host Number: 7 State: running


> Could you also send the
> /var/log/messages output when you run the login command that fails to
> find disks.

Running this command:

# iscsiadm -m node --target iqn.1986-03.com.sun:02:2293f889-f1bb-e764-
e45f-b7931c6c86a5 --login
Logging in to [iface: default, target: iqn.1986-03.com.sun:02:2293f889-
f1bb-e764-e45f-b7931c6c86a5, portal: XXX.XXX.64.168,3260]
Login to [iface: default, target: iqn.1986-03.com.sun:02:2293f889-f1bb-
e764-e45f-b7931c6c86a5, portal: XXX.XXX.64.168,3260]: successful

generates this in /var/log/messages:

Aug 25 10:09:59 onvm1 kernel: [85593.458276] scsi8 : iSCSI Initiator
over TCP/IP

> Also if you do
>
> echo - - - > /sys/class/scsi_host/hostX/scan
>
> X is the host number you will see from the -P3 command.

I ran "echo - - - > /sys/class/scsi_host/host8/scan" (the host number
incremented after logout/login).

> Does that find disks (check /var/log/messages and /dev after the echo
> has completed).

Nope, no change. No additional entry in /var/log/messages, no
additional /dev/sd entries, no changes in /dev/disk/by-path.

Thanks,
==ml

Mike Christie

unread,
Aug 25, 2010, 3:04:29 PM8/25/10
to open-...@googlegroups.com, agshekeloh
On 08/24/2010 11:43 AM, agshekeloh wrote:
> Hi,
>
> I'm running open-iscsi 2.0-871, on Ubuntu 10.04 amd64, straight from
> the operating system package. Mounting an iSCSI drive works perfectly
> when booting from a hard drive, but not when booting diskless via
> NFS.

For the solaris target do you have to setup ACLs by initiator name? For
these 2 root FSs is the /etc/iscsi/initiatorname.iscsi have the same
value? Or does the NFS one have iqn.1993-08.org.debian:
01:cd00198dd976, and then the hard drive boot one have a different name?

mwlucas

unread,
Aug 25, 2010, 4:04:34 PM8/25/10
to open-iscsi
The Ubuntu nodes have different iSCSI initiator names. (The diskless
node was copied from the hard drive install, but I manually changed a
few characters in /etc/iscsi/initiatorname.iscsi.) I've removed the
ACL on all solaris targets during troubleshooting.

The hard-drive based install is targeting a different opensolaris
instance. (Both of those hosts are running on my ESXi crash box.)
The diskless server and its target are in a different datacenter.

Just to check things, I tried to mount an iSCSI disk from each Ubuntu
machine to the remote solaris box. Both ubuntu machines can log into
the remote iSCSI server, but neither server gets a device node for
this iSCSI session in /dev/disk/by-path. Could this be due to the lag
and very limited bandwidth (1.5Mbs) between the sites, or is something
else weird going on here? If there was a straight solaris problem, I
would expect the diskless node to get a device node from the test
server, just like the hard drive based install does.

Thanks,
==ml

Mike Christie

unread,
Aug 25, 2010, 4:25:33 PM8/25/10
to open-...@googlegroups.com, mwlucas
On 08/25/2010 03:04 PM, mwlucas wrote:
> On Aug 25, 3:04 pm, Mike Christie<micha...@cs.wisc.edu> wrote:
>> On 08/24/2010 11:43 AM, agshekeloh wrote:
>>
>>> Hi,
>>
>>> I'm running open-iscsi 2.0-871, on Ubuntu 10.04 amd64, straight from
>>> the operating system package. Mounting an iSCSI drive works perfectly
>>> when booting from a hard drive, but not when booting diskless via
>>> NFS.
>>
>> For the solaris target do you have to setup ACLs by initiator name? For
>> these 2 root FSs is the /etc/iscsi/initiatorname.iscsi have the same
>> value? Or does the NFS one have iqn.1993-08.org.debian:
>> 01:cd00198dd976, and then the hard drive boot one have a different name?
>
> The Ubuntu nodes have different iSCSI initiator names. (The diskless
> node was copied from the hard drive install, but I manually changed a
> few characters in /etc/iscsi/initiatorname.iscsi.) I've removed the
> ACL on all solaris targets during troubleshooting.
>
> The hard-drive based install is targeting a different opensolaris
> instance. (Both of those hosts are running on my ESXi crash box.)
> The diskless server and its target are in a different datacenter.
>
> Just to check things, I tried to mount an iSCSI disk from each Ubuntu
> machine to the remote solaris box. Both ubuntu machines can log into
> the remote iSCSI server, but neither server gets a device node for
> this iSCSI session in /dev/disk/by-path. Could this be due to the lag
> and very limited bandwidth (1.5Mbs) between the sites, or is something

If that was the case, we should be getting a bunch of IO errors about
inqiuries or report luns commands failing in the initiator log when you
run the iscsiadm login command.

> else weird going on here?

On the initiator box, when you run the iscsiadm login command can you
take a wireshark/ethereal/tcpdump trace, so we can see if the initiator
is sending inquiries and report luns ok, and what the responses are.

Michael W. Lucas

unread,
Aug 26, 2010, 10:06:28 AM8/26/10
to Mike Christie, open-...@googlegroups.com

I can provide the raw capture to devs upon request, but here's the
highlights.

I see four "Login Command"s, followed by "Login Response (Success)".
Then there's a "SCSI: Inquiry LUN: 0x00," followed by a "SCSI Response
(Check Condition) LUN:0x00"

Digging into that packet I see:

.111 0000 = SNS Error Type: Current Error (0x70)
Filemark: 0, EOM: 0, ILI: 0
.... 0100 = Sense Key: Hardware Error (0x04)

The target is a 400GB ZFS on a 1TB mirrored disk, and solaris isn't
reporting any disk errors. Any suggestions, or do I go poke at the
solaris folks?

Thanks,
==ml

--
Michael W. Lucas mwl...@BlackHelicopters.org
http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/
New book available: Network Flow Analysis
http://www.networkflowanalysis.com/

Mike Christie

unread,
Aug 26, 2010, 4:30:17 PM8/26/10
to Michael W. Lucas, open-...@googlegroups.com

iSCSI login goes ok. Those look ok.


> Then there's a "SCSI: Inquiry LUN: 0x00," followed by a "SCSI Response
> (Check Condition) LUN:0x00"
>
> Digging into that packet I see:
>
> .111 0000 = SNS Error Type: Current Error (0x70)
> Filemark: 0, EOM: 0, ILI: 0
> .... 0100 = Sense Key: Hardware Error (0x04)
>
> The target is a 400GB ZFS on a 1TB mirrored disk, and solaris isn't
> reporting any disk errors. Any suggestions, or do I go poke at the
> solaris folks?
>

I think you are going to have to ask them. I have no idea why the target
sends this. The linux scsi layer sends an inquiry, as you saw in the
trace, and if that fails, the linux scsi layer will fail the scsi scan
and will not proceed and send other commands like report luns which
would normally find all your devices and lead to setting them up in the
kernel.

mwlucas

unread,
Aug 26, 2010, 5:12:26 PM8/26/10
to open-iscsi
Thanks for the pointers and suggestions.

For anyone finding this discussion, I've followed up with the
OpenSolaris folks at
https://www.opensolaris.org/jive/thread.jspa?threadID=133389

==ml
Reply all
Reply to author
Forward
0 new messages