connection1:0: detected conn error (1020)

10,405 views
Skip to first unread message

Per Jessen

unread,
Feb 16, 2012, 3:21:57 PM2/16/12
to open-...@googlegroups.com
I am trying to get a server to boot/run using PXE+iSCSI - the
iSCSI target is running ietd, the initiator is open-iscsi.
I'm running openSUSE Factory (bleeding edge). The initial
boot-up works fine, up until:

Starting Remount API VFS...
[ 30.141223] connection1:0: detected conn error (1020)
[ 30.456050] sd 2:0:0:0: [sda] Unhandled error code
[ 30.456565] EXT4-fs error (device sda1): __ext4_get_inode_loc:3650: inode #4025: block 924: comm udevd: unable to read itable block
30.654991] sd 2:0:0:0: [sda] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[ 30.689582] EXT4-fs (sda1): previous I/O error to superblock detected
[ 30.689612] EXT4-fs error (device sda1): __ext4_get_inode_loc:3650: inode #529270: block 2097495: comm boot.iscsid-ear: unable to read itable block


I've spent most of the day googling and trying out more
or less wild suggestions. ietd is from SVN, open-iscsi is very recent.
Does anyone have any suggestions or hints?

/Per Jessen, Zürich

Mike Christie

unread,
Feb 16, 2012, 4:37:35 PM2/16/12
to open-...@googlegroups.com, Per Jessen
On 02/16/2012 02:21 PM, Per Jessen wrote:
> I am trying to get a server to boot/run using PXE+iSCSI - the
> iSCSI target is running ietd, the initiator is open-iscsi.
> I'm running openSUSE Factory (bleeding edge). The initial
> boot-up works fine, up until:
>
> Starting Remount API VFS...
> [ 30.141223] connection1:0: detected conn error (1020)


Is there anything else before this in the log? 1020 means the target
dropped the connection. What do you see on the target side? Is there
something about a protocol violation or some unrecoverable error?


> [ 30.456050] sd 2:0:0:0: [sda] Unhandled error code
> [ 30.456565] EXT4-fs error (device sda1): __ext4_get_inode_loc:3650: inode #4025: block 924: comm udevd: unable to read itable block
> 30.654991] sd 2:0:0:0: [sda] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK

Are you using dm-multipath? The upstream iscsi code would not normally
just fail the IO with that error code. You would normally see a message
about the replacement/recovery timeout firing first.

Do you know at the time this is running is iscsid running? opensuse
might have some patches where if it is not running the IO gets fast
failed and that is why we are getting the error instead of being able to
retry.

What is the kernel version of opensuse?


> [ 30.689582] EXT4-fs (sda1): previous I/O error to superblock detected
> [ 30.689612] EXT4-fs error (device sda1): __ext4_get_inode_loc:3650: inode #529270: block 2097495: comm boot.iscsid-ear: unable to read itable block
>
>
> I've spent most of the day googling and trying out more
> or less wild suggestions. ietd is from SVN, open-iscsi is very recent.
> Does anyone have any suggestions or hints?
>
>
>

> /Per Jessen, Z�rich
>

Per Jessen

unread,
Feb 17, 2012, 3:04:41 AM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

> On 02/16/2012 02:21 PM, Per Jessen wrote:
>> I am trying to get a server to boot/run using PXE+iSCSI - the
>> iSCSI target is running ietd, the initiator is open-iscsi.
>> I'm running openSUSE Factory (bleeding edge). The initial
>> boot-up works fine, up until:
>>
>> Starting Remount API VFS...
>> [ 30.141223] connection1:0: detected conn error (1020)
>
>
> Is there anything else before this in the log? 1020 means the target
> dropped the connection. What do you see on the target side? Is there
> something about a protocol violation or some unrecoverable error?

Hi Mike
I don't have much log, I'm half way into installing this server. I
have captured the console (serial) output, and the 1020 message is
the first indicator of something not working. At first the root
disk appears to be working fine, then I get to "Remount API VFS..."
after which I get the 1020.
Yesterday I didn't see any unusual messages on the target, I'll double
check.


>> [ 30.456050] sd 2:0:0:0: [sda] Unhandled error code
>> [ 30.456565] EXT4-fs error (device sda1):
>> [ __ext4_get_inode_loc:3650: inode #4025: block 924: comm udevd:
>> [ unable to read itable block
>> 30.654991] sd 2:0:0:0: [sda] Result:
>> hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
>
> Are you using dm-multipath? The upstream iscsi code would not normally
> just fail the IO with that error code. You would normally see a
> message about the replacement/recovery timeout firing first.

I've seen those mentioned when googling yesterday, but no, I don't see
those anywhere nor do I have any lengthy waits. I'm not using dm-path.

> Do you know at the time this is running is iscsid running? opensuse
> might have some patches where if it is not running the IO gets fast
> failed and that is why we are getting the error instead of being able
> to retry.

AFAICT, iscsid is running:

[ 22.918225] Loading iSCSI transport class v2.0-870.
[ 22.979289] iscsi: registered transport (tcp)
[ 23.040617] iscsid (242): /proc/242/oom_adj is deprecated, please use /proc/242/oom_score_adj instead.
FATAL: Error inserting crc32c_intel (/lib/modules/3.2.0-2-desktop/kernel/arch/x86/crypto/crc32c-intel.ko): No such device
Starting iSCSI daemon
Logging in to [iface: default, target: iqn.2012-02.net.local:44, portal: 192.168.2.181,3260]
[ 23.418046] scsi2 : iSCSI Initiator over TCP/IP
[ 23.724633] scsi 2:0:0:0: Direct-Access IET VIRTUAL-DISK 0 PQ: 0 ANSI: 4
Login to [iface:[ 23.824626] sd 2:0:0:0: [sda] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
default, target[ 23.929981] sd 2:0:0:0: [sda] Write Protect is off
: iqn.2012-02.ne[ 24.004034] sd 2:0:0:0: [sda] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
t.local:44, portal: 192.168.2.18[ 24.154349] sda: sda1
1,3260]: success[ 24.193444] sd 2:0:0:0: [sda] Attached SCSI disk
ful
Waiting for device /dev/disk/by-id/scsi-1494554000000000074616767617274370000000000000000-part1 to appear: ok
fsck from util-linux 2.20.1
[/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a /dev/sda1
/dev/sda1: recovering journal
/dev/sda1: clean, 25083/655360 files, 232250/262[ 25.258943] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: acl,user_xattr
1184 blocks
fsck succeeded. Mounting root device read-write.
Mounting root /dev/disk/by-id/scsi-1494554000000000074616767617274370000000000000000-part1
mount -o rw,acl,user_xattr -t ext4 /dev/disk/by-id/scsi-1494554000000000074616767617274370000000000000000-part1 /root

(the log is a bit mixed up due to systemd).

> What is the kernel version of opensuse?

3.2.0-2


/Per Jessen, Zürich

Per Jessen

unread,
Feb 17, 2012, 3:51:11 AM2/17/12
to open-...@googlegroups.com
Per Jessen wrote:

> Mike Christie wrote:
>
>> On 02/16/2012 02:21 PM, Per Jessen wrote:
>>> I am trying to get a server to boot/run using PXE+iSCSI - the
>>> iSCSI target is running ietd, the initiator is open-iscsi.
>>> I'm running openSUSE Factory (bleeding edge). The initial
>>> boot-up works fine, up until:
>>>
>>> Starting Remount API VFS...
>>> [ 30.141223] connection1:0: detected conn error (1020)
>>
>>
>> Is there anything else before this in the log? 1020 means the target
>> dropped the connection. What do you see on the target side? Is there
>> something about a protocol violation or some unrecoverable error?
>
> Hi Mike
> I don't have much log, I'm half way into installing this server. I
> have captured the console (serial) output, and the 1020 message is
> the first indicator of something not working. At first the root
> disk appears to be working fine, then I get to "Remount API VFS..."
> after which I get the 1020.
> Yesterday I didn't see any unusual messages on the target, I'll double
> check.

No unusual messages, but I restarted ietd with debuglevel 9 and then
tried starting the new server again; this is the log:

http://files.jessen.ch/stork1-messages.txt

--
Per Jessen, Zürich (1.2°C)

Mike Christie

unread,
Feb 17, 2012, 12:31:48 PM2/17/12
to open-...@googlegroups.com, Per Jessen

Not 100% sure, but it looks like for some reason we disconnect from the
target. This might be due to iscsid starting (iscsid startup forces a
restart of the connections to make sure the kernel and iscsid are in the
same state). But then on the reconnect from that it looks like the login
fails because of a CHAP failure.

It looks like in the initial connection you are not using chap, but in
the retry/relogin connection's we are trying to use chap and that fails
because the target was not setup to use it.

Mike Christie

unread,
Feb 17, 2012, 12:39:13 PM2/17/12
to open-...@googlegroups.com, Per Jessen

Can you mount the iscsi root disk from another system? If so could you
look at /var/lib/iscsi/nodes. In there if you grep some file with the
username set, do you see anything? Something like

cd /var/lib/iscsi/nodes
grep -R username *

If so then you want to either remove the username and password fields
from those files if you did not intend to use chap or add the username
and password to the chap settings of the target if you wanted to use chap.

Per Jessen

unread,
Feb 17, 2012, 1:09:53 PM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

>> No unusual messages, but I restarted ietd with debuglevel 9 and then
>> tried starting the new server again; this is the log:
>>
>> http://files.jessen.ch/stork1-messages.txt
>
> Not 100% sure, but it looks like for some reason we disconnect from
> the target. This might be due to iscsid starting (iscsid startup
> forces a restart of the connections to make sure the kernel and iscsid
> are in the same state). But then on the reconnect from that it looks
> like the login fails because of a CHAP failure.
>
> It looks like in the initial connection you are not using chap, but in
> the retry/relogin connection's we are trying to use chap and that
> fails because the target was not setup to use it.

Correct, I'm not using any authentication for now - I'll have a look to
see if something has been set up the wrong way.


--
Per Jessen, Zürich (2.1°C)

Per Jessen

unread,
Feb 17, 2012, 1:15:16 PM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

> Can you mount the iscsi root disk from another system?

Yep, it's an LV, I can mount it locally on the storage server.

> If so could you look at /var/lib/iscsi/nodes. In there if you grep
> some file with the username set, do you see anything? Something like
>
> cd /var/lib/iscsi/nodes
> grep -R username *

openSUSE keeps that info in /etc/iscsi/nodes:

# cd /etc/iscsi/nodes/
# ls -la
total 12
drw------- 3 root root 4096 Feb 16 13:25 ./
drwxr-xr-x 5 root root 4096 Feb 16 17:20 ../
drw------- 3 root root 4096 Feb 16 13:25 iqn.2012-02.net.local:44/
# find . -type f | xargs grep -i user
(nothing found).

Mike Christie

unread,
Feb 17, 2012, 1:16:46 PM2/17/12
to open-...@googlegroups.com, Per Jessen
Can you tar up what is in there and send it to me?

Mike Christie

unread,
Feb 17, 2012, 1:19:24 PM2/17/12
to open-...@googlegroups.com, Per Jessen

Oh yeah, when you do your boot test would it be possible to run tcpdump
or wrieshark on the target box and send that trace so I can see what is
going on?

Per Jessen

unread,
Feb 17, 2012, 3:31:37 PM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

It's on it's way to you now.

--
Per Jessen, Zürich (0.5°C)

Per Jessen

unread,
Feb 17, 2012, 3:32:10 PM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

Sure, I'll do one of those later tonight.


--
Per Jessen, Zürich (0.6°C)

Mike Christie

unread,
Feb 17, 2012, 3:37:11 PM2/17/12
to open-...@googlegroups.com, Per Jessen

Nothing useful. Looks ok.

Is there any way you can modify the system to run iscsid with debugging
on and get that info?

iscsid -d 8

instead of just

iscsid

?

Do you know for pxe +iscsi boot support on suse do you put the iscsi
session/target info on the command line or do they store it in the
initramfs?

Per Jessen

unread,
Feb 17, 2012, 4:29:20 PM2/17/12
to open-...@googlegroups.com
Mike Christie wrote:

> On 02/17/2012 02:32 PM, Per Jessen wrote:
>>>
>>> Oh yeah, when you do your boot test would it be possible to run
>>> tcpdump or wrieshark on the target box and send that trace so I can
>>> see what is going on?
>>
>> Sure, I'll do one of those later tonight.
>>
>
> Nothing useful. Looks ok.

I've got a tcpdump now, its' about 29M when gzipped:

http://files.jessen.ch/stork1-trace2.gz (will take a few more minutes
to complete the upload).

> Is there any way you can modify the system to run iscsid with
> debugging on and get that info?
>
> iscsid -d 8
>
> instead of just
>
> iscsid
>
> ?

Yes, that ought to be possible by amending the init-script on the root
fs. I'll try that tomorrow, it's getting a bit late here. Thank for
your help so far.

> Do you know for pxe +iscsi boot support on suse do you put the iscsi
> session/target info on the command line or do they store it in the
> initramfs?

I expect a bit of both :-( - I specify the target address on the command
(not sure if it is required) line, but the rest is no doubt in the
initramfs.


--
Per Jessen, Zürich (0.9°C)

Per Jessen

unread,
Feb 18, 2012, 4:52:28 AM2/18/12
to open-...@googlegroups.com
Per Jessen wrote:

> Mike Christie wrote:
>
>> Do you know for pxe +iscsi boot support on suse do you put the iscsi
>> session/target info on the command line or do they store it in the
>> initramfs?
>
> I expect a bit of both :-( - I specify the target address on the
> command (not sure if it is required) line, but the rest is no doubt in
> the initramfs.

The openSUSE initramfs has a complete iscsi config - afaict, the
following options may be overridden from the command line:

TargetAddress the address of the iscsi server
TargetPort the port of the iscsi server (defaults to 3260)
TargetName the iscsi target name (connect to all if empty)
iSCSI_ignoreNodes if set all stored nodes will be ignored

The initramfs config does not have any authentication info.

The openSUSE systemd setup has two iscsid scripts:

/etc/init.d/boot.iscsid-early
/etc/init.d/iscsid

More trial&error led to me removing boot.iscsid-early from the bootup
sequence, which for now has "fixed" the problem here (connection error
1020). Mike, it doesn't seem to be an iscsi issue at all, I'm sorry
about the noise.

Mike Christie

unread,
Feb 20, 2012, 2:20:51 AM2/20/12
to open-...@googlegroups.com, Per Jessen
On 02/18/2012 03:52 AM, Per Jessen wrote:
> More trial&error led to me removing boot.iscsid-early from the bootup
> sequence, which for now has "fixed" the problem here (connection error
> 1020). Mike, it doesn't seem to be an iscsi issue at all, I'm sorry
> about the noise.

No problem. Could you send me those 2 files, so I can make sure it is
not a issue on my side.

Reply all
Reply to author
Forward
0 new messages