udev events for iscsi

Gionatan Danti

unread,

Apr 21, 2020, 3:31:24 AM4/21/20

to open-iscsi

[reposting, as the previous one seems to be lost]

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

I can read here that, years ago, a patch was in progress to give better integration with udev when a device disconnects/reconnects. Did the patch got merged? Or does the one I described above remain the expected behavior? Can be changed?

Thanks.

rob...@eyeconsultantspc.com

unread,

Apr 21, 2020, 11:20:23 AM4/21/20

to open-...@googlegroups.com

Wondering myself.

On Apr 21, 2020, at 2:31 AM, Gionatan Danti <gionata...@gmail.com> wrote:

[reposting, as the previous one seems to be lost]

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

So running “udevadm monitor” on the initiator, you can see when a block device becomes available locally.

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

As someone who has had an inordinate amount of experience with the iSCSi connection breaking ( power outage, Network switch dies, wrong ethernet cable pulled, the target server machine hardware crashes, ...) in the middle of production, the more info the better. Udev event triggers would help. I wonder exactly how XenServer handles this as it itself seemed more resilient.

XenServer host initiators do something correct to recover and wonder how that compares to the normal iSCSi initiator.

But unfortunately, XenServer LVM-over-iSCSi does not pass the message along to its Linux virtual drives and VMs in the same way as Windows VMs.

When the target drives became available again, MS Windows virtual machines would gracefully recover on their own. All Linux VM filesystems went read only and those VM machines required forceful rebooting. mount remount would not work.

I can read here that, years ago, a patch was in progress to give better integration with udev when a device disconnects/reconnects. Did the patch got merged? Or does the one I described above remain the expected behavior? Can be changed?

Thanks.

--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/13d4c963-b633-4672-97d9-dd41eec5fb5b%40googlegroups.com.

gionata...@gmail.com

unread,

Apr 21, 2020, 11:20:45 AM4/21/20

to open-iscsi

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

I read a quite old thread here were it was stated that a patch to better integrate iscsi with udev events was in progress. Did something changed/happened during these years? Is the behavior I observed (and described above) to be expected?

Thanks.

Donald Williams

unread,

Apr 21, 2020, 12:06:24 PM4/21/20

to open-...@googlegroups.com

Hello,

re: XenServer. The initiator is the same but I suspect your issue with the disk timeout value on Linux. When the connection drops Linux gets the error and mount RO. In VMware for example the VMware tools sets Windows Disktimeout to 60 seconds to not give up so quickly.

I suspect if you do the same in your Lilnux VM, increase the Disk Timeout you will likely ride out transitory network issues and SAN controller failovers. Which is where I see this occur all the time.

This is from a Dell PS Series document. that shows one way to set the value http://downloads.dell.com/solutions/storage-solution-resources/(3199-CD-L)RHEL-PSseries-Configuration.pdf

Starting on Page 14.

Disk timeout values The PS Series arrays can deliver more network I/O than an initiator can handle, resulting in dropped packets and retransmissions. Other momentary interruptions in network connectivity can also cause problems, such as a mount point becoming read-only as a result of interruptions. To mitigate against unnecessary iSCSI resets during very brief network interruptions, change the value the kernel uses.

The default setting for Linux is 30 seconds. This can be verified using the command:

# for i in $(find /sys/devices/platform –name timeout ) ; do cat $i ; done

30 30

To increase the time it takes before an iSCSI connection is reset to 60 seconds, use the command:

# for i in $(find /sys/devices/platform –name timeout ); do echo “60” > $i; done

To verify the changes, re-run the first command.

# for i in $(find /sys/devices/platform –name timeout ); do cat $i; done

60 60

When the system is rebooted, the timeout value will revert to 30 seconds, unless the appropriate udev rules file is created.

Create a file named /lib/udev/rules.d/99-eqlsd.rules and add the following content: ACTION!=”remove”, SUBSYSTEM==”block”, ENV{ID_VENDOR}==”EQLOGIC”, RUN+=”/bin/sh – c ‘echo 60 > /sys/%p/device/timeout’” To test the efficacy of the new udev rule, reboot the system.

Test that the reboot occurred, and then run the “cat $i” command above.

# uptime 12:31:22 up 1 min, 1 user, load average: 0.78, 0.29, 0.10

# for i in $(find /sys/devices/platform –name timeout ) ; do cat $i ; done

60 60

Regards,

Don

To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/9D54680A-F97E-4465-BA6C-566562C5DC91%40eyeconsultantspc.com.

The Lee-Man

unread,

Apr 21, 2020, 2:44:22 PM4/21/20

to open-iscsi

On Tuesday, April 21, 2020 at 12:31:24 AM UTC-7, Gionatan Danti wrote:

[reposting, as the previous one seems to be lost]

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

Because of the design of iSCSI, there is no way for the initiator to know the server has gone away. The only time an initiator might figure this out is when it tries to communicate with the target.

This assumes we are not using some sort of directory service, like iSNS, which can send asynchronous notifications. But even then, the iSNS server would have to somehow know that the target went down. If the target crashed, that might be difficult to ascertain.

So in the absence of some asynchronous notification, the initiator only knows the target is not responding if it tries to talk to that target.

Normally iscsid defaults to sending periodic NO-OPs to the target every 5 seconds. So if the target goes away, the initiator usually notices, even if no regular I/O is occurring.

But this is where the error recovery gets tricky, because iscsi tries to handle "lossy" connections. What if the server will be right back? Maybe it's rebooting? Maybe the cable will be plugged back in? So iscsi keeps trying to reconnect. As a matter of fact, if you stop iscsid and restart it, it sees the failed connection and retries it -- forever, by default. I actually added a configuration parameter called reopen_max, that can limit the number of retries. But there was pushback on changing the default value from 0, which is "retry forever".

So what exactly do you think the system should do when a connection "goes away"? How long does it have to be gone to be considered gone for good? If the target comes back "later" should it get the same disc name? Should we retry, and if so how much before we give up? I'm interested in your views, since it seems like a non-trivial problem to me.

I can read here that, years ago, a patch was in progress to give better integration with udev when a device disconnects/reconnects. Did the patch got merged? Or does the one I described above remain the expected behavior? Can be changed?

So you're saying as soon as a bad connection is detected (perhaps by a NOOP), the device should go away?

Thanks.

The Lee-Man

unread,

Apr 21, 2020, 2:47:53 PM4/21/20

to open-iscsi

On Tuesday, April 21, 2020 at 8:20:23 AM UTC-7, Robert ECEO Townley wrote:

Wondering myself.

On Apr 21, 2020, at 2:31 AM, Gionatan Danti <gionata...@gmail.com> wrote:

[reposting, as the previous one seems to be lost]

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

So running “udevadm monitor” on the initiator, you can see when a block device becomes available locally.

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

As someone who has had an inordinate amount of experience with the iSCSi connection breaking ( power outage, Network switch dies, wrong ethernet cable pulled, the target server machine hardware crashes, ...) in the middle of production, the more info the better. Udev event triggers would help. I wonder exactly how XenServer handles this as it itself seemed more resilient.

XenServer host initiators do something correct to recover and wonder how that compares to the normal iSCSi initiator.

I was under the impression that XenServer used open-iscsi.

But unfortunately, XenServer LVM-over-iSCSi does not pass the message along to its Linux virtual drives and VMs in the same way as Windows VMs.

When the target drives became available again, MS Windows virtual machines would gracefully recover on their own. All Linux VM filesystems went read only and those VM machines required forceful rebooting. mount remount would not work.

A filesystem going read-only means it was likely ext3, which does that if it gets IO errors, I believe. (Disclaimer: I'm not a filesystem person.)

Donald Williams

unread,

Apr 21, 2020, 2:49:39 PM4/21/20

to open-...@googlegroups.com

Hello,

If the loss exceeds the timeout value yes. If the 'drive' doesn't come back in 30 to 60 seconds it's not likely a transitory event like a cable pull.

NOOP-IN and NOOP-OUT are also know as KeepAlive. That's when the connection is up but the target or initiator isn't responding. If those timeout the connection will be dropped and a new connection attempt made.

Don

--

You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/7f583720-8a84-4872-8d1a-5cd284295c22%40googlegroups.com.

Gionatan Danti

unread,

Apr 21, 2020, 4:30:44 PM4/21/20

to open-iscsi

Il giorno martedì 21 aprile 2020 20:44:22 UTC+2, The Lee-Man ha scritto:

Because of the design of iSCSI, there is no way for the initiator to know the server has gone away. The only time an initiator might figure this out is when it tries to communicate with the target.

This assumes we are not using some sort of directory service, like iSNS, which can send asynchronous notifications. But even then, the iSNS server would have to somehow know that the target went down. If the target crashed, that might be difficult to ascertain.

So in the absence of some asynchronous notification, the initiator only knows the target is not responding if it tries to talk to that target.

Normally iscsid defaults to sending periodic NO-OPs to the target every 5 seconds. So if the target goes away, the initiator usually notices, even if no regular I/O is occurring.

True.

But this is where the error recovery gets tricky, because iscsi tries to handle "lossy" connections. What if the server will be right back? Maybe it's rebooting? Maybe the cable will be plugged back in? So iscsi keeps trying to reconnect. As a matter of fact, if you stop iscsid and restart it, it sees the failed connection and retries it -- forever, by default. I actually added a configuration parameter called reopen_max, that can limit the number of retries. But there was pushback on changing the default value from 0, which is "retry forever".

So what exactly do you think the system should do when a connection "goes away"? How long does it have to be gone to be considered gone for good? If the target comes back "later" should it get the same disc name? Should we retry, and if so how much before we give up? I'm interested in your views, since it seems like a non-trivial problem to me.

Well, for short disconnections the re-try approach is surely the better one. But I naively assumed that a longer disconnection, as described by the node.session.timeo.replacement_timeout parameter, would tear down the device with a corresponding udev event. Udev should have no problem assigning the device a sensible persistent name, right?

So you're saying as soon as a bad connection is detected (perhaps by a NOOP), the device should go away?

I would say that the device should go away not a the first NOOP failing, but when the replacement_timeout (or another sensible timeout) expires.

This open the door to another question: from iscsid.conf and README files I (wrongly?) understand that replacement_timeout come into play only when the SCSI EH is running, while in the other cases different timeouts as node.session.err_timeo.lu_reset_timeout and node.session.err_timeo.tgt_reset_timeout should affect the (dis)connection. However, in all my tests, I only saw replacement_timeout being honored, still I did not catch a single running instance of SCSI EH via the proposed command iscsiadm -m session -P 3

What I am missing?

Thanks.

Ulrich Windl

unread,

Apr 21, 2020, 7:24:41 PM4/21/20

to open-...@googlegroups.com

Hi!

Sorry, for top-posting, the stupid MUA here can't quote...

On Linux and read-only: If the kernel wanted to write a block, but got an I/O error (after waiting and retrying), what should it do? The block to write wasn't written...

Setting the filesystem to read-only is like pulling the plug in panic, but you can still see what you had on disk...

If the disk becomes online again (and you detect that), the kernel would have to turn back time to retry the op that failed (and all subsequent ones)...

So reboot (letting the user to roll back time, i.e.: redo anything) is the better choice IMHO.

There's a good reason for NFS hard-mounts, waiting "forever". Maybe iSCSI needs that ,too ;-)

Regards,

Ulrich

>>> <rob...@eyeconsultantspc.com> 21.04.2020, 17:20 >>>

Wondering myself.

On Apr 21, 2020, at 2:31 AM, Gionatan Danti <gionata...@gmail.com> wrote:

[reposting, as the previous one seems to be lost]

Hi all,
I have a question regarding udev events when using iscsi disks.

By using "udevadm monitor" I can see that events are generated when I login and logout from an iscsi portal/resource, creating/destroying the relative links under /dev/

So running “udevadm monitor” on the initiator, you can see when a block device becomes available locally.

However, I can not see anything when the remote machine simple dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I don't see anything about a removed disk (and the links under /dev/ remains unaltered, indeed). At the same time, when the remote machine and disk become available again, no reconnection events happen.

As someone who has had an inordinate amount of experience with the iSCSi connection breaking ( power outage, Network switch dies, wrong ethernet cable pulled, the target server machine hardware crashes, ...) in the middle of production, the more info the better. Udev event triggers would help. I wonder exactly how XenServer handles this as it itself seemed more resilient.

XenServer host initiators do something correct to recover and wonder how that compares to the normal iSCSi initiator.

But unfortunately, XenServer LVM-over-iSCSi does not pass the message along to its Linux virtual drives and VMs in the same way as Windows VMs.

When the target drives became available again, MS Windows virtual machines would gracefully recover on their own. All Linux VM filesystems went read only and those VM machines required forceful rebooting. mount remount would not work.

I can read here that, years ago, a patch was in progress to give better integration with udev when a device disconnects/reconnects. Did the patch got merged? Or does the one I described above remain the expected behavior? Can be changed?

Thanks.

--

You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/13d4c963-b633-4672-97d9-dd41eec5fb5b%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/9D54680A-F97E-4465-BA6C-566562C5DC91%40eyeconsultantspc.com.

Ulrich Windl

unread,

Apr 22, 2020, 2:09:09 AM4/22/20

to open-iscsi

>>> Donald Williams <don.e.w...@gmail.com> schrieb am 21.04.2020 um 18:06
in
Nachricht
<29812_1587485183_5E9F19FE_29812_432_1_CAK3e-EbA-d6NeDETJ0EMHeAw3HGko_uCB_f6gsiq
jmEe...@mail.gmail.com>:

[...]

>
> The default setting for Linux is 30 seconds. This can be verified using the
> command:
>
> # for i in $(find /sys/devices/platform –name timeout ) ; do cat $i ; done
> 30 30

Two remarks on the command above:
1) the command contains an en-dash instead of a minus, so you get funny error
messages like this:
find: ‘–iname’: No such file or directory
find: ‘timeout’: No such file or directory

2) Even with the correct command, I get no matches here (SLES12)

However I see matches within
/sys/devices/pci* and /sys/class/firmware/timeout

[...]

Regards,
Ulrich

Ulrich Windl

unread,

Apr 22, 2020, 2:56:23 AM4/22/20

to open-iscsi

>>> The Lee-Man <leeman...@gmail.com> schrieb am 21.04.2020 um 20:44 in
Nachricht
<618_1587494664_5E9F3F08_618_445_1_7f583720-8a84-4872-8d1a-5cd284295c22@googlegr
ups.com>:

> On Tuesday, April 21, 2020 at 12:31:24 AM UTC-7, Gionatan Danti wrote:
>>

>> [reposting, as the previous one seems to be lost]
>>
>> Hi all,
>> I have a question regarding udev events when using iscsi disks.
>>
>> By using "udevadm monitor" I can see that events are generated when I
>> login and logout from an iscsi portal/resource, creating/destroying the
>> relative links under /dev/
>>
>> However, I can not see anything when the remote machine simple
>> dies/reboots/disconnects: while "dmesg" shows the iscsi timeout expiring, I
>> don't see anything about a removed disk (and the links under /dev/ remains
>> unaltered, indeed). At the same time, when the remote machine and disk
>> become available again, no reconnection events happen.
>>
>

> Because of the design of iSCSI, there is no way for the initiator to know
> the server has gone away. The only time an initiator might figure this out
> is when it tries to communicate with the target.

My knowlege of the SCSI stack is quite poor, but I think the last revisions of parallel SCSI (like Ultra 320 (or was it 160?)) had a concept of "domain validation". AFAIK the leatter meant measuring the quality of the wires, adjusting the transfer speed.
While basically SCSI assumes "the bus" won't go away magically, a future iSCSI standard might contain regular "bus checks" to trigger recovery actions if the "bus" (network transport connection) seems to be gone.

>
> This assumes we are not using some sort of directory service, like iSNS,
> which can send asynchronous notifications. But even then, the iSNS server
> would have to somehow know that the target went down. If the target
> crashed, that might be difficult to ascertain.

To be picky: If the traget went down (like a classical failing SCSI disk), it could issue some attention message, but when the transport went down, no such message can be received. So I think there's a difference between "target down" (device not present, device fails to respond) and "bus down" (no communication possible any more). In the second case no assumptions can be made about the health of the traget device.

>
> So in the absence of some asynchronous notification, the initiator only
> knows the target is not responding if it tries to talk to that target.
>
> Normally iscsid defaults to sending periodic NO-OPs to the target every 5
> seconds. So if the target goes away, the initiator usually notices, even if
> no regular I/O is occurring.

So the target went away, or the bus went down?

>
> But this is where the error recovery gets tricky, because iscsi tries to
> handle "lossy" connections. What if the server will be right back? Maybe
> it's rebooting? Maybe the cable will be plugged back in? So iscsi keeps
> trying to reconnect. As a matter of fact, if you stop iscsid and restart
> it, it sees the failed connection and retries it -- forever, by default. I
> actually added a configuration parameter called reopen_max, that can limit
> the number of retries. But there was pushback on changing the default value
> from 0, which is "retry forever".
>
> So what exactly do you think the system should do when a connection "goes
> away"? How long does it have to be gone to be considered gone for good? If
> the target comes back "later" should it get the same disc name? Should we
> retry, and if so how much before we give up? I'm interested in your views,
> since it seems like a non-trivial problem to me.

IMHO a "bus down" is a critical event affecting _all_ devices on that bus, not just a single target. Well, it might be some extra noise if those other targets have no I/O outstanding, but it's better to know that the bus is down before initiating a transfer rather than concluding seconds later that the target seems unreachable for some reasons unknown.

>
>>
>> I can read here that, years ago, a patch was in progress to give better
>> integration with udev when a device disconnects/reconnects. Did the patch
>> got merged? Or does the one I described above remain the expected behavior?
>> Can be changed?
>>
>

> So you're saying as soon as a bad connection is detected (perhaps by a
> NOOP), the device should go away?

Maybe the state should be similar to a device being in power-save mode: It's not accessible right now, but should be woke up ASAP. See my earlier comparison to NFS hard-mounts...

Regards,
Ulrich

>
>>
>> Thanks.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to open-iscsi+...@googlegroups.com.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/open-iscsi/7f583720-8a84-4872-8d1a-5cd28429
> 5c22%40googlegroups.com.

Ulrich Windl

unread,

Apr 22, 2020, 3:04:51 AM4/22/20

to open-iscsi

>>> Donald Williams <don.e.w...@gmail.com> schrieb am 21.04.2020 um 20:49 in
Nachricht
<30147_1587494977_5E9F4041_30147_801_1_CAK3e-EawwxYGb3Gw74+P-yBmrnE0ktOL=Fj1OT_L
Q+CZ...@mail.gmail.com>:

> Hello,
>
> If the loss exceeds the timeout value yes. If the 'drive' doesn't come
> back in 30 to 60 seconds it's not likely a transitory event like a cable
> pull.
>
> NOOP-IN and NOOP-OUT are also know as KeepAlive. That's when the

Actually I think that's two different mechanisms: Keepalive just prevents the connection from being discarded (some firewall like to do that), while the No_op actually is an end-to-end (almost at least) connection test.

> connection is up but the target or initiator isn't responding. If those
> timeout the connection will be dropped and a new connection attempt made.

I think the original intention for SCSI timeouts was to conclude a device has failed if it does not respond within time (actually there are different timeouts depending on the operation (like the famous rewinding of a long tape)). Next step for the OS would be to block I/O to a seemingly failed device. Recent operating systems like Linux have the choice to remove the device logically, requiring it to re-appear before it can be used. In some cases it seems preferrable to keep the device, because otherwise there could be a cascading effect like killing processes that have the device open (UNIX processes do not like it when opened devices suddenly disappear).

Regards,
Ulrich

>
> Don
>
>
> On Tue, Apr 21, 2020 at 2:44 PM The Lee-Man <leeman...@gmail.com> wrote:
>

>> --
>> You received this message because you are subscribed to the Google Groups
>> "open-iscsi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to open-iscsi+...@googlegroups.com.
>> To view this discussion on the web visit
>>
> https://groups.google.com/d/msgid/open-iscsi/7f583720-8a84-4872-8d1a-5cd28429
> 5c22%40googlegroups.com
>>

> <https://groups.google.com/d/msgid/open-iscsi/7f583720-8a84-4872-8d1a-5cd28429
> 5c22%40googlegroups.com?utm_medium=email&utm_source=footer>

>> .
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to open-iscsi+...@googlegroups.com.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/open-iscsi/CAK3e-EawwxYGb3Gw74%2BP-yBmrnE0k
> tOL%3DFj1OT_LEQ%2BCZyZUkg%40mail.gmail.com.

Donald Williams

unread,

Apr 22, 2020, 5:12:07 AM4/22/20

to open-...@googlegroups.com

Hello

Re: Errors That's likely from a bad / copy paste. I referenced the source document I took that from. That was done against an older RHEL kernel.

Don

To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/5E9FEC8E020000A1000387D7%40gwsmtp.uni-regensburg.de.

The Lee-Man

unread,

Apr 22, 2020, 12:30:13 PM4/22/20

to open-iscsi

The initiator does not know the difference. As you know, there are dozens of things (conservatively) that can go wrong, which is why I say the disk "goes away". It could be sleeping. It could be dead. The cable could be unplugged. The system could be rebooting. The switch could be down. The ACLs could have changed (which is how I simulate a target going away).

>
> But this is where the error recovery gets tricky, because iscsi tries to
> handle "lossy" connections. What if the server will be right back? Maybe
> it's rebooting? Maybe the cable will be plugged back in? So iscsi keeps
> trying to reconnect. As a matter of fact, if you stop iscsid and restart
> it, it sees the failed connection and retries it -- forever, by default. I
> actually added a configuration parameter called reopen_max, that can limit
> the number of retries. But there was pushback on changing the default value
> from 0, which is "retry forever".
>
> So what exactly do you think the system should do when a connection "goes
> away"? How long does it have to be gone to be considered gone for good? If
> the target comes back "later" should it get the same disc name? Should we
> retry, and if so how much before we give up? I'm interested in your views,
> since it seems like a non-trivial problem to me.

IMHO a "bus down" is a critical event affecting _all_ devices on that bus, not just a single target. Well, it might be some extra noise if those other targets have no I/O outstanding, but it's better to know that the bus is down before initiating a transfer rather than concluding seconds later that the target seems unreachable for some reasons unknown.

There are 3 error handling levels built into the iSCSI protocol. I think you'll need to change/augment the protocol to change this. They are ERL=[0|1|2]. Error level 0 is the default, and the only one supported by open-iscsi. That just means we end the connection reconnect. ERL=1 adds handling digest error handling, and ERL=2 adds session recovery on top of that, i.e. try to recover the session before disconnecting and reconnecting.

It is up to the transport (usually TCP/IP) to tell us of transport errors. At the open-iscsi level, the transport should either "just work", or it should fail and tell us it failed.

But perhaps I'm being redundant and you know all this.

>
>>
>> I can read here that, years ago, a patch was in progress to give better
>> integration with udev when a device disconnects/reconnects. Did the patch
>> got merged? Or does the one I described above remain the expected behavior?
>> Can be changed?
>>
>
> So you're saying as soon as a bad connection is detected (perhaps by a
> NOOP), the device should go away?

Maybe the state should be similar to a device being in power-save mode: It's not accessible right now, but should be woke up ASAP. See my earlier comparison to NFS hard-mounts...

I think the current code works well enough when the target goes away for a "short" period of time, but again it depends on how it goes away. Not all disappearances are equal, though we really can't tell them apart very well.

Regards,
Ulrich

Gionatan Danti

unread,

Apr 28, 2020, 5:15:31 PM4/28/20

to open-iscsi

Hi all, any thoughts regarding the point above?

Thanks.

Gionatan Danti

unread,

May 30, 2020, 4:59:32 AM5/30/20

to open-iscsi

Il giorno martedì 28 aprile 2020 23:15:31 UTC+2, Gionatan Danti ha scritto:

Well, for short disconnections the re-try approach is surely the better one. But I naively assumed that a longer disconnection, as described by the node.session.timeo.replacement_timeout parameter, would tear down the device with a corresponding udev event. Udev should have no problem assigning the device a sensible persistent name, right?

This open the door to another question: from iscsid.conf and README files I (wrongly?) understand that replacement_timeout come into play only when the SCSI EH is running, while in the other cases different timeouts as node.session.err_timeo.lu_reset_timeout and node.session.err_timeo.tgt_reset_timeout should affect the (dis)connection. However, in all my tests, I only saw replacement_timeout being honored, still I did not catch a single running instance of SCSI EH via the proposed command iscsiadm -m session -P 3

Hi all and sorry for the bump, but I would really like to understand the two points above (especially the one regarding the various timeout values).