Hello,
Recently there has been a bug report in Debian about -U all not
logging out of all active iSCSI sessions on shutdown:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838540
I've isolated the problem, and what happens is as follows:
1. An external tool sets up a node configuration manually [1], but
makes a small mistake, by not setting a target portal group tag
at all, so open-iscsi will default to 1 - instead of the correct
value of 257 for that target. (In the case of the bug reporter.)
2. The external tool logs in on that portal via iscsiadm.
3. The login succeeds, but either iscsid or the kernel (I'm not
deep enough in the code to know exactly where that happens)
figures out "hey, the actual portal target group tag is 257,
so use that". Hence, login succeeds, but the node database
references the wrong portal group.
4. On shutdown, iscsiadm --logoutall=all is called. The
__logout_by_startup callback tries to run idbm_rec_read() on
the session, that fails, so it completely skips that session.
This behavior in (4) seems wrong to me in two ways:
a. If the tpgt can change after login, then the matching against
the tpgt should be relaxed.
i.e. first try to match the tpgt exactly, if that fails,
just try to find the same match but without fixing the tpgt.
That way, if a login changed the tpgt implicitly, this
would still match the right config with the right session.
b. Regardless of that, there could be active iSCSI sessions
without a corresponding configuration. For example, if the
configuration was wiped accidentally before shutting down
the session. Or someone accidentally used iscsistart on a
already booted system without knowing what they were doing.
In that case, would it not be better if --logoutall=all is
specify to not skip that session, even if there's no match
in the node database? After all, the manpage says -U all
will logout of all active sessions that are NOT onboot,
and it doesn't say "all sessions that have a config AND are
not onboot".
The counterpoint to that would be: if someone were have a
setup with root on iSCSI, and the session of the root
filesystem not configured in the node database at all (but
rather just in some separate config used by the initramfs
they were using), then the current behavior treats that as
if onboot were set in that case, whereas my proposed change
would be treat that as automatic or manual and -U all
would also kill that. In Debian that wouldn't be a problem,
as we already have logic in the shutdown script to
dynamically detect the session on which the root filesystem
(and potentially /usr) is located, but other distros
probably don't.
But maybe we could add --logoutall=reallyall or something?
(Maybe with a better name?)
I'm mostly interested in fixing (a), because that's what's
causing the immediate problem the bug reporter is experiencing.
But (b) deserves some consideration as well IMHO.
Now of course, the original bug is the external tool generating
a static node config with the wrong tpgt - and if open-iscsi
were to simply refuse logins in the first place, I'd also be
fine with that behavior. But since that's not the case (and it's
very likely true that a lot of people are in some way relying on
the fact that the tpgt isn't that important at login, so it's
not going to be realistic to change that), and the login does
succeed anyway, I think the automated logout should as well.
I can work on a patch, but before I start, I'd like to hear
some thoughts on this first. Thanks!
Regards,
Christian
[1] iscsiadm --mode node --portal ... --targetname ... --op new
The portal here is specified by the external tool as only
IP:PORT.