Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#946645: KillUserProcesses=no disregarded for some cgroups

82 views
Skip to first unread message

chrysn

unread,
Dec 12, 2019, 1:10:03 PM12/12/19
to
Package: systemd
Version: 244-3
Severity: normal
File: /usr/share/man/man5/logind.conf.5.gz

The documentation about KillUserProcesses claims that processes will be
left alive after user logout when set to "no", and specifically mentions
tmux as one application.

This is only correct under some circumstances (apparently related to
cgroups, but I'm not familiar enough with them to give good details).
The observed issue is as follows:

* In a fresh sid installation, start Gnome under X11 (Wayland not
tested).
* Alt+F2, xterm
* tmux
* Ensure you recognize the screen again later on
* Log out
* Log in back again, open xterm again
* tmux attach: "no sessions" -- the session was killed

In comparison, when `gnome-terminal` is used to start the tmux session,
it does survive the logout.

In drilling down to finding the difference at all (a process that seemed
very random to me in #945540, when I failed to take the terminal program
used into consideration), I noticed that the cgroups involved depended
on the terminal used: With xterm (or xfce4-terminal), the process seems
to be launched with cgroups pids, memory, name and "" in
/user.slice/user-$UID.slice/session-$SESSION.scope, whereas a surviving
tmux has its "", name and memory cgroups set to
/user.slice/user-$UID.slice/user@$UID.service/gnome-terminal-server.service
and the pids to /user.slice/user-$UID.slice/user@$UID.service.

It seems to me that processes under session-$SESSION.scope do get reaped
(whether deliberately killed by logind or somehow implicily by their
cgroups going away) at logout, and only those staretd from a program
that happens to switch cgroups survive.

Please change logind behavior to allow for all processes spawned in a
login session to survive, or if that is not possible, describe in the
KillUserProcesses which processes have a chance to survive.


For now, a usable workaround for me is to `systemd-run --user --scope
tmux` instead of tmux, but that's a workaround nobody should need to
use, especially as finding the need to do so usually involves having
lost one's tmux sessions over an X server crash before.

Thanks
chrysn


(Information down here relates to my production system where I've
reproduced the steps above with i3 instead of gnome).

-- System Information:
Debian Release: bullseye/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.3.0-3-amd64 (SMP w/8 CPU cores)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages systemd depends on:
ii adduser 3.118
ii libacl1 2.2.53-5
ii libapparmor1 2.13.3-7
ii libaudit1 1:2.8.5-2+b1
ii libblkid1 2.34-0.1
ii libc6 2.29-6
ii libcap2 1:2.27-1
ii libcryptsetup12 2:2.2.2-1
ii libgcrypt20 1.8.5-3
ii libgnutls30 3.6.10-5
ii libgpg-error0 1.36-7
ii libidn2-0 2.2.0-2
ii libip4tc2 1.8.4-1
ii libkmod2 26-3
ii liblz4-1 1.9.2-2
ii liblzma5 5.2.4-1+b1
ii libmount1 2.34-0.1
ii libpam0g 1.3.1-5
ii libpcre2-8-0 10.34-7
ii libseccomp2 2.4.2-2
ii libselinux1 2.9-3+b1
ii libsystemd0 244-3
ii mount 2.34-0.1
ii util-linux 2.34-0.1

Versions of packages systemd recommends:
ii dbus 1.12.16-2

Versions of packages systemd suggests:
ii policykit-1 0.105-26
pn systemd-container <none>

Versions of packages systemd is related to:
pn dracut <none>
ii initramfs-tools 0.135
ii udev 244-3

-- no debconf information

chrysn

unread,
Feb 10, 2020, 4:10:03 AM2/10/20
to
found 946645 244.2-1
stop

On Sun, Jan 26, 2020 at 01:29:38AM +0100, Michael Biebl wrote:
> This works fine for me, so I can not reproduce the issue with the given
> information.
>
> Do you have libpam-systemd installed and enabled?

Yes, libpam-systemd is installed, and the line `session optional
pam_systemd.so` in /etc/pam.d/common-session was not modified (I'm using
a virtual machine that had Buster's installer run and upgraded to sid,
otherwise nothing in /etc was changed manually).

To reproduce on a current system, I've upgraded the the VM's systemd
from 243-8 to 244.2-1, and even the complete system to today's state.
The behavior is still as described in the original report. For sake of
completeness, it is (and has been) running gdm3.

Is there any particular VM implementation you tried the reproduction,
such that I can retry on that? (I don't really think it's the VM itself,
but it might make it easier to compare results.)

Kind regards
chrysn
signature.asc

chrysn

unread,
Feb 10, 2020, 8:50:03 AM2/10/20
to
On Mon, Feb 10, 2020 at 11:25:29AM +0100, Michael Biebl wrote:
> I assume your GNOME session is managed by systemd --user, i.e.
> gnome-session is modelled around systemd --user.
>
> Might be that is gnome-session that triggers the cleanup of the
> session/processes.
>
> You could try with a more minimal desktop session, like openbox, and
> start tmux in a xterm there and test if that survices a logout (it should)

In openbox, I could indeed not reproduce this behavior in the sid system
(Opening xterm via right-click / system / xterm, logging out via
right-click / ...) on a first attempt.

In the similarly simple i3wm (where I also saw it on my production
machine; in the sid VM I open the terminal using Win-D; logging out
using Win-Shift-E), the behavior happens on and off: I've seen it in 2
of 8 attempts (all with reboots inbetween).

Even on Gnome, when testing again after the first openbox failures, I
found the behavior to be on and off (seen in 2 out of 3 attempts), all
with no discernable patterns to it.

This is a bug in the bad area of happening often enough to not give up
on it, but erratically enough to place me at a loss on how to provide
better reproducibility to dig down :-/

Kind regards
chrsyn
signature.asc

Ansgar

unread,
Feb 10, 2020, 9:10:03 AM2/10/20
to
On Mon, 2020-02-10 at 14:37 +0100, chrysn wrote:
> On Mon, Feb 10, 2020 at 11:25:29AM +0100, Michael Biebl wrote:
> > I assume your GNOME session is managed by systemd --user, i.e.
> > gnome-session is modelled around systemd --user.
> >
> > Might be that is gnome-session that triggers the cleanup of the
> > session/processes.
> >
> > You could try with a more minimal desktop session, like openbox, and
> > start tmux in a xterm there and test if that survices a logout (it should)
>
> In openbox, I could indeed not reproduce this behavior in the sid system
> (Opening xterm via right-click / system / xterm, logging out via
> right-click / ...) on a first attempt.

If I start an xterm via Alt-F2 in gnome, the xterm process runs in a
cgroup like gnome-launched-xterm-489694.scope. gnome-terminal or an
xterm started in gnome-terminal run in the gnome-terminal-
server.service cgroup. I expect openbox doesn't do this and processes
started by openbox run in the same cgroup as openbox itself.

`systemd-cgls` is useful to see how processes are organized into
cgroups.

The gnome-launched-*.scope has KillMode=control-group, so all processes
including background processes like tmux get killed when the unit gets
stopped (systemd --user show -p KillMode gnome-launched-....scope).

gnome-terminal-server.service has KillMode=process which doesn't kill
background processes when the unit gets stopped.

I would guess that the unit openbox runs in also has KillMode=process.

Ansgar

chrysn

unread,
Feb 10, 2020, 10:40:03 AM2/10/20
to
On Mon, Feb 10, 2020 at 03:04:49PM +0100, Ansgar wrote:
> If I start an xterm via Alt-F2 in gnome, the xterm process runs in a
> cgroup like gnome-launched-xterm-489694.scope. gnome-terminal or an
> xterm started in gnome-terminal run in the gnome-terminal-
> server.service cgroup. I expect openbox doesn't do this and processes
> started by openbox run in the same cgroup as openbox itself.
>
> `systemd-cgls` is useful to see how processes are organized into
> cgroups.
>
> The gnome-launched-*.scope has KillMode=control-group, so all processes
> including background processes like tmux get killed when the unit gets
> stopped (systemd --user show -p KillMode gnome-launched-....scope).

That does not explain the on-and-off behavior, but helps in
understanding why processes get killed in the first place.

Then, however, is KillUserProcesses completely obsolete? A user trying
to debug a situation like mine will currently wind up in the discussion
of #825394 and look into KillUserProcesses' documentation. Even if it's
not completely obsolete, a warning that desktop environment spawned
processes might also be subject to killing from their desktop
environment's KillMode would be helpful.

Kind regards
chrysn
signature.asc

chrysn

unread,
Feb 10, 2020, 11:00:03 AM2/10/20
to
> Maybe we should first get to the bottom of your issue before jumping to
> conclusions and producing incorrect/incomplete documentation.

Happily so.

Right now I don't really know how to test this better (as in to get
usable data that'll get us on), let alone how to test this well (as in
without logging in and out of a virtual machine 2x5 times to get one or
two cases of observable behavior).

Any ideas on what more data I could pull out?

chrysn
signature.asc

Michael Biebl

unread,
Mar 10, 2021, 3:30:03 PM3/10/21
to
Am 09.03.21 um 03:55 schrieb Russell Stuart:
> I have similar observations to chrysn after starting and detaching a
> tmux session with KillUserProcesses=no (the default):
>
> 1.  If I do the process in Gnome-3 using gnome-terminal and I log out,
>     wait for a bit and log in (60 seconds is what I used), the detached
>     tmux session is gone.
>
> 2.  If I do the process in Gnome-3 using gnome-terminal and I log out,
>     and log back in immediately, the tmux session usually survives.
>
> 3.  If I Gnome-3 using gnome-terminal I run "loginctl enable-linger",
>     do the process above and log out and in, tmux session always
>     survives.
>
> 4.  If started on a virtual console (crtl-alt-f3), and log out it
>     survives regardless of how long I want before logging in again,
>     and regardless of how many Gnome3 sessions are started and stopped
>     on the same machine.
>
> I guess a workaround would be to globally set "loginctl enable-linger"
> for all logins but if there is such a setting I can't find it.  (I
> thought KillUserProcess=no was that setting, apparently not, but I'm
> guessing that's because it isn't systemd doing this.)  Alternatively if
> someone can tell me how get Wayland version of KDE to run ...

Laney, Simon, isn't this an issue in gnome-session? WDYT?

OpenPGP_signature
0 new messages