[lxc-users] Cgroup saga continues - lxc-autostart woes.

1 view
Skip to first unread message

Ben Green

unread,
Nov 21, 2020, 4:25:15 PM11/21/20
to lxc-...@lists.linuxcontainers.org
Hi all,

I've been wrestling with cgroups for ages, trying to get limits applied
to containers. I've created a directories 'forcontainers' which exist in
/sys/fs/cgroup/*/forcontainers/ on the advice of Serge E. Hallyn. It
works perfectly well during normal operation but I can't find a way to
get it to boot and run the autostarting servers up properly. I find that
the /sys/fs/cgroup/memory/forcontainers/lxc/<container name>/tasks is
completely empty. My .service file looks like this:

   [Unit]
   Description=Psand run script to add appropriate
   After=network.target lxccgroup-add-dirs.service
   RequiresMountsFor=/sys/fs/cgroup/memory

   [Service]
   ExecStart=/usr/local/sbin/lxc-start-lxcadmin-servers
   Type=oneshot
   RemainAfterExit=yes

   [Install]
   WantedBy=multi-user.target
   Alias=lxc-start-lxcadmin-servers.service

And it script basically running this:

   #!/bin/bash

   /sbin/runuser -l lxcadmin -c 'lxc-autostart'

Am I missing something obvious? Should I be launching lxc-autostart in a
better way like wrapped in bash? I've written about this before there
was an associated kernel bug, but it doesn't affect my current install.
I can't believe it's taken so long getting this into production.

--
Cheers,
Ben Green

Serge E. Hallyn

unread,
Nov 21, 2020, 4:54:05 PM11/21/20
to LXC users mailing-list
On Sat, Nov 21, 2020 at 09:25:15PM +0000, Ben Green wrote:
> Hi all,
>
> I've been wrestling with cgroups for ages, trying to get limits applied to
> containers. I've created a directories 'forcontainers' which exist in
> /sys/fs/cgroup/*/forcontainers/ on the advice of Serge E. Hallyn. It works
> perfectly well during normal operation but I can't find a way to get it to
> boot and run the autostarting servers up properly. I find that the
> /sys/fs/cgroup/memory/forcontainers/lxc/<container name>/tasks is completely
> empty. My .service file looks like this:
>
>    [Unit]
>    Description=Psand run script to add appropriate
>    After=network.target lxccgroup-add-dirs.service
>    RequiresMountsFor=/sys/fs/cgroup/memory
>
>    [Service]
>    ExecStart=/usr/local/sbin/lxc-start-lxcadmin-servers
>    Type=oneshot
>    RemainAfterExit=yes
>
>    [Install]
>    WantedBy=multi-user.target
>    Alias=lxc-start-lxcadmin-servers.service
>
> And it script basically running this:
>
>    #!/bin/bash
>
>    /sbin/runuser -l lxcadmin -c 'lxc-autostart'
>
> Am I missing something obvious? Should I be launching lxc-autostart in a

I've never used lxc-autostart, but looking at the manpage, do you
have lxc.start.auto set in the containers you want to start? What
do the configs look like?

Do the containers actually start but in the wrong cgroup? Or do
they just not start?

Ben Green

unread,
Nov 22, 2020, 8:41:14 AM11/22/20
to lxc-...@lists.linuxcontainers.org
On 21/11/2020 21:54, Serge E. Hallyn wrote:
> I've never used lxc-autostart, but looking at the manpage, do you
> have lxc.start.auto set in the containers you want to start? What
> do the configs look like?
>
> Do the containers actually start but in the wrong cgroup? Or do
> they just not start?

They start, but the tasks part of some cgroup filesystems are
unpopulated with PID numbers. This is particularly troublesome for the
memory cgroup, for example:
/sys/fs/cgroup/memory/forcontainers/lxc/example_server/tasks would
contain nothing. This means the contains ends up with the memory limits
assigned to the host.

I can get the memory limits to work again, but copying the tasks pids
from /sys/fs/cgroup/cpuset/forcontainers/lxc/example_server/tasks into
/sys/fs/cgroup/memory/forcontainers/lxc/example_server/tasks. Then the
server reports the required memory allowance again.

Other cgroups file systems are affected too, 'devices' and 'blkio' for
example. We aren't doing anything with those though, so I'm
concentrating 'memory' which I really need to make use of.

Cheers,

Ben

Serge E. Hallyn

unread,
Nov 22, 2020, 10:09:11 AM11/22/20
to LXC users mailing-list
On Sun, Nov 22, 2020 at 01:41:14PM +0000, Ben Green wrote:
> On 21/11/2020 21:54, Serge E. Hallyn wrote:
> > I've never used lxc-autostart, but looking at the manpage, do you
> > have lxc.start.auto set in the containers you want to start? What
> > do the configs look like?
> >
> > Do the containers actually start but in the wrong cgroup? Or do
> > they just not start?
>
> They start, but the tasks part of some cgroup filesystems are unpopulated
> with PID numbers. This is particularly troublesome for the memory cgroup,
> for example: /sys/fs/cgroup/memory/forcontainers/lxc/example_server/tasks
> would contain nothing. This means the contains ends up with the memory
> limits assigned to the host.

Can you cat /proc/$pid/cgroup for one of the tasks in an autostarted container,
and show the /var/lib/lxc/container-name/config ?

> I can get the memory limits to work again, but copying the tasks pids from
> /sys/fs/cgroup/cpuset/forcontainers/lxc/example_server/tasks into
> /sys/fs/cgroup/memory/forcontainers/lxc/example_server/tasks. Then the
> server reports the required memory allowance again.
>
> Other cgroups file systems are affected too, 'devices' and 'blkio' for
> example. We aren't doing anything with those though, so I'm concentrating
> 'memory' which I really need to make use of.
>
> Cheers,
>
> Ben
>
>
>
>
>

> _______________________________________________
> lxc-users mailing list
> lxc-...@lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users

Guido Jäkel

unread,
Nov 25, 2020, 1:18:48 AM11/25/20
to LXC users mailing-list
Dear Ben, (hi Serge,)

maybe you should also take a look what happens if you play with namespaces using the userland tools like lsns, unshare and enterns .

with greetings

Guido

Ben Green

unread,
Dec 8, 2020, 5:21:07 PM12/8/20
to lxc-...@lists.linuxcontainers.org
On 22/11/2020 15:09, Serge E. Hallyn wrote:
> Can you cat /proc/$pid/cgroup for one of the tasks in an autostarted container,
> and show the /var/lib/lxc/container-name/config ?

I must apologise for such a later response, this is the first time I've
been back to this problem since you responded.

I'm running as an unprivileged user, lxcadmin, so the configs are at
/home/lxcadmin/.local/share/lxc/slug/config :

# "Secure" mounting
lxc.mount.auto = proc:mixed sys:ro cgroup:mixed
lxc.rootfs.path = dir:/home/lxcadmin/.local/share/lxc/slug/rootfs

# Common configuration
lxc.include = /usr/share/lxc/config/debian.common.conf

# Container specific configuration
lxc.tty.max = 4
lxc.uts.name = slug
lxc.arch = amd64
lxc.pty.max = 1024

lxc.cgroup.memory.limit_in_bytes = 4G
lxc.cgroup.memory.memsw.limit_in_bytes = 6G
lxc.mount.entry = none tmp tmpfs size=1024m,mode=1777 0 0
lxc.cap.drop = mac_admin mac_override net_admin sys_admin sys_module
sys_rawio sys_time syslog sys_resource setpcap

# Subuids and subgids mapping
lxc.idmap = u 0 1258512 65536
lxc.idmap = g 0 1258512 65536
lxc.start.auto = 1

I've omitted the inferface settings. I've actually found that once
again, containers are jumping out of their cgroups,
/sys/fs/cgroup/memory/forcontainers/lxc/slug/tasks is again empty. I
don't know how or why this has come about. This is now an absolute
nightmare. We were hoping for this LXC setup to become a production
setup and we've got a setup in which the containers get removed on a
regular yet seemingly motiveless basis. I've put these containers into
the forcontainers group as suggested. The below does seem very revealing
though.

Here's and example PID from inside the server which has just started:

11:net_cls,net_prio:/
10:cpu,cpuacct:/../../../system.slice/lxc-start-lxcadmin-servers.service
9:cpuset:/
8:rdma:/
7:perf_event:/
6:memory:/../../../system.slice/lxc-start-lxcadmin-servers.service
5:pids:/../../../system.slice/lxc-start-lxcadmin-servers.service
4:freezer:/
3:blkio:/../../../system.slice/lxc-start-lxcadmin-servers.service
2:devices:/../../../system.slice/lxc-start-lxcadmin-servers.service
1:name=systemd:/
0::/

Here's one which has lost it's pids in a unknown process which happens
seemingly randomly:

11:rdma:/
10:freezer:/
9:perf_event:/
8:pids:/../../../user.slice/user-202.slice/session-59466.scope
7:blkio:/../../../user.slice
6:cpuset:/
5:cpu,cpuacct:/../../../user.slice
4:memory:/../../../user.slice/user-202.slice/session-59466.scope
3:net_cls,net_prio:/
2: :/../../../user.slice
1:name=systemd:/
0::/

This is one behaving as we would like, freshly started using lxc-start
from and ssh session of the user lxcadmin.

11:net_cls,net_prio:/
10:cpu,cpuacct:/
9:cpuset:/
8:rdma:/
7:perf_event:/
6:memory:/
5:pids:/
4:freezer:/
3:blkio:/
2:devices:/
1:name=systemd:/
0::/

It's that pesky systemd isn't it? I've made a script to get things back
to how we want which looks like this for the guest ${1}:

for j in pids blkio cpu,cpuacct memory devices; do
   for i in $(cat /sys/fs/cgroup/cpuset/forcontainers/lxc/${1}/tasks); do
      echo "${i}" >> /sys/fs/cgroup/${j}/forcontainers/lxc/${1}/tasks
   done
done

Cheers,

Ben

Ben Green

unread,
Dec 14, 2020, 9:57:42 AM12/14/20
to lxc-...@lists.linuxcontainers.org
On 08/12/2020 22:21, Ben Green wrote:
>
> It's that pesky systemd isn't it? I've made a script to get things
> back to how we want which looks like this for the guest ${1}:
>
To add some info to this, I've confirmed that running 'systemctl
daemon-reload' causes systemd to move all pids into it's own cgroup slice.

I've decided the simple approach, though very annoying to have to do, is
to simply overwrite systemd's cgroup pid movement on a regular basis.
Until systemd has either the flexibility to not intefere or consistency
to handle properly, cgroups in manner suitable for LXC unpriveged
containers, this seems a pragmatic approach. I'm really deeply bored
with this unfathomable coercive and broken init system. If anyone has
any ways to stop systemd interfering with my plans, please let me know.

Thanks for everyone's help on this issue.

--
Cheers,
Ben Green

Reply all
Reply to author
Forward
0 new messages