S6-setuidgid

21 views

Skip to first unread message

Nicol Allphin

unread,

Jul 27, 2024, 1:39:47 AM7/27/24

to tingpetleuda

Now in my previous work, the usual policy was that you never ever run any process as root. You create a user/group for it and run from there. Of course, the system did run some things as root, but we could achieve all business logic processing without being root.

s6-setuidgid

Download File ☆ https://urloso.com/2zQBfH

It seems that others have missed your point, which was not reasons why to use changed roots, which of course you clearly already know, nor what else you can do to place limits on dmons, when you also clearly know about running under the aegides of unprivileged user accounts; but why to do this stuff inside the application. There's actually a fairly on point example of why.

Consider the design of the httpd dmon program in Daniel J. Bernstein's publicfile package. The first thing that it does is change root to the root directory that it was told to use with a command argument, then drop privileges to the unprivileged user ID and group ID that are passed in two environment variables.

Dmon management toolsets have dedicated tools for things like changing root directory and dropping to unprivileged user and group IDs. Gerrit Pape's runit has chpst. My nosh toolset has chroot and setuidgid-fromenv. Laurent Bercot's s6 has s6-chroot and s6-setuidgid. Wayne Marshall's Perp has runtool and runuid. And so forth. Indeed, they all have M. Bernstein's own daemontools toolset with setuidgid as an antecedent.

With Bernstein httpd as it stands, the only files and directories that are in the root directory tree are ones that are to be published to the world. There is nothing else in the tree at all. Moreover, there is no reason for any executable program image file to exist in that tree.

But move the root directory change out into a chain-loading program (or systemd), and suddenly the program image file for httpd, any shared libraries that it loads, and any special files in /etc, /run, and /dev that the program loader or C runtime library access during program initialization (which you might find quite surprising if you truss/strace a C or C++ program), also have to be present in the changed root. Otherwise httpd cannot be chained to and won't load/run.

Remember that this is a HTTP(S) content server. It can potentially serve up any (world-readable) file in the changed root. This now includes things like your shared libraries, your program loader, and copies of various loader/CRTL configuration files for your operating system. And if by some (accidental) means the content server has access to write stuff, a compromised server can possibly gain write access to the program image for httpd itself, or even your system's program loader. (Remember that you now have two parallel sets of /usr, /lib, /etc, /run, and /dev directories to keep secure.)

So you have traded having a small amount of privileged code, that is fairly easy to audit and that runs right at the start of the httpd program, running with superuser privileges; for having a greatly expanded attack surface of files and directories within the changed root.

Notice that this is nonetheless a bare minimum of functionality within httpd itself. All of the code that does things such as look in the operating system's account database for the user ID and group ID to put into those environment variables in the first place is external to the httpd program, in simple standalone auditable commands such as envuidgid. (And of course it is a UCSPI tool, so it contains none of the code to listen on the relevant TCP port(s) or to accept connections, those being the domain of commands such as tcpserver, tcp-socket-listen, tcp-socket-accept, s6-tcpserver4-socketbinder, s6-tcpserver4d, and so on.)

I think many details of your question could apply equally to avahi-daemon, which I looked at recently. (I might have missed another detail that differs though). Running avahi-daemon in a chroot has many advantages, in case avahi-daemon is compromised. These include:

Point 3 could be particularly nice when you're not using dbus or similar... I think avahi-daemon uses dbus, so it makes sure to keep access to the system dbus even from inside the chroot. If you don't need the ability to send messages on the system dbus, denying that ability might be quite a nice security feature.

Note that if avahi-daemon was re-written, it could potentially choose to rely on systemd for security, and use e.g. ProtectHome. I proposed a change to avahi-daemon to add these protections as an extra layer, along with some additional protections that are not guaranteed by chroot. You can see the full list of options I proposed here:

It looks like there are more restrictions which I could have used if avahi-daemon did not use chroot itself, some of which are mentioned in the commit message. I'm not sure how much this applies though.

Another approach would be to use SELinux. However you would kind of be tying your application to that sub-set of Linux distributions. The reason I thought of SELinux positively here, is that SELinux restricts the access that processes have on dbus, in a fine-grained way. For example, I think you could often expect that systemd would not be in the list of bus names that you needed to be able to send messages to :-).

If you think about point 3, using chroot provides more confinement. ProtectHome= and its friends don't even try to be as restrictive as chroot. (For example, none of the named systemd options blacklists /run, where we tend to put unix socket files).

chroot shows that restricting filesystem access can be a very powerful, but not everything on Linux is a file :-). There are systemd options that can restrict other things, that are not files. This is useful if the program is compromised, you can reduce the kernel features available to it, which it might try to exploit a vulnerability in. For example avahi-daemon doesn't need bluetooth sockets and I guess your web server doesn't either :-). So don't give it access to the AF_BLUETOOTH address family. Just whitelist AF_INET, AF_INET6, and maybe AF_UNIX, using the RestrictAddressFamilies= option.

Please read the docs for each option you use. Some options are be more effective in combination with others, and some are not available on all CPU architectures. (Not because the CPU is bad, but because the Linux port for that CPU wasn't as nicely designed. I think).

(There's a general principle here. It's more secure if you can write lists of what you want to allow, not what you want to deny. Like defining a chroot gives you a list of files you're allowed to access, and this more robust than saying you want to block /home).

In principle, you could apply all the same restrictions yourself before setuid(). It's all just code which you could copy from systemd. However, systemd unit options should be significantly easier to write, and since they are in a standard format they should be easier to read and review.

So I can highly recommend just reading through the sandboxing section of man systemd.exec on your target platform. But if you want the most secure design possible, I wouldn't be afraid to try chroot (and then drop root privileges) in your program as well. There is a tradeoff here. Using chroot imposes some constraints on your overall design. If you already have a design that uses chroot, and it seems to do what you need, that sounds pretty great.

If you can rely on systemd, then it is indeed safer (and simpler!) to leave the sandboxing to systemd. (Of course, the application can also detect if it has been launched sandboxed by systemd or not, and sandbox itself if it is still root.) The equivalent of the service you describe would be:

s6 is a package that provides a daemontools-inspired process supervision suite, a notification framework, a UNIX domain super-server, and tools for file descriptor holding and suidless privilege gain. It can be used as an init system component, and also as a helper for supervising OpenRC services. A high level overview of s6 is available here. The package's documentation is primarily provided in HTML format, and can be read on a text user interface using for example www-client/links. However, a man page port of the s6 documentation, app-misc/s6-man, is available in the GURU repository.

The program that implements the supervisor features in s6 is s6-supervise, and just like daemontools' supervise, it takes the (absolute or relative to the working directory) pathname of a service directory (or servicedir) as an argument. An s6 service directory must contain at least an executable file named run, and can contain an optional, regular file named down, and an optional subdirectory or symbolic link to directory named log, all of which work like their daemontools counterparts. Like runit service directories, it can also contain an optional, executable file named finish, that can be used to perfom cleanup actions each time the supervised process terminates, possibly depending on its exit status information. s6-supervise calls finish with two arguments: the first one is the supervised process' exit code, or 256 if it was killed by a signal, and the second one is the signal number if the supervised process was killed by a signal, or an undefined number otherwise. Unlike runit's runsv, s6-supervise sends the finish process a SIGKILL signal if it runs for too long. If using s6 version 2.2.0.0 or later, there can be an optional, regular file in the service directory, named timeout-finish, and containing an unsigned integer value that specifies how much time (in milliseconds) the finish process is allowed to run until being killed. If that file is absent, a default value of 5 seconds is used, which is the fixed value used by earlier versions. Like daemontools-encore, s6-supervise makes its child process the leader of a new session using the POSIX setsid() call, unless the servicedir contains a regular file named nosetsid (daemontools-encore's counterpart file is named no-setsid, though). In that case, the child process will run in s6-supervise's session instead. s6-supervise waits for a minimum of 1 second between two run spawns, so that it does not loop too quickly if the supervised process exits immediately. If s6-supervise receives a SIGTERM signal, it behaves as if an s6-svc -dx command naming the corresponding service directory had been used (see later), and if it receives a SIGHUP signal, it behaves as if an s6-svc -x command naming the corresponding service directory had been used.