Minimum Kernel Requirements for cgroups

357 views
Skip to first unread message

brandon...@levvel.io

unread,
Mar 1, 2016, 11:04:45 AM3/1/16
to Nomad
I've looked everywhere and I can't find any information on what the minimum kernel requirements are for Nomad. I know that there aren't hard requirements but I'm referring to the req for features like cgroups. I'm thinking primarily about the Java driver and the exec driver. I know as far as the java driver is concerned, the documentation says:

On Linux, Nomad will attempt to use cgroups, namespaces, and chroot to isolate the resources of a process. If the Nomad agent is not running as root many of these mechanisms cannot be used.
As a baseline, the Java jars will be run inside a Java Virtual Machine, providing a minimum amount of isolation.

Since there's no option for the user to run under, I'm assuming that if nomad can't use cgroups, the java process is run as the same user as nomad (like the raw exec driver). I'm also assuming the same for the isolated exec. My understanding of cgroups is a little limited, but from what I've read they were introduced in kernel version 2.6.24, but have changed greatly since then.

I'm pushing to get approval for RHEL 7, but right now the only thing I have access to is RHEL 6.6, which uses an older kernel (~2.6.32). We were originally going to use docker, but without a newer kernel we can't. We'll have to use a combination of the java driver and one of the exec drivers (for the node frontend).

Does nomad support cgroups all the way back to 2.6.24, or is there a cutoff? Am I correct that without cgroups, the java driver will run as the user that nomad is running under? Will the isolated exec run without cgroups and gracefully degrade to just a chroot, or will it just fail? And if it will run, will it run as the nomad user as well?

Alex Dadgar

unread,
Mar 1, 2016, 12:43:30 PM3/1/16
to brandon...@levvel.io, Nomad
Hey Brandon,

The exec and Java drivers when running under linux will not gracefully degrade to just using a chroot. They are designed to provide isolation guarantees rather than best effort. We honestly have not tested backwards to check kernel compatibility. A very easy way to find out is to launch an instance on your desired OS using `nomad agent -dev` this will launch a server and client as a easy way to test things. You can then just try to run an exec job! Even "/bin/sleep" will be enough to test!

Thanks,
Alex

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomad-tool/6d0978ab-68f1-4f71-89dd-590845ccf0e0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

brandon...@levvel.io

unread,
Mar 1, 2016, 1:54:57 PM3/1/16
to Nomad
Simple enough.

Nomad: 0.2.3
RHEL 6.6 running in VMware
kernel: 2.6.32-504.el6.x86_64

I tried running "/bin/sleep 1" as an exec job. Failed to join spawn-daemon to the cgroup. I tried looking up red hat docs on cgroups in 6.6. Followed along with this briefly, installing libcgroup and starting the service (not sure if that was needed, but it did create the /cgroup directory). Tried again and got the same error:

2016/03/01 12:40:32 [ERR] client: failed to start task 'sleep' for alloc '84edcf59-88c8-b5b9-4786-7ef1d0fabef4': failed to start command: 1 error(s) occurred:

* Failed to join spawn-daemon to the cgroup (&{Name:2d1179c7-34c7-8a75-51f1-1831aac11949 Parent: ScopePrefix: Resources:0xc820286480}): Error found less than 3 fields post '-' in "29 24 0:5 / /tmp/NomadClient428429800/3fbfff31-166f-6a5b-b2b5-07f953d080bc/sleep/dev ro,relatime - devtmpfs  rw,size=8154472k,nr_inodes=2038618,mode=755"

This appears to be the same error message as https://groups.google.com/forum/#!msg/nomad-tool/QSLtwQWK2Tc/N11u2BIoEQAJ but I'm not sure if it's related or not. Reading the discussion around that, I'm thinking it isn't related.

Is this just proof that cgroups related stuff wont run on RHEL 6.6?

Alex Dadgar

unread,
Mar 1, 2016, 1:58:59 PM3/1/16
to brandon...@levvel.io, Nomad
Hey,

It looks like you are running Nomad 0.2.3. Try using Nomad 0.3.0, it will check that you have cgroups mounted. When you first start up Nomad it will log if cgroups aren't there. But that error message (Error found less than 3 fields post '-' ) is usually because you don't have cgroups mounted. You can check by calling `mount` and seeing if cgroups are there.

Further your job spec should look something like:

task "foo" {
   driver = "exec"
   config {
     command = "/bin/sleep"
     args = ["10"]
   }

Brandon Dennis

unread,
Mar 1, 2016, 2:03:00 PM3/1/16
to Alex Dadgar, Nomad
I just downloaded the version from the download page. Someone might want to update that then. I'll try your recommendations in a few. Where can I download 0.3 from?

Alex Dadgar

unread,
Mar 1, 2016, 2:05:41 PM3/1/16
to Brandon Dennis, Nomad
Just looked at all the links, they all point to Nomad 0.3:

Alex Dadgar

unread,
Mar 1, 2016, 2:06:12 PM3/1/16
to Brandon Dennis, Nomad
In case that doesn't work for you: https://releases.hashicorp.com/nomad/0.3.0/

Brandon Dennis

unread,
Mar 1, 2016, 2:08:18 PM3/1/16
to Alex Dadgar, Nomad
Ok. Maybe it's cached in my browser. Thanks


On Tuesday, March 1, 2016, Alex Dadgar <al...@hashicorp.com> wrote:

brandon...@levvel.io

unread,
Mar 1, 2016, 2:54:28 PM3/1/16
to Nomad
I'm cracking up a bit. I wanted something a little more obvious that it was successful. So I wrote a script that just echoes the date to a file (/home/s1463080/test_script).

#! /usr/bin/env bash

echo `date` > /home/s1463080/test_output.txt

Then I changed the job file

job "example" {
        datacenters = ["dc1"]
        type = "service"
        update {
                stagger = "10s"
                max_parallel = 1
        }
        group "testing" {
                task "what-time-is-it" {
                        driver = "exec"
                        config {
                                command = "/home/s1463080/test_script"
                        }
                        resources {
                                cpu = 500 # 500 Mhz
                                memory = 256 # 256MB
                                network {
                                        mbits = 10
                                }
                        }
                }
        }
}


Now I get an error saying :

2016/03/01 13:47:43 [DEBUG] plugin: /usr/bin/nomad: plugin process exited
    2016/03/01 13:47:43 [ERR] client: failed to start task 'what-time-is-it' for alloc 'c8356d85-3121-0ae0-4f16-bd659c7b96cb': error starting process via the plugin: error starting command: fork/exec /home/s1463080/test_script: no such file or directory

I copied the path directly from the output to make sure I didn't typo something and tried cat'ing it.

# cat /home/s1463080/test_script
#! /usr/bin/env bash

echo `date` > /home/s1463080/test_output.txt

It's there, so I don't know what's going on. I pasted the path from the log in as the root user (nomad is running as root) and it runs without a problem.

brandon...@levvel.io

unread,
Mar 1, 2016, 2:55:14 PM3/1/16
to Nomad
Oh and yes it was cached in my browser. The above was run with nomad 0.3.

Alex Dadgar

unread,
Mar 1, 2016, 3:15:29 PM3/1/16
to Brandon Dennis, Nomad
Hey Brandon,

So the reason you are seeing that is that Nomad builds a chroot. Inside that chroot, the script you wrote does not exist. The chroot does not contain every file of the host operating system. Can you just run the following: http://pastebin.com/YYs5UnXh

You can see if it had output by running: nomad fs cat <ALLOC-ID> alloc/logs/redis.stdout.0

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.

Brandon Dennis

unread,
Mar 1, 2016, 4:21:18 PM3/1/16
to Alex Dadgar, Nomad
Lol. Of course. I feel silly. I'll try it out tomorrow when I have access again. But if it created the chroot, then I think it's probably going to work.

brandon...@levvel.io

unread,
Mar 2, 2016, 9:39:39 AM3/2/16
to Nomad
It works! Thank you so much for the help. You guys rock!
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+unsubscribe@googlegroups.com.

Alex Dadgar

unread,
Mar 2, 2016, 12:43:40 PM3/2/16
to Brandon Dennis, Nomad
Awesome! Hope you enjoy it!

To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomad-tool/4b7c71de-9d6c-4014-af1e-99471029726c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages