kernel.shmmax and docker

3,567 views
Skip to first unread message

Flavio Castelli

unread,
May 19, 2014, 3:15:59 PM5/19/14
to docke...@googlegroups.com
Starting from latest version of docker (0.11.0) my containers do no
longer inherit the kernel.shmmax value of the host system. They all have
the following value: 33554432.

It looks this value is not handled by cgroups, it thought it was set by
apparmor. Hence I disabled apparmor by adding 'apparmor=0 security=""'
to my kernel parameter. Unfortunately that didn't help (despite apparmor
being really turned off, according to the output of dmesg).

I didn't look into SELinux since it's not enabled on my system.

I tried to run my containers adding this lxc configuration:
"lxc.aa_profile=unconfined" but it didn't help. Maybe because I'm using
libcontainment as a backend, rather than lxc.

The only workaround I found is to run the container in privileged mode,
and then set the value of shmmax straight from inside of it. However
that doesn't look like an ideal solution to me...

I'm currently running my containers on openSUSE 13.1 x86_64 with docker
o.11.1.

Thanks in advance
Flavio

Jérôme Petazzoni

unread,
May 20, 2014, 2:09:07 AM5/20/14
to Flavio Castelli, docker-dev
Hi Flavio,

I believe that this is not a feature of Docker, but of the kernel.
Did you just upgrade Docker to 0.11, or did you also upgrade your kernel?

 



Flavio

--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Flavio Castelli

unread,
May 20, 2014, 9:29:32 AM5/20/14
to docke...@googlegroups.com
On 05/20/2014 08:09 AM, Jérôme Petazzoni wrote:
> I believe that this is not a feature of Docker, but of the kernel.
> Did you just upgrade Docker to 0.11, or did you also upgrade your kernel?

I'm really really confused: I'm no longer able to reproduce a
configuration which allows the container to see the same value of the host.

I even tried to install older version of docker on a fresh installation
of openSUSE 13.1 (no updates ran, hence the stock kernel and libs).


Do you have any idea about how to remove this limitation without having
to run the container in privileged mode?

Thanks
Flavio

Jason Stoops

unread,
May 22, 2014, 7:53:36 PM5/22/14
to docke...@googlegroups.com, Flavio Castelli
Hi Jerome,

I am also seeing this issue using Ubunt 14.04 Server (3.13.0-24-generic) and Docker 0.11.1.

I'm guessing it's related to Docker issue 5703 (the discussion of /proc and /sys now being read-only)? https://github.com/dotcloud/docker/issues/5703

The value for kernel.shmmax in the container is not inherited from the host, but is always 33554432.

Thanks,
Jason


On Monday, May 19, 2014 11:09:07 PM UTC-7, Jérôme Petazzoni wrote:
Hi Flavio,

I believe that this is not a feature of Docker, but of the kernel.
Did you just upgrade Docker to 0.11, or did you also upgrade your kernel?

 
On Mon, May 19, 2014 at 12:15 PM, Flavio Castelli <fla...@castelli.me> wrote:
Starting from latest version of docker (0.11.0) my containers do no longer inherit the kernel.shmmax value of the host system. They all have the following value: 33554432.

It looks this value is not handled by cgroups, it thought it was set by apparmor. Hence I disabled apparmor by adding 'apparmor=0 security=""' to my kernel parameter. Unfortunately that didn't help (despite apparmor being really turned off, according to the output of dmesg).

I didn't look into SELinux since it's not enabled on my system.

I tried to run my containers adding this lxc configuration: "lxc.aa_profile=unconfined" but it didn't help. Maybe because I'm using libcontainment as a backend, rather than lxc.

The only workaround I found is to run the container in privileged mode, and then set the value of shmmax straight from inside of it. However that doesn't look like an ideal solution to me...

I'm currently running my containers on openSUSE 13.1 x86_64 with docker o.11.1.

Thanks in advance

Flavio

--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Jérôme Petazzoni

unread,
May 22, 2014, 9:55:45 PM5/22/14
to Jason Stoops, docker-dev, Flavio Castelli
Right; the kernel just copies the default values (32M in older kernels, 4G in some very recent kernels).
Someone proposed a patch so that namespaces would inherit the value of their parent namespace:

Meanwhile... This might require a privileged container. Not ideal at all, I realize; but let's keep in mind that SHM is a very efficient way to DOS a machine (since it's not subject to the OOM killer!), so it's not completely ridiculous to demand elevated privileges before granting high SHM values to a container.

It could also be set by dockerinit and exposed through a driver option.

Upendra Sharma

unread,
May 23, 2014, 4:01:10 PM5/23/14
to docke...@googlegroups.com, Jason Stoops, Flavio Castelli
any easy way to fix this .. in the sense by changing some config parameters ?
I am using Ubuntu 12.04 and seeing this from inside the container

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1

While outside the container .. I see this ..
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 2097152
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1

thanks,
-Upendra

Upendra Sharma

unread,
May 23, 2014, 4:02:37 PM5/23/14
to docke...@googlegroups.com, Jason Stoops, Flavio Castelli
forgot to ask .. which recent kernel has the default of 4G ?
is it available in the latest Ubuntu release ?

Jérôme Petazzoni

unread,
May 23, 2014, 4:05:31 PM5/23/14
to Upendra Sharma, docker-dev, Jason Stoops, Flavio Castelli
I'm not sure about the kernel version which has the 4G; I saw it was planned...

For now, you might want to work around this with nsenter? Nsenter would let you start a process in the container, but without associated restrictions (so it could tweak up the parameters). Short term hack, but if it helps...

Upendra Sharma

unread,
May 23, 2014, 5:42:57 PM5/23/14
to docke...@googlegroups.com, Upendra Sharma, Jason Stoops, Flavio Castelli
Thanks Jérôme

I have managed to compile and install nsenter.
but it seems to me that its a utility to enter into a docker which is not running a SSHD (or may be I am missing something).
I entered the docker using:
PID=$(docker inspect --format '{{ .State.Pid }}' bce24ad2ebb8)
nsenter --target $PID --mount --uts --ipc --net --pid bash

I was trying to set kernel.shmax ?
I ran this
sysctl -w kernel.shmmax=1073741824
error: "Read-only file system" setting key "kernel.shmmax"

but did not succeed.

Any ideas on how to achieve that ?

Thanks,
-Upendra

Jérôme Petazzoni

unread,
May 23, 2014, 5:47:28 PM5/23/14
to Upendra Sharma, docker-dev, Jason Stoops, Flavio Castelli
Sure! At that point, you can remount /proc read-write, change the value, and remount /proc again read-only.
Once again -- that's clunky but it should get you going while we figure a better way!

Jason Stoops

unread,
May 23, 2014, 7:48:08 PM5/23/14
to docke...@googlegroups.com, Upendra Sharma, Jason Stoops, Flavio Castelli
Hi Jerome,

The nsenter strategy worked for me! Thanks for the quick advice!

Jason

Upendra Sharma

unread,
May 23, 2014, 10:40:05 PM5/23/14
to docke...@googlegroups.com, Upendra Sharma, Jason Stoops, Flavio Castelli
Hi Jérôme and Jason,

could you please share how you remounted the proc fs as r/w of a docker/container.

thanks,
-upendra

Jérôme Petazzoni

unread,
May 27, 2014, 12:50:02 PM5/27/14
to Upendra Sharma, docker-dev, Jason Stoops, Flavio Castelli
Sure!

The idea is to use nsenter (as explained here: http://jpetazzo.github.io/2014/03/23/lxc-attach-nsinit-nsenter-docker-0-9/) to "break into" the container.
This will create a new shell process inside the container, but not bounded by the same capability set.
In other words, that shell will be able to mount/remount filesystems.

Does that help?

Upendra Sharma

unread,
May 27, 2014, 5:44:44 PM5/27/14
to docke...@googlegroups.com, Upendra Sharma, Jason Stoops, Flavio Castelli
Thanks Jerome,

I am not sure what I was doing wrong earlier remounting was not working for me. Today I disabled AppArmor and tried again and it seems to have worked; I am not sure if it was disabling AppArmor which made it work or if I was doing something wrong in remounting the proc fs .. anyways .. here is what worked for me ..
1) I diabled apparmor -- not sure if this is required or not.
2) PID=$(docker inspect --format '{{ .State.Pid }}' bce24ad2ebb8)
3) nsenter --target $PID --mount --uts --ipc --net --pid bash
4) mount -o remount,rw -t proc /proc /proc
5) echo 1073741824 > /proc/sys/kernel/shmmax
6) mount -o remount,ro -t proc /proc /proc

thanks,
-Upendra

Michael Crosby

unread,
May 27, 2014, 5:45:53 PM5/27/14
to Upendra Sharma, docke...@googlegroups.com, Jason Stoops, Flavio Castelli
Apparmor does prevent remounts for proc

-- 
Michael Crosby
@crosbymichael

Jérôme Petazzoni

unread,
May 28, 2014, 12:52:29 PM5/28/14
to Michael Crosby, Upendra Sharma, docker-dev, Jason Stoops, Flavio Castelli
Yes, but I thought the process started by nsenter wouldn't be confined by AppArmor :/

Kundu Gangadharan

unread,
Jan 27, 2015, 7:01:40 PM1/27/15
to docke...@googlegroups.com

I am running into the same problem. Creating small shared memory segments work fine, but doing anything useful with shared memory sections fails miserably within Docker. The issue points to the shmmax limits imposed in the container.

# sysctl -a | grep shm
kernel.shm_rmid_forced = 0
kernel.shmall = 2097152
kernel.shmmax = 33554432
kernel.shmmni = 4096
vm.hugetlb_shm_group = 0

I tried the work around suggested at the bottom of the chain -- tweaking the /proc/sys/kernel/shmmax in the container, but I am not able to mount the /proc in read/write mode. 
Did anyone really figure out a way to make the shared memory segments work with in Docker?

For example:
27ab3defe455:/ # echo 12884901888 >/proc/sys/kernel/shmmax
bash: /proc/sys/kernel/shmmax: Read-only file system
27ab3defe455:/ # mount -o remount,rw -t proc /proc /proc
27ab3defe455:/ # echo 12884901888 >/proc/sys/kernel/shmmax
bash: /proc/sys/kernel/shmmax: Read-only file system
27ab3defe455:/ # mount
/dev/mapper/docker-8:1-1213459-27ab3defe45507a40f28f0ee5ca43b748551c13f16e4a2dbc558ceedb0fbab4a on / type ext4 (rw,relatime,discard,stripe=16,data=ordered)
proc on /proc type proc (rw,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,mode=755)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
/dev/sda1 on /etc/resolv.conf type ext3 (rw,relatime,data=ordered)
/dev/sda1 on /etc/hostname type ext3 (rw,relatime,data=ordered)
/dev/sda1 on /etc/hosts type ext3 (rw,relatime,data=ordered)
/dev/sdb1 on /gdat type xfs (rw,noatime,attr2,inode64,logbsize=256k,sunit=512,swidth=1024,noquota)
/dev/sdb1 on /lfnet type xfs (rw,noatime,attr2,inode64,logbsize=256k,sunit=512,swidth=1024,noquota)
devpts on /dev/console type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,mode=755)

Running docker version 1.4.1 on SuSE Linux Enterprise ( SLES 12 ).

Thanks
Ganga

Nate Aune

unread,
Feb 2, 2015, 1:19:37 PM2/2/15
to docke...@googlegroups.com
I'm having the same problem trying to run software that requires the shared memory to be 128MB instead of 32MB. This is the command I'm trying to run:
echo 134217728 >/proc/sys/kernel/shmmax
Which results in:
bash: /proc/sys/kernel/shmmax: Read-only file system

I tried the suggested workaround to use nsenter, but boot2docker doesn't have GCC, so I wasn't able to compile nsenter from source.

Any other ideas for how to get past this issue? Should I dump boot2docker and just launch a Ubuntu 14.04 VM and try it there instead?

thanks,
Nate

Nate Aune

unread,
Feb 2, 2015, 3:34:50 PM2/2/15
to docke...@googlegroups.com
I found that you can use --privileged=true with the docker run
command, and then it's possible to increased the shared memory limit.
But this is only temporary until you restart the container, and then
you have to run the command again.
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "docker-dev" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/docker-dev/HTC_q92ILZs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
Nate Aune - nate...@gmail.com
http://www.nateaune.com
+1 (617) 517-4953
Reply all
Reply to author
Forward
0 new messages