Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

shared mmap() of anon memory, broken?

226 views
Skip to first unread message

s...@cs.stanford.edu

unread,
Sep 18, 1997, 3:00:00 AM9/18/97
to

Matthias Buelow <m...@altair.mayn.de> wrote:

> For one of my programs, I decided to share memory between parent and
> child processes by mmapping some address space (mapping /dev/zero for
> System V and using MAP_ANON[YMOUS] for BSD-likes).
> So far so good. Tried it on DEC Unix, AIX, Free- and OpenBSD, works
> like a charm, as expected and decribed in Stevens, ``Advanced Programming
> in the Unix Environment'' (pp. 467-470). On the Linux based systems I
> tried it on, it didn't work.

Shared anonymous mmap doen't work on Linux. Period. This is a limitation of
the current memory management/page cache data structures, and, judging by the
conversations on linux.dev.kernel, is unlikely to change for many months to
come. As far as I know, the only available choices are SysV shared memory,
mmap-ing an actual file, or clone (threads, etc.)

Matthias Buelow

unread,
Sep 18, 1997, 3:00:00 AM9/18/97
to

'lo folks,

while hacking tonight, I somehow seemed to have disgruntled the Linux
kernel, since it refused to provide me with a working mmap(2). Oh how
I struggled and begged, to no avail, it kept ignoring me, simply shoving
EINVALs down my throat, like someone would spit ``rtfm'' into a luser's
face who is just asking for the 4th time about how to unpack a tar file.

Well, here's my problem:

For one of my programs, I decided to share memory between parent and
child processes by mmapping some address space (mapping /dev/zero for
System V and using MAP_ANON[YMOUS] for BSD-likes).
So far so good. Tried it on DEC Unix, AIX, Free- and OpenBSD, works
like a charm, as expected and decribed in Stevens, ``Advanced Programming
in the Unix Environment'' (pp. 467-470). On the Linux based systems I
tried it on, it didn't work.

I tried it on systems with kernel revisions 2.0.19 and 2.0.27. Since I
personally don't run a Linux system, I couldn't verify it with 2.1.55,
the newest development snapshot, but I had a look at 2.1.55 mm/mmap.c
and it didn't show any difference where I suspect the culprit.

What I was going to do is the following:

On Unix, mmap(2) can be used to map anonymous memory into address space,
which is automatically zeroed when it gets trapped.
With the MAP_SHARED flag, mmap() is supposed to share the allocated space
with all related processes (parent - children), unlike MAP_PRIVATE, which
marks the pages as copy-on-write.

On vanilla System V systems such as Solaris or Irix, the procedure is
to open /dev/zero, mmap() it and close it immediately afterwards. The
file is then mmapped() which is equivalent to anonymous mapping. This
basically (without error checks) looks like the following:

fd = open("/dev/zero", O_RDWR);
addr = mmap(0, size_of_area, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);

On 4.3+BSD, the method is a bit different, you don't have to use /dev/zero,
instead you specify MAP_ANON or MAP_ANONYMOUS, which in the end does the
same thing, map zeroed pages into address space:

addr = mmap(0, size_of_area, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED,
-1, 0)

Since the mmap() manpage on linux explicitly states that it supports
MAP_ANON, I tried it with the BSD version first. I also tried it the SysV
way. I made sure size_of_area was page-aligned. I tried around for
several hours until I gave up. When MAP_SHARED was specified with some
anonymous mapping (be it MAP_ANON or /dev/zero), Linux returned -1/EINVAL.
Only when I used MAP_PRIVATE, it worked, but that's not what I wanted, since
I want to have _shared_ memory.

After a while, I had a look in the kernel code, at mm/mmap.c (since I'm
not very familiar with Linux in particular, I don't know exactly if this
is the right place but I guess it is).

IMHO where it breaks is in the function do_mmap(), around line 193 on
2.0.27. There it says:

...
} else if ((flags & MAP_TYPE) != MAP_PRIVATE)
return -EINVAL;
...

This seems to be the point where I get my EINVAL delivered from.
I guess that from the fact that it works with MAP_PRIVATE and doesn't
with MAP_SHARED.
This portion of the file looks the same on 2.1.55, so I think the problem
is also in this version (I couldn't test it).
However, this anomaly is extremely annoying, since it effectively disables
the sharing of anonymously mapped pages. Perhaps someone with more insight
in the linux kernel can shed some light here and probably give some tips
how I can work around that (I don't want to use System V shmem, which is an
entirely different nuisance and not as portable).

Anyway, thanks in advance for answers.

--
--token * Boycott Micro$oft, see http://www.vcnet.com/bms/ *


Matthias Buelow

unread,
Sep 19, 1997, 3:00:00 AM9/19/97
to

In article <5vseg4$2el$1...@nntp.Stanford.EDU>, <s...@cs.stanford.edu> wrote:
>
>Shared anonymous mmap doen't work on Linux. Period. This is a limitation of
>the current memory management/page cache data structures, and, judging by the
>conversations on linux.dev.kernel, is unlikely to change for many months to
>come. As far as I know, the only available choices are SysV shared memory,
>mmap-ing an actual file, or clone (threads, etc.)

*sigh* Why doesn't anyone state this in the manpage...

Elliot Lee

unread,
Sep 19, 1997, 3:00:00 AM9/19/97
to

On 18 Sep 1997 22:27:23 GMT, Matthias Buelow <m...@altair.mayn.de> wrote:

>For one of my programs, I decided to share memory between parent and
>child processes by mmapping some address space (mapping /dev/zero for
>System V and using MAP_ANON[YMOUS] for BSD-likes).

[snip]


>Only when I used MAP_PRIVATE, it worked, but that's not what I wanted, since
>I want to have _shared_ memory.

If you want shared memory, then use shared memory :) You know, the
standard SysV IPC shared memory... Linux doesn't support sharing of
anonymous maps (yet). Patches for this are probably being welcomed by
linux-kernel.

>entirely different nuisance and not as portable).

It's not that much difference as far as operation goes, I can't comment on
portability though.

-- Elliot - http://www.redhat.com/
What's nice about GUI is that you see what you manipulate.
What's bad about GUI is that you can only manipulate what you see.

| http://www.cauce.org/ | http://www.linuxnet.org/ |

0 new messages