Performance penalties of using shmget

walec51

unread,

Aug 29, 2009, 6:12:02 AM8/29/09

to

Hi,

Does using shared memory in linux and receiving it from each process
throw shmget has a significant performance penalty compared using in
process memory by several threads ?

Thanks in advance.

David Schwartz

unread,

Aug 29, 2009, 10:18:06 PM8/29/09

to

Once the shared memory is set up, the remaining penalty is extremely
minor. The difference is that when switching between tasks that share
a vm, less work is done than when switching between tasks that do not.
But while the tasks are running, it's precisely the same.

DS

Rainer Weikusat

unread,

Aug 30, 2009, 1:17:55 PM8/30/09

to

walec51 <j...@adamwalczak.info> writes:
> Does using shared memory in linux and receiving it from each process
> throw shmget has a significant performance penalty compared using in
> process memory by several threads ?

Maybe yes and maybe no :-). First, this depends on what you are looking
at: Memory allocation or use of already allocated memory. shmget is a
system call and the inherent overhead of a system call is larger than
the inherent overhead of a normal subroutine invocation. Because of
this, malloc is a libary routine which uses system calls to allocate
'large' chunks of virtual memory from the kernel which are then
subdivided to satisfy actual allocation requests from a program and
henceforth managed in userspace. But malloc imposes an overhead of its
own and depending on how the allocator works and how the program has
been using it in the past, a single call to malloc isn't necessarily
always faster than any single system call. The performance of a memory
allocating system call will vary for the same reason, because the
memory allocator in the kernel isn't magic and the problems it needs to
solve are identical or at least very similar to the problems which
need to be solved by malloc. Theoretically, the 'pedigree' of an area
of memory doesn't matter because it is always accessed by the CPU in
the same way, so using already allocated shared memory should be
slower than using already allocated non-shared memory. But this is
again only partially true, because DRAM accesses itself are so slow
that the CPU tries very hard to avoid them altogehter and use a
smaller amount of faster memory as cache (especially regarding
register-crippled 1970s CPU-designs meanwhile emulated by a
combination of hardware and microcode, also known as x86 compatible
CPUs, this is the oversimplification of the 20th century :->). Because
there is usually a lot less cache memory than DRAM memory in a
computer, programs which reuse memory they recently have used will
generally run faster than programs whose memory accesses have little
spatial locality and ...

... I could continue this for a while, but I fear, the actual answer
is just that your questions makes little sense :-).

Rainer Weikusat

unread,

Aug 30, 2009, 1:19:30 PM8/30/09

to

walec51 <j...@adamwalczak.info> writes:
> Does using shared memory in linux and receiving it from each process
> throw shmget has a significant performance penalty compared using in
> process memory by several threads ?

Maybe yes and maybe no :-). First, this depends on what you are looking

at: Memory allocation or use of already allocated memory. shmget is a
system call and the inherent overhead of a system call is larger than
the inherent overhead of a normal subroutine invocation. Because of
this, malloc is a libary routine which uses system calls to allocate
'large' chunks of virtual memory from the kernel which are then
subdivided to satisfy actual allocation requests from a program and
henceforth managed in userspace. But malloc imposes an overhead of its
own and depending on how the allocator works and how the program has
been using it in the past, a single call to malloc isn't necessarily
always faster than any single system call. The performance of a memory
allocating system call will vary for the same reason, because the
memory allocator in the kernel isn't magic and the problems it needs to
solve are identical or at least very similar to the problems which
need to be solved by malloc. Theoretically, the 'pedigree' of an area
of memory doesn't matter because it is always accessed by the CPU in

the same way, so using already allocated shared memory should not be