malloc() vs mmap()

bolta...@boltar.world

unread,

Oct 21, 2011, 9:38:43 AM10/21/11

to

When just allocating memory (ie using MAP_ANON) what is the difference
between mmap and malloc? In man it says mmap allocates in the virtual
address space of the program so does that mean it uses the heap like malloc
or something else?

B2003

Måns Rullgård

unread,

Oct 21, 2011, 10:06:31 AM10/21/11

to

mmap() is a (more or less) direct system call, requesting pages of
virtual memory from the kernel. malloc() allocates memory from the
kernel using some unspecified method (in practice a combination of brk()
and mmap()), which it then slices up and serves to the caller. A call
to free() might put the block into an internal pool for recycling by
malloc(), or it can be given back to the kernel (which can then free up
the underlying physical pages).

Unless you have specific reasons for doing otherwise, you should use
malloc() to allocate memory in your application.

--
Måns Rullgård
ma...@mansr.com

Rainer Weikusat

unread,

Oct 21, 2011, 10:11:55 AM10/21/11

to

bolta...@boltar.world writes:
> When just allocating memory (ie using MAP_ANON) what is the difference
> between mmap and malloc?

'malloc' is the main 'memory allocation interface' of the 'heap
management' code in the C library. This heap management code, in turn,
uses available system facilities ((s)brk and mmap) to get memory
(address space, actually) from the kernel and uses this in line with
some 'memory management strategy' to satisfy memory allocation
requests from applications (typically, this will be something
hideously complicated to prevent people from coming up with simple
programs where the allocator exhibits spectacularly bad performance).

'mmap' is a system call which requests that the kernel finds a
contiguous, unused region in the address space of the application
large to map a certain number of memory pages into it, creates the
necessary virtual memory management structures to make access to this
area 'legal' (in the sense that they won't result in a segfault) and
returns the start address of this area as result of the system call.

bolta...@boltar.world

unread,

Oct 21, 2011, 10:13:51 AM10/21/11

to

On Fri, 21 Oct 2011 15:06:31 +0100
=?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= <ma...@mansr.com> wrote:
>mmap() is a (more or less) direct system call, requesting pages of
>virtual memory from the kernel. malloc() allocates memory from the

That would explain why it seems to be faster than malloc.

>Unless you have specific reasons for doing otherwise, you should use
>malloc() to allocate memory in your application.

I am trying to find a way to speed up memory allocation in my program even
if that means wasting some bytes by allocating a page at a time so thats
why I was considering mmap but I wondered if I was missing some gotcha that
meant it wouldn't be suitable for this.

B2003

Måns Rullgård

unread,

Oct 21, 2011, 10:31:24 AM10/21/11

to

bolta...@boltar.world writes:

> On Fri, 21 Oct 2011 15:06:31 +0100
> =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= <ma...@mansr.com> wrote:
>>mmap() is a (more or less) direct system call, requesting pages of
>>virtual memory from the kernel. malloc() allocates memory from the
>
> That would explain why it seems to be faster than malloc.

Calling mmap() directly would only be faster than malloc() in cases
where malloc() would have to call mmap() itself. If the request can be
served without a syscall, malloc() is likely to be faster.

>>Unless you have specific reasons for doing otherwise, you should use
>>malloc() to allocate memory in your application.
>
> I am trying to find a way to speed up memory allocation in my program even
> if that means wasting some bytes by allocating a page at a time so thats
> why I was considering mmap but I wondered if I was missing some gotcha that
> meant it wouldn't be suitable for this.

If malloc() is using any appreciable amount of time in your program, you
should rethink your design.

--
Måns Rullgård
ma...@mansr.com

Eric Sosman

unread,

Oct 21, 2011, 10:51:36 AM10/21/11

to

On 10/21/2011 10:13 AM, bolta...@boltar.world wrote:
> On Fri, 21 Oct 2011 15:06:31 +0100
> =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?=<ma...@mansr.com> wrote:
>> mmap() is a (more or less) direct system call, requesting pages of
>> virtual memory from the kernel. malloc() allocates memory from the
>
> That would explain why it seems to be faster than malloc.

Your experience differs from mine. Perhaps your allocation
patterns are unusual.

Observe that malloc() can do all its work in the context of your
process, unless it needs to obtain more "bulk" memory from the O/S.
mmap(), on the other hand, involves a context switch into kernel-land.
Once there, the parameters must be validated (do you in fact have
access rights to any addresses that you mention, are you exceeding
any quota limitations, is the system as a whole short on memory so
the request must steal from other processes, ...). Then if all is OK,
the MMU hardware must be adjusted to reflect the new mapping, and the
kernel must update its own internal bookkeeping (very much as malloc()
does, though the granularity is probably coarser), and finally there's
another context switch to get control back to your process again.

Unless there's something very strange going on, this is probably
a lot more "heavy-weight" than having malloc() sling a few pointers
around inside the process' context.

>> Unless you have specific reasons for doing otherwise, you should use
>> malloc() to allocate memory in your application.
>
> I am trying to find a way to speed up memory allocation in my program even
> if that means wasting some bytes by allocating a page at a time so thats
> why I was considering mmap but I wondered if I was missing some gotcha that
> meant it wouldn't be suitable for this.

The first question to ask (maybe even the zero'th) is: "What
evidence exists to indicate that memory management is a significant
bottleneck in my program?" Measure your program's performance, and
measure memory management's contribution to whatever problems you
see. If you haven't measured it, then (1) you don't actually know
whether you have a problem, and (2) if you make changes you won't
know whether they helped or harmed.

If measurements do in fact show memory management as a problem,
I'd recommend studying the program's usage patterns. Gather data
about the sizes of allocations, and if possible about their lifetimes.
You may find that you're thrashing the same few allocations back and
forth between malloc() and free(), or that you're calling realloc()
to grow the same region one byte at a time a million times in a row.
If you find patterns of this kind, it may pay to make changes to the
way your program deals with memory. For example, if you find that you
always allocate a thousand 56-byte structs, use them, and then free
them all, it may make sense to allocate one 56000-byte array of structs
in one malloc() and release them all at once in one free(). Other
strategies may occur to you, once you know more about the usage
patterns.

If the program's usage patterns are not amenable to change, then
and only then should you consider customized allocators. Which, of
course, you will design and implement in light of what you've learned
about the program's usage patterns. And which, of course, you will
measure in the same way you measured the original code, to find out
whether you have or have not improved matters.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Xavier Roche

unread,

Oct 21, 2011, 11:04:40 AM10/21/11

to

On 10/21/2011 04:51 PM, Eric Sosman wrote:
>> That would explain why it seems to be faster than malloc.
> Your experience differs from mine. Perhaps your allocation
> patterns are unusual.
>

> Unless there's something very strange going on, this is probably
> a lot more "heavy-weight" than having malloc() sling a few pointers
> around inside the process' context.

Besides, malloc() returns a potentially dirty memory region, which is
good, because it is fast.

On the other side, mmap() MUST return a "clean" memory region, which
means a memset'ed region (can be costly)

Besides, the returned mmap'ed region may, depending on the operating
system, be "lazy", which means an additional cost once you touch bytes
within the allocated region to actually allocate real (mapped) memory.

Rainer Weikusat

unread,

Oct 21, 2011, 11:38:23 AM10/21/11

to

bolta...@boltar.world writes:
> On Fri, 21 Oct 2011 15:06:31 +0100

[...]

>>Unless you have specific reasons for doing otherwise, you should use
>>malloc() to allocate memory in your application.
>
> I am trying to find a way to speed up memory allocation in my program even
> if that means wasting some bytes by allocating a page at a time so thats
> why I was considering mmap but I wondered if I was missing some gotcha that
> meant it wouldn't be suitable for this.

The usual way to achieve this would be to use a radically simpler
scheme than the three-header wolpertinger malloc tries to be, based on
the knowledge that 'an efficient solution to the general memory
management problem' (using only block sizes as 'user provided
information') isn't really needed (even if it is needed, an object
caching allocator with different caches for different types of objects
would be a suitable replacement).

Eg, programs I write are usually supposed to keep running for an
indefinite amount of time. These programs need 'objects' of various
types and I usually don't want to "reuse" the corresponding memory
areas for 'objects of other types'. This means that I usually allocate
'a large block of memory' (typically using mmap), allocate objects
from that until it is consumed and keep allocated objects on
type-specific free lists. This means for the common case, allocation
and deallocation are simple, constant-time operations (remove object
from free list, add object to free list) which cannot fail.

Rainer Weikusat

unread,

Oct 21, 2011, 12:13:26 PM10/21/11

to

Eric Sosman <eso...@ieee-dot-org.invalid> writes:

[...]

>> I am trying to find a way to speed up memory allocation in my program even
>> if that means wasting some bytes by allocating a page at a time so thats
>> why I was considering mmap but I wondered if I was missing some gotcha that
>> meant it wouldn't be suitable for this.
>
> The first question to ask (maybe even the zero'th) is: "What
> evidence exists to indicate that memory management is a significant
> bottleneck in my program?"

I'd like to propose another 'zeroth question' namely, 'What evidence
exists that the memory management algorithm which happens to be used
by the C library I happen to use will be suitable for my specific
problem?' and the answer to that is actually well known: 'Ehmm ... no
evidence, since nobody really understands the dynamic behaviour of
block size based general purpose allocators. If it fails
catastrophically, you'll know it'. And that's usually not an option
for software supposed to run unattended for an indefinite period of
time.

'malloc' isn't something which fell from the sky as gift of the gods
but a library interface to 'a simple, general purpose memory
allocation package' introduced in UNIX V7. A large number of wildy
differing implementations exists and it is common for every 'large
project' I've encountered so far to become throrougly fed up with the
malloc interface at some point in time and design a custom one, IOW
neither is there something like 'the general purpose memory mangement
algorithm' nor is there much reason to assume that this specific
interface to 'a memory management facility' was a particularly good
idea, meaning the concept is a complete failure: Nobody really know
how to implement this factility. Nobody really wants to use it,
either.

"If you're in a hole, stop digging" ...

Eric Sosman

unread,

Oct 21, 2011, 10:08:56 PM10/21/11

to

On 10/21/2011 12:13 PM, Rainer Weikusat wrote:
> Eric Sosman<eso...@ieee-dot-org.invalid> writes:
>
> [...]
>
>>> I am trying to find a way to speed up memory allocation in my program even
>>> if that means wasting some bytes by allocating a page at a time so thats
>>> why I was considering mmap but I wondered if I was missing some gotcha that
>>> meant it wouldn't be suitable for this.
>>
>> The first question to ask (maybe even the zero'th) is: "What
>> evidence exists to indicate that memory management is a significant
>> bottleneck in my program?"
>
> I'd like to propose another 'zeroth question' namely, 'What evidence
> exists that the memory management algorithm which happens to be used
> by the C library I happen to use will be suitable for my specific

> problem?' [...]

Pursue that line a little further, and you'll reject all software
that you didn't write yourself. Not just malloc(), but also the C
compiler, the linker, all the rest of the run-time library, the
operating system, and the assembler. You're back to entering programs
by flipping front-panel switches (if you can find a computer that still
has them), and even then you've got to ask "What evidence exists that
this machine's transistors are fabricated suitably for my problem?"

When I was a child I wondered what would happen if you lined up
two mirrors facing each other: How many images of images of images of
images or my own silly face would there be? You may be well on the
way to discovering an answer.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Rainer Weikusat

unread,

Oct 22, 2011, 10:27:34 AM10/22/11

to

Eric Sosman <eso...@ieee-dot-org.invalid> writes:
>>> The first question to ask (maybe even the zero'th) is: "What
>>> evidence exists to indicate that memory management is a significant
>>> bottleneck in my program?"
>>
>> I'd like to propose another 'zeroth question' namely, 'What evidence
>> exists that the memory management algorithm which happens to be used
>> by the C library I happen to use will be suitable for my specific
>> problem?' [...]
>
> Pursue that line a little further, and you'll reject all software
> that you didn't write yourself.

Every specific statement can be generalized until it becomes absurd.
But that's not an argument against the original statement, just
against this procedure.

loozadroog

unread,

Oct 23, 2011, 2:54:02 AM10/23/11

to

One major pragmatic difference is the malloc will return memory for
any size (within the range of a size_t); but mmap/MAP_ANON only grants
multiples of the page size.

So you can't do this:
mmap(NULL, sizeof(mystructure), PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANON, -1, 0);

Ian Collins

unread,

Oct 23, 2011, 3:42:54 AM10/23/11

to

Well yes, you can. You just end up wasting memory.

--
Ian Collins

Rainer Weikusat

unread,

Oct 23, 2011, 10:56:57 AM10/23/11

to

Memory efficiency isn't as paramount a concern as it used to be. If
it was, nobody would be using C++ or Java :->. And it isn't necessary
to allocate a page of memory and just use a small area at the
beginning. Future 'memory needs' of the program can be satisfied by
using still unused parts of the allocated page, eg, by using a moving
pointer to the 'still unused area' which is suitably incremented every
time a new memory block was requested. This can be combined with a
simple facility for memory reuse at the object level, as opposed to
'at the level of typeless areas of certain sizes' (IMO, the type of
an object is a much more important piece of 'usage information' than
its size) to make an allocator which is sufficiently general to
work for programs (or cases) where 'allocation requests' are usually
(or mostly) for rather small objects (compared to the page size) and
the exact number of objects that will be needed can't be (or can't
easily be) predicted at compile time.

For a single memory allocation, mmap is pretty much guaranteed to be
slower than malloc because of the system call alone.

Barry Margolin

unread,

Oct 23, 2011, 1:27:23 PM10/23/11

to

In article <9ghuru...@mid.individual.net>,

And how do you know malloc() isn't wasting memory as well? It would be
valid, although quite inferior, for malloc() to simply call mmap() every
time.

Some malloc() implementations are better than others at optimizing
memory use and avoiding fragmentation.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Rainer Weikusat

unread,

Oct 23, 2011, 3:34:31 PM10/23/11

to

Barry Margolin <bar...@alum.mit.edu> writes:

[...]

> Some malloc() implementations are better than others at optimizing
> memory use and avoiding fragmentation.

Leading remark: The text below has intentionally been simplified as
much as possible so that the basic idea is more readily understood.

I would go so far to state malloc creates fragmentation. Somewhat
simplified, 'fragmentation' is when you have lots of free memory you
can't use because it has been broken up into physically separated chunks
of inadequate sizes. Since malloc dedicates contiguous memory areas to
being used for allocation requests of a specific size and the usage
patterns of objects whose sizes happen to be identical vary (a
structure of length 64 will usually live a lot longer than a 64 byte
character string which happened to be needed for some intermediate
'calculation') each of these dedicated regions will end up holding
some long-lived objects and some 'free regions' which are lost except
if the happen to be large enough for future allocation requests of the
program. This mixing of different kinds of objects cause the problem
and 'some malloc implementations' may be better at working around this
inherent deficiency than others.

The person who designed the interface invented something new and thus,
didn't know about the problem yet, and in any case, he was targetting
the UNIX(*) system and application of his time: An interactive
research OS whose workload was comprised of short-lived processes run
by human users.

Ian Collins

unread,

Oct 23, 2011, 3:39:58 PM10/23/11

to

On 10/24/11 06:27 AM, Barry Margolin wrote:
> In article<9ghuru...@mid.individual.net>,
> Ian Collins<ian-...@hotmail.com> wrote:
>
>> On 10/23/11 07:54 PM, loozadroog wrote:
>>> On Oct 21, 8:38 am, boltar2...@boltar.world wrote:
>>>> When just allocating memory (ie using MAP_ANON) what is the difference
>>>> between mmap and malloc? In man it says mmap allocates in the virtual
>>>> address space of the program so does that mean it uses the heap like malloc
>>>> or something else?
>>>>
>>>> B2003
>>>
>>> One major pragmatic difference is the malloc will return memory for
>>> any size (within the range of a size_t); but mmap/MAP_ANON only grants
>>> multiples of the page size.
>>>
>>> So you can't do this:
>>> mmap(NULL, sizeof(mystructure), PROT_READ|PROT_WRITE,
>>> MAP_PRIVATE|MAP_ANON, -1, 0);
>>>
>>
>> Well yes, you can. You just end up wasting memory.
>
> And how do you know malloc() isn't wasting memory as well? It would be
> valid, although quite inferior, for malloc() to simply call mmap() every
> time.

I didn't make any claims for malloc, I simply disagreed with the
assertion that mmap couldn't be used in its place ("you can't do this").

--
Ian Collins

Ian Collins

unread,

Oct 23, 2011, 3:45:34 PM10/23/11

to

On 10/24/11 03:56 AM, Rainer Weikusat wrote:
> Ian Collins<ian-...@hotmail.com> writes:
>> On 10/23/11 07:54 PM, loozadroog wrote:
>>> On Oct 21, 8:38 am, boltar2...@boltar.world wrote:
>>>> When just allocating memory (ie using MAP_ANON) what is the difference
>>>> between mmap and malloc? In man it says mmap allocates in the virtual
>>>> address space of the program so does that mean it uses the heap like malloc
>>>> or something else?
>>>>
>>>> B2003
>>>
>>> One major pragmatic difference is the malloc will return memory for
>>> any size (within the range of a size_t); but mmap/MAP_ANON only grants
>>> multiples of the page size.
>>>
>>> So you can't do this:
>>> mmap(NULL, sizeof(mystructure), PROT_READ|PROT_WRITE,
>>> MAP_PRIVATE|MAP_ANON, -1, 0);
>>>
>>
>> Well yes, you can. You just end up wasting memory.
>
> Memory efficiency isn't as paramount a concern as it used to be. If
> it was, nobody would be using C++ or Java :->.

C++ is no less memory efficient than C. Using the the language support
for custom allocators, it is easier to improve the memory efficiency of
a C++ program than an equivalent C program.

<snip>

> For a single memory allocation, mmap is pretty much guaranteed to be
> slower than malloc because of the system call alone.

Did I claim otherwise?

--
Ian Collins

Volker Birk

unread,

Oct 23, 2011, 3:54:42 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> wrote:
> C++ is no less memory efficient than C.

Well, if you're not using common C++ libraries, but optimizing yourself.

> Using the the language support for custom allocators, it is easier to
> improve the memory efficiency of a C++ program than an equivalent C
> program.

I really doubt that. I'm using C since 1988, C++ since 1992. And I must
say, "easy" ist the last of all words which I would relate to C++.

Yours,
VB.
--
"If /dev/null is fast in web scale I will use it."

http://www.mongodb-is-web-scale.com/

Ian Collins

unread,

Oct 23, 2011, 4:12:05 PM10/23/11

to

On 10/24/11 08:54 AM, Volker Birk wrote:
> Ian Collins<ian-...@hotmail.com> wrote:
>> C++ is no less memory efficient than C.
>
> Well, if you're not using common C++ libraries, but optimizing yourself.

I write a lot of embedded C++ code and I use the standard library. If a
solution requires dynamic memory, it will require dynamic in C or C++.
Yes it is easy to unwittingly consume memory in C++ if you don't know
what you are doing, but in that case, you shouldn't be writing
production code!

>> Using the the language support for custom allocators, it is easier to
>> improve the memory efficiency of a C++ program than an equivalent C
>> program.
>
> I really doubt that. I'm using C since 1988, C++ since 1992. And I must
> say, "easy" ist the last of all words which I would relate to C++.

I was specifically referring to the ability to add a custom allocator to
a class/struct, or to specify one in a standard container template. For
example profiling shows a particular small struct is heavily used and
"wasting" a significant amount of memory (say it's 24 bytes and
malloc/new uses 16 byte chunks), adding a class specific operator new is
much easier than optimising malloc for 24 byte allocations.

--
Ian Collins

Volker Birk

unread,

Oct 23, 2011, 4:23:17 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> wrote:
> On 10/24/11 08:54 AM, Volker Birk wrote:
>> Ian Collins<ian-...@hotmail.com> wrote:
>>> C++ is no less memory efficient than C.
>> Well, if you're not using common C++ libraries, but optimizing yourself.
> I write a lot of embedded C++ code and I use the standard library.

I'm not talking about libstdc++, but about libraries like libkdeinit4*
or anything like that.

> If a solution requires dynamic memory, it will require dynamic in C or
> C++. Yes it is easy to unwittingly consume memory in C++ if you don't
> know what you are doing, but in that case, you shouldn't be writing
> production code!

;-) Well, most people shouldn't, that's right. And they do.

>>> Using the the language support for custom allocators, it is easier to
>>> improve the memory efficiency of a C++ program than an equivalent C
>>> program.
>> I really doubt that. I'm using C since 1988, C++ since 1992. And I must
>> say, "easy" ist the last of all words which I would relate to C++.
> I was specifically referring to the ability to add a custom allocator to
> a class/struct, or to specify one in a standard container template. For
> example profiling shows a particular small struct is heavily used and
> "wasting" a significant amount of memory (say it's 24 bytes and
> malloc/new uses 16 byte chunks), adding a class specific operator new is
> much easier than optimising malloc for 24 byte allocations.

I don't want to say, that you're not right with these ideas about the
possibilities of C++ memory management. But again, "easy" is the last of
all words I would see related.

Rainer Weikusat

unread,

Oct 23, 2011, 4:38:08 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> writes:
> On 10/24/11 03:56 AM, Rainer Weikusat wrote:

[...]

>> Memory efficiency isn't as paramount a concern as it used to be. If
>> it was, nobody would be using C++ or Java :->.
>
> C++ is no less memory efficient than C.

This statement makes no sense: C and C++ are programming languages and
languages don't use memory.

Rainer Weikusat

unread,

Oct 23, 2011, 4:59:25 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> writes:
> On 10/24/11 08:54 AM, Volker Birk wrote:
>> Ian Collins<ian-...@hotmail.com> wrote:
>>> C++ is no less memory efficient than C.
>>
>> Well, if you're not using common C++ libraries, but optimizing yourself.
>
> I write a lot of embedded C++ code and I use the standard library. If
> a solution requires dynamic memory, it will require dynamic in C or
> C++. Yes it is easy to unwittingly consume memory in C++ if you don't
> know what you are doing, but in that case, you shouldn't be writing
> production code!

That's a nice theory but nobody told this to all these people who do
write 'C++ production code'. Eg, a e-mail classfication engine I
happen to know has a C++-API (it is itself written in C++) requiring
'e-mails' to be passed to it as Std::String instances. This implies
that there is no way short of using a custom C++-library this API can
be used to scan 'e-mail files' in place.

>>> Using the the language support for custom allocators, it is easier to
>>> improve the memory efficiency of a C++ program than an equivalent C
>>> program.
>>
>> I really doubt that. I'm using C since 1988, C++ since 1992. And I must
>> say, "easy" ist the last of all words which I would relate to C++.
>
> I was specifically referring to the ability to add a custom allocator
> to a class/struct, or to specify one in a standard container template.
> For example profiling shows a particular small struct is heavily used
> and "wasting" a significant amount of memory (say it's 24 bytes and
> malloc/new uses 16 byte chunks), adding a class specific operator new
> is much easier than optimising malloc for 24 byte allocations.

There's no need to 'optimize malloc' and implementing custom
allocation and deallocation routines for a specific type is no more
difficult in C than in C++ (since they need to perform the exact same
tasks). But the C++-solution will be easier to use because it doesn't
require a new interface.

Scott Lurndal

unread,

Oct 23, 2011, 5:41:20 PM10/23/11

to

Volker Birk <bum...@dingens.org> writes:
>Ian Collins <ian-...@hotmail.com> wrote:
>> C++ is no less memory efficient than C.
>
>Well, if you're not using common C++ libraries, but optimizing yourself.
>
>> Using the the language support for custom allocators, it is easier to
>> improve the memory efficiency of a C++ program than an equivalent C
>> program.
>
>I really doubt that. I'm using C since 1988, C++ since 1992. And I must
>say, "easy" ist the last of all words which I would relate to C++.
>

I've written two commerical hypervisors (one for SGI, one for 3Leaf Systems)
in C++. I've written an entire distributed, microkernel operating system (for Unisys) in
C++, for 128 core systems (in 1995!). I've written a mainframe simulator (instruction
level) + associated hardware device simulations in C++. The 3leaf hypervisor ran
on 192-core AMD and 1024-core Intel systems, and supported as many guests as physical
cores, with less than 1/2 of 1 percent hypervisor overhead (a 192-core ccNUMA
shared memory system had 16 independent copies of the hypervisor sharing up to
1TB of memory across standard infiniband using the 3Leaf DSM ASIC. The hypervisors
worked together to present resources from all 16 nodes (cores and memory) to one
or more guests; to the guest it was a single system. We did some fun stuff with
1TB/192Core Guests running big science apps).

C++ is just as efficient, when properly used, as is C. In OS/Hypervisors
one avoids templates (code bloat), run-time typing (rtti) and C++ exception handling.

Using classes for data encapsulation, and pure virtual classes
for interfaces[*], makes for easily maintained, efficient code.

The ability to overload operator "new" on a per-class basis allows for very
efficient pool-based allocators to be used instead of malloc (required when
writing operating systems where the standard run-time library doesn't exist).

On the other hand, I find that the C++ STL makes for unreadable, ugly bloated code.

[*] Every modern operating system written in C uses the equivalent of virtual
tables, but they have to code it explicitly (e.g. System V VFS, Cellular Irix
"behaviors", Linux *_ops vectors). The single indirection cost of a virtual
function call in C++ is no different.

Ian Collins

unread,

Oct 23, 2011, 5:49:05 PM10/23/11

to

On 10/24/11 09:59 AM, Rainer Weikusat wrote:
> Ian Collins<ian-...@hotmail.com> writes:
>>

>> I was specifically referring to the ability to add a custom allocator
>> to a class/struct, or to specify one in a standard container template.
>> For example profiling shows a particular small struct is heavily used
>> and "wasting" a significant amount of memory (say it's 24 bytes and
>> malloc/new uses 16 byte chunks), adding a class specific operator new
>> is much easier than optimising malloc for 24 byte allocations.
>
> There's no need to 'optimize malloc' and implementing custom
> allocation and deallocation routines for a specific type is no more
> difficult in C than in C++ (since they need to perform the exact same
> tasks). But the C++-solution will be easier to use because it doesn't
> require a new interface.

That last sentence is my point in a nutshell.

--
Ian Collins

Ian Collins

unread,

Oct 23, 2011, 6:59:53 PM10/23/11

to

Writing class specific new/delete operators is a lot easier than
changing all the calls to malloc/free for a struct. Using a custom
allocator with a standard library template usually involves little more
than changing a typedef.

--
Ian Collins

William Ahern

unread,

Oct 23, 2011, 7:41:31 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> wrote:
> On 10/24/11 09:23 AM, Volker Birk wrote:

<snip>

> > I don't want to say, that you're not right with these ideas about the
> > possibilities of C++ memory management. But again, "easy" is the last of
> > all words I would see related.

> Writing class specific new/delete operators is a lot easier than
> changing all the calls to malloc/free for a struct. Using a custom
> allocator with a standard library template usually involves little more
> than changing a typedef.

You're comparing apples and oranges. Well encapsulated C code will use
object-specific initializers and destructors. Using C++ will at best gain
you two fewer lines of code; one for the explicit malloc and one for the
explicit destruction.

Poorly written C++ code may use malloc intead of new. There's no accounting
for incompetence.

Ian Collins

unread,

Oct 23, 2011, 8:13:57 PM10/23/11

to

On 10/24/11 12:41 PM, William Ahern wrote:
> Ian Collins<ian-...@hotmail.com> wrote:
>> On 10/24/11 09:23 AM, Volker Birk wrote:
> <snip>
>>> I don't want to say, that you're not right with these ideas about the
>>> possibilities of C++ memory management. But again, "easy" is the last of
>>> all words I would see related.
>
>> Writing class specific new/delete operators is a lot easier than
>> changing all the calls to malloc/free for a struct. Using a custom
>> allocator with a standard library template usually involves little more
>> than changing a typedef.
>
> You're comparing apples and oranges. Well encapsulated C code will use
> object-specific initializers and destructors. Using C++ will at best gain
> you two fewer lines of code; one for the explicit malloc and one for the
> explicit destruction.

I may well be. But I'm posting from the perspective of one who as come
onto several not so well designed C projects a the point where the
developers realise they are in a mess and have memory problems. So I've
sent a lot for time tidying up after the fact. I made a good living out
of writing specialised allocators!

> Poorly written C++ code may use malloc intead of new. There's no accounting
> for incompetence.

In any language! Any C++ programmer who used malloc without good reason
should be forced to read Perl listings while listening to Justin Bieber
until they repent.

--
Ian Collins

Barry Margolin

unread,

Oct 23, 2011, 10:09:56 PM10/23/11

to

In article <87hb2zy...@sapphire.mobileactivedefense.com>,

Rainer Weikusat <rwei...@mssgmbh.com> wrote:

> Barry Margolin <bar...@alum.mit.edu> writes:
>
> [...]
>
> > Some malloc() implementations are better than others at optimizing
> > memory use and avoiding fragmentation.
>
> Leading remark: The text below has intentionally been simplified as
> much as possible so that the basic idea is more readily understood.
>
> I would go so far to state malloc creates fragmentation.

What I meant was "minimizing fragmentation" -- it can't be avoided
completely, as you pointed out.

Volker Birk

unread,

Oct 23, 2011, 10:34:54 PM10/23/11

to

Ian Collins <ian-...@hotmail.com> wrote:
> In any language! Any C++ programmer who used malloc without good reason
> should be forced to read Perl listings while listening to Justin Bieber
> until they repent.

;-)

Well, I even couldn't imagine, why someone wants to use malloc() in C++
(except implementing some new operator).

And even think about the menace...

Volker Birk

unread,

Oct 23, 2011, 10:43:15 PM10/23/11

to

If we're nitpicking here:

Well, actually C++ implementations have to. It's not possible to
implement them without using some mutex implementation, and such
implementations are using memory.

And Java is more than just the language.

But even C requires memory in the entry (and exit) code.

bolta...@boltar.world

unread,

Oct 24, 2011, 4:32:27 AM10/24/11

to

On Mon, 24 Oct 2011 08:45:34 +1300
Ian Collins <ian-...@hotmail.com> wrote:
>C++ is no less memory efficient than C. Using the the language support

For allocating memory you're correct, but the way a lot of people actually
use C++ leads to a lot of unnecessary temporary objects being created
and copying being done. Eg: Any use of std::string virtually guarantees wasted
cpu cycles and memory alloc/dealloc compared to using a char array but thats
an argument for another group.

B2003

Rainer Weikusat

unread,

Oct 24, 2011, 6:12:06 AM10/24/11

to

Ian Collins <ian-...@hotmail.com> writes:

[...]

>> Poorly written C++ code may use malloc intead of new. There's no accounting
>> for incompetence.
>
> In any language! Any C++ programmer who used malloc without good
> reason should be forced to read Perl listings while listening to
> Justin Bieber until they repent.

The Perl listings or Justin Bieber?

Ian Collins

unread,

Oct 24, 2011, 6:16:17 AM10/24/11

to

Both are beyond redemption :)

--
Ian Collins

bolta...@boltar.world

unread,

Oct 24, 2011, 6:29:39 AM10/24/11

to

On Mon, 24 Oct 2011 13:13:57 +1300
Ian Collins <ian-...@hotmail.com> wrote:
>> Poorly written C++ code may use malloc intead of new. There's no accounting
>> for incompetence.
>
>In any language! Any C++ programmer who used malloc without good reason
>should be forced to read Perl listings while listening to Justin Bieber
>until they repent.

How about someone allocating a block of memory for non object data who doesn't
want to have to fart about catching bad_alloc when a simple return value test
will suffice?

B2003

Marc

unread,

Oct 24, 2011, 7:57:11 AM10/24/11

to

bolta...@boltar.world wrote:

> How about someone allocating a block of memory for non object data
> who doesn't want to have to fart about catching bad_alloc when a
> simple return value test will suffice?

They can use the nothrow version of new...

Using realloc can be a reason, although people will tell you to use a
std::vector instead.

Volker Birk

unread,

Oct 24, 2011, 8:12:53 AM10/24/11

to

Marc <marc....@gmail.com> wrote:
> Using realloc can be a reason, although people will tell you to use a
> std::vector instead.

Yes, and std::vector really is a good idea. It's convenient and it's
fast.

Rainer Weikusat

unread,

Oct 24, 2011, 8:41:59 AM10/24/11

to

Volker Birk <bum...@dingens.org> writes:
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:
>> Ian Collins <ian-...@hotmail.com> writes:
>>> On 10/24/11 03:56 AM, Rainer Weikusat wrote:
>> [...]
>>>> Memory efficiency isn't as paramount a concern as it used to be. If
>>>> it was, nobody would be using C++ or Java :->.
>>> C++ is no less memory efficient than C.
>> This statement makes no sense: C and C++ are programming languages and
>> languages don't use memory.
>
> If we're nitpicking here:
>
> Well, actually C++ implementations have to. It's not possible to
> implement them without using some mutex implementation, and such
> implementations are using memory.
>
> And Java is more than just the language.
>
> But even C requires memory in the entry (and exit) code.

What I was trying to get at is that the language, including any
standard libraries, does not use memory: It's completely passive. Code
written in 'the language' uses memory upon execution and how exactly
this memory is being used depends on how the code was written. As far
as I can tell, for most people, the 'C++ killer feature' seems to be
that it is not necessary to spend any planning effort on deciding what
implementation of which data structures some application will use,
"I'll just use std::schnuddeldiewutz, that's The Right Thing To
Do[tm], and get on with my much more interesting life that much
faster!". And this attitude, combined with the address space straw
cutter named malloc, leads to processes with insanely large resident
segment sets defeating all caching strategies employed by the
hardware single-handedly.

It is, of course, possible to use C++ consciously, but the language
inventor himself actually recommends that people shouldn't do this:
The language features which make C++ interesting are supposed to be
used by 'the demigods' to implement 'libraries' and 'Crethi and
Plethi' programmers are advised to use these libraries instead of
running the risk of getting their fingers hacked off by the machinery
used to implement them.

Volker Birk

unread,

Oct 24, 2011, 8:48:01 AM10/24/11

to

Rainer Weikusat <rwei...@mssgmbh.com> wrote:
> What I was trying to get at is that the language, including any
> standard libraries, does not use memory: It's completely passive.

Are you sure in the ages of shared libraries? ;-) (beware: we're
nitpicking now)

> Code
> written in 'the language' uses memory upon execution and how exactly
> this memory is being used depends on how the code was written.

This is not true for entry and exit code.

> As far
> as I can tell, for most people, the 'C++ killer feature' seems to be
> that it is not necessary to spend any planning effort on deciding what
> implementation of which data structures some application will use,
> "I'll just use std::schnuddeldiewutz, that's The Right Thing To
> Do[tm], and get on with my much more interesting life that much
> faster!". And this attitude, combined with the address space straw
> cutter named malloc, leads to processes with insanely large resident
> segment sets defeating all caching strategies employed by the
> hardware single-handedly.

;-)

> It is, of course, possible to use C++ consciously, but the language
> inventor himself actually recommends that people shouldn't do this:
> The language features which make C++ interesting are supposed to be
> used by 'the demigods' to implement 'libraries' and 'Crethi and
> Plethi' programmers are advised to use these libraries instead of
> running the risk of getting their fingers hacked off by the machinery
> used to implement them.

Well, yes. And if you're getting such code, you better throw it away.

bolta...@boltar.world

unread,

Oct 24, 2011, 10:21:32 AM10/24/11

to

On Mon, 24 Oct 2011 11:57:11 +0000 (UTC)
Marc <marc....@gmail.com> wrote:
>bolta...@boltar.world wrote:
>
>> How about someone allocating a block of memory for non object data
>> who doesn't want to have to fart about catching bad_alloc when a
>> simple return value test will suffice?
>
>They can use the nothrow version of new...

And the advantage of using that over malloc is ....?

>Using realloc can be a reason, although people will tell you to use a
>std::vector instead.

And the advantage of using that over realloc is ....?

B2003

bolta...@boltar.world

unread,

Oct 24, 2011, 10:31:31 AM10/24/11

to

On Mon, 24 Oct 2011 12:12:53 +0000 (UTC)
Volker Birk <bum...@dingens.org> wrote:
>Marc <marc....@gmail.com> wrote:
>> Using realloc can be a reason, although people will tell you to use a
>> std::vector instead.
>
>Yes, and std::vector really is a good idea. It's convenient and it's
>fast.

Its unlikely to be faster than using the allocator directly.

B2003

Ian Collins

unread,

Oct 24, 2011, 2:44:51 PM10/24/11

to

On 10/25/11 03:21 AM, bolta...@boltar.world wrote:
> On Mon, 24 Oct 2011 11:57:11 +0000 (UTC)
> Marc<marc....@gmail.com> wrote:
>> bolta...@boltar.world wrote:
>>
>>> How about someone allocating a block of memory for non object data
>>> who doesn't want to have to fart about catching bad_alloc when a
>>> simple return value test will suffice?
>>
>> They can use the nothrow version of new...
>
> And the advantage of using that over malloc is ....?

You don't end up with some pointers that have to be freed and some that
have to be deleted. Besides, well written C++ does not use naked
new/malloc very much or at all. Resources (including dynamic memory)
are owned by an object to ensure they are properly managed. std::vector
is a good example.

>> Using realloc can be a reason, although people will tell you to use a
>> std::vector instead.
>
> And the advantage of using that over realloc is ....?

You don't have to bother. std::vector manages its own resources.

--
Ian Collins

Rainer Weikusat

unread,

Oct 24, 2011, 4:09:53 PM10/24/11

to

Barry Margolin <bar...@alum.mit.edu> writes:
> In article <87hb2zy...@sapphire.mobileactivedefense.com>,
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:
>
>> Barry Margolin <bar...@alum.mit.edu> writes:
>>
>> [...]
>>
>> > Some malloc() implementations are better than others at optimizing
>> > memory use and avoiding fragmentation.
>>
>> Leading remark: The text below has intentionally been simplified as
>> much as possible so that the basic idea is more readily understood.
>>
>> I would go so far to state malloc creates fragmentation.
>
> What I meant was "minimizing fragmentation" -- it can't be avoided
> completely, as you pointed out.

malloc can't avoid it because it needs to make memory placement
descision based on 'allocation request sizes'. And this is completely
insufficient.

Barry Margolin

unread,

Oct 24, 2011, 10:00:33 PM10/24/11

to

In article <87ty6yt...@sapphire.mobileactivedefense.com>,

I said it can't be avoided, what are you arguing about?

But different memory allocation algorithms may be able to reduce the
amount of fragmentation.

bolta...@boltar.world

unread,

Oct 25, 2011, 5:00:12 AM10/25/11

to

On Tue, 25 Oct 2011 07:44:51 +1300
Ian Collins <ian-...@hotmail.com> wrote:
>On 10/25/11 03:21 AM, bolta...@boltar.world wrote:
>> On Mon, 24 Oct 2011 11:57:11 +0000 (UTC)
>> Marc<marc....@gmail.com> wrote:
>>> bolta...@boltar.world wrote:
>>>
>>>> How about someone allocating a block of memory for non object data
>>>> who doesn't want to have to fart about catching bad_alloc when a
>>>> simple return value test will suffice?
>>>
>>> They can use the nothrow version of new...
>>
>> And the advantage of using that over malloc is ....?
>
>You don't end up with some pointers that have to be freed and some that
>have to be deleted.

What on earth are you talking about? A pointer is a pointer whether it
arrives via malloc or new.

>Besides, well written C++ does not use naked
>new/malloc very much or at all. Resources (including dynamic memory)

Well you're right about malloc since it doesn't call the constructor, but
I've yet to see a C++ program yet that didn't use new. If you want arrays of
objects constructed via something other than the default constructor you
don't have any choice unless you think creating throwaway objects in the
array, then creating temporary objects and going through the whole copy
business is an efficient way to code.

>>> Using realloc can be a reason, although people will tell you to use a
>>> std::vector instead.
>>
>> And the advantage of using that over realloc is ....?
>
>You don't have to bother. std::vector manages its own resources.

Realloc is more likely to be used to (re)allocate a flat block of memory,
for example when downloading data from a socket and mapping packet structures
onto it. I see no advantage in using a vector in that situation as the array
paradigm is not what you're after.

B2003

Ian Collins

unread,

Oct 25, 2011, 6:12:45 AM10/25/11

to

On 10/25/11 10:00 PM, bolta...@boltar.world wrote:
> On Tue, 25 Oct 2011 07:44:51 +1300
> Ian Collins<ian-...@hotmail.com> wrote:
>> On 10/25/11 03:21 AM, bolta...@boltar.world wrote:
>>> On Mon, 24 Oct 2011 11:57:11 +0000 (UTC)
>>> Marc<marc....@gmail.com> wrote:
>>>> bolta...@boltar.world wrote:
>>>>
>>>>> How about someone allocating a block of memory for non object data
>>>>> who doesn't want to have to fart about catching bad_alloc when a
>>>>> simple return value test will suffice?
>>>>
>>>> They can use the nothrow version of new...
>>>
>>> And the advantage of using that over malloc is ....?
>>
>> You don't end up with some pointers that have to be freed and some that
>> have to be deleted.
>
> What on earth are you talking about? A pointer is a pointer whether it
> arrives via malloc or new.

True, but just try deleting a pointer assigned by malloc or passing a
pointer assigned by new to free.

>>>> Using realloc can be a reason, although people will tell you to use a
>>>> std::vector instead.
>>>
>>> And the advantage of using that over realloc is ....?
>>
>> You don't have to bother. std::vector manages its own resources.
>
> Realloc is more likely to be used to (re)allocate a flat block of memory,
> for example when downloading data from a socket and mapping packet structures
> onto it. I see no advantage in using a vector in that situation as the array
> paradigm is not what you're after.

I see no advantage in using a spanner when a screwdriver is what I'm
after. Use the best tool for the job.

--
Ian Collins

bolta...@boltar.world

unread,

Oct 25, 2011, 6:29:16 AM10/25/11

to

On Tue, 25 Oct 2011 23:12:45 +1300
Ian Collins <ian-...@hotmail.com> wrote:
>True, but just try deleting a pointer assigned by malloc or passing a
>pointer assigned by new to free.

So you're saying that you loose track of the pointers in code you write and
just delete them and hope for the best? If you can't even find how they were
created then how on earth do you manage to find out what they're for?

>> Realloc is more likely to be used to (re)allocate a flat block of memory,
>> for example when downloading data from a socket and mapping packet structures
>> onto it. I see no advantage in using a vector in that situation as the array
>> paradigm is not what you're after.
>
>I see no advantage in using a spanner when a screwdriver is what I'm
>after. Use the best tool for the job.

Agreed. But you don't always need a screwdriver, sometimes you need that
spanner and I see no point in having the overhead of a wrapper class when
I just need a large lump of memory that I might need to grow at some point.

B2003

Volker Birk

unread,

Oct 25, 2011, 7:00:21 AM10/25/11

to

Ian Collins <ian-...@hotmail.com> wrote:
> True, but just try deleting a pointer assigned by malloc or passing a
> pointer assigned by new to free.

Not to mention delete[]...

Rainer Weikusat

unread,

Oct 25, 2011, 8:37:40 AM10/25/11

to

Barry Margolin <bar...@alum.mit.edu> writes:
> In article <87ty6yt...@sapphire.mobileactivedefense.com>,
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:
>> Barry Margolin <bar...@alum.mit.edu> writes:
>> > In article <87hb2zy...@sapphire.mobileactivedefense.com>,
>> > Rainer Weikusat <rwei...@mssgmbh.com> wrote:
>> >
>> >> Barry Margolin <bar...@alum.mit.edu> writes:
>> >>
>> >> [...]
>> >>
>> >> > Some malloc() implementations are better than others at optimizing
>> >> > memory use and avoiding fragmentation.
>> >>
>> >> Leading remark: The text below has intentionally been simplified as
>> >> much as possible so that the basic idea is more readily understood.
>> >>
>> >> I would go so far to state malloc creates fragmentation.
>> >
>> > What I meant was "minimizing fragmentation" -- it can't be avoided
>> > completely, as you pointed out.
>>
>> malloc can't avoid it because it needs to make memory placement
>> descision based on 'allocation request sizes'. And this is completely
>> insufficient.
>
> I said it can't be avoided, what are you arguing about?

malloc can't avoid fragmentation because it makes memory placement
descisions based on block sizes and this is insufficient information
for doing so: How an object is going to be used depends much more on
what kind of object it is (its type) than on how large it (presently)
happens to be.

I'm not 'arguing' about that. This is a piece of information I'm
trying to get accross.

Barry Margolin

unread,

Oct 26, 2011, 3:10:46 AM10/26/11

to

In article <87d3dln...@sapphire.mobileactivedefense.com>,

What does "how an object is going to be used" have to do with
fragmentation? Fragmentation is the result of a particular order of
allocating and freeing memory that results in lots of unused bits of
memory in the heap.

Malloc can't avoid fragmentation because it can't predict the future of
memory allocation. Did you mean "how long it's going to be used"?

Bjarni Juliusson

unread,

Oct 26, 2011, 9:19:22 AM10/26/11

to

What he's getting at is the pattern of use, that is, how long the block
is allocated and what else gets allocated and freed during that time.

The programmer knows more about how the memory should be allocated than
he can communicate to malloc(). The result is that even though he might
know that he's going to occasionally allocate lots of struct foo and
then free them a little while later, he might still end up getting a
more or less permanently allocated string in the middle of a page of
struct foos, and then that page is going to be there forever.
Alternatively, struct foo might be 129 bytes long, and the allocator
allocates a 256 byte block for it to avoid fragmentation.

If you can tell the allocator that you want a bit of memory that will
stay allocated for a long time, the allocator can put it together with
other bits that are going to stay allocated for a long time. If you tell
the allocator that you'll be allocating a lot of 129-byte structs, it
can pack them in together on the same pages, knowing that fragmentation
will not be a problem.

So, while I see no reason for a crusade against malloc, which works fine
most of the time, especially for programs that only run for a short
time, it is also true that you really should consider your memory
allocation patterns if you are having fragmentation problems, and
remember that there's nothing special about malloc and there are other
allocators that are better if you can provide them with more hints than
just how much memory you need.

Bjarni
--

INFORMATION WANTS TO BE FREE

Fritz Wuehler

unread,

Oct 26, 2011, 4:08:40 PM10/26/11

to

The unrepentant know-nothing socialist Bjarni Juliusson

<bja...@update.uu.se> wrote:

> What he's getting at is the pattern of use, that is, how long the block
> is allocated and what else gets allocated and freed during that time.

So what?

> If you can tell the allocator that you want a bit of memory that will
> stay allocated for a long time, the allocator can put it together with
> other bits that are going to stay allocated for a long time.

That doesn't mean anything. Only the size of the area has any effect. All
your idiotic scheme is going to do is nothing.

> If you tell the allocator that you'll be allocating a lot of 129-byte
> structs, it can pack them in together on the same pages, knowing that
> fragmentation will not be a problem.

There are already pieces of code that work like this from 40 years ago, but
not on UNIX. UNIX has shit for storage management, it's one of the lamest,
barely working storage management systems ever "designed". Even Windows does
better. You communists really need to get your heads out of your collective
ass and go read some doc on how storage should be managed. Because libc and
UNIX just don't deliver the goods. Yeah, yeah, I know. When all you know is
UNIX and Stallman you think everything's ok. I have a bulletin for you, It's
not.

Funny, the UNIX people stole everything but anything that made sense. They
ignored properly designed systems just to spite themselves. Real
individuals! Go morons!

> INFORMATION WANTS TO BE FREE

Horseshit. Communists want to cut and paste other people's code.

Bjarni Juliusson

unread,

Oct 26, 2011, 4:38:22 PM10/26/11

to

How the hell did you know I'm a communist?

Bjarni
--

Scott Lurndal

unread,

Oct 27, 2011, 12:11:54 PM10/27/11

to

Fritz Wuehler <fr...@spamexpire-201110.rodent.frell.theremailer.net> writes:

[ Uninformed rant omitted ]

>
>There are already pieces of code that work like this from 40 years ago, but
>not on UNIX. UNIX has shit for storage management, it's one of the lamest,
>barely working storage management systems ever "designed".

You really don't have a clue, do you? Have fun with your DD cards and
track allocation. The rest of us left that bullshit behind twenty years
ago (and IBM's mainframe competitors always knew IBM's storage allocation
was crap - note the programmer/operator friendliness of the Burroughs mainframes, for
example).

FWIW, fixed sized pool allocators can be implemented on Unix systems in a couple
dozen lines of code.

Nomen Nescio

unread,

Oct 27, 2011, 5:08:07 PM10/27/11

to

sq...@slp53.sl.home (Squat Turd-all) wrote:

> You really don't have a clue, do you? Have fun with your DD cards and
> track allocation.

Hahaha asshole. That's DOS. You never worked on a man's system, not then and
not now. Keep on cutting and pasting and thinking you're a developer. Yeah,
right. ;-)

Torsten Kirschner

unread,

Oct 28, 2011, 9:23:35 PM10/28/11

to

Den 21.10.2011 16:13, skrev bolta...@boltar.world:
> On Fri, 21 Oct 2011 15:06:31 +0100
> =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= <ma...@mansr.com> wrote:
>> mmap() is a (more or less) direct system call, requesting pages of
>> virtual memory from the kernel. malloc() allocates memory from the
>
> That would explain why it seems to be faster than malloc.
>
>> Unless you have specific reasons for doing otherwise, you should use
>> malloc() to allocate memory in your application.
>
> I am trying to find a way to speed up memory allocation in my program even
> if that means wasting some bytes by allocating a page at a time so thats
> why I was considering mmap but I wondered if I was missing some gotcha that
> meant it wouldn't be suitable for this.
>
> B2003
>

As others have stated, memory allocation can be tricky. Have a look at

http://www2.research.att.com/~gsf/download/ref/vmalloc/vmalloc.html
( http://www2.research.att.com/~gsf/download/ref/vmalloc/vmalloc-spe.pdf )

to see if it could help.

Regards
T

jgharston

unread,

Nov 1, 2011, 7:29:00 PM11/1/11

to

Torsten Kirschner wrote:
> As others have stated, memory allocation can be tricky. Have a look at
> http://www2.research.att.com/~gsf/download/ref/vmalloc/vmalloc.html
> (http://www2.research.att.com/~gsf/download/ref/vmalloc/vmalloc-spe.pdf)

Interesting few bits of test code. Bashed the hell out of my self-
rolled
naive malloc library ;)

JGH

Ulrich Eckhardt

unread,

Nov 2, 2011, 4:24:29 PM11/2/11

to

bolta...@boltar.world wrote:
> If you want arrays of objects constructed via something other than
> the default constructor you don't have any choice unless you think
> creating throwaway objects in the array, then creating temporary
> objects and going through the whole copy business is an efficient
> way to code.

You can ease things a bit by creating a temporary and swapping that with a
clean object in the array^Wstd::vector. Or you could use placement new,
but then you you would effectively be back to manual resource management
(or implementing a container rather than using an existing one)...

That said, using your solution _IS_ an efficient way to code, i.e. taking
the simplest and easiest way.

Uli

Ulrich Eckhardt

unread,

Nov 2, 2011, 4:28:22 PM11/2/11

to

Volker Birk wrote:
> Well, I even couldn't imagine, why someone wants to use malloc() in C++
> (except implementing some new operator).

Efficiency at a very low level? You can't get uninitialized memory out of
a std::vector, if you resize it, it will zero-fill the elements for you.
That said, there was some utility class in C++ that provides a buffer of
uninitialized objects, if I could only remember its name...

;)

Uli

Volker Birk

unread,

Nov 2, 2011, 5:47:33 PM11/2/11

to

Ulrich Eckhardt <doom...@knuut.de> wrote:
> Volker Birk wrote:
>> Well, I even couldn't imagine, why someone wants to use malloc() in C++
>> (except implementing some new operator).
> Efficiency at a very low level? You can't get uninitialized memory out of
> a std::vector, if you resize it, it will zero-fill the elements for you.

There is no zero fill. std::vector utilizes an allocator, which can be
overloaded.

Ulrich Eckhardt

unread,

Nov 3, 2011, 2:50:55 AM11/3/11

to

Volker Birk wrote:
> Ulrich Eckhardt <doom...@knuut.de> wrote:
>> Volker Birk wrote:
>>> Well, I even couldn't imagine, why someone wants to use malloc() in
>>> C++ (except implementing some new operator).
>> Efficiency at a very low level? You can't get uninitialized memory out
>> of a std::vector, if you resize it, it will zero-fill the elements for
>> you.
>
> There is no zero fill. std::vector utilizes an allocator, which can be
> overloaded.

{Disclaimer: I haven't tried to get uninitialized memory out of a vector,
so I don't completely rule out its possibility.}

AFAIK, vector uses the allocator to get memory and then constructs the
elements inside by using placement new. This latter will initialize the
elements similarly to

element_type x = element_type();

which will zero-initialize scalar types. Note that the construction will
not be like

element_type x;

which would leave POD types with random ("singular") values.

Uli

Rainer Weikusat

unread,

Nov 3, 2011, 6:45:54 PM11/3/11

to

Barry Margolin <bar...@alum.mit.edu> writes:
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:

[...]

>> malloc can't avoid fragmentation because it makes memory placement
>> descisions based on block sizes and this is insufficient information
>> for doing so: How an object is going to be used depends much more on
>> what kind of object it is (its type) than on how large it (presently)
>> happens to be.
>
> What does "how an object is going to be used" have to do with
> fragmentation? Fragmentation is the result of a particular order of
> allocating and freeing memory that results in lots of unused bits of
> memory in the heap.

As written, this doesn't make sense: Providing that everything which
has been allocated before time T has also been freed until time T,
fragmentation cannot exist (except for allocators incapable of reusing
entirely free chunks of memory originally dedicated to holding objects
of a specific size for objects of another size) and the order of
allocations and deallocations doesn't matter. Also 'fragmentation' is
not 'lots of unused bits of memory in the heap' (every 'bit of memory'
'in the heap' is unused and the purpose of malloc is to manage these
bits so that they can be used for future allocation requests) but
"lots of unusable memory areas in the heap", these areas being
unusable because their size is to small to satisfy all or most
allocation requests by the program and unreusable because the chunk of
memory they reside in is held hostage by some long-lived objects which
were allocated in it 'by bad luck'.

I'm assuming that the content of this paper,

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.275

is at least generally known (it's been years since I read it myself
and I have certainly forgotten a lot of the details). It contains the
following statements

Objects allocated at about the same time are likely to die
together at the end of a phase; if consecutively-allocated
objects are allocated in contiguous memory, they will free
contiguous memory.

Objects of different types may be likely to serve different
purposes and die at diffrent times. Size is likely to be
related to type and purpose, so avoiding the intermingling of
diffrent sizes and likely types of objects may reduce the
scattering of long-lived objects among short-lived ones.

This suggests that objects allocated at about the same time
should be allocated adjacent to each other in memory, with the
possible amendment that different-sized objects should be
segregated
[p. 18]

and the problem with this is that each type of object has a specific
size but that many different types of objects can (and often will)
have identical sizes, hence, the approach to segregate different
types of objects based on assuming that their respective sizes (the
only information known to malloc) can used for a type-based
classification cannot possibly work ...

> Malloc can't avoid fragmentation because

... and because of this, malloc cannot possibly avoid mixing
long-lived and short-lived objects in the same memory chunk.

William Ahern

unread,

Nov 3, 2011, 7:21:30 PM11/3/11

to

Rainer Weikusat <rwei...@mssgmbh.com> wrote:
> Barry Margolin <bar...@alum.mit.edu> writes:

> Objects allocated at about the same time are likely to die
> together at the end of a phase; if consecutively-allocated
> objects are allocated in contiguous memory, they will free
> contiguous memory.
>
> Objects of different types may be likely to serve different
> purposes and die at diffrent times. Size is likely to be
> related to type and purpose, so avoiding the intermingling of
> diffrent sizes and likely types of objects may reduce the
> scattering of long-lived objects among short-lived ones.

> This suggests that objects allocated at about the same time
> should be allocated adjacent to each other in memory, with the
> possible amendment that different-sized objects should be
> segregated
> [p. 18]

> and the problem with this is that each type of object has a specific
> size but that many different types of objects can (and often will)
> have identical sizes, hence, the approach to segregate different
> types of objects based on assuming that their respective sizes (the
> only information known to malloc) can used for a type-based
> classification cannot possibly work ...

I don't see how you arrive at that conclusion. It seems odd to argue that
size correlation doesn't matter while runtime locality correlation does.
Both are basically empirical questions, and the latter has been, presumably,
tested and shown to be useful for generating inferences. Just because
objects of the same size aren't _necessarily_ related doesn't mean that, in
fact, they usually aren't.

> > Malloc can't avoid fragmentation because

> ... and because of this, malloc cannot possibly avoid mixing
> long-lived and short-lived objects in the same memory chunk.

Imagine a malloc() implementation which walked the stack and learned to
identify code which generated short-lived objects versus long-lived objects.

I think it'd be hard to prove that malloc() couldn't achieve the same
theoretical rate of fragmentation as custom allocators. Whether they
actually do is a much easier question, but I still wouldn't venture a guess
without real data. But in my experience malloc() plus a basic LIFO allocator
factory does a pretty good job. Most malloc() implementation already
implement object pools, anyhow. These days all I use is malloc() plus an
"object stack" allocator--similar to GNU obstack but not so ugly and painful
to use.

Rainer Weikusat

unread,

Nov 3, 2011, 10:57:08 PM11/3/11

to

William Ahern <wil...@wilbur.25thandClement.com> writes:
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:
>> Barry Margolin <bar...@alum.mit.edu> writes:
>
>> Objects allocated at about the same time are likely to die
>> together at the end of a phase; if consecutively-allocated
>> objects are allocated in contiguous memory, they will free
>> contiguous memory.
>>
>> Objects of different types may be likely to serve different
>> purposes and die at diffrent times. Size is likely to be
>> related to type and purpose, so avoiding the intermingling of
>> diffrent sizes and likely types of objects may reduce the
>> scattering of long-lived objects among short-lived ones.
>
>> This suggests that objects allocated at about the same time
>> should be allocated adjacent to each other in memory, with the
>> possible amendment that different-sized objects should be
>> segregated
>> [p. 18]
>
>> and the problem with this is that each type of object has a specific
>> size but that many different types of objects can (and often will)
>> have identical sizes, hence, the approach to segregate different
>> types of objects based on assuming that their respective sizes (the
>> only information known to malloc) can used for a type-based
>> classification cannot possibly work ...
>
> I don't see how you arrive at that conclusion.

So, where's the error in m statement? How are

char passwd[8]

and

struct passwd *pwd

the same thing on a 64 bit machine and how can the code which works in
an LP64 model because they must be identical, given their sizes are,
also work for ILP32 where the sizes are not identical?

William Ahern

unread,

Nov 4, 2011, 2:04:30 AM11/4/11

to

Rainer Weikusat <rwei...@mssgmbh.com> wrote:
> William Ahern <wil...@wilbur.25thandClement.com> writes:
> > Rainer Weikusat <rwei...@mssgmbh.com> wrote:

<snip>

> >> and the problem with this is that each type of object has a specific
> >> size but that many different types of objects can (and often will)
> >> have identical sizes, hence, the approach to segregate different
> >> types of objects based on assuming that their respective sizes (the
> >> only information known to malloc) can used for a type-based
> >> classification cannot possibly work ...
> >
> > I don't see how you arrive at that conclusion.
>
> So, where's the error in m statement? How are
>
> char passwd[8]
>
> and
>
> struct passwd *pwd
>
> the same thing on a 64 bit machine and how can the code which works in
> an LP64 model because they must be identical, given their sizes are,
> also work for ILP32 where the sizes are not identical?

I dunno. The same way it works (or doesn't work) if you were to have

struct passwd *pwd, *tmp;

and where the duration of pwd was different from tmp.

Rainer Weikusat

unread,

Nov 4, 2011, 5:19:56 PM11/4/11

to

Not possible for two identifiers defined by one definition. But that's
as besides the point your trying to obscure than your statement
itself. I'm going to recapitulate this very shortly: 'Fragmentation'
is a phenomenon which happens when long-lived objects and short-lived
objects which happen to be allocated at a time close to each other are
placed in the same memory area because the long-lived (or longer
living) objects prevent this memory area from being reused for
allocation requests of a different size even despite the area itself
could be used to satisfy them (Please do me a favour and stop playing
the idiot here. There's ample room for intentional misunderstanding in
any non-trivial text and we all know that). Consequently, allocating
objects whose lifetimes differ wildly in the same chunk of memory
should be avoided. The problem someone implementing a general purpose
allocator is facing here is making an educated guess at the lifetime
of some object based on the available information and this is really
only the size of the object because 'at the same time' is not a
well-defined concept in sequential process. Assuming the heap isn't
already fragmented, objects of identical sizes will end up being 'time
clustered' on their own. Consequently, allocators seggregating objects
based on their sizes effectively employ this two-part strategy and
shouldn't exhibit significant external fragmentation if it worked. But
they do[*].

[*] I happen to know this from first-hand experience because I
once had the mispleasure to tend an instance of the Avira
antivir daemon running on a 64M UTM device. While it lasted,
this was a constant battle with this program's ever increasing
memory requirements and one of the things I did to deal with
that was to modify the malloc implementation in the C library
we were using to satisfy all allocation requests by handing
out blocks whose size was a multiple of 32. This reduced the
RAM consumption of the process instantly by about 1/3. Which
suggests that seggregating objects by size is decidedly not a
good idea, at least not for small objects.

Now, why not take a step backward, assuming that the problem is not
'damned to have to make malloc work', which cannot use anything but
the object size for making placement descisions, but 'devise a
generally useful memory management scheme', maybe even adding another
restriction, namely 'a scheme useful for typed objects of varying,
fixed sizes', purposely setting 'C strings of arbitrary sizes' aside
for the moment. Assuming that the general idea that objects of
identical types will tend to have a similar life, IOW, will be
allocated and freed in batches, is sound, why not try to segregate
objects based on their actual type instead based on the size this type
happens to have? It should also be noted that the size of a type can
(and often will) change during the development and maintenance time of
some software while the ways objects of this types are used,
especially, where they are allocated and where they're freed again,
won't change just because some fields were added or removed. As it
turns out to be, that was an experiment someone named Jeff Bonwick
already conducted for the SunOS kernel in 1996 and with success. The
basic strategy devised by him is meanwhile at least used in SunOS,
FreeBSD and Linux (and my guess would be that other animals living in
the BSD zoo use it as well). And the kernel is a much less forgiving
environment for a memory allocator than userspace.

The same strategy can also be used for userspace applications (but not
- dammit - by nailing the amyelencephalus named malloc on top of it --
maybe, that will finally make him become alive!). Another past program
where I happened to have used it was an HTTP interception proxy used
for content-filtering on the same appliance I already mentioned. And
despite this program had a much more 'dynamic' life than the virus
scanner, being used by multiple people (for the deployments I could
observe, something like five to fifteen) concurrently for 'random web
surfing', it's absolute memory usage remained frugal over extended
periods of time (measured in months).

amic...@perceiveinc.com

unread,

Jan 30, 2018, 9:09:43 AM1/30/18

to

On Sunday, October 23, 2011 at 10:56:57 AM UTC-4, Rainer Weikusat wrote:
> Ian Collins <ian-...@hotmail.com> writes:
> > On 10/23/11 07:54 PM, loozadroog wrote:
> >> On Oct 21, 8:38 am, boltar2...@boltar.world wrote:
> >>> When just allocating memory (ie using MAP_ANON) what is the difference
> >>> between mmap and malloc? In man it says mmap allocates in the virtual
> >>> address space of the program so does that mean it uses the heap like malloc
> >>> or something else?
> >>>
> >>> B2003
> >>
> >> One major pragmatic difference is the malloc will return memory for
> >> any size (within the range of a size_t); but mmap/MAP_ANON only grants
> >> multiples of the page size.
> >>
> >> So you can't do this:
> >> mmap(NULL, sizeof(mystructure), PROT_READ|PROT_WRITE,
> >> MAP_PRIVATE|MAP_ANON, -1, 0);
> >>
> >
> > Well yes, you can. You just end up wasting memory.
>
> Memory efficiency isn't as paramount a concern as it used to be. If
> it was, nobody would be using C++ or Java :->. And it isn't necessary
> to allocate a page of memory and just use a small area at the
> beginning. Future 'memory needs' of the program can be satisfied by
> using still unused parts of the allocated page, eg, by using a moving
> pointer to the 'still unused area' which is suitably incremented every
> time a new memory block was requested. This can be combined with a
> simple facility for memory reuse at the object level, as opposed to
> 'at the level of typeless areas of certain sizes' (IMO, the type of
> an object is a much more important piece of 'usage information' than
> its size) to make an allocator which is sufficiently general to
> work for programs (or cases) where 'allocation requests' are usually
> (or mostly) for rather small objects (compared to the page size) and
> the exact number of objects that will be needed can't be (or can't
> easily be) predicted at compile time.
>
> For a single memory allocation, mmap is pretty much guaranteed to be
> slower than malloc because of the system call alone.

Your modern CPU spends most of it's time idle, waiting for memory to page in. Locality is key to good performance, which can mean... custom memory management.

Joe Pfeiffer

unread,

Jan 30, 2018, 9:46:15 AM1/30/18

to

Ah, no. If you've got anything resembling adequate memory, very little
time is spent waiting for memory to page in because there is almost no
paging.

Scott Lurndal

unread,

Jan 30, 2018, 10:13:02 AM1/30/18

to

Waiting for I/O, perhaps (network, sata, uart).

Many of my modern CPUs spends all of their time CPU-bound (all 24 cores).
Most modern machines have enough memory that they never page other than
the initial COW faults.

James K. Lowden

unread,

Jan 30, 2018, 1:25:33 PM1/30/18

to

On Tue, 30 Jan 2018 07:46:10 -0700
Joe Pfeiffer <pfei...@cs.nmsu.edu> wrote:

> > Your modern CPU spends most of it's time idle, waiting for memory to
> > page in. Locality is key to good performance, which can mean...
> > custom memory management.
>
> Ah, no. If you've got anything resembling adequate memory, very
> little time is spent waiting for memory to page in because there is
> almost no paging.

I didn't read the OP's statement as referring to a page file on disk.
His statement is correct if it's read simply as the "CPU spends most of
it's time idle, waiting for memory" (except for it's/its), except that
caches mitigate the problem.

To a modern CPU, memory is an I/O device, 100x slower than
computation. Avoiding a fetch from doddering old RAM is, as he says,
key to good performance. That's not controversial, and hasn't been for
at least a decade.

--jkl

Rainer Weikusat

unread,

Jan 30, 2018, 2:19:19 PM1/30/18

to

amic...@perceiveinc.com writes:
> On Sunday, October 23, 2011 at 10:56:57 AM UTC-4, Rainer Weikusat wrote:

[custom memory management/ malloc vs mmap]

> Your modern CPU spends most of it's time idle, waiting for memory to
> page in. Locality is key to good performance, which can mean... custom
> memory management.

I was arguing in favour of that. Caching is supposed to exploit both
spatial and temporal locality, though: Both use of memory that's
physically (address-wise, actually) close by to recently used memory and
reuse of already used, discontinous address sets.

"Good performance" also depends very much on the nature of a certain
task. Eg, memory access latencies aren't going to matter if something
isn't CPU-bound.

Rainer Weikusat

unread,

Jan 30, 2018, 2:23:52 PM1/30/18

to

This is too simplistic: For instance, one wouldn't want CPUs/ cores
fighting for cachelines even if this means "more fetches from memory".

James K. Lowden

unread,

Jan 31, 2018, 1:09:15 AM1/31/18

to

On Tue, 30 Jan 2018 19:23:48 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:

> > His statement is correct if it's read simply as the "CPU spends
> > most of it's time idle, waiting for memory" (except for it's/its),
> > except that caches mitigate the problem.
> >
> > To a modern CPU, memory is an I/O device, 100x slower than
> > computation. Avoiding a fetch from doddering old RAM is, as he
> > says, key to good performance. That's not controversial, and
> > hasn't been for at least a decade.
>
> This is too simplistic: For instance, one wouldn't want CPUs/ cores
> fighting for cachelines even if this means "more fetches from memory".

I'm afraid I don't follow you. What "this" is too simplistic?

I'm not suggesting one would want CPUs/ cores fighting for cachelines.
I'm saying processors are faster than memory, and the less they wait for
memory, the faster they'll get their work done. Obvious, perhaps, but
not so simple that it doesn't describe reality.

--jkl

bol...@cylonhq.com

unread,

Jan 31, 2018, 4:42:20 AM1/31/18

to

A > 6 year gap between replies has got to be some sort of record! :)

Barry Margolin

unread,

Jan 31, 2018, 12:33:14 PM1/31/18

to

Not even close. I've seen quite a few necro'ed threads that are over a
decade old.

Rainer Weikusat

unread,

Jan 31, 2018, 1:03:41 PM1/31/18

to

"James K. Lowden" <jklo...@speakeasy.net> writes:

,----

| Avoiding a fetch from doddering old RAM is, as he
| says, key to good performance. That's not controversial, and
| hasn't been for at least a decade.

`----

Per-CPU variables specifically exist to force "fetches from memory" (on
first access, at least) in order to avoid bouncing cachelines among
CPUs.

Scott Lurndal

unread,

Jan 31, 2018, 1:08:41 PM1/31/18

to

Barry Margolin <bar...@alum.mit.edu> writes:
>In article <p4s31n$1s66$1...@gioia.aioe.org>, bol...@cylonHQ.com wrote:
>
>> On Tue, 30 Jan 2018 06:09:37 -0800 (PST)
>> amic...@perceiveinc.com wrote:
>> >On Sunday, October 23, 2011 at 10:56:57 AM UTC-4, Rainer Weikusat wrote:
>> >> For a single memory allocation, mmap is pretty much guaranteed to be
>> >> slower than malloc because of the system call alone.
>> >
>> >Your modern CPU spends most of it's time idle, waiting for memory to page in.
>> >Locality is key to good performance, which can mean... custom memory
>> >management.
>>
>> A > 6 year gap between replies has got to be some sort of record! :)
>
>Not even close. I've seen quite a few necro'ed threads that are over a
>decade old.

1992 was the oldest I've seen resurrected so far, via the USENET leach
site 'homeowners hub'.

Kenny McCormack

unread,

Feb 15, 2018, 11:40:39 AM2/15/18

to

In article <barmar-257413....@reader.eternal-september.org>,

Barry Margolin <bar...@alum.mit.edu> wrote:
>In article <p4s31n$1s66$1...@gioia.aioe.org>, bol...@cylonHQ.com wrote:
>
>> On Tue, 30 Jan 2018 06:09:37 -0800 (PST)
>> amic...@perceiveinc.com wrote:
>> >On Sunday, October 23, 2011 at 10:56:57 AM UTC-4, Rainer Weikusat wrote:
>> >> For a single memory allocation, mmap is pretty much guaranteed to be
>> >> slower than malloc because of the system call alone.
>> >
>> >Your modern CPU spends most of it's time idle, waiting for memory to page in.
>> >Locality is key to good performance, which can mean... custom memory
>> >management.
>>
>> A > 6 year gap between replies has got to be some sort of record! :)
>
>Not even close. I've seen quite a few necro'ed threads that are over a
>decade old.

You should check out alt.obituaries, for really long-ago-necro'd threads.
(Yes, given the subject matter, there must be some joke to be made about
this particular newsgroup and necro'ing...)

Anyway, what seems to happen is that someone will Google a dead relative,
and Google will lead them to the newsgroup (accessed, of course, via the
so-called "Google Groups" interface) and they will post a followup
(something along the lines of "I am so-and-so's great-grand-niece and ...").

The underlying point here is that when then subject is death, 20 years is
not that long a time period. I'd say anyone who died in the 21st century,
can still be said to have died recently.

P.S. Speaking of which, another newsgroup with a tendency to necro really
old threads is misc.transport.road.

--
In the corner of the room on the ceiling is a large vampire bat who
is obviously deranged and holding his nose.