Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

manual memory management, vs an automatic gc...

21 views
Skip to first unread message

Chris M. Thomasson

unread,
Sep 23, 2022, 3:57:23 PM9/23/22
to
I have conducted a lot of experiments in the past with garbage collected
languages, and have come to a conclusion. Manual memory management
techniques can be a good thing, even in a GC environment. Iirc, I wrote
about it here before, many years ago.

Being able to create objects all over the place willy-nilly, and never
even have to think about destroying them... Is a "convenience" that a GC
can help one out with...

However, I found that using a simple object stack can help to take
pressure off a GC.

So instead of (pseudo-code)...

"hey look at me, no free for GC is the everything":
_____________________
for (;;)
{
foo = new foo();

// use foo
}
_____________________

;^)

Btw, don't worry, the GC will try to make sure the program above does
not blow out memory...

However, what about this very simple manual memory management technique,
even in a pure GC setup:
_____________________
for (;;)
{
foo = stack_try_pop();
if (! foo)
{
// Stack empty condition...
foo = new foo();
}

// use foo

if (! stack_try_push(foo))
{
// Stack full condition...
// let the gc handle it
}
}
_____________________

stack_try_push fails when its full. Since we are in GC, we just let the
reference dangle.

Juha Nieminen

unread,
Sep 26, 2022, 4:29:18 AM9/26/22
to
Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
> I have conducted a lot of experiments in the past with garbage collected
> languages, and have come to a conclusion. Manual memory management
> techniques can be a good thing, even in a GC environment. Iirc, I wrote
> about it here before, many years ago.
>
> Being able to create objects all over the place willy-nilly, and never
> even have to think about destroying them... Is a "convenience" that a GC
> can help one out with...

I think that as data-oriented design (as opposed to object-oriented design)
is gaining popularity, especially in certain fields of programming that
require extreme efficiency (such as game engines), the need for automatic
garbage collection is diminishing, at least in those fields.

The problem with automatic GC is that it's mostly needed when you
allocate dynamically individual objects (which is the case with most
GC'd languages). However, using individually allocated objects is a
performance killer. (In fact, using "objects" at all, ie. class
instances, is a performance killer.)

DOD doesn't require individually allocated objects, as everything is
put into arrays. (And not as in arrays of objects. Arrays of individual
values, which would normally be class member variables.)

Since optimally all the dynamically allocated data is in arrays, and
no "object" refers to any other "object", the need for automatic GC
is significantly lessened.

In contrast, low-level control of what the compiler produces is
significantly more important.

Chris M. Thomasson

unread,
Sep 26, 2022, 8:10:12 PM9/26/22
to
On 9/26/2022 1:29 AM, Juha Nieminen wrote:
> Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
>> I have conducted a lot of experiments in the past with garbage collected
>> languages, and have come to a conclusion. Manual memory management
>> techniques can be a good thing, even in a GC environment. Iirc, I wrote
>> about it here before, many years ago.
>>
>> Being able to create objects all over the place willy-nilly, and never
>> even have to think about destroying them... Is a "convenience" that a GC
>> can help one out with...
>
> I think that as data-oriented design (as opposed to object-oriented design)
> is gaining popularity, especially in certain fields of programming that
> require extreme efficiency (such as game engines), the need for automatic
> garbage collection is diminishing, at least in those fields.
>
> The problem with automatic GC is that it's mostly needed when you
> allocate dynamically individual objects (which is the case with most
> GC'd languages). However, using individually allocated objects is a
> performance killer. (In fact, using "objects" at all, ie. class
> instances, is a performance killer.)

Quick note. Will get back to you.

Instead of allocating one "object" at a time, allocate a chunk of
contiguous (wrt memory representation) objects instead. Iterating
through the chunk will be good for the cache.

Juha Nieminen

unread,
Sep 27, 2022, 2:45:50 AM9/27/22
to
Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
> Instead of allocating one "object" at a time, allocate a chunk of
> contiguous (wrt memory representation) objects instead. Iterating
> through the chunk will be good for the cache.

AFAIK most GC'd languages (such as Java) do not support creating arrays
of objects. As in, the objects themselves being in the array by value
(rather than the array just containing references/pointers to individually
allocated objects).

Even when some language does (eg. C++), putting the objects in an array
helps a bit, but not necessarily a lot, depending on the internal
structure of the objects.

When the objects are small and all of their member variables are
typically accessed at once (eg. the object represents an RGBA pixel,
and contains nothing else), then it's very efficient.

However, for larger objects the benefits of cache locality become less
and less, the larger the objects are, and the more members variables
they have that aren't accessed in the array traversal. That's because
if you traverse the array, accessing just one or two member variables
out of a dozen, you'll be making larger jumps, thus making cache misses
more frequent.

A concrete example: A class representing a vertex, containing all the
data attached to that vertex. In other words, position, normal vector,
UV coordinates, color, and whatever other data may be attached to a
vertex.

If you have such a vertex class, and you just put all the instances
in a vector, and then you traverse the vector and eg. access the position
member of each, you'll be making much larger jumps (and thus more frequent
cache misses) than if all the positions were on their own, in their own
array.

Also, accessing consecutive values in an array helps the compiler with
autovectorization optimizations.

This is the basic idea of data-oriented design.
0 new messages