Chris M. Thomasson <
chris.m.t...@gmail.com> wrote:
> Instead of allocating one "object" at a time, allocate a chunk of
> contiguous (wrt memory representation) objects instead. Iterating
> through the chunk will be good for the cache.
AFAIK most GC'd languages (such as Java) do not support creating arrays
of objects. As in, the objects themselves being in the array by value
(rather than the array just containing references/pointers to individually
allocated objects).
Even when some language does (eg. C++), putting the objects in an array
helps a bit, but not necessarily a lot, depending on the internal
structure of the objects.
When the objects are small and all of their member variables are
typically accessed at once (eg. the object represents an RGBA pixel,
and contains nothing else), then it's very efficient.
However, for larger objects the benefits of cache locality become less
and less, the larger the objects are, and the more members variables
they have that aren't accessed in the array traversal. That's because
if you traverse the array, accessing just one or two member variables
out of a dozen, you'll be making larger jumps, thus making cache misses
more frequent.
A concrete example: A class representing a vertex, containing all the
data attached to that vertex. In other words, position, normal vector,
UV coordinates, color, and whatever other data may be attached to a
vertex.
If you have such a vertex class, and you just put all the instances
in a vector, and then you traverse the vector and eg. access the position
member of each, you'll be making much larger jumps (and thus more frequent
cache misses) than if all the positions were on their own, in their own
array.
Also, accessing consecutive values in an array helps the compiler with
autovectorization optimizations.
This is the basic idea of data-oriented design.