Ben Bacarisse <
ben.u...@bsb.me.uk> wrote:
> Right. But this is in part bad design. I blame the rather simplistic
> view of OO that gets pushed by online tutorials and so on. There is no
> reason to consider position to be an intrinsic property of an object.
> Object locations could be stored contiguously in an instance of a
> LocationArray class and linked (vie pointers or indexes) or otherwise
> associated with the object or objects that have those locations.
I don't think it's a "simplistic view of OO". It's the standard view that
has always existed, since the very beginning of OOP.
The way that object-oriented programming (and modular programming) works
is rather logical and practical: Every object has an internal state
(usually in the form of member variables) and some member functions.
You can easily handle such objects, such as passing them around, copying
them, querying or modifying their state, have objects manage other
objects, and so on. When coupled with the concept of a public/private
interface division, it makes even large programs manageable, maintainable
and the code reusable.
Back when OOP was first developed, in the 70's and 80', this design didn't
really have any sort of negative impact on performance. After all, there
were no caches, no pipelines, no fancy-pansy SIMD. Every machine code
instruction typically took the same amount of clock cycles regardless of
anything. How your data was arranged in memory had pretty much zero impact
on the efficiency of the program. Conditionals and function calls took
always the same amount of clock cycles and thus were inconsequential.
(You could make the code faster by reducing the amount of conditionals,
but not because they were conditionals, but because they were just extra
instructions, like everything else.)
However, since the late-90's and forward this has been less and less the
case. The introduction of CPU pipelines saw a drastic increase in machine
code throughput (in the beginning with instructions that typically took
at least 3 clock cycles being reduced to taking just 1 clock cycle).
On the flipside, this introduced the possibility of the pipeline getting
invalidated (usually because of a conditional jump, sometimes because
of other reasons). As time passed pipelines became more and more
complicated, and longer and longer, and consequently the cost of a
pipeline invalidation became larger and larger in terms of clock cycle
penalties. Optimal code saw incredible IPS numbers, but conversely
suboptimal code that constantly causes pipeline invalidation would suffer
tremendously.
Likewise the introduction of memory caches brought great improvements to
execution speed, as often-used data from RAM was much quicker to access.
But, of course, for a program to take advantage of this it would need to
cause as few cache misses as possible.
Nowadays SIMD has become quite a thing in modern CPUs, and compilers are
becoming better and better at optimizing code to use it. However, for
optimal results the code needs to be written in a certain way that allows
the compiler to optimize it. If you write it in the "wrong" way, the
compiler won't be able to take much advantage of SIMD.
The problem with OOP is that, while really logical, practical and quite
awesome from a programmer's point of view, making it much easier to manage
very large programs, it was never designed to be optimal for modern CPUs.
A typical OOP program will cause lots of cache misses, lots of pipeline
invalidations, and typically be hard for the compiler to optimize for
SIMD.
In order to circumvent that problem, if one still wants to keep the code
more or less object-oriented, one needs to resort to certain design decisions
that are not traditional (and often not the best, from an object-oriented
design point of view). Such as no longer objects having their own internal
state, separate from all the other objects, but instead grouping the states
of all objects in arrays. Or not having distinct objects for certain things
at all, and instead use data arrays for those things. Likewise eschewing
using member functions for certain things, and instead accessing the state
data arrays directly (something that bypasses the abstraction princples
of OOP.)
Many an expert efficiency-conscious programmer will often choose a mixed
approach: Use "pure" OOP for things that don't require efficiency, and
use more of a Data-Oriented Design for things that do. (Also, encapsulating
the DOD style arrays inside classes, to make them "more OO".)