Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Garbage Collection and Games

12 views
Skip to first unread message

Philip Haddad

unread,
Dec 17, 2004, 5:12:45 PM12/17/04
to
Hi all,
I have recently been doing some research into Garbage Collectors, and
playing around with CMUCL's GC. Since there has been a lot of
disscussion about Lisp and games in here over the last few months, I
figured I'd continue the trend :)
I read the Naughty Dog Postmortem on Jak and Dexter, about some of the
pros and cons of using Lisp for games (well, GOAL anyways), and I
wondered
a) What type of GC would be the best to use in games? mark-sweep,
generational stop-and-copy, incremental tracing, etc. Would a real-time
GC work?
b) Java is used to write games frequently, and it works ok as far as I
know (coming from limited Java experiance here). Java of course has a
GC. How is Java's GC different from most Lisp GCs? Is it really that
different?
c) Is it possibly to use a non-real-time GC and only GC when the player
is at a menu screen or some such? I mean when you write games in C++
you don't have to worry about GCing (then again you have to declare
everything and clean up your mess), so is it possible not to worry
about GCing frequently in games? Or will you need to because of screen
refreshes etc.?
Just some thoughts and questions. I will probably be compiling all of
these ideas into some sort of essay when I get some of the answers, and
learn a little more about GC algorithms and the like.
--
Certum quod factum.
Philip Haddad

Philip Haddad

unread,
Dec 17, 2004, 5:13:29 PM12/17/04
to

Philip Haddad

unread,
Dec 17, 2004, 5:15:33 PM12/17/04
to

Svein Ove Aas

unread,
Dec 17, 2004, 6:00:30 PM12/17/04
to
begin quoting Philip Haddad :

> Hi all,
> I have recently been doing some research into Garbage Collectors, and
> playing around with CMUCL's GC. Since there has been a lot of
> disscussion about Lisp and games in here over the last few months, I
> figured I'd continue the trend :)
> I read the Naughty Dog Postmortem on Jak and Dexter, about some of the
> pros and cons of using Lisp for games (well, GOAL anyways), and I
> wondered
> a) What type of GC would be the best to use in games? mark-sweep,
> generational stop-and-copy, incremental tracing, etc. Would a real-time
> GC work?

I can only assume a combination would be best: Use incremental copying when
you have CPU time left over, and a generational something-or-other
stop-and-copy thing if you run out of both space and time.

There are of course ways to do incremental tracing that would eliminate this
possibility, but said ways also eat a lot of CPU time. It depends; do you
want absolutely no jitter, or would no jitter when the machine isn't
overloaded anyway be enough?

> b) Java is used to write games frequently, and it works ok as far as I
> know (coming from limited Java experiance here). Java of course has a
> GC. How is Java's GC different from most Lisp GCs? Is it really that
> different?

I don't know about Java, but Python has also been used to write games, and
it just happens to use an incremental GC. This is important; without some
sort of incremental GC, you *will* eventually get jitter.

> c) Is it possibly to use a non-real-time GC and only GC when the player
> is at a menu screen or some such? I mean when you write games in C++
> you don't have to worry about GCing (then again you have to declare
> everything and clean up your mess), so is it possible not to worry
> about GCing frequently in games? Or will you need to because of screen
> refreshes etc.?

Everything's possible. A Lisp with better support for declarations should be
able to do the exact same thing C++ does, and it's perfectly simple to turn
off the incremental GC if you're doing a hybrid approach already.

Chris Capel

unread,
Dec 17, 2004, 7:45:44 PM12/17/04
to
Philip Haddad wrote:

> Hi all,
> I have recently been doing some research into Garbage Collectors, and
> playing around with CMUCL's GC. Since there has been a lot of
> disscussion about Lisp and games in here over the last few months, I
> figured I'd continue the trend :)

> b) Java is used to write games frequently, and it works ok as far as I


> know (coming from limited Java experiance here). Java of course has a
> GC. How is Java's GC different from most Lisp GCs? Is it really that
> different?

I think one thing one must keep in mind is that the kind of games written in
java (unless you can prove me wrong) are the kind that use less than 50-80%
of the CPU most of the time. These kinds of games have /plenty/ of spare
time for garbage collection, and unless you have 100ms pauses for GC,
you're not going to see any jitter. I've done a simple game in ltk that's
set to run at 33 fps (I could probably set it higher--I haven't tried). I
don't see any framerate problems with it, but then it /is/ fairly simple.

The kinds of games talked about in the "C++ sucks for games" thread are
console and PC games that use absolutely every cycle available to them.
These are the kinds of games that I don't see written in Java, and I won't
expect to see written in any true HLL without a very optimized graphics
library available and a really good compiler, too. (I don't think Naughty
Dog counts, but I'm not really clear on the role of garbage collection in
GOAL during the game's runtime, nor about how much of the game was written
in GOAL and how much in other languages.)

That's half the discussion right there.

By the way, Jak 3 is a great game. My ltk game was taken right out of one of
their minigames. It's pretty addictive.

Chris Capel

Surendra Singhi

unread,
Dec 17, 2004, 8:39:43 PM12/17/04
to
Philip Haddad wrote:

Is there no simple way to do program controlled deallocation of memory,
and not requiring garbage collection at all?

--
Surendra Singhi

www.public.asu.edu/~sksinghi/

Philip Haddad

unread,
Dec 18, 2004, 9:19:09 AM12/18/04
to
> Everything's possible. A Lisp with better support for declarations
should be
> able to do the exact same thing C++ does, and it's perfectly simple
to turn
> off the incremental GC if you're doing a hybrid approach already.

So you think that writing a Lisp dialect is the best way to go?
Personally, I like the fact that Lisp doesn't use too many type
declarations, although I suppose that in game you can figure out pretty
simply how much space an object is going to take up.

Philip Haddad

unread,
Dec 18, 2004, 4:48:25 PM12/18/04
to

I just had an interesting idea. What if the GC was manually controlable
from the mutator (the running program)? Would this speed up storage
reaclimation? I think this way you could GC when you wanted to, which
is important. How about floating garbage, would this approach limit it,
or would there be more? I thought that it may be effective to detect
floating garbage and put as much of it as possible in a cache that the
mutator creates for the GC. Would this be to computationally expensive?

John Connors

unread,
Dec 18, 2004, 6:35:12 PM12/18/04
to
Surendra Singhi wrote:

>
>
> Is there no simple way to do program controlled deallocation of memory,
> and not requiring garbage collection at all?
>

With SBCL and CMCUL it's possible to dynamically allocate and free
chunks of alien types allocated by foreign code.

http://common-lisp.net/project/cmucl/doc/cmu-user/aliens.html#toc256

However GC is bit of a red herring. In commercial console games, ideally
no memory allocation or deallocation is done in the main game loop at
all, there just are not the cycles for anythng like malloc. Referencing,
intialising and modifying chunks out of pre-allocated static arrays
(read simple arrays in Lisp) is the order of the day.


--
Cyborg Animation Programmer
http://yagc.blogspot.com
http://badbyteblues.blogspot.com

Philip Haddad

unread,
Dec 18, 2004, 7:47:06 PM12/18/04
to

> However GC is bit of a red herring. In commercial console games,
ideally
> no memory allocation or deallocation is done in the main game loop at

> all, there just are not the cycles for anythng like malloc.

Really? I always thought that stuff was still being allocated in the
game loop. Pointers to objects are still active right?

Christopher C. Stacy

unread,
Dec 18, 2004, 9:05:11 PM12/18/04
to
"Philip Haddad" <philip...@gmail.com> writes:

Don't you just allocate the necessary objects in pool
at the beginning of each level of the game?

John Connors

unread,
Dec 19, 2004, 6:08:42 AM12/19/04
to

Usually what you do is you have a frontend / menus etc and a loading
cycle when going between frontend and the game (or from level to level
within the game). This is when allocation gets done, and buffers get
allocated - buffers for n models, n particles, n animated joints, n
whatever, and the pointers to objects you refer to are pointing into
those buffers..

A classical malloc() that walks a heap, and finds the right place in a
list of free chunks of memory is not something you want when you are
commmited to animating n objects, and plotting x zillion polys in 1/60th
sec - it's execution time is just too variable.

The game loop is a soft real-time system in effect (you can live with
dropping the odd frame if it's been architected correctly - but no more
than the odd frame). So allocation / deallocation in the game loop is
out of static arrays with a static index incremented per allocation.
Primitive, but effective. Without a GC that's been bulletproofed for
soft-real time usage (and I know there are some, although not which),
it's probably the best you are going to get.

John Connors

unread,
Dec 19, 2004, 6:20:39 AM12/19/04
to

In effect yes, I'm using the term allocation a bit sloppily. There are
some things we pretend to allocate/dealloctae by initalising/or
re-initalising data / objects held in those pools. Particles, props,
smashables, local timer objects, things like that.

Philip Haddad

unread,
Dec 19, 2004, 1:50:25 PM12/19/04
to

> > Really? I always thought that stuff was still being allocated in
the
> > game loop. Pointers to objects are still active right?

To carry this idea a little farther, what if you had a GC with a pool
that you allocated all of the programs (mutator) floating
garbage in. The GC can detect the floating garbage, but it cannot
reach it, so once it is detected, just put it in the floating garbage
pool. Once the mutator has terminated its process, reclaim all of the
space
taken by the pool (actaully, you might not even need to do that).
Obviously,
the pool would have to be contiguous to the spaces controlled by the
GC.

I also think that it might be useful to define GC operations that the
mutator
can use in itself. For example,

(defun foo (args)
(do ((stuff-with args))
(declaim (speed 2)))
;; clean up some other stuff
(bar (stuff-with args)
(gc-claim bar)))

gc-claim takes up the space allocated by bar explicitly in the program.

This is probably not the greatest example, but I couldn't think of a
more
general case off the top of my head. I call this "manually Garbage
Collecting".
Will some GC algorithm with this support be a viable option? Basically,
I'm
thinking about using an incremental tracing generational GC with
mutator
functions and a floating garbage pool. Good\bad idea??

Brian Downing

unread,
Dec 19, 2004, 9:06:16 PM12/19/04
to
In article <cq3ngn$mkg$1...@newsg2.svr.pol.co.uk>,

John Connors <"johnc at yagc dot ndo dot co dot uk"> wrote:
> Usually what you do is you have a frontend / menus etc and a loading
> cycle when going between frontend and the game (or from level to level
> within the game). This is when allocation gets done, and buffers get
> allocated - buffers for n models, n particles, n animated joints, n
> whatever, and the pointers to objects you refer to are pointing into
> those buffers..

[...]

> The game loop is a soft real-time system in effect (you can live with
> dropping the odd frame if it's been architected correctly - but no more
> than the odd frame). So allocation / deallocation in the game loop is
> out of static arrays with a static index incremented per allocation.
> Primitive, but effective. Without a GC that's been bulletproofed for
> soft-real time usage (and I know there are some, although not which),
> it's probably the best you are going to get.

This is great unless you want to craft a game like Metroid Prime 1 or 2,
where the graphic engine runs at 60fps from the moment the game turns
on, and loading new "levels" (sections of the world separated by doors)
is done on the fly, with the door not opening until the next section
loads. Nevertheless, the frame rate never drops below 60fps, and the
game is still playable in the room you're coming from.

I've used this as an example before. I wouldn't keep going back to it
except that I find it makes for a /very/ polished and immersive
experience for me (except for the occasional wait for a door to open or
a (60fps) elevator ride - far less jarring than a "LOADING" screen), and
impresses me greatly both from a gameplay perspective and a programming
one.

I would like to think that an incremental GC that had a low enough
maximum run time that it could run between 60fps frames and yet not use
up enough total CPU to prevent such a game from running. This, however,
may be wishful thinking. :)

-bcd
--
*** Brian Downing <bdowning at lavos dot net>

Christopher C. Stacy

unread,
Dec 19, 2004, 11:46:02 PM12/19/04
to
Is it really clear that garbage collection ever needs to happen?

Brian Downing

unread,
Dec 20, 2004, 12:39:39 AM12/20/04
to
In article <uwtvdz...@news.dtpq.com>,

Christopher C. Stacy <cst...@news.dtpq.com> wrote:
> Is it really clear that garbage collection ever needs to happen?

Well, no, of course, but writing consless Lisp in the general case is a
pain.

Christopher C. Stacy

unread,
Dec 20, 2004, 4:06:14 PM12/20/04
to
Brian Downing <see-si...@lavos.net> writes:

> In article <uwtvdz...@news.dtpq.com>,
> Christopher C. Stacy <cst...@news.dtpq.com> wrote:
> > Is it really clear that garbage collection ever needs to happen?
>
> Well, no, of course, but writing consless Lisp in the general case is a pain.

No, I'm suggesting that maybe you just won't exhaust memory, anyway.

Philip Haddad

unread,
Dec 21, 2004, 9:44:26 AM12/21/04
to
not if you run the app for a long enough time

Christopher C. Stacy

unread,
Dec 21, 2004, 11:01:38 AM12/21/04
to
> not if you run the app for a long enough time

Has anyone done measurements of the consing and how long the
application runs? Video games don't run for very long, and I
bet they don't need to cons very much either. Maybe people
are worrying about a problem that never happens.

The reason I ask is that historically, over a quarter century ago,
and more recently, people used to sometimes run hairy AI programs
without using the garbage collector. (This was especially true
for demos, when you didn't want the GC going off in the middle;
also true on systems where the GC was still too buggy.)

I am trying to point out that there are many different techniques
for garbage collection, and that there is also the option to just
turn it off, which often works out very well.

Matthias

unread,
Dec 21, 2004, 11:29:07 AM12/21/04
to
cst...@news.dtpq.com (Christopher C. Stacy) writes:

> > not if you run the app for a long enough time
>
> Has anyone done measurements of the consing and how long the
> application runs? Video games don't run for very long, and I
> bet they don't need to cons very much either. Maybe people
> are worrying about a problem that never happens.

I haven't written a video game, only some small OpenGL demos in
Scheme. At least with my implementation they do cons,
heavily. (During animations, new OpenGL primitives are created all the
time). I didn't much care for it, but the GC did cause the demos to
stop briefly every few seconds.

just...@gmail.com

unread,
Dec 21, 2004, 1:01:56 PM12/21/04
to

As usual whether your runtime performance is affected by GC time would
seem to be context dependent.

If you construct your lisp code so that it does no consing whatsoever
and manipulates only pre-existing (perhaps native) structures then it's
a non-issue. The question then becomes do you lose too much of the
utility of Lisp in that case for it to be worthwhile using it?

My guess is yes. The ideal place for Lisp code in a game is to run high
level game logic and AI, using internal native data structures for
doing rendering, CPU intensive AI operations like path finding,
animation etc.

If I was using Lisp in this way I'm pretty sure I would want to be able
to create lists, sequences, strings etc without having to worry about
the GC. Or if I did have to worry about it, to at least have a known
maximum hit per frame, or to be able to tell the GC when it has time
and how much.

I don't know much about GC and what techniques there are, but that in a
nutshell is what I would want. Something with a known worse case, and
strict control over the CPU time it has available.

In Christopher's post above he mentions that he creates OpenGL
primitives. My guess is that in a console game you would want to
create objects at the primitive level within native code. The Lisp
layer would merely message the native code to load, create and
manipulate them at the object level.
Justin Heyes-Jones
Senior Programmer
Exile Interactive

mikel

unread,
Dec 21, 2004, 2:21:18 PM12/21/04
to
Christopher C. Stacy wrote:
> > not if you run the app for a long enough time
>
> Has anyone done measurements of the consing and how long the
> application runs? Video games don't run for very long, and I
> bet they don't need to cons very much either. Maybe people
> are worrying about a problem that never happens.

It's not unusual for a session of World of Warcraft to last several
hours. It's not unheard-of among hardcore players for one to last
several days, but human stamina being what it is, a need to restart the
game a few times in that interval might be acceptable.

Christopher C. Stacy

unread,
Dec 21, 2004, 4:24:45 PM12/21/04
to
mikel <mi...@evins.net> writes:

It doesn't matter how many hours or days it runs, except as related
to how much garbage is being generated. Six hours might be nothing,
or it might be an eternity. Randomly guessing about a hypothetical
application is probably not a sound technical approach to making
this evaluation.

Vorax

unread,
Dec 26, 2004, 9:47:20 PM12/26/04
to


I am currently writing a Open GL game engine in Java. I have found
that you can avoid memory allocation in the rendering loop entirely if
you try. I used to believe that this would be a problem as well.

The best trick I can suggest (and this would work just as well in
Lisp), is creating a pool of temporary member variables within your
objects. Upon construction, allocate the objects, then use them when
you need temporary variables, or need to send a result back to a
caller, but don't want to mess your internal state. The next thing to
remember is that the callers should be doing the same thing.

Example:
class Caller {
private Matrix4 m = new Matrix4();
private Matrix4 myMatrix

...

myMatrix.set( matStack.getMultMatrix(myMatrix) );
// do something
...
}

class MatrixStack {
private Matrix4 scratch = new Matrix4();
private Matrix4 internallyUsedMatrix = new Matrix4();

public Matrix4 getMVMatrix(Matrix4 m) {
// math stuff here.
..

scratch.set( myMatrix );

return scratch;
}
}


If its safe, go ahead and use the variable directly to avoid having to
copy the results contents into your local variable. Try to return safe
values where you can because this avoids any data copying (the
myMatrix.set( ... code above). Most cases you don't need to copy
anything either because the life of data is only one rendering frame
within the loop and your loop is a very controlled environment (by
you).

My engine currently (though still in development) can maintain 63 fps
(configured to that for synchronization with monitor refresh) with 70K+
tris with 12 animated MD3 chars and 3 active particle systems of 6k
particles without a flicker in the FPS count (monitored for 12 hours).
...If Java can do it, Lisp can to. The engine is doing all Matrix
calcs on behalf of Open GL as well (so I can track VW coords while
using LC coords for my scene graph), and that would be the biggest risk
for GC (lots of passing around and potential for creating new objects).

Steven M. Haflich

unread,
Dec 27, 2004, 1:10:10 AM12/27/04
to
There seems to be a flawed assumption in this thread that normal
Lisp execution must suspend during gc. While this is true of
essentially all the main line CL implementations today, there is
no fundamental reason why this must be so. There have been in
the past gc implementations that worked in parallel without
suspending Lisp execution (at least with worst-possible
suspension being acceptably short) and there is no reason why
modern designs and implementations would not be possible today.

I've been sitting on sucha design for about a decade, and there
are other known designs that are at least as good as the one in
my head. Here are the issues:

Designing and implementing a robust GC in a mature and complete
CL implementation is a big job. The code generator needs to know
a lot about the gc and memory allocator, and vice versa. All the
changes to make these things work with one another causes a lot
of destabilization. Square that with the existence of
multithreading. Oh, and did I mention multithreading? (If you
don't get the joke, nevermind.) When and how does it become
worthwhile for a stable implementation to suffer the instability
of such a fundamental change? And how do you fund it?

Code generated by the compiler and higher-level code (for
instance, the implementation of hash tables) in certain
critical sections may depend upon atomicity of execution for
certain segments of code. In a multiprocessor environment these
assumptions may not happen automatically, and enforcing them may
require interlocks or suboptimal code sequences, and these
changes reduce the speed of _all_ code execution. How much is
the possibility of parallel gc (and perhaps also parallel
multiprocessor execution) be worth? Would it be worth a 20%
slowdown? Would it be worth a 50% slowdown? These are the
kinds of estimates postulated by experts who have implemented
existing compilers and gc systems.

Something to think about.

Christopher C. Stacy

unread,
Dec 27, 2004, 4:18:47 AM12/27/04
to
If you want to really hair things up, maybe a multiprocessor
system could actually be used to run the GC in parallel with
the mutator threads; I wonder if this would be fast, or worse?

Duane Rettig

unread,
Dec 27, 2004, 12:09:10 PM12/27/04
to
cst...@news.dtpq.com (Christopher C. Stacy) writes:

Yes :-)

It would first be a requirement - if a gc is not explicitly designed
to run in a thread while other threads are running (on other
processors, of course), then each thread would have to be stopped
while some segment of the gc runs, and it wouldn't be what I consider
to be a true multiprocessor-ready gc. If, however, other threads are
allowed to run during gc operations, then it is only a question of
configuration as to whether th gc runs in its own thread (preferred)
or in one of the threads that was already running Lisp code. Either
way, it is both

- faster: multiple threads can continue to run without stoppage
from gc operations

- slower: every access to the lisp heap must take a hardware or
software trap, or an extra indirection, to ensure that the access
is properly forwarded.

This means that individual threads would in general be slower,
but a system of highly parallelized operations would be faster.
So depending on the nature of the application, the overall result
might be faster or slower, to the extent that the application is
itself parallelizable to fit into the number of processors.
However, Amdahl's Law suggests that most applications would be
slower...

As for the tradeoffs on the slower side: the extra indirection
is the most straightforward solution; it takes very little extra
logic on the Lisp code side, but the cost is one indirection for
every access. This means that non-parallel applications are halved
in speed for access-heavy operation. The software trap situation
is less constricting to Lisp code, but it requires specific memory
layouts to ensure that the "trap" tests are fast. I believe that
Steve's solution is one of these. The hardware trap situation is
by far the least intrusive to normal operation _when_ a page is
not being forwarded, but it has an extremely high cost (on the order
of a hundred or more cycles) when forwarding is required, or, if
an operating system allows the kind of "fast user trap" that would
be necessary to get in and out with automatic forwarding built
in (say, as a level-one trap), the cycle count could be pushed
to something less than a hundred cycles.

--
Duane Rettig du...@franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182

Steven M. Haflich

unread,
Dec 27, 2004, 10:12:02 PM12/27/04
to

Parallel gc is implicit in the design, but the combined execution
would still be slower for a single, non-parallelizable mutator
computation. The argument is that a well-tuned application can
generally run with less than 10% overhead for gc, but the estimates
for parallel gc overhead are generally larger than 10%.

As a limiting case with a single processor, parallel gc designs
degenerate into nonobtrusive gc, where the gc can run in small
processor quanta. Such a design could still be useful, even
though the overall execution speed would be slowed, because gc-
induced latency could be made quite small and acceptable to
some real-time applications. Also, the interlocking issues
with only a single processor are much less severe than with
multiple processors, so the speed hit is much less than with
even two processors (mutator and gc). But I have no numbers
to offer.

0 new messages