On Sun, May 9, 2010 at 5:39 AM, Paolo 'Blaisorblade' Giarrusso
<
p.gia...@gmail.com> wrote:
> Please, not again. The performance myth that "garbage collection is so
> slow" "Java is slow" is old and should be dead. About Java, it has
> just bad startup performance. So please don't draw uneducated
> conclusions on GC.
Two historic reasons pester the myth. One, most functional languages
at that time were garbage collected, but comparatively little was
known about making functional languages go fast, cross pollinating the
idea that it might as well be the garbage collector. Two, when Java
came out, it had an awful garbage collector. Three, Java programmers,
the bad ones, tend to like wrapping their objects in objects. If this
happens several times, cache memory quickly becomes a scarce resource
which in turn hurts performance. Four, GC is forgiving. If you write a
bad program with a horrendous allocation rate, and bad reference
management, the GC saves the day. The program runs, and it is slow.
> And some concurrent algorithms (like immutable data structures) become
> unmanageable without GC (unless GC is emulated through atomic
> refcounting - that's easy in C++, but much slower).
Right. GC is a productivity balance. With GC in Go, you spend more
time on your algorithm and problem than tedious memory management.
When memory become a problem, garbage collectors normally allow for
heap inspection in some way. You can then draw statistics from the GC
and optimize the bottleneck. I will claim that your time is much
better spent with this approach.
There is another interesting GC-strategy for a highly concurrent
language which is the one Erlang uses. In Erlang, a process is a
light-weight, userland scheduled beast, much akin to a goroutine. Two
differences however: They are completely isolated from each other,
hence warranting the process term, and each have their own garbage
collector attached. The consequence is that all communication between
processes must happen by copying. It makes the "Share data by
communicating" mantra extremely important. Don't hand a 16Mb binary
search tree to another process! You will not be passing by the
reference. The one-gc-per-process yields some nice soft-realtime
properties in the system. Also, it means that a process-crash is
non-fatal by construction. The rest of the program can be sure it does
not garbled data so it can keep running. The strategy of limiting the
impact of errors is central to Erlang programming.
On the other hand, the design choice of Erlang forces other things to
be in certain ways. To facilitate Erlangs main purpose, protocol
handling, bytes of binary data are handled in a seperate, ref-counted
heap. Each process only sits with the slice-structure in its heap
space. Also, the system support pervasive hash-tables essentially,
where any process can go look up information.
--
J.