Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

advice/thoughts on garbage collection?

8 views
Skip to first unread message

mij...@yahoo.com

unread,
Feb 13, 2009, 6:18:27 PM2/13/09
to
Hello,

I'd appreciate any advice fellow bit surgeons may
have pertinent to a c language implementation
of a postscript interpreter/X11 graphics engine.

I've got the basic object model/scanner/executive
interface working quite well (if I do say so myself :),
and have begun to consider how best to handle
the virtual memory and garbage collection aspects.
I confess I haven't scoured the ghostscript code
for an answer to this for several reasons (it's so
ungodly huge; and my design is geared toward
simplicity rather than speed per se, so it seems
unlikely to reap much benefit for the days and
weeks it would require to digest), but I have
looked at several other postscript sources
with varying results: crispin goswell's interpreter
doesn't appear to do it at all, rbuss uses the GC
library with some painful interface (a brief
browsing reinforced my fear of ghostscript),
and the amiga programs are all in 68000 assembler.
I don't even have a reference for that.

So it appears my options boil down to:
1. Use GC.
pro: It's standard, it's good.
con: It's almost as big as my entire program!
I can manage the internal structures just fine,
using realloc for growing structures and freeing
display lists when they're replaced with new
ones. Most of these persist through the life
of the process anyway. So is it easy to
attach GC just to the object space and just
replace malloc in the object allocators?
Can I tell GC my object structure so it
knows exactly where my pointers may
be and under what specific circumstances
they may be pointers (from the type field)?
Will any of this help me with Virtual Memory
(scoped memory changes, allowing arbitrary
"backing out" of data changes)?

2. Do it all myself.
With all the specificity I'd have to feed to
GC, would it be simpler to write my own
tracing function? I could even add a quick
bit-field reference count check to the 'pop'
operator and free most garbage right away.
Would I front-end malloc and keep all the
pointers in a table?

I've got the book coming, but I wanted to
give my subconscious a jump-start on the
issues involved.

tia.

luser-ex-troll

Malcolm McLean

unread,
Feb 15, 2009, 10:43:05 AM2/15/09
to
<mij...@yahoo.com> wrote in message news:

> I'd appreciate any advice fellow bit surgeons may
> have pertinent to a c language implementation
> of a postscript interpreter/X11 graphics engine.
>
Garbage collection has to be built into the compiler. Barring lots of evil
hacks, there's no way of providing it as user code.
Jacob Navia has a C compiler that offers garbage collection as an extension.
However I don't think he has a version for Unix.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Bartc

unread,
Feb 15, 2009, 11:30:37 AM2/15/09
to

<mij...@yahoo.com> wrote in message
news:e174e25d-de5a-4d4d...@v31g2000vbb.googlegroups.com...

> Hello,
>
> I'd appreciate any advice fellow bit surgeons may
> have pertinent to a c language implementation
> of a postscript interpreter/X11 graphics engine.
>
> I've got the basic object model/scanner/executive
> interface working quite well (if I do say so myself :),
> and have begun to consider how best to handle
> the virtual memory and garbage collection aspects.
> I confess I haven't scoured the ghostscript code
> for an answer to this for several reasons (it's so
> ungodly huge; and my design is geared toward
> simplicity rather than speed per se, so it seems
> unlikely to reap much benefit for the days and
> weeks it would require to digest), but I have
> looked at several other postscript sources
> with varying results: crispin goswell's interpreter
> doesn't appear to do it at all, rbuss uses the GC
> library with some painful interface (a brief
> browsing reinforced my fear of ghostscript),
> and the amiga programs are all in 68000 assembler.
> I don't even have a reference for that.
>
> So it appears my options boil down to:
> 1. Use GC.

> 2. Do it all myself.

Perhaps look at whether a language as straightforward as Postscript (if I
recall) needs garbage collection all.

If the language variables don't share data, and it doesn't use pointers,
then you can just allocate/deallocate/reallocate as needed, as you seem to
have discovered already.

--
Bartc

mij...@yahoo.com

unread,
Feb 15, 2009, 1:35:23 PM2/15/09
to
On Feb 15, 10:30 am, "Bartc" <ba...@freeuk.com> wrote:
> <mijo...@yahoo.com> wrote in message
>
[snip!]

> > So it appears my options boil down to:
> > 1. Use GC.
> > 2. Do it all myself.
>
> Perhaps look at whether a language as straightforward as Postscript (if I
> recall) needs garbage collection all.
>
> If the language variables don't share data, and it doesn't use pointers,
> then you can just allocate/deallocate/reallocate as needed, as you seem to
> have discovered already.
>
> --
> Bartc

Ah. But certain of the objects DO have (internal)
pointers.
The object structure boils down to this:

typedef struct s_object Object;
struct s_object {
enum {booleantype, realtype, arraytype, ...} type;
unsigned flags;
unsigned length;
union { int b; float f; Object *a; ...};
};

And there is a troublesome operator, 'getinterval',
which returns a subarray. From the manual:

array index count getinterval subarray
creates a new array ... whose value consists of some
subsequence of the original array .... The subsequence
consists of count elements starting at the specified
index in the original object. The elements in the
subsequence are shared between the original and new
objects.

so if i do:
[0 1] dup 0 1 getinterval %[0 1] [0]
0 2 put ==

should print [2 1].
the subarray reuses the appropriate portion of the
original array so they are sharing memory through
pointers. The difficulty here is when the subarray
is dropped from the stack (by 'put' in the example),
its pointer is not necessarily going to be present
in any cache of malloc'd memory (since it could be
offset to anywhere in the middle).

Actually I may have just answered my own question.
If I add an integer offset to the structure, then
the Object *a member can always hold the pointer
returned by malloc.

All of this get much more complicated with the
Display Postscript and NeWS extensions for
multithreading.

My work is scratched out for me, but a great deal
of engraving and painting remain.

luser-ex-troll

mij...@yahoo.com

unread,
Feb 15, 2009, 1:42:42 PM2/15/09
to
On Feb 15, 9:43 am, "Malcolm McLean" <regniz...@btinternet.com> wrote:
> <mijo...@yahoo.com> wrote in message news:

I appreciate the response but it simply cannot be true.
The Boehm GC package is an example. But conceptually,
anything in the compiler must have been user code first
(although it need not have been present in the most
recent source: ie. the cc/login hack).

Or perhaps what you say is true, and all of my code
consists of "evil hacks". I can live with that.

luser-ex-trolley

Bartc

unread,
Feb 15, 2009, 2:34:53 PM2/15/09
to
mij...@yahoo.com wrote:
> On Feb 15, 10:30 am, "Bartc" <ba...@freeuk.com> wrote:
>> <mijo...@yahoo.com> wrote in message
>>
> [snip!]
>>> So it appears my options boil down to:
>>> 1. Use GC.
>>> 2. Do it all myself.
>>
>> Perhaps look at whether a language as straightforward as Postscript
>> (if I recall) needs garbage collection all.
>>
>> If the language variables don't share data, and it doesn't use
>> pointers, then you can just allocate/deallocate/reallocate as
>> needed, as you seem to have discovered already.

> The object structure boils down to this:


>
> typedef struct s_object Object;
> struct s_object {
> enum {booleantype, realtype, arraytype, ...} type;
> unsigned flags;
> unsigned length;
> union { int b; float f; Object *a; ...};
> };

That shouldn't be a problem, but depends on, when you're copying one object
to another, whether the memory pointed to by *a is duplicated.

>
> And there is a troublesome operator, 'getinterval',

> which returns a subarray ... The elements in the


> subsequence are shared between the original and new
> objects.
>
> so if i do:
> [0 1] dup 0 1 getinterval %[0 1] [0]
> 0 2 put ==
>
> should print [2 1].
> the subarray reuses the appropriate portion of the
> original array so they are sharing memory through
> pointers. The difficulty here is when the subarray
> is dropped from the stack (by 'put' in the example),
> its pointer is not necessarily going to be present
> in any cache of malloc'd memory (since it could be
> offset to anywhere in the middle).

Routines which decide whether a pointer points to an allocated block I think
can already deal with pointers into the middle of the block.

> Actually I may have just answered my own question.
> If I add an integer offset to the structure, then
> the Object *a member can always hold the pointer
> returned by malloc.
>
> All of this get much more complicated with the
> Display Postscript and NeWS extensions for
> multithreading.
>
> My work is scratched out for me, but a great deal
> of engraving and painting remain.

My strategy for implementing a simple stack language was:

* Variables always have their own, unshared data (ie. complex variables such
as arrays)

* When variable values are pushed onto the stack, a copy of the descriptor
(eg. your Object struct) is pushed, and a bit set saying this is a Copy.
This works provided the original variable is not going to disappear during
this expression.

* Values popped from the stack are freed, if the Copy bit is not set. Your
getinterval op would simply have created a slice (an Object set up for an
array) with Copy set; popping this requires no deallocation.

* Values popped into a variable, are duplicated if the Copy bit is set

* And functions which return a value should duplicate that value if
otherwise the Copy bit is set (for reasons which may not apply to
Postscript)

This way I've narrowly managed to avoid incorporating a proper GC for quite
a few years (and this language also has troublesome global variables and
pointers), but at a cost of deep copies for assignments.

--
Bartc

Tim Rentsch

unread,
Feb 15, 2009, 2:39:29 PM2/15/09
to
"Malcolm McLean" <regn...@btinternet.com> writes:

> <mij...@yahoo.com> wrote in message news:
> > I'd appreciate any advice fellow bit surgeons may
> > have pertinent to a c language implementation
> > of a postscript interpreter/X11 graphics engine.
> >
> Garbage collection has to be built into the compiler. Barring lots of evil
> hacks, there's no way of providing it as user code.
> Jacob Navia has a C compiler that offers garbage collection as an extension.
> However I don't think he has a version for Unix.

I don't know what sorts of code might be considered "evil hacks" by
this poster. But if you want something that's reasonably likely to
provide the capability you want, try the Hans Boehm collector:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

One sentence summary: just use malloc(), don't bother using free();
compiler modifications not required.

Quoting from the page, under "Platforms":

The collector is not completely portable, but the distribution
includes ports to most standard PC and UNIX/Linux platforms. The
collector should work on Linux, *BSD, recent Windows versions,
MacOS X, HP/UX, Solaris, Tru64, Irix and a few other operating
systems. Some ports are more polished than others. There are
instructions for porting the collector to a new platform.

Malcolm McLean

unread,
Feb 15, 2009, 3:06:56 PM2/15/09
to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
news:kfnr61z...@alumnus.caltech.edu...

> "Malcolm McLean" <regn...@btinternet.com> writes:
>
>> <mij...@yahoo.com> wrote in message news:
>> > I'd appreciate any advice fellow bit surgeons may
>> > have pertinent to a c language implementation
>> > of a postscript interpreter/X11 graphics engine.
>> >
>> Garbage collection has to be built into the compiler. Barring lots of
>> evil
>> hacks, there's no way of providing it as user code.
>> Jacob Navia has a C compiler that offers garbage collection as an
>> extension.
>> However I don't think he has a version for Unix.
>
> I don't know what sorts of code might be considered "evil hacks" by
> this poster.

Form the site:

DATASTART
The beginning of the main data segment. The collector will trace all memory
between DATASTART and DATAEND for root pointers. On some platforms,this can
be defined to a constant address, though experience has shown that to be
risky. Ideally the linker will define a symbol (e.g. _data whose address is
the beginning of the data segment. Sometimes the value can be computed using
the GC_SysVGetDataStart function. Not used if either the next macro is
defined, or if dynamic loading is supported, and the dynamic loading support
defines a function GC_register_main_static_data() which returns false.
SEARCH_FOR_DATA_START
If this is defined DATASTART will be defined to a dynamically computed value
which is obtained by starting with the address of _end and walking backwards
until non-addressable memory is found. This often works on Posix-like
platforms. It makes it harder to debug client programs, since startup
involves generating and catching a segmentation fault, which tends to
confuse users.

I am not saying the garbage collector is a bad thing, but these are clearly
evil hacks.

mij...@yahoo.com

unread,
Feb 15, 2009, 5:26:04 PM2/15/09
to
On Feb 15, 1:34 pm, "Bartc" <ba...@freeuk.com> wrote:
> mijo...@yahoo.com wrote:
[snip!]

> >> If the language variables don't share data, and it doesn't use
> >> pointers, then you can just allocate/deallocate/reallocate as
> >> needed, as you seem to have discovered already.
> > The object structure boils down to this:
>
> > typedef struct s_object Object;
> > struct s_object {
> > enum {booleantype, realtype, arraytype, ...} type;
> > unsigned flags;
> > unsigned length;
> > union { int b; float f; Object *a; ...};
> > };
>
> That shouldn't be a problem, but depends on, when you're copying one object
> to another, whether the memory pointed to by *a is duplicated.
>

it is for the 'copy' operator (which demands an
appropriately sized container. but the 'dup' operator
simply copies the entire structure (pointer and all).
So it appears that 'dup' will need to do a little
extra work for composite objects (objects whose payload
is a pointer to memory containing other objects).
It seems I can do this with a bit in the flags.
Thanks. This may help enormously.


> > so if i do:
> > [0 1] dup 0 1 getinterval %[0 1] [0]
> > 0 2 put ==
>
> > should print [2 1].
> > the subarray reuses the appropriate portion of the
> > original array so they are sharing memory through
> > pointers. The difficulty here is when the subarray
> > is dropped from the stack (by 'put' in the example),
> > its pointer is not necessarily going to be present
> > in any cache of malloc'd memory (since it could be
> > offset to anywhere in the middle).
>
> Routines which decide whether a pointer points to an allocated block I think
> can already deal with pointers into the middle of the block.

that makes sense. check whether pointer is within
the allocated space. p > space && p < (space + size)

> > My work is scratched out for me, but a great deal
> > of engraving and painting remain.
>
> My strategy for implementing a simple stack language was:
>
> * Variables always have their own, unshared data (ie. complex variables such
> as arrays)
>
> * When variable values are pushed onto the stack, a copy of the descriptor
> (eg. your Object struct) is pushed, and a bit set saying this is a Copy.
> This works provided the original variable is not going to disappear during
> this expression.
>
> * Values popped from the stack are freed, if the Copy bit is not set.

So one object /owns/ the memory, (flags & COPY) == 0.
And pop adds: ((o.flags&COPY)?:free(o.a))
(it's a macro).

Your
> getinterval op would simply have created a slice (an Object set up for an
> array) with Copy set; popping this requires no deallocation.
>

comprendo.

> * Values popped into a variable, are duplicated if the Copy bit is set
>

or in my case, defined in a dictionary.

> * And functions which return a value should duplicate that value if
> otherwise the Copy bit is set (for reasons which may not apply to
> Postscript)
>

perhaps not, but excellent fuel for thought, regardless.

> This way I've narrowly managed to avoid incorporating a proper GC for quite
> a few years (and this language also has troublesome global variables and
> pointers), but at a cost of deep copies for assignments.
>
> --
> Bartc

I still need to think through how this affects
save/restore (restores all composite objects to previous
values).

thanks again.

luser-ex-troll

Tim Rentsch

unread,
Feb 15, 2009, 11:29:04 PM2/15/09
to
"Malcolm McLean" <regn...@btinternet.com> writes:

In regular user code, I'd probably agree with you.

For this kind of implementation add-on, it kind of comes with the
territory. Pragmatically, if it provides the right functionality,
and if the quality of code and implementation is high enough, the
benefits of using it are pretty likely to outweigh the costs of
making sure it works correctly on whatever platform(s) the code is
intended to address.

Don't get me wrong, I'd prefer a GC implementation more integrated
with the compiler(s) the project is using. Absent that, certainly
its reasonable -- for some projects, and especially in the context
of this kind of add-on -- to judge the techniques mentioned as
raising only tactical concerns, not strategic objections.

nick_keigh...@hotmail.com

unread,
Feb 16, 2009, 4:53:40 AM2/16/09
to
On 13 Feb, 23:18, mijo...@yahoo.com wrote:
> Hello,
>
> I'd appreciate any advice fellow bit surgeons may
> have pertinent to a c language implementation
> of a postscript interpreter/X11 graphics engine.
>
> I've got the basic object model/scanner/executive
> interface working quite well (if I do say so myself :),
> and have begun to consider how best to handle
> the virtual memory and garbage collection aspects.

perhaps it's a bit late in the day to be considering
memory management *after* you've implemented your code.
Garbage collection isn't the only way to manage memory.

<sniP>

mij...@yahoo.com

unread,
Feb 16, 2009, 10:41:16 AM2/16/09
to

Agreed. My claim may have both over- and under-promoted
the work I've done so far. I began a few days after
January 1, and have already achieved a monstrous 2557
lines. I've been using the "tracer-bullets" strategy
from /The Pragmatic Programmer/, so it really does fit
to add these sorts of larger features laterally. And I
didn't see any need to worry about the issue until the
concept was proved.
Now that it's drawing filled polygons from a file,
run by a procedure, I feel it's safe to flesh out more
of the features concerning whose implementation I was
less conversant.

Is there something else that would accomplish similar
ends in a different manner (besides option 0: leave it
leaking memory).

luser-ex-troll

nick_keigh...@hotmail.com

unread,
Feb 17, 2009, 4:50:19 AM2/17/09
to
On 16 Feb, 15:41, mijo...@yahoo.com wrote:
> On Feb 16, 3:53 am, nick_keighley_nos...@hotmail.com wrote:
> > On 13 Feb, 23:18, mijo...@yahoo.com wrote:

> > > I'd appreciate any advice fellow bit surgeons may
> > > have pertinent to a c language implementation
> > > of a postscript interpreter/X11 graphics engine.
>
> > > I've got the basic object model/scanner/executive
> > > interface working quite well (if I do say so myself :),
> > > and have begun to consider how best to handle
> > > the virtual memory and garbage collection aspects.
>
> > perhaps it's a bit late in the day to be considering
> > memory management *after* you've implemented your code.
> > Garbage collection isn't the only way to manage memory.
>
> > <sniP>
>
> Agreed. My claim may have both over- and under-promoted
> the work I've done so far. I began a few days after
> January 1, and have already achieved a monstrous 2557
> lines. I've been using the "tracer-bullets" strategy
> from /The Pragmatic Programmer/,

I've considered buying this but never have. Is it any good?

> so it really does fit
> to add these sorts of larger features laterally. And I
> didn't see any need to worry about the issue until the
> concept was proved.

ok, I perhaps sounded harsher than I meant to.
But my experiences of trying to retrofit sane
memory management to leaky implementaions were
not good.

> Now that it's drawing filled polygons from a file,
> run by a procedure, I feel it's safe to flesh out more
> of the features concerning whose implementation I was
> less conversant.
>
> Is there something else that would accomplish similar
> ends in a different manner (besides option 0: leave it
> leaking memory).

eeek!

I was thinking of explicitly free()ing when the memory
went out of use. *Some* part of your program ought to
know when you've finished with something. Even take a look at
C++s smart pointers and RAII.


> luser-ex-troll

mij...@yahoo.com

unread,
Feb 17, 2009, 10:35:10 AM2/17/09
to
On Feb 17, 3:50 am, nick_keighley_nos...@hotmail.com wrote:
> On 16 Feb, 15:41, mijo...@yahoo.com wrote:
>
[snip!]

> > > > I've got the basic object model/scanner/executive
> > > > interface working quite well (if I do say so myself :),
> > > > and have begun to consider how best to handle
> > > > the virtual memory and garbage collection aspects.
>
> > > perhaps it's a bit late in the day to be considering
> > > memory management *after* you've implemented your code.
> > > Garbage collection isn't the only way to manage memory.
>
> > > <sniP>
>
> > Agreed. My claim may have both over- and under-promoted
> > the work I've done so far. I began a few days after
> > January 1, and have already achieved a monstrous 2557
> > lines. I've been using the "tracer-bullets" strategy
> > from /The Pragmatic Programmer/,
>
> I've considered buying this but never have. Is it any good?
>

I liked it so much I never took it back to the library.
(But I will soon. Sorry, Ben Franklin!)

> > so it really does fit
> > to add these sorts of larger features laterally. And I
> > didn't see any need to worry about the issue until the
> > concept was proved.
>
> ok, I perhaps sounded harsher than I meant to.
> But my experiences of trying to retrofit sane
> memory management to leaky implementaions were
> not good.

Yes. I agree. But I consider this an exception since
it's still "early days" for a project of this magnitude
(it took Peter Deutsch 5 years to release the first
version of ghostscript).
I still haven't tackled fonts, aliasing, images, ....
Of course I hope this software will be brilliant and
revolutionary, but I'll be satisfied if I learn enough
to try again (or something different but representing
a similar challange) and if it helps distract me from
porn.

> > Now that it's drawing filled polygons from a file,
> > run by a procedure, I feel it's safe to flesh out more
> > of the features concerning whose implementation I was
> > less conversant.
>
> > Is there something else that would accomplish similar
> > ends in a different manner (besides option 0: leave it
> > leaking memory).
>
> eeek!
>
> I was thinking of explicitly free()ing when the memory
> went out of use. *Some* part of your program ought to
> know when you've finished with something. Even take a look at
> C++s smart pointers and RAII.

There are specific points where objects are discarded
(storing a new value in a dictionary, popping from
the stack, every operator that takes arguments from
the stack). But at these places, the program logic is
unaware of whether the object is unique (and ready to
be free'd). The postscript language appears to demand
garbage collection of some sort.

I'm afraid of C++. It has disappointed my expections
on numerous occasions. But I'll look into RAII.

luser-ex-troll

Flash Gordon

unread,
Feb 17, 2009, 10:10:05 PM2/17/09
to

I'e not looked at the details of postscript so this might be completely
the wrong approach, but you could consider reference counting since I'm
assuming that you need to keep objects alive until the last reference to
it goes. You write functions to create objects, duplicate them, add and
destroy references etc. Then the routine for destroying the object (or
deleting a reference) decrements the reference count and only actually
frees it if that was the last reference.

> I'm afraid of C++. It has disappointed my expections
> on numerous occasions. But I'll look into RAII.

From the sounds of it you will learn a lot about managing memory by
implementing this in C, so if this is a learning project then it is
probably worth continuing. Then, once you have solved it in C, see if it
can be solved more elegantly by using C++ and RAII or some other tool.
However, remember that very little is free, and there are certainly
costs to garbage collection, but sometimes those costs are worth paying.
--
Flash Gordon

dj3v...@csclub.uwaterloo.ca.invalid

unread,
Feb 17, 2009, 11:31:44 PM2/17/09
to
In article <fpas66x...@news.flash-gordon.me.uk>,
Flash Gordon <sp...@flash-gordon.me.uk> wrote:

>I'e not looked at the details of postscript so this might be completely
>the wrong approach, but you could consider reference counting since I'm
>assuming that you need to keep objects alive until the last reference to
>it goes. You write functions to create objects, duplicate them, add and
>destroy references etc. Then the routine for destroying the object (or
>deleting a reference) decrements the reference count and only actually
>frees it if that was the last reference.

Reference counting has a lot of advantages (it's simple to implement,
simple to understand, and guaranteed to deallocate resources as soon as
it can be known that they're no longer needed), but it has one
nontrivial disadvantage (it needs space for a counter for every object
you create) and one often-fatal flaw:
*** If you can *EVER* create a reference cycle, a refcounting ***
*** GC will *NEVER* collect it. ***
So unless your type system is restricted enough (and strong enough) to
guarantee that cycles are impossible, or your object model lets you
forbid any mutation that might change what existing objects refer to
(if you can only create objects with references to existing objects,
and you can't modify objects once they're created, that forces your
reference graph to be acyclic), refcounting will be inadequate.
(It's possible to work around this problem, but in doing so you end up
losing a lot of the advantages refcounting has over other types of
garbage collection; in particular, the client code typically needs to
be aware that it's using a refcounted GC and know how to play nicely
with it.)
Like Flash, I don't know enough about postscript to comment on whether
refcounting is appropriate for your (OP's) purposes.


> From the sounds of it you will learn a lot about managing memory by
>implementing this in C, so if this is a learning project then it is
>probably worth continuing. Then, once you have solved it in C, see if it
>can be solved more elegantly by using C++ and RAII or some other tool.
>However, remember that very little is free, and there are certainly
>costs to garbage collection, but sometimes those costs are worth paying.

Experience in the world of software development at large suggests that
you meant to say "...often those costs are worth paying.".
Of course, if you narrow your scope to the sorts of things C is a good
tool for, the proportion goes way down. (I think the largest class of
things that C-plus-GC is a good match for is implementations of
higher-level languages that provide GC.)


dave

--
Dave Vandervies dj3vande at eskimo dot com
When a person knows as little as you do about a subject, they really
should stop talking. It's the only way to prevent looking foolish.
--Frank Krygowski in rec.bicycles.misc

Flash Gordon

unread,
Feb 18, 2009, 2:19:46 AM2/18/09
to

Good points. I *suspect* that in implementing a postscript interpreter
would fit within the limits of reference counting, but you are correct
that it might not.

>> From the sounds of it you will learn a lot about managing memory by
>> implementing this in C, so if this is a learning project then it is
>> probably worth continuing. Then, once you have solved it in C, see if it
>> can be solved more elegantly by using C++ and RAII or some other tool.
>> However, remember that very little is free, and there are certainly
>> costs to garbage collection, but sometimes those costs are worth paying.
>
> Experience in the world of software development at large suggests that
> you meant to say "...often those costs are worth paying.".
> Of course, if you narrow your scope to the sorts of things C is a good
> tool for, the proportion goes way down. (I think the largest class of
> things that C-plus-GC is a good match for is implementations of
> higher-level languages that provide GC.)

Well, most of the programming I've done in over 20 years has not needed
dynamic memory allocation. The instances where it was needed were

1) To get around a limit of 64K of static data, so GC was not needed
2) The software on one processor on a 20 processor system where I had to
write the SW in assembler, and the requirements were such that there
were easier ways than using GC (I could just drop *all* allocations for
the start of each time round the main loop)
5) The SW I've worked on over the past 8 years where GC might be useful.

So my personal experience is that you don't need dynamic memory
allocation most of the time, but that experience is biased by the types
of work I spent 15 years doing.

My opinion is that sometimes the cost of GC is worth it, and I'm not
going to take a guess at percentages either inside or outside C programming.
--
Flash Gordon

nick_keigh...@hotmail.com

unread,
Feb 18, 2009, 4:55:02 AM2/18/09
to
On 18 Feb, 07:19, Flash Gordon <s...@spam.causeway.com> wrote:

<snip>

> Well, most of the programming I've done in over 20 years has not needed
> dynamic memory allocation. The instances where it was needed were
>
> 1) To get around a limit of 64K of static data, so GC was not needed
> 2) The software on one processor on a 20 processor system where I had to
> write the SW in assembler, and the requirements were such that there
> were easier ways than using GC (I could just drop *all* allocations for
> the start of each time round the main loop)
> 5)

5?

> The SW I've worked on over the past 8 years where GC might be useful.
>
> So my personal experience is that you don't need dynamic memory
> allocation most of the time, but that experience is biased by the types
> of work I spent 15 years doing.

what about the cases where you don't know in advance how many
thingies
you'll need. For instance inter-processor messages where the messages
vary in size and you don't know how many you'll need. Its obvious
when you've finished with a message so you can explicitly return
and don't need Garbage Collection.

oh, wait what was the memory leak I spent 3 weeks hunting down...
That involved instrumenting every malloc and free (ok, it was new and
delete) in the system and seeing what type of thingy just kept
on increasing. Timers and inter-processor messages.

mij...@yahoo.com

unread,
Feb 18, 2009, 10:49:13 PM2/18/09
to
On Feb 17, 9:10 pm, Flash Gordon <s...@spam.causeway.com> wrote:
> mijo...@yahoo.com wrote:
[snip!]

> > There are specific points where objects are discarded
> > (storing a new value in a dictionary, popping from
> > the stack, every operator that takes arguments from
> > the stack). But at these places, the program logic is
> > unaware of whether the object is unique (and ready to
> > be free'd). The postscript language appears to demand
> > garbage collection of some sort.
>
> I'e not looked at the details of postscript so this might be completely
> the wrong approach, but you could consider reference counting since I'm
> assuming that you need to keep objects alive until the last reference to
> it goes. You write functions to create objects, duplicate them, add and
> destroy references etc. Then the routine for destroying the object (or
> deleting a reference) decrements the reference count and only actually
> frees it if that was the last reference.

Yes this appears to be exactly the approach taken by NeWS.
I found a pretty good description in Davison, et al.
/Distributed Window Systems/.
Apparently there's an extra mechanism called "soft" references
that helps will possibly cyclic structures, but I don't
really understand that part.

> > I'm afraid of C++. It has disappointed my expections
> > on numerous occasions. But I'll look into RAII.
>
>  From the sounds of it you will learn a lot about managing memory by
> implementing this in C, so if this is a learning project then it is
> probably worth continuing. Then, once you have solved it in C, see if it
> can be solved more elegantly by using C++ and RAII or some other tool.
> However, remember that very little is free, and there are certainly
> costs to garbage collection, but sometimes those costs are worth paying.
> --
> Flash Gordon

Yes, it is a learning project. Of course, I hope it'll change
the world and earn me millions. But mostly, this is the
software I want to use and nobody else is gonna do it for me.
So my only costs are time and cranial capacity. But dividends
in either tend to cross-benefit.
In short, excellent advice. Thank you.

luXer-ex-troXX

mij...@yahoo.com

unread,
Feb 18, 2009, 10:55:25 PM2/18/09
to
On Feb 18, 3:55 am, nick_keighley_nos...@hotmail.com wrote:
> On 18 Feb, 07:19, Flash Gordon <s...@spam.causeway.com> wrote:
[snip!]

>
> > 1) To get around a limit of 64K of static data, so GC was not needed
> > 2) The software on one processor on a 20 processor system where I had to
> > write the SW in assembler, and the requirements were such that there
> > were easier ways than using GC (I could just drop *all* allocations for
> > the start of each time round the main loop)
> > 5)
>
> 5?

Don't let him hold the holy handgrenade!

hee hee

Flash Gordon

unread,
Feb 18, 2009, 11:42:31 PM2/18/09
to
nick_keigh...@hotmail.com wrote:
> On 18 Feb, 07:19, Flash Gordon <s...@spam.causeway.com> wrote:

<snip>

>> 5)
>
> 5?

Do you have something against 5?

>> The SW I've worked on over the past 8 years where GC might be useful.
>>
>> So my personal experience is that you don't need dynamic memory
>> allocation most of the time, but that experience is biased by the types

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


>> of work I spent 15 years doing.
>
> what about the cases where you don't know in advance how many
> thingies
> you'll need. For instance inter-processor messages where the messages
> vary in size and you don't know how many you'll need. Its obvious
> when you've finished with a message so you can explicitly return
> and don't need Garbage Collection.
>
> oh, wait what was the memory leak I spent 3 weeks hunting down...
> That involved instrumenting every malloc and free (ok, it was new and
> delete) in the system and seeing what type of thingy just kept
> on increasing.

I acknowledged above my experience is biased. I also clearly
acknowledged that there are occasions where GC is worth it. I even
stated that I've worked for 8 years on SW where it could be useful!

> Timers and inter-processor messages.

Don't always deal with unknown amounts of stuff. Sometimes they do,
sometimes they don't. I suspect the 20 processor system I worked on
might have had a little communication between processes (4 of the
processors were just dealing with marshalling, synchronisation, data
transfer etc whilst the other 16 did the work), and I've dealt with
timers before as well.
--
Flash Gordon

Tim Rentsch

unread,
Feb 20, 2009, 12:41:46 PM2/20/09
to
Flash Gordon <sm...@spam.causeway.com> writes:

True story follows.

A development team working on a reasonably large (> 600 KLOC)
application made a decision to drop in a conservative collector
to do their memory management. Up until that point they had
managed memory in the usual malloc()/free() style, and everything
was working okay, but maintaining that part of their code that
did memory management was just taking too many of their cycles.
So they decided to "bite the bullet" and switch over.

The result was great. Not only did they no longer have to
worry about when to free() memory, but using the conservative
collector the application also ran faster.

No conclusions offered, just a data point.

mij...@yahoo.com

unread,
Feb 22, 2009, 11:19:41 PM2/22/09
to

Thanks to all for many useful perspectives.
Overall trends:
Use Boehm Garbage Collector for most C projects.
Prefer reference counting (possibly reduced to a bit)
for a simpler method where Boehm is too much.

I've also stumbled upon a library that offers a solution
to this and other issues specific to a dynamic
object-based language: GObject, part of Glib (GNOME).

I'm sorely tempted to scrap everything and start over
with GObject containers. Until then it's still more
useful perspective.

luXer-ex-troXX

0 new messages