Then the correct answer is ALLOCATION/WITH-FIXED-ALLOCATION; everything else is more or less negotiable. Ie. the compiler needs to know how to emit allocation sequences.
it's somewhat tangential, but it might be worth it in your endeavor to
consider completely reifying heaps as first class objects into the
system.
i think it's much more work than just pluggable GC's, so probably it
will be out of your scope, but nevertheless try to keep this in mind
when shaping the interfaces...
that way the user could have an API to tell that in a dynamic extent
the runtime should use a specific heap, with a potentially different
GC.
or just open a new heap in a dynamic extent and tell the runtime not
to bother with synchronization and GC, because this heap is big enough
for all the allocation that will happen in this dynamic extent, and
can be thrown away in its entirety afterwards.
does anyone know any system that already explored this idea? i guess
the FONC people (Alan Kay, VPRI) have something like this, but very
little info/code is leaking out of them.
--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“If you want to do good, work on the technology, not on getting power.”
— John McCarthy
I'd like to look into it and write a "linear lisp" in the style of
Henry Baker that has some such notion of first-class region. We'll see
after ELS2013. (Admittedly, I said I'd do it after ILC2012, and
instead I wrote ASDF3.)
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
There cannot be Ethics without Models of possible behaviors, and Imagination
to explore them. [Corollary: there is no Ethics for an all-knowing God,
but there are Ethics for mostly-ignorant but nevertheless thinking humans]
well, it's easy to do it in a system where all that you have is random
access to the underying huge binary number we call memory, and as such
you must manage the object layout yourself.
it's an entirely different story to introduce first class heaps into a
system that also promises transparent memory management for you (GC).
--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Be who you are and say what you feel, because those who mind don't
matter and those who matter don't mind.”
— Dr. Seuss
On May 14, 2013 6:33 PM, "Craig Lanning" <clan...@isc8.com> wrote:
>
> On Tue, 2013-05-14 at 18:09 +0300, Nikodemus Siivola wrote:
> > Then the correct answer is ALLOCATION/WITH-FIXED-ALLOCATION;
> > everything else is more or less negotiable. Ie. the compiler needs to
> > know how to emit allocation sequences.
>
> Now, that was an eye opener.
>
> Ultimately, what I'd like to do is define a set of functions and global
> variables that can be define either in Lisp for a GC implemented in Lisp
> or in C for a GC implemented in C (and brought into Lisp via the alien
> facility).
>
> The company I work for has a very special use case that I think can be
> made easier by rewriting the GC. I read the opinions about why the GC
> is in C instead of Lisp. I understand them, agree with a few, and
> disagree with others. I think that the changes I need to do to the GC
> will be easier to implement if I can do it in Lisp instead of C.
Are you able to share any details about the sort of changes that you have in mind? Or about the use case, even in general terms?
> After looking at ALLOCATION/WITH-FIXED-ALLOCATION, it appears that
> switching the GC from a C implementation to a Lisp implementation would
> be very non-trivial. Has anyone given any serious thought to what would
> need to be done to implement the GC in Lisp?
Some years ago (late 2008, maybe), I did some preliminary investigation into writing parts of the GC in Lisp, although I was mostly focusing on how to implement things like the dispatch involved in scavenging an object.
If you're planning to implement a GC in Lisp, one of the things to make sure of is that the code and data that implements the GC is available while you're running the GC, which is something that seemed very difficult in my original context, but for what you're doing simply arranging for any data tables required to be in static space or otherwise pinned and for code objects to not move would cover the worst of it. Well, and there are the FDEFINITION objects to consider, but putting them into static space as well should work, if you can arrange that (I can think of a couple of approaches here).
Another aspect to consider is if the GC code itself should be allowed to allocate memory. If it shouldn't, then you have to be careful about how you write the code in order to avoid allocation and you may also want to figure out how to tell the compiler that any allocation would be an error (so that you don't backslide during maintenance).
The inline allocation logic itself is actually fairly straightforward, modulo the overflow handling. Each thread has an alloc_region (in a single-threaded system there is a global alloc_region), which contains two fields of interest to the allocation logic: the current allocation pointer and the end of the region. Everything else in the alloc_region is of interest to the GC only, and the general layout might well be completely different for a different GC.
You would also have to deal with the "safepoint" or "pseudo-atomic" logic, and I haven't really thought overmuch about what would be involved here, plus there's the whole issue of actually triggering a collection cycle. And there's the matter of scavenging any interrupt contexts and the various thread stacks.
And if you're changing the heap layout too drastically, or need to arrange for certain things to be in certain heap spaces even in the cold-core, you may well end up modifying genesis (SYS:SRC;COMPILER;GENERIC;GENESIS.LISP), the program that actually creates the cold-core from a set of cross-compiled FASLs.
Anyway, I've given the matter a certain amount of thought, and I'm reasonably confident that I could write SOMETHING that would work, but it would take a while and I simply don't have a use-case that would make it worth the effort.
I hope that this brain-dump helps, and would love to be kept "in the loop" if you decide to go forward with writing a new GC for SBCL in Lisp. Or even a new GC for SBCL in C.
> Craig
-- Alastair
If you can guarantee that your memory areas will always have sufficient space for all of the allocation that you need to do, are willing to run WITHOUT-INTERRUPTS while allocating into that space, and the space will contain no external references other than to static space, then I have a nasty, nasty hack in mind that will run on a stock system (hint: there are only two words used for the basic allocation and overflow check, and the region is in a known global location on single-thread systems and in the thread structure on multi-thread systems, easily poked at via SAP functions either way).
The requirement to always have sufficient space is so that you don't trip an allocation overflow "trap" (though they are actually straight calls on x86oids, but on PPC they are a TWI instruction that the trap handler can easily recognize). To alleviate this, we would need to define a way to hook the overflow trap on a per-region basis to point to the "correct" GC. Note that it should be plausible to run the trap handler in Lisp with a bit of care.
The requirement to run WITHOUT-INTERRUPTS is because an interrupt (unix signal) handler will almost certainly allocate memory, and you don't want that in your custom memory space. To alleviate this, modify the interrupt handling logic to detect that a thread is running with a custom allocation region, and to somehow bind a normal, empty gencgc region in its place. And modify your macro to arrange to close the old gencgc region before setting up your custom region, otherwise the GC could lose track of the old region. If you also have to enable scavenging of your allocation regions then this becomes more complex, as you would need to update whatever tracks the size of the allocated data in your region when binding the gencgc region into place.
The requirement to not have any pointers to non-static heap data is because gencgc doesn't know to scavenge your allocation regions. This should be straightforward to arrange if you keep track of the regions, by arranging to scavenge all of the regions at some point after all conservative roots have been scanned (somewhere near the scavenging of the binding stacks should suffice; it should be before scavenging newspace).
Depending on which of these restrictions you can live with, this could run anywhere from an afternoon to a month, as a zero'th order estimate. The easy restriction to lift is the one about scavenging custom allocation regions. I'd actually have to think through the details for the other two. There may be a dependency involved (you might need some of the bits for removing WITHOUT-INTERRUPTS before you can have a lisp-side overflow handler).
[ snip ]
> > I hope that this brain-dump helps, and would love to be kept "in the
> > loop" if you decide to go forward with writing a new GC for SBCL in
> > Lisp. Or even a new GC for SBCL in C.
>
> The "brain-dump" does help. You mentioned a few things that I hadn't
> run across yet. I certainly will keep posting info to the list. I'm
> interested to run any GC performance tests on whatever I end up
> building. I intend to provide the SBCL team with a GC chapter for the
> Internals manual at the very least. As I work on the GC changes, any
> code that is generally useful, SBCL is welcome to have. I will try to
> make sure that any of our "proprietary" changes are really nothing more
> than specific configuration of the general GC that I build.
>
> Based on the info from Nikodemus and Alastair, it looks like this will
> be a longer term project than I had originally thought. Fortunately,
> the application will still work with the existing GC so we're ok for the
> time being. Changing the GC would just make the application run more
> quickly and more efficiently.
>
> Craig
Now you've got me thinking about pluggable allocation regions and scavenging strategies... And I already have enough of a project list.
-- Alastair