Garbage collection

Christophe de Dinechin

unread,

Mar 11, 2011, 6:06:34 PM3/11/11

to xlr-...@googlegroups.com

On 10 mars 2011, at 10:20, Henry Weller wrote:

> What I don't understand is the way in which the types in XLR are dynamic, is this only a compile-time issue or are they dynamic at run-time?

The short answer is that this depends on the program, but the goal of type inference is to deduce as much information about actual types as possible, so as to do more at compile-time rather than run-time.

The longer answer, which I will try to put in a separate thread, boils down to this : type information in XL can either be entirely static (i.e. we know everything at compile-time), static types with dynamic tests (e.g. the type can be "integer" or "real" and we need a test at run-time to decide which code to execute), or fully dynamic types (i.e. the actual type can be computed at run-time).

> If the latter what is the overhead?

In the best scenarios where all type information is correctly inferred, I've been within 15% of code optimized at -O3 with gcc. It's hard to tell if what remains is due to LLVM code generation or some other cause.

The code generated by default from xlr doesn't use type inference, so it corresponds to a worst case scenario. The few times I measured, I typically observed a 20x to 50x slow-down compared to C. But this extreme scenario is very unlikely to occur once type inference is fully functional : the maximum overhead occurs when what we compare to are single machine instructions, and these are cases where I expect that type inference should work best.

> Also is the garbage collector used only at compile-time or also at run-time?

Remember that in XLR, compile-time and run-time are intertwined, so the answer is: "both" :-) The bad news is that there is a GC. The good news is that it only activates when objects need to be cleaned up, and you can write code so that it never activates. Actually, if you write code "C++ style", i.e. you specify types manually, I think that you can easily guarantee this "zero GC" objective at least for the parts of the code where it matters.

> The reason I ask is that GC is very inefficient in memory usage for the kind of code I write which is why I am glad is was not built-in to the new C++ standard and why I would not consider D.

Garbage collection can be inefficient. It can also be more efficient than manual memory management. It depends on the workload. We are only beginning to explore this space.

Regarding GC, One aspect of XL2 that I only barely scratched so far was memory management. I wanted to allow anything from stack-based variables to manual memory management a la C++ to garbage collection. But then, there is more garbage than just memory. I'd like to have generic garbage collection algorithms that I can use to recycle file descriptors, distributed objects, whatever. It's not very simple to design, though.

Similarly, the "pointer" type in the XL2 library is a generic type, but other pointer types can be easily written. If you make all data structures dependent on this pointer type, you can easily derive a "hash table on disk" from a "pointer on disk".

Long-term plans...

Henry Weller

unread,

Mar 12, 2011, 5:25:57 AM3/12/11

to xlr-...@googlegroups.com

> The longer answer, which I will try to put in a separate thread, boils down to
> this : type information in XL can either be entirely static (i.e. we know
> everything at compile-time), static types with dynamic tests (e.g. the type
> can be "integer" or "real" and we need a test at run-time to decide which code
> to execute), or fully dynamic types (i.e. the actual type can be computed at
> run-time).

Is this the case for both XL2 and XLR? Is this the means by which dynamic
dispatch and multi-methods would be supported in XL2?

>> Also is the garbage collector used only at compile-time or also at run-time?

> Remember that in XLR, compile-time and run-time are intertwined, so the answer
> is: "both" :-) The bad news is that there is a GC. The good news is that it
> only activates when objects need to be cleaned up, and you can write code so
> that it never activates.

Interesting.

> Actually, if you write code "C++ style", i.e. you specify types manually, I
> think that you can easily guarantee this "zero GC" objective at least for the
> parts of the code where it matters.

That would be great, but it seems rather ambitious. Would this be available in
both XL2 and XLR or only in one or other?

> Regarding GC, One aspect of XL2 that I only barely scratched so far was memory
> management. I wanted to allow anything from stack-based variables to manual
> memory management a la C++ to garbage collection. But then, there is more
> garbage than just memory. I'd like to have generic garbage collection
> algorithms that I can use to recycle file descriptors, distributed objects,
> whatever. It's not very simple to design, though.

Indeed, even more ambitious :-)

> Similarly, the "pointer" type in the XL2 library is a generic type, but other
> pointer types can be easily written. If you make all data structures dependent
> on this pointer type, you can easily derive a "hash table on disk" from a
> "pointer on disk".

> Long-term plans...

:-)

Christophe de Dinechin

unread,

Mar 12, 2011, 9:32:32 AM3/12/11

to xlr-...@googlegroups.com

On 12 mars 2011, at 11:25, Henry Weller wrote:

>> The longer answer, which I will try to put in a separate thread, boils down to
>> this : type information in XL can either be entirely static (i.e. we know
>> everything at compile-time), static types with dynamic tests (e.g. the type
>> can be "integer" or "real" and we need a test at run-time to decide which code
>> to execute), or fully dynamic types (i.e. the actual type can be computed at
>> run-time).
>
> Is this the case for both XL2 and XLR? Is this the means by which dynamic
> dispatch and multi-methods would be supported in XL2?

So far, the XL2 design only has fully static typing implemented. Dynamic tests is in the plans, with a relatively clear design. I had no plans for fully dynamic types in XL2, they "emerged" based on the XLR design.

>> Actually, if you write code "C++ style", i.e. you specify types manually, I
>> think that you can easily guarantee this "zero GC" objective at least for the
>> parts of the code where it matters.
>
> That would be great, but it seems rather ambitious. Would this be available in both XL2 and XLR or only in one or other?

It's the same in both cases. If you know that something is an integer or a struct {int x; float y;}, then there is no more need for a garbage collector than in C++. And if you write "x:integer", you do know the type of x. So I am relatively confident here.

>> Regarding GC, One aspect of XL2 that I only barely scratched so far was memory
>> management. I wanted to allow anything from stack-based variables to manual
>> memory management a la C++ to garbage collection. But then, there is more
>> garbage than just memory. I'd like to have generic garbage collection
>> algorithms that I can use to recycle file descriptors, distributed objects,
>> whatever. It's not very simple to design, though.
>
> Indeed, even more ambitious :-)
>
>> Similarly, the "pointer" type in the XL2 library is a generic type, but other
>> pointer types can be easily written. If you make all data structures dependent
>> on this pointer type, you can easily derive a "hash table on disk" from a
>> "pointer on disk".
>
>> Long-term plans...
>
> :-)

XL2 has always been about long term plans. This project has been 15+ years in the making, so what's another couple of years?

Reply all

Reply to author

Forward