Better GC and Memory Allocator for Go

2,408 views
Skip to first unread message

Dmitry Vyukov

unread,
May 22, 2013, 12:57:30 PM5/22/13
to golang-dev
Hi,

I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!

Dmitry Vyukov

unread,
May 24, 2013, 8:15:57 AM5/24/13
to golang-dev
Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.

The point is that it would be very-very nice if we can use just
1 bit to represent type info. It would reduce memory consumption and
working set, simplify and speedup GC code.

One possible solution is to not store data inline (always allocate, as gccgo does).
Another solution is to store inline only pointers (and zero-size structs). I suspect that lots of interface values are pointers.
It's also possible to allocate object with sizes 1..7 inline if we set high bits to 1 so that it does not look like pointer (is it true for all OSes?). But it's probably too complex.
Will interface representation change break any public APIs?

Keith Randall

unread,
May 24, 2013, 9:41:32 AM5/24/13
to Dmitry Vyukov, golang-dev
On Fri, May 24, 2013 at 5:15 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,

I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!


Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.


So what kind of objects does this happen with?  I would imagine most objects do not fall into this category.  Certainly stacks, maybe interface values, others?  Perhaps a special case can be made for these (for stacks, there already is).
 
The point is that it would be very-very nice if we can use just
1 bit to represent type info. It would reduce memory consumption and
working set, simplify and speedup GC code.

One possible solution is to not store data inline (always allocate, as gccgo does).
Another solution is to store inline only pointers (and zero-size structs). I suspect that lots of interface values are pointers.
It's also possible to allocate object with sizes 1..7 inline if we set high bits to 1 so that it does not look like pointer (is it true for all OSes?). But it's probably too complex.
Will interface representation change break any public APIs?

--
 
---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Dmitry Vyukov

unread,
May 24, 2013, 9:46:15 AM5/24/13
to Keith Randall, golang-dev
On Fri, May 24, 2013 at 5:41 PM, Keith Randall <k...@google.com> wrote:

On Fri, May 24, 2013 at 5:15 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,

I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!


Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.


So what kind of objects does this happen with?  I would imagine most objects do not fall into this category.  Certainly stacks, maybe interface values, others?  Perhaps a special case can be made for these (for stacks, there already is).


I think for stacks we can get precise info. In each safepoint it's either pointer or not, even if we reuse stack space across scopes.

Can it happen inside of maps?

Keith Randall

unread,
May 24, 2013, 9:48:35 AM5/24/13
to Dmitry Vyukov, golang-dev
On Fri, May 24, 2013 at 6:46 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Fri, May 24, 2013 at 5:41 PM, Keith Randall <k...@google.com> wrote:

On Fri, May 24, 2013 at 5:15 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,

I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!


Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.


So what kind of objects does this happen with?  I would imagine most objects do not fall into this category.  Certainly stacks, maybe interface values, others?  Perhaps a special case can be made for these (for stacks, there already is).


I think for stacks we can get precise info. In each safepoint it's either pointer or not, even if we reuse stack space across scopes.

Can it happen inside of maps?

No.  Every slot in a map data structure is either a pointer or not for its entire lifetime.  We can tell the GC what the layout is at allocation time. 

roger peppe

unread,
May 24, 2013, 9:51:47 AM5/24/13
to Dmitry Vyukov, golang-dev
On 24 May 2013 13:15, Dmitry Vyukov <dvy...@google.com> wrote:
> On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
>>
>> Hi,
>>
>> I want to share my rough plan on improving Go GC and memory allocator:
>>
>> https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
>> I don't have all the details, and I don't have a timeline.
>> But since these are very significant changes I want share it early,
>> hear your thoughts and feedback, and ideally get a buy in from
>> somebody for compiler/linker part (GC info generation) and parts of
>> runtime work.
>> Thanks!
>
>
>
> Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer)
> will conflict with interface -- the data word can be a pointer and not a
> pointer for the same object.
>
> The point is that it would be very-very nice if we can use just
> 1 bit to represent type info. It would reduce memory consumption and
> working set, simplify and speedup GC code.

This is probably a stupid idea, but I wonder if you could make use
of the fact that the *second* pointer in an interface is always
of one or two well known kinds (a type or a method-table+type)

If the second of a pair of pointers is one of those kinds, we
know it's an interface and can potentially work out if the
first of the pair is a pointer.

We could store types and method tables in a separate arena
to make the check cheap.

Dmitry Vyukov

unread,
May 24, 2013, 9:55:08 AM5/24/13
to roger peppe, golang-dev
Thanks!
This is a possible solution.
But it also complicates and slows down GC. It would be nice to scan w/o any exceptions.

Daniel Morsing

unread,
May 24, 2013, 9:57:44 AM5/24/13
to Dmitry Vyukov, golang-dev
I don't think you gain a lot by doing this. People tend to use int as
a placeholder type to attach methods to, so you end up doing just as
much allocating, but with more complexity.

> Will interface representation change break any public APIs?
>

reflect has (Value).InterfaceData() for direct access to the interface
words. I don't think you can use that call for anything useful, so
maybe it's ok to break it?

Will each memory section have it's own copy of the bitmask? Then a
possibility would be to have the interface generation functions
clear/set the bitmask, depending on whether it's word contains a value
or a pointer.

Keith Randall

unread,
May 24, 2013, 10:01:22 AM5/24/13
to Dmitry Vyukov, golang-dev
I had a different idea about improving the way we record type info in the heap.  Perhaps it would compliment your plan.

The idea is that most of the heap is probably just a few object types.  So we can reserve some MSpans to hold just one type of object.  Kind of like the size classes we have now, but divided up by type instead of size class.  Of course, that means a lot more classes, so we can't do this for every type.  Instead, we keep a count of allocations by type and when the count of allocations exceed a threshold, we promote that type to have its own "class", meaning from then on it gets allocated as a separate class (and no longer allocated by size class).

The benefit is that we can record type info at the MSpan level instead of at the individual object level.  At least, for MSpans which correspond to a type class.  We also get the side benefit of saving some space if the type doesn't fit exactly in a size class.  We need type info per object only for the old size class MSpans.

There are still some problems to work through.  How are arrays handled?  And what if the number of type classes grows too large (maybe large = as many as size classes we have now = 61)?  Can we "unpromote" a type class later?  We could reset (or decay) the allocation counts at each GC and rechoose the type classes based on the top N allocations.

Dmitry Vyukov

unread,
May 24, 2013, 10:39:30 AM5/24/13
to Daniel Morsing, golang-dev
One more reason to use struct{} ;)

 

> Will interface representation change break any public APIs?
>

reflect has (Value).InterfaceData() for direct access to the interface
words. I don't think you can use that call for anything useful, so
maybe it's ok to break it?

Does it document the exact layout? AFAIK gccgo always use non-inlined data, so there must be no actual promise on representation...



Will each memory section have it's own copy of the bitmask? Then a
possibility would be to have the interface generation functions
clear/set the bitmask, depending on whether it's word contains a value
or a pointer.


Yeah, I though about it, but it will slowdown interface assignment...
 

Daniel Morsing

unread,
May 24, 2013, 11:23:53 AM5/24/13
to Dmitry Vyukov, golang-dev
It just returns an array of 2 uintptrs. I doubt there is anything you
could actually do with them.

>
>
>> Will each memory section have it's own copy of the bitmask? Then a
>> possibility would be to have the interface generation functions
>> clear/set the bitmask, depending on whether it's word contains a value
>> or a pointer.
>>
>
> Yeah, I though about it, but it will slowdown interface assignment...
>

Right, on top of that, It will make it a lot more difficult to
represent arrays of interfaces compactly. We can scrap that idea.

I'm leaning towards either just assuming that anything in an interface
is a pointer or allocating when it's a value type.

Dmitry Vyukov

unread,
May 24, 2013, 11:29:35 AM5/24/13
to Daniel Morsing, golang-dev
We must aim for 100% GC in long term.
It's OK to treat the interface data conservative as a pointer for now. But we need at least some idea how to implement it in a precise way later (to not reimplement the whole thing from scratch again).

Yes, for now I am inclined to allocate int's separately and store pointers.

Dmitry Vyukov

unread,
May 24, 2013, 11:33:42 AM5/24/13
to Keith Randall, golang-dev
On Fri, May 24, 2013 at 6:01 PM, Keith Randall <k...@google.com> wrote:
I had a different idea about improving the way we record type info in the heap.  Perhaps it would compliment your plan.

The idea is that most of the heap is probably just a few object types.  So we can reserve some MSpans to hold just one type of object.  Kind of like the size classes we have now, but divided up by type instead of size class.  Of course, that means a lot more classes, so we can't do this for every type.  Instead, we keep a count of allocations by type and when the count of allocations exceed a threshold, we promote that type to have its own "class", meaning from then on it gets allocated as a separate class (and no longer allocated by size class).

The benefit is that we can record type info at the MSpan level instead of at the individual object level.  At least, for MSpans which correspond to a type class.  We also get the side benefit of saving some space if the type doesn't fit exactly in a size class.  We need type info per object only for the old size class MSpans.

There are still some problems to work through.  How are arrays handled?  And what if the number of type classes grows too large (maybe large = as many as size classes we have now = 61)?  Can we "unpromote" a type class later?  We could reset (or decay) the allocation counts at each GC and rechoose the type classes based on the top N allocations.



If I correctly understand your proposal, I think the additional complexity outweigh the potential memory savings. It will also slow-down either allocation of GC somewhat, because we will need to count number of alive objects by type.

Dmitry Vyukov

unread,
May 29, 2013, 3:01:11 AM5/29/13
to golan...@googlegroups.com
Hi,

Here is a sketch of the scan loop that uses:
1. heap object type info (instead of points-to type info)
2. simple bitmasks for type info
3. object headers

#define BitAllocated 1
#define BitMarked 2
#define BitsPerTypeInfoWord 64
#define EmbedTypeInfoBits (BitsPerTypeInfoWord-2)
#define EmbedTypeInfoObjectSize (EmbedTypeInfoBits*PtrSize)
#define PAGETABLE (mheap.pagetab)  // can be const
#define SPANTABLE (mheap.spantab)  // can be const

void scan(void)
{
uintptr *p;
uintptr n, i, idx, v, vsize, tisize, tipos0, tipos1, spanstart;
uint64 *ti, *ti0;
Pagetab *pt;
Type *t;
Span *span;

for(;;) {
p = dequeue(&n);  // next object to scan
if(p == nil)
break;
// setup type info
if(n <= EmbedTypeInfoObjectSize) {
// embed type info
tisize = BitsPerTypeInfoWord;
tipos0 = tipos1 = 2;  // first 2 bits are allocated and marked
ti0 = ti = p;
} else {
// satellite type info
t = (Type*)(*p & ~3);
tisize = t->tisize;  // DEREF
tipos0 = tipos1 = 0;
ti0 = ti = t->ti;
}
for(i = 1; i < n/PtrSize; i++) {
if((*ti & (1<<tipos1)) && (v = p[i])) {
// have non-nil pointer v
// find object beginning
pt = PAGETABLE[v/PageSize];  // DEREF
if(pt.size) {
// small object
vsize = pt.size;
spanstart = (v&~(PageSize-1)) - pt.offset*PageSize;
idx = (v-spanstart)/vsize;  // DIV
v = spanstart + idx*vsize;
} else {
// large object
span = SPANTABLE[v/PageSize];  // DEREF
v = span->start;  // DEREF
vsize = span->size;
}
// now v - memory block, vsize - memory block size
// check if already marked
// (*v&BitAllocated) == 0 should not happen with precise GC
if((*v&BitMarked) == 0) {  // DEREF
// deliberately sloppy about atomicity, can scan the same object twice
*v |= BitMarked;
// enqueue for scanning
enqueue(v, vsize);
}
}
// update type info position
tipos0++;
tipos1++;
if(tipos0 == tisize) {
// move to the next array element
ti = ti0;
tipos1 = tipos0 = 0;
} else if(tipos1 == BitsPerTypeInfoWord) {
// move to the next type info word
ti++;
tipos1 = 0;
}
}
}
}

For objects of size <= 496 there is only 1 memory dereference (except the object itself) -- PAGETABLE.
For objects of size <= MaxSmallSize (32K) there is 1 additional memory dereference -- satellite type info.
For large object there are 2 additional memory dereferences -- SPANTABLE and span.
I afraid counting dereferences in the current code, I would expect this to be at least 2 times faster than what we have now.

Hopefully this will be less error-prone as well. The recent GC bugs were caused by (1) wrong points-to type info, (2) dangling pointers and (3) complexity, corner cases and accumulated state. (1) points-to type info is not used in this algo. (2) a dangling pointer can cause a leak at worst. (3) here is no corner cases and accumulated state, a bug is either in this handful of lines of code or in type info bitmasks.

With this algo it's also trivial to calculate alive heap size during scanning (basically impossible now), and this opens the path to concurrent sweeping (another 1/3 stop-the-world reduction).




 

unread,
May 29, 2013, 7:24:01 AM5/29/13
to golan...@googlegroups.com
From the document:


2. GC redesign.

Problems:

2.1 GC is complex

2.2 type info is complex, fat and slow to process

2.3 lots of scattered memory accesses and indirection

My opinion:

2.1 True
2.2 The current GC hasn't been optimized yet
2.3 The current GC hasn't been optimized yet

Dmitry Vyukov

unread,
May 29, 2013, 7:26:27 AM5/29/13
to ⚛, golang-dev
On Wed, May 29, 2013 at 3:24 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> From the document:
>
>
> 2. GC redesign.
>
> Problems:
>
> 2.1 GC is complex
>
> 2.2 type info is complex, fat and slow to process
>
> 2.3 lots of scattered memory accesses and indirection
>
> My opinion:
>
> 2.1 True
> 2.2 The current GC hasn't been optimized yet
> 2.3 The current GC hasn't been optimized yet

Do you think it can be substantially faster (to outweigh complexity)
than what I propose?



> On Wednesday, May 22, 2013 6:57:30 PM UTC+2, Dmitry Vyukov wrote:
>>
>> Hi,
>>
>> I want to share my rough plan on improving Go GC and memory allocator:
>>
>> https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
>> I don't have all the details, and I don't have a timeline.
>> But since these are very significant changes I want share it early,
>> hear your thoughts and feedback, and ideally get a buy in from
>> somebody for compiler/linker part (GC info generation) and parts of
>> runtime work.
>> Thanks!
>

unread,
May 29, 2013, 7:59:01 AM5/29/13
to golan...@googlegroups.com, ⚛
On Wednesday, May 29, 2013 1:26:27 PM UTC+2, Dmitry Vyukov wrote:
On Wed, May 29, 2013 at 3:24 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> From the document:
>
>
> 2. GC redesign.
>
> Problems:
>
> 2.1 GC is complex
>
> 2.2 type info is complex, fat and slow to process
>
> 2.3 lots of scattered memory accesses and indirection
>
> My opinion:
>
> 2.1 True
> 2.2 The current GC hasn't been optimized yet
> 2.3 The current GC hasn't been optimized yet

Do you think it can be substantially faster (to outweigh complexity)
than what I propose?

I do not know the answer. I only know that the current GC can be improved in multiple ways.

Note: The reason why the current GC does not store typeinfo in the beginning of an allocated block is because it has been rejected by rsc. I don't plan to change this decision.
Message has been deleted

Dmitry Vyukov

unread,
May 29, 2013, 8:54:03 AM5/29/13
to ⚛, golang-dev
My plan is to wait until Carl implement it :)
He already has precise args scanning for arguments.

On Wed, May 29, 2013 at 4:00 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> How do you propose to scan stack frames in a precise way?

Ian Lance Taylor

unread,
May 29, 2013, 9:27:25 AM5/29/13
to Dmitry Vyukov, Daniel Morsing, golang-dev
On Fri, May 24, 2013 at 8:29 AM, Dmitry Vyukov <dvy...@google.com> wrote:
>
> We must aim for 100% GC in long term.
> It's OK to treat the interface data conservative as a pointer for now. But
> we need at least some idea how to implement it in a precise way later (to
> not reimplement the whole thing from scratch again).
>
> Yes, for now I am inclined to allocate int's separately and store pointers.

That is what gccgo does, but for some types of code it introduces
considerable overhead. Some code, like encoding/binary, uses small
values with methods and stores them in interfaces. encoding/binary
only uses a couple of values, but I've seen code that uses a lot more.
Forcing all those values into separate heap locations means a lot of
unnecessary heap allocations.

Ian

Dmitry Vyukov

unread,
May 30, 2013, 2:35:13 AM5/30/13
to Ian Lance Taylor, Daniel Morsing, golang-dev
Bad news...
We can make objects with size < PtrSize embedded as well (by setting
unused bits to 1).
Is it possible to rewrite that code in some other way so that it does
not use such interfaces?

Russ Cox

unread,
May 30, 2013, 11:18:36 AM5/30/13
to Dmitry Vyukov, Ian Lance Taylor, Daniel Morsing, golang-dev
On Thu, May 30, 2013 at 2:35 AM, Dmitry Vyukov <dvy...@google.com> wrote:
Is it possible to rewrite that code in some other way so that it does
not use such interfaces?

I get very scared when I see discussion like this. Our job as language implementers is to implement the language. It is not to change the language because we are not good enough to implement it. Sometimes new information about implementation does feed back into language design, of course, but that must be done deliberately and treated as a language change, not a one-off comment in the middle of an implementation design.

Russ

Dmitry Vyukov

unread,
May 30, 2013, 11:46:41 AM5/30/13
to Russ Cox, Ian Lance Taylor, Daniel Morsing, golang-dev
I just want to collect information for now.
Good design and implementation of memory
allocator/GC/interfaces/chans/maps is not a simple task, and some
restrictions can make it much simpler to solve. Win-win changes are
great, but sometimes we need to make trade-off decisions. This is
where information matters.

Probably I formulated the question badly. What I want to understand is:
Is that code a good idiomatic Go? Or maybe the author will now say
that there is a better way to write it and the better way does not use
"integers in interfaces". I do not see "integers in interfaces" used
often.

Russ Cox

unread,
May 30, 2013, 11:49:11 AM5/30/13
to Dmitry Vyukov, Ian Lance Taylor, Daniel Morsing, golang-dev
It is very important to me that we do not penalize existing uses of the language. Non-pointer values in interfaces do happen in common cases, such as enumerated constant types with String methods.

Russ

Dmitry Vyukov

unread,
May 30, 2013, 11:52:31 AM5/30/13
to ⚛, golang-dev
On Wed, May 29, 2013 at 3:59 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> On Wednesday, May 29, 2013 1:26:27 PM UTC+2, Dmitry Vyukov wrote:
>
>> On Wed, May 29, 2013 at 3:24 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
>> > From the document:
>> >
>> >
>> > 2. GC redesign.
>> >
>> > Problems:
>> >
>> > 2.1 GC is complex
>> >
>> > 2.2 type info is complex, fat and slow to process
>> >
>> > 2.3 lots of scattered memory accesses and indirection
>> >
>> > My opinion:
>> >
>> > 2.1 True
>> > 2.2 The current GC hasn't been optimized yet
>> > 2.3 The current GC hasn't been optimized yet
>>
>> Do you think it can be substantially faster (to outweigh complexity)
>> than what I propose?
>
>
> I do not know the answer. I only know that the current GC can be improved in
> multiple ways.

There are several reasons why I think it won't be very fast:
1. Scanblock algorithm is complex (read - slow).
2. The complexity does not allow to implement other important
optimizations (e.g. concurrent sweep, and if complex changes will be
implemented, we will be hunting bugs for weeks).
3. There are some inevitable additional memory accesses during scan --
e.g. bitmap, span, span type info array.
4. The data structures are expensive to maintain (e.g. settype,
compare it to 1 store to the first word of the object).

unread,
May 30, 2013, 12:11:57 PM5/30/13
to golan...@googlegroups.com, ⚛
1. The proposal increases memory consumption by adding an uintptr to the start of each object. We can get a fairly accurate idea about the final memory consumption of the proposal by changing the case MTypes_Empty in function runtime·settype() to allocate MTypes_Words instead of MTypes_Bytes.

2. How does the proposal deal with alignment guarantees? If a program allocates 16 bytes, the expectation is for the data to be aligned to 16 bytes.

Russ Cox

unread,
May 30, 2013, 12:23:08 PM5/30/13
to ⚛, golang-dev
On Thu, May 30, 2013 at 12:11 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
1. The proposal increases memory consumption by adding an uintptr to the start of each object. We can get a fairly accurate idea about the final memory consumption of the proposal by changing the case MTypes_Empty in function runtime·settype() to allocate MTypes_Words instead of MTypes_Bytes.

2. How does the proposal deal with alignment guarantees? If a program allocates 16 bytes, the expectation is for the data to be aligned to 16 bytes.

This is a serious problem. Per-object headers are not okay, for this reason.

Russ

Dmitry Vyukov

unread,
May 30, 2013, 12:25:58 PM5/30/13
to ⚛, golang-dev
On Thu, May 30, 2013 at 8:11 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> 1. The proposal increases memory consumption by adding an uintptr to the
> start of each object. We can get a fairly accurate idea about the final
> memory consumption of the proposal by changing the case MTypes_Empty in
> function runtime·settype() to allocate MTypes_Words instead of MTypes_Bytes.

It is not precise.
The real memory consumption can be lower because when you allocate 24
bytes, the header adds no overhead (fits into 32 byte size class).
On the other hand, for 4096 byte allocation we will need to allocate
much more to add the header.

I have little data now. I've measured memory consumption on
test/bench/garbage/parser. Overall RSS increase was about 5% (heap
size was about 580MB, 900MB Sys).
I need to do more investigation.


> 2. How does the proposal deal with alignment guarantees? If a program
> allocates 16 bytes, the expectation is for the data to be aligned to 16
> bytes.

I understand that. Do we guarantee the alignment somewhere? I would
expect only 8-byte alignment guarantee (enough for any Go type).
Larger alignment can be achieved by manual offset (e.g. for SSE one
would allocate a slightly larger []float32 and then subslice from
beginning to get the necessary alignment).


> On Thursday, May 30, 2013 5:52:31 PM UTC+2, Dmitry Vyukov wrote:
>>
>> On Wed, May 29, 2013 at 3:59 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
>> > On Wednesday, May 29, 2013 1:26:27 PM UTC+2, Dmitry Vyukov wrote:
>> >>
>> >> Do you think it can be substantially faster (to outweigh complexity)
>> >> than what I propose?
>> >
>> >
>> > I do not know the answer. I only know that the current GC can be
>> > improved in
>> > multiple ways.
>>
>> There are several reasons why I think it won't be very fast:
>> 1. Scanblock algorithm is complex (read - slow).
>> 2. The complexity does not allow to implement other important
>> optimizations (e.g. concurrent sweep, and if complex changes will be
>> implemented, we will be hunting bugs for weeks).
>> 3. There are some inevitable additional memory accesses during scan --
>> e.g. bitmap, span, span type info array.
>> 4. The data structures are expensive to maintain (e.g. settype,
>> compare it to 1 store to the first word of the object).
>

Dmitry Vyukov

unread,
May 30, 2013, 12:27:50 PM5/30/13
to Russ Cox, ⚛, golang-dev
Where do we guarantee it?

Dmitry Vyukov

unread,
May 30, 2013, 12:30:30 PM5/30/13
to Russ Cox, Ian Lance Taylor, Daniel Morsing, golang-dev
I understand that. But rejecting to give up any bit of performance for
all primitive operations does not look like a good way forward.

Russ Cox

unread,
May 30, 2013, 12:34:44 PM5/30/13
to Dmitry Vyukov, ⚛, golang-dev
It is not guaranteed by the spec, but it has historically been guaranteed by the implementation, and we have stated on mailing lists and such that it will continue to hold. I believe this is an important property of the current allocator.

I realize that these constraints mean it is not possible to just reuse implementations of other languages. But I don't see that as necessarily a bad thing. If you could implement Go in terms of other languages, what would be the point of having Go?

Russ

Dmitry Vyukov

unread,
May 30, 2013, 12:37:30 PM5/30/13
to Russ Cox, ⚛, golang-dev
On Thu, May 30, 2013 at 8:34 PM, Russ Cox <r...@golang.org> wrote:
>> >> 1. The proposal increases memory consumption by adding an uintptr to
>> >> the
>> >> start of each object. We can get a fairly accurate idea about the final
>> >> memory consumption of the proposal by changing the case MTypes_Empty in
>> >> function runtime·settype() to allocate MTypes_Words instead of
>> >> MTypes_Bytes.
>> >>
>> >> 2. How does the proposal deal with alignment guarantees? If a program
>> >> allocates 16 bytes, the expectation is for the data to be aligned to 16
>> >> bytes.
>> >
>> >
>> > This is a serious problem. Per-object headers are not okay, for this
>> > reason.
>
>
>>
>> Where do we guarantee it?
>
>
> It is not guaranteed by the spec, but it has historically been guaranteed by
> the implementation, and we have stated on mailing lists and such that it
> will continue to hold. I believe this is an important property of the
> current allocator.

OK

Robert Griesemer

unread,
May 30, 2013, 12:49:45 PM5/30/13
to Russ Cox, Dmitry Vyukov, ⚛, golang-dev
On Thu, May 30, 2013 at 9:34 AM, Russ Cox <r...@golang.org> wrote:
I realize that these constraints mean it is not possible to just reuse implementations of other languages. But I don't see that as necessarily a bad thing. If you could implement Go in terms of other languages, what would be the point of having Go?

+1

unread,
May 30, 2013, 1:13:35 PM5/30/13
to golan...@googlegroups.com, Russ Cox, ⚛
It is possible to put the extra uintptr just before the start of an object.

void **p;
uintptr info = (uintptr)p[-1];

Dmitry Vyukov

unread,
May 30, 2013, 1:30:35 PM5/30/13
to ⚛, golang-dev, Russ Cox
This is a good question. Russ, what exactly alignment do you mean?
Would it be OK to put header in the end of the block?

Russ Cox

unread,
May 30, 2013, 1:47:00 PM5/30/13
to Dmitry Vyukov, ⚛, golang-dev
If I allocate a power-of-two size block, I expect it to be aligned to that power of two, up to page size.
You could put a header at the end of the block, but then they wouldn't pack very tightly so you'd waste a lot of memory due to fragmentation.

Russ

Carl Shapiro

unread,
May 30, 2013, 3:26:07 PM5/30/13
to Russ Cox, Dmitry Vyukov, ⚛, golang-dev
On Thu, May 30, 2013 at 9:34 AM, Russ Cox <r...@golang.org> wrote:
On Thu, May 30, 2013 at 12:27 PM, Dmitry Vyukov <dvy...@google.com> wrote:
On Thu, May 30, 2013 at 8:23 PM, Russ Cox <r...@golang.org> wrote:
> On Thu, May 30, 2013 at 12:11 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
>> 2. How does the proposal deal with alignment guarantees? If a program
>> allocates 16 bytes, the expectation is for the data to be aligned to 16
>> bytes.
>
>
> This is a serious problem. Per-object headers are not okay, for this reason.
 
Where do we guarantee it?

It is not guaranteed by the spec, but it has historically been guaranteed by the implementation, and we have stated on mailing lists and such that it will continue to hold. I believe this is an important property of the current allocator.

How is this property currently observable by a Go prograrm?

Carl Shapiro

unread,
May 30, 2013, 4:57:07 PM5/30/13
to Ian Lance Taylor, Dmitry Vyukov, Daniel Morsing, golang-dev
On Wed, May 29, 2013 at 6:27 AM, Ian Lance Taylor <ia...@golang.org> wrote:
That is what gccgo does, but for some types of code it introduces
considerable overhead.  Some code, like encoding/binary, uses small
values with methods and stores them in interfaces.  encoding/binary
only uses a couple of values, but I've seen code that uses a lot more.
 Forcing all those values into separate heap locations means a lot of
unnecessary heap allocations.

I just looked at encoding/binary.  The small values are global structures so there should be no need for a heap allocation as the interface can point to the structure directly.  If this is not done, it should be relatively easy to implement.

I am curious about code which makes more aggressive use of this idiom, can you send me a pointer to the code you are thinking of?