On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,
I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!
Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.
The point is that it would be very-very nice if we can use just
1 bit to represent type info. It would reduce memory consumption and
working set, simplify and speedup GC code.One possible solution is to not store data inline (always allocate, as gccgo does).Another solution is to store inline only pointers (and zero-size structs). I suspect that lots of interface values are pointers.It's also possible to allocate object with sizes 1..7 inline if we set high bits to 1 so that it does not look like pointer (is it true for all OSes?). But it's probably too complex.Will interface representation change break any public APIs?
--
---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
On Fri, May 24, 2013 at 5:15 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,
I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!
Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.So what kind of objects does this happen with? I would imagine most objects do not fall into this category. Certainly stacks, maybe interface values, others? Perhaps a special case can be made for these (for stacks, there already is).
On Fri, May 24, 2013 at 5:41 PM, Keith Randall <k...@google.com> wrote:
On Fri, May 24, 2013 at 5:15 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:
Hi,
I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!
Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.So what kind of objects does this happen with? I would imagine most objects do not fall into this category. Certainly stacks, maybe interface values, others? Perhaps a special case can be made for these (for stacks, there already is).I think for stacks we can get precise info. In each safepoint it's either pointer or not, even if we reuse stack space across scopes.Can it happen inside of maps?
reflect has (Value).InterfaceData() for direct access to the interface
> Will interface representation change break any public APIs?
>
words. I don't think you can use that call for anything useful, so
maybe it's ok to break it?
Will each memory section have it's own copy of the bitmask? Then a
possibility would be to have the interface generation functions
clear/set the bitmask, depending on whether it's word contains a value
or a pointer.
I had a different idea about improving the way we record type info in the heap. Perhaps it would compliment your plan.The idea is that most of the heap is probably just a few object types. So we can reserve some MSpans to hold just one type of object. Kind of like the size classes we have now, but divided up by type instead of size class. Of course, that means a lot more classes, so we can't do this for every type. Instead, we keep a count of allocations by type and when the count of allocations exceed a threshold, we promote that type to have its own "class", meaning from then on it gets allocated as a separate class (and no longer allocated by size class).The benefit is that we can record type info at the MSpan level instead of at the individual object level. At least, for MSpans which correspond to a type class. We also get the side benefit of saving some space if the type doesn't fit exactly in a size class. We need type info per object only for the old size class MSpans.There are still some problems to work through. How are arrays handled? And what if the number of type classes grows too large (maybe large = as many as size classes we have now = 61)? Can we "unpromote" a type class later? We could reset (or decay) the allocation counts at each GC and rechoose the type classes based on the top N allocations.
2. GC redesign.
Problems:
2.1 GC is complex
2.2 type info is complex, fat and slow to process
2.3 lots of scattered memory accesses and indirectionOn Wed, May 29, 2013 at 3:24 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
> From the document:
>
>
> 2. GC redesign.
>
> Problems:
>
> 2.1 GC is complex
>
> 2.2 type info is complex, fat and slow to process
>
> 2.3 lots of scattered memory accesses and indirection
>
> My opinion:
>
> 2.1 True
> 2.2 The current GC hasn't been optimized yet
> 2.3 The current GC hasn't been optimized yet
Do you think it can be substantially faster (to outweigh complexity)
than what I propose?
Is it possible to rewrite that code in some other way so that it doesnot use such interfaces?
1. The proposal increases memory consumption by adding an uintptr to the start of each object. We can get a fairly accurate idea about the final memory consumption of the proposal by changing the case MTypes_Empty in function runtime·settype() to allocate MTypes_Words instead of MTypes_Bytes.2. How does the proposal deal with alignment guarantees? If a program allocates 16 bytes, the expectation is for the data to be aligned to 16 bytes.
I realize that these constraints mean it is not possible to just reuse implementations of other languages. But I don't see that as necessarily a bad thing. If you could implement Go in terms of other languages, what would be the point of having Go?
On Thu, May 30, 2013 at 12:27 PM, Dmitry Vyukov <dvy...@google.com> wrote:On Thu, May 30, 2013 at 8:23 PM, Russ Cox <r...@golang.org> wrote:
> On Thu, May 30, 2013 at 12:11 PM, ⚛ <0xe2.0x...@gmail.com> wrote:
>> 2. How does the proposal deal with alignment guarantees? If a program
>> allocates 16 bytes, the expectation is for the data to be aligned to 16
>> bytes.
>
>
> This is a serious problem. Per-object headers are not okay, for this reason.Where do we guarantee it?
It is not guaranteed by the spec, but it has historically been guaranteed by the implementation, and we have stated on mailing lists and such that it will continue to hold. I believe this is an important property of the current allocator.
Allocate memory and use it with something that requires alignment.
Nothing in the language per se requires alignment, but since Go is a systems programming language it is common to run into system alignment requirements. For example, Linux I/O on a file opened with syscall.O_DIRECT requires page alignment, and on modern CPUs various vector instructions require 16- or 32-byte alignment even though the basic values are smaller. I am sure there are more examples.
On Wed, May 29, 2013 at 6:27 AM, Ian Lance Taylor <ia...@golang.org> wrote:
I just looked at encoding/binary. The small values are global structures so there should be no need for a heap allocation as the interface can point to the structure directly. If this is not done, it should be relatively easy to implement.
How about putting types at the end of the span? Something like this:
span: [obj0 obj1 ... objN type0 type1 ... typeN]
that way objects are packed and aligned correctly, but we can still get to the types efficiently. I'm thinking we use a page table like what Dmitry proposed:
struct PTEntry
{
uint8 sizeclass; // element size class (we have 61 classes right now)
uint8 offset; // offset from span beginning in pages (at most 15 right now - there's a few unused bits here)
};
An errno value is usually a very small integer. We can preallocate a range of small integer values around zero. When assigning an integer to an interface, if the value is within range, the preallocated integer can be used instead.
How about putting types at the end of the span?
On Sat, Jun 1, 2013 at 1:25 AM, Keith Randall <k...@google.com> wrote:Yes, this looks like the next best option after object headers.
> How about putting types at the end of the span? Something like this:
>
>
> span: [obj0 obj1 ... objN type0 type1 ... typeN]
I've prototyped it, and it showed 5% memory usage reduction (with no GC bitmap).
Probably the types must go in reverse order -- typeN ... type0.
If we have type info at the end of the span, it must also contain span
> that way objects are packed and aligned correctly, but we can still get to
> the types efficiently. I'm thinking we use a page table like what Dmitry
> proposed:
>
>
> struct PTEntry
>
> {
>
> uint8 sizeclass; // element size class (we have 61 classes right now)
>
> uint8 offset; // offset from span beginning in pages (at most 15 right now
> - there's a few unused bits here)
>
> };
size in pages. It will be needed to compute type info address.
I would prefer to store size instead of sizeclass. There is no easy
way to compute size from sizeclass. Currently we just do an additional
memory load.
And then you have a performance cliff as soon as someone uses a large enum value.
What is the problem with storing non-pointers in interface values? Can't you use the other half of the interface value to supply type information?
A stack frame is equivalent to a Go struct, there is nothing special about it. (Unless we want a stack frame to have different structure depending on the offset of the IP register from the start of the function, or to have the contents of a stack frame to partially describe its structure.)
On Tue, Jun 4, 2013 at 1:37 PM, ⚛ <0xe2.0x...@gmail.com> wrote:A stack frame is equivalent to a Go struct, there is nothing special about it. (Unless we want a stack frame to have different structure depending on the offset of the IP register from the start of the function, or to have the contents of a stack frame to partially describe its structure.)
It is more complicated than a struct.
It depends on the level of GC precision that is expected from the design of the garbage collector. What precision are you aiming for?
On Wed, May 22, 2013 at 8:57 PM, Dmitry Vyukov <dvy...@google.com> wrote:Hi,
I want to share my rough plan on improving Go GC and memory allocator:
https://docs.google.com/document/d/1HCPu3WKyCX3ZRYxmIMKTk0Ik1dePxKW1p02k3uhcft4/edit?usp=sharing
I don't have all the details, and I don't have a timeline.
But since these are very significant changes I want share it early,
hear your thoughts and feedback, and ideally get a buy in from
somebody for compiler/linker part (GC info generation) and parts of
runtime work.
Thanks!
Carl pointed that GC bitmasks (bit per word saying - pointer/not pointer) will conflict with interface -- the data word can be a pointer and not a pointer for the same object.The point is that it would be very-very nice if we can use just1 bit to represent type info. It would reduce memory consumption and
working set, simplify and speedup GC code.
One possible solution is to not store data inline (always allocate, as gccgo does).Another solution is to store inline only pointers (and zero-size structs). I suspect that lots of interface values are pointers.It's also possible to allocate object with sizes 1..7 inline if we set high bits to 1 so that it does not look like pointer (is it true for all OSes?). But it's probably too complex.Will interface representation change break any public APIs?