Reducing gc peak memory use

506 views
Skip to first unread message

Josh Bleecher Snyder

unread,
May 20, 2015, 3:01:29 PM5/20/15
to golan...@googlegroups.com
I've been slowly shrinking gc's memory footprint, and I think I've
gotten all the really low-hanging fruit, so I wanted to check in
before I continue hacking at it.

CL 10211 has a report on gc's peak memory use vs 1.4.2 on a
memory-hungry test. In short, approximately +31%, but with higher
variance.

gc.Node is now at 312 bytes. The next smaller malloc class is 288
bytes; getting there would mean shaving 24 bytes off. That'd reduce
Node size by 7.5%, which probably translates to ~3.5% less overall
memory use. How we could get there:

* Escflowsrc and Escretval are used only during escape analysis. They
could be moved either into the Opt field or into a map in EscState
keyed by *Node. Escloopdepth might also be movable. That's 16-20
bytes.
* Initplan could be removed entirely, on pain of doing a tiny bit of
fairly cheap duplicate work in sinit.go. That's 8 bytes.
* Converting bools to flags would cut 10 bytes.

Opinions?

-josh

Josh Bleecher Snyder

unread,
May 20, 2015, 3:37:27 PM5/20/15
to golan...@googlegroups.com
> I've been slowly shrinking gc's memory footprint, and I think I've
> gotten all the really low-hanging fruit, so I wanted to check in
> before I continue hacking at it.

I should add that the other memory-hungry type is obj.Prog. I don't
see any obvious angle of attack, but someone else might. (Could
ProgInfo safely become *ProgInfo?) Prog is currently 456 bytes; the
next smaller malloc classes are 448 and 416, so even cutting just 8
bytes off Prog will make a small but noticeable difference. If you see
a way, please say something or send a CL!

Changing the less-frequently used Addr fields (To2, From3) from Addr
to *Addr would have a big impact, but that change is broad and subtle.

-josh

Ingo Oeser

unread,
May 20, 2015, 6:56:06 PM5/20/15
to golan...@googlegroups.com
I tried converting some of the NodeList values into slices, but had trouble understanding how they are linked.

I started with Dcl and InlDcl here but getting segfaults when starting to use the compiler to compile itself.

I could send a (broken!) CL with that work if you are curious.

It will get rid of a lot of Next fields in many places.

Josh Bleecher Snyder

unread,
May 20, 2015, 7:22:17 PM5/20/15
to Ingo Oeser, golan...@googlegroups.com
> I tried converting some of the NodeList values into slices, but had trouble understanding how they are linked.
>
> I started with Dcl and InlDcl here but getting segfaults when starting to use the compiler to compile itself.
>
> I could send a (broken!) CL with that work if you are curious.

I started on this as well. You can see my flailing at
https://github.com/josharian/go/commits/wip/gc-remove-nodelist
(probably not a permanent URL, sorry).

The first few commits are stable (although they need rebasing and
cleanup); they remove specific quirky standalone uses in sinit.go,
esc.go, typecheck.go, order.go, etc. I'll probably mail those early in
the Go 1.6 cycle.

Then I added methods to NodeList to capture common things that get
done with them and changed all uses of NodeList to use those methods.
I then changed NodeList to be []*Node and re-implemented the methods.
It was never going to be a real fix, but I was curious about the
performance impact.

It didn't work and I eventually gave up. However, it illustrated that
there's some hidden complexity there. For example, that "Poison
pointer" bit in list1 is not just an optimization, it is currently
required for correctness. And it took me a long time to figure out how
to rewrite the code in order.go so that it made sense.


> It will get rid of a lot of Next fields in many places.

True, and even more importantly, it would make the code much more
readable. However, the goal of this original email is to make the Node
and Prog structs smaller. NodeLists don't contribute much to the
overall memory usage of the compiler, even though they do account for
a non-trivial percentage of its allocations.

-josh

Ingo Oeser

unread,
May 20, 2015, 8:08:00 PM5/20/15
to golan...@googlegroups.com, night...@googlemail.com


On Thursday, May 21, 2015 at 1:22:17 AM UTC+2, Josh Bleecher Snyder wrote:
> I tried converting some of the NodeList values into slices, but had trouble understanding how they are linked.
>
> I started with Dcl and InlDcl here but getting segfaults when starting to use the compiler to compile itself.
>
> I could send a (broken!) CL with that work if you are curious.

I started on this as well. You can see my flailing at
https://github.com/josharian/go/commits/wip/gc-remove-nodelist
(probably not a permanent URL, sorry).


Thanks for that link! 

Will play with it. And yes, I agree this is too risky for 1.5 and certainly 1.6 material.

Ingo Oeser

unread,
May 20, 2015, 8:22:15 PM5/20/15
to golan...@googlegroups.com


On Wednesday, May 20, 2015 at 9:37:27 PM UTC+2, Josh Bleecher Snyder wrote:
Changing the less-frequently used Addr fields (To2, From3) from Addr
to *Addr would have a big impact, but that change is broad and subtle.


Making only Addr.To2 a pointer is actually fairly straightforward and affects only arm64 AFAICS:

 7g/peep.go                 |    4 ++--
 asm/internal/asm/asm.go    |    6 ++++--
 internal/obj/arm64/asm7.go |    2 +-
 internal/obj/link.go       |    2 +-
 internal/obj/util.go       |    4 ++--
 internal/obj/x86/obj6.go   |    2 +-
 6 files changed, 11 insertions(+), 9 deletions(-)

And that saves 72 bytes on 64bit already, so seems a worthy trade-off to me.

What do you think?

Josh Bleecher Snyder

unread,
May 20, 2015, 8:34:42 PM5/20/15
to golan...@googlegroups.com
>> Changing the less-frequently used Addr fields (To2, From3) from Addr
>> to *Addr would have a big impact, but that change is broad and subtle.
>>
>
> Making only Addr.To2 a pointer is actually fairly straightforward and
> affects only arm64 AFAICS:
>
> 7g/peep.go | 4 ++--
> asm/internal/asm/asm.go | 6 ++++--
> internal/obj/arm64/asm7.go | 2 +-
> internal/obj/link.go | 2 +-
> internal/obj/util.go | 4 ++--
> internal/obj/x86/obj6.go | 2 +-
> 6 files changed, 11 insertions(+), 9 deletions(-)
>
> And that saves 72 bytes on 64bit already, so seems a worthy trade-off to me.
>
> What do you think?

Have you confirmed with toolstash -cmp that there is no functional
change? (I generally use
https://github.com/rsc/toolstash/blob/master/buildall to check all
architecture/OS combinations.)

Want to mail a CL so that we can look at the diff?

The challenge of switching to a pointer (and the reason I called this
change subtle and broad) is making sure that we preserve copy vs alias
behavior everywhere it matters. x.To2 = y.To2 means different things
if To2 is an Addr vs a *Addr, and such assignments are all over the
place.

-josh

Ingo Oeser

unread,
May 20, 2015, 10:37:35 PM5/20/15
to golan...@googlegroups.com


On Thursday, May 21, 2015 at 2:34:42 AM UTC+2, Josh Bleecher Snyder wrote:
Have you confirmed with toolstash -cmp that there is no functional
change? (I generally use
https://github.com/rsc/toolstash/blob/master/buildall to check all
architecture/OS combinations.)

Good idea! Just did it, looks good.
 
Want to mail a CL so that we can look at the diff?

 Just mailed out CL 10272 with you as reviewer.

The challenge of switching to a pointer (and the reason I called this
change subtle and broad) is making sure that we preserve copy vs alias
behavior everywhere it matters. x.To2 = y.To2 means different things
if To2 is an Addr vs a *Addr, and such assignments are all over the
place.


I changed all places I could find by the describe function of go oracle. If the change looks ok, 
I would love to have a try bot run, since I don't own an arm64.

PS: Thanks for your feedback, I value it very much!

Russ Cox

unread,
May 21, 2015, 4:21:57 PM5/21/15
to Josh Bleecher Snyder, Ingo Oeser, golan...@googlegroups.com
I'm in favor of reducing our dependence on NodeList as far as that makes sense, so thanks for looking into it, but does removing NodeList help memory? It seems to me that a nil *NodeList is 1 word while a nil []*Node is 3 words. In the Node struct I would expect you might want *[]*Node to avoid tripling the footprint of the empty list. It's definitely true that once there is more than one element, the slice is better. And slices are better than linked lists in general, so I'm willing to pay the cost for the 1-element lists.

Russ

Josh Bleecher Snyder

unread,
May 21, 2015, 6:34:42 PM5/21/15
to Russ Cox, Ingo Oeser, golan...@googlegroups.com
> I'm in favor of reducing our dependence on NodeList as far as that makes
> sense, so thanks for looking into it, but does removing NodeList help
> memory? It seems to me that a nil *NodeList is 1 word while a nil []*Node is
> 3 words. In the Node struct I would expect you might want *[]*Node to avoid
> tripling the footprint of the empty list.

Yes, my prototypes used *[]*Node. I was also going to experiment with
NodeLists that contained multiple *Nodes per list element. The key to
experimenting with this is removing dependence on the implementation
details of NodeList, which proved less tractable than I hoped. I plan
to resume that effort for Go 1.6.

-josh
Reply all
Reply to author
Forward
0 new messages