--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
2. Rewrite escape analysis. I've documented my progress at https://github.com/golang/go/issues/23109 and I have a working rewrite at https://github.com/mdempsky/go/blob/esc5/src/cmd/compile/internal/gc/esc2.go that's about half the size of esc.go. This might be a bit messier, unfortunately, because the new code both better matches Go implementation details (so escape analysis tags differ in some cases, ultimately resulting in some objects changing whether they're stack- or heap-allocated), and it's also much simpler (so some of the extra debugging messages that many regress tests are written to expect don't make sense).
Do you have a sense of how many stack allocations we are going to lose?
I am very happy with how good the escape analysis has gotten at keeping things on the stack and it would be good not to regress.
regexp/regexp.go:617:2: sreplnet/nss.go:137:31: ccrypto/x509/verify.go:587:2: comparisonCountcrypto/x509/x509.go:1117:39: dnsNamescrypto/x509/x509.go:1117:49: emailAddressescrypto/x509/x509.go:1117:74: ipAddressescrypto/x509/x509.go:1117:96: uriscmd/go/internal/modfetch/cache.go:218:29: revcmd/go/internal/modload/import.go:211:59: dircmd/go/internal/modload/import.go:230:7: dcmd/compile/internal/gc/typecheck.go:359:17: n
type Closure struct {yp *interr error}//go:noinlinefunc g(f *Closure) error {return f.err}func F() error {var y intreturn g(&Closure{yp: &y,err: nil,})}
--
I plan to make the compiler start generating new POWER9 instructions on ppc64/ppc64le. For that purpose, I’ll propose a method similar to GOARM to make sure this doesn’t break things for the current default (POWER8). I’m also planning some runtime performance optimizations for POWER9.
To test those, I’m also setting up a new POWER9 builder.
--
For 1.13, I plan to finalise aix/ppc64 port:
- fix the last remaining bugs
- port cgo (which is almost done locally btw)
- port golang.org/x packages
- port pprof
- port remaining linker/compiler modes if I still have time.
Sincerely
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
My implementation heap allocates these 11 variables, that esc.go currently stack allocates:regexp/regexp.go:617:2: sreplnet/nss.go:137:31: ccrypto/x509/verify.go:587:2: comparisonCountcrypto/x509/x509.go:1117:39: dnsNamescrypto/x509/x509.go:1117:49: emailAddressescrypto/x509/x509.go:1117:74: ipAddressescrypto/x509/x509.go:1117:96: uriscmd/go/internal/modfetch/cache.go:218:29: revcmd/go/internal/modload/import.go:211:59: dircmd/go/internal/modload/import.go:230:7: dcmd/compile/internal/gc/typecheck.go:359:17: n
Escape analysis fundamentally models the Go source as a bunch of abstract locations (e.g., x, y, z), and assignments between them (e.g., x = &y, x = y, x = *y, x = **y, ...).An indirect call like "x = f()" is modeled as "x = *f" because f could potentially return any value stored in its closure. (It could also return any value that already escaped to the heap, but those aren't interesting.)A function literal like "func() error { y++; return nil }" needs to store a reference to y in its closure. This needs to be modeled as "closure = &y", because if the closure leaks to the heap, then y also needs to leak to the heap.
So working from this example again://go:noinlinefunc g(f func() error) error {return f()}func H() error {var y intreturn g(func() error {y++return nil})}There are a few relevant details about how esc.go handles this code:1. It first analyzes g and summarizes it as "~r0 = *f" (using EscReturnBits in the esc tags). This is fine.2. At L1064, we associate "H.func1 = &y" because y is captured by reference.3. At L1269, there's code in escassign to rewrite "dst = H.func1" into "dst = &H.func1", since function literals are actually a reference type.4. The call to "g(H.func1)" gets handled by escassignfromtag, which rewrites to ".out0 = *H.func1" (based on the summary from step 1 above) and calls escassign. (See L1479.)The problem is the interaction between 3 and 4: escassign doesn't recurse on src when src.Op == ODEREF, so ".out0 = *H.func1" is left as-is, rather than being rewritten to ".out0 = *&H.func1".
But both of these cases seem worth preservingif at all possible, and it does look possible to me.
I remain convinced though that the regressions are related to an off-by-1 error in the current code though that happens to get lucky in these cases, and not by design.
It's easy to believe there are bugscancelling each other out or going unnoticed.But the code does not just "happen to get lucky".Repeating that so many times dismisses the significantengineering work by others and is frankly a bit off-putting.
I’m really sorry. I don't mean to dismiss the engineering effort put into esc.go. I know you, David, Cherry, and others have put very significant work into improving escape analysis, investigating issues, writing test cases, and more over the years. I've always been impressed with that work. esc.go is a very subtle body of code that has accumulated a lot of complexity as a result of its long history. I'm personally thankful for the extensive tests written for it: they proved very instructive while I was learning how esc.go works.
By "not by design," I meant just that the interaction of esc.go routines that resulted in those objects being stack allocated seems unintentional; and by "happens to get lucky," I was referring to the affected code patterns were indeed always safe to stack allocate. I suggested this as a hypothesis earlier, and David supported that assessment. Moreover, having spent a while understanding esc.go, I agreed with his nervousness about the closure handling code in particular. In that context, I thought David agreed with my phrasing when he repeated it, and I used it again in two more emails (which were otherwise focused on factually discussing the code) to make clear I was continuing to refer back to that same original hypothesis.
I certainly did not mean to suggest esc.go was not designed or not thoughtfully engineered. I did not mean any personal offense to anyone.
That said, I found your earlier response very frustrating.
I have a lot of respect for you and your work, so when you showed a modicum of enthusiasm for mine and asked about regressions, I dropped everything to get you an up-to-date comparison. I also detailed those minority of cases that regressed, and then spent hours digging into the root cause. I’d already spent days looking into them before, but this time I was motivated by wanting to impress you. And I did track down why and explained it. I was brief to respect your time and general expertise, but I would have been happy to elaborate.
But your response? First, you lectured me about closure capture-by-value by capture-by-reference, as though that’s not something any experienced C++ programmer knows or covered in introductory compiler texts. As though I was specifying “captured by reference” in all of my emails because I just like to be verbose.
Then, you didn’t believe me when I’d already enumerated all of the regressions for you and said they were the same issue. (Sorry, I over simplified: the par.Cache.Do regressions actually involve esc.go:1465 rather than esc.go:1482.) No inquiry into why I thought that. Instead, you immediately started listing additional issues you assumed were in my code and asserted those were the actual faulty code patterns. Well sorry, you lost your “bet”: my rewrite handles both of your “g” functions without escaping x to the heap.
But most annoyingly, you dismissed my explanation and then misexplained how the existing code works. “The return-tracking bits don't really matter”? Then explain why new(int) only escapes in F1 here. The only difference is whether the call* function returns the result of f().
And do you know why new(int) escapes there, but y doesn’t? Because there’s an off-by-1 calculation in esc.go’s connectivity-graph edge construction code. Let me disregard your decades of programming expertise to patronizingly remind you: off-by-1 errors have a long history of causing issues and security vulnerabilities. And in the context of esc.go specifically, both of the memory corruption errors I linked above were instances of this same problem. So the fact that here another off-by-1 error instead resulted in a safe compiler optimization? I’d love to know how you describe that other than “lucky.”
I put weeks of effort into thoroughly understanding esc.go and wrote a new implementation that halves the code complexity, reduces compiler memory usage, fixes multiple memory corruption issues (#29197, #29353), and overall significantly improves escape analysis results. I think I deserve some credit that maybe I understand how esc.go works now.
Yes, my rewrite currently causes y to unnecessarily escape too, which is a legitimate performance regression concern. And yes, because of this tedious exchange I’m going to improve it even further so that neither y nor new(int) escape.
But your email was factually wrong, and I think it’s fair for me to have stood up for myself about that. And even though your email was also obnoxious, instead of publicly calling you out on that or directly saying “you’re wrong,” I charitably assumed the misunderstanding was because I inadequately explained the problem in the first place, so I tried again to give a clearer example.
Now, instead of owning up to your error, you instead suddenly take offense with my phrasing.
esc.go was described earlier in this very thread by another senior Go engineer as “has never been very maintainable” and “so complex” and mentioned its “lack of documentation,” and you even “seconded” their message. But calling esc.go “lucky” is the affront to engineering? Please.
They wrote lots of tests and made sure the codepassed those tests. The tests are the reasonthe code works on these cases, not luck.
--
I also apologize for the criticism about calling code lucky. I was trying to push back on the general tendency I've observed from many people, especially with respect to the more subtle parts of the Go compiler and runtime, to assume that code that doesn't make sense must be working entirely by accident.
(It was a big aha moment for me when I realized that the answer to "how could this possibly have worked?" is almost always "because there was a test".)
I think you can do that but still stay within those rules of conduct by
stating that you think you've been misunderstood, perhaps you weren't
clear, they haven't grasped you point, etc. Then the annoyance isn't
being bottled up and the other party gets a prompt to consider what
they've missed. :-)
IME it's quite common to have two off-by-one errors `cancel' each other
in complex areas, especially with test-driven development as the passing
of tests helps reassure against the nagging doubt the logic isn't fully
understood. So is it more goal-directed hill climbing than luck?