Go 1.4 Beta 1 is released

12,397 views
Skip to first unread message

Andrew Gerrand

unread,
Oct 29, 2014, 11:29:32 PM10/29/14
to golang-nuts
Hi Go nuts,

We have just released go1.4beta1, a beta version of Go 1.4.
It is cut from the default branch at a revision tagged as go1.4beta1.

Please help us by testing your Go programs with the release, and report any problems using the issue tracker:

You can download binary and source distributions from the usual place:
    http://golang.org/dl/#go1.4beta1

To find out what has changed in Go 1.4, read the release notes:

Documentation for Go 1.4 is available at:

Our goal is to release the final version of Go 1.4 on December 1, 2014.

Andrew
Message has been deleted

Davide D'Agostino

unread,
Oct 30, 2014, 12:17:02 AM10/30/14
to golan...@googlegroups.com
Awesome news! 

That's a big release!! Runtime and GC!

Congrats guys!!!!!!!!!!!!!!!!!!!!

Wael M. Nasreddine

unread,
Oct 30, 2014, 12:31:06 AM10/30/14
to Davide D'Agostino, golan...@googlegroups.com
Awesome news!

One minor issue, in the release notes, in the sentence:

A brief description of the plans for this experimental port are available here.

The here link above leads to a 404.

Wael

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew Gerrand

unread,
Oct 30, 2014, 12:38:46 AM10/30/14
to Wael M. Nasreddine, Davide D'Agostino, golan...@googlegroups.com
Ah, the link should be http://golang.org/go14android — which is what it will be when the release is complete. Probably not worth changing at this point.

Wael M. Nasreddine

unread,
Oct 30, 2014, 12:39:44 AM10/30/14
to Andrew Gerrand, Davide D'Agostino, golan...@googlegroups.com
Thanks Andrew, the same applies for "Details are available in the design document."

Bradford H. Cook

unread,
Oct 30, 2014, 12:53:57 AM10/30/14
to golan...@googlegroups.com, wael.na...@gmail.com, in...@daddye.it
The link might be this (Google Doc).  Dated June 2014.  Hope it helps!

daniel....@gmail.com

unread,
Oct 30, 2014, 1:26:48 AM10/30/14
to golan...@googlegroups.com

TODO: It may be bumped to 4096 for the release.

Did it get bumped to 4096?

Dave Cheney

unread,
Oct 30, 2014, 1:42:24 AM10/30/14
to golan...@googlegroups.com
Nope, but it might fit the final release in a month.

Thorsten Sommer

unread,
Oct 30, 2014, 2:39:38 AM10/30/14
to golan...@googlegroups.com


Dear Go's team,

thank alot for the great work :-) 


Best regards,
Thorsten.

Emily Maier

unread,
Oct 30, 2014, 7:24:40 AM10/30/14
to golang-nuts
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

What's the status of passing Go pointers to C in 1.4? I know that's
supposed to be unsupported eventually, will the GC changes in this
release break it?

Emily

Rob Pike

unread,
Oct 30, 2014, 9:06:35 AM10/30/14
to golang-nuts
Nothing has changed. Users of cgo must understand that if the C heap holds the only pointer to an item in the Go heap, the garbage collector cannot see that and the memory might be freed. If that happens, it's the cgo user's error, not Go's.

-rob

Carlos Castillo

unread,
Oct 30, 2014, 9:31:23 AM10/30/14
to golan...@googlegroups.com
AFAIK contiguous stacks move (to be resized), so wouldn't passing a pointer to a go stack object be bad now?

Rob Pike

unread,
Oct 30, 2014, 9:51:28 AM10/30/14
to Carlos Castillo, golan...@googlegroups.com
You can't pass a pointer to the Go stack to C: that would be an "escape" and the object would move to the heap first.

-rob


gauravk

unread,
Oct 30, 2014, 10:03:36 AM10/30/14
to golan...@googlegroups.com
Reading through release notes:

"Sometimes one wishes to have components that are not exported, for instance to avoid acquiring clients of interfaces to code that is part of a public repository but not intended for use outside the program to which it belongs."

Should it be  "..but intended.." ?

Thanks

Jan Mercl

unread,
Oct 30, 2014, 10:11:09 AM10/30/14
to Andrew Gerrand, golang-nuts
On Thu, Oct 30, 2014 at 4:29 AM, Andrew Gerrand <a...@golang.org> wrote:

> We have just released go1.4beta1, a beta version of Go 1.4.

FYI: In a sample of benchmarks for repositories at [0] I see some
quite big, but seldom speed improvements and, more often, rather
biggish speed regressions, both in tens of percents or more. Top 5
results range by magnitude ($ benchcmp -mag old new).

b: +49% to +89%
b/example: +23% to +43%
mathutil: +85% to +190%
ql: -56% to +119% (and top 5 results for memory use increased
in about +45%)
zappy: +28% to +41%

Positive numbers are slowdowns, negative are speedups. 2 x Intel X5450
@ 3GHz machine (== 4 cores total).

[0]: github.com/cznic

-j

roger peppe

unread,
Oct 30, 2014, 10:27:15 AM10/30/14
to Jan Mercl, Andrew Gerrand, golang-nuts
I wonder if most of the cause of regression might be the change
that causes an assignment to an interface to allocate when
the value is a pointer-sized non-pointer.

For example, in benchmarkSetSeq in
https://github.com/cznic/b/blob/master/all_test.go,
it calls Set with two integers, which will now incur two
more allocations per iteration.

This may be affecting benchmark code disproportionately, as it's not
uncommon to use a dummy non-pointer value in a benchmark.

Jan Mercl

unread,
Oct 30, 2014, 10:47:33 AM10/30/14
to roger peppe, Andrew Gerrand, golang-nuts
On Thu, Oct 30, 2014 at 3:27 PM, roger peppe <rogp...@gmail.com> wrote:
> I wonder if most of the cause of regression might be the change
> that causes an assignment to an interface to allocate when
> the value is a pointer-sized non-pointer.
>
> For example, in benchmarkSetSeq in
> https://github.com/cznic/b/blob/master/all_test.go,
> it calls Set with two integers, which will now incur two
> more allocations per iteration.
>
> This may be affecting benchmark code disproportionately, as it's not
> uncommon to use a dummy non-pointer value in a benchmark.

Agreed and I should have mentioned that. However, the b/example is an
<int, int> specilaization of package b which does not use interface{}
for storing the key or value. Yet there's a +23% to +43% range in the
top 5 by magnitude. That's only a half of the non specialized case and
that, I presume, shows the, now additional, int allocating share for
the regression.

BTW, full results for b/example:
https://gist.github.com/cznic/d39a41d4dd4e8cf48784#file-gistfile1-sh

-j

James Bardin

unread,
Oct 30, 2014, 10:49:32 AM10/30/14
to golan...@googlegroups.com, 0xj...@gmail.com, a...@golang.org


On Thursday, October 30, 2014 10:27:15 AM UTC-4, rog wrote:
This may be affecting benchmark code disproportionately, as it's not
uncommon to use a dummy non-pointer value in a benchmark.

I'm wondering this too. I'm seeing a lot of performance regressions with benchmarks, but the overall test run-time is often slightly better.  

Rob Pike

unread,
Oct 30, 2014, 11:58:12 AM10/30/14
to gauravk, golan...@googlegroups.com
Correct as written. Maybe badly written, but correct.

-rob


--

Rob Pike

unread,
Oct 30, 2014, 1:00:14 PM10/30/14
to gauravk, golan...@googlegroups.com
Ints in interfaces allocate now, but they didn't before. Write barriers slow things down. The garbage collector is faster. The runtime has had some speedups.

The performance effects of these confounding factors are dependent on the programs. If you can isolate a single simple benchmark that illustrates a slowdown, we can investigate.

-rob

gwenn...@gmail.com

unread,
Oct 30, 2014, 2:23:27 PM10/30/14
to golan...@googlegroups.com
Hello,
I may have found a regression :
It's related to the unsafe package.
A null pointer is returned in Go 1.4beta1 but not in Go 1.3.
Could you please confirm before filling an issue?
Or I can easily fix this on my side if you tell me to do so.

The real code is here:
Allocation can be avoid by using the custom 'cstring' method instead of 'C.CString'...
(SQLite will copy the value anyway: http://sqlite.org/c3ref/bind_blob.html
"If the fifth argument has the value SQLITE_TRANSIENT, then SQLite makes its own private copy of the data immediately, before the sqlite3_bind_*() routine returns.")

Thanks and Regards.

Rob Pike

unread,
Oct 30, 2014, 2:40:07 PM10/30/14
to gwenn...@gmail.com, golan...@googlegroups.com
Does it make a difference? It's an empty string either way.

-rob


--

gwenn...@gmail.com

unread,
Oct 30, 2014, 3:20:23 PM10/30/14
to golan...@googlegroups.com, gwenn...@gmail.com
In the SQLite DB, there is a big difference: the NULL value bound/inserted is completely different to an empty string.
Here is the test that is failing with Go1.4beta1:
https://github.com/gwenn/gosqlite/blob/master/stmt_test.go#L560
But it's perfectly ok to me if you decide to keep the new behaviour.
I will fix it on my side.
And my code may happen to be completely broken in Go1.5 due to the move/compaction performed by the GC.
So don't bother too much...
Regards.

Andrew Lytvynov

unread,
Oct 30, 2014, 4:46:36 PM10/30/14
to golan...@googlegroups.com
Would it make sense to mention https://codereview.appspot.com/164120043 in Canonical Import Paths section?

On Wednesday, October 29, 2014 8:29:32 PM UTC-7, Andrew Gerrand wrote:

xxs...@gmail.com

unread,
Oct 30, 2014, 5:05:22 PM10/30/14
to golan...@googlegroups.com
Has there been discussion on why it makes sense to replace the encoding/gob implementation with a 30% slower one? If so, I would appreciate a link, as it seems quite bizzare to me.
I gathered from the changelog that this makes encoding/gob available to more limited environments, however are these limited environments so popular that it justifies hitting everyone else with a 30% typical performance loss?

robfig

unread,
Oct 30, 2014, 7:38:26 PM10/30/14
to golan...@googlegroups.com, rogp...@gmail.com, a...@golang.org
I also see a sizable performance regression for github.com/robfig/soy

benchmark                       old ns/op     new ns/op     delta
BenchmarkLexParseFeatures       1124934       1382339       +22.88%
BenchmarkExecuteFeatures        196904        238391        +21.07%

benchmark                       old allocs     new allocs     delta
BenchmarkLexParseFeatures       1969           2156           +9.50%
BenchmarkExecuteFeatures        740            1075           +45.27%

benchmark                       old bytes     new bytes     delta
BenchmarkLexParseFeatures       103802        93461         -9.96%
BenchmarkExecuteFeatures        41686         41066         -1.49%


I believe the benchmark is representative of real-world performance.  It parses and executes the templates in this file:

I guess the slowdown is probably entirely attributable to scalars stored in an interface requiring an allocation now.

Andrew Gerrand

unread,
Oct 30, 2014, 8:04:55 PM10/30/14
to Andrew Lytvynov, golan...@googlegroups.com
Yeah, I sent this: 

https://codereview.appspot.com/168890043


--

Caleb Spare

unread,
Oct 30, 2014, 8:21:49 PM10/30/14
to robfig, golang-nuts, rogp...@gmail.com, Andrew Gerrand
I've also seen some pretty drastic slowdowns in my own benchmarks. For example, this one parses a simple message format (statsd):

BenchmarkParseSimple      327           492           +50.46%     
BenchmarkParseComplex     621           816           +31.40%


--

Ian Taylor

unread,
Oct 30, 2014, 8:41:27 PM10/30/14
to gwenn...@gmail.com, golang-nuts
On Thu, Oct 30, 2014 at 12:20 PM, <gwenn...@gmail.com> wrote:
>
> In the SQLite DB, there is a big difference: the NULL value bound/inserted
> is completely different to an empty string.
> Here is the test that is failing with Go1.4beta1:
> https://github.com/gwenn/gosqlite/blob/master/stmt_test.go#L560
> But it's perfectly ok to me if you decide to keep the new behaviour.
> I will fix it on my side.

I think the new behaviour is acceptable from Go's perspective. The
length is zero either way, so the pointer tells you nothing. You can
only tell the difference if you import the unsafe package. The
distinction you are drawing between a NULL and non-NULL string value
pointer does not exist in Go, so you are already in an unspecified
netherworld.

The change was not made arbitrarily. See http://golang.org/issue/8404 .

Ian

gwenn...@gmail.com

unread,
Oct 30, 2014, 11:54:25 PM10/30/14
to golan...@googlegroups.com, gwenn...@gmail.com
Ok,
Thanks you for the explanation.

Keith Randall

unread,
Oct 31, 2014, 12:20:25 AM10/31/14
to golan...@googlegroups.com, gwenn...@gmail.com
I've coded up a patch to avoid allocation when small integers are stored in interfaces.  Can people who saw performance regressions due to this problem try again with this patch?

Rob Pike

unread,
Oct 31, 2014, 2:14:10 AM10/31/14
to xxs...@gmail.com, golan...@googlegroups.com
A number of reasons. First, as Andrew mentioned, it works better with the garbage collector. Many of the unsafe operations it was doing will not work correctly with the new collector. Proof: This comment from decode.go has been deleted:

// TODO(rsc): When garbage collector changes, revisit
// the allocations in this file that use unsafe.Pointer.

That is no small thing. Those allocations are gone, and what replaced them won't need that kind of attention in future.

Second, it is now totally portable. Not only does that mean it can be used in environments where unsafe is disallowed, it is no longer sensitive to changes in the runtime. It previously needed updates when runtime representations changed, often in the subtlest parts of a very subtle implementation. (It's much less subtle now.) Also, in the process of doing the change I discovered a number of places where unsafe operations were not only unsafe, but also violating invariants that unsafe exposed, errors that would have cause problems later.

The story is not over yet. We clawed back a lot of the speed by careful tuning and custom code, and we will surely find more to recover. We have some ideas that will roll out in 1.5 (they missed the 1.4 cutoff). Plus, there remain clean, focused options for unsafe to be put back (under build tags) that could be done with confidence, but we don't want to go there yet.

Finally, the actual slowdown is highly data-dependent, and also not entirely due to the changes to gob itself. The runtime also contributes.

-rob



--

Rob Pike

unread,
Oct 31, 2014, 2:48:20 AM10/31/14
to xxs...@gmail.com, golan...@googlegroups.com

One more thing. A bug fix plus the introduction of atomic.Value makes it significantly faster under heavy load on multiple cores.

As always with performance, the full story is not well captured by a simple benchmark number.

-rob

Gustavo Niemeyer

unread,
Oct 31, 2014, 8:08:47 AM10/31/14
to Rob Pike, xxs...@gmail.com, golan...@googlegroups.com
It is also memory safe now, which ironically hasn't been mentioned as an advantage by itself. I would personally be willing to give up some performance for it. Hopefully the future performance improvements can be made without having the unsafe logic back.

Rob Pike

unread,
Oct 31, 2014, 9:41:55 AM10/31/14
to Gustavo Niemeyer, xxs...@gmail.com, golan...@googlegroups.com
Indeed, that's the main reason, and I did say that, just not as succinctly. "Works better with the garbage collector" means "works", and the invariants it was violating would corrupt the heap.

-rob

alexandr...@gmail.com

unread,
Oct 31, 2014, 10:41:12 AM10/31/14
to golan...@googlegroups.com, gus...@niemeyer.net, xxs...@gmail.com
I ran a benchmark of my library Gomail (it builds an email). And like others benchmarks mentioned here, it shows a ~30% slowdown:

benchmark         old ns/op     new ns/op     delta
BenchmarkFull     143705        189744        +32.04%

benchmark         old allocs     new allocs     delta
BenchmarkFull     322            336            +4.35%

benchmark         old bytes     new bytes     delta
BenchmarkFull     38328         38287         -0.11%

Brendan Tracey

unread,
Oct 31, 2014, 11:17:36 AM10/31/14
to golan...@googlegroups.com, gus...@niemeyer.net, xxs...@gmail.com, alexandr...@gmail.com
Could someone educate me on what determines the optimal goroutine stack size? I understand that the larger the stack becomes, the more memory each goroutine uses, limiting the maximum number of goroutines. What's the pressure in the other direction? Likelihood of copying and moving the stack immediately? If so, in theory could that be improved with better static analysis?

Keith Randall

unread,
Oct 31, 2014, 12:50:16 PM10/31/14
to golan...@googlegroups.com, gus...@niemeyer.net, xxs...@gmail.com, alexandr...@gmail.com


On Friday, October 31, 2014 8:17:36 AM UTC-7, Brendan Tracey wrote:
Could someone educate me on what determines the optimal goroutine stack size? I understand that the larger the stack becomes, the more memory each goroutine uses, limiting the maximum number of goroutines. What's the pressure in the other direction? Likelihood of copying and moving the stack immediately? If so, in theory could that be improved with better static analysis?

Yes, the smaller the stack the more likely you'll run out of space and have to grow it.  Static analysis might help in some cases.  For example, you could start with a larger stack for goroutines with known large stack requirements.  I don't think it would be easy to do the static analysis, however.  I suspect stack use by goroutines is highly data-dependent.

The cost of copying is not very large, so we can afford to start the stacks pretty small.  We're just playing it safe by ratcheting down the starting size slowly.

Keith Randall

unread,
Oct 31, 2014, 2:03:11 PM10/31/14
to golan...@googlegroups.com, gus...@niemeyer.net, xxs...@gmail.com, alexandr...@gmail.com
About 10% of this is the new memory barriers.
The other 20% is GC related; with GOGC=off the difference goes away.
GC seems to be triggering more often and taking somewhat longer each time.  This is strange, the 1.4 garbage collector should be generally faster than 1.3.  I'll investigate some more.

Rob Pike

unread,
Oct 31, 2014, 2:34:50 PM10/31/14
to Keith Randall, golan...@googlegroups.com, Gustavo Niemeyer, xxs...@gmail.com, alexandr...@gmail.com
That's odd. My data show that it's significantly faster. Definitely worth investigating.

-rob

Michael Jones

unread,
Oct 31, 2014, 2:48:50 PM10/31/14
to Rob Pike, Keith Randall, golan...@googlegroups.com, Gustavo Niemeyer, xxs...@gmail.com, alexandr...@gmail.com
I have several carefully written programs that run at 40% of the former speed. Perversely, the programs whose performance I care most about. I have other programs that are within noise of the same performance, and a few +/- 3%. The slow ones, though, have me pulling my hair out. That's a big problem, because I don't have much hair in the first place.
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

Nate Finch

unread,
Oct 31, 2014, 2:58:54 PM10/31/14
to golan...@googlegroups.com
I don't know if this counts as a regression, but using 1.4 go tool with gccgo 4.9.1 breaks all tests with an error like this:

foo/_test/_testmain.go:52:15: error: reference to undefined identifier ‘testing.MainStart’
  m := testing.MainStart(matchString, tests, benchmarks, examples)
               ^

I see mention of gcc 5 supporting Go 1.4, but no indication of how to get such a thing.

Chris Manghane

unread,
Oct 31, 2014, 3:45:22 PM10/31/14
to Nate Finch, golan...@googlegroups.com

gcc5 will likely support Go 1.4, but it is not released yet. GCC trunk supports 1.3.3 currently and GCC 4.9.1 comes with 1.2 (not 1.3). There is no version of gccgo that has a lingo version of 1.4, currently.


--

Nate Finch

unread,
Oct 31, 2014, 4:02:23 PM10/31/14
to golan...@googlegroups.com, nate....@gmail.com
There should probably be a big caution sign posted somewhere that if you upgrade to 1.4, it'll break usage of gccgo with the go tool until gcc 5 is out.

Keith Randall

unread,
Oct 31, 2014, 7:45:58 PM10/31/14
to Rob Pike, golan...@googlegroups.com, Gustavo Niemeyer, xxs...@gmail.com, alexandr...@gmail.com
I understand more about what is happening here.

The underlying cause is that the heap is a lot smaller in 1.4 than in 1.3.  In this example, about 40%.  Live heap data went down from 300K to 180K.  Because we GC when we reach 2x live data, a smaller heap means we GC more often.

About 10% of the heap reduction is due to more efficient encoding of type information in the heap.  The other 30% is reduction in (and change in accounting for) stacks.  We no longer account for stacks as part of the heap.  Non-default-sized stack segments used to be counted as part of the heap.

GC still seems slower than I thought it would be, but this effect accounts for most of it.  You can check yourself by adjusting GOGC to a larger value for 1.4, to match heap sizes with 1.3 (you can see the heap sizes using GODEBUG=gctrace=1).

This is an unfortunate side effect of improving our space efficiency.  I'm not sure what can be done about it, other than to increase the default GOGC value.  But that seems wrong also.

Dave Cheney

unread,
Oct 31, 2014, 8:57:08 PM10/31/14
to Keith Randall, Dmitry Vyukov, Rob Pike, golan...@googlegroups.com, Gustavo Niemeyer, xxs...@gmail.com, alexandr...@gmail.com
Great analysis. It feels like each GC cycle has a fixed overhead so more, small GCs have larger wall time. How odd.




On 1 Nov 2014, at 10:45, 'Keith Randall' via golang-nuts <golan...@googlegroups.com> wrote:

I understand more about what is happening here.

The underlying cause is that the heap is a lot smaller in 1.4 than in 1.3.  In this example, about 40%.  Live heap data went down from 300K to 180K.  Because we GC when we reach 2x live data, a smaller heap means we GC more often.

About 10% of the heap reduction is due to more efficient encoding of type information in the heap.  The other 30% is reduction in (and change in accounting for) stacks.  We no longer account for stacks as part of the heap.  Non-default-sized stack segments used to be counted as part of the heap.

GC still seems slower than I thought it would be, but this effect accounts for most of it.  You can check yourself by adjusting GOGC to a larger value for 1.4, to match heap sizes with 1.3 (you can see the heap sizes using GODEBUG=gctrace=1).

This is an unfortunate side effect of improving our space efficiency.  I'm not sure what can be done about it, other than to increase the default GOGChave  value.  But that seems wrong also.

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/7VAcfULjiB8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Alex Skinner

unread,
Oct 31, 2014, 10:03:05 PM10/31/14
to golan...@googlegroups.com, k...@google.com, dvy...@google.com, r...@golang.org, gus...@niemeyer.net, xxs...@gmail.com, alexandr...@gmail.com
Agreed, thank you much for the analysis and suggestion on how to mitigate.  This seems like an extermely hard problem to solve.  Any plans for the long run?  As someone who -doesn't- write garbage collectors, at face value it seems like the only sane solutions are a very advanced(and perhaps impossible) compiler analysis, and/or a dynamically calculated GOGC value.  It seems increasing a static default GOGC value would cause a lot of ballooning/complaints. 
Or maybe the solution is to leave it alone and just let the speed freaks tune it as they see fit.

Curious on other thoughts...
Alex


On Friday, October 31, 2014 8:57:08 PM UTC-4, Dave Cheney wrote:
Great analysis. It feels like each GC cycle has a fixed overhead so more, small GCs have larger wall time. How odd.




On 1 Nov 2014, at 10:45, 'Keith Randall' via golang-nuts <golan...@googlegroups.com> wrote:

I understand more about what is happening here.

The underlying cause is that the heap is a lot smaller in 1.4 than in 1.3.  In this example, about 40%.  Live heap data went down from 300K to 180K.  Because we GC when we reach 2x live data, a smaller heap means we GC more often.

About 10% of the heap reduction is due to more efficient encoding of type information in the heap.  The other 30% is reduction in (and change in accounting for) stacks.  We no longer account for stacks as part of the heap.  Non-default-sized stack segments used to be counted as part of the heap.

GC still seems slower than I thought it would be, but this effect accounts for most of it.  You can check yourself by adjusting GOGC to a larger value for 1.4, to match heap sizes with 1.3 (you can see the heap sizes using GODEBUG=gctrace=1).

This is an unfortunate side effect of improving our space efficiency.  I'm not sure what can be done about it, other than to increase the default GOGChave  value.  But that seems wrong also.

Andrew Gerrand

unread,
Nov 1, 2014, 3:54:06 AM11/1/14
to Alex Skinner, golan...@googlegroups.com, k...@google.com, dvy...@google.com, r...@golang.org, gus...@niemeyer.net, xxs...@gmail.com, alexandr...@gmail.com
The GC planned for 1.5 will be able to work incrementally and should have quite different performance characteristics to the current implementation. I'm not an expert, but I don't think this particular issue will affect the new GC. I'm not even sure that the GC trigger (heap some percent bigger than that at last GC) will be the same.

Andrew

Alexandre Cesaro

unread,
Nov 1, 2014, 6:22:00 AM11/1/14
to Andrew Gerrand, Alex Skinner, golan...@googlegroups.com, k...@google.com, dvy...@google.com, r...@golang.org, gus...@niemeyer.net, xxs...@gmail.com
I don't have any knowledge in garbage collectors so I hope what I'm saying is not stupid. But maybe in Go 1.4 only, the GC should only be triggered when freshly allocated data reach something like max(live data * GOGC, N) or (live data * N * GOGC) where N is a fixed value to be defined to make the behavior of the 1.4 GC similar to 1.3.

Because I don't think telling people to tweak their GOGC value in Go 1.4 and to remove it in Go 1.5 is a good idea. Changing the default value of GOGC and bothering people that already tweaked it doesn't look good either.

Naoki INADA

unread,
Nov 1, 2014, 9:12:02 AM11/1/14
to golan...@googlegroups.com, xxs...@gmail.com
I found there are too many MOVQ.
For example, disasm of http.ReadRequest:  https://gist.github.com/methane/328f0c6a0bb193f77b03

I feel Go's calling convention (no pass by register and all registers are volatile) cause this.
Do you have any plan to improve this?

Ian Taylor

unread,
Nov 1, 2014, 9:51:24 AM11/1/14
to Naoki INADA, golang-nuts, xxs...@gmail.com
On Sat, Nov 1, 2014 at 6:12 AM, Naoki INADA <songof...@gmail.com> wrote:
>
> I found there are too many MOVQ.
> For example, disasm of http.ReadRequest:
> https://gist.github.com/methane/328f0c6a0bb193f77b03
>
> I feel Go's calling convention (no pass by register and all registers are
> volatile) cause this.
> Do you have any plan to improve this?

It will be easier to improve the compiler optimizations when the
compiler has been converted to Go (http://golang.org/s/go13compiler).
We hope that that will happen before the 1.5 release.

However, there are no current plans to change the calling convention.

Ian

robfig

unread,
Nov 1, 2014, 4:16:03 PM11/1/14
to golan...@googlegroups.com, gwenn...@gmail.com
go1.4beta1 vs go1.4beta1 w/ patch

benchmark                       old ns/op     new ns/op     delta
BenchmarkLexParseFeatures       1370615       1363280       -0.54%
BenchmarkExecuteFeatures        236649        230015        -2.80%

benchmark                       old allocs     new allocs     delta
BenchmarkLexParseFeatures       2156           2155           -0.05%
BenchmarkExecuteFeatures        1075           917            -14.70%

benchmark                       old bytes     new bytes     delta
BenchmarkLexParseFeatures       93468         93447         -0.02%
BenchmarkExecuteFeatures        41065         39679         -3.38%

gwenn...@gmail.com

unread,
Nov 2, 2014, 12:56:37 PM11/2/14
to golan...@googlegroups.com, gwenn...@gmail.com
Hello,
I may have found another regression.
It seems to be related to cgo, callback and IO (sqlite3_set_authorizer).
I am trying to debug with gdb...
I will keep you posted.
Regards.


On Friday, October 31, 2014 1:41:27 AM UTC+1, Ian Lance Taylor wrote:

Chris Hines

unread,
Nov 2, 2014, 9:40:21 PM11/2/14
to golan...@googlegroups.com
On Wednesday, October 29, 2014 11:29:32 PM UTC-4, Andrew Gerrand wrote:
We have just released go1.4beta1, a beta version of Go 1.4.
It is cut from the default branch at a revision tagged as go1.4beta1.

Nice work. My favorite feature so far: `go tool pprof` works on Windows without any extra work!

Also, although I haven't had a chance to play with it yet, I think the ability to define a TestMain will help simplify tests that need to launch a child process.

Chris

gwenn...@gmail.com

unread,
Nov 3, 2014, 3:27:03 PM11/3/14
to golan...@googlegroups.com, gwenn...@gmail.com
Hello,
I've failed to debug it.
I will try to make a minimal program with no sqlite...
Regards.

Russ Cox

unread,
Nov 7, 2014, 10:23:51 AM11/7/14
to Alexandre Cesaro, Rob Pike, Keith Randall, golang-nuts, Gustavo Niemeyer, xxs...@gmail.com
On Fri, Oct 31, 2014 at 10:41 AM, <alexandr...@gmail.com> wrote:
I ran a benchmark of my library Gomail (it builds an email). And like others benchmarks mentioned here, it shows a ~30% slowdown:

benchmark         old ns/op     new ns/op     delta
BenchmarkFull     143705        189744        +32.04%

benchmark         old allocs     new allocs     delta
BenchmarkFull     322            336            +4.35%

benchmark         old bytes     new bytes     delta
BenchmarkFull     38328         38287         -0.11%

I can reproduce this, but I don't think you should give much weight to the result. It is an artifact of having basically no memory allocated. If you run your test after 'export GODEBUG=gctrace=1' you can see that each of the garbage collections is starting and ending with a '0 MB' heap (rounded down to the nearest MB, of course). It is true that changing GOGC can improve the results here, but so does just having other memory allocated, as you would in a real program.

Here are results for Go 1.3 using -benchtime=2s (to average the GC across the benchmark a little better).

BenchmarkFull   50000    122326 ns/op
BenchmarkFull   50000    119026 ns/op
BenchmarkFull   50000    124336 ns/op

And for Go 1.4:

BenchmarkFull   20000    189629 ns/op
BenchmarkFull   20000    193394 ns/op
BenchmarkFull   20000    192372 ns/op

But if you allocate some other memory in your program:

g% git diff .
diff --git a/gomail_test.go b/gomail_test.go
index 74c8851..0acb122 100644
--- a/gomail_test.go
+++ b/gomail_test.go
@@ -11,6 +11,8 @@ import (
  "time"
 )
 
+var x = make(chan int, 1e6)
+
 type message struct {
  from    string
  to      []string
g% 

Here's Go 1.3:

BenchmarkFull   50000     93494 ns/op
BenchmarkFull   50000     94074 ns/op
BenchmarkFull   50000     96963 ns/op

and Go 1.4:

BenchmarkFull   30000     95748 ns/op
BenchmarkFull   30000     95028 ns/op
BenchmarkFull   30000     94307 ns/op

I filed golang.org/issue/9067 to try to do something about the "empty heap" garbage collections for Go 1.5, but I don't think it has enough impact in practice to do anything for Go 1.4.

Thanks for the interesting behavior.
Russ

Alexandre Cesaro

unread,
Nov 7, 2014, 10:53:25 AM11/7/14
to Russ Cox, golang-nuts
Ok I understand. Thanks!

Daniel Skinner

unread,
Nov 7, 2014, 12:38:48 PM11/7/14
to Russ Cox, Alexandre Cesaro, Rob Pike, Keith Randall, golang-nuts, Gustavo Niemeyer, xxs...@gmail.com
I don't actually understand the conclusions drawn here. The allocation clearly shows marked improvement, but 1.4 still only ran for 30k iterations (instead of 20k) while 1.3 ran for 50k. Is it that more allocations (as may be seen in a real-world scenario) would close the gap further?

Enjoying the 1.4 beta + mobile pkg, thanks to everyone involved :)

--

foxnet.d...@googlemail.com

unread,
Nov 7, 2014, 3:49:25 PM11/7/14
to golan...@googlegroups.com
Hey,

i just did a few tests and noticed that 1.4 beta1 offers huge improvements for apps using many goroutines.
I guess, this relates to the decreased start stack size of goroutines ? Really great!

Am Donnerstag, 30. Oktober 2014 04:29:32 UTC+1 schrieb Andrew Gerrand:
Hi Go nuts,

We have just released go1.4beta1, a beta version of Go 1.4.
It is cut from the default branch at a revision tagged as go1.4beta1.

Brendan Tracey

unread,
Nov 7, 2014, 6:47:48 PM11/7/14
to golan...@googlegroups.com, foxnet.d...@googlemail.com
I tried installing the beta on a linux cluster for our group, and I get the error below. I tried on tip. The error persists in addition to a number of panics in the GC. I output the last working version of go I had as well as go env. The previous go version passed all of the tests, so it seems it's broken since then. I've had problems in the past on this machine with clock asynchronisity, but they appear to be synced up this time. I believe this is on CentOS.


ok      runtime/debug    0.006s
--- FAIL: TestCPUProfileMultithreaded (0.25s)
    pprof_test.go:165: runtime/pprof_test.cpuHog1: 49
    pprof_test.go:165: runtime/pprof_test.cpuHog2: 0
    pprof_test.go:179: runtime/pprof_test.cpuHog2 has 0 samples out of 49, want at least 1, ideally 24
FAIL
FAIL    runtime/pprof    20.165s
?       runtime/race    [no test files]


[btracey@zion go_tip]$ go version
go version devel +3d68989bd1ea Mon Oct 27 08:46:18 2014 -0700 linux/amd64

Go env output (from previous go installation):
GOARCH="amd64"
GOBIN=""
GOCHAR="6"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/ADL/btracey/mygo"
GORACE=""
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"



Failure on tip:
panic: runtime error: index out of range
fatal error: panic during gc

goroutine 1 [running, locked to thread]:
runtime.gothrow(0x62cd90, 0xf)
    /ADL/btracey/gover/go_tip/go/src/runtime/panic.go:503 +0x8e fp=0xc2082d9c58 sp=0xc2082d9c40
runtime.gopanic(0x5db6a0, 0xc208010010)
    /ADL/btracey/gover/go_tip/go/src/runtime/panic.go:343 +0x19d fp=0xc2082d9cc0 sp=0xc2082d9c58
runtime.panicindex()
    /ADL/btracey/gover/go_tip/go/src/runtime/panic.go:12 +0x4e fp=0xc2082d9ce8 sp=0xc2082d9cc0
runtime.(*traceStackDepot).dump(0x74a6f0)
    /ADL/btracey/gover/go_tip/go/src/runtime/trace.go:498 +0x1ef fp=0xc2082e9f30 sp=0xc2082d9ce8
runtime.TraceStop()
    /ADL/btracey/gover/go_tip/go/src/runtime/trace.go:173 +0x1d5 fp=0xc2082e9f88 sp=0xc2082e9f30
runtime.traceFinalize()
    /ADL/btracey/gover/go_tip/go/src/runtime/trace.go:89 +0x1b fp=0xc2082e9f90 sp=0xc2082e9f88
runtime.main()
    /ADL/btracey/gover/go_tip/go/src/runtime/proc.go:68 +0x105 fp=0xc2082e9fe0 sp=0xc2082e9f90
runtime.goexit()
    /ADL/btracey/gover/go_tip/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc2082e9fe8 sp=0xc2082e9fe0

goroutine 563 [syscall]:
runtime_test.func·046(0x0, 0x4e94914f0000, 0x63ca30, 0x14, 0xc20804a060)
    /ADL/btracey/gover/go_tip/go/src/runtime/futex_test.go:47 +0x4d
created by runtime_test.TestFutexsleep
    /ADL/btracey/gover/go_tip/go/src/runtime/futex_test.go:51 +0x222

goroutine 564 [syscall]:
runtime_test.func·046(0x0, 0x1dcd65174876e800, 0x63c630, 0x13, 0xc20804a0c0)
    /ADL/btracey/gover/go_tip/go/src/runtime/futex_test.go:47 +0x4d
created by runtime_test.TestFutexsleep
    /ADL/btracey/gover/go_tip/go/src/runtime/futex_test.go:51 +0x222

Brendan Tracey

unread,
Nov 7, 2014, 6:55:14 PM11/7/14
to golan...@googlegroups.com, foxnet.d...@googlemail.com, Russ Cox
I just checked, and compiling 1.3.0 works fine, so it appears to be a 1.4.beta1 problem and not a problem with the machine.

Dave Cheney

unread,
Nov 7, 2014, 7:21:42 PM11/7/14
to golan...@googlegroups.com
Please log a bug and include the specifics of the operating system.

foxnet.d...@googlemail.com

unread,
Nov 8, 2014, 2:37:02 AM11/8/14
to golan...@googlegroups.com, gaurav...@gmail.com
Just curious: Why is there a need for write barriers for every access now ?
Does it have something to do with the new GC planned for 1.5 ?
Also: Are there any plans for improvements related to the write barriers ?

Cheers,
Chris

Am Donnerstag, 30. Oktober 2014 18:00:14 UTC+1 schrieb Rob 'Commander' Pike:
Ints in interfaces allocate now, but they didn't before. Write barriers slow things down. The garbage collector is faster. The runtime has had some speedups.

The performance effects of these confounding factors are dependent on the programs. If you can isolate a single simple benchmark that illustrates a slowdown, we can investigate.

-rob

Dave Cheney

unread,
Nov 8, 2014, 2:41:38 AM11/8/14
to foxnet.d...@googlemail.com, golan...@googlegroups.com

You will find more background information in this document. http://golang.org/s/go14gc
--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/7VAcfULjiB8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

robfig

unread,
Nov 30, 2014, 3:37:49 PM11/30/14
to golan...@googlegroups.com
Interesting, despite the 20-25% increase in the Soy template rendering benchmark time, my "real world benchmark" shows 25% more allocations as expected but no change in running time. I can't share the code, but it spends about half of the time rendering templates (it is a website generation system).

benchmark old ns/op new ns/op delta
BenchmarkStageDefaultSite 794588650 794801566 +0.03%

benchmark old allocs new allocs delta
BenchmarkStageDefaultSite 1954499 2334605 +19.45%

benchmark old bytes new bytes delta
BenchmarkStageDefaultSite 208373288 204872668 -1.68%

Either the other changes in 1.4 sped up the non-template rendering parts sufficiently to off-balance the increase, or something is causing the Soy benchmark to be inaccurate. Either way, very relieved that I don't have to swallow an overall slow-down to upgrade to go1.4.

Thanks!

Rob Pike

unread,
Nov 30, 2014, 8:28:55 PM11/30/14
to robfig, golan...@googlegroups.com
As it says in the 1.4 release notes, some things slowed down and some things sped up, particularly in multicore situations. It's likely, common even that these two effects cancel each other out.

Relax and be happy. We have your back.

-rob

Dmitry Vyukov

unread,
Nov 30, 2014, 11:26:09 PM11/30/14
to robfig, golang-nuts
On Sun, Nov 30, 2014 at 11:37 PM, robfig <rob...@gmail.com> wrote:
> Interesting, despite the 20-25% increase in the Soy template rendering benchmark time, my "real world benchmark" shows 25% more allocations as expected but no change in running time. I can't share the code, but it spends about half of the time rendering templates (it is a website generation system).


We've done some adjustments to malloc stats calculations. Now we add
+1 to MemStats.Mallocs more frequently w/o doing any additional work.
Exactly what you see.


> benchmark old ns/op new ns/op delta
> BenchmarkStageDefaultSite 794588650 794801566 +0.03%
>
> benchmark old allocs new allocs delta
> BenchmarkStageDefaultSite 1954499 2334605 +19.45%
>
> benchmark old bytes new bytes delta
> BenchmarkStageDefaultSite 208373288 204872668 -1.68%
>
> Either the other changes in 1.4 sped up the non-template rendering parts sufficiently to off-balance the increase, or something is causing the Soy benchmark to be inaccurate. Either way, very relieved that I don't have to swallow an overall slow-down to upgrade to go1.4.
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages