proposal: GOMAXPROCS = NumCPU() by default

1,150 views
Skip to first unread message

Russ Cox

unread,
May 28, 2015, 1:33:26 PM5/28/15
to golang-dev
For Go 1.5, we propose to default GOMAXPROCS to NumCPU() instead of 1.
See golang.org/s/go15gomaxprocs for details.
Comments and discussion welcome in this thread.

Thanks.
Russ


Keith Randall

unread,
May 28, 2015, 1:46:47 PM5/28/15
to Russ Cox, golang-dev
I didn't see any experiments checking whether the scheduler still works well when the machine has other load besides the Go program in question.
Effectively, this means checking that GOMAXPROCS=runtime.NumCPU()+n for a few n doesn't cause trouble.


--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brendan Tracey

unread,
May 28, 2015, 2:07:56 PM5/28/15
to golan...@googlegroups.com, r...@golang.org
SGTM -- I frequently run with GOMAXPROCS = NumCPU already.

I assume that this setting will be true for benchmarking as well? That is, when running go test -bench, it will be defaulted to NumCPU?

Matt Jones

unread,
May 28, 2015, 2:38:05 PM5/28/15
to Russ Cox, golang-dev
I'm assuming this would apply to the test runner as well? Either way, this would be a large improvement in usability for newcomers (and one less step in our internal getting-started doc).

On Thu, May 28, 2015 at 10:33 AM, Russ Cox <r...@golang.org> wrote:

Russ Cox

unread,
May 28, 2015, 2:43:11 PM5/28/15
to Keith Randall, golang-dev
On Thu, May 28, 2015 at 1:46 PM, Keith Randall <k...@google.com> wrote:
I didn't see any experiments checking whether the scheduler still works well when the machine has other load besides the Go program in question.
Effectively, this means checking that GOMAXPROCS=runtime.NumCPU()+n for a few n doesn't cause trouble.

I think the make.bash measurements do test that: there are many compilations going on at once, they're all now trying to use GOMAXPROCS CPUs, and it seems fine. 

We have also run with large GOMAXPROCS values on Google production machines shared with other load for years, without problem.

Russ

Russ Cox

unread,
May 28, 2015, 2:44:57 PM5/28/15
to Brendan Tracey, golang-dev
On Thu, May 28, 2015 at 2:07 PM, Brendan Tracey <tracey....@gmail.com> wrote:
SGTM -- I frequently run with GOMAXPROCS = NumCPU already.

I assume that this setting will be true for benchmarking as well? That is, when running go test -bench, it will be defaulted to NumCPU?

Yes, because package testing does not override this (unless you say -cpu). And you'll continue to see the -N suffix on those benchmarks for N != 1.

Russ

Russ Cox

unread,
May 28, 2015, 2:45:27 PM5/28/15
to Matt Jones, golang-dev
On Thu, May 28, 2015 at 1:38 PM, Matt Jones <ma...@pinterest.com> wrote:
I'm assuming this would apply to the test runner as well? Either way, this would be a large improvement in usability for newcomers (and one less step in our internal getting-started doc).

Yes, it applies to all Go programs that do not explicitly override it, testing included.

Russ

Dan Kortschak

unread,
May 28, 2015, 4:36:05 PM5/28/15
to Russ Cox, golang-dev
I'm concerned about how this will impact users of shared machines with many cores - particularly where some users may be unaware of GOMAXPROCS (this is particularly relevant in a teaching/research environment). Would it be possible to place a sensible upper bound on the default GOMAXPROCS that may be less than NumCPU for big machines?

Andrew Gerrand

unread,
May 28, 2015, 4:38:13 PM5/28/15
to Dan Kortschak, Russ Cox, golang-dev
I'm not aware of other popular multi-threaded languages that let you restrict the number of processes to consume. For instance, if I start a Java program that spins up 50 threads it could easily saturate 24 cores. Why should Go be special in this regard?

On 28 May 2015 at 13:36, Dan Kortschak <dan.ko...@adelaide.edu.au> wrote:
I'm concerned about how this will impact users of shared machines with many cores - particularly where some users may be unaware of GOMAXPROCS (this is particularly relevant in a teaching/research environment). Would it be possible to place a sensible upper bound on the default GOMAXPROCS that may be less than NumCPU for big machines?

Dan Kortschak

unread,
May 28, 2015, 4:39:50 PM5/28/15
to Andrew Gerrand, Russ Cox, golang-dev
I think you have misunderstood. I'm not asking for a restriction, just that the default not be 100% of cores for, say, a 64 core machine.

Brad Fitzpatrick

unread,
May 28, 2015, 4:39:52 PM5/28/15
to Dan Kortschak, Russ Cox, golang-dev
Also, one might argue that said teaching/research environments should provide better isolation between their users.


On Thu, May 28, 2015 at 1:36 PM, Dan Kortschak <dan.ko...@adelaide.edu.au> wrote:
I'm concerned about how this will impact users of shared machines with many cores - particularly where some users may be unaware of GOMAXPROCS (this is particularly relevant in a teaching/research environment). Would it be possible to place a sensible upper bound on the default GOMAXPROCS that may be less than NumCPU for big machines?

Dan Kortschak

unread,
May 28, 2015, 4:41:06 PM5/28/15
to Brad Fitzpatrick, Russ Cox, golang-dev
This is not trivial for the research problems that we deal with.

Aram Hăvărneanu

unread,
May 28, 2015, 4:44:49 PM5/28/15
to Dan Kortschak, Brad Fitzpatrick, Russ Cox, golang-dev
It will be fun to test this on a 512 thread machine.

--
Aram Hăvărneanu

Andrew Gerrand

unread,
May 28, 2015, 4:46:10 PM5/28/15
to Dan Kortschak, Russ Cox, golang-dev
I think I interpreted you correctly. I'm just asking: why should Go be the core police when other languages seem to rely on the operating system?

Brendan Tracey

unread,
May 28, 2015, 4:54:39 PM5/28/15
to Dan Kortschak, Russ Cox, golang-dev
Couldn’t this be fixed by using the environment variable that is set with the job scheduler? This way the message to students is “don’t set GOMAXPROCS in the code”, and the scheduler takes care of the sharing.

This is how I’ve done it with the computational resources in my lab. The job scheduler assumes that all parallel code is MPI (it’s possible to be parallel without MPI?), so I need to trick it, but it tells me how many cores I have and then I send that info to GOMAXPROCS


> On May 28, 2015, at 2:36 PM, Dan Kortschak <dan.ko...@adelaide.edu.au> wrote:
>
> I'm concerned about how this will impact users of shared machines with many cores - particularly where some users may be unaware of GOMAXPROCS (this is particularly relevant in a teaching/research environment). Would it be possible to place a sensible upper bound on the default GOMAXPROCS that may be less than NumCPU for big machines?
>

Dan Kortschak

unread,
May 28, 2015, 5:01:13 PM5/28/15
to Brendan Tracey, Russ Cox, golang-dev
On our HPC infrastructure I'm not concerned, that is managed by others and nodes have 4 to 8 cores. The issue is on our large shared memory machines where much of our development occurs (as opposed to just running established pipelines as is the case where we have queue management).

Dan Kortschak

unread,
May 28, 2015, 5:04:41 PM5/28/15
to Andrew Gerrand, Russ Cox, golang-dev
Go already allows you, and still will after this proposal is implemented, to restrict the number of threads being used.

Andrew Gerrand

unread,
May 28, 2015, 5:09:24 PM5/28/15
to Dan Kortschak, Russ Cox, golang-dev
Yes. This change puts the onus on the user to restrict the number of threads being used (if desired), as with the other mainstream languages that I know. Why should Go be different to the others?

(FWIW, I have found that most people are very surprised that Go doesn't use all their cores by default. It's great that they won't trip over this anymore.)

--

Dave Cheney

unread,
May 28, 2015, 5:14:46 PM5/28/15
to Dan Kortschak, Andrew Gerrand, Russ Cox, golang-dev
SGTM, but I think the comments that this should be clamped at some
reasonable upper value, 8 or possibly 16 (Dmitry had some strong
feelings about the scalability of the scheduler) should be considered.

Paul Borman

unread,
May 28, 2015, 5:28:23 PM5/28/15
to Dave Cheney, Dan Kortschak, Andrew Gerrand, Russ Cox, golang-dev
How about clamping it based on the number of cores on the machine rather than treating all machines as if they were identical?  So, how about on a 12 core machine we make that clamping be 12?

(And yes, I do understand the original proposal :-)

    -Paul

Nate Finch

unread,
May 28, 2015, 5:38:58 PM5/28/15
to golan...@googlegroups.com
This is a great idea.  It's a stumbling block for pretty much every newbie.

Rodrigo Kochenburger

unread,
May 28, 2015, 5:38:58 PM5/28/15
to Dave Cheney, Dan Kortschak, Andrew Gerrand, Russ Cox, golang-dev
I just to clarify my previous message and why I think defaulting to all cores can be bad.

On other languages, when writing code, you're explicitly specifying number of threads and that's your upper boundary for number of CPUs. In Go, most applications spawn a lot more Goroutines than they would if it was thread, relying on the fact that the scheduler can handle and multiplex then. Depending on how the threads blocks, it can and will create more threads to be able to run other Goroutines which can easily consume many more cores than first anticipated.

Also, there is the scheduler scalability problem that Dave mentioned. I agree it should be limited to 8 or possibily 16 of the available cores.

Rodrigo Kochenburger

unread,
May 28, 2015, 5:38:58 PM5/28/15
to Andrew Gerrand, Dan Kortschak, Russ Cox, golang-dev
When you're working in a environment where threads are exposed to you, you usually limit the number of threads to control the number of cores you can use. Correct me if I'm wrong but we don't have direct control over the number of threads and Go programs usually run a lot more goroutines than you'd threads, and depending on how they block it could cause as many threads as the number of cores to be created. 

I think, at least for now, it should use the number of cores up to a maximum default value to avoid hogging all CPUs on systems that are not prepared.

Paul Borman

unread,
May 28, 2015, 5:53:33 PM5/28/15
to Rodrigo Kochenburger, Andrew Gerrand, Dan Kortschak, Russ Cox, golang-dev
I think people are way too concerned about this.  Operating systems are very good at balancing more than N tasks on an N core machine.  Just because you fire up N threads does not mean the OS will give you N threads.  Often you may only have a fraction of the number requested actually be given CPU resources so as soon as you go above 1 you risk having more runnable threads than available CPUs.  But most typical programs end up doing some reasonable amount of I/O (ray tracing, computational fluid dynamics and structural analysis excluded) so they are not 100% busy on every thread at all times.  And if they do need the CPU, the faster they completes the faster they get out of the way for others to do work.  Defaulting to N seems very reasonable.  In years past the number of worker threads you typically wanted was on the order of 2*N (N usually was 1, 2, or 4, sometimes 8) to get maximum usage of your machine.  (I did a lot of empirical testing in the 90's on this). Things are better now, and Go handles this quite well, so N is quite a good default.  As mentioned, we have been doing this at Google for years.

    -Paul

Dan Kortschak

unread,
May 28, 2015, 6:04:59 PM5/28/15
to Andrew Gerrand, Russ Cox, golang-dev
Go is different to other languages in many ways, usually in picking sensible defaults that work for most people, so I don't think hat argument is tenable here.

If GOMAXPROCS were set to min(NumCPU, 16) or some other reasonable number for the the default maximum, that would have no impact at all on nearly 100% of users. The users that will be affected are almost certainly going to be in a position of knowing what to do if it is harming their work.

On non-shared machines I welcome this proposal, since I very often preface my invocations with GOMAXPROCS=...

Russ Cox

unread,
May 28, 2015, 8:39:19 PM5/28/15
to Dan Kortschak, Andrew Gerrand, golang-dev
On Thu, May 28, 2015 at 6:04 PM, Dan Kortschak <dan.ko...@adelaide.edu.au> wrote:
If GOMAXPROCS were set to min(NumCPU, 16) or some other reasonable number for the the default maximum, that would have no impact at all on nearly 100% of users.

Today we use min(NumCPU, 1). We could raise that limit to 2 or 4 (as another reader suggested privately) or 8 or 16 or any other value. The problem with any of these suggestions is that they are specific to certain assumptions about the capabilities of Go, the machine, and what else is going on. It surprises people today that for Go to use more than 1 core they need to ask explicitly. If we set a different minimum (2, 4, 16, ...), there will still be surprises when people have more than that many cores, it will be less obvious that there is an artificial limit being imposed, we will need to think about revising the limit at each release, the revision of the limit will change program performance from release to release in ways that possibly cause more surprises, and so on. In short, there are significant disadvantages to this approach.

NumCPU has always been our first choice, because it does not have these disadvantages. As I hope the doc makes clear, we didn't do that for Go 1.0 because there was clear evidence of problems with GOMAXPROCS>1 in certain programs that were at least somewhat likely to be written; those particular programs now work fine with GOMAXPROCS>1 and are no longer a case against it.

Instead of speculating about problems, I'd like to ask everyone to wait for the beta to be cut and then try it and report back specific details about problems, if any. If there are clear problems that warrant a (higher) limit, then we'll impose the limit, address those problems in future releases, and try this all over again at some point in the future.

The default GOMAXPROCS is a compromise. No default will work well for everyone, but the current default works well for almost no one. The goal is to find a default that balances being as simple as possible (NumCPU is simpler than min(NumCPU, N)) and working for as many people as possible.

The users that will be affected are almost certainly going to be in a position of knowing what to do if it is harming their work.

For what it's worth, I don't believe this. No new Go programmer intuitively knows there is a GOMAXPROCS setting. When Go doesn't make full use of the machine, the first reaction is very likely "Go is bad at using multiple cores" and not "I wonder if there is a setting to unlock the use of multiple cores". We don't know how many Go programmers have tried Go and gone away before finding this setting. We do know that people still ask the question, and we can only assume more people don't bother to ask.

In any event, if we set GOMAXPROCS to NumCPU and the only problem we find is that it's inappropriate policy on some large shared machines, it seems like the best solution in that case would be to have those machines configured to put an appropriate GOMAXPROCS value into the environment by default. But let's wait and try it first. Maybe it will work fine even on shared machines.

Russ

Alan Donovan

unread,
May 28, 2015, 8:42:36 PM5/28/15
to Russ Cox, Dan Kortschak, Andrew Gerrand, golang-dev
Will this change be an effective workaround for the Mac OS X kernel bug whereby GOMAXPROCS must be set to >1 when the profiling timer is enabled?

--

Dan Kortschak

unread,
May 28, 2015, 8:50:57 PM5/28/15
to Russ Cox, Andrew Gerrand, golang-dev
On Thu, 2015-05-28 at 20:39 -0400, Russ Cox wrote:
> > The users that will be affected are almost certainly going to be in
> > a position of knowing what to do if it is harming their work.
>
>
> For what it's worth, I don't believe this. No new Go programmer
> intuitively knows there is a GOMAXPROCS setting. When Go doesn't make
> full use of the machine, the first reaction is very likely "Go is bad
> at using multiple cores" and not "I wonder if there is a setting to
> unlock the use of multiple cores". We don't know how many Go
> programmers have tried Go and gone away before finding this setting.
> We do know that people still ask the question, and we can only assume
> more people don't bother to ask.

This is exactly my point. If I put a new researcher on a machine with
some Go executable (that they may have written or not) and tell them to
run it to do some kind of analysis, in the default case now, they won't
monopolise resources whether they know about GOMAXPROCS or not. When
GOMAXPROCS=NumCPU then they may depending on the code.


> In any event, if we set GOMAXPROCS to NumCPU and the only problem we
> find is that it's inappropriate policy on some large shared machines,
> it seems like the best solution in that case would be to have those
> machines configured to put an appropriate GOMAXPROCS value into the
> environment by default. But let's wait and try it first. Maybe it will
> work fine even on shared machines.

This seems like a reasonable approach, but can I suggest that it be
noted (if it is not already) that the default GOMAXPROCS value is
subject to change so that people are not surprised if it is changed.


Paul Borman

unread,
May 28, 2015, 9:13:32 PM5/28/15
to Dan Kortschak, Russ Cox, Andrew Gerrand, golang-dev
On Thu, May 28, 2015 at 5:50 PM, Dan Kortschak <dan.ko...@adelaide.edu.au> wrote:

This is exactly my point. If I put a new researcher on a machine with
some Go executable (that they may have written or not) and tell them to
run it to do some kind of analysis, in the default case now, they won't
monopolise resources whether they know about GOMAXPROCS or not. When
GOMAXPROCS=NumCPU then they may depending on the code.

I assume someone is an admin for that shared resource.  Simply change the global .profile or what have you to set GOMAXPROCS to something else.  Problem solved.  If it is a timeshare system of some sort than use the timeshare facilities to limit their resource usage.

Saying it is "subject to change" will encourage people to set it to make sure they are not affected by the change.  I don't think we want to encourage that.

Just as an anecdote, when ever I run ffmpeg to convert a bunch of .jpg files into a .mov it doesn't care about any other programs and it keeps all CPU's pegged at 100%, and I wouldn't want it any other way.  Of course the machine is still responsive because the OS scheduler works well.  There is nothing here to see.  Move on.

    -Paul 

Dan Kortschak

unread,
May 28, 2015, 9:17:19 PM5/28/15
to Paul Borman, Russ Cox, Andrew Gerrand, golang-dev
On Thu, 2015-05-28 at 18:13 -0700, Paul Borman wrote:
> I assume someone is an admin for that shared resource. Simply change
> the global .profile or what have you to set GOMAXPROCS to something
> else. Problem solved. If it is a timeshare system of some sort than
> use the timeshare facilities to limit their resource usage.
>
Note the initial clause in the response to Russ' second paragraph.


Paul Borman

unread,
May 28, 2015, 9:31:56 PM5/28/15
to Dan Kortschak, Russ Cox, Andrew Gerrand, golang-dev
I have no idea which of the many responses and paragraphs that is.  I don't think Russ talked about shared machines or global profiles.  I was suggesting that you, as an administrator (or someone who knows the administrators) institute a campus policy to set the GOMAXPROCS in the global .profile environment (just like is done with PATH or other such variables) so unless the student does something special, they will have GOMAXPROCS set in their environment.  The needn't be any the wiser.  In truth I don't think there is an issue, but there are solutions for your situation that do not require wiring the concern into the default settings of Go.  The environment variable GOMAXPROCS is already there and can be used to solve just these sorts of issues.

Dan Kortschak

unread,
May 28, 2015, 9:46:40 PM5/28/15
to Paul Borman, Russ Cox, Andrew Gerrand, golang-dev
On Thu, 2015-05-28 at 18:31 -0700, Paul Borman wrote:
> I have no idea which of the many responses and paragraphs that is. I
> don't think Russ talked about shared machines or global profiles.

> In any event, if we set GOMAXPROCS to NumCPU and the only problem we
> find is that it's inappropriate policy on some large shared machines,
> it seems like the best solution in that case would be to have those
> machines configured to put an appropriate GOMAXPROCS value into the
> environment by default. But let's wait and try it first. Maybe it will
> work fine even on shared machines.

This seems like a reasonable approach, ...

andrewc...@gmail.com

unread,
May 28, 2015, 10:42:30 PM5/28/15
to golan...@googlegroups.com

It is pretty clear to me that using more cores when a user launches multiple goroutines follows least surprise as has been said.

If someone has a thousand cores or a shared system or something crazy, they likely understand what they are doing and can limit it themselves. In a teaching/research environment you can issue a note in the lecture/tutorial/meeting explaining what GOMAXPROCS is and why it is important.

Dan Kortschak

unread,
May 28, 2015, 10:55:45 PM5/28/15
to andrewc...@gmail.com, golan...@googlegroups.com
On Thu, 2015-05-28 at 19:42 -0700, andrewc...@gmail.com wrote:
> If someone has a thousand cores or a shared system or something crazy,
> they likely understand what they are doing and can limit it
> themselves. In a teaching/research environment you can issue a note in
> the lecture/tutorial/meeting explaining what GOMAXPROCS is and why it
> is important.

I love this optimism.

Russ Cox

unread,
May 28, 2015, 11:18:18 PM5/28/15
to Alan Donovan, Dan Kortschak, Andrew Gerrand, golang-dev
On Thu, May 28, 2015 at 8:42 PM, Alan Donovan <adon...@google.com> wrote:
Will this change be an effective workaround for the Mac OS X kernel bug whereby GOMAXPROCS must be set to >1 when the profiling timer is enabled?

I don't know about that bug. As far as I understand, the OS X kernel is broken for profiling regardless of the GOMAXPROCS setting. That's what rsc.io/pprof_mac_fix is for.

Russ

tita...@gmail.com

unread,
May 28, 2015, 11:31:45 PM5/28/15
to golan...@googlegroups.com
I think gomaxprocs = NumCPU is reasonably sane default value considering the cooperative scheduling of the runtime, concurrency features built into the language and compiler, and that multicore systems are the norm in most markets golang is adept.

minux

unread,
May 28, 2015, 11:57:23 PM5/28/15
to Dan Kortschak, golang-dev
I'm wondering why everybody are focusing on the administrative side of
things and neglect the amazing Go 1.5 runtime performance increase?

If you don't trust the users, use cgroups or other means to limit their cpu
use. Limiting cpu use has nothing to do with Go's default GOMAXPROCS
choice.

Even if GOMAXPROCS is 1, a user can still monopolize the whole machine
easily if not suitably constraint, so I don't understand why that has anything
to do with Go?

Besides, you can always set GOMAXPROCS to some limited value in
the program.

Pieter Droogendijk

unread,
May 29, 2015, 9:57:50 AM5/29/15
to golan...@googlegroups.com
I'm all for it!

Those numbers, by the way, are amazing. I had no idea it had improved THAT much.


On Thursday, May 28, 2015 at 7:33:26 PM UTC+2, rsc wrote:
For Go 1.5, we propose to default GOMAXPROCS to NumCPU() instead of 1.
See golang.org/s/go15gomaxprocs for details.
Comments and discussion welcome in this thread.

Thanks.
Russ


Florian Weimer

unread,
May 29, 2015, 10:05:29 AM5/29/15
to Andrew Gerrand, Dan Kortschak, Russ Cox, golang-dev
On 05/28/2015 10:37 PM, Andrew Gerrand wrote:
> I'm not aware of other popular multi-threaded languages that let you
> restrict the number of processes to consume. For instance, if I start a
> Java program that spins up 50 threads it could easily saturate 24 cores.
> Why should Go be special in this regard?

Maybe because goroutines aren't threads, and some applications create
them much more liberally, without concern for overscheduling?

Hotspot does limit then number of threads created by the VM proper (it
is up the application to limit itself):

// For very large machines, there are diminishing returns
// for large numbers of worker threads. Instead of
// hogging the whole system, use a fraction of the workers for every
// processor after the first 8. For example, on a 72 cpu machine
// and a chosen fraction of 5/8
// use 8 + (72 - 8) * (5/8) == 48 worker threads.

But Hotspot has an idea what those worker threads do.

At least the Go run-time counts affinity bits, so it responds well to
affinity tuning.

--
Florian Weimer / Red Hat Product Security
Reply all
Reply to author
Forward
0 new messages