Select performance anomaly from #15442

196 views
Skip to first unread message

Nigel Vickers

unread,
May 2, 2016, 11:51:49 AM5/2/16
to golang-nuts

After discussion on issue (#15442) I was asked why I have reservations about select{}, the runtime.scheduler and GC. Here is an example

The program at https://play.golang.org/p/TV6McjG4cg runs with the defaults correctly. However if you change the  defaults from:

creators := 1
primers := 1
consumers := 1

to 
creators := 1
primers := 20
consumers := 1
which creates 20 goroutines to handle the primer. The program will be aborted by the playground for taking to much time.

Running in 3vpu KVM 14.04 LTS
ok!
sysop@godev16:~/go/src/looptest$ time  ./looptest 
Nr of tasks  100000 
Start 1 task Consumer
Start 1 task Primer
Start 1 task Creators
exited

real    0m0.246s
user    0m0.252s
sys     0m0.050s

try it with 20 primers ouch!
sysop@godev16:~/go/src/looptest$ time  ./looptest 
Nr of tasks  100000 
Start 1 task Consumer
Start 20 task Primer
Start 1 task Creators
exited

real    0m4.796s
user    0m5.225s
sys     0m0.781s

and it isn't the consumers or the creators!
sysop@godev16:~/go/src/looptest$ time  ./looptest 
Nr of tasks  100000 
Start 20 task Consumer
Start 1 task Primer
Start 20 task Creators
exited

real    0m0.210s
user    0m0.267s
sys     0m0.029s

The performance problem is in this implicit block in the select{} in the primer goroutine.
select {
case ch := <-creator:
              primed <- ch
ch <- i
close(ch)
i++
}

Each goroutine has only a single case in the select{}. Performance is dependant on the channel received message which is being held by the select{} until it exits. where the message is finally passed when the runtime.Scheduler fires. The logic of why this message is not being released by the implicit block completion eludes me. However >10 seconds to complete running with 20 goroutines compared to <0,5 second to complete with one needs an explanation, methinks.

rgds, Nigel Vickers 

Jan Mercl

unread,
May 2, 2016, 12:49:57 PM5/2/16
to Nigel Vickers, golang-nuts

On Mon, May 2, 2016 at 5:52 PM Nigel Vickers <rhe...@gmail.com> wrote:

I think that a select statement consisting of a single ubuffered channel operation case / solely of cases of unbuffered channel operations, with no default clause, is basically a race condition. If the only case / all of the cases is/are not ready to proceed at the time the select statement cases are evaluated, the select will block forever. (Additionally, the deadlock detector seems to miss such situation.)

Not a proof of anything, but: https://play.golang.org/p/vV0GF4gYVL

--

-j

Nigel Vickers

unread,
May 2, 2016, 1:58:40 PM5/2/16
to golang-nuts, rhe...@gmail.com
Hello Jan,


On Monday, 2 May 2016 18:49:57 UTC+2, Jan Mercl wrote:

On Mon, May 2, 2016 at 5:52 PM Nigel Vickers <rhe...@gmail.com> wrote:

I think that a select statement consisting of a single ubuffered channel operation case / solely of cases of unbuffered channel operations, with no default clause, is basically a race condition.

I have been plagued by this for some time. I also of the opinion that the default is essential to prevent deadlock with selects on unbuffered channels. Unfortunately this discussion from last year warned me off of defaults. 

 

If the only case / all of the cases is/are not ready to proceed at the time the select statement cases are evaluated, the select will block forever. (Additionally, the deadlock detector seems to miss such situation.)

The problem for me is that for unbuffered channels the propagation of the received message is the key to unblocking the next send. I have been unable to find a way to instrument these messages. Primarily because select{} is in two parts, one of which is in the compiler. I am getting too old to keep up with what's happening there.
 
Not a proof of anything, but: https://play.golang.org/p/vV0GF4gYVL

Yeah, buffers are sometimes useful.

rgds, Nigel
--

-j

Roberto Zanotto

unread,
May 2, 2016, 2:59:30 PM5/2/16
to golang-nuts, rhe...@gmail.com
Maybe I'm just misunderstanding what you are saying, but it seems very wrong to me.
Here's a simple example of a select with one case that is not ready when it is evaluated,
but there's no race and it does not deadlock:
If this deadlocked, it would be a bug for sure.
(I'm in a rush now and didn't look at the code that Nigel posted, I will take the time to do that tonight)

Roberto Zanotto

unread,
May 2, 2016, 3:25:50 PM5/2/16
to golang-nuts
I took a quick look at the code.
One thing seems more a bug than a design choice:
If you have a tasklimit of 100000 and 20 primers, the number of tasks executed can go up to 20*100000,
because each primer has its own i counter and the thing exits only when one of the i counters reaches tasklimit.
(each i counter of each primer can go up to 99999 before exiting).
Combine this with the fact that unbuffered channels make the thing constantly switch goroutines and we may have found
your performance problem.

Nigel Vickers

unread,
May 2, 2016, 4:32:56 PM5/2/16
to golang-nuts

Hello Roberto

On Monday, 2 May 2016 21:25:50 UTC+2, Roberto Zanotto wrote:
I took a quick look at the code.
One thing seems more a bug than a design choice:
If you have a tasklimit of 100000 and 20 primers, the number of tasks executed can go up to 20*100000,
because each primer has its own i counter and the thing exits only when one of the i counters reaches tasklimit.
(each i counter of each primer can go up to 99999 before exiting).

You are right. Being born in the first half of the last century is beginning  to show.
 
Combine this with the fact that unbuffered channels make the thing constantly switch goroutines and we may have found
your performance problem.

This example was created originally to do just that. My experience is that if you need to buffer work is accumulating faster than it can be processed. This may be temporary or not. However buffered or not the channel received message has to work quicky and reliably. I am just having a hard time proving that it sometime doesn't.  

Roberto Zanotto

unread,
May 2, 2016, 8:58:46 PM5/2/16
to golang-nuts
Keep in mind that sometimes it makes sense to use buffered channels even if you do not have a situation of unbalance
(where the "work is accumulating faster than it can be processed").
If the tasks to be performed are very fine grained, the cost of a goroutine switch can be bigger than
the cost of processing one item of the stream and context switches can be a bottleneck.
In such cases a channel buffer of capacity (for example) 10 can reduce the
frequency of context switches by a factor of 10x.
Reply all
Reply to author
Forward
0 new messages