Handling of blocking/nonblocking syscalls

450 views
Skip to first unread message

Dmitry Vyukov

unread,
Jan 13, 2013, 9:12:51 AM1/13/13
to golang-dev
Hi,

There is a known problem in runtime with contention related to
handling of syscalls. My scheduler patch contains the following
solution (https://codereview.appspot.com/6441097/diff/51001/src/pkg/runtime/proc.c,
see sysmon() func). Dedicated background thread checks status of all
threads every 20mks, if a thread is in the same syscall for more than
20mks, the background thread wakes up another worker thread to execute
Go code (assuming that the first thread is blocked).
There are several problems with the approach. It's based on magical
constants, that may work well for one workload/platform and badly for
another workload/platform. The background thread consumes CPU time all
the time, AFAIR on Mac it consumes ~10% and ~5% in Linux, the
background thread can be parked if the program itself does nothing,
but it's not implemented yet. The approach does not actually detect
blocking, the syscall/cgo call can actually be computing something.

What do you think about explicitly saying that net read/write syscalls
and possibly some others (connect/accept) do not block? For cgo we can
implement some annotations that say that the call does not block. Then
runtime does not need to guess whether the call blocks or not. It has
the advantage of being simple and reliable for all
workloads/platforms. + a user can mark as nonblocking cgo calls that
do long computations. + it's probably that simple that we can
implement it before Go1.1. Later we can implement an automatic
solution and disregard the cgo annotations, so it does not burn the
bridges.

Ingo Oeser

unread,
Jan 13, 2013, 11:05:48 AM1/13/13
to golan...@googlegroups.com
Hi Dmitry.

your gut feeling is right, the sysmon solution looks a bit problematic.
But WHAT it checks sounds good. What about checking these things only, 
when somethings else want to run? So wake it up on runtime·gcwaiting instead of spinning.

The marking of some syscalls as non-blocking is an idea that has been discussed, 
but I don't remember the outcome. It sounds a very natural thing to do, if one can be very sure
that this decision is right and will stay right after further OS releases.

For CGO you might verify the marking via PTRACE_SYCALL/PTRACE_SYSEMU.
This could be done either on every first run (e.g. to cache the decision) or even on every run per special option.
Transparently correct the marking then panic/error if you noticed the marking is bad. 

So the process of debugging his decisions will be similiar to the race detector for the developer.

When we later find a better way to detect the marking, I would just panic/error if we PROVED the marking is wrong,
but still assume it is right in the first way.

minux

unread,
Jan 13, 2013, 11:14:35 AM1/13/13
to Ingo Oeser, golan...@googlegroups.com

On Mon, Jan 14, 2013 at 12:05 AM, Ingo Oeser <night...@googlemail.com> wrote:
The marking of some syscalls as non-blocking is an idea that has been discussed, 
the syscall package does have the option to mark some syscalls as nonblocking syscalls.
and lately, dave cheney proposed a CL to introduce nonblocking write for the net package
(CL 6813046), but it's not submitted.
but I don't remember the outcome. It sounds a very natural thing to do, if one can be very sure
that this decision is right and will stay right after further OS releases.
For CGO you might verify the marking via PTRACE_SYCALL/PTRACE_SYSEMU.
This is not portable.

This could be done either on every first run (e.g. to cache the decision) or even on every run per special option.
Transparently correct the marking then panic/error if you noticed the marking is bad. 

So the process of debugging his decisions will be similiar to the race detector for the developer.

When we later find a better way to detect the marking, I would just panic/error if we PROVED the marking is wrong,
but still assume it is right in the first way.
I don't like the idea of checking the special non-blocking mark (if we do introduce the marking).
sometimes people do make intentionally wrong marks to achieve some special purpose. That is,
if we expose the ability to mark a cgo call as nonblocking, then trust the programmer to do the
right thing (or just don't expose the feature in the first place).

actually, I don't like the idea of exposing the marking to user at all.

Ingo Oeser

unread,
Jan 13, 2013, 11:33:37 AM1/13/13
to golan...@googlegroups.com, Ingo Oeser
Hi Minux,

On Sunday, January 13, 2013 5:14:35 PM UTC+1, minux wrote:

On Mon, Jan 14, 2013 at 12:05 AM, Ingo Oeser <night...@googlemail.com> wrote:
For CGO you might verify the marking via PTRACE_SYCALL/PTRACE_SYSEMU.
This is not portable.

Yes, this is Linux specific, but there exist similar calls in the other supported platforms, 
as this kind of functionality is useful to implement sandboxing. If everything else fails, we simply cannot prove it and must trust 
the developer.

When we later find a better way to detect the marking, I would just panic/error if we PROVED the marking is wrong,
but still assume it is right in the first way.
I don't like the idea of checking the special non-blocking mark (if we do introduce the marking).
sometimes people do make intentionally wrong marks to achieve some special purpose. That is,
if we expose the ability to mark a cgo call as nonblocking, then trust the programmer to do the
right thing (or just don't expose the feature in the first place).

 Idea is to first trust the marking and then try to prove it. Like you find races, integer divisions by zero and any other guidance the developer is given.
And people doing such hacks, will probably compile their own Go and can make the hacks there. 

Or one can disable it, like we can disable GC and bounds checking.
 

Best Regards

Ingo

Ian Lance Taylor

unread,
Jan 13, 2013, 1:57:25 PM1/13/13
to Dmitry Vyukov, golang-dev
On Sun, Jan 13, 2013 at 6:12 AM, Dmitry Vyukov <dvy...@google.com> wrote:
>
> What do you think about explicitly saying that net read/write syscalls
> and possibly some others (connect/accept) do not block?

I'm not sure I understand the suggestion, since clearly connect and
accept do block routinely. read and write block when using NFS. Also
see the go-fuse project, in which while read and write may not
actually block, they must be treated as blocking because otherwise the
goroutines implementing the read and write will not be run.

> For cgo we can
> implement some annotations that say that the call does not block. Then
> runtime does not need to guess whether the call blocks or not. It has
> the advantage of being simple and reliable for all
> workloads/platforms. + a user can mark as nonblocking cgo calls that
> do long computations. + it's probably that simple that we can
> implement it before Go1.1. Later we can implement an automatic
> solution and disregard the cgo annotations, so it does not burn the
> bridges.

In gccgo for a while cgo was implemented incorrectly, and in effect
treated all calls as non-blocking. This let me see firsthand that
programs would deadlock under that assumption, since other goroutines,
which would otherwise unblock the program, were not run. So I'm
worried about any approach like this, since a minor error--marking an
occasionally blocking call as non-blocking--an lead to an
unpredictable deadlock.

Ian

Dave Cheney

unread,
Jan 13, 2013, 2:47:07 PM1/13/13
to Ian Lance Taylor, Dmitry Vyukov, golang-dev
There is also issue http://golang.org/issue/3412

The patch should apply cleanly, but I have not been able access testing resources sufficient to give me confidence that it is a net improvement. 

Dmitry Vyukov

unread,
Jan 13, 2013, 11:53:50 PM1/13/13
to Ian Lance Taylor, golang-dev
On Sun, Jan 13, 2013 at 10:57 PM, Ian Lance Taylor <ia...@google.com>
>> What do you think about explicitly saying that net read/write
syscalls
>> and possibly some others (connect/accept) do not block?
>
> I'm not sure I understand the suggestion, since clearly connect and
> accept do block routinely. read and write block when using NFS. Also
> see the go-fuse project, in which while read and write may not
> actually block, they must be treated as blocking because otherwise the
> goroutines implementing the read and write will not be run.

Sorry, I mean only read/write calls from the net package where we know
they are (most likely) nonblocking. This is 99% of all syscalls for a
lot of Go programs.


>> For cgo we can
>> implement some annotations that say that the call does not block. Then
>> runtime does not need to guess whether the call blocks or not. It has
>> the advantage of being simple and reliable for all
>> workloads/platforms. + a user can mark as nonblocking cgo calls that
>> do long computations. + it's probably that simple that we can
>> implement it before Go1.1. Later we can implement an automatic
>> solution and disregard the cgo annotations, so it does not burn the
>> bridges.
>
> In gccgo for a while cgo was implemented incorrectly, and in effect
> treated all calls as non-blocking. This let me see firsthand that
> programs would deadlock under that assumption, since other goroutines,
> which would otherwise unblock the program, were not run. So I'm
> worried about any approach like this, since a minor error--marking an
> occasionally blocking call as non-blocking--an lead to an
> unpredictable deadlock.


Yes, if there are dependencies between functions, it can lead to
deadlocks. This is close to deadlocks of Go programs that require
parallelism.

OK, what about doing it just for net package?

Dmitry Vyukov

unread,
Jan 13, 2013, 11:55:07 PM1/13/13
to Dave Cheney, Ian Lance Taylor, golang-dev
I think it's the simplest solution for now. We can even make
ReadNB/WriteNB private/non-visible, so we do not have any additional
obligations.

Dave Cheney

unread,
Jan 14, 2013, 12:07:08 AM1/14/13
to Dmitry Vyukov, Ian Lance Taylor, golang-dev
I have asked Sugu if we can leverage the Vitess test harness to
performance test that CL.

Ian Lance Taylor

unread,
Jan 14, 2013, 1:00:15 PM1/14/13
to Dmitry Vyukov, golang-dev
On Sun, Jan 13, 2013 at 8:53 PM, Dmitry Vyukov <dvy...@google.com> wrote:
> On Sun, Jan 13, 2013 at 10:57 PM, Ian Lance Taylor <ia...@google.com>
>>> What do you think about explicitly saying that net read/write
> syscalls
>>> and possibly some others (connect/accept) do not block?
>>
>> I'm not sure I understand the suggestion, since clearly connect and
>> accept do block routinely. read and write block when using NFS. Also
>> see the go-fuse project, in which while read and write may not
>> actually block, they must be treated as blocking because otherwise the
>> goroutines implementing the read and write will not be run.
>
> Sorry, I mean only read/write calls from the net package where we know
> they are (most likely) nonblocking. This is 99% of all syscalls for a
> lot of Go programs.

Ah, yes, I'm in favor of that.

>> In gccgo for a while cgo was implemented incorrectly, and in effect
>> treated all calls as non-blocking. This let me see firsthand that
>> programs would deadlock under that assumption, since other goroutines,
>> which would otherwise unblock the program, were not run. So I'm
>> worried about any approach like this, since a minor error--marking an
>> occasionally blocking call as non-blocking--an lead to an
>> unpredictable deadlock.
>
>
> Yes, if there are dependencies between functions, it can lead to
> deadlocks. This is close to deadlocks of Go programs that require
> parallelism.
>
> OK, what about doing it just for net package?

Works for me if it makes sense.

Ian
Reply all
Reply to author
Forward
0 new messages