With Go version 1.14 I get a lot of errors when I run:go test -v github.com/pebbe/zmq4I didn't see this with Go 1.13.8 or any earlier version.Is this a problem with Go 1.14, or am I doing something wrong and just got lucky until now?How do I debug this? The errors are different for each run. Below is a sample of some errors.Line numbers are not always accurate, because I inserted some calls to test.Log().
On Wednesday, February 26, 2020 at 12:33:05 PM UTC+1, Peter Kleiweg wrote:With Go version 1.14 I get a lot of errors when I run:go test -v github.com/pebbe/zmq4I didn't see this with Go 1.13.8 or any earlier version.Is this a problem with Go 1.14, or am I doing something wrong and just got lucky until now?How do I debug this? The errors are different for each run. Below is a sample of some errors.Line numbers are not always accurate, because I inserted some calls to test.Log().The errors are probably caused by https://golang.org/doc/go1.14#runtime.The solution is to update zmq4 to explicitly handle interrupted system calls.
However it is strange that they happen in the tests. Is this cause by SIPIPE?
This looks like fallout of the 1.14 changes that made Goroutines preemptively schedulable.
It seems likely that this code hasn't worked before either, just
that the failure cases were masked because less signals got
delivered (and thus had less chance of interrupting system calls).
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/cdb7319d-542a-45ab-842b-bc1b5d838e93%40googlegroups.com.
-- -- Gregor Best be...@pferdewetten.de
Op woensdag 26 februari 2020 13:05:40 UTC+1 schreef Manlio Perillo:On Wednesday, February 26, 2020 at 12:33:05 PM UTC+1, Peter Kleiweg wrote:With Go version 1.14 I get a lot of errors when I run:go test -v github.com/pebbe/zmq4I didn't see this with Go 1.13.8 or any earlier version.Is this a problem with Go 1.14, or am I doing something wrong and just got lucky until now?How do I debug this? The errors are different for each run. Below is a sample of some errors.Line numbers are not always accurate, because I inserted some calls to test.Log().The errors are probably caused by https://golang.org/doc/go1.14#runtime.The solution is to update zmq4 to explicitly handle interrupted system calls.Often the program freezes before I get an interrupted system call. It hangs inside a ZeroMQ C++ library function.zmq4 is just a wrapper for ZeroMQ. I can't "fix" ZeroMQ to make it work with Go.Is there a way to stop Go from interrupting my system calls? It happens rather randomly all over the place.
Two famous people, one from MIT and another from Berkeley (but working on Unix) once met to discuss operating system issues. The person from MIT was knowledgeable about ITS (the MIT AI Lab operating system) and had been reading the Unix sources. He was interested in how Unix solved the PC loser-ing problem.
The PC loser-ing problem occurs when a user program invokes a system routine to perform a lengthy operation that might have significant state, such as IO buffers. If an interrupt occurs during the operation, the state of the user program must be saved. Because the invocation of the system routine is usually a single instruction, the PC of the user program does not adequately capture the state of the process. The system routine must either back out or press forward. The right thing is to back out and restore the user program PC to the instruction that invoked the system routine so that resumption of the user program after the interrupt, for example, re-enters the system routine. It is called ``PC loser-ing'' because the PC is being coerced into ``loser mode,'' where ``loser'' is the affectionate name for ``user'' at MIT.
The MIT guy did not see any code that handled this case and asked the New Jersey guy how the problem was handled. The New Jersey guy said that the Unix folks were aware of the problem, but the solution was for the system routine to always finish, but sometimes an error code would be returned that signaled that the system routine had failed to complete its action. A correct user program, then, had to check the error code to determine whether to simply try the system routine again. The MIT guy did not like this solution because it was not the right thing.
The New Jersey guy said that the Unix solution was right because the design philosophy of Unix was simplicity and that the right thing was too complex. Besides, programmers could easily insert this extra test and loop. The MIT guy pointed out that the implementation was simple but the interface to the functionality was complex. The New Jersey guy said that the right tradeoff has been selected in Unix-namely, implementation simplicity was more important than interface simplicity.
On Feb 26, 2020, at 11:41 AM, Peter Kleiweg <pkle...@xs4all.nl> wrote:
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/aab5e1f7-c3ad-42b6-9806-395c3b14cdee%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcXGcdfpPAz8t8adfqGodKPjZNxKzkTUyB0b4L1zysVFSQ%40mail.gmail.com.
On Wed, Feb 26, 2020 at 9:11 AM Manlio Perillo <manlio...@gmail.com> wrote:
>
> On Wednesday, February 26, 2020 at 4:14:38 PM UTC+1, Ian Lance Taylor wrote:
>>
>> On Wed, Feb 26, 2020 at 7:11 AM Manlio Perillo <manlio...@gmail.com> wrote:
> [...]
>> >
>> > https://stackoverflow.com/questions/36040547/zeromq-how-to-react-on-different-signal-types-on-eintr
>> >
>> > ZeroMQ may return an EINTR error , but zmq4 does not list it in errors.go.
>> > ZeroMQ asks the caller to handle EINTR, so zmq4 should handle it internally or return it to the caller.
>> >
>> > https://golang.org/doc/go1.14#runtime should have mentioned that not only programs that use packages like syscall or golang.org/x/sys/unix will see more slow system calls fail with EINTR errors, but also programs that use Cgo.
>>
>> I don't know ZeroMQ. If the ZeroMQ calls correspond closely to system
>> calls, then it could work for them to return EINTR. In that case the
>> fix is going to be for the Go wrapper around ZeroMQ to check whether
>> the error returned is syscall.EINTR, and to retry the call if it is.
>>
>
> Unfortunately it is not that simple:
>
> http://250bpm.com/blog:12
> https://alobbs.com/post/54503240599/close-and-eintr
> http://man7.org/linux/man-pages/man7/signal.7.html
> https://github.com/golang/go/issues/11180
> https://www.python.org/dev/peps/pep-0475/
>
> The second entry about close and EINTR is enlightening.
Thanks for the links. Note that these issues don't really have
anything to do with Go. For certain system calls, you need to handle
EINTR one way or another. The Go runtime does as much as it can to
avoid these problems, but on Unix systems it is impossible to avoid
them entirely.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/4b2677b3-ae0e-486f-802b-d3eb24bfc65f%40googlegroups.com.
On Thu, Feb 27, 2020 at 9:41 AM Robert Engels <ren...@ix.netcom.com> wrote:
>
>
> I re-read your comments, and I agree that a rare error is still and error, and needs to be handled, but if it the platform is introducing lots of errors, is that the library writers issue?
>
> Maybe an easy solution is a flag to disable the signal usage for tight-loop preemption as a "backwards compatibility" mode ?
>
> As the OP pointed out, he can't really change ZeroMQ, and this is a fairly established product, maybe more so than Go, so doesn't it make more sense that Go adapts rather than the other way around?
We already have that flag: GODEBUG=noasyncpreempt=1.
The discussion upthread explains that the Go wrapper for ZeroMQ should
handle EINTR, and take the appropriate action such as retrying the
operation when appropriate. The response to that was a bit of
distraction, as it discussed generic problems with EINTR. At this
point there is no reason to assume that any of those problems actually
apply to using ZeroMQ.
-----Original Message-----
From: Peter Kleiweg
Sent: Feb 27, 2020 11:59 AM
To: golang-nuts
Subject: Re: [go-nuts] Re: Lot's of test errors in package zmq4 with Go version 1.14, no errors with earlier versions
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/a76d2e26-2ed9-4f7a-beee-c95244743e2e%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
On Feb 27, 2020, at 2:26 PM, Peter Kleiweg <pkle...@xs4all.nl> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/a95feca4-17f4-4a43-80c7-3adc76d0cabf%40googlegroups.com.
With Go version 1.14 I get a lot of errors when I run:go test -v github.com/pebbe/zmq4I didn't see this with Go 1.13.8 or any earlier version.Is this a problem with Go 1.14, or am I doing something wrong and just got lucky until now?How do I debug this? The errors are different for each run. Below is a sample of some errors.Line numbers are not always accurate, because I inserted some calls to test.Log().
=== RUN TestSocketEventTestSocketEvent: socketevent_test.go:73: rep.Bind: interrupted system call=== RUN TestMultipleContextsTestMultipleContexts: zmq4_test.go:131: sock1.Connect: interrupted system callfreeze:=== RUN TestMultipleContexts^CFAIL github.com/pebbe/zmq4 30.226sfreeze:=== RUN TestMultipleContextsTestMultipleContexts: zmq4_test.go:148: sock1.RecvMessage: expected <nil> [tcp://127.0.0.1:9997 tcp://127.0.0.1:9997], got interrupted system call []^CFAIL github.com/pebbe/zmq4 21.445sfreeze:=== RUN TestSecurityCurve^CFAIL github.com/pebbe/zmq4 31.143sfreeze:=== RUN TestSecurityNullTestSecurityNull: zmq4_test.go:1753: server.Recv 1: resource temporarily unavailable^CFAIL github.com/pebbe/zmq4 44.828s=== RUN TestDisconnectInprocTestDisconnectInproc: zmq4_test.go:523: Poll: interrupted system callTestDisconnectInproc: zmq4_test.go:623: isSubscribed=== RUN TestHwmTestHwm: zmq4_test.go:823: bind_socket.Bind: interrupted system callTestHwm: zmq4_test.go:1044: test_inproc_bind_first(0, 0): expected 10000, got -1freeze:=== RUN TestSecurityPlain^CFAIL github.com/pebbe/zmq4 46.395s=== RUN TestPairIpcTestPairIpc: zmq4_test.go:1124: client.Send SNDMORE|DONTWAIT: interrupted system call
-----Original Message-----
From: Peter Kleiweg
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/f7eb247c-f772-4663-9d0b-5cb07c62e427%40googlegroups.com.
Can you clarify that a bit? Did you change the code to look for EINTR errors and then retry the system call?
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.
On Fri, Feb 28, 2020 at 7:18 AM Peter Kleiweg <pkle...@xs4all.nl> wrote:
>
> Op vrijdag 28 februari 2020 16:13:50 UTC+1 schreef Robert Engels:
>>
>>
>> Can you clarify that a bit? Did you change the code to look for EINTR errors and then retry the system call?
>
>
> Yes, I did. But as an option that must be enabled by the user.
I don't understand why you're making it an option. The README
suggests that you would not want to enable it if you want to handle
^C, but in Go the ^C will be delivered on a channel, presumably to a
separate goroutine. At that point your program will either exit or do
some other operation. If the program doesn't exit, then it's not
going to want the interrupted system call to fail. It's going to want
it to be retried.
(As a minor side note, calls like getsockopt will never return EINTR,
it's not necessary to retry them. But it doesn't hurt.)
On Fri, Feb 28, 2020 at 7:18 AM Peter Kleiweg <pkle...@xs4all.nl> wrote:
>
> Op vrijdag 28 februari 2020 16:13:50 UTC+1 schreef Robert Engels:
>>
>>
>> Can you clarify that a bit? Did you change the code to look for EINTR errors and then retry the system call?
>
>
> Yes, I did. But as an option that must be enabled by the user.
I don't understand why you're making it an option. The README
suggests that you would not want to enable it if you want to handle
^C, but in Go the ^C will be delivered on a channel, presumably to a
separate goroutine. At that point your program will either exit or do
some other operation. If the program doesn't exit, then it's not
going to want the interrupted system call to fail. It's going to want
it to be retried.
Ian
On Fri, Feb 28, 2020 at 9:14 AM Manlio Perillo <manlio...@gmail.com> wrote:
>
> On Friday, February 28, 2020 at 5:36:09 PM UTC+1, Ian Lance Taylor wrote:
>>
>> On Fri, Feb 28, 2020 at 8:27 AM Manlio Perillo <manlio...@gmail.com> wrote:
>> >
> [...]
>> Go programs always
>> have multiple threads of execution. Just let a goroutine sit in the
>> slow syscall; who cares?
>>
>
> An user running a client program from a terminal may care.
> If it takes too long to read data from a remote server, an user expects that ^C will interrupt the program.
>
> However a solution is to register an atexit handler using a closure to do some cleanup, so probably this is not an issue worth making the Go runtime more complex.
In Go, a ^C will interrupt a program if you write code like
c := make(chan os.Signal, 1)
signal.Notify(c, syscall.SIGINT)
go func() {
<-c
fmt.Printf("exiting due to ^C")
os.Exit(1)
}()
That process is entirely independent of whether the function zmq4.Poll
returns EINTR or not.