Replacement for net.Error.Temporary in server Accept loops

Caleb Spare

unread,

Apr 20, 2022, 9:02:34 PM4/20/22

to golang-nuts

In Go 1.18 net.Error.Temporary was deprecated (see
https://go.dev/issue/45729). However, in trying to remove it from my
code, I found one way in which Temporary is used for which there is no
obvious replacement: in a TCP server's Accept loop, when deciding
whether to wait and retry an Accept error.

You can see an example of this in net/http.Server today:
https://github.com/golang/go/blob/ab9d31da9e088a271e656120a3d99cd3b1103ab6/src/net/http/server.go#L3047-L3059

In this case, Temporary seems useful, and enumerating the OS-specific
errors myself doesn't seem like a good idea.

Does anyone have a good solution here? It doesn't seem like this was
adequately considered when making this deprecation decision.

Caleb

Damien Neil

unread,

Apr 20, 2022, 9:46:40 PM4/20/22

to golang-nuts

The reason for deprecating Temporary is that the set of "temporary" errors was extremely ill-defined. The initial issue for https://go.dev/issue/45729 discusses the de facto definition of Temporary and the confusion resulting from it.

Perhaps there's a useful definition of temporary or retriable errors, perhaps limited in scope to syscall errors such as EINTR and EMFILE. I don't know what that definition is, but perhaps we should come up with one and add an os.ErrTemporary or some such. I don't think leaving net.Error.Temporary undeprecated was the right choice, however; the need for a good way to identify transient system errors such as EMFILE doesn't mean that it was a good way to do so or could ever be made into one.

Ian Lance Taylor

unread,

Apr 20, 2022, 10:49:20 PM4/20/22

to Damien Neil, golang-nuts

On Wed, Apr 20, 2022 at 6:46 PM 'Damien Neil' via golang-nuts
<golan...@googlegroups.com> wrote:
>
> The reason for deprecating Temporary is that the set of "temporary" errors was extremely ill-defined. The initial issue for https://go.dev/issue/45729 discusses the de facto definition of Temporary and the confusion resulting from it.
>
> Perhaps there's a useful definition of temporary or retriable errors, perhaps limited in scope to syscall errors such as EINTR and EMFILE. I don't know what that definition is, but perhaps we should come up with one and add an os.ErrTemporary or some such. I don't think leaving net.Error.Temporary undeprecated was the right choice, however; the need for a good way to identify transient system errors such as EMFILE doesn't mean that it was a good way to do so or could ever be made into one.

To frame issue 45729 in a different way, whether an error is temporary
is not a general characteristic. It depends on the context in which
it appears. For the Accept loop in http.Server.Serve really the only
plausible temporary errors are ENFILE and EMFILE. Perhaps the net
package needs a RetriableAcceptError function.

Ian

> On Wednesday, April 20, 2022 at 6:02:34 PM UTC-7 ces...@gmail.com wrote:
>>
>> In Go 1.18 net.Error.Temporary was deprecated (see
>> https://go.dev/issue/45729). However, in trying to remove it from my
>> code, I found one way in which Temporary is used for which there is no
>> obvious replacement: in a TCP server's Accept loop, when deciding
>> whether to wait and retry an Accept error.
>>
>> You can see an example of this in net/http.Server today:
>> https://github.com/golang/go/blob/ab9d31da9e088a271e656120a3d99cd3b1103ab6/src/net/http/server.go#L3047-L3059
>>
>> In this case, Temporary seems useful, and enumerating the OS-specific
>> errors myself doesn't seem like a good idea.
>>
>> Does anyone have a good solution here? It doesn't seem like this was
>> adequately considered when making this deprecation decision.
>>
>> Caleb
>

> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1024e668-795f-454f-a659-ab5a4bf9517cn%40googlegroups.com.

Bryan C. Mills

unread,

Apr 21, 2022, 10:16:11 AM4/21/22

to golang-nuts

Even ENFILE and EMFILE are not necessarily blindly retriable: if the process has run out of files, it may be because they have leaked (for example, they may be reachable from deadlocked goroutines).

If that is the case, it is arguably better for the program to fail with a useful error than to keep retrying without making progress.

(I would argue that the retry loop in net/http.Server is a mistake, and should be replaced with a user-configurable semaphore limiting the number of open connections — thus avoiding the file exhaustion in the first place!)

Caleb Spare

unread,

Apr 21, 2022, 6:35:44 PM4/21/22

to Damien Neil, golang-nuts

On Wed, Apr 20, 2022 at 6:46 PM 'Damien Neil' via golang-nuts
<golan...@googlegroups.com> wrote:
>

> The reason for deprecating Temporary is that the set of "temporary" errors was extremely ill-defined. The initial issue for https://go.dev/issue/45729 discusses the de facto definition of Temporary and the confusion resulting from it.
>
> Perhaps there's a useful definition of temporary or retriable errors, perhaps limited in scope to syscall errors such as EINTR and EMFILE. I don't know what that definition is, but perhaps we should come up with one and add an os.ErrTemporary or some such. I don't think leaving net.Error.Temporary undeprecated was the right choice, however; the need for a good way to identify transient system errors such as EMFILE doesn't mean that it was a good way to do so or could ever be made into one.

Thanks for the response. I definitely appreciate and agree with the
sentiment in issue 45729. As I went through a large codebase to remove
Temporary, most usages were on the client side and didn't really make
sense. I'm glad we're getting rid of it.

While the whole suite of Temporary errors isn't really coherent, the
issue is that a small subset of Temporary is still useful for Accept
loops and it doesn't have a non-deprecated replacement. As a case in
point, I presume that http.Server is going to keep using Temporary
indefinitely.

In our code, we try to weed out any deprecated functions, so I will
need to write some replacement for this use case (maybe using Ian's
suggestion of RetriableAcceptError). Perhaps I'll do that and send a
proposal for the net package later.

But ideally I think we would've provided a replacement for this use
case before deprecating Temporary. Perhaps it is simply that this use
case wasn't identified as part of issue 45729. Near the end of the
issue you wrote

> The cases where Temporary does not imply Timeout are surprising and not particularly useful.

but Accept loops seem like a counterexample to that (surprising or
not, it is certainly useful).

Caleb

>
> On Wednesday, April 20, 2022 at 6:02:34 PM UTC-7 ces...@gmail.com wrote:
>>
>> In Go 1.18 net.Error.Temporary was deprecated (see
>> https://go.dev/issue/45729). However, in trying to remove it from my
>> code, I found one way in which Temporary is used for which there is no
>> obvious replacement: in a TCP server's Accept loop, when deciding
>> whether to wait and retry an Accept error.
>>
>> You can see an example of this in net/http.Server today:
>> https://github.com/golang/go/blob/ab9d31da9e088a271e656120a3d99cd3b1103ab6/src/net/http/server.go#L3047-L3059
>>
>> In this case, Temporary seems useful, and enumerating the OS-specific
>> errors myself doesn't seem like a good idea.
>>
>> Does anyone have a good solution here? It doesn't seem like this was
>> adequately considered when making this deprecation decision.
>>
>> Caleb
>

Caleb Spare

unread,

Apr 21, 2022, 6:39:39 PM4/21/22

to Bryan C. Mills, golang-nuts

On Thu, Apr 21, 2022 at 7:16 AM 'Bryan C. Mills' via golang-nuts
<golan...@googlegroups.com> wrote:
>
> Even ENFILE and EMFILE are not necessarily blindly retriable: if the process has run out of files, it may be because they have leaked (for example, they may be reachable from deadlocked goroutines).
> If that is the case, it is arguably better for the program to fail with a useful error than to keep retrying without making progress.
>
> (I would argue that the retry loop in net/http.Server is a mistake, and should be replaced with a user-configurable semaphore limiting the number of open connections — thus avoiding the file exhaustion in the first place!)

ENFILE might be caused by a different process entirely, no?

> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1826b3b5-c147-4015-9769-984fd84eacb3n%40googlegroups.com.

Zhang Jie (Kn)

unread,

Nov 17, 2025, 10:11:24 PM (3 days ago) Nov 17

to golang-nuts

Hello everyone,

Over the past year, I've encountered two strange issues with net.ListenTCPand listener.Accept. Without explicitly enabling reuseport, multiple service processes on the same machine, all searching for available ports starting from 9000, managed to successfully call listenon the same IP and port. At least when calling net.ListenTCP, it returned err == nil, and the error only appeared during listener.Accept. However, at the time, we weren't explicitly checking the returned error or printing the error message. Instead, when we found the returned conn == nil, we kept retrying listener.Acceptin a for-loop.

We've reproduced this issue twice within a year. The environment was a virtual machine allocated on a physical host with a Linux 5.4 kernel, and it was very difficult to reproduce. Our immediate fix was to add the error checking logic and print the specific error. While handling this issue, we also ran into the problem with netError.Temporary().

I completely agree with Ian's insight: "Whether an error is temporary depends on what you were doing at the time." For the specific case of listener.Accept(), even if netError.Temporary()returns true, retrying doesn't necessarily mean the service can remain available. Errors always manifest in wildly different ways. In our specific flawed usage scenario, the service had already successfully registered with the name service, and other services had already discovered it and started sending requests. However, because the listenwasn't actually successful (the IP:port was held by another process), it resulted in persistent access failures.

But if we don't use Temporary(), asking developers to enumerate all possible temporary errors that can be retried isn't a very straightforward task. Could several categorical functions, similar to IsTimeout, be provided to allow developers to combine them freely? For example, something like if ne.IsTimeout() || ne.IsXXX() || ne.IsYYY().

Brian Candler

unread,

Nov 18, 2025, 4:31:12 AM (3 days ago) Nov 18

to golang-nuts

When the problem occurs, I suggest you look at "ss -natp" ("netstat -natp" on older systems) and see if you really do have two listening sockets on the same port and address.

If you do, that seems like a kernel bug / some sort of race. What kernel version is the VM running? (The kernel on the physical host shouldn't really make any difference).

Robert Engels

unread,

Nov 18, 2025, 7:38:59 AM (2 days ago) Nov 18

to Brian Candler, golang-nuts

I believe that if the port has pre ious connections still in the CLOSE_WAIT state (could be a previous run of the same app) the port cannot be opened.

Linux also has a REUSE_PORT option that allows multiple processes to bind to the same port and it balances the incoming requests automatically.

On Nov 18, 2025, at 3:36 AM, 'Brian Candler' via golang-nuts <golan...@googlegroups.com> wrote:

When the problem occurs, I suggest you look at "ss -natp" ("netstat -natp" on older systems) and see if you really do have two listening sockets on the same port and address.

To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/86d641cd-4503-4568-b491-f82b5fa705c9n%40googlegroups.com.

Zhang Jie (Kn)

unread,

Nov 18, 2025, 7:49:18 AM (2 days ago) Nov 18

to golang-nuts

The release is Tencent tlinux3, the kernel is Linux 5.4, it's modified by Tencent.

---

In golang, net.ListenTCP will set REUSEADDR to quickly reuse the same ipport, but listen twice shouldn't success unless REUSEPORT set.

When the problem occurs, we try use `fuser port/tcp` to check if there's only one process listening on the same ipport. Yes, there's only one.

The other process trying to listen on the same ipport succeeded:

```

ln, err := net.ListenTCP(...),

```

here err is nil.

Then:

```

conn, err := ln.Accept()

```

here conn is nil, and err != nil, but in our previous code, the err is ignored (bad practice), I didn't know what error it returned.

And I cannot reproduce this problem.

Zhang Jie (Kn)

unread,

Nov 18, 2025, 7:56:08 AM (2 days ago) Nov 18

to golang-nuts

I searched the web and Linux commit history, after which I only found two meaningful pieces of information:
1) before linux 2.6, there's double bind race

2) linux 6.0.16, there's double bind race

but it seems there's no reports in 5.x kernel.

---

And, I talked with chatgpt, it says:

```

"Double bind race" refers to a scenario where multiple threads/CPUs attempt to bind() to the same (IP, port, proto) almost simultaneously. Due to a race condition window in the kernel when creating and inserting an inet_bind_bucket (port binding bucket), the following may occur :

Both threads may believe the port is available.
Both threads may create their own inet_bind_bucket.
The kernel might ultimately insert one bucket, but in an inconsistent state.
This can lead to one thread's bind operation failing with an unexpected error (e.g., not EADDRINUSE), or, in older versions, even result in a temporary "successful duplicate bind" (which theoretically should not happen) .

This type of race condition is typically difficult to reproduce and requires a multi-core environment with near-instantaneous concurrent attempts to bind to the same port .

```

I don't know if the kernel bug really exists, or is it caused by some virtualization technology bugs.

Steven Hartland

unread,

Nov 18, 2025, 8:15:31 AM (2 days ago) Nov 18

to Zhang Jie (Kn), golang-nuts

In your case where the two instances started very close together? If not a double race bind shouldn't be a problem.

To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/0f47147b-5911-4f67-aa45-8eb00e722f5fn%40googlegroups.com.

Brian Candler

unread,

Nov 18, 2025, 8:49:14 AM (2 days ago) Nov 18

to golang-nuts

On Tuesday, 18 November 2025 at 12:56:08 UTC Zhang Jie (Kn) wrote:

I don't know if the kernel bug really exists, or is it caused by some virtualization technology bugs.

If it exists, it's a bug in the guest kernel. The outer virtualization layer has no knowledge or visibility of socket data structures in the guest kernel.

> In golang, net.ListenTCP will set REUSEADDR to quickly reuse the same ipport, but listen twice shouldn't success unless REUSEPORT set.

I found this in the go source:

func setDefaultListenerSockopts(s int) error {
// Allow reuse of recently-used addresses.
return os.NewSyscallError("setsockopt", syscall.SetsockoptInt(s, syscall.SOL_SOCKET, syscall.SO_REUSEADDR, 1))
}

If that's used on sockets by default, then yes this condition could quite possibly happen.

If you want to report this as a bug, try making a small piece of standalone Go code which reproduces the issue with what you're trying to do.

Robert Engels

unread,

Nov 18, 2025, 9:12:36 AM (2 days ago) Nov 18

to Brian Candler, golang-nuts

As expected there are other states on the port that can prevent reusing the port from working. I suspect the OP is encountering one of those. https://www.unixguide.net/network/socketfaq/4.5.shtml

On Nov 18, 2025, at 7:55 AM, 'Brian Candler' via golang-nuts <golan...@googlegroups.com> wrote:

SO_REUSEADDR

Brian Candler

unread,

Nov 18, 2025, 11:53:46 AM (2 days ago) Nov 18

to golang-nuts

I don't think he wants to re-use the port.

It sounds like he has code which hunts for a free port to bind to, in the range 9000 to 9000+N. It does this by binding to a port in that range, and if it fails, picking a new port and repeating. There are multiple tasks doing this concurrently, and sometimes, two tasks end up (wrongly) thinking they have found the same free port.

At least, that was how I understood the post - I stand to be corrected.

Message has been deleted

Zhang Jie (Kn)

unread,

Nov 18, 2025, 9:15:21 PM (2 days ago) Nov 18

to golang-nuts

As Brian says, that's exactly our situation.

Reply all

Reply to author

Forward