Why does my code keep giving me "socket: too many open files"

1,990 views
Skip to first unread message

Tenjin

unread,
Apr 9, 2021, 1:21:19 PM4/9/21
to golang-nuts
Basically I am making a script to query an api for some json data, if I run it synchronously it works fine and works as intended but it needs to do this give or take 15thousand times so I thought I would use concurrency to get the job done. It works fine for about 4 thousand queries then it stops working and gives me this error "socket: too many open files" when I did research on the issue I implemented everything they did I am consuming the response body and then I am closing it once I am finished as well as letting my wait group know I am finished to exit the go routine.

Right now I am making a new goroutine for each request and then closing it. I am also using WSL2 to run the code as I came across a stack overflow post that said I need to up my ulimit on the machine which I cannot do and ideally do not want others to have to do.

I have been stuck on this for a few days now and no one is helping me thus far, fingers crossed this post works out well and I can get it resolved.

Jesper Louis Andersen

unread,
Apr 9, 2021, 1:54:22 PM4/9/21
to Tenjin, golang-nuts
On Fri, Apr 9, 2021 at 7:20 PM Tenjin <jdiscord...@gmail.com> wrote:
Basically I am making a script to query an api for some json data, if I run it synchronously it works fine and works as intended but it needs to do this give or take 15thousand times so I thought I would use concurrency to get the job done. It works fine for about 4 thousand queries then it stops working and gives me this error "socket: too many open files" when I did research on the issue I implemented everything they did I am consuming the response body and then I am closing it once I am finished as well as letting my wait group know I am finished to exit the go routine.

I'm a bat in a vacuum here, so I'm going to guess and give some general advice rather than concrete advice. I hope it's enough to poke you in the right direction so you can solve your problem.

Unix uses "file descriptors" for files and also for network connections, so they come from the same pool of resources. An account is given a number of these to use, and in a virtualized environment it can be quite limited. You can check your limits with something like `ulimit -a`. This is the background for the error you are seeing. In general, it can be good to keep this artificially low in dev environments so you run into the limits quicker and can act before hitting production.

The particular problem can happen in a number of ways. Either because you are leaking network connections, and also because you are leaking files (and it just so happens you run out of network connections first). It can also happen because your open connections "linger" after you've used them for a little while (by HTTP clients, etc), or because connections are kept open for reuse. Or because you open new connections while the old ones are still in use, so they stack up. The garbage collector can't be relied on for cleaning up, because it may take a while before it runs, and you want to give resources back quickly so they can be reused.

A good way is to verify this is the case. Look up something like the `netstat` or `lsof` commands (on Linux) to get a view of the open files or the open network connections. You can then learn what the problem is, and you can drill down into where the problem might be.

Any API has a limit to how much it can handle. And your client will have a limit as well. Hence it is good style to put some kind of concurrency limit on your connections. Either spawn a number of workers who read work from the same channel. Or have a channel with a limit. You only open new work when there's space to put a token on the channel. And when you are done, you consume a single token out of the channel. This channel acts like a limiter. Bryan C. Mills had an excellent talk at GopherCon 2018 about this: https://youtu.be/5zXAHh5tJqQ?t=1641 where I've put a link to the point where he discusses it, but note all of the talk is highly recommended. The reason Bryan's suggestion is good is because it limits the amount of goroutines you have, and debugging is way easier this way.

Some APIs have overload protection. They'll begin returning 429 Too many requests, or something such. Make sure you have a system which detects this and stops hammering the API if this happens.

You can use the channel to test your setup as well. Set the limit to 1 and you should have the synchronous behavior, and resources should be steady when analyzed. Now you can then try to bump it to 2,4,8, ... or whatever you feel is fair to the API.






--
J.

Jeff Peck

unread,
Apr 9, 2021, 2:24:41 PM4/9/21
to golan...@googlegroups.com


On 4/9/21 11:57 AM, Tenjin wrote:
Basically I am making a script to query an api for some json data, if I run it synchronously it works fine and works as intended but it needs to do this give or take 15thousand times so I thought I would use concurrency to get the job done. It works fine for about 4 thousand queries then it stops working and gives me this error "socket: too many open files" when I did research on the issue I implemented everything they did I am consuming the response body and then I am closing it once I am finished as well as letting my wait group know I am finished to exit the go routine.

Right now I am making a new goroutine for each request and then closing it. I am also using WSL2 to run the code as I came across a stack overflow post that said I need to up my ulimit on the machine which I cannot do and ideally do not want others to have to do.

This is most likely your problem. I can't be sure, but it sounds like you are possibly launching all of these goroutines at once on startup. If so, then they are likely all trying to open a network connection at the same time. Instead, I would suggest to push your requests into a channel and have a well defined goroutine that iterates over that channel to service the requests. Then you can experiment with launching N instances of your goroutine when your application starts up.  This should give you better control over the number of outstanding sockets, and you can play with N to get good performance.

I have been stuck on this for a few days now and no one is helping me thus far, fingers crossed this post works out well and I can get it resolved.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/09e4caf4-d79c-450e-a544-2d9242167f87n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages