Hi,
Our
application receives about 50k POST (json) requests per second, and we
are facing some problems on production. The application gets stuck and
stops receiving connections.
During this time, This warning appears on our logs:
[default-akka.actor.default-dispatcher-11] WARN s.can.server.HttpServerConnection - Configured registration timeout of 1 second expired, stopping
And many (~20K) connections are in "CLOSE_WAIT" status
We are using spray-can 1.3.1, and akka 2.3.6. and didn't changed the default configuration of spray.
When doing ss -lt I'm seeing this output:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 127.0.0.1:9000 *:*
LISTEN 0 128 :::11211 :::*
LISTEN 0 128 *:11211 *:*
LISTEN 0 128 *:6379 *:*
LISTEN 0 128 :::6379 :::*
LISTEN 0 128 127.0.0.1:11212 *:*
LISTEN 0 128 127.0.0.1:11213 *:*
LISTEN 0 128 127.0.0.1:11214 *:*
LISTEN 129 128 *:8080 *:*
LISTEN 0 128 *:http *:*
LISTEN 0 128 *:81 *:*
LISTEN 0 128 *:82 *:*
LISTEN 0 128 *:83 *:*
LISTEN 0 128 *:84 *:*
LISTEN 0 128 :::ssh :::*
LISTEN 0 128 *:ssh *:*
LISTEN 0 128 *:4568 *:*
LISTEN 0 128 *:https *:*
We've tried to increase the backlog on the Http.Bind (At the moment it's set to 2048),
and also here: sysctl -w net.ipv4.tcp_max_syn_backlog=50000
But with no change.
I've read
here some info regarding this, but didn't understand what is the suggested solution..
Do you think it's related? if so, have any idea how I can solve this?
We would really appreciate any help with this problem.
Thanks