failed: Connection reset by peer

1,943 views
Skip to first unread message

Ajith Shetty

unread,
Sep 11, 2017, 7:17:38 AM9/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Group Members,

I have been using apache cassandra 3.0.11

Datastax driver: 3.1.3

Netty version: 4.1.6.Final


We get the frequent below error in the database side:

INFO  [SharedPool-Worker-2] 2017-09-11 01:39:31,738 Message.java:615 - Unexpected exception during request; channel = [id: 0x22d0c179, L:/xx.xx.xx.xx:9042 ! R:/yy.yy.yy.yy:57774]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peer
        at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
INFO  [SharedPool-Worker-2] 2017-09-11 01:44:25,789 Message.java:615 - Unexpected exception during request; channel = [id: 0xace60b99, L:/xx.xx.xx.xx:9042 ! R:/yy.yy.yy.yy:54372]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peer
        at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]


And from the application side we get the below error:

08/24/17 05:30:08.321   DEBUG [CassandraAccessor:57] - executing CQL [select * from src where host_ip= 'cc.cc.cc.cc']

08/24/17 05:30:08.319   DEBUG [Qwerty:49] - Finding route for [ Source IP : yy.yy.yy.yy , Source Port : 16107 , Messaage Type : ISONMQ

08/24/17 05:30:08.322   DEBUG [CassandraAccessor:57] - executing CQL [select * from test]

08/24/17 05:30:08.321   ERROR [RequestHandler$SpeculativeExecution$1:333] - Unexpected error while querying /xx.xx.xx.xx

com.datastax.driver.core.exceptions.ConnectionException: [/xx.xx.xx.xx:9042] Pool is CLOSING

        at com.datastax.driver.core.HostConnectionPool.borrowConnection(HostConnectionPool.java:196)

        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.query(RequestHandler.java:293)

        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:272)

        at com.datastax.driver.core.RequestHandler.startNewExecution(RequestHandler.java:115)

        at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:95)

        at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)

        at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)

        at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:43)

        at org.springframework.cassandra.core.CqlTemplate$3.doInSession(CqlTemplate.java:286)

        at org.springframework.cassandra.core.CqlTemplate$3.doInSession(CqlTemplate.java:283)

        at org.springframework.cassandra.core.CqlTemplate.doExecute(CqlTemplate.java:276)

        at org.springframework.cassandra.core.CqlTemplate.doExecuteQueryReturnResultSet(CqlTemplate.java:283)

        at org.springframework.data.cassandra.core.CassandraTemplate.select(CassandraTemplate.java:594)

        at org.springframework.data.cassandra.core.CassandraTemplate.select(CassandraTemplate.java:376)

        at com.fs.searshc.commlink.db.cassandra.RouterData.findBySourceIp(RouterData.java:27)

        at com.fs.searshc.commlink.router.tcpip.CommLinkRouter.findRoute(CommLinkRouter.java:75)


Any kind of help would be greatly appreciated.


Thanks in advance.


-Ajith

Andrew Tolbert

unread,
Sep 11, 2017, 10:48:52 AM9/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hello Ajith,

If you don't see an indication on either sides' logs indicating that it is closing the connection, one possibility is that a connection reset can be initiated by some intermediate networking equipment between your client and cassandra nodes.  I would suggest doing a packet trace with a tool like wireshark/tcpdump on both sides to see if both sides are observing the connection being reset by the other (could be intermediate networking device in that case) or if one is sending the reset.

Thanks,
Andy

Ajith Shetty

unread,
Sep 11, 2017, 10:58:51 AM9/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Thank you for the quick response Andy.
We shall do the packet trace and see if we can find anything

Thanks,
Ajith

Ajith Shetty

unread,
Sep 26, 2017, 7:48:40 AM9/26/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Andrew,

We have run the tcpdump and we got the below output.
Can you please help me with decoding the tcpdump output.
Which server is the culprit and sending RST.


Application server: xx.xx.xx.139
Cassandra server1: xx.xx.xx.78
cassandra server2: xx.xx.xx.79

TCP dump from application server:xx.xx.xx.139

2017-09-20 05:09:22.170273 IP xx.xx.xx.78.9042 > xx.xx.xx.139.44408: Flags [R.], seq 558647898, ack 47750391, win 6145, options [nop,nop,TS val 3419854254 ecr 1293665900], length 0
2017-09-20 05:09:22.170319 IP xx.xx.xx.79.9042 > xx.xx.xx.139.42696: Flags [R.], seq 594588671, ack 44660661, win 6146, options [nop,nop,TS val 3410573711 ecr 1293665900], length 0


Andrew Tolbert

unread,
Sep 26, 2017, 10:13:11 AM9/26/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Ajith,

Given that output it looks like the application server is observing resets sent by each of the cassandra nodes at around the same time.  It's still possible there could be something intermediate between the two resetting the connection as I wouldn't expect cassandra itself to do that unless  the nodes were coming down or someone ran nodetool disablebinary or something.  What is your environment like?  Is it on-premise, or is it hosted on some cloud provider?

Thanks,
Andy

Ajith Shetty

unread,
Sep 26, 2017, 10:28:39 AM9/26/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Thanks for the quick response Andy.

We have cassandra cluster and application servers behind firewall.
Below are the tcp connection timeout values:

-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_probes

9

-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl

75

-bash-4.1$ cat /proc/sys/net/ipv4/tcp_keepalive_time

7200

 

Thanks,

Ajith Shetty


Reply all
Reply to author
Forward
0 new messages