"Connection reset by peer: socket write error" when inserting more than 70K entries

1,209 views
Skip to first unread message

Vinay Kumar Chella

unread,
Oct 2, 2012, 6:12:22 PM10/2/12
to hector...@googlegroups.com
Any idea why I am getting this below error when I am trying to insert more than 70K entries into Cassandra with single mutator object.

Thanks in advance for your help

Caused by: me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset by peer: socket write error
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:33)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:252)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)

Nate McCall

unread,
Oct 2, 2012, 6:16:25 PM10/2/12
to hector...@googlegroups.com
You are going past thrift's max frame size (15mb by default in
Cassandra - it's per connection so don't change it).

Split your inserts up into parallel threads with smaller batch sizes.
Here is an example of using an executor and a callable for
parallelizing insertion:
https://github.com/zznate/cassandra-tutorial/blob/master/src/main/java/com/datastax/tutorial/composite/CompositeDataLoader.java

Vinay Kumar Chella

unread,
Oct 2, 2012, 6:27:30 PM10/2/12
to hector...@googlegroups.com
Nate you have been very helpful and responsive. Thanks a lot. Does this thrift's max frame size will also applies for reading? which means if I am reading using MultigetsliceQuery with many row keys mentioned and without filtering the columns which might lead more than 15MB size of data, Will I get the same exception?

Nate McCall

unread,
Oct 2, 2012, 6:42:55 PM10/2/12
to hector...@googlegroups.com
Yes. Same for reads. This is analogous the max_allowed_packet on MySQL
and similar database systems. It's just not a good idea to return a
large result set over the wire in one shot.

On Tue, Oct 2, 2012 at 5:27 PM, Vinay Kumar Chella

Vinay Kumar Chella

unread,
Oct 2, 2012, 7:06:06 PM10/2/12
to hector...@googlegroups.com
Got it Nate. Thanks again. :)

Vinay Kumar Chella

unread,
Oct 2, 2012, 7:36:07 PM10/2/12
to hector...@googlegroups.com
How about changing this setting in Cassandra.yaml to 25 MB or so? Will it create any issue?

Nate McCall

unread,
Oct 2, 2012, 8:58:07 PM10/2/12
to hector...@googlegroups.com
Just turn the batch sizes down.

Thrift has a "feature" that re-uses the buffers. They only grow out to
the max size, they don't shrink.

After some time running your app could have 50 connections *per host*
each with a 25mb buffer. You will OOM on the client.

On Tue, Oct 2, 2012 at 6:36 PM, Vinay Kumar Chella
Reply all
Reply to author
Forward
0 new messages