columnsIteratee connection errors

41 views
Skip to first unread message

Tyson Hamilton

unread,
Mar 28, 2012, 2:59:38 PM3/28/12
to twitter...@googlegroups.com
Hello,

I'm receiving some strange behaviour from the columnsIteratee method.  I am attempting to iterate over all columns (lots)  in a specific row key.  Here is a code fragment,

Hello,

I'm attempting to iterate over all columns in a row.  There is quite a large number of columns, so as a test I'm just trying to count/print them all.  Does anyone have experience using the columnsIteratee method?  I'm receiving an error, which I do not understand...


/////////////////////////// code //////////////////////////
    val flumeIndexcolumnFamily = keyspace.columnFamily("flumeindex", Utf8Codec, LexicalUUIDCodec, Utf8Codec)

    // Iterate over all the columns in a row in groups of 100
    // Currently throws a java.net.ConnectException: connection time out
    val colsIt = flumeIndexcolumnFamily.columnsIteratee("2012022503")
    var count = 0
    val indexFin = colsIt.foreach {
      case(column) =>
        count += 1
        println("<" + column.name + "> => " + column.value) // executed asynchronously on each row
    }
    indexFin()

///////////////////////////

This will iterate through a bunch of columns before finally throwing an exception, before completing the entire row:


[error] (run-main) com.twitter.finagle.WriteException: java.net.ConnectException: connection timed out
com.twitter.finagle.WriteException: java.net.ConnectException: connection timed out
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processConnectTimeout(NioClientSocketPipelineSink.java:387)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:291)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: connection timed out
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processConnectTimeout(NioClientSocketPipelineSink.java:387)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:291)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Ryan King

unread,
Mar 28, 2012, 3:02:28 PM3/28/12
to twitter...@googlegroups.com
Huge rows are rough. Even though you only want 100 items per batch, cassandra has to read the whole row first. That's probably what's causing the timeouts.

-ryan

Ryan King

unread,
Mar 28, 2012, 3:10:52 PM3/28/12
to twitter...@googlegroups.com
Actually I was wrong about that. Its not as bad as a described below.

I think you need to look at what's going in Cassandra to figure this out. What do the logs say?

-ryan

Tyson Hamilton

unread,
Mar 28, 2012, 3:57:15 PM3/28/12
to twitter...@googlegroups.com
I don't see much happening in Cassandra, in fact - the system.log and service.log aren't even showing any log entries when I execute the code.  Perhaps it is in the connection to the cluster, my set-up is somewhat convoluted.  I'm going to try compiling the code and put it directly on a cluster node, maybe that will make a difference?

-Tyson

Ryan King

unread,
Mar 28, 2012, 4:03:44 PM3/28/12
to twitter...@googlegroups.com
Yeah, that's possible. Someone over here pointed out that its possible you haven't successfully connected to the cluster at all.

-ryan

Tyson Hamilton

unread,
Mar 28, 2012, 4:11:29 PM3/28/12
to twitter...@googlegroups.com
No I've definitely connected, a bunch of columns are returned before it errors out.  Also I have tested the rowsIteratee successfully on the same cluster.

-Tyson

Tyson Hamilton

unread,
Mar 28, 2012, 4:23:14 PM3/28/12
to twitter...@googlegroups.com
Well it is definitely my set-up, I put the code onto the cluster itself and iterated a total of 263,372 columns and printed each key to stdout in 28 seconds (sweet!).

I'm curious about this error though, I will investigate deeper to see what is happening.  Is there a default timeout set on the Future or in Finagle that I am missing?

Thanks for your help (again) Ryan

-Tyson
Reply all
Reply to author
Forward
0 new messages