Thrift Connection and Timeout issues

3,933 views
Skip to first unread message

Fonzy

unread,
Oct 12, 2012, 1:40:14 PM10/12/12
to hyperta...@googlegroups.com
Hi,

I'm running my client application on the same server as hypertable and I constantly get timeout errors trying to connect to hypertable. I'm writing this on C++ and running (32bit intel) Ubuntu 12.x. with hypertable 9.6.4.0

The error I get is "terminate called after throwing an instance of 'apache::thrift::transport::TTransportException' what(): EAGAIN (timed out)"

Has anyone come across this before and does anyone have any ideas on how to fix it?

Thanks for any help or pointers.
Message has been deleted

Christoph Rupp

unread,
Oct 15, 2012, 1:20:48 AM10/15/12
to hyperta...@googlegroups.com
Hi,

does the ThriftBroker.log report any errors?

I would also check the monitor UI to see if there's a high load when this problem appears, just to rule out that the server is too busy to accept the connection.

bye
Christoph



2012/10/12 Fonzy <joem...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "Hypertable User" group.
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/BuxKPTt3fU8J.
To post to this group, send email to hyperta...@googlegroups.com.
To unsubscribe from this group, send email to hypertable-us...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hypertable-user?hl=en.

Fonzy

unread,
Oct 15, 2012, 1:53:13 PM10/15/12
to hyperta...@googlegroups.com, ch...@hypertable.com
Christoph,

If I restart hypertable it works for a bit then it goes back to throwing timeouts. The error in the ThriftBrokerLog is
Error ThriftBroker: get_mutator (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2322): Bad Mutator id - 0

Does this ring a bell?

By the way, I'm using hypertable 0.9.6.4

Thanks

Christoph Rupp

unread,
Oct 15, 2012, 1:56:06 PM10/15/12
to hyperta...@googlegroups.com
That means that you have a problem in your application. You use an invalid mutator handle.

You can enable the thrift api logging to get more information in your logfile:

ThriftBroker.API.Logging = true

bye
Christoph

2012/10/15 Fonzy <joem...@gmail.com>
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/e4hVc9mzlZIJ.

Fonzy

unread,
Oct 15, 2012, 5:14:53 PM10/15/12
to hyperta...@googlegroups.com, ch...@hypertable.com
Christoph,

I turned on the logging and here is a log from a working run:
1350330910.502859000 API namespace_exists: API namespace_exists: namespace=l3 exists=1 latency=0
1350330910.503992000 API namespace_open: API namespace_open: namespace name=l3 id=8 latency=0
1350330910.504118000 API table_drop: API table_drop: namespace=8 table=recorder if_exists=1 latency=695
1350330911.199898000 API table_create: API table_create: namespace=8 table=recorder schema=<schema>my table stuff</schema> latency=920
1350330912.119835000 API mutator_open: API mutator_open: namespace=8table=recorder flags=0 flush_interval=0 async_mutator=149461888 latency=1
1350330912 ERROR ThriftBroker : TSocket::peek() recv() <Host: 127.0.0.1 Port: 56717>Connection reset by peer
1350330912 ERROR ThriftBroker : TThreadedServer client died: recv(): Connection reset by peer

I then stopped the program and ran it again and it failed.  Here is the log for that instance:
1350331015.109138000 API namespace_exists: API namespace_exists: namespace=l3 exists=1 latency=2555
1350331017 ERROR ThriftBroker : get_mutator (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2322): Bad mutator id - 0
1350331023 ERROR ThriftBroker : TSocket::peek() recv() <Host: 127.0.0.1 Port: 56718>Connection reset by peer
1350331023 ERROR ThriftBroker : TThreadedServer client died: recv(): Connection reset by peer


I'm not asking for the mutator but yet it is throwing a bad mutator exception.

Thanks

Christoph Rupp

unread,
Oct 16, 2012, 12:56:24 AM10/16/12
to Fonzy, hyperta...@googlegroups.com
so here's definitely a problem:


1350331017 ERROR ThriftBroker : get_mutator (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2322): Bad mutator id - 0

in any of the mutator functions used by the thrift client there's a bad mutator handle used. can you add some client-side logging to narrow it down?

basically that could be any mutator function like shared_mutator_refresh, *mutator_close, *mutator_set_cell, *mutator_set_cells_serialized etc.

2012/10/15 Fonzy <joem...@gmail.com>

Fonzy

unread,
Oct 16, 2012, 10:16:54 AM10/16/12
to hyperta...@googlegroups.com, Fonzy, ch...@hypertable.com
Christoph,

My code at this point is very simple and easy to see where it is failing.  I put some cout<< lines in to trace where it is stopping:
Here is my code

client = new Thrift::Client(host,(int)port,1000,true);

cout<<"Checking if namespace exists"<<endl;
if (client->namespace_exists(NAMESPACE)==false)
{
    cout<<"Creating namespace"<<endl;
    client->namespace_create(NAMESPACE);
}

cout<<"Opening the name space"<<endl;
namespaceId = client->namespace_open(NAMESPACE);

cout<<"Droping table"<<endl;
client->drop_table(namespaceId,TABLENAME,true);
cout<<create table"<<endl;
client->create_table(namespaceId,TABLENAME,TABLECREATE);

cout<<"Open mutator"<<endl;
mutator = client->mutator_open(namespaceId,TABLENAME,0,0);
cout<<"Done opening mutator"<<endl;


The first time I ran it it failed on the if statement checking if the name space exists
1350396252 ERROR ThriftBroker : get_mutator (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2322): Bad mutator id - 0
1350396259 ERROR ThriftBroker : TSocket::peek() recv() <Host: 127.0.0.1 Port: 56884>Connection reset by peer
1350396259 ERROR ThriftBroker : TThreadedServer client died: recv(): Connection reset by peer

I ran it a second time and it failed creating the table.
1350396307 ERROR ThriftBroker : TSocket::write_partial() send() <Host: 127.0.0.1 Port: 56885>Connection reset by peer
1350396307 ERROR ThriftBroker : TThreadedServer client died: write() send(): Connection reset by peer

As you can see it is very random in where it is failing.  I checked my code and I never call the function, get_mutator.

Thanks

Christoph Rupp

unread,
Oct 16, 2012, 10:46:23 AM10/16/12
to hyperta...@googlegroups.com, Fonzy
get_mutator is an internal function. It's used in mutator_flush, mutator_set_cells_serialized and a couple of other functions.

Internally (in the ThriftBroker), there's a std::map<uint64_t, TableMutatorAsync *>. Whenever you use a mutator handle, the ThriftBroker uses this map to look up the TableMutatorAsync object which is associated with this handle. And in your case an invalid handle is used.

The other error message that you're seeing ("write partial", "connection reset by peer") most likely come from the fact that the thriftbroker connection is not closed but the client terminates. this is not a problem.

If you single-step through your client, and in another terminal window run "tail -f /opt/hypertable/current/log/ThriftBroker.log", can you exactly pinpoint which line in the client generates the "Bad mutator id - 0" error?

You could also build a debug version of the ThriftBroker and set a breakpoint where the error message is printed.

bye
Christoph


2012/10/16 Fonzy <joem...@gmail.com>
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/bYnBcpvsr6wJ.

Fonzy

unread,
Oct 16, 2012, 1:28:50 PM10/16/12
to hyperta...@googlegroups.com
Christoph,

I did as you asked and this time it happened when I tried to drop the table.  This happens very randomly, mostly at startup but I also see it happen when I writing data to the database.  I captured a log when trying to write some data to the database and a decent speed and it ran for about 10 to 20 seconds before I got the timeout.  Here is the tail of the output:

1350398926.804540000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.804683000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.804915000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.805110000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.805334000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.805562000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=0
1350398926.805897000 API async_mutator_set_cells_serialized: API async_mutator_set_cells_serialized: mutator=170083264 cells.size=271 latency=5598
1350398932 ERROR ThriftBroker : get_mutator (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2322): Bad mutator id - 170083264
1350398950 ERROR ThriftBroker : TSocket::peek() recv() <Host: 127.0.0.1 Port: 56930>Connection reset by peer
1350398950 ERROR ThriftBroker : TThreadedServer client died: recv(): Connection reset by peer


The connection reset by peer is because the app crashes from the Thrift error.

Thanks

Christoph Rupp

unread,
Oct 17, 2012, 9:37:53 AM10/17/12
to hyperta...@googlegroups.com
Hi Fonzy,

what i would do is set a breakpoint in the ThriftBroker. I think that's the best way to figure out what is going on. It's part of the hypertable sources, and you would have to build it from scratch. If you want i can give instructions how to build it.

bye
Christoph

2012/10/16 Fonzy <joem...@gmail.com>
Christoph,
--
You received this message because you are subscribed to the Google Groups "Hypertable User" group.
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/entowVEX3D4J.

Fonzy

unread,
Oct 17, 2012, 11:26:31 AM10/17/12
to hyperta...@googlegroups.com, ch...@hypertable.com
Christoph,

I don't think I have that much time I can devote to this. By the way, this never happens when I try this in Java.  When I compiled my C++ app I had to grab the Thrift library from the thrift sources. Could I be using the wrong library?

If I can't get the Thrift broker to reduce CPU or work reliably I'm thinking about taking a different approach.  We are running on a lower power device with a lot of incoming data. The Thriftbroker is using a lot of CPU time in my tests (when it works) and so I'm wondering if it will be more efficient to bypass the thrift interface.  If I compile my app with the hypertable libraries directly, will other users be able to access the data or do I effectively kill that feature?  I would like for other less data intensive application to have access through the thrift interface.  Would that be possible?  Is there code examples or documentation for running hypertable embedded within ones application?

Thanks

Christoph Rupp

unread,
Oct 18, 2012, 2:20:25 AM10/18/12
to hyperta...@googlegroups.com
It doesn't sound as if you are using the wrong thrift library, but maybe you can double-check that you use thrift 0.8.0 (or use the one from /opt/hypertable/lib).

If you want to use the native c++ library then everything will work as expected. others can read your data, with and without thrift. Here's a code sample:

https://github.com/cruppstahl/hypertable/blob/master/examples/apache_log/apache_log_load.cc

just go through main() where it sets up a Client object, opens the Namespace and the Table, then creates a Mutator. With Mutator::set you can store a key in the mutator (it buffers the keys till there's a certain threshold), and in the end the mutator is flushed.

bye
Christoph

2012/10/17 Fonzy <joem...@gmail.com>
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/YZeq7h-rRm0J.

Fonzy

unread,
Oct 19, 2012, 10:54:14 AM10/19/12
to hyperta...@googlegroups.com, ch...@hypertable.com
Christoph,

Which is more efficient, a tablePtr->create_mutator_sync or tablePtr->create_mutator_async?

Thanks

Christoph Rupp

unread,
Oct 19, 2012, 11:20:44 AM10/19/12
to hyperta...@googlegroups.com
It depends on your application. The async mutator allows your application to run while the mutator flushes the keys to the RangeServers, and then it will report the result through a callback object. If your application can be structured in a way that it can continue running while the mutator is flushing in the background, then the async mutator will lead to faster results. But it comes at the cost of additional complexity when developing and debugging.

I personally would start with a synchronous mutator, and only if i discover that it is too slow then i would switch to an asynchronous one.

bye
Christoph

2012/10/19 Fonzy <joem...@gmail.com>
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/a51aCeDlZEIJ.
Message has been deleted
Message has been deleted

Christoph Rupp

unread,
Nov 22, 2012, 12:52:45 AM11/22/12
to hyperta...@googlegroups.com
Hi,

are you flushing the mutator? It buffers the data; if you do not insert too many keys then it's possible that the data was not yet sent to the RangeServers.

bye
Christoph

2012/11/21 Fonzy <joem...@gmail.com>
Christoph,

Okay, I shot too fast, I created a class derived from the callback and passed the pointer to the table->create_mutator_async method like so

MyCallback *callback = new MyCallback();
mutator = table->create_mutator_async(callback,2000,0);

I only create the mutator once and keep it around and call
mutator->set(key,data,size);
when I need to send data to the database which will be very frequent and very fast. But nothing get's written to the database.  Any ideas?

Thanks


On Wednesday, November 21, 2012 1:42:09 PM UTC-5, Fonzy wrote:
Christoph,

Can you point me to an example of how to use the async_mutator? I don't know how to use the callback.

Thanks,
To view this discussion on the web visit https://groups.google.com/d/msg/hypertable-user/-/rH_jAEy8bfcJ.
Reply all
Reply to author
Forward
0 new messages