thread pool, some questions

1 view
Skip to first unread message

rogerdpack

unread,
Sep 19, 2008, 3:54:07 PM9/19/08
to neverblock
I think for sure that neverblock [i.e. fibered processing] will be
faster than the typical mongrel "one thread per connection [esp. if
you have many many connections]".

Rails itself doesn't seem to realize this. With rails 2.2 it appears
that there will still be tons of threads, they will just block waiting
for a member of the thread pool to become available, then process the
request [they grab a thread pool member and hang onto it for the
duration of the request, then release it after the request is
processed].

This makes me wonder if a thread pool would be "a good thing" for
rails. Since it would be a pool it wouldn't have the thread creation
overhead, and it wouldn't have all the extra threads sleeping and
slowing things down. But in reality fibers should still be slightly
faster than a thread pool, no matter what. Fibers would be intensely
useful if you needed like 500 concurrent requests all accessing the DB
at the same time, but in reality I think people only would really
'want' about 20 concurrent DB requests, so...a thread pool of 20 might
be about as fast.

Also do we want to create a plugin which is 'mysqlplus+async instead
of mysql' for people to use with rails [like a drop in replacement--
all you do is require one file and you're good to go] or just require
them to use neverblock?

Also do we want to rename the .so file so that it doesn't conflict
with previously installed mysql gems?

Anyway just thinking out loud.
Go NB :)
-=R

Aman Gupta

unread,
Sep 19, 2008, 4:45:47 PM9/19/08
to never...@googlegroups.com
This makes me wonder if a thread pool would be "a good thing" for
rails.  Since it would be a pool it wouldn't have the thread creation
overhead, and it wouldn't have all the extra threads sleeping and
slowing things down.  But in reality fibers should still be slightly
faster than a thread pool, no matter what.  Fibers would be intensely
useful if you needed like 500 concurrent requests all accessing the DB
at the same time, but in reality I think people only would really
'want' about 20 concurrent DB requests, so...a thread pool of 20 might
be about as fast.

The ruby thread scheduler uses setitimer to have the kernel send it a SIGVTALRM every 10000 microseconds. Under high load, this is highly undesirable as ruby is constantly getting interrupted, increasing latency and reducing throughput. This timer is setup as soon as the first ruby thread is created, and never removed.

The approach I've found to work best with MRI is to patch ruby to remove the setitimer when no threads exist (http://www.ruby-forum.com/topic/164574), and to create threads only when required. This way the extra cost of the setitimer is only occurred when a thread is actually processing, instead of happening constantly just because there is a threadpool full of sleeping threads.
 
Also do we want to rename the .so file so that it doesn't conflict
with previously installed mysql gems?

Yea, probably a good idea.

  Aman

Roger Pack

unread,
Sep 19, 2008, 4:56:02 PM9/19/08
to never...@googlegroups.com
>> Also do we want to rename the .so file so that it doesn't conflict
>>
>> with previously installed mysql gems?
>
> Yea, probably a good idea.
> Aman

Interestingly, renaming it causes something interesting
if I
require 'mysqlplus'
require 'thin_attributes'

thin attributes then runs
require 'mysql'

within itself, which effectively overrides mysqlplus :)

I'd assume that the pain is worth it, though, so that we don't get
confused as to which one we're using.
Thoughts?
-=R

Aman Gupta

unread,
Sep 19, 2008, 5:24:49 PM9/19/08
to never...@googlegroups.com
On Fri, Sep 19, 2008 at 1:56 PM, Roger Pack <roger...@leadmediapartners.com> wrote:

>> Also do we want to rename the .so file so that it doesn't conflict
>>
>> with previously installed mysql gems?
>
> Yea, probably a good idea.
>   Aman

Interestingly, renaming it causes something interesting
if I
require 'mysqlplus'
require 'thin_attributes'

thin attributes then runs
require 'mysql'

Actually, given this behavior I think we're better off keeping it the same name. That way it plugs as a monkeypatch.. ruby will ignore any require 'mysql' done by any other libraries (AR, Sequel, etc) and use the mysqlplus version instead.

within itself, which effectively overrides mysqlplus :)

I'd assume that the pain is worth it, though, so that we don't get
confused as to which one we're using.
Thoughts?

I vote for leaving it the way it is right now (using mysql.so)

  Aman
 

-=R



Muhammad A. Ali

unread,
Sep 19, 2008, 6:06:58 PM9/19/08
to never...@googlegroups.com
Also do we want to create a plugin which is 'mysqlplus+async instead
of mysql' for people to use with rails [like a drop in replacement--
all you do is require one file and you're good to go] or just require
them to use neverblock?

They need to just install the mysqlplus gem (I need to move that to RubyForge). That's if AR provides a thread pool for the mysql adapter that uses the async_query method
 



Muhammad A. Ali

unread,
Sep 19, 2008, 6:07:53 PM9/19/08
to never...@googlegroups.com

They need to just install the mysqlplus gem (I need to move that to RubyForge). That's if AR provides a thread pool for the mysql adapter that uses the async_query method
 

I meant connection pool rather than thread pool
 




Lourens Naude

unread,
Sep 19, 2008, 6:54:45 PM9/19/08
to never...@googlegroups.com
Roger,

Played with the thread pool in Edge and mysqlplus for a bit earlier
today.

The difference with current head being :

http://gist.github.com/11429

Found the following to trip up with the query sequence checks :

- SET *
- BEGIN
- ROLLBACK

etc.

Another way to guard against this ( exclusively via #async_query ) is
to clear any previous results before
firing #send_query, scheduling and #get_result ... like the Postgres
client does :

http://gist.github.com/11678

When using #async_query with a threaded connection pool, that
shouldn't be an issue - one's more likely to mess up the sequence
with the evented model.

Included the following as an initializer in /config :

http://gist.github.com/11681

Thoughts ?

Muhammad A. Ali

unread,
Sep 19, 2008, 7:45:23 PM9/19/08
to never...@googlegroups.com
Connection pooling in Rails will assign a connection to the requesting thread and check it out after the request is finished. This means that there should be no syncing issues at all, even with commands like set, begin and commit. Unless of course you manage the checkin/checkout manually in which case you have to handle query syncing.

NeverBlock connection pool otoh will attempt to retrieve a new connection on each db request. It will not stick the connection to the fiber for all it's life time. Special support is added for transaction handling but not for "set" commands or connection methods that modify the connection status.

I was wondering if this fine grained model is better than just slamming a connection on the fiber till it is done. Set and the likes can still be wrapped in a request wide transaction (unless you are using those in filters). Or may be we can provide a transaction around filter for those actions that need "set".

ideas? :)

Roger Pack

unread,
Sep 20, 2008, 3:33:51 PM9/20/08
to never...@googlegroups.com
> Another way to guard against this ( exclusively via #async_query ) is
> to clear any previous results before
> firing #send_query, scheduling and #get_result ... like the Postgres
> client does :
>
> http://gist.github.com/11678

Yeah--I like our way [?] where we raise if there's a connection in
progress. If we do, that is :P


> When using #async_query with a threaded connection pool, that
> shouldn't be an issue - one's more likely to mess up the sequence
> with the evented model.
>
> Included the following as an initializer in /config :
>
> http://gist.github.com/11681

Looks good. I guess we have two options--either do as NeverBlock does
[and basically poll incoming queries, check if they're SET * 's and if
they are, then pin the connection to the fiber] or piggy back on the
existing Rails pool system, which checks out a single connection per
request and then checks it back in.
There might be some nicety in just following rails' [somewhat weird]
pool system--only that it would merge more easily with what rails
does, so might be more friendly to the community.

Thoughts?
-=R

Lourens Naude

unread,
Sep 21, 2008, 12:04:32 AM9/21/08
to never...@googlegroups.com

Another diff that with cleaner async order tracking :

http://gist.github.com/11834

Also exposes the following :

Mysql#async_in_progress #compares to the current connection identifier

&&

Mysql#async_in_progress = ( true | false | nil )

Muhammad mentioned that MySQL 6 would feature a hybrid threaded and
evented client, thus #connection_identifier extracted
to handle any logic required for that use case - currently uses
#mysql_thread_id

It may also be useful to expose that as Mysql#connection_identifier
( same as Mysql#thread_id currently, but perhaps a bit more versatile )

When handling cases such as SET, one can then still play well with the
expected send, get_result order eg.

connection.send_query( "SET something" )
connection.async_in_progress = false

or maybe even sugar that use case :

connection.send_query!( "SET something" )

Not sure if it's safe, or even sane, to commit at present - it WILL
blow up for any successive #send_query without a #get_result, which
shouldn't
affect neverblock or em-mysql @ present.

Mysql#c_async_query clears Mysql#async_in_progress on consecutive
calls and plays well with typical ActiveRecord use.

Thoughts ?

Lourens Naude

unread,
Sep 21, 2008, 12:09:07 AM9/21/08
to never...@googlegroups.com
Also of note,

As per http://faemalia.net/mysqlUtils/mysql-internals.pdf

"Avoid using malloc(), which is very slow. For memory allocations that
only need to live
for the lifetime of one thread, use sql_alloc() instead. "

Not sure about libmysqld support or how that affects Ruby's GC
requirements, but may be most useful for resultset retrieval ...

Thoughts ?

On 2008/09/20, at 20:33, Roger Pack wrote:

Roger Pack

unread,
Sep 21, 2008, 1:20:40 AM9/21/08
to never...@googlegroups.com
> connection.send_query( "SET something" )
> connection.async_in_progress = false

What does this do exactly, then? Does this somehow bind the
connection to the thread then?

> Not sure if it's safe, or even sane, to commit at present - it WILL
> blow up for any successive #send_query without a #get_result, which
> shouldn't
> affect neverblock or em-mysql @ present.

I'm definitely in favor of raising if people do successive
send_queries. Make them pay :)

re: non malloc use: appears we mostly use rb_hash_new() and such,
though I do notice that slim-attributes uses malloc. I guess in Ruby
land it's tough to tell if your memory will need to span multiple
threads or not, since it could stored away and re-used later.
Not sure.
-=R

Lourens Naude

unread,
Sep 21, 2008, 9:49:55 AM9/21/08
to never...@googlegroups.com

On 2008/09/21, at 06:20, Roger Pack wrote:

>
>> connection.send_query( "SET something" )
>> connection.async_in_progress = false
>
> What does this do exactly, then? Does this somehow bind the
> connection to the thread then?
>

Just sets mysql_struct->async_in_progress to 0 ... so the next call to
#send_query won't fail the async sequence check.

Roger Pack

unread,
Oct 9, 2008, 11:41:28 AM10/9/08
to neverblock
> Also do we want to rename the .so file so that it doesn't conflict
> with previously installed mysql gems?

Yeah I guess I kind of like it with it named the same as the mysql gem
since it's more of a drop-in replacement. Which is nice.

I think I may still add the all_pseudo_hashes method just so that
users have the flexibility to tune their DB "to their pleasure" if
they want to try and eke out more speed. It would be convenient.

-=R

Reply all
Reply to author
Forward
0 new messages