We have a replica set, which performs well until a node (master or
slave) is disconnected. If that node remains in the replica set
configuration, data access to mongo becomes painfully slow.
Has anybody else encountered this issue? If yes, what is the recommend
resolution?
Thank you,
Brian
> rs.status()
{
"set" : "tcsf",
"date" : "Mon Mar 07 2011 21:50:50 GMT+0000 (UTC)",
"myState" : 1,
"members" : [
{
"_id" : 2,
"name" : "...",
"health" : 1,
"state" : 1,
"self" : true
},
{
"_id" : 3,
"name" : "...",
"health" : 0,
"state" : 2,
"uptime" : 0,
"lastHeartbeat" : "Mon Mar 07 2011 21:50:47 GMT+0000 (UTC)",
"errmsg" : "connect/transport error"
}
],
"ok" : 1
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>
I was seeing ~ 5 seconds per query vs < 100ms
I tried to reproduce the issue and I'm getting a different error:
Read error: #<Mongo::ConnectionFailure: Failed to connect any given host:port>
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/mongo-1.2.0/lib/../lib/mongo/repl_set_connection.rb:121:in
`connect'
/data/honk/releases/20110301211724/vendor/gems/mongo-reconnect-0.0.1/lib/mongo_reconnect.rb:9:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/rails-2.3.10/lib/rails/rack/static.rb:31:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/rack-1.1.0/lib/rack/urlmap.rb:47:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/rack-1.1.0/lib/rack/urlmap.rb:41:in
`each'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/rack-1.1.0/lib/rack/urlmap.rb:41:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/rails-2.3.10/lib/rails/rack/log_tailer.rb:17:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:519:in
`process_client'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:594:in
`worker_loop'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:592:in
`each'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:592:in
`worker_loop'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/honkster-newrelic_rpm-2.13.1/lib/new_relic/control/../agent/instrumentation/unicorn_instrumentation.rb:7:in
`call'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/honkster-newrelic_rpm-2.13.1/lib/new_relic/control/../agent/instrumentation/unicorn_instrumentation.rb:7:in
`worker_loop'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:482:in
`spawn_missing_workers'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:479:in
`fork'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:479:in
`spawn_missing_workers'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:475:in
`each'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:475:in
`spawn_missing_workers'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:489:in
`maintain_worker_count'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn/http_server.rb:163:in
`start'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/lib/unicorn.rb:13:in
`run'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/gems/unicorn-3.0.0/bin/unicorn_rails:208
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/bin/unicorn_rails:19:in `load'
/usr/local/rvm/gems/ree-1.8.7-2010.02@honk/bin/unicorn_rails:19
It seems like automatic fail over is not working at all now :-(
I didn't see any errors when access was slow.
Mongodb v1.6.5
Ruby mongo driver v1.2.0
Thanks,
Brian
Quick question: what's are you expecting the driver to do when the
replica set fails over? Do you expect no ConnectionFailure exceptions
at all? The driver doesn't provide that functionality.
See these docs:
http://api.mongodb.org/ruby/current/file.REPLICA_SETS.html
Also, I have an FAQ in the docs that explain the reasoning on this:
http://api.mongodb.org/ruby/current/file.FAQ.html#I_periodically_see_connection_failures_between_the_driver_and_MongoDB._Why_can't_the_driver_retry_the_operation_automatically_
I _think_ the issue here is that we have different expectations for
how the driver is supposed to react. All the driver does is attempt to
connect on the subsequent request using everything it knows about the
replica set.
Kyle
It would be nice if there were a supported way to override the default
behavior, which is to raise a ConnectionFailure when any node fails.
More on this:
http://groups.google.com/group/mongodb-user/msg/7038a88b0f005413
I'll do the monkey patch for now, but I bet there can be a
configuration method that takes a lambda (or block), to handle the
connection error logic.
Would such a patch be welcome?
I fixed this by adding another replica set node.
It is completely unnecessary, as the mongo driver seems to attempt to
reconnect if it is not connected.
Here's more the gory details:
http://jira.mongodb.org/browse/RUBY-248
Kyle, thanks for your help,
Brian