http://api.mongodb.org/ruby/current/file.REPLICA_SETS.html#Recovery
On Fri, Feb 11, 2011 at 2:41 PM, tsxn <tsx...@gmail.com> wrote:
> I am running a replicaset setup with a master, secondary, and arbiter
> with reads to secondary enabled using a ReplSetConnection. My master
> server went down and was started back up as a secondary. Now my reads
> are producting Mongo::ConnectionFailure exceptions. Does the Ruby
> driver not reestablish connections to restarted MongoDB servers?
>
> Here's the stack trace:
>
> Operation failed with the following exception: Broken pipe - send(2)
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/connection.rb:746:in `send_message_on_socket'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/connection.rb:418:in `receive_message'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/connection.rb:417:in `synchronize'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/connection.rb:417:in `receive_message'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/cursor.rb:382:in `send_initial_query'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/cursor.rb:348:in `refresh'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/cursor.rb:72:in `next_document'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/mongo-1.2.0/
> lib/../lib/mongo/collection.rb:230:in `find_one'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/plucky-0.3.6/
> lib/plucky/query.rb:62:in `find_one'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/plucky-0.3.6/
> lib/plucky/query.rb:79:in `first'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/ign-
> mongo_mapper-0.8.6.2/lib/mongo_mapper/plugins/querying/decorator.rb:
> 29:in `first'
> /home/jetty/.rvm/gems/ree-1.8....@api.ign.com/gems/plucky-0.3.6/
> lib/plucky/query.rb:68:in `find'
>
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>
The driver definitely reconnects after failures. If you're resetting
your app, you do need to make sure you've specified seed nodes. If you
can provide a reproducible test case, that'd be very helpful.
What's the current status of the replica set?
How are you connecting to the replica set?
I added a line after:
https://github.com/mongodb/mongo-ruby-driver/blob/master/lib/mongo/util/pool.rb#L77
I put in:
socket.setsockopt(Socket::SOL_SOCKET, Socket::SO_REUSEADDR, true)
That seems to cure some issues with reconnection. I haven't submitted it as a patch because I can't seem to produce a failure case on a consistent basis. If you have code that can do so, that would be really helpful.
If this seems reasonable, then we probably want to add a similar line in other source files where a socket is allocated.
cr
All the tests I use to verify that replica set failover works live here:
https://github.com/mongodb/mongo-ruby-driver/tree/master/test/replica_sets
Feel free to run them in your environments with the following Rake task:
rake test:rs
@Chuck. Thanks for the note about Socket::SO_REUSEADDR. My first
thought is that this would be unnecessary given that all sockets are
closed on any connection failure.
Again, if anyone can find any reproducible scenarios, or simply
provide more details, for when the driver fails to reconnect, that'd
be much appreciated.
For instance, if you're writing to the database without safe mode
enabled, and a failover occurs, then you have no idea how many recent
writes you've lost, and you may want to do more than simply retry the
previous write.
If you are running in safe mode, you still don't know for certain if
the previous write arrived so, again, retrying the write automatically
may not be best.
Thus, the question of whether to retry really depends on the operation
being performed and the needs of the application. That's why we leave
this up to the app developer. A global driver setting to automatically
retry everything suggests that this can be a good policy for a lot of
applications. But this is almost never the case, and that's why we
don't support it at the moment.
Definitely interested in hearing more thoughts on the issue, and
certainly feel free to open a JIRA for further discussion.
Kyle
If these are the ideal choices for your application, then you can
easily build a thin layer atop the driver to handle them. But everyone
has to make different choices here, so we're not convinced that
building a given failover architecture into the driver makes sense.
That said, you're welcome to post a ticket to
http://jira.mongodb.org/browse/RUBY where we can continue the
discussion and allow other users to vote and comment on the issue as
well.
There is no timeout at the moment, but I've just created a ticket for it:
http://jira.mongodb.org/browse/RUBY-236
This hasn't been implemented yet for two reasons. One is simply lack
of demand and the other is that socket timeouts aren't easy to
implement well in Ruby. In any case, I'll start looking into it and
hopefully get something added before the next release.
Kyle