MotorReplicaSetConnection with Motor on Tornado

L-R

unread,

Oct 24, 2012, 10:05:19 AM10/24/12

to python-...@googlegroups.com

Quick question - I'm running a replica set with 3 nodes and using MotorReplicaSetConnection to connect to them in the following way :

DB = motor.MotorReplicaSetConnection(host="mongodb://user:pass@primary_db:27017", replicaSet='rep_set_name').open_sync().db_name

Which seems to work more than fine - but does it matter to which DB node I connect to? Ex : primary_db, second_node_db, or third_node_db? Any best practices here? It seems like the primary does the load balancing (judging by the output of consoles) but it's the first time I use a replSet.

thanks.

Serge S. Koval

unread,

Oct 24, 2012, 10:17:58 AM10/24/12

to python-...@googlegroups.com

I tried it as well and it appears that it is smart enough to connect to all Mongo nodes. If one of the nodes go down, you'll see lots of "connection closed" messages in the console, but your application will continue working with new master.

What's interesting, we had node crash and MotorReplicaSetConnection switched to new master without interrupting service. I know it should work, but when it *really* worked I was surprised :-)

Serge.

L-R

unread,

Oct 24, 2012, 10:18:48 AM10/24/12

to python-...@googlegroups.com

Actually, have noticed that I get a "No replica set primary available for query with ReadPreference PRIMARY" error when trying to retrieve data from the DB. Have reverted to MotorConnection, any ideas?

L-R

unread,

Oct 24, 2012, 10:30:10 AM10/24/12

to python-...@googlegroups.com

Hm, interesting Serge. I'm getting a "No replica set primary available for query with ReadPreference PRIMARY" error when using MotorReplicaSetConnection, so I switched back to a normal MotorConnection. I tried taking down nodes, and it works fine as long as I don't touch PRIMARY, which isn't ideal - looks like the driver can't elect a new primary with MotorConnection.

L-R

unread,

Oct 24, 2012, 3:27:51 PM10/24/12

to python-...@googlegroups.com

Also, could you post your connection string for MotorReplicaSetConnection? I simply can't get rid of the "pymongo.errors.AutoReconnect: No replica set members available for query with ReadPreference PRIMARY_PREFERRED" error, which doesn't make sense since I can easily reach my nodes. They are syncing properly, voting is fine, etc. but cannot get my client to connect to the replica set. Thanks.

On Wednesday, October 24, 2012 10:18:14 AM UTC-4, Serge S. Koval wrote:

A. Jesse Jiryu Davis

unread,

Oct 24, 2012, 10:08:25 PM10/24/12

to python-...@googlegroups.com

Hi guys.

1. When you create your connection string, best to include all the members of your set:

DB = motor.MotorReplicaSetConnection(host="mongodb://user:pass@primary_db,second_db,third_db", replicaSet='rep_set_name').open_sync().db_name

If you only pass one member 'primary_db' then the MotorReplicaSetConnection will connect to primary_db, discover the rest of the members of the set, and continue to connect to them even after 'primary_db' fails. *However*, if your application restarts after primary_db goes down, there's no way for MotorRSC to discover the surviving members of the replica set. Thus, when you create your connection string, include all the members you know about at that time. This will maximize the chance that MotorRSC can connect to some available member of the replica set. As long as one of them is available, MotorRSC can discover the rest of them.

2. MotorRSC will only read from the current primary, by default. If there's no primary, you should expect the error, "No replica set primary available for query with ReadPreference PRIMARY." You can set a different read preference like:

from pymongo.read_preference import ReadPreference
# Read from primary if available, otherwise secondary
DB.read_preference = ReadPreference.PRIMARY_PREFERRED

# Read from secondary if available, otherwise primary
DB.read_preference = ReadPreference.SECONDARY_PREFERRED

See the "ReplicaSetConnection" section here: http://api.mongodb.org/python/current/api/pymongo/index.html#pymongo.read_preferences.ReadPreference

3. It's not the *driver* that elects a new primary -- the surviving replica-set members elect a primary after the primary goes down. The driver should detect when there's a new primary, and connect to it. (If more than half the replica-set members haven't survived there will be no primary -- e.g., if you have two members and one goes down, you can't have a primary until it comes back up. See http://www.kchodorow.com/blog/2012/01/04/replica-set-internals-bootcamp-part-i-elections/ )

I've tested these cases thoroughly and I believe MotorReplicaSetConnection works as desired with replica sets, but if you have a reproducible problem I would love to hear about it.

Peace,
Jesse

L-R

unread,

Oct 25, 2012, 10:23:42 AM10/25/12

to python-...@googlegroups.com

hey Jesse, thanks for clarifications. The problem persists, here's the setup :

RE 1) I've changed the string to include all 3 members. It appears that MotorRSC knows of the secondary nodes, but can't connect to them (see below)

RE 2) The error "No replica set primary available for query with ReadPreference PRIMARY" has gone away - I was using internal IPs in my replica set settings, so of course Motor couldn't connect to them. I've added the line "DB.read_preferences = ReadPreference.SECONDARY_PREFERRED" after my connection string, no difference noted.

RE 3) My replica set seems to work fine, here are the steps that cause the problem :

A) 3-node set running, when connecting to primary I check rs.status() :

{

"set" : "rs01",

"date" : ISODate("2012-10-25T14:03:08Z"),

"myState" : 1,

"members" : [

{

"_id" : 0,

"name" : "IP_ONE_HERE:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 76,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"self" : true

},

{

"_id" : 1,

"name" : "IP_TWO_HERE:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 76,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"lastHeartbeat" : ISODate("2012-10-25T14:03:08Z"),

"pingMs" : 1

},

{

"_id" : 2,

"name" : "IS_THREE_HERE:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 76,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"lastHeartbeat" : ISODate("2012-10-25T14:03:08Z"),

"pingMs" : 1

}

],

"ok" : 1

}

All seems okay. I have my app running with no errors. I take down PRIMARY, and node #2 takes over. I log into mongo and check rs.status() :

{

"set" : "rs01",

"date" : ISODate("2012-10-25T14:16:03Z"),

"myState" : 2,

"syncingTo" : "NEW_PRIMARY_IP:27017",

"members" : [

{

"_id" : 0,

"name" : "IP_ONE_HERE:27017",

"health" : 0,

"state" : 8,

"stateStr" : "(not reachable/healthy)",

"uptime" : 0,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"lastHeartbeat" : ISODate("2012-10-25T14:15:37Z"),

"pingMs" : 0,

"errmsg" : "socket exception [CONNECT_ERROR] for IP_ONE_HERE:27017"

},

{

"_id" : 1,

"name" : "IP_TWO_HERE:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 64666,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"lastHeartbeat" : ISODate("2012-10-25T14:16:03Z"),

"pingMs" : 2

},

{

"_id" : 2,

"name" : "IP_THREE_HERE:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 64666,

"optime" : Timestamp(1351164396000, 2),

"optimeDate" : ISODate("2012-10-25T11:26:36Z"),

"errmsg" : "syncing to: NEW_PRIMARY_IP:27017",

"self" : true

}

],

"ok" : 1

}

All looks good, we have a new PRIMARY running on the second node. Now I fire a request to my app and get the following :

[W 121025 10:18:52 iostream:507] Connect error on fd 16: ECONNREFUSED

[W 121025 10:18:52 iostream:507] Connect error on fd 17: ECONNREFUSED

[W 121025 10:18:52 iostream:507] Connect error on fd 15: ECONNREFUSED

[E 121025 10:18:52 ioloop:435] Exception in callback <tornado.stack_context._StackContextWrapper object at 0x1553050>

Traceback (most recent call last):

File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 421, in _run_callback

callback()

File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 229, in wrapped

callback(*args, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line 1324, in _to_list_got_more

callback(None, error)

File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 382, in inner

self.set_result(key, result)

File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 315, in set_result

self.run()

File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 343, in run

yielded = self.gen.throw(*exc_info)

File "/mypath/file.py", line 27, in get

segments, uids, segment_data = yield motor.WaitAllOps(['one', 'two', 'three'])

File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 335, in run

next = self.yield_point.get_result()

File "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line 1679, in get_result

raise error

AttributeError: 'NoneType' object has no attribute 'close'

If I resuscitate my old PRIMARY, everything goes back online and works fine. If I get the first 3 lines of the error msg correctly, it seems like Motor sees the 3 seeds, but gets a refused connection? Odd note : the very first time I tried these steps, when putting PRIMARY back online, I actually got a message saying : "master has changed" (not that it helped, I need it to detect this when going offline).

Cheers

L

A Jesse Jiryu Davis

unread,

Oct 25, 2012, 2:08:35 PM10/25/12

to python-...@googlegroups.com

2. Make sure you're setting DB.read_preference, not DB.read_preferences!

3. Huh, that looks like it could be a Motor bug. Do you know if
behaves the same without using Mongo's authentication? Could you share
your whole script with me? Either on the list, or in a gist, or just
email me je...@10gen.com

Thanks

L-R

unread,

Oct 26, 2012, 10:50:18 AM10/26/12

to python-...@googlegroups.com

So I've done a bunch of testing and from what I understand there seems to be a problem with Motor detecting / switching nodes when some go down. You were right about DB.read_preference syntax (but the module is apparently pymongo.read_preferences), and here's the script I'm using, which has nothing special to it : https://gist.github.com/a6537c8cd9c15d0c31b6. I've put the 2 most common errors in there, but they vary from request to request.

The setup I have working for now is :

- specify the 3 nodes in the connection string

- ReadPreference.SECONDARY_PREFERRED

The problem : the driver is unable to switch back to primary if I take down both secondary nodes.

I tested more and had :

- specify the 3 nodes in the connection string

- ReadPreference.PRIMARY_PREFERRED

The problem : driver would not switch to secondary nodes when primary went down. This would of been the optimal setup for me, but for the moment SECONDARY_PREFERRED handling reads is OK. I've also attempted the NEAREST setting, and if I took any node down, I'd get an error once in a while, as if Motor still tried to make requests to the node.

Cheers!

A. Jesse Jiryu Davis

unread,

Oct 26, 2012, 12:24:16 PM10/26/12

to python-...@googlegroups.com

Hi L-R. Can you make sure you're really setting DB.read_preference, not read_preferences? I still see the plural in your settings.py:

https://gist.github.com/a6537c8cd9c15d0c31b6#file_settings.py

My expectation is that if you *haven't* set the read_preference to SECONDARY_PREFERRED correctly, then during the 5- or 10-second period when there's no primary, you'll see pymongo.errors.AutoReconnect exceptions. After that, MotorReplicaSetConnection should find the new primary and start reading from it. If you *have* correctly set read_preference, then you still might see a handful of errors, although fewer of them, since replica set members close their connections when the replica set status changes, forcing MotorRSC to reconnect.

Can you connect to each member using the shell, like "mongo -u user -p pass public_up:27017/admin" and the same for db2:27017 and db3:27017? I'm concerned that db2 and db3 aren't accessible from the machine on which you're running your Python application.

L-R

unread,

Oct 26, 2012, 12:49:39 PM10/26/12

to python-...@googlegroups.com

hey,

RE : syntax, the gist was from yesterday so I have read_preference (singular) now.

Interesting thing - I just tested on both staging and prod environments, and PRIMARY_PREFERRED actually works as expected! Even when PRIMARY is down and the app restarts, it manages to find the 2 other seeds and reads just fine. On local dev though, none of this works, which is strange since I can effectively connect to both nodes via CLI (I stay connected to check their rs.status()) but Tornado throw the same errors :

[W 121026 12:43:01 iostream:507] Connect error on fd 14: ECONNREFUSED

[W 121026 12:43:01 iostream:507] Connect error on fd 15: ECONNREFUSED

[W 121026 12:43:01 iostream:507] Connect error on fd 23: ECONNREFUSED

[W 121026 12:43:01 iostream:507] Connect error on fd 14: ECONNREFUSED

At this point it's really a minor annoyance (getting an error if my primary is down might not be such a bad thing ;)), but I do wonder what would cause this.

A. Jesse Jiryu Davis

unread,

Oct 26, 2012, 1:10:09 PM10/26/12

to python-...@googlegroups.com

OK, if MotorRSC works as desired everywhere except your local machine, I think it has to do with the reachability of all the replica set members from the machine the driver is running on. Can you tell me more about your local dev setup? Show me rs.status() from the shell, and your MotorReplicaSetConnection options?

L-R

unread,

Oct 26, 2012, 2:23:25 PM10/26/12

to python-...@googlegroups.com

Yep. When PRIMARY is up (rs.status() from primary node):

{

"set" : "rs01",

"date" : ISODate("2012-10-26T18:12:15Z"),

"myState" : 1,

"members" : [

{

"_id" : 0,

"name" : "IP_ONE:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 4939,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"self" : true

},

{

"_id" : 1,

"name" : "IP_TWO:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 4939,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"lastHeartbeat" : ISODate("2012-10-26T18:12:14Z"),

"pingMs" : 2

},

{

"_id" : 2,

"name" : "IP_THREE:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 4939,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"lastHeartbeat" : ISODate("2012-10-26T18:12:13Z"),

"pingMs" : 4

}

],

"ok" : 1

}

When it's down (rs.status() from from node #2) :

{

"set" : "rs01",

"date" : ISODate("2012-10-26T18:14:19Z"),

"myState" : 1,

"members" : [

{

"_id" : 0,

"name" : "IP_ONE:27017",

"health" : 0,

"state" : 8,

"stateStr" : "(not reachable/healthy)",

"uptime" : 0,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"lastHeartbeat" : ISODate("2012-10-26T18:13:38Z"),

"pingMs" : 0,

"errmsg" : "socket exception [CONNECT_ERROR] for IP_ONE:27017"

},

{

"_id" : 1,

"name" : "IP_TWO:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 14080,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"self" : true

},

{

"_id" : 2,

"name" : "IP_THREE:27017",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 5498,

"optime" : Timestamp(1351274217000, 78),

"optimeDate" : ISODate("2012-10-26T17:56:57Z"),

"lastHeartbeat" : ISODate("2012-10-26T18:14:18Z"),

"pingMs" : 3,

"errmsg" : "syncing to: IP_TWO:27017"

}

],

"ok" : 1

}

As for my MotorReplicaSetConnection, this :

DB = motor.MotorReplicaSetConnection("mongodb://user:pass@PRIMARY_IP:27017,2nd_IP:27017,3rd_IP:27017", replicaSet='rs_name').open_sync().my_collection

DB.read_preference = ReadPreference.PRIMARY_PREFERRED

attrs = vars(DB)

print ', '.join("%s: %s" % item for item in attrs.items())

prints out :

connection: MotorReplicaSetConnection(ReplicaSetConnection([u'IP_ONE:27017', u'IP_TWO:27017', u'IP_THREE:27017'])), delegate: Database(ReplicaSetConnection([u'IP_ONE:27017', u'IP_TWO:27017', u'IP_THREE:27017']), u'my_collection')

I double checked IPs, all kosher. All 3 nodes reachable by shell from local dev machine with auth.

And full stacktrace once again :

[I 121026 14:19:14 autoreload:175] /myfile.py modified; restarting server

WARNING:root:Connect error on fd 15: ECONNREFUSED

Traceback (most recent call last):

File "myfile.py", line 1, in <module>

from imports import *

File "/imports.py", line 32, in <module>

import settings

File "/settings.py", line 7, in <module>

DB = motor.MotorReplicaSetConnection("mongodb://user:pass@IP1:27017,IP2:27017,IP3:27017", replicaSet='rs_name').open_sync().my_collection

File "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line 819, in open_sync

super(MotorReplicaSetConnection, self).open_sync()

File "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line 690, in open_sync

raise outcome['error']

KeyError: 'pop from an empty set'

L

A Jesse Jiryu Davis

unread,

Oct 26, 2012, 3:15:38 PM10/26/12

to python-...@googlegroups.com

Oh, it's a failure from open_sync()? I didn't understand that earlier,
I thought we were talking only about failures when querying from a
MotorRSC that was initialized before a member went down.

A Jesse Jiryu Davis

unread,

Oct 26, 2012, 4:01:42 PM10/26/12

to python-...@googlegroups.com

So, this works for me: I start a replica set on localhost:27017,
27018, and 27019. I shut down one of them. I do:

motor.MotorReplicaSetConnection('localhost:27017,localhost:27018,localhost:27019',
replicaSet='repl0').open_sync()

MotorRSC correctly detects the current primary and secondary, and
makes no attempt to connect to the downed member.

L-R

unread,

Oct 26, 2012, 4:29:09 PM10/26/12

to python-...@googlegroups.com

(oops that reply got send priv)

A. Jesse Jiryu Davis

unread,

Oct 26, 2012, 7:15:00 PM10/26/12

to python-...@googlegroups.com

Taking this back public.

1. I'm using read preference PRIMARY, actually. But I just tried PRIMARY_PREFERRED with a 3-member replica set where I kill two members, so I'm left with one secondary. Still works as expected. This is frustrating because I think you might be revealing a real bug.

2. Your connection string is correctly formatted, with "mongodb://user:pass@" before the comma-separated list of hosts.

3. So you can definitely always connect to all the hosts listed from the shell, from the same machine as you're running Motor from? And what if you run without authentication enabled on Mongo - does this have to do w/ authentication at all?

4. The two kinds of errors you see, "'NoneType' object has no attribute 'close'" and "pop from empty set," are unexpected, I want to understand and fix them. What should happen is either a successful query (if possible) or AutoReconnect exception. Alas, Tornado stack traces are often useless, and yours is mostly useless....

A. Jesse Jiryu Davis

unread,

Oct 26, 2012, 7:16:30 PM10/26/12

to python-...@googlegroups.com

I'm going to run an application to test against, later this weekend or Tuesday. Could you email me directly a full application I can run? je...@10gen.com

Thanks

L-R

unread,

Oct 27, 2012, 5:27:49 PM10/27/12

to python-...@googlegroups.com

#3 : Yes, I've no problem connecting /w authentication from my shell on this machine. But I will test without auth to see if that isn't the problem.

Assuming above doesn't work, I'll email you a full app that connects to my replica set to see if you can replicate the issue. Will get back to this on Monday.

Cheers

A. Jesse Jiryu Davis

unread,

Mar 7, 2013, 3:41:08 PM3/7/13

to python-...@googlegroups.com

Yeah; that's right—the driver, whether it's PyMongo and Motor (the
underlying code is the same), connects to hosts in the "seed list"
that you pass in, looking for the primary. Once it finds the primary
it calls the isMaster command on the primary and gets a list of
replica-set members, which is more or less the list you passed into
rs.initiate(). Then, the driver connects to all the members according
to their addresses in the isMaster response.

In summary: the hostnames that you pass into rs.initiate() have to be
resolvable and available from the client machine.

On Thu, Mar 7, 2013 at 1:09 PM, Tom Vaughan
<thomas.dav...@gmail.com> wrote:
> I recently ran into this problem. In my case I setup my replica set like:
>
> $ mongo localhost:27017
> > conf = {_id:'bazaar', members:[]}
> > conf.members[0] = {_id: 0, host:'localhost:27017'}
> > rs.initiate(conf)
> > rs.add('localhost:27020')
> > exit
>
> I also have authentication enabled (both `auth` and `keyfile`).
>
> Then I connected to one of the replica set instances like:
>
> $ mongo some-machine-name:27017
> > use admin
> > db.auth("username", "password")
>
> That worked. But I can't connect to some-machine-name:27017 from PyMongo or
> Motor. This is because I setup the replica set members as "localhost"
> instead of "some-machine-name". Once I switched the replica set to use
> "some-machine-name" everything worked. I suspect PyMongo and Motor attempt
> to resolve the replica set host names on the client machine.
>
> Hope this helps.
>
> -Tom

> --
> You received this message because you are subscribed to the Google Groups
> "Tornado Web Server" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to python-tornad...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Tom Vaughan

unread,

Mar 7, 2013, 3:55:04 PM3/7/13

to python-...@googlegroups.com

On Thu, Mar 7, 2013 at 5:41 PM, A. Jesse Jiryu Davis
<je...@emptysquare.net> wrote:
> Yeah; that's right—the driver, whether it's PyMongo and Motor (the
> underlying code is the same), connects to hosts in the "seed list"
> that you pass in, looking for the primary. Once it finds the primary
> it calls the isMaster command on the primary and gets a list of
> replica-set members, which is more or less the list you passed into
> rs.initiate(). Then, the driver connects to all the members according
> to their addresses in the isMaster response.
>
> In summary: the hostnames that you pass into rs.initiate() have to be
> resolvable and available from the client machine.

I see. I specified resolvable host names in the list of seed hosts on
the client. I was under the impression that took precedence. I
appreciate the response.

Thanks.

-Tom

A. Jesse Jiryu Davis

unread,

Mar 7, 2013, 5:04:58 PM3/7/13

to python-...@googlegroups.com

There's a reason it's built this way. If you start with a 3-member
set, and specify some of those members in the seed list, and the
client connects, great. Then over time, if you add a bunch of new
members to the set, and eventually kill all the members of the set
that were in the seed list, the client should be able to stay
connected to the set, even though all the members it was initialized
with are now gone. (See Plutarch, "Ship of Theseus.") Thus the client
only uses the seed list for its initialization, and uses the list of
hosts it gets from the replica set configuration after that.

Reply all

Reply to author

Forward