I double checked IPs, all kosher. All 3 nodes reachable by shell from local
dev machine with auth.
On Friday, October 26, 2012 1:10:09 PM UTC-4, A. Jesse Jiryu Davis wrote:
> OK, if MotorRSC works as desired everywhere except your local machine, I
> think it has to do with the reachability of all the replica set members
> from the machine the driver is running on. Can you tell me more about your
> local dev setup? Show me rs.status() from the shell, and your
> MotorReplicaSetConnection options?
> On Friday, October 26, 2012 12:49:39 PM UTC-4, L-R wrote:
>> hey,
>> RE : syntax, the gist was from yesterday so I have read_preference
>> (singular) now.
>> Interesting thing - I just tested on both staging and prod environments,
>> and PRIMARY_PREFERRED actually works as expected! Even when PRIMARY is down
>> and the app restarts, it manages to find the 2 other seeds and reads just
>> fine. On local dev though, none of this works, which is strange since I can
>> effectively connect to both nodes via CLI (I stay connected to check their
>> rs.status()) but Tornado throw the same errors :
>> [W 121026 12:43:01 iostream:507] Connect error on fd 14: ECONNREFUSED
>> [W 121026 12:43:01 iostream:507] Connect error on fd 15: ECONNREFUSED
>> [W 121026 12:43:01 iostream:507] Connect error on fd 23: ECONNREFUSED
>> [W 121026 12:43:01 iostream:507] Connect error on fd 14: ECONNREFUSED
>> At this point it's really a minor annoyance (getting an error if my
>> primary is down might not be such a bad thing ;)), but I do wonder what
>> would cause this.
>> On Friday, October 26, 2012 12:24:16 PM UTC-4, A. Jesse Jiryu Davis wrote:
>>> Hi L-R. Can you make sure you're really setting DB.read_preference, not
>>> read_preferences? I still see the plural in your settings.py:
>>> https://gist.github.com/a6537c8cd9c15d0c31b6#file_settings.py
>>> My expectation is that if you *haven't* set the read_preference to
>>> SECONDARY_PREFERRED correctly, then during the 5- or 10-second period when
>>> there's no primary, you'll see pymongo.errors.AutoReconnect exceptions.
>>> After that, MotorReplicaSetConnection should find the new primary and start
>>> reading from it. If you *have* correctly set read_preference, then you
>>> still might see a handful of errors, although fewer of them, since replica
>>> set members close their connections when the replica set status changes,
>>> forcing MotorRSC to reconnect.
>>> Can you connect to each member using the shell, like "mongo -u user -p
>>> pass public_up:27017/admin" and the same for db2:27017 and db3:27017? I'm
>>> concerned that db2 and db3 aren't accessible from the machine on which
>>> you're running your Python application.
>>> On Friday, October 26, 2012 10:50:18 AM UTC-4, L-R wrote:
>>>> So I've done a bunch of testing and from what I understand there seems
>>>> to be a problem with Motor detecting / switching nodes when some go down.
>>>> You were right about DB.read_preference syntax (but the module is
>>>> apparently pymongo.read_preferences), and here's the script I'm using,
>>>> which has nothing special to it :
>>>> https://gist.github.com/a6537c8cd9c15d0c31b6. I've put the 2 most
>>>> common errors in there, but they vary from request to request.
>>>> The setup I have working for now is :
>>>> - specify the 3 nodes in the connection string
>>>> - ReadPreference.SECONDARY_PREFERRED
>>>> The problem : the driver is unable to switch back to primary if I take
>>>> down both secondary nodes.
>>>> I tested more and had :
>>>> - specify the 3 nodes in the connection string
>>>> - ReadPreference.PRIMARY_PREFERRED
>>>> The problem : driver would not switch to secondary nodes when primary
>>>> went down. This would of been the optimal setup for me, but for the
>>>> moment SECONDARY_PREFERRED handling reads is OK. I've also attempted the
>>>> NEAREST setting, and if I took any node down, I'd get an error once in a
>>>> while, as if Motor still tried to make requests to the node.
>>>> Cheers!
>>>> On Thursday, October 25, 2012 2:09:24 PM UTC-4, A. Jesse Jiryu Davis
>>>> wrote:
>>>>> 2. Make sure you're setting DB.read_preference, not
>>>>> DB.read_preferences!
>>>>> 3. Huh, that looks like it could be a Motor bug. Do you know if
>>>>> behaves the same without using Mongo's authentication? Could you share
>>>>> your whole script with me? Either on the list, or in a gist, or just
>>>>> email me je...@10gen.com
>>>>> Thanks
>>>>> On Thu, Oct 25, 2012 at 10:23 AM, L-R <lau...@human.co> wrote:
>>>>> > hey Jesse, thanks for clarifications. The problem persists, here's
>>>>> the setup
>>>>> > :
>>>>> > RE 1) I've changed the string to include all 3 members. It appears
>>>>> that
>>>>> > MotorRSC knows of the secondary nodes, but can't connect to them
>>>>> (see below)
>>>>> > RE 2) The error "No replica set primary available for query with
>>>>> > ReadPreference PRIMARY" has gone away - I was using internal IPs in
>>>>> my
>>>>> > replica set settings, so of course Motor couldn't connect to them.
>>>>> I've
>>>>> > added the line "DB.read_preferences =
>>>>> ReadPreference.SECONDARY_PREFERRED"
>>>>> > after my connection string, no difference noted.
>>>>> > RE 3) My replica set seems to work fine, here are the steps that
>>>>> cause the
>>>>> > problem :
>>>>> > A) 3-node set running, when connecting to primary I check
>>>>> rs.status() :
>>>>> > {
>>>>> > "set" : "rs01",
>>>>> > "date" : ISODate("2012-10-25T14:03:08Z"),
>>>>> > "myState" : 1,
>>>>> > "members" : [
>>>>> > {
>>>>> > "_id" : 0,
>>>>> > "name" : "IP_ONE_HERE:27017",
>>>>> > "health" : 1,
>>>>> > "state" : 1,
>>>>> > "stateStr" : "PRIMARY",
>>>>> > "uptime" : 76,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "self" : true
>>>>> > },
>>>>> > {
>>>>> > "_id" : 1,
>>>>> > "name" : "IP_TWO_HERE:27017",
>>>>> > "health" : 1,
>>>>> > "state" : 2,
>>>>> > "stateStr" : "SECONDARY",
>>>>> > "uptime" : 76,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "lastHeartbeat" : ISODate("2012-10-25T14:03:08Z"),
>>>>> > "pingMs" : 1
>>>>> > },
>>>>> > {
>>>>> > "_id" : 2,
>>>>> > "name" : "IS_THREE_HERE:27017",
>>>>> > "health" : 1,
>>>>> > "state" : 2,
>>>>> > "stateStr" : "SECONDARY",
>>>>> > "uptime" : 76,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "lastHeartbeat" : ISODate("2012-10-25T14:03:08Z"),
>>>>> > "pingMs" : 1
>>>>> > }
>>>>> > ],
>>>>> > "ok" : 1
>>>>> > }
>>>>> > All seems okay. I have my app running with no errors. I take down
>>>>> PRIMARY,
>>>>> > and node #2 takes over. I log into mongo and check rs.status() :
>>>>> > {
>>>>> > "set" : "rs01",
>>>>> > "date" : ISODate("2012-10-25T14:16:03Z"),
>>>>> > "myState" : 2,
>>>>> > "syncingTo" : "NEW_PRIMARY_IP:27017",
>>>>> > "members" : [
>>>>> > {
>>>>> > "_id" : 0,
>>>>> > "name" : "IP_ONE_HERE:27017",
>>>>> > "health" : 0,
>>>>> > "state" : 8,
>>>>> > "stateStr" : "(not reachable/healthy)",
>>>>> > "uptime" : 0,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "lastHeartbeat" : ISODate("2012-10-25T14:15:37Z"),
>>>>> > "pingMs" : 0,
>>>>> > "errmsg" : "socket exception [CONNECT_ERROR] for IP_ONE_HERE:27017"
>>>>> > },
>>>>> > {
>>>>> > "_id" : 1,
>>>>> > "name" : "IP_TWO_HERE:27017",
>>>>> > "health" : 1,
>>>>> > "state" : 1,
>>>>> > "stateStr" : "PRIMARY",
>>>>> > "uptime" : 64666,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "lastHeartbeat" : ISODate("2012-10-25T14:16:03Z"),
>>>>> > "pingMs" : 2
>>>>> > },
>>>>> > {
>>>>> > "_id" : 2,
>>>>> > "name" : "IP_THREE_HERE:27017",
>>>>> > "health" : 1,
>>>>> > "state" : 2,
>>>>> > "stateStr" : "SECONDARY",
>>>>> > "uptime" : 64666,
>>>>> > "optime" : Timestamp(1351164396000, 2),
>>>>> > "optimeDate" : ISODate("2012-10-25T11:26:36Z"),
>>>>> > "errmsg" : "syncing to: NEW_PRIMARY_IP:27017",
>>>>> > "self" : true
>>>>> > }
>>>>> > ],
>>>>> > "ok" : 1
>>>>> > }
>>>>> > All looks good, we have a new PRIMARY running on the second node.
>>>>> Now I fire
>>>>> > a request to my app and get the following :
>>>>> > [W 121025 10:18:52 iostream:507] Connect error on fd 16:
>>>>> ECONNREFUSED
>>>>> > [W 121025 10:18:52 iostream:507] Connect error on fd 17:
>>>>> ECONNREFUSED
>>>>> > [W 121025 10:18:52 iostream:507] Connect error on fd 15:
>>>>> ECONNREFUSED
>>>>> > [E 121025 10:18:52 ioloop:435] Exception in callback
>>>>> > <tornado.stack_context._StackContextWrapper object at 0x1553050>
>>>>> > Traceback (most recent call last):
>>>>> > File
>>>>> "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line
>>>>> > 421, in _run_callback
>>>>> > callback()
>>>>> > File
>>>>> > "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py",
>>>>> line 229,
>>>>> > in wrapped
>>>>> > callback(*args, **kwargs)
>>>>> > File
>>>>> "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line
>>>>> > 1324, in _to_list_got_more
>>>>> > callback(None, error)
>>>>> > File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py",
>>>>> line
>>>>> > 382, in inner
>>>>> > self.set_result(key, result)
>>>>> > File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py",
>>>>> line
>>>>> > 315, in set_result
>>>>> > self.run()
>>>>> > File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py",
>>>>> line
>>>>> > 343, in run
>>>>> > yielded = self.gen.throw(*exc_info)
>>>>> > File "/mypath/file.py", line 27, in get
>>>>> > segments, uids, segment_data = yield
>>>>> motor.WaitAllOps(['one', 'two',
>>>>> > 'three'])
>>>>> > File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py",
>>>>> line
>>>>> > 335, in run
>>>>> > next = self.yield_point.get_result()
>>>>> > File
>>>>> "/usr/local/lib/python2.7/dist-packages/motor/__init__.py", line
>>>>> > 1679, in get_result
>>>>> > raise error
>>>>> > AttributeError: 'NoneType' object has no attribute 'close'
>>>>> > If I resuscitate my old PRIMARY, everything goes back online and
>>>>> works fine.
>>>>> > If I get the first 3 lines of the error msg correctly, it seems like
>>>>> Motor
>>>>> > sees the 3 seeds, but gets a refused connection? Odd note : the very
>>>>> first
>>>>> > time I tried these steps, when putting PRIMARY back online, I
>>>>> actually got a
>>>>> > message saying : "master has changed" (not that it helped, I need it
>>>>> to
>>>>> > detect this when going offline).
>>>>> > Cheers
>>>>> > L
>>>>> > On Wednesday, October 24, 2012 10:08:25 PM UTC-4, A. Jesse Jiryu
>>>>> Davis
>>>>> > wrote:
>>>>> >> Hi guys.
>>>>> >> 1. When you create your connection string, best to include all the
>>>>> members
>>>>> >> of your set:
>>>>> >> DB =
>>>>> motor.MotorReplicaSetConnection(host="mongodb://user:pass@primary_db,second _db,third_db",
>>>>> >> replicaSet='rep_set_name').open_sync().db_name
>>>>> >> If you only pass one member 'primary_db' then the
>>>>> >> MotorReplicaSetConnection will connect to primary_db, discover the
>>>>> rest of
>>>>> >> the members of the set, and continue to connect to them even after
>>>>> >> 'primary_db' fails. *However*, if your application restarts after
>>>>> primary_db
>>>>> >> goes down, there's no way for MotorRSC to discover the surviving
>>>>> members of
>>>>> >> the replica set. Thus, when you create your connection string,
>>>>> include all
>>>>> >> the members you know about at that time. This will maximize the
>>>>> chance that
>>>>> >> MotorRSC can connect to some available member of the replica set.
>>>>> As long as
>>>>> >> one of them is available, MotorRSC can discover the rest of them.
>>>>> >> 2. MotorRSC will only read from the current primary, by default. If
>>>>> >> there's no primary, you should expect the error, "No replica set
>>>>> primary
>>>>> >> available for query with ReadPreference PRIMARY." You can set a
>>>>> different
>>>>> >> read preference like:
>>>>> >> from pymongo.read_preference import ReadPreference
>>>>> >> # Read from primary if available, otherwise secondary
>>>>> >> DB.read_preference = ReadPreference.PRIMARY_PREFERRED
>>>>> >> # Read from secondary if available, otherwise primary
>>>>> >> DB.read_preference = ReadPreference.SECONDARY_PREFERRED
>>>>> >> See the "ReplicaSetConnection" section here:
>>>>> http://api.mongodb.org/python/current/api/pymongo/index.html#pymongo....
>>>>> >> 3. It's not the *driver* that elects a new primary -- the surviving
>>>>> >> replica-set members elect a primary after the primary goes down.
>>>>> The driver
>>>>> >> should detect when there's a new primary, and connect to it. (If
>>>>> more than
>>>>> >> half the replica-set members haven't survived there will be no
>>>>> primary --
>>>>> >> e.g., if you have two members and one goes down, you can't have a
>>>>> primary
>>>>> >> until it comes back up. See
>>>>> http://www.kchodorow.com/blog/2012/01/04/replica-set-internals-bootca...
>>>>> >> )
>>>>> >> I've tested these cases thoroughly and I believe
>>>>> MotorReplicaSetConnection
>>>>> >> works as desired with replica sets, but if you have a reproducible
>>>>> problem I
>>>>> >> would love to hear about it.
>>>>> >> Peace,
>>>>> >> Jesse
>>>>> >> On Wednesday, October 24, 2012 3:27:52 PM UTC-4, L-R wrote:
>>>>> >>> Also, could you post your connection string for
>>>>> >>> MotorReplicaSetConnection? I simply can't get rid of the
>>>>> >>> "pymongo.errors.AutoReconnect: No replica set members available
>>>>> for query
>>>>> >>> with ReadPreference PRIMARY_PREFERRED" error, which doesn't make
>>>>> sense since
>>>>> >>> I can easily reach my nodes. They are syncing properly, voting is
>>>>> fine, etc.
>>>>> >>> but cannot get my client to connect to the replica set. Thanks.
>>>>> >>> On Wednesday, October 24, 2012 10:18:14 AM UTC-4, Serge S. Koval
>>>>> wrote:
>>>>> >>>> I tried it as well and it appears that it is smart enough to
>>>>> connect to
>>>>> >>>> all Mongo nodes. If one of the nodes go down, you'll see lots of
>>>>> "connection
>>>>> >>>> closed" messages in the console, but your application will
>>>>> continue working
>>>>> >>>> with new master.
>>>>> >>>> What's interesting, we had node crash and
>>>>> MotorReplicaSetConnection
>>>>> >>>> switched to new master without interrupting service. I know it
>>>>> should work,
>>>>> >>>> but when it *really* worked I was surprised :-)
>>>>> >>>> Serge.
>>>>> >>>> On Wed, Oct 24, 2012 at 5:05 PM, L-R <lau...@human.co> wrote:
>>>>> >>>>> Quick question - I'm running a replica set with 3 nodes and
>>>>> using
>>>>> >>>>> MotorReplicaSetConnection to connect to them in the following
>>>>> way :
>>>>> >>>>> DB =
>>>>> motor.MotorReplicaSetConnection(host="mongodb://user:pass@primary_db:27017" ,
>>>>> >>>>> replicaSet='rep_set_name').open_sync().db_name
>>>>> >>>>> Which seems to work more than fine - but does it matter to which
>>>>> DB
>>>>> >>>>> node I connect to? Ex : primary_db, second_node_db, or
>>>>> third_node_db? Any
>>>>> >>>>> best practices here? It seems like the primary does the load
>>>>> balancing
>>>>> >>>>> (judging by the output of consoles) but it's the first time I
>>>>> use a replSet.
>>>>> >>>>> thanks.