Kazoo 2.0 ... problems with KazooClient.retry() and add_auth()?

30 views
Skip to first unread message

Matt

unread,
Jun 20, 2014, 8:34:05 PM6/20/14
to pyth...@googlegroups.com
It seems that Kazoo 2.0 was released without an announcement to the list ... and we inadvertently had a server boot up and install it instead of our tried and true 1.3.1. We discovered a problem with our nd_service_registry code and Kazoo 2.0 that I could use help with.

The problem manifests as a dead-lock when we instantiate a nd_service_registry.KazooServiceRegistry() object with a username/password setting, and then call the set_node() method.

The issue seems to be line 855 (https://github.com/Nextdoor/ndserviceregistry/blob/master/nd_service_registry/__init__.py#L827-L856) where we use the KazooClient.handler.lock_object() method to get a run lock, then call the KazooClient.retry() method on the KazooClient.add_auth() method. If we remove either the 'with self._run_lock' line (827), OR we remove the self._zk_retry() line (855), the problem goes away and our code works just fine.

Any thoughts on whats changed in Kazoo and what we're doing wrong?

Matt Wise

unread,
Jun 23, 2014, 5:16:11 PM6/23/14
to pyth...@googlegroups.com
Any thoughts guys?

Matt Wise
Sr. Systems Architect
Nextdoor.com


--
You received this message because you are subscribed to the Google Groups "python-zk" group.
Visit this group at http://groups.google.com/group/python-zk.

Matt Wise

unread,
Jun 30, 2014, 12:57:49 AM6/30/14
to pyth...@googlegroups.com
Kazoo guys, any ideas?

Matt Wise
Sr. Systems Architect
Nextdoor.com


Ben Bangert

unread,
Sep 23, 2014, 3:27:01 PM9/23/14
to pyth...@googlegroups.com
Ah, I think Hanno forgot about this list, as he did make an announcement to the apache-zookeeper-users list instead. Just following up and asking if this was resolved before I take a look.

Matt Wise

unread,
Sep 23, 2014, 3:30:57 PM9/23/14
to Ben Bangert, pyth...@googlegroups.com
So the ultimate issue was that the old synchronous add_auth() code actually lied. It fired off an async add_auth call, and then immediately returned. The new code is fixed, which means it fires off the add_auth async call, but then waits until it returns.

The simple fix for us was to change our code to fire off the add_auth_async() call and then ignore the result. Not entirely clean, but it just restored the previous behavior as we expected.


Matt Wise
Sr. Systems Architect
Nextdoor.com

On Tue, Sep 23, 2014 at 12:27 PM, Ben Bangert <b...@groovie.org> wrote:
Ah, I think Hanno forgot about this list, as he did make an announcement to the apache-zookeeper-users list instead. Just following up and asking if this was resolved before I take a look.

--
Reply all
Reply to author
Forward
0 new messages