kazoo not retrying properly when the connection times-out?

1,595 views
Skip to first unread message

q4brk

unread,
Feb 11, 2014, 5:11:26 AM2/11/14
to pyth...@googlegroups.com
I'm doing some testing to see how Kazoo is retrying the connections and I noticed the following behaviour:  when trying to connect to an IP address that sends a RST back to the client (e.g. machine is up but Zookeeper server is not listening on the port), retrying seems to work properly. But when trying to connect to an IP address that drops all traffic (e.g. a firewalled machine), kazoo only retries twice and raises an exception.

The code I used to test is:

from kazoo.client import KazooClient, KazooState
from kazoo.retry import KazooRetry
import time
import logging

logging.basicConfig(level=logging.DEBUG, format="%(asctime)s [%(thread)d]: %(message)s")
l = logging.getLogger("main")

kz_retry = KazooRetry(max_tries=1000, delay=0.5, backoff=2)

# Use  hosts="127.0.0.1:19999" for the "normal" behaviour - port 19999 on localhost must be closed
# and not firewalled

# Use hosts="240.0.0.1:3000" for the "timeout" behaviour - works on my machine, but not sure if it will
# for others. Replace with an IP address that drops packets sent to it

zk = KazooClient(hosts='240.0.0.1:3000',
                 connection_retry=kz_retry,
                 command_retry=kz_retry)

try:
    zk.start()
except Exception as e:
    l.exception(e)



When using "127.0.0.1:19999" as a host, I get:

2014-02-11 12:09:01,695 [4352249856]: ZK loop started
2014-02-11 12:09:01,695 [4352249856]: Skipping state change
2014-02-11 12:09:01,695 [4352249856]: Connecting to 127.0.0.1:19999
2014-02-11 12:09:01,696 [4352249856]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:09:01,696 [4352249856]: Connection dropped: socket connection error: Connection refused
2014-02-11 12:09:02,570 [4352249856]: Connecting to 127.0.0.1:19999
2014-02-11 12:09:02,570 [4352249856]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:09:02,570 [4352249856]: Connection dropped: socket connection error: Connection refused
2014-02-11 12:09:04,051 [4352249856]: Connecting to 127.0.0.1:19999
2014-02-11 12:09:04,051 [4352249856]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:09:04,051 [4352249856]: Connection dropped: socket connection error: Connection refused
2014-02-11 12:09:06,687 [4352249856]: Connecting to 127.0.0.1:19999
2014-02-11 12:09:06,687 [4352249856]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:09:06,688 [4352249856]: Connection dropped: socket connection error: Connection refused
^CTraceback (most recent call last):
  File "test_zoo.py", line 22, in <module>
    zk.start()
  File "/Users/user/.pyenv/versions/mac_venv/lib/python2.7/site-packages/kazoo/client.py", line 471, in start
    event.wait(timeout=timeout)
  File "/Users/user/.pyenv/versions/2.7.5/lib/python2.7/threading.py", line 618, in wait
    self.__cond.wait(timeout)
  File "/Users/user/.pyenv/versions/2.7.5/lib/python2.7/threading.py", line 358, in wait
    _sleep(delay)
KeyboardInterrupt


It seems to retry properly. However, when I use the other IP address:

2014-02-11 12:09:58,933 [4326440960]: ZK loop started
2014-02-11 12:09:58,933 [4326440960]: Skipping state change
2014-02-11 12:09:58,934 [4326440960]: Connecting to 240.0.0.1:3000
2014-02-11 12:09:58,934 [4326440960]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:10:08,936 [4326440960]: Connection dropped: socket connection error: None
2014-02-11 12:10:09,631 [4326440960]: Connecting to 240.0.0.1:3000
2014-02-11 12:10:09,631 [4326440960]:     Using session_id: None session_passwd: 00000000000000000000000000000000
2014-02-11 12:10:19,632 [4326440960]: Connection dropped: socket connection error: None
2014-02-11 12:10:19,733 [4326440960]: Failed connecting to Zookeeper within the connection retry policy.
2014-02-11 12:10:19,733 [4326440960]: Zookeeper session lost, state: CLOSED
2014-02-11 12:10:19,733 [4326440960]: Connection stopped
2014-02-11 12:10:19,734 [140735278076688]: Connection time-out
Traceback (most recent call last):
  File "test_zoo.py", line 22, in <module>
    zk.start()
  File "/Users/user/.pyenv/versions/mac_venv/lib/python2.7/site-packages/kazoo/client.py", line 475, in start
    raise self.handler.timeout_exception("Connection time-out")
TimeoutError: Connection time-out



...it times out.  Is this the normal behaviour?  Is there a way to get kazoo to keep retrying even in case of timeouts ?

Reply all
Reply to author
Forward
0 new messages