[scribe-server] review request: Add zookeeper-based network store discovery.

514 views
Skip to first unread message

Travis Crawford

unread,
May 1, 2010, 6:42:24 PM5/1/10
to scribe...@googlegroups.com
Hey all -

Anyone feel like doing a code review?

At present Twitter is pretty far away from the facebook/master branch,
and we're trying to get everything merged back into the community
branch, at which point we can help testing the other awesome patches
that have come through :)

http://github.com/traviscrawford/scribe/commit/2e71b143c54a84f0be2025ec01b994b48d0ce7ab

"""
Add zookeeper-based network store discovery.

New configuration options:

zk_server -- Zookeeper connection string. Comma-separated list of
host:port pairs.
zk_registration_prefix -- Where to register in zookeeper, making the
scribe discoverable.

Modified configuration option:

remote_host -- Now accepts targets in zk:///zk/registration/prefix
format. Existing hostname-format values function the same as before.
Note when using the new form remote_port is ignored.

To reduce Zookeeper load only scribes that require discovery should
register themselves. That is, if a node will never be a network store
remote_host it should not register itself.

When discovering a remote scribe via Zookeeper, one of the registered
scribes will be selected at random. If zookeeper is unavailable the
previously
discovered scribe will be reused. If there is no previously discovered
scribe is available (for example, scribe restarted during a zookeeper
outage) sending to the network store will fail. In this case, a buffer
store can be used to buffer locally until zookeeper comes back online,
at which point scribe will retry discovery, find a remote scribe, and
replay the buffer.
"""

--travis

David Recordon

unread,
May 5, 2010, 9:09:01 PM5/5/10
to Scribe Server
Hey Travis,
I'm helping Gautam and the team get caught up on outstanding patches.
Can you please describe how you guys went about testing this patch?

Thanks,
--David


On May 1, 3:42 pm, Travis Crawford <traviscrawf...@gmail.com> wrote:
> Hey all -
>
> Anyone feel like doing a code review?
>
> At present Twitter is pretty far away from the facebook/master branch,
> and we're trying to get everything merged back into the community
> branch, at which point we can help testing the other awesome patches
> that have come through :)
>
> http://github.com/traviscrawford/scribe/commit/2e71b143c54a84f0be2025...

Travis Crawford

unread,
May 6, 2010, 5:38:00 PM5/6/10
to scribe...@googlegroups.com
On May 5, 2010, at 6:09 PM, David Recordon wrote:

Hey Travis,
I'm helping Gautam and the team get caught up on outstanding patches.
Can you please describe how you guys went about testing this patch?


Sure thing! Hopefully the highlighting comes through in logs below, as that helps identify the important parts. Let me know if you have any questions about how/why/wtf, or Zookeeper stuff if you're not familiar.


The following logs and examples are from my testing of the following two patches:


That are checked into this branch: http://github.com/traviscrawford/scribe/commits/fb-zk

In addition to the following tests, I run this in production on at least several hundred machines.

All testing thus far has been from looking at logs, and verifying the logs look correct in the analysis system.



######################################################################
### Test case: Zookeeper already running, scribe registers itself. ###
######################################################################

SUMMARY
At startup, a new ZKClient object is created, and because the config has a zk_registration_prefix a Zookeeper connection is established. When the connection and session are established we receive a callback which then registers the task in Zookeeper. Note the parent directories are created if necessary and an ephemeral registration node in host:port format is created.

LOGS
[travis@hoth ~/src/scribe]$ cat zk-agg.conf 
port=4444
zk_server=localhost:2181
zk_registration_prefix=/scribe/aggregator
<store>
  category=travis
  type=null
</store>
[travis@hoth ~/src/scribe]$ src/scribed -c zk-agg.conf 
[Thu May  6 11:34:25 2010] "setrlimit error (setting max fd size)" 
[Thu May  6 11:34:25 2010] "STATUS: STARTING" 
[Thu May  6 11:34:25 2010] "STATUS: configuring" 
[Thu May  6 11:34:25 2010] "got configuration data from file <zk-agg.conf>" 
[Thu May  6 11:34:25 2010] "Creating new ZKClient." 
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.0
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@662: Client environment:host.name=hoth.local
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@669: Client environment:os.name=Darwin
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@670: Client environment:os.arch=10.3.0
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@671: Client environment:os.version=Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386
2010-05-06 11:34:25,744:9223(0x7fff70a75be0):ZOO_INFO@log_env@679: Client environment:user.name=travis
2010-05-06 11:34:25,745:9223(0x7fff70a75be0):ZOO_INFO@log_env@687: Client environment:user.home=/Users/travis
2010-05-06 11:34:25,745:9223(0x7fff70a75be0):ZOO_INFO@log_env@699: Client environment:user.dir=/Users/travis/src/scribe
2010-05-06 11:34:25,745:9223(0x7fff70a75be0):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x10004b3b0 sessionId=0 sessionPasswd=<null> context=0x0 flags=0
2010-05-06 11:34:25,746:9223(0x7fff70a75be0):ZOO_DEBUG@start_threads@152: starting threads...
2010-05-06 11:34:25,746:9223(0x100481000):ZOO_DEBUG@do_completion@326: started completion thread
[Thu May  6 11:34:25 2010] "CATEGORY : travis" 
2010-05-06 11:34:25,747:9223(0x1002c0000):ZOO_DEBUG@do_io@279: started IO thread
[Thu May  6 11:34:25 2010] "configured <1> stores" 
2010-05-06 11:34:25,748:9223(0x1002c0000):ZOO_INFO@check_events@1585: initiated connection to server [::1:2181]
[Thu May  6 11:34:25 2010] "store thread starting" 
[Thu May  6 11:34:25 2010] "STATUS: " 
[Thu May  6 11:34:25 2010] "STATUS: ALIVE" 
[Thu May  6 11:34:25 2010] "Starting scribe server on port 4444" 
Thrift: Thu May  6 11:34:25 2010 libevent 1.4.13-stable method kqueue
2010-05-06 11:34:25,755:9223(0x1002c0000):ZOO_INFO@check_events@1632: session establishment complete on server [::1:2181], sessionId=0x1286ee218490001, negotiated timeout=10000
2010-05-06 11:34:25,755:9223(0x1002c0000):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2010-05-06 11:34:25,755:9223(0x100481000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
[Thu May  6 11:34:25 2010] "Registering task in Zookeeper." 
2010-05-06 11:34:25,755:9223(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30bb2 for path [/scribe] to ::1:2181
2010-05-06 11:34:25,756:9223(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30bb2 rc=-110
2010-05-06 11:34:25,756:9223(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30bb3 for path [/scribe/aggregator] to ::1:2181
2010-05-06 11:34:25,757:9223(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30bb3 rc=-110
2010-05-06 11:34:25,757:9223(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30bb4 for path [/scribe/aggregator/hoth.local:4444] to ::1:2181
2010-05-06 11:34:25,758:9223(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30bb4 rc=0
2010-05-06 11:34:29,092:9223(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
^C
[travis@hoth ~/src/scribe]$ 





####################################################################################
### Test case: Znode already exists, set watch and register when node goes away. ###
####################################################################################

SUMMARY
In the case where scribe restarts quickly the previous ephemeral node may not have been garbage collected yet, so we set a watch on that znode and register when it goes away. Scribe can still receive messages during this time, and technically its still discoverable since clients would see the stale ephemeral znode.

LOGS
2010-05-06 11:41:45,997:9251(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d6a for path [/scribe/aggregator/hoth.local:4444] to fe80::1:2181
2010-05-06 11:41:45,998:9251(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d6a rc=0
2010-05-06 11:41:49,331:9251(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
^C
[travis@hoth ~/src/scribe]$ src/scribed -c zk-agg.conf 
[Thu May  6 11:41:51 2010] "setrlimit error (setting max fd size)" 
[Thu May  6 11:41:51 2010] "STATUS: STARTING" 
[Thu May  6 11:41:51 2010] "STATUS: configuring" 
[Thu May  6 11:41:51 2010] "got configuration data from file <zk-agg.conf>" 
[Thu May  6 11:41:51 2010] "Creating new ZKClient." 
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.0
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@662: Client environment:host.name=hoth.local
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@669: Client environment:os.name=Darwin
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@670: Client environment:os.arch=10.3.0
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@671: Client environment:os.version=Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386
2010-05-06 11:41:51,270:9252(0x7fff70a75be0):ZOO_INFO@log_env@679: Client environment:user.name=travis
2010-05-06 11:41:51,271:9252(0x7fff70a75be0):ZOO_INFO@log_env@687: Client environment:user.home=/Users/travis
2010-05-06 11:41:51,271:9252(0x7fff70a75be0):ZOO_INFO@log_env@699: Client environment:user.dir=/Users/travis/src/scribe
2010-05-06 11:41:51,271:9252(0x7fff70a75be0):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x10004b3b0 sessionId=0 sessionPasswd=<null> context=0x0 flags=0
2010-05-06 11:41:51,271:9252(0x7fff70a75be0):ZOO_DEBUG@start_threads@152: starting threads...
2010-05-06 11:41:51,271:9252(0x100481000):ZOO_DEBUG@do_completion@326: started completion thread
2010-05-06 11:41:51,271:9252(0x1002c0000):ZOO_DEBUG@do_io@279: started IO thread
[Thu May  6 11:41:51 2010] "CATEGORY : travis" 
[Thu May  6 11:41:51 2010] "configured <1> stores" 
[Thu May  6 11:41:51 2010] "STATUS: " 
[Thu May  6 11:41:51 2010] "STATUS: ALIVE" 
[Thu May  6 11:41:51 2010] "Starting scribe server on port 4444" 
2010-05-06 11:41:51,272:9252(0x1002c0000):ZOO_INFO@check_events@1585: initiated connection to server [127.0.0.1:2181]
[Thu May  6 11:41:51 2010] "store thread starting" 
Thrift: Thu May  6 11:41:51 2010 libevent 1.4.13-stable method kqueue
2010-05-06 11:41:51,282:9252(0x1002c0000):ZOO_INFO@check_events@1632: session establishment complete on server [127.0.0.1:2181], sessionId=0x1286ee218490003, negotiated timeout=10000
2010-05-06 11:41:51,283:9252(0x1002c0000):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2010-05-06 11:41:51,283:9252(0x100481000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
[Thu May  6 11:41:51 2010] "Registering task in Zookeeper." 
2010-05-06 11:41:51,283:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d70 for path [/scribe] to 127.0.0.1:2181
2010-05-06 11:41:51,285:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d70 rc=-110
2010-05-06 11:41:51,285:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d71 for path [/scribe/aggregator] to 127.0.0.1:2181
2010-05-06 11:41:51,287:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d71 rc=-110
2010-05-06 11:41:51,287:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d72 for path [/scribe/aggregator/hoth.local:4444] to 127.0.0.1:2181
2010-05-06 11:41:51,289:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d72 rc=-110
2010-05-06 11:41:51,299:9252(0x100481000):ZOO_DEBUG@zoo_awexists@2573: Sending request xid=0x4be30d73 for path [/scribe/aggregator/hoth.local:4444] to 127.0.0.1:2181
2010-05-06 11:41:51,342:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2016: Calling COMPLETION_STAT for xid=0x4be30d73 rc=0
[Thu May  6 11:41:51 2010] "Set watch on znode that already exists: /scribe/aggregator/hoth.local:4444" 
2010-05-06 11:41:54,633:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
2010-05-06 11:41:57,967:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
2010-05-06 11:42:00,015:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@1916: Processing WATCHER_EVENT
2010-05-06 11:42:00,015:9252(0x100481000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [/scribe/aggregator/hoth.local:4444], type = -1 event=ZOO_DELETED_EVENT
[Thu May  6 11:42:00 2010] "Registering task in Zookeeper." 
2010-05-06 11:42:00,015:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d74 for path [/scribe] to 127.0.0.1:2181
2010-05-06 11:42:00,016:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d74 rc=-110
2010-05-06 11:42:00,016:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d75 for path [/scribe/aggregator] to 127.0.0.1:2181
2010-05-06 11:42:00,017:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d75 rc=-110
2010-05-06 11:42:00,017:9252(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be30d76 for path [/scribe/aggregator/hoth.local:4444] to 127.0.0.1:2181
2010-05-06 11:42:00,018:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be30d76 rc=0
2010-05-06 11:42:03,352:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
2010-05-06 11:42:06,686:9252(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
^C
[travis@hoth ~/src/scribe]$ 


################################################################################################
### Test case: Zookeeper down when scribe starts, Scribe registers when Zookeeper is started ###
################################################################################################

SUMMARY
The Zookeeper client library takes care of retries when Zookeeper server is not available. More specifically, the client starts it its ZOO_CONNECTING_STATE, and the library retries when in this state. This is the same flow when a ZK server goes away -- the state machines goes back to ZOO_CONNECTING_STATE and we follow the same reconnect flow.

LOGS
[travis@hoth ~/src/scribe]$ src/scribed -c zk-agg.conf 
[Thu May  6 11:56:04 2010] "setrlimit error (setting max fd size)" 
[Thu May  6 11:56:04 2010] "STATUS: STARTING" 
[Thu May  6 11:56:04 2010] "STATUS: configuring" 
[Thu May  6 11:56:04 2010] "got configuration data from file <zk-agg.conf>" 
[Thu May  6 11:56:04 2010] "Creating new ZKClient." 
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.0
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@662: Client environment:host.name=hoth.local
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@669: Client environment:os.name=Darwin
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@670: Client environment:os.arch=10.3.0
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@671: Client environment:os.version=Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386
2010-05-06 11:56:04,476:9291(0x7fff70a75be0):ZOO_INFO@log_env@679: Client environment:user.name=travis
2010-05-06 11:56:04,477:9291(0x7fff70a75be0):ZOO_INFO@log_env@687: Client environment:user.home=/Users/travis
2010-05-06 11:56:04,477:9291(0x7fff70a75be0):ZOO_INFO@log_env@699: Client environment:user.dir=/Users/travis/src/scribe
2010-05-06 11:56:04,477:9291(0x7fff70a75be0):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x10004b3b0 sessionId=0 sessionPasswd=<null> context=0x0 flags=0
2010-05-06 11:56:04,478:9291(0x7fff70a75be0):ZOO_DEBUG@start_threads@152: starting threads...
2010-05-06 11:56:04,478:9291(0x1002c0000):ZOO_DEBUG@do_io@279: started IO thread
2010-05-06 11:56:04,478:9291(0x100481000):ZOO_DEBUG@do_completion@326: started completion thread
[Thu May  6 11:56:04 2010] "CATEGORY : travis" 
[Thu May  6 11:56:04 2010] "configured <1> stores" 
2010-05-06 11:56:04,478:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [fe80::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
[Thu May  6 11:56:04 2010] "store thread starting" 
[Thu May  6 11:56:04 2010] "STATUS: " 
2010-05-06 11:56:04,479:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
[Thu May  6 11:56:04 2010] "STATUS: ALIVE" 
2010-05-06 11:56:04,479:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [127.0.0.1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
[Thu May  6 11:56:04 2010] "Starting scribe server on port 4444" 
Thrift: Thu May  6 11:56:04 2010 libevent 1.4.13-stable method kqueue
2010-05-06 11:56:07,813:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [fe80::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:07,813:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:07,813:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [127.0.0.1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:11,146:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [fe80::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:11,147:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [::1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:11,147:9291(0x1002c0000):ZOO_ERROR@handle_socket_error_msg@1579: Socket [127.0.0.1:2181] zk retcode=-4, errno=61(Connection refused): server refused to accept the client
2010-05-06 11:56:14,481:9291(0x1002c0000):ZOO_INFO@check_events@1585: initiated connection to server [fe80::1:2181]
2010-05-06 11:56:14,505:9291(0x1002c0000):ZOO_INFO@check_events@1632: session establishment complete on server [fe80::1:2181], sessionId=0x1286ef99f260000, negotiated timeout=10000
2010-05-06 11:56:14,506:9291(0x1002c0000):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2010-05-06 11:56:14,506:9291(0x100481000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
[Thu May  6 11:56:14 2010] "Registering task in Zookeeper." 
2010-05-06 11:56:14,506:9291(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be310c5 for path [/scribe] to fe80::1:2181
2010-05-06 11:56:14,517:9291(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be310c5 rc=-110
2010-05-06 11:56:14,517:9291(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be310c6 for path [/scribe/aggregator] to fe80::1:2181
2010-05-06 11:56:14,526:9291(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be310c6 rc=-110
2010-05-06 11:56:14,526:9291(0x100481000):ZOO_DEBUG@zoo_acreate@2489: Sending request xid=0x4be310c7 for path [/scribe/aggregator/hoth.local:4444] to fe80::1:2181
2010-05-06 11:56:14,527:9291(0x1002c0000):ZOO_DEBUG@zookeeper_process@2049: Calling COMPLETION_STRING for xid=0x4be310c7 rc=0
2010-05-06 11:56:17,860:9291(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
2010-05-06 11:56:21,194:9291(0x1002c0000):ZOO_DEBUG@zookeeper_process@1974: Got ping response in 0 ms
^C
[travis@hoth ~/src/scribe]$ 



#######################################################################
### Test case: Discover remote_host and remote_port from Zookeeper. ###
#######################################################################

SUMMARY
When remote_host specifies a znode we randomly select a child registered at that location, overriding the remote_host and remote_port config options. Using this method means the existing code paths are reusued, so we have the same failure modes. Note that ZKClient is created, then a store, then we see the ZK connection start. This is because this config does not register a znode so we only open the connection when needed. After discovering a remote scribe we establis a connection to it as usual. Note we also disconnect if not registering ourselves, since the connection is not needed.

LOGS
[travis@hoth ~/src/scribe]$ cat zk.conf 
port=3333
zk_server=localhost:2181
<store>
  category=travis
  type=buffer
  retry_interval=5
  retry_interval_range=2
  <primary>
    type=network
    remote_host=zk:///scribe/aggregator
    remote_port=2222
  </primary>
  <secondary>
    type=file
    file_path=/tmp
    base_filename=travis
  </secondary>
</store>
[travis@hoth ~/src/scribe]$ src/scribed -c zk.conf 
[Thu May  6 12:03:28 2010] "setrlimit error (setting max fd size)" 
[Thu May  6 12:03:28 2010] "STATUS: STARTING" 
[Thu May  6 12:03:28 2010] "STATUS: configuring" 
[Thu May  6 12:03:28 2010] "got configuration data from file <zk.conf>" 
[Thu May  6 12:03:28 2010] "Creating new ZKClient." 
[Thu May  6 12:03:28 2010] "CATEGORY : travis" 
[Thu May  6 12:03:28 2010] "configured <1> stores" 
[Thu May  6 12:03:28 2010] "STATUS: " 
[Thu May  6 12:03:28 2010] "STATUS: ALIVE" 
[Thu May  6 12:03:28 2010] "store thread starting" 
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.0
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@662: Client environment:host.name=hoth.local
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@669: Client environment:os.name=Darwin
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@670: Client environment:os.arch=10.3.0
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@671: Client environment:os.version=Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386
2010-05-06 12:03:28,115:9321(0x1002c0000):ZOO_INFO@log_env@679: Client environment:user.name=travis
[Thu May  6 12:03:28 2010] "Starting scribe server on port 3333" 
Thrift: Thu May  6 12:03:28 2010 libevent 1.4.13-stable method kqueue
2010-05-06 12:03:28,117:9321(0x1002c0000):ZOO_INFO@log_env@687: Client environment:user.home=/Users/travis
2010-05-06 12:03:28,117:9321(0x1002c0000):ZOO_INFO@log_env@699: Client environment:user.dir=/Users/travis/src/scribe
2010-05-06 12:03:28,117:9321(0x1002c0000):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x10004b3b0 sessionId=0 sessionPasswd=<null> context=0x0 flags=0
2010-05-06 12:03:28,117:9321(0x1002c0000):ZOO_DEBUG@start_threads@152: starting threads...
2010-05-06 12:03:28,117:9321(0x101184000):ZOO_DEBUG@do_completion@326: started completion thread
2010-05-06 12:03:28,117:9321(0x100787000):ZOO_DEBUG@do_io@279: started IO thread
2010-05-06 12:03:28,118:9321(0x100787000):ZOO_INFO@check_events@1585: initiated connection to server [::1:2181]
[Thu May  6 12:03:28 2010] "Getting the best remote scribe." 
2010-05-06 12:03:28,118:9321(0x1002c0000):ZOO_DEBUG@zoo_awget_children_@2612: Sending request xid=0x4be31281 for path [/scribe/aggregator] to ::1:2181
2010-05-06 12:03:28,122:9321(0x100787000):ZOO_INFO@check_events@1632: session establishment complete on server [::1:2181], sessionId=0x1286ef99f260003, negotiated timeout=10000
2010-05-06 12:03:28,122:9321(0x100787000):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2010-05-06 12:03:28,123:9321(0x100787000):ZOO_DEBUG@zookeeper_process@2026: Calling COMPLETION_STRINGLIST for xid=0x4be31281 rc=0
2010-05-06 12:03:28,123:9321(0x101184000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
2010-05-06 12:03:28,123:9321(0x100787000):ZOO_DEBUG@do_io@317: IO thread terminated
2010-05-06 12:03:28,123:9321(0x101184000):ZOO_DEBUG@do_completion@336: completion thread terminated
2010-05-06 12:03:28,123:9321(0x1002c0000):ZOO_INFO@zookeeper_close@2290: Closing zookeeper sessionId=0x1286ef99f260003 to [::1:2181]
[Thu May  6 12:03:28 2010] "Got remote scribe from zookeeper <hoth.local:4444>" 
[Thu May  6 12:03:28 2010] "Opened connection to remote scribe server <hoth.local:4444>" 
[Thu May  6 12:03:28 2010] "[travis] Opened file </tmp/travis_00000> for writing" 
[Thu May  6 12:03:28 2010] "[travis] Changing state from <DISCONNECTED> to <SENDING_BUFFER>" 
[Thu May  6 12:03:28 2010] "[travis] successfully read <0> entries of <0> bytes from file </tmp/travis_00000>" 
[Thu May  6 12:03:28 2010] "[travis] No more buffer files to send, switching to streaming mode" 
[Thu May  6 12:03:28 2010] "[travis] Changing state from <SENDING_BUFFER> to <STREAMING>" 
^C
[travis@hoth ~/src/scribe]$ 


###################################################
### Test case: Unable to discover remote scribe ###
###################################################

SUMMARY
Zookeeper outages will happen. When unable to get a remote scribe the secondary store is used, if configured. Later when switching to the primary store we go through the discovery code path and potentially get a place to connect.


LOGS
[travis@hoth ~/src/scribe]$ src/scribed -c zk.conf 
[Thu May  6 14:27:51 2010] "setrlimit error (setting max fd size)" 
[Thu May  6 14:27:51 2010] "STATUS: STARTING" 
[Thu May  6 14:27:51 2010] "STATUS: configuring" 
[Thu May  6 14:27:51 2010] "got configuration data from file <zk.conf>" 
[Thu May  6 14:27:51 2010] "Creating new ZKClient." 
[Thu May  6 14:27:51 2010] "CATEGORY : travis" 
[Thu May  6 14:27:51 2010] "configured <1> stores" 
[Thu May  6 14:27:51 2010] "STATUS: " 
[Thu May  6 14:27:51 2010] "STATUS: ALIVE" 
[Thu May  6 14:27:51 2010] "store thread starting" 
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.0
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@662: Client environment:host.name=hoth.local
[Thu May  6 14:27:51 2010] "Starting scribe server on port 3333" 
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@669: Client environment:os.name=Darwin
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@670: Client environment:os.arch=10.3.0
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@671: Client environment:os.version=Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386
2010-05-06 14:27:51,523:21643(0x1002c0000):ZOO_INFO@log_env@679: Client environment:user.name=travis
Thrift: Thu May  6 14:27:51 2010 libevent 1.4.13-stable method kqueue
2010-05-06 14:27:51,524:21643(0x1002c0000):ZOO_INFO@log_env@687: Client environment:user.home=/Users/travis
2010-05-06 14:27:51,524:21643(0x1002c0000):ZOO_INFO@log_env@699: Client environment:user.dir=/Users/travis/src/scribe
2010-05-06 14:27:51,524:21643(0x1002c0000):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x10004b3b0 sessionId=0 sessionPasswd=<null> context=0x0 flags=0
2010-05-06 14:27:51,526:21643(0x1002c0000):ZOO_DEBUG@start_threads@152: starting threads...
2010-05-06 14:27:51,526:21643(0x101181000):ZOO_DEBUG@do_completion@326: started completion thread
2010-05-06 14:27:51,526:21643(0x10078a000):ZOO_DEBUG@do_io@279: started IO thread
[Thu May  6 14:27:51 2010] "Getting the best remote scribe." 
2010-05-06 14:27:51,527:21643(0x1002c0000):ZOO_DEBUG@zoo_awget_children_@2612: Sending request xid=0x4be33458 for path [/scribe/aggregator] to 127.0.0.1:2181
2010-05-06 14:27:51,527:21643(0x10078a000):ZOO_INFO@check_events@1585: initiated connection to server [127.0.0.1:2181]
2010-05-06 14:27:51,530:21643(0x10078a000):ZOO_INFO@check_events@1632: session establishment complete on server [127.0.0.1:2181], sessionId=0x1286ef99f260019, negotiated timeout=10000
2010-05-06 14:27:51,530:21643(0x10078a000):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2010-05-06 14:27:51,530:21643(0x101181000):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
2010-05-06 14:27:51,530:21643(0x10078a000):ZOO_DEBUG@zookeeper_process@2026: Calling COMPLETION_STRINGLIST for xid=0x4be33458 rc=0
2010-05-06 14:27:51,531:21643(0x10078a000):ZOO_DEBUG@do_io@317: IO thread terminated
2010-05-06 14:27:51,531:21643(0x101181000):ZOO_DEBUG@do_completion@336: completion thread terminated
2010-05-06 14:27:51,531:21643(0x1002c0000):ZOO_INFO@zookeeper_close@2290: Closing zookeeper sessionId=0x1286ef99f260019 to [127.0.0.1:2181]
[Thu May  6 14:27:51 2010] "Unable to get a remote Scribe from zk:///scribe/aggregator
Thrift: Thu May  6 14:27:51 2010 TSocket::open() getaddrinfo() <Host: zk:///scribe/aggregator Port: 2222>nodename nor servname provided, or not known
[Thu May  6 14:27:51 2010] "failed to open connection to remote scribe server <zk:///scribe/aggregator:2222> thrift error <Could not resolve host for client socket.>" 
[Thu May  6 14:27:51 2010] "[travis] Opened file </tmp/travis_00000> for writing" 
[Thu May  6 14:27:51 2010] "[travis] choosing new retry interval <4> seconds" 
[Thu May  6 14:27:51 2010] "[travis] Changing state from <DISCONNECTED> to <DISCONNECTED>" 
^C
[travis@hoth ~/src/scribe]$ 



--Travis

g13760...@gmail.com

unread,
Jun 19, 2015, 3:51:10 AM6/19/15
to scribe...@googlegroups.com, travisc...@gmail.com
hi travis , could you tell me how to install the scribe (https://github.com/traviscrawford/scribe)。
i want the boost version and thinft version.
i install it with boost1.44 and thinft 3/4/5  both all failed. 

and the configure cmd is :

./bootstrap.sh  --enable-zookeeper --prefix=/usr/local/scribe --with-boost=/usr/local/boost/ --with-thriftpath=/usr/local/thrift/ --with-fb303path=/usr/local/fb303/ LDFLAGS="-L/usr/soft/zookeeper-3.4.6/src/c/.libs -lzookeeper_mt" CPPFLAGS='-I/usr/soft/zookeeper-3.4.6/src/c/include -I/usr/soft/zookeeper-3.4.6/src/c/generated'

think you!

在 2010年5月2日星期日 UTC+8上午6:42:24,Travis Crawford写道:

g13760...@gmail.com

unread,
Jun 19, 2015, 3:54:26 AM6/19/15
to scribe...@googlegroups.com, travisc...@gmail.com
hi , could you tell me how to install the scribe (https://github.com/traviscrawford/scribe)。

i install it with boost1.44 and thinft 3/4/5  both all failed. 

i want to konw  the boost version and thinft version.

and the configure cmd is :

./bootstrap.sh  --enable-zookeeper --prefix=/usr/local/scribe --with-boost=/usr/local/boost/ --with-thriftpath=/usr/local/thrift/ --with-fb303path=/usr/local/fb303/ LDFLAGS="-L/usr/soft/zookeeper-3.4.6/src/c/.libs -lzookeeper_mt" CPPFLAGS='-I/usr/soft/zookeeper-3.4.6/src/c/include -I/usr/soft/zookeeper-3.4.6/src/c/generated'

think you!

在 2010年5月7日星期五 UTC+8上午5:38:00,Travis Crawford写道:
...
Reply all
Reply to author
Forward
0 new messages