need help with my library tests

120 views
Skip to first unread message

Ingo Sauerzapf

unread,
Jun 24, 2016, 11:03:55 AM6/24/16
to openresty-en
Hi guys,
I am writing a library to connect to zookeeper from nginx with cosocket and semaphore and the library works just fine with a little test program I wrote that I run with the resty cli. I had never build any test for it but wanted to start with that now. I read the testing part in the eBook that was published on github and I am looking through the test in other modules but I seem to not be able to figure this one out. My current test is
# vim:set ft= ts=4 sw=4 et fdm=marker:

use Test::Nginx::Socket::Lua;
use Cwd qw(cwd);

repeat_each(1);

plan tests => repeat_each() * (3 * blocks());

my $pwd = cwd();

our $HttpConfig = qq{
    resolver \$TEST_NGINX_RESOLVER;
    lua_package_path "$pwd/lib/?.lua;$pwd/t/lib/?.lua;;";
    lua_package_cpath "/usr/local/openresty-debug/lualib/?.so;/usr/local/openresty/lualib/?.so;;";
};

$ENV{TEST_NGINX_RESOLVER} = '8.8.8.8';
$ENV{TEST_NGINX_ZOOKEEPER_HOST} = '127.0.0.1';
$ENV{TEST_NGINX_ZOOKEEPER_POST} = 2181;
 
log_level 'debug';
no_shuffle();
#no_long_string();

run_tests();

__DATA__

=== TEST 1: connect 
Connect and disconnect form zookeeper
--- http_config eval: $::HttpConfig
--- config
location = /t {
  content_by_lua '
    local zookeeper = require "resty.zookeeper"

    local zoo = zookeeper:new()
    local ok, err = zoo:connect("$TEST_NGINX_ZOOKEEPER_HOST", $TEST_NGINX_ZOOKEEPER_POST)
    if not ok then
      ngx.say("connect failed ", err)
      return
    end
    
    ngx.say("connected")
    
    zoo:close()

  ';
}
--- request
GET /t
--- response_body
connected

and when I run it I get below error. The test does connect and disconnect from zookeeper just fine, so I am assuming I make a mistake in the test setup.
:~/Development/lua-resty-zookeeper$ ./go
t/query.t .. 1/3
#   Failed test 'ERROR: client socket timed out - TEST 1: connect
# '
#   at /Library/Perl/5.18/Test/Nginx/Socket.pm line 1710.

#   Failed test 'TEST 1: connect  - status code ok'
#   at /Library/Perl/5.18/Test/Nginx/Socket.pm line 891.
#          got: ''
#     expected: '200'

#   Failed test 'TEST 1: connect  - response_body - response is expected (req 0)'
#   at /Library/Perl/5.18/Test/Nginx/Socket.pm line 1282.
#          got: ""
#       length: 0
#     expected: "connected\x{0a}"
#       length: 10
#     strings begin to differ at char 1 (line 1 column 1)
t/query.t .. Failed 3/3 subtests

Test Summary Report
-------------------
t/query.t (Wstat: 0 Tests: 3 Failed: 3)
  Failed tests:  1-3
Files=1, Tests=3,  3 wallclock secs ( 0.02 usr  0.00 sys +  0.13 cusr  0.03 csys =  0.18 CPU)
Result: FAIL

Any help is highly appreciated.

Yichun Zhang (agentzh)

unread,
Jun 24, 2016, 5:00:10 PM6/24/16
to openresty-en
Hello!

On Fri, Jun 24, 2016 at 8:03 AM, Ingo Sauerzapf wrote:
> and when I run it I get below error. The test does connect and disconnect
> from zookeeper just fine, so I am assuming I make a mistake in the test
> setup.
> :~/Development/lua-resty-zookeeper$ ./go
> t/query.t .. 1/3
> # Failed test 'ERROR: client socket timed out - TEST 1: connect
> # '

This error means the nginx server does not close the connection within
the test scaffold's client timeout threshold (default to 3 seconds).
To debug such problems, you should dig up your nginx's debug logs in
the file t/servroot/logs/error.log.

Regards,
-agentzh

Ingo Sauerzapf

unread,
Jun 24, 2016, 6:52:41 PM6/24/16
to openresty-en
thanks for the quick reply,
I went through the error log and found that the problem seems to be with ngx.sleep. Zookeeper expects clients to periodically ping the server and I did put that into a separate thread.

--[[ send pings as long as the state is connected to keep the tcp connection
     from timing out ]]
local function ping(self)
  local req = _str_byte4(-2)
            .. _str_byte4(11)

  while self.state == STATE.CONNECTED do
    local timeout = floor(self.props.timeout / 3000)
    ngx.sleep(timeout)
    ngx.log(ngx.DEBUG, "ping: send, timout: ", timeout)

    -- send ping, do not wait for any reponse
    send(self, req)
  end
end

Obviously a bad joice. How would I implement such a functionality correctly?
Message has been deleted

Ingo Sauerzapf

unread,
Jun 24, 2016, 7:23:11 PM6/24/16
to openresty-en
I think I found the answer, correct? https://github.com/openresty/lua-nginx-module#ngxtimerat. From what it looks like trying it out briefly this works. I am surprised my thread and the ngx.sleep worked just fine in the resty cli. Should nginx and the resty cli not behave the same here?

Ingo Sauerzapf

unread,
Jun 25, 2016, 9:22:02 PM6/25/16
to openresty-en
 have to stand myself corrected. It does not work with ngx.timer.at as I have to reuse the socket for the ping and ngx.timer.at is detached and cannot reuse the socket. How would I do a timed ping to zookeeper?

Yichun Zhang (agentzh)

unread,
Jun 29, 2016, 5:13:58 PM6/29/16
to openresty-en
Hello!

On Fri, Jun 24, 2016 at 3:52 PM, Ingo Sauerzapf wrote:
> I went through the error log and found that the problem seems to be with
> ngx.sleep. Zookeeper expects clients to periodically ping the server and I
> did put that into a separate thread.
>

Zookeeper requires a periodic ping "thread" for every connection?

> --[[ send pings as long as the state is connected to keep the tcp connection
> from timing out ]]
> local function ping(self)
> local req = _str_byte4(-2)
> .. _str_byte4(11)
>
> while self.state == STATE.CONNECTED do
> local timeout = floor(self.props.timeout / 3000)
> ngx.sleep(timeout)
> ngx.log(ngx.DEBUG, "ping: send, timout: ", timeout)
>
> -- send ping, do not wait for any reponse
> send(self, req)
> end
> end

Please note that the request handler always wait for all the light
threads to finish. The "resty" utility does not do this, however. So
if you use an infinite loop in a separate light thread to send ping
for your connection, then you should really kill your thread when the
main thread calls zoo:close().

Regards,
-agentzh

Ingo Sauerzapf

unread,
Jun 29, 2016, 11:15:20 PM6/29/16
to openresty-en
This was super helpful, makes perfect sense to kill the thread, but I am still making a mistake I guess. Yes every connection to zookeeper needs to periodically send a ping to one of the servers in a zookeeper cluster in order for the server to keep it alive. Of course one can just ignore that and just do a quick connect(), dosomething(), close(), but that would basically violate the idea of zookeeper where clients are expected to be permanently connected in order to receive notifications on changes happening in zookeeper.

I am now seeing the sleep timer expiring 10 seconds after the ping thread was killed
2016/06/29 20:03:44 [debug] 95200#0: *6 event timer del: 587210664: 1467255829095
2016/06/29 20:03:44 [debug] 95200#0: *6 lua run thread, top:0 c:1
2016/06/29 20:03:44 [debug] 95200#0: *6 [lua] zookeeper.lua:148: send(): sama continue:
2016/06/29 20:03:44 [debug] 95200#0: *6 [lua] zookeeper.lua:452: close(): ping thread killed
2016/06/29 20:03:44 [debug] 95200#0: *6 lua deleting light thread
2016/06/29 20:03:44 [debug] 95200#0: *6 [lua] zookeeper.lua:459: close(): receive thread killed
2016/06/29 20:03:44 [debug] 95200#0: *6 lua finalize socket
2016/06/29 20:03:44 [debug] 95200#0: *6 lua http cleanup free: 00007FDE23001648
2016/06/29 20:03:44 [debug] 95200#0: *6 lua close socket connection
2016/06/29 20:03:44 [debug] 95200#0: *6 free: 00007FDE20700200, unused: 48
2016/06/29 20:03:44 [debug] 95200#0: *6 reusable connection: 0
2016/06/29 20:03:44 [debug] 95200#0: *6 lua resume returned 0
2016/06/29 20:03:44 [debug] 95200#0: *6 lua light thread ended normally
2016/06/29 20:03:44 [debug] 95200#0: *6 lua deleting light thread
2016/06/29 20:03:44 [debug] 95200#0: *6 lua run thread returned -2
2016/06/29 20:03:44 [debug] 95200#0: worker cycle
2016/06/29 20:03:44 [debug] 95200#0: kevent timer: 9999, changes: 0
2016/06/29 20:03:54 [debug] 95200#0: kevent events: 0
2016/06/29 20:03:54 [debug] 95200#0: timer delta: 9999
2016/06/29 20:03:54 [debug] 95200#0: *6 event timer del: 0: 1467255834095
2016/06/29 20:03:54 [debug] 95200#0: *6 lua sleep timer expired: "/zookeeper?"
2016/06/29 20:03:54 [debug] 95200#0: *6 lua run thread, top:0 c:1
2016/06/29 20:03:54 [debug] 95200#0: *6 [lua] zookeeper.lua:179: ping: send, timout: 10
2016/06/29 20:03:54 [debug] 95200#0: *6 [lua] zookeeper.lua:136: send(): Packet Size: 8
2016/06/29 20:03:54 [error] 95200#0: *6 attempt to send data on a closed socket: u:000000000C5D79E8, c:0000000000000000, ft:0 eof:0, client: 127.0.0.1, server: , request: "GET /zookeeper HTTP/1.1", host: "localhost:1080"
2016/06/29 20:03:54 [debug] 95200#0: *6 lua resume returned 0
2016/06/29 20:03:54 [debug] 95200#0: *6 lua light thread ended normally
2016/06/29 20:03:54 [debug] 95200#0: *6 lua deleting light thread
2016/06/29 20:03:54 [debug] 95200#0: *6 http set discard body
2016/06/29 20:03:54 [debug] 95200#0: *6 lua sending last buf of the response body
2016/06/29 20:03:54 [debug] 95200#0: *6 http output filter "/zookeeper?"
2016/06/29 20:03:54 [debug] 95200#0: *6 http copy filter: "/zookeeper?"
2016/06/29 20:03:54 [debug] 95200#0: *6 lua capture body filter, uri "/zookeeper"
2016/06/29 20:03:54 [debug] 95200#0: *6 http postpone filter "/zookeeper?" 00007FFF5388ED60
2016/06/29 20:03:54 [debug] 95200#0: *6 http chunk: 0


Yichun Zhang (agentzh)

unread,
Jun 30, 2016, 8:01:14 PM6/30/16
to openresty-en
Hello!

On Wed, Jun 29, 2016 at 8:15 PM, Ingo Sauerzapf wrote:
> Yes every connection to zookeeper needs to
> periodically send a ping to one of the servers in a zookeeper cluster in
> order for the server to keep it alive.

Seems like it's much easier, more transparent, and more efficient to
let your kernel's TCP stack do "TCP keepalive" for this.

> Of course one can just ignore that
> and just do a quick connect(), dosomething(), close(), but that would
> basically violate the idea of zookeeper where clients are expected to be
> permanently connected in order to receive notifications on changes happening
> in zookeeper.

Then you shouldn't create the zookeeper connection in your request
handler in the first place since requests do quit fast.

>
> I am now seeing the sleep timer expiring 10 seconds after the ping thread
> was killed

Then you should kill the light thread doing sleeping as well.

Regards,
-agentzh
Message has been deleted

Ingo Sauerzapf

unread,
Jun 30, 2016, 10:32:00 PM6/30/16
to openresty-en
On Thursday, June 30, 2016 at 5:01:14 PM UTC-7, agentzh wrote:
Hello!

On Wed, Jun 29, 2016 at 8:15 PM, Ingo Sauerzapf wrote:
>  Yes every connection to zookeeper needs to
> periodically send a ping to one of the servers in a zookeeper cluster in
> order for the server to keep it alive.

Seems like it's much easier, more transparent, and more efficient to
let your kernel's TCP stack do "TCP keepalive" for this.
 
TCP keepalive is not enough. Zookeeper expects each client to 
periodically send a Zookeeper ping which is part of the protocol. 
I got that covered, not a problem. 

> Of course one can just ignore that
> and just do a quick connect(), dosomething(), close(), but that would
> basically violate the idea of zookeeper where clients are expected to be
> permanently connected in order to receive notifications on changes happening
> in zookeeper.

Then you shouldn't create the zookeeper connection in your request
handler in the first place since requests do quit fast.
   I fully agree. I am just writing tests right now and in them I want the connection to be short lived. 
   A real use of the library would however not be in the request. 
>
> I am now seeing the sleep timer expiring 10 seconds after the ping thread
> was killed

Then you should kill the light thread doing sleeping as well.
I am killing the thread doing the sleeping and the log shows that 
the thread is killed but it is not working and nginx still waits. I will try some more 
I must be doing a stupid mistake that I am not able to see right now.
 

Regards,
-agentzh

Ingo Sauerzapf

unread,
Jun 30, 2016, 11:00:59 PM6/30/16
to openresty-en
I found the problem. The thread was actually not killed. I did not check the response from thread.kill() which was "failed to kill user thread: killer not paren". The 
spawn was in a local function and the kill was in the zoo:close() member function. I am really new to nginx and lua, would you be able to explain why this is the case?

Yichun Zhang (agentzh)

unread,
Jul 1, 2016, 2:15:49 PM7/1/16
to openresty-en
Hello!
Right now a light thread can only be killed by its parent. There is a
pending PR relaxing this constraint but has not yet got merged yet:

https://github.com/openresty/lua-nginx-module/pull/476

Regards,
-agentzh

Ingo Sauerzapf

unread,
Jul 1, 2016, 4:29:30 PM7/1/16
to openresty-en
Understood, but I am wondering why a local function xyz in my zookeeper.lua file is not the same parent as a function _M.close?

Yichun Zhang (agentzh)

unread,
Aug 25, 2016, 5:48:30 PM8/25/16
to openresty-en
Hello!

On Fri, Jul 1, 2016 at 1:29 PM, Ingo Sauerzapf wrote:
> Understood, but I am wondering why a local function xyz in my zookeeper.lua
> file is not the same parent as a function _M.close?
>

It's not the parent of function calls; instead, it's the parent light
thread of that light thread (that is, the light thread creates the
light thread in question).

Regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages