lua-upstream-nginx-module & lua-resty-upstream-healthcheck questions

261 views

Skip to first unread message

Natasha K

unread,

Jan 19, 2017, 7:25:41 PM1/19/17

to openre...@googlegroups.com

Hello,

I am using both the lua-upstream-nginx and lua-resty-upstream-healthcheck modules and am looking for some clarification on how these modules work.

I have several groups of upstreams and i use spawn_checker to create a checker for each of the upstream groups.  For example, here's one of the groups:

upstream default {

  server 10.11.12.13:8080;

  server 14.15.16.17:8080;

}

Let's say, once nginx starts, 10.11.12.13 is marked DOWN because it's actually down and 14.15.16.17 is up.  So the /status endpoint would show:

Nginx Worker PID: 94
Upstream default
    Primary Peers
        10.11.12.13:8080 DOWN        14.15.16.17:8080 up    Backup Peers

1.  if at this point i use `set_peer_down` to mark the 10.11.12.13 server up, it will be turned up (though it continues to be physically down) and it will not be turned down again by the checker after x number of failed healthcheck calls.  In fact it seems like the healthchecks calls are not happening at this point, at least I am not seeing them in the logs.  By the same token, if I need to disable a server which is actually up and i mark it down it will not be turned up after y number of successful healthcheck calls (but in this scenario i actually want it to stay down, whereas in the opposite scenario i don't want a server that's physically down to be considered up by the config).  Is this expected behavior?  Does using `set_peer_down` somehow interfere with the checkers from healthcheck module?  And if yes, how can i modify this to only enable a server that is, in fact, up?

2.  if i have multiple workers, and i use `set_peer_down` to disable a server inside an upstream block, do i need to synchronize the workers myself or will it get taken care of by the healthcheck module?  I see that in the documentation for `set_peer_down` it says the following: 

Note that this method only changes the peer settings in the current Nginx worker process. You need to synchronize the changes across all the Nginx workers yourself if you want a server-wide change (for example, by means of ngx_lua's ngx.shared.DICT).

​But then I also found this where agentzh suggests checking the healtcheck.lua library for details and this google group discussion on the same issue. Now I am not sure if this means that if I am already making use of the healthcheck library and spawned checkers for upstreams, the worker synchronization will be taken care of or still needs to be done explicitly by me.  

3.  slightly off topic question.  when i define my upstream with hostnames instead of ip addresses, i see that when i use the /status endpoint or get_servers() from upstream the hostnames are resolved to ip addresses.  is this because nginx internally resolves them to ips? is there a way to make it hold on to hostnames and show them instead of ips? 

Thank you!

=^..^=

N K

unread,

Jan 23, 2017, 4:09:19 PM1/23/17

to openresty-en

update on #2.

after some testing, I now see that the workers, in fact, do have to be synced manually. I'm attempting to reuse what the healthcheck.lua library is doing and I'm doing the following but it is not working and only 1 worker sees the server as disabled.

Any idea what I am doing wrong?

local function gen_peer_key(prefix, upstream, is_backup, server_id)
    if is_backup then
        return prefix .. upstream .. ":b" .. server_id
    end
    return prefix .. upstream .. ":p" .. server_id
end


local function set_peer_down_globally(upstream, is_backup, server_id, value)
    local ngx_upstream = require "ngx.upstream"

    local SHM = "healthcheck" -- this is set in the nginx.conf file as lua_shared_dict
    local HTTP_NOT_FOUND = ngx.HTTP_NOT_FOUND
    local HTTP_OK = ngx.OK

    local shared = ngx.shared
    local get_servers = ngx_upstream.get_servers
    local get_upstreams = ngx_upstream.get_upstreams
    local set_peer_down = ngx_upstream.set_peer_down

    local ok, err = set_peer_down(upstream, is_backup, server_id, value)
    if not ok then
        ngx.exit(HTTP_NOT_FOUND)

    end

    local key = gen_peer_key("d:", upstream, is_backup, server_id)
    local ok, err = dict:set(key, value)
    if not ok then
        roxy_log("failed to set peer down state: ", err)
        ngx.exit(HTTP_NOT_FOUND)
    end



    return true

end

function disable_server(upstream_name, server_id)
  local ok, err = set_peer_down_globally(upstream_name, false, server_id, true)
  if not ok then
    ngx.exit(HTTP_NOT_FOUND)

  else
    ngx.exit(HTTP_OK)
  end
end

N K

unread,

Jan 23, 2017, 11:16:33 PM1/23/17

to openresty-en

Hello,

I've figured out what has not been working, I was missing the update of the upstream version in ctx. Now everything works as expected, which answers my original 1st and 2nd questions. But brings me to another - is there a way to turn a peer down/up without a reload and have it stay that way, i.e. avoid it being turned up/down by /healthcheck while still making use of the healthcheck library?