detecting shutdown vs HUP

110 views
Skip to first unread message

David Birdsong

unread,
Jun 15, 2015, 9:21:50 PM6/15/15
to openresty-en
I've got a timer that I'd like to be able to detect a full shutdown event vs a worker exiting upon reload.

Does this work?

init_by_lua - master process sets a key in shm
init_worker_by_lua - timer function uses resty-lock and removes the key OR
detects the premature flag is set to true and checks for the key in shm, if present, then init_by_lua was run recently and only the worker process is exiting, if absent, the whole masther/child process group is exiting.

Will this logic work? Have I understood the phases correctly?

Yichun Zhang (agentzh)

unread,
Jun 15, 2015, 9:30:21 PM6/15/15
to openresty-en
Hello!

On Tue, Jun 16, 2015 at 9:21 AM, David Birdsong wrote:
> I've got a timer that I'd like to be able to detect a full shutdown event vs
> a worker exiting upon reload.
>
> Does this work?
>
> init_by_lua - master process sets a key in shm
> init_worker_by_lua - timer function uses resty-lock and removes the key OR
> detects the premature flag is set to true and checks for the key in shm, if
> present, then init_by_lua was run recently and only the worker process is
> exiting, if absent, the whole masther/child process group is exiting.
>

No, this is not reliable since there is a potential race condition
here because you cannot assume the relative running order between new
workers' init_worker_by_lua and the old workers'.

You can detect a HUP reload by inspecting whether there's existing
data in shm in init_by_lua. This works because shm's data only
survives across HUP reload. Other than that, all worker exits can be
considered a "full shutdown" according to your definition :)

Best regards,
-agentzh

David Birdsong

unread,
Jun 15, 2015, 9:49:42 PM6/15/15
to openre...@googlegroups.com
Ok, what about every init_worker_by_lua appending it's pid to a serialized list in shm inside a resty-lock.

When a timer fires w/ premature set, the worker locks, removes it's pid, unlocks, then locks again and checks the pid list?

I'm trying to add a hook that would deregister a service in consul but cut down on service register/deregister churn. If a new worker arrives so late that the old worker can't detect it in the shm list of pids, than the service should be deregistered anyway IMO.

As I understand timers, they're similar to a connection, so can a timer hold up a worker's exiting or is there a cutoff time?

Best regards,
-agentzh

--
You received this message because you are subscribed to the Google Groups "openresty-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openresty-en...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yichun Zhang (agentzh)

unread,
Jun 16, 2015, 1:57:35 AM6/16/15
to openresty-en
Hello!

On Tue, Jun 16, 2015 at 9:49 AM, David Birdsong wrote:
> Ok, what about every init_worker_by_lua appending it's pid to a serialized
> list in shm inside a resty-lock.
>

Be prepared with abnormal cases that a worker crashes or something. It
should not usually happen but it MAY happen.

> When a timer fires w/ premature set, the worker locks, removes it's pid,
> unlocks, then locks again and checks the pid list?
>

This is fragile for extreme cases where a worker exits abnormally
(like crashing). You need to be prepared for that though it usually
won't happen.

> I'm trying to add a hook that would deregister a service in consul but cut
> down on service register/deregister churn. If a new worker arrives so late
> that the old worker can't detect it in the shm list of pids, than the
> service should be deregistered anyway IMO.
>

I think you should just add the HUP reload check in init_by_lua and a
trigger in a re-occuring timer created in init_worker_by_lua that only
fires for one of the workers (with lua-resty-lock or just a special
flag in shdict). And yeah, a timestamp is recorded in init_by_lua and
you need to compare it with the current one in init_worker_by_lua.

> As I understand timers, they're similar to a connection,

Yes.

> so can a timer hold
> up a worker's exiting or is there a cutoff time?
>

Neither. All pending timers will expire immediately and have their
handlers called with the premature argument set.

Regards,
-agentzh

Nelson, Erik - 2

unread,
Jun 16, 2015, 9:06:58 AM6/16/15
to openre...@googlegroups.com
I'm trying to do something logically like the redis routing example on

http://openresty.org/#DynamicRoutingBasedOnRedis

but instead of routing to a certain host, I want the request routed to a certain worker. Is there any way to direct a specific location/request to a designated worker?

Thanks

Erik

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.

Yichun Zhang (agentzh)

unread,
Jun 16, 2015, 9:14:39 AM6/16/15
to openresty-en
Hello!

On Tue, Jun 16, 2015 at 9:06 PM, Nelson, Erik - 2 wrote:
> I'm trying to do something logically like the redis routing example on
> http://openresty.org/#DynamicRoutingBasedOnRedis
> but instead of routing to a certain host, I want the request routed to a certain worker. Is there any way to direct a specific location/request to a designated worker?
>

No.

I wonder why you need that in the first place especially when we
already have shared memory dictionaries which are visible across all
the worker processes already (and also lua-resty-lock for activating
only a single worker).

Regards,
-agentzh

Nelson, Erik - 2

unread,
Jun 16, 2015, 9:26:44 AM6/16/15
to openre...@googlegroups.com
Yichun Zhang wrote on Tuesday, June 16, 2015 9:15 AM
>
> On Tue, Jun 16, 2015 at 9:06 PM, Nelson, Erik - 2 wrote:
> > I'm trying to do something logically like the redis routing example
> on
> > http://openresty.org/#DynamicRoutingBasedOnRedis
> > but instead of routing to a certain host, I want the request routed
> to a certain worker. Is there any way to direct a specific
> location/request to a designated worker?
> >
>
> No.
Thanks!
>
> I wonder why you need that in the first place especially when we
> already have shared memory dictionaries which are visible across all
> the worker processes already (and also lua-resty-lock for activating
> only a single worker).

It definitely can be solved with those mechanisms- I just thought if I could do it then it might be a simpler solution.

Yichun Zhang (agentzh)

unread,
Jun 17, 2015, 2:53:42 AM6/17/15
to openresty-en
Hello!

On Tue, Jun 16, 2015 at 9:26 PM, Nelson, Erik - 2 wrote:
> It definitely can be solved with those mechanisms- I just thought if I could do it then it might be a simpler solution.
>

Well, from the implementer's perspective, the setup of
non-differential workers is actually simpler and more robust :)

Regards,
-agentzh

Nelson, Erik - 2

unread,
Jun 17, 2015, 9:12:46 AM6/17/15
to openre...@googlegroups.com
Yichun Zhang wrote on Wednesday, June 17, 2015 2:54 AM
I believe you, but don't understand. My implementation uses something like the example at

https://github.com/openresty/lua-resty-lock#for-cache-locks

but it seems to me that much of that could be avoided if the need to share between workers could be reduced to

http://wiki.nginx.org/HttpLuaModule#Data_Sharing_within_an_Nginx_Worker

by routing things that need the shared data to the same worker, which looks much simpler to me.

What am I missing?

Yichun Zhang (agentzh)

unread,
Jun 17, 2015, 9:26:03 AM6/17/15
to openresty-en
Hello!

On Wed, Jun 17, 2015 at 9:12 PM, Nelson, Erik - 2 wrote:
> I believe you, but don't understand. My implementation uses something like the example at
>
> https://github.com/openresty/lua-resty-lock#for-cache-locks
>

This is for external caching involved with I/O, like memcached, redis,
and etc. Such external cache can be shared across multiple
nginx/openresty boxes.

> but it seems to me that much of that could be avoided if the need to share between workers could be reduced to
>
> http://wiki.nginx.org/HttpLuaModule#Data_Sharing_within_an_Nginx_Worker
>
> by routing things that need the shared data to the same worker, which looks much simpler to me.
>

This shm-based cache is per-server. It's for a different use case but
can also be backed by an external cache.

Regards,
-agentzh

Nelson, Erik - 2

unread,
Jun 17, 2015, 10:28:04 AM6/17/15
to openre...@googlegroups.com
Yichun Zhang wrote on Wednesday, June 17, 2015 9:26 AM
Confirming that

http://wiki.nginx.org/HttpLuaModule#Data_Sharing_within_an_Nginx_Worker

is shm-based and per-server? The docs mentions that it is per-worker, not per-server.

Maybe I misunderstood your reply.

Thanks

Yichun Zhang (agentzh)

unread,
Jun 17, 2015, 10:54:41 PM6/17/15
to openresty-en
Hello!

On Wed, Jun 17, 2015 at 10:27 PM, Nelson, Erik - 2 wrote:
> Confirming that
>
> http://wiki.nginx.org/HttpLuaModule#Data_Sharing_within_an_Nginx_Worker
>
> is shm-based and per-server?

Nope. It's not shm backed. It's just a Lua VM thing.

> The docs mentions that it is per-worker, not per-server.
>

Correct.

> Maybe I misunderstood your reply.
>

I don't think there's any errors in my previous reply.

Regards,
-agentzh

Nelson, Erik - 2

unread,
Jun 24, 2015, 1:45:23 PM6/24/15
to openre...@googlegroups.com
I'm using tcpsock:connect() as documented here

http://wiki.nginx.org/HttpLuaModule#tcpsock:connect

to test whether ports have something listening on TCP. It's not an error if nothing is listening, but I end up with lots of errors in my log, like

2015/06/24 13:34:42 [error] 1336#0: *104 connect() failed (111: Connection refused)...

Is there some way to prevent the tcp socket from logging these errors?

Yichun Zhang (agentzh)

unread,
Jun 24, 2015, 2:05:33 PM6/24/15
to openresty-en
Hello!

On Thu, Jun 25, 2015 at 1:45 AM, Nelson, Erik - 2 wrote:
> I'm using tcpsock:connect() as documented here
> http://wiki.nginx.org/HttpLuaModule#tcpsock:connect
> to test whether ports have something listening on TCP. It's not an error if nothing is listening, but I end up with lots of errors in my log, like
> 2015/06/24 13:34:42 [error] 1336#0: *104 connect() failed (111: Connection refused)...
> Is there some way to prevent the tcp socket from logging these errors?
>

Currently no. This error message is generated by the nginx core and
it's currently hard-coded. It would require patching the nginx core to
make it conditional. Will you look into a patch for the nginx core? We
could have such a patch in OpenResty's bundled version of nginx by
default.

Thanks!
-agentzh

Nelson, Erik - 2

unread,
Jun 24, 2015, 2:27:04 PM6/24/15
to openre...@googlegroups.com
Yichun Zhang (agentzh) wrote on Wednesday, June 24, 2015 2:06 PM
Sure, I'll take a look at it. Under ngx_openresty-1.7.7.2/bundle/nginx-1.7.7/src/http (for example) I see some files like ngx_http_upstream.c.orig and ngx_http_upstream.c- those are the nginx original and resty-patched files, respectively?

Nelson, Erik - 2

unread,
Jun 24, 2015, 4:18:33 PM6/24/15
to openre...@googlegroups.com
Yichun Zhang (agentzh) wrote on Wednesday, June 24, 2015 2:06 PM
I poked around a little bit. It seems like the message is coming from around line 3468 of ngx_http_lua_socket_tcp.c

if (llcf->log_socket_errors) {
(void) ngx_connection_error(c, err, "connect() failed");
}

It looks like it's conditional on llcf->log_socket_errors, but it wasn't obvious to me how to control that.

Does this seem like the right path?

Thanks

Yichun Zhang (agentzh)

unread,
Jun 24, 2015, 11:37:33 PM6/24/15
to openresty-en
Hello!

On Thu, Jun 25, 2015 at 4:18 AM, Nelson, Erik - 2 wrote:
> I poked around a little bit. It seems like the message is coming from around line 3468 of ngx_http_lua_socket_tcp.c
> if (llcf->log_socket_errors) {
> (void) ngx_connection_error(c, err, "connect() failed");
> }
> It looks like it's conditional on llcf->log_socket_errors, but it wasn't obvious to me how to control that.
>

Oh, you can control that via the lua_socket_log_errors directive:

https://github.com/openresty/lua-nginx-module#lua_socket_log_errors

Glad it can be controlled in nginx.conf :)

Regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages