Balancing work between workers in init_worker_by_lua* and ngx.timer.at

427 views
Skip to first unread message

Bogdan Irimia

unread,
Nov 3, 2015, 4:22:42 AM11/3/15
to openre...@googlegroups.com
Hello, everybody

In our application we have the following situation: we have a set of parameters that need to be calculated offline (with a timer), at different intervals. What we do is we start a timer (with ngx.timer.at) for each parameter, with the corresponding interval, in init_worker_by_lua_file. Each timer, when it runs, sets a lock (using the lua-resty-lock library) and executes the processing code. The lock has the role of avoiding computing the same parameter in each worker.
With this configuration, the load between workers is different. It is very possible that all the work is done by one worker, and all the others stay idle. What we would like is to find a method to balance the work between these workers.
Do you guys have any suggestions?

Thank you

Bogdan Irimia

Nelson, Erik - 2

unread,
Nov 3, 2015, 9:26:29 AM11/3/15
to openre...@googlegroups.com
Bogdan Irimia Sent: Tuesday, November 03, 2015 4:23 AM

>In our application we have the following situation: we have a set of parameters that need to be calculated offline (with a timer), at different intervals. What we do is we start a timer (with ngx.timer.at) for each parameter, with the corresponding interval, in init_worker_by_lua_file. Each timer, when it runs, sets a lock (using the lua-resty-lock library) and executes the processing code. The lock has the role of avoiding computing the same parameter in each worker.
>With this configuration, the load between workers is different. It is very possible that all the work is done by one worker, and all the others stay idle. What we would like is to find a method to balance the work between these workers.

Maybe have the workers decide among themselves how to divide up the work to balance it out?

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.

Bogdan Irimia

unread,
Nov 3, 2015, 9:28:56 AM11/3/15
to openre...@googlegroups.com
Ok, but how should I do that?

Aapo Talvensaari

unread,
Nov 3, 2015, 9:45:35 AM11/3/15
to openresty-en
On Tuesday, 3 November 2015 16:28:56 UTC+2, bogdan wrote:
Ok, but how should I do that?

1. Say you have 4 workers that to counting of 8 different calculations or parameters
2. Each worker will need to process 2 parameters

Now on init_by_lua compile a list of parameters in SHM.

Then in init_worker_by_lua each worker will read 2 different parameters from that list. When this gets merger you can lpop in init_worker_by_lua, but lets not think about that now, let's think how you can do this now:

How to do this now:

One thing I'm not sure is that are the workers started serially or in parallel. Lets just say (I'm not sure if it is correct, that they are started in parallel so they run init_worker_by_lua concurrently, if not then this is even easier).

1. On init worker by lua start looping that list of parameters
2. Check if parameter is already processed and if it is, then skip it, if not
3. Lock against parameter name with 0 ms timeout so that only one worker will take that
4. Mark the parameter as processed
5. Unlock parameter
6. Break the loop when ceil(parameters / workers count) parameters have been taken by this worker

Now each worker has their own set of parameters. Just in case Nginx worker crashes
store the names in SHM with worker pid.

Now start a recurring timer for each parameter.

If crash happens, then before 1. You need to check if there are previously registered
parameters for this worker in SHM. If there is, then use those. And skip 2 - 6.

Now if workers are started serially (this can be easily tested), this is easy, just pop out
ceil(parameters / workers count)  from that SHM list built on init_by_lua. If the list is empty
all the work has already been taken care of. Aka if you have 4 workers and 2 parameters,
then only 2 workers will have something to do with. 

And that's about it. Now you have Nginx running as an event loop for these timers and the
timers are divided between workers.

Bogdan Irimia

unread,
Nov 3, 2015, 10:00:38 AM11/3/15
to openre...@googlegroups.com
Ok, I understand. I will still need to figure out the way I should handle these "crashes", but the balancing method seems quite ok.
I appreciate the verbose response. Thank you!

Tuesday, November 3, 2015 4:45 PM via Postbox
--
You received this message because you are subscribed to the Google Groups "openresty-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openresty-en...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Tuesday, November 3, 2015 4:28 PM via Postbox
Ok, but how should I do that?

Nelson, Erik - 2 wrote:
Tuesday, November 3, 2015 4:26 PM via Postbox
Bogdan Irimia Sent: Tuesday, November 03, 2015 4:23 AM

Maybe have the workers decide among themselves how to divide up the work to balance it out?

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.

Tuesday, November 3, 2015 11:22 AM via Postbox
Hello, everybody


In our application we have the following situation: we have a set of parameters that need to be calculated offline (with a timer), at different intervals. What we do is we start a timer (with ngx.timer.at) for each parameter, with the corresponding interval, in init_worker_by_lua_file. Each timer, when it runs, sets a lock (using the lua-resty-lock library) and executes the processing code. The lock has the role of avoiding computing the same parameter in each worker.
With this configuration, the load between workers is different. It is very possible that all the work is done by one worker, and all the others stay idle. What we would like is to find a method to balance the work between these workers.
Do you guys have any suggestions?

Thank you

Bogdan Irimia
--
You received this message because you are subscribed to the Google Groups "openresty-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openresty-en...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nelson, Erik - 2

unread,
Nov 6, 2015, 12:44:52 PM11/6/15
to openre...@googlegroups.com
I'm using openresty 1.9.3.1 on RHEL5, I noticed if I set a header like

ngx.header['X-Custom-Error'] = header_string

that the output headers will be corrupted if header_string has CR/LF in it. It makes sense, just surprised me. The error from the C# client that I saw was

"The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF"

Just a heads-up in case someone else runs into that.

Lord Nynex

unread,
Nov 6, 2015, 1:10:50 PM11/6/15
to openre...@googlegroups.com
Hello,

Can you explain more details? CR/LF is what is defined in the standard for separating header values. You can read more about that here http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2

If header_string contains CR/LF anywhere but the end of the string, this is a protocol violation and the warning you receive is correct. Headers aren't the best place to store multi-line data. It should not be necessary to insert cr/lf line terminators into header strings manually as nginx should do this for you. 

Can you explain what you mean by 'corrupt' ?

Nelson, Erik - 2

unread,
Nov 6, 2015, 1:32:50 PM11/6/15
to openre...@googlegroups.com

In my case, header_string had been read from the body of a URL that was served by ngx.say() and consequently had a trailing newline (that I failed to trim off); it was not intended to be multi-line data.

 

The warning/error was correct; it *was* a protocol violation, which is how I defined ‘corrupt’.  I’m not saying there’s anything unexplained or incorrect with openresty.

 

There’s nothing to be solved, just a searchable note on the list that might help someone spend less time tracking the problem down than I did.

 

From: openre...@googlegroups.com [mailto:openre...@googlegroups.com] On Behalf Of Lord Nynex
Sent: Friday, November 06, 2015 1:11 PM
To: openre...@googlegroups.com
Subject: Re: [openresty-en] warning about headers with CR/LF

 

Hello,

 

Can you explain more details? CR/LF is what is defined in the standard for separating header values. You can read more about that here http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2

 

If header_string contains CR/LF anywhere but the end of the string, this is a protocol violation and the warning you receive is correct. Headers aren't the best place to store multi-line data. It should not be necessary to insert cr/lf line terminators into header strings manually as nginx should do this for you. 

 

Can you explain what you mean by 'corrupt' ?

On Fri, Nov 6, 2015 at 9:44 AM, Nelson, Erik - 2 <erik.l...@bankofamerica.com> wrote:

I'm using openresty 1.9.3.1 on RHEL5, I noticed if I set a header like

ngx.header['X-Custom-Error'] = header_string

that the output headers will be corrupted if header_string has CR/LF in it.  It makes sense, just surprised me.  The error from the C# client that I saw was

"The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF"

Just a heads-up in case someone else runs into that.

.


Yichun Zhang (agentzh)

unread,
Nov 6, 2015, 10:36:01 PM11/6/15
to openresty-en
Hello!

On Sat, Nov 7, 2015 at 2:32 AM, Nelson, Erik - 2 wrote:
> In my case, header_string had been read from the body of a URL that was
> served by ngx.say() and consequently had a trailing newline (that I failed
> to trim off); it was not intended to be multi-line data.
>
> The warning/error was correct; it *was* a protocol violation, which is how I
> defined ‘corrupt’. I’m not saying there’s anything unexplained or incorrect
> with openresty.
>
> There’s nothing to be solved, just a searchable note on the list that might
> help someone spend less time tracking the problem down than I did.
>

Agreed. We can reject header values containing CR or LF in response
and request header API, for security reasons also. It comes with a
small price though. Maybe it's worth it. Patches welcome.

Regards,
-agentzh

Nelson, Erik - 2

unread,
Nov 9, 2015, 1:50:10 PM11/9/15
to openre...@googlegroups.com
Yichun Zhang Sent: Friday, November 06, 2015 10:36 PM
> To: openresty-en
> Subject: Re: [openresty-en] warning about headers with CR/LF
When you mention the request header API, maybe you're thinking it's very similar to transform_underscores_in_resp_headers?

It seems to me on request headers, if there are CRLF the request is already malformed- can that really be fixed up? Won't that already be parsed by nginx?

Erik

----------------------------------------------------------------------

Yichun Zhang (agentzh)

unread,
Nov 9, 2015, 9:44:09 PM11/9/15
to openresty-en
Hello!

On Tue, Nov 10, 2015 at 2:50 AM, Nelson, Erik - 2 wrote:
>
> When you mention the request header API, maybe you're thinking it's very similar to transform_underscores_in_resp_headers?
>

I was talking about the ngx.req.set_header() API function :)

> It seems to me on request headers, if there are CRLF the request is already malformed- can that really be fixed up? Won't that already be parsed by nginx?
>

I mean new headers inserted by the Lua API, not those from the HTTP clients.

Regards,
-agentzh

Anshul Agrawal

unread,
Nov 14, 2015, 12:35:56 PM11/14/15
to openre...@googlegroups.com
Sorry for chiming in a little late, but doesn't the HTTP protocol allows CRLF in the header values? The only requirement being, next line should start with the space or horizontal tab.


Regards,
Anshul

Yichun Zhang (agentzh)

unread,
Nov 15, 2015, 10:33:27 AM11/15/15
to openresty-en
Hello!

On Sun, Nov 15, 2015 at 1:35 AM, Anshul Agrawal <anshul...@gmail.com> wrote:
> Sorry for chiming in a little late, but doesn't the HTTP protocol allows
> CRLF in the header values? The only requirement being, next line should
> start with the space or horizontal tab.
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2
>

Yeah, seems like the safest way is to check for exactly this setting.

Regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages