Memory utilization across thousands of sites

274 views
Skip to first unread message

John K.

unread,
Feb 20, 2016, 2:38:13 PM2/20/16
to openresty-en
Hey guys,

First of all, thanks agentzh for the project and work here. We love OpenResty and are excited to use it even more. Now to my questions... Both a very generic and not a bug or issue with OpenResty per se.

We are using OpenResty + lua-nginx-module on a small hosting with a few thousand sites and started to notice an increase in memory utilization after we started to use Lua more often. Wondering if you guys have a good strategy or recommendation to improve our flow.

Basically, each site has a server {} block with multiple location {} blocks inside, like this:

location /1 {

access_by_lua_file /home/clientX/lua/include1.lua

}

location /wp-admin {

access_by_lua_file /home/clientX/lua/wp-admin.lua

}

location /2...
location /3..

location /
{
    access_by_lua_file /home/clientX/lua/generic.lua
}


Inside each block, we include the access_by_lua_file with slight different code, depending on the user case. So at the end, we have a few thousand entries of:

access_by_lua_file /home/clientX/lua/code.lua
access_by_lua_file /home/client1/lua/code.lua
access_by_lua_file /home/client2/lua/code.lua
..

Most of the with the exact code.lua content, just on different files (only a few are different). 


My questions:

1- Is that a good approach? Do you guys see any issues there?


2- Most clients have the exact same lua code. If we used /home/generic/lua/code.lua (instead of /home/clientX) for the majority of location/server blocks, instead of having one different file per client, does that affect the memory footprint?

I mean, how does openresty / the lua module handles memory if the same file is included as "access_by_lua_file" across multiple location & server blocks? Will it reuse the memory or allocate it separately for each? It complicates our setup a bit more, but if it gives better performance, we might change our approach.


3- Is there any performance gain by including multiple lua files or would it be better to concatanate all of them together into one big lua file? I mean, it is better to do:


access_by_lua_file small1.lua
access_by_lua_file small2.lua
access_by_lua_file small3.lua

Or

access_by_lua_file big.lua (with all small* files together)?


Too many questions, but thanks for the help. 






John K.

unread,
Feb 23, 2016, 1:05:40 AM2/23/16
to openresty-en
Little bump here to see if anyone can help.

thanks!

Guanglin Lv

unread,
Feb 23, 2016, 5:25:20 AM2/23/16
to openresty-en


在 2016年2月21日星期日 UTC+8上午3:38:13,John K.写道:

2- Most clients have the exact same lua code. If we used /home/generic/lua/code.lua (instead of /home/clientX) for the majority of location/server blocks, instead of having one different file per client, does that affect the memory footprint?

I mean, how does openresty / the lua module handles memory if the same file is included as "access_by_lua_file" across multiple location & server blocks? Will it reuse the memory or allocate it separately for each? It complicates our setup a bit more, but if it gives better performance, we might change our approach.


  I'm not sure  how does openresty handles memory if the same file is included as "*_by_lua_file". if the most clients have the same lua code in your case, why don't you write them in a lua module ? then you can require it in each file. the module wiil be requiredl only once time per-worker.

 

Hamish Forbes

unread,
Feb 23, 2016, 6:55:52 AM2/23/16
to openresty-en

There are also other issues with having a large number of server { } blocks with the same (or very similar) Lua code.
For example if you wish to use SSL with ngx.tcp co-sockets and use ssl verification then you need to load a set of CA certs, doing this for every server block uses a ton of memory.

With the ssl_certification_by_lua directives now available there's little reason to use multiple server {} blocks anymore, unless you're using a lot of nginx C modules with different configuration I suppose.

I just a couple of days ago pushed some code to do simple vhost style configuration in pure Lua: https://github.com/hamishforbes/lua-resty-vhost
The idea being you can replace multiple server {} blocks that have different server_names configurations with an instance of resty.vhost and use the values stored in there to configure your Lua modules at runtime.

Hamish

John K.

unread,
Feb 24, 2016, 2:39:09 AM2/24/16
to openresty-en
Thanks all.

We don't use anything specific like that, mostly basic rules and some basic apps for our users.

The issue with having a central file and using require is that some users have custom settings and small custom modifications making it harder. That's why I was wondering about the memory utilization and how is that setup, to see if it was worth or not trying to re-code it all.

Btw, any of you guys know if there is any difference between including multiple small files or a big one (with access_by_lua_file)? Besides just tracking extra file descriptors? I also thought the memory on each one was automatically duplicated for each "/location" box, so specifying it at the server {} level or location {} level would not make any diff.

thanks!

Mathew Heard

unread,
Feb 24, 2016, 8:50:05 PM2/24/16
to openresty-en
A lua file is only parsed and kept in memory once. Between loading multiple small files and one large files where there is no overlap between files there should be minimal difference in memory usage, unless you consider other factors (such as copy on write).

In our case we have files for each phase and role i.e phases/access_by_${role}.lua etc. Configuration load loaded in init_by_lua (for copy on write memory reduction between workers).

Yichun Zhang (agentzh)

unread,
Feb 24, 2016, 9:44:12 PM2/24/16
to openresty-en
Hello!

On Tue, Feb 23, 2016 at 2:25 AM, Guanglin Lv wrote:
> I'm not sure how does openresty handles memory if the same file is
> included as "*_by_lua_file".

To answer your question:

1. If the same file path is used in multiple *_by_lua_file directives,
then only one instance of the Lua code will be loaded into the memory
of a worker process (the Lua code cache uses the file path as the
key).

2. If the same piece of the Lua code snippet is used in multiple
*_by_lua_block {} or *_by_lua "" directives, then only one instance of
the Lua code will be loaded into a worker's memory (the Lua source's
MD5 checksum is used as the Lua code cache key).

As already mentioned by others, for optimal memory usage, one should
put as much the Lua code as possible into Lua module files and
pre-load them in the context of init_by_lua* (this way all the nginx
worker processes MAY have a chance to share physical memory pages via
modern operating system's COW feature).

Regards,
-agentzh

John K.

unread,
Feb 25, 2016, 1:34:05 PM2/25/16
to openresty-en
agentzh: Thanks!

So in theory, if I include the same file on all server blocks, the memory utilization will be the same as if I had used a module, no?

thanks,


On Saturday, February 20, 2016 at 3:38:13 PM UTC-4, John K. wrote:

Yichun Zhang (agentzh)

unread,
Feb 25, 2016, 2:36:10 PM2/25/16
to openresty-en
Hello!

On Thu, Feb 25, 2016 at 10:34 AM, John K. wrote:
> So in theory, if I include the same file on all server blocks, the memory
> utilization will be the same as if I had used a module, no?
>

Yes, as long as you use the exactly same file path.

But regarding your usage, it's recommended to use a generic server {}
block and do dispatch based on the host name (or the server IP address
in case of SSL requests lacking SNI) of the request. For example, CDN
gateways like CloudFlare uses this approach for a huge amount of
different hosts (WAY more than thousands).

Regards,
-agentzh

John K.

unread,
Feb 27, 2016, 11:14:51 AM2/27/16
to openresty-en
Very useful to know. One more question:

I could not find anywhere how I would do a dispatch on a generic server {} block. Do you have an example? In your case, how
do you assign the upstream addresses and SSL settings + all the custom values for each site? 

thanks,

Yichun Zhang (agentzh)

unread,
Feb 27, 2016, 1:14:07 PM2/27/16
to openresty-en
Hello!

On Sat, Feb 27, 2016 at 8:14 AM, John K. wrote:
> Very useful to know. One more question:
>
> I could not find anywhere how I would do a dispatch on a generic server {}
> block. Do you have an example? In your case, how
> do you assign the upstream addresses and SSL settings + all the custom
> values for each site?
>

Via the ngx_http_lua module. For example, there is
ssl_certificiate_by_lua for dynamic downstream SSL configurations and
balancer_by_lua for dynamic upstream configurations (when working with
ngx_proxy and etc).

Regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages