nginx-lua shared dict

2,379 views
Skip to first unread message

NinjaPenguin

unread,
Feb 11, 2013, 4:54:29 PM2/11/13
to openre...@googlegroups.com
Hi All

I have a question relating to the lua_shared_dict

I've implemented a mechanism to serve cache contents from redis via lua

Every time a cache hit occurs I record a bunch of data (timestamp, id of item returned, browsers user agent etc). Originally I submitted this instantly to a Gearmand instance (using the openresty Gearman lib)

This worked relatively well, but Gearman struggled to cope with the number of connections required (15K/sec had it struggling)

Using the lua_shared_dict I wanted to chunk up these 'stats records' and then submit to Gearman once I had a batch of them (I chose a batch size of 40). The code for this is as follows:

        -- we concat the job and increment the job count
        local jobs = ngx.shared.jobs
        local new_job = ngx.encode_args( { ["impressions[]"] = string.format('%s||s=%s||i=%s||u=%s', ngx.var.request_uri, creative_id, tostring(ngx.time()), tostring(ngx.var.http_user_agent) ) } )

        -- combine the job on and incr the job count
        local job_string, err = jobs:get('job_string')

        -- set the string
        if not job_string then
            jobs:add("job_string", string.format("worker_impression_processor#%s",new_job))
        else
            jobs:set('job_string', string.format("%s&%s", job_string, new_job))
        end

        -- incr the count safely
        local newval, err = jobs:incr("job_count", 1)
        if not newval and err == "not found" then
            jobs:add("job_count", 0)
            newval = jobs:incr("job_count", 1)
        end

        -- submit the job if we've seen 40 calls
        if (newval == 50) then

            -- proceed to submit the impression to gearman
            local gearman = require "resty.gearman"
            local gm = gearman:new()

            gm:set_timeout(500) -- 0.5 sec

            local ok, err = gm:connect("127.0.0.1", 4730)
            if not ok then
                -- can't submit the job we we continue to serve the request
                ngx.log(ngx.ERR, "@cache: unable to connect to gearman: ")
            else
                -- submit the job to gearman to
                local ok, err = gm:submit_job_bg("creative_call", jobs:get('job_string'))
                if not ok then
                    ngx.log(ngx.ERR, "@cache: unable to submit job to gearman: (",ngx.var.request_uri,") ", err)
                else
                    -- put it into the connection pool of size 100, with 0 idle timeout
                    local ok, err = gm:set_keepalive(0, 100)
                end
            end

            -- reset the shared dictionary
            jobs:flush_all()

        end


When testing this at load (12K/sec) I am seeing a large number of 'ngx_slab_alloc() failed: no memory in lua_shared_dict zone "jobs"'

The other issue i'm having, which is slightly more concerning (as i've read the above 'error' is not a concern) is that I can put 12K calls into the system but not subsequently get 12K (in chunks of 40) submitted to gearman - its always a varying number short (I'm getting no other errors such as gearman connection errors for example)

Could this be because of concurrent access to the shared dict? is this a known limitation (and so i'm not using it in the correct manor?)

Any help would be greatly appreciated as I'm a little stumped for the best course of action here

Many thanks

/Matt

agentzh

unread,
Feb 11, 2013, 6:33:05 PM2/11/13
to openre...@googlegroups.com
Hello!

On Mon, Feb 11, 2013 at 1:54 PM, NinjaPenguin wrote:
> The other issue i'm having, which is slightly more concerning (as i've read
> the above 'error' is not a concern) is that I can put 12K calls into the
> system but not subsequently get 12K (in chunks of 40) submitted to gearman -
> its always a varying number short (I'm getting no other errors such as
> gearman connection errors for example)
>
> Could this be because of concurrent access to the shared dict? is this a
> known limitation (and so i'm not using it in the correct manor?)
>

What is the total capacity defined in your lua_shared_dict
configuration? It seems that you're running out of room in the shared
memory zone. When you're running out of space in the zone, ngx_lua's
shared dict will start forcibly removing the least recently used items
that are not expired yet (and the Nginx core will also print out the
harmless "ngx_slab_alloc() failed: no memory" alert to your nginx
error.log).

Another possible cause for running out of memory in your shared dict
is memory fragmentation. You should put similarly sized items into the
same store whenever possible, to reduce memory fragmentation and
vastly different sized items into different stores (because equally
sized items will never lead to memory fragmentation at all).

If you're on Linux, you can consider using the ngx-shm script in my
Nginx Systemtap Toolkit to analyse the shared memory zones in your
running Nginx processes at realtime:

https://github.com/agentzh/nginx-systemtap-toolkit#ngx-shm

Best regards,
-agentzh

agentzh

unread,
Feb 11, 2013, 8:57:32 PM2/11/13
to openre...@googlegroups.com
Hello!

On Mon, Feb 11, 2013 at 3:33 PM, agentzh <age...@gmail.com> wrote:
> What is the total capacity defined in your lua_shared_dict
> configuration? It seems that you're running out of room in the shared
> memory zone. When you're running out of space in the zone, ngx_lua's
> shared dict will start forcibly removing the least recently used items
> that are not expired yet (and the Nginx core will also print out the
> harmless "ngx_slab_alloc() failed: no memory" alert to your nginx
> error.log).
>

I should add that in your use cases it's recommended to call the
"flush_expired" method manually (especially after the "flush_all"
call):

http://wiki.nginx.org/HttpLuaModule#ngx.shared.DICT.flush_expired

Because the "flush_all" method just mark all items as expired without
actually releasing the memory blocks in the store while
"flush_expired" will actually release all the expired items.

Best regards,
-agentzh

NinjaPenguin

unread,
Feb 12, 2013, 8:17:18 AM2/12/13
to openre...@googlegroups.com
Hi Agentzh

Firstly thanks very much for getting back to me!

The storage size is 100M (I dramatically high balled it to ensure I had space). Testing today with the added call to flush_expired did indeed seem to remove the "ngx_slab_alloc() failed: no memory" msg - so thanks very much for that!

I am still seeing the issue with calls essentially being lost (and not being submitted to Gearman) - the more I think about this though the more I believe it is due to the lack of atomicity within the shared dict. I believe in the time between making a submission to Gearman and calling flush, other processes are probably writing to the space and so they are then subsequently flushed without having been written.

Its possible that I could work around this with a basic lock method using the :get call, but i'm undecided on how exactly this would work, and the performance impact it may have

For now I have simply removed the chunking of these jobs and now submit on each request

This did however reveal a subsequent issue with connections to redis (I'm using the redis2 module) and the current upstream configuration:

    upstream redis {
        server unix:/redis-6406/redis.sock;

        # a pool with at most 4096 connections
        keepalive 4096;
    }

At high load this results in a number of:

[error] 5134#0: *67436 connect() to unix:/redis-6406/redis.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: XXX.XXX.XXX.XXX, server: test.io, request: "GET /test/mode:direct HTTP/1.1", subrequest: "/redis_creative_get", upstream: "redis2://unix:/redis-6406/redis.sock:", host: "XXX.XXX.XXX.XXX"

This may be a natural limit possibly (I'm seeing this at a traffic load of around 15K/s) but was wondering if there was anything I could be looking at to further tune this? (I should note that redis-server sees very low load at this point)

Many thanks for the continued help!

/Matt

agentzh

unread,
Feb 12, 2013, 2:14:16 PM2/12/13
to openre...@googlegroups.com
Hello!

On Tue, Feb 12, 2013 at 5:17 AM, NinjaPenguin wrote:
> Hi Agentzh
>

Please do not capitalize my nick. Thank you.

> Firstly thanks very much for getting back to me!
>
> The storage size is 100M (I dramatically high balled it to ensure I had
> space). Testing today with the added call to flush_expired did indeed seem
> to remove the "ngx_slab_alloc() failed: no memory" msg - so thanks very much
> for that!
>

Good to know :)

> I am still seeing the issue with calls essentially being lost (and not being
> submitted to Gearman) - the more I think about this though the more I
> believe it is due to the lack of atomicity within the shared dict. I believe
> in the time between making a submission to Gearman and calling flush, other
> processes are probably writing to the space and so they are then
> subsequently flushed without having been written.
>

Atomicity is only guaranteed on the method call level. That is, "get"
is atomic, "set" is atomic, but the calling sequence of "get" and
"set" is not.

If you want to lock a sequence of calls, you have to emulate a
high-level lock yourself as discussed here:

https://groups.google.com/group/openresty-en/browse_thread/thread/4c91de9fc25dd2d7/6fdf04d24f12443f

Maybe we can eventually implement a builtin transaction API as in
Redis here in shared dict :)

> Its possible that I could work around this with a basic lock method using
> the :get call, but i'm undecided on how exactly this would work, and the
> performance impact it may have
>

See above. Also, group your shared dict operations together, do not do
I/O in the middle of locking.

> For now I have simply removed the chunking of these jobs and now submit on
> each request
>
> This did however reveal a subsequent issue with connections to redis (I'm
> using the redis2 module) and the current upstream configuration:
>
> upstream redis {
> server unix:/redis-6406/redis.sock;
>
> # a pool with at most 4096 connections
> keepalive 4096;
> }
>
> At high load this results in a number of:
>
> [error] 5134#0: *67436 connect() to unix:/redis-6406/redis.sock failed (11:
> Resource temporarily unavailable) while connecting to upstream, client:
> XXX.XXX.XXX.XXX, server: test.io, request: "GET /test/mode:direct HTTP/1.1",
> subrequest: "/redis_creative_get", upstream:
> "redis2://unix:/redis-6406/redis.sock:", host: "XXX.XXX.XXX.XXX"
>
> This may be a natural limit possibly (I'm seeing this at a traffic load of
> around 15K/s) but was wondering if there was anything I could be looking at
> to further tune this? (I should note that redis-server sees very low load at
> this point)
>

It seems like your Redis server is just not catching up with the
traffic. You can consider tuning the Redis configurations, especially
enlarging the "backlog" setting and/or just sharding across multiple
Redis server instances.

BTW, you should get better performance with the lua-resty-redis
library instead of subrequesting to ngx_redis2 in Lua.

For high throughput like this, properly setting CPU affinity on both
the Nginx workers and the local backend servers like Redis will boost
performance dramatically in practice.

Also, if the CPU resource is the bottleneck, then please consider
doing Flame Graphs on your daemon processes eating CPU time can also
give you a lot of clues about further performance improvements. For
example, the following tools can be used to render Flame Graphs by
sampling your live systems under load (both Nginx and other processes
like Redis) on Linux:

https://github.com/agentzh/nginx-systemtap-toolkit#ngx-sample-bt
https://github.com/agentzh/nginx-systemtap-toolkit#ngx-sample-lua-bt

Best regards,
-agentzh

NinjaPenguin

unread,
Feb 12, 2013, 5:40:59 PM2/12/13
to openre...@googlegroups.com
Hi agentzh (firstly apologies for uppercase, I blame autocorrect ;) )

There is some great info in here for me to take a look at now:

On Tuesday, 12 February 2013 19:14:16 UTC, agentzh wrote:
Hello!

On Tue, Feb 12, 2013 at 5:17 AM, NinjaPenguin wrote:
> Hi Agentzh
>

Please do not capitalize my nick. Thank you.

> Firstly thanks very much for getting back to me!
>
> The storage size is 100M (I dramatically high balled it to ensure I had
> space). Testing today with the added call to flush_expired did indeed seem
> to remove the "ngx_slab_alloc() failed: no memory" msg - so thanks very much
> for that!
>

Good to know :)

> I am still seeing the issue with calls essentially being lost (and not being
> submitted to Gearman) - the more I think about this though the more I
> believe it is due to the lack of atomicity within the shared dict. I believe
> in the time between making a submission to Gearman and calling flush, other
> processes are probably writing to the space and so they are then
> subsequently flushed without having been written.
>

Atomicity is only guaranteed on the method call level. That is, "get"
is atomic, "set" is atomic, but the calling sequence of "get" and
"set" is not.

This makes sense, thanks very much, my feeling is that locking the dict while I perform the IO to gearman may compromise the performance too much, I will test this though!
 

If you want to lock a sequence of calls, you have to emulate a
high-level lock yourself as discussed here:

    https://groups.google.com/group/openresty-en/browse_thread/thread/4c91de9fc25dd2d7/6fdf04d24f12443f

Maybe we can eventually implement a builtin transaction API as in
Redis here in shared dict :)

> Its possible that I could work around this with a basic lock method using
> the :get call, but i'm undecided on how exactly this would work, and the
> performance impact it may have
>

See above. Also, group your shared dict operations together, do not do
I/O in the middle of locking.

multi/exec would be a fantastic addition (IMO of course :) )
 

> For now I have simply removed the chunking of these jobs and now submit on
> each request
>
> This did however reveal a subsequent issue with connections to redis (I'm
> using the redis2 module) and the current upstream configuration:
>
>     upstream redis {
>         server unix:/redis-6406/redis.sock;
>
>         # a pool with at most 4096 connections
>         keepalive 4096;
>     }
>
> At high load this results in a number of:
>
> [error] 5134#0: *67436 connect() to unix:/redis-6406/redis.sock failed (11:
> Resource temporarily unavailable) while connecting to upstream, client:
> XXX.XXX.XXX.XXX, server: test.io, request: "GET /test/mode:direct HTTP/1.1",
> subrequest: "/redis_creative_get", upstream:
> "redis2://unix:/redis-6406/redis.sock:", host: "XXX.XXX.XXX.XXX"
>
> This may be a natural limit possibly (I'm seeing this at a traffic load of
> around 15K/s) but was wondering if there was anything I could be looking at
> to further tune this? (I should note that redis-server sees very low load at
> this point)
>

It seems like your Redis server is just not catching up with the
traffic. You can consider tuning the Redis configurations, especially
enlarging the "backlog" setting and/or just sharding across multiple
Redis server instances.

I'll definitely take a look at those settings, thanks very much! sharding across multiple instances is something I had actually considered also, it seems like the 'easiest' win perhaps
 

BTW, you should get better performance with the lua-resty-redis
library instead of subrequesting to ngx_redis2 in Lua.
 
I will check this out again, I actually tested the lua-resty-redis lib but saw a marked increase in the number of connection issues, perhaps I should look at this again knowing the additional stuff I know now!


For high throughput like this, properly setting CPU affinity on both
the Nginx workers and the local backend servers like Redis will boost
performance dramatically in practice.
 
This is a fantastic tip, thanks - I will take a look into this, do you have any specific resources I could read up on with relation to this in practice? 


Also, if the CPU resource is the bottleneck, then please consider
doing Flame Graphs on your daemon processes eating CPU time can also
give you a lot of clues about further performance improvements. For
example, the following tools can be used to render Flame Graphs by
sampling your live systems under load (both Nginx and other processes
like Redis) on Linux:

    https://github.com/agentzh/nginx-systemtap-toolkit#ngx-sample-bt
    https://github.com/agentzh/nginx-systemtap-toolkit#ngx-sample-lua-bt
Another awesome tip, something I've never played with before so I will absolutely check this out! 

Best regards,
-agentzh

Once again, thanks very much for the help and guidance, its greatly appreciated!

/Matt 

agentzh

unread,
Feb 12, 2013, 5:53:11 PM2/12/13
to openre...@googlegroups.com
Hello!

On Tue, Feb 12, 2013 at 2:40 PM, NinjaPenguin wrote:
> I will check this out again, I actually tested the lua-resty-redis lib but
> saw a marked increase in the number of connection issues, perhaps I should
> look at this again knowing the additional stuff I know now!
>

Maybe it was because you just didn't call the set_keepalive method in
the right way. People tend to call it too early for the method name
could be a little confusing to them :)

>>
>> For high throughput like this, properly setting CPU affinity on both
>> the Nginx workers and the local backend servers like Redis will boost
>> performance dramatically in practice.
>
>
> This is a fantastic tip, thanks - I will take a look into this, do you have
> any specific resources I could read up on with relation to this in practice?
>

Check out the worker_cpu_affinity directive in Nginx, for example:

http://wiki.nginx.org/CoreModule#worker_cpu_affinity

>
> Once again, thanks very much for the help and guidance, its greatly
> appreciated!
>

You're welcome :)

Best regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages