Is tcpsock:receive invoke system call receive each time?

已查看 100 次
跳至第一个未读帖子

guang...@gmail.com

未读,
2015年6月1日 05:33:102015/6/1
收件人 openre...@googlegroups.com
Hi,

I am using lua-resty-http and proxy_response to proxy stream output, i get a performance problem that high cpu utilization,up to 98 percent.

the active connections is about 150~200, the stream output data size is less than 1MB.

nginx has only start one worker. I get the debug of nginx worker thread with commad strace -p <nginx-worker-pid> -c:

 it tells that most of time is invoke receive and writev. I guess it because tcpsock:receive is invoked at the lua coroutine in a loop, more details in _chunked_body_reader function.

Is tcpsock:receive invoke system call receive each time? Is there some more performance way like event notify?

any suggestion and solution is big help for me.

thanks.

James Hurst

未读,
2015年6月1日 06:43:262015/6/1
收件人 openre...@googlegroups.com
Hi,

On 1 June 2015 at 10:33, <guang...@gmail.com> wrote:
Hi,

I am using lua-resty-http and proxy_response to proxy stream output, i get a performance problem that high cpu utilization,up to 98 percent.

the active connections is about 150~200, the stream output data size is less than 1MB.

Can you post a gist of your code and an example of response headers? It's important to know if you are specifying a chunksize to proxy_response and if the response is encoded as chunked or not. The more complete your example the better.
 

nginx has only start one worker. I get the debug of nginx worker thread with commad strace -p <nginx-worker-pid> -c:

 it tells that most of time is invoke receive and writev. I guess it because tcpsock:receive is invoked at the lua coroutine in a loop, more details in _chunked_body_reader function.

Is tcpsock:receive invoke system call receive each time? Is there some more performance way like event notify?


tcpsock:receive is the ngx_lua cosocket API for reading from a tcp socket in a manner which yields when waiting on i/o. That is, it's the correct way to do this in ngx_lua. It makes sense that this is where almost all the CPU time is spent, because there's not really anything else happening.

I suspect your performance issue might be a tuning thing - chunk sizes / number of workers etc. 150-200 active connections on a single worker, potentially reading/writing up to 1MB at a time?

Regards,

--

guang...@gmail.com

未读,
2015年6月1日 07:27:242015/6/1
收件人 openre...@googlegroups.com
hi james,

sorry for no gist because of my company network policy.

yes,it's a chunked stream output at single nginx worker. the stream connections will never close until data size up to 1MB or keep 20 minutes.

the workflow like CLIENT<----->OPENRESTY<----->STREAMSERVER.

here is the strace prints:



and I had did a compared test between lua_resty_http and  nginx upstream method. the nginx upstream method is working better than lua_resty_http very much, the cpu utilization is keeping between 7 and 15 percent.

here is my nginx upstream config,

location = /proxy {
   internal;
   proxy_set_header Host $host;
   proxy_set_header X-Forwarded-For $http_x_forwarded_for;
   proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   proxy_set_header Connection "";
   proxy_http_version 1.1;
   
   proxy_pass http://${backend}${request_uri};
}

location = /portal/eventstream {
   set $backend 'STREAMSERVER';
   try_files $uri /proxy;
}



在 2015年6月1日星期一 UTC+8下午6:43:26,James Hurst写道:

guang...@gmail.com

未读,
2015年6月1日 07:36:422015/6/1
收件人 openre...@googlegroups.com
the workflow like 

CLIENT<----->OPENRESTY<----->STREAMSERVER.
|-- <heartbeat>        |                   <heartbeat>--|
|--<some datas>      |           <some datas>------|
|--<some datas>      |           <some datas>------|
|-- <heartbeat>        |                   <heartbeat>--|
|-- <heartbeat>       |                   <heartbeat>--|
|--<some datas>     |           <some datas>------|

 the stream connections will never close until data size up to 1MB or keep 20 minutes.
在 2015年6月1日星期一 UTC+8下午6:43:26,James Hurst写道:
Hi,

James Hurst

未读,
2015年6月1日 09:18:322015/6/1
收件人 openre...@googlegroups.com
Unfortunately it's quite hard to imagine what's happening without some code examples and example response headers / data. Surely you can share a simplified few lines which illustrate how you're using lua-resty-http?

Again, have you tried specifying a chunksize? You could try passing in 2^16 or 2^17 as the chunksize parameter and see if it changes anything.

If you don't specify a chunk size, then proxy_response will read / yield data as per the chunked encoding (or the whole thing if there is no chunked encoding). Whereas the nginx proxy module will use internal buffering.

So one guess might be, lua-resty-http is reading / yielding / writing in much smaller chunks than the proxy module. I'd rather not be guessing though, some examples would be really handy ;)

James.


--
You received this message because you are subscribed to the Google Groups "openresty-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openresty-en...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

guang...@gmail.com

未读,
2015年6月1日 21:44:182015/6/1
收件人 openre...@googlegroups.com

nginx.conf like:

user root root;
worker_processes  1;
worker_rlimit_nofile 100000;
events {
use epoll;
worker_connections 100000;
}
http {

include       mime.types;
default_type  text/html;
uninitialized_variable_warn off;

server_names_hash_bucket_size 128;
client_header_buffer_size 32k;
large_client_header_buffers 4 32k;
client_max_body_size 0;
client_body_buffer_size 512k;
directio 20m;
directio_alignment 512;

sendfile on;
tcp_nopush     on;
keepalive_timeout 60 60;
tcp_nodelay on;
server_tokens off;

fastcgi_connect_timeout 300;
fastcgi_send_timeout 300;
fastcgi_read_timeout 300;
fastcgi_buffer_size 64k;
fastcgi_buffers 4 64k;
fastcgi_busy_buffers_size 128k;
fastcgi_temp_file_write_size 128k;
fastcgi_intercept_errors on;
proxy_intercept_errors on;
proxy_connect_timeout 75;
proxy_read_timeout 300;
proxy_send_timeout 300;
proxy_buffer_size 64k;
proxy_buffers   4 64k;
proxy_busy_buffers_size 128k;
proxy_temp_file_write_size 128k;
proxy_http_version 1.1;
proxy_set_header Connection "";

gzip on;
gzip_min_length  1k;
gzip_buffers     4 16k;
gzip_http_version 1.0;
gzip_comp_level 2;
gzip_types       text/plain application/x-javascript text/css application/xml;
gzip_vary on;

proxy_headers_hash_max_size 51200;
proxy_headers_hash_bucket_size 6400;

        server {
    listen 80 default_server backlog=10000;
    server_name _;
            location / {
               content_by_lua_file '/path/to/stream.lua';
            }
        }
}
-----------------

stream.lua like:

local http_req_headers = {}
local http_req_body = nil
local http_res = nil

-- need to bypass all of the request headers
local headers = ngx.req.get_headers(0,true)
for k,v in pairs(headers) do http_req_headers[k] = v end
-- extra proxy headers
http_req_headers["Host"] = req_host
http_req_headers["X-Real-IP"] = ngx.var.remote_addr
local forwarded = http_req_headers["X-Forwarded-For"]
if not forwarded then forwarded = ngx.var.remote_addr
else forwarded = forwarded .. ","..ngx.var.remote_addr
end
http_req_headers["X-Forwarded-For"] = tostring(forwarded)
http_req_headers["Connection"] = ""
http_req_headers["X-Forwarded-Host"] = ngx.var.http_host

local http_req_params = {
    version = ngx.req.http_version() or 1.1,
    method = ngx.var.request_method,
    path = ngx.var.request_uri,
    headers = http_req_headers,
    body = http_req_body
}

local http = require "resty.http"
local http_conn = http:new()
http_conn:set_timeout(600000)
local ok,err = http_conn:connect(STREAMSERVER_HOST,STREAMSERVER_PORT)
if ok then
    http_req_params["body"] =http_conn:get_client_body_reader()
    http_res,err = http_conn:request(http_req_params)
    if not http_res then
        http_conn:close()
        return ngx.exit(500)
    end
else
    http_conn:close()
    return ngx.exit(500)
end

http_conn:proxy_response(http_res)
http_conn:set_keepalive(60000,150)

---------------

by the way, i have to invoke ngx.flush ngx.flush(true)  after line resty/http.lua#L793 in proxy_response,it help resolve lua-resty-http/issues/28

yes,i don't specify a chunksize for proxy_response,it's my purpose to keep the same as stream server chunked encoding.



在 2015年6月1日星期一 UTC+8下午9:18:32,James Hurst写道:

guang...@gmail.com

未读,
2015年6月1日 22:02:502015/6/1
收件人 openre...@googlegroups.com
I have no idea about the implement of stream server,it's charged for other one. but the behavior of stream server is describe as above said. the heartbeat is about 13 bytes and the data size is different each time. if the total size is up to 1MB or the connection time is up to 20minutes, the client will close it and open a new connection.

thanks for your kindly reply,i will try to specify a chunk size for proxy_response and re-test it.


在 2015年6月1日星期一 UTC+8下午9:18:32,James Hurst写道:
Unfortunately it's quite hard to imagine what's happening without some code examples and example response headers / data. Surely you can share a simplified few lines which illustrate how you're using lua-resty-http?

Yichun Zhang (agentzh)

未读,
2015年6月13日 06:17:032015/6/13
收件人 openresty-en
Hello!

On Mon, Jun 1, 2015 at 5:33 PM, guanglinlv wrote:
> I am using lua-resty-http and proxy_response to proxy stream output, i get a
> performance problem that high cpu utilization,up to 98 percent.
>

Time to profile! See

http://openresty.org/#Profiling

Everytime I see a CPU hog, I'll definitely sample some flame graphs ;)

You're encouraged to post your flame graphs here so that I can give
you advices with ease.

> nginx has only start one worker. I get the debug of nginx worker thread with
> commad strace -p <nginx-worker-pid> -c:
>
> it tells that most of time is invoke receive and writev.

I wonder if your process indeed spend most of the CPU time in the
syscalls. Maybe it's busy with userland code? A C-land flame graph can
give us a big picture.

> I guess it because
> tcpsock:receive is invoked at the lua coroutine in a loop, more details in
> _chunked_body_reader function.
>

Ensure that you handle errors of your receive() calls (and any other
I/O calls) properly in your Lua code. Failing to handle errors may
make your loop enter dead busy looping depending on how you write your
loop.

> Is tcpsock:receive invoke system call receive each time?

It depends on whether there's remaining data in the cosocket receive
buffers (userland buffers).

> Is there some more
> performance way like event notify?
>

The tcpsock:receive *is* based on event notification under the hood.

> any suggestion and solution is big help for me.
>

See above.

Best regards,
-agentzh

guang...@gmail.com

未读,
2015年7月2日 07:05:372015/7/2
收件人 openre...@googlegroups.com
Hi , 

i use lua-resty-http library to do the business. i made a little change that invoking ngx.flush ngx.flush(true)  after resty/http.lua#L793 in proxy_response function. it help resolve heartbeat issue lua-resty-http/issues/28. i think it's all of my user land code.

now, i use the nginx proxy_pass to serve the eventstream business by ngx.exec directive. the cpu utilization is looks good.


在 2015年6月13日星期六 UTC+8下午6:17:03,agentzh写道:
回复全部
回复作者
转发
0 个新帖子