Nginx memory usage and worker restarts

1,911 views
Skip to first unread message

narayan sagar

unread,
Jun 23, 2015, 5:45:17 AM6/23/15
to nginx-...@googlegroups.com
Hi,

1) During some benchmark testing on nginx-clojure, I have observed that under certain concurrent load, the nginx worker process is restarted automatically. I see the log line "[alert] 2128#0: worker process 2129 exited on signal 9" in error.log and worker process runs with a new process ID. 

Has anyone come across this behaviour earlier? Is it to work around some known issue?

More information:
  • I am running nginx-clojure in async channel mode.
  • Java heap space usage is well within the "Xmx" limit specified in nginx.conf.
  • CPU utilization is not high (< 10%).

The problem with this restart is that send_timeout does not take effect on the connections that were open when the restart happened.

2) In the same benchmark test, I have observed that the memory usage of nginx is pretty high. I have set max heap size to 2GB in nginx.conf (jvm_options "-Xmx2048m";). After the concurrent load has been submitted I see that the java heap memory usage is pretty low (<150 MB), but the memory usage of nginx process is close to 2.8 GB. Does this mean something that is not java is taking up a lot of memory? Am I missing something here?

Thanks in advance.

Regards,
Sagar

SVJ

unread,
Jun 23, 2015, 4:02:12 PM6/23/15
to nginx-...@googlegroups.com
I got something like this.


2015/06/23 12:59:33 [alert] 35562#0: worker process 35563 exited on signal 6

Yuexiang Zhang

unread,
Jun 23, 2015, 8:38:17 PM6/23/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

It seems in the benchmark the client is too slow to read data from sever.
Because when we use NginxHttpServerChannel.send or , the remaining data after first sending will be copied into the busying buffer chain. So if the client is too slow to read data from sever and server still produces much data to send to it, the busying buffer chain will become more and more longer. It will cause nginx worker process use more and more memory until out of memory error.

NginxHttpServerChannel.setAsyncTimeout(mis) can be used to let nginx release the request and its busying buffer chain if timeout happens.

But this feature is not in v0.3.0 we need compile the source from github or wait for release of v0.4.0.

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nginx-clojure/ec5bd09c-015a-4c0e-a75d-eced83985b69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yuexiang Zhang

unread,
Jun 23, 2015, 9:05:15 PM6/23/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

By default the  NginxHttpServerChannel write timeout is set to send_timeout defined in the nginx.conf.

Could you please show some details about your testing? e.g. some source code and source of nginx.conf?  Do you use 

req.handler().hijack(req, false);

or 

req.handler().hijack(req, true);

to get the server channel?  

Thanks. I will deep into this issue.

Regards.
Xfeep

Yuexiang Zhang

unread,
Jun 23, 2015, 11:35:21 PM6/23/15
to narayan sagar, nginx-...@googlegroups.com
Hi, 

Thanks for reporting this issue!  
I found the cause about why send_timeout  does not take effect.
It's a bug because nginx-clojure set the timer after the first write event happens if it has no chance to get a write event the timeout timer won't be set. 

So if we use the current source from github NginxHttpServerChannel.setAsyncTimeout(mis)  can resolve this issue before we invoke NginxHttpServerChannel.send.

It'll be fixed in v0.4.0. If anybody really want it to be fixed in v0.3.0 please tell me so that we will create a patch for v0.3.0.

Regards.
Xfeep

narayan sagar

unread,
Jun 24, 2015, 12:20:56 AM6/24/15
to nginx-...@googlegroups.com
Thanks for the detailed explanation. We are indeed testing out the use of send_timeout in a case where a client reads a part of the response and then stops communicating with the server and the connection never gets closed from the client side.

Is there any way to set a limit to the size this buffer can grow? Also, when will the 0.4.0 release be available?

Thanks again.

Regards,
Sagar

Yuexiang Zhang

unread,
Jun 24, 2015, 1:06:08 AM6/24/15
to narayan sagar, nginx-...@googlegroups.com
There's no way to set a limit to the total size of busying buffer.But we can add this feature. Could you please create a issue for this feature? 
In the github issue please tell us what behavior you will expect when nignx-clojure increases the a busying buffer chain whose total size is larger than the limit size? Close the request or fire an error event to handle ? Thanks!

BTW v0.4.0 will be released before July 5th.

Regards.
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

Yuexiang Zhang

unread,
Jun 25, 2015, 12:40:00 PM6/25/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

commit  c72ba9e fixed bug about that send_timeout does not take effect with NginxHttpServerChannel.
please try it.

Regards
Xfeep

narayan sagar

unread,
Jun 26, 2015, 12:13:57 AM6/26/15
to nginx-...@googlegroups.com
Thanks for the fix. We will test this along with the fix for freeing up busying buffers once v0.4.0 release is out.

Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

Yuexiang Zhang

unread,
Jul 5, 2015, 4:29:54 PM7/5/15
to narayan sagar, nginx-...@googlegroups.com
Hi, v0.4.0 was released!
Please try it. Thanks!

Regards.
Xfeep


--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

narayan sagar

unread,
Jul 7, 2015, 4:10:55 AM7/7/15
to nginx-...@googlegroups.com
Hi,

Thanks for the information about v0.4.0 release!

I did try out with the new release and I see that the connections are still not freed up after the automatic worker restart. Below is the code I am trying with. Please let me know if I am missing something (please ignore compilation issues/syntax incorrecteness etc... I have only pasted the relevant parts of the code).

public Object[] invoke(Map<String, Object> request) {
        log.debug("In AsyncChannelService.invoke at "+new Date());
        NginxRequest req = (NginxRequest) request;
        NginxHttpServerChannel downstream = req.handler().hijack(req, true);
        ChannelListener cl = new ChannelListener<NginxHttpServerChannel>() {
            @Override
            public void onClose(NginxHttpServerChannel o) {
                log.debug("***downstream closed at " + new Date());
            }

            @Override
            public void onConnect(long l, NginxHttpServerChannel downstream) {
                log.info("***downstream connected at "+new Date());
            }

            @Override
            public void onRead(long l, NginxHttpServerChannel nginxHttpServerChannel) throws IOException {

            }

            @Override
            public void onWrite(long l, NginxHttpServerChannel nginxHttpServerChannel) throws IOException {

            }
        };
        downstream.addListener(downstream, cl);
downstream.setAsyncTimeout(30000);
            downstream.sendResponse(new Object[] {
                    NGX_HTTP_OK,
                    ArrayMap.create(CONTENT_TYPE, response.getContentType()),
                    new ByteArrayInputStream(response.getResponseBody())
            });
        return null;
    }


Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

Yuexiang Zhang

unread,
Jul 7, 2015, 4:48:44 AM7/7/15
to narayan sagar, nginx-...@googlegroups.com
Hi, Could you please tell me os version and nginx.conf ?
What is response.getResponseBody() ? a byte array with certain size?
Your test client do not read after send the request? Or read some bytes and suspend?
I need deep into it.

Thanks.
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

narayan sagar

unread,
Jul 7, 2015, 5:03:01 AM7/7/15
to nginx-...@googlegroups.com
Hi,

  • The OS is Red Hat Enterprise Linux Server release 7.1 (Maipo).
  • response.getResponseBody() is a byte array containing all the content.
  • Test client reads some bytes and suspends.

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

Yuexiang Zhang

unread,
Jul 7, 2015, 5:09:28 AM7/7/15
to narayan sagar, nginx-...@googlegroups.com
Please try set send_timeout, e.g. 

location /chtimeout {
         content_handler_type java;
         content_handler_name 'narayan.TimeoutExampleHandler';
         send_timeout 2s;
       }

I don't think NginxHttpServerChannel.setAsyncTimeout is suitable for this case. Because setAsyncTimeout only work for one time if a write event happens  we need set it again (e.g. websocekt) but for your case you have no chance to set it because non-websocket channel won't fire write event.

Regards.
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

narayan sagar

unread,
Jul 7, 2015, 5:16:33 AM7/7/15
to nginx-...@googlegroups.com
Sorry, missed to attach the nginx.conf file in earlier message. I have already set send_timeout in nginx.conf.

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:
nginx.conf

Yuexiang Zhang

unread,
Jul 7, 2015, 5:31:51 AM7/7/15
to narayan sagar, nginx-...@googlegroups.com
Sorry I can not reproduce it.

Here is my test example :

       location /chtimeout {
         content_handler_type java;
         content_handler_name 'narayan.TimeoutExampleHandler';
         send_timeout 2s;
         proxy_buffering off;
      chunked_transfer_encoding on;
       }


we will find it will be closed after 2s. the log :

2015-07-07 17:26:25[info]:in TimeoutExampleHandler
2015-07-07 17:26:25[info]:sending...
2015-07-07 17:26:25[info]:we end of invoke
2015-07-07 17:26:27[info]:closed now!


Here is java Handler which also has a main method to run as test client, 

package narayan;

import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.Socket;
import java.util.Map;

import nginx.clojure.ChannelCloseAdapter;
import nginx.clojure.NginxClojureRT;
import nginx.clojure.NginxHttpServerChannel;
import nginx.clojure.java.ArrayMap;
import nginx.clojure.java.NginxJavaRequest;
import nginx.clojure.java.NginxJavaRingHandler;
import nginx.clojure.logger.LoggerService;

public class TimeoutExampleHandler implements NginxJavaRingHandler {

LoggerService log = NginxClojureRT.getLog();
@Override
public Object[] invoke(Map<String, Object> r) throws IOException {
log.info("in TimeoutExampleHandler");
NginxJavaRequest req = (NginxJavaRequest)r;
final NginxHttpServerChannel ch = req.handler().hijack(req, false);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10240; i++) {
sb.append("TimeoutExampleHandler\r\n");
}
final String body = sb.toString();
ch.addListener(ch, new ChannelCloseAdapter<NginxHttpServerChannel>() {
@Override
public void onClose(NginxHttpServerChannel data) throws IOException {
log.info("closed now!");
}
});
Thread t = new Thread(new Runnable() {
@Override
public void run() {
try {
log.info("sending...");
ch.sendResponse(new Object[] {200, ArrayMap.create("Content-Type", "text/plain"), body});
} catch (IOException e) {
e.printStackTrace();
}
}
});
//t.start to run at another thread. here we test run it in main thread first
t.run();
log.info("we end of invoke");
return null;
}

public static void main(String[] args) {
Socket socket = new Socket();
InetSocketAddress inetSocketAddress = new InetSocketAddress("127.0.0.1", 8080);
try {
socket.setSoTimeout(5000000);
System.out.println("socket.getReceiveBufferSize()" + socket.getReceiveBufferSize());
socket.setReceiveBufferSize(1024);
System.out.println("socket.getReceiveBufferSize()" + socket.getReceiveBufferSize());
socket.setTcpNoDelay(true);
socket.setKeepAlive(true);
socket.connect(inetSocketAddress);
OutputStream out = socket.getOutputStream();
// out.write("GET /ubuntu/dists/trusty/Release HTTP/1.1\r\nUser-Agent: nginx-clojure/0.2.0\r\nHost: mirrors.163.com\r\nAccept: */*\r\nConnection: close\r\n\r\n".getBytes());
out.write("GET /chtimeout HTTP/1.1\r\nUser-Agent: nginx-clojure/0.2.5\r\nHost: www.apache.org\r\nAccept: */*\r\nConnection: close\r\n\r\n".getBytes());
out.flush();
byte[] buf = new byte[socket.getReceiveBufferSize()/2];
InputStream in = socket.getInputStream();
System.out.println("first read:" + (char)in.read());
Thread.sleep(10000);
do {
int c = in.read(buf);
if (c > 0) {
System.out.print(new String(buf, 0, c));
}else {
break;
}
// Thread.sleep(3000);
}while(true);
} catch (Throwable e) {
// TODO Auto-generated catch block
e.printStackTrace();
}finally{
try {
socket.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

}
}


Could you run it on your computer and give me the result?

Regards
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

Yuexiang Zhang

unread,
Jul 7, 2015, 5:38:30 AM7/7/15
to narayan sagar, nginx-...@googlegroups.com
Hi.

I found nginx-clojure-0.3.0.jar still in your nginx.conf ?

Regards.
Xfeep

On Tue, Jul 7, 2015 at 5:16 PM, narayan sagar <bsns...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

Yuexiang Zhang

unread,
Jul 7, 2015, 7:39:10 AM7/7/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

If you are sure nginx-clojure v0.4.0 was used instead of v0.3.0 and my test example can work and do close connection after 2 seconds as expected, please give me your test client code to help me reproduce this issue.

Thanks!
Xfeep

narayan sagar

unread,
Jul 8, 2015, 12:55:22 AM7/8/15
to nginx-...@googlegroups.com
Hi,

Thanks for diving deep into the issue.

I am using nginx-clojure-0.4.0 (0.3.0 is only a directory in the path for other jars). 

May be I was not clear in my initial message - the problem is only when the worker is restarted automatically under concurrent load - the connections that were created before the restart are not closed. send_timeout kicks in and the connections are closed (everything works fine) unless the worker gets restarted automatically.

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

Yuexiang Zhang

unread,
Jul 8, 2015, 1:15:50 AM7/8/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

Thanks for your explanation. I see.
We need to find what cause nginx  automatically restarted.
What the error info in error.log?  It will be great if core dump file exists.
Could you give me your test code and test way to help me reproduce it at my computer?

Thanks!
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

narayan sagar

unread,
Jul 8, 2015, 2:28:24 AM7/8/15
to nginx-...@googlegroups.com
Hi,

Please find the client code attached. 

The only log line in error.log is : [alert] 3430#0: worker process 3431 exited on signal 9

I have the JVM option 'jvm_options "-XX:+HeapDumpOnOutOfMemoryError";'  enabled in nginx.conf but not sure where the dump is getting created.

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:
TestPullContent.java
URL.txt

narayan sagar

unread,
Jul 8, 2015, 2:37:43 AM7/8/15
to nginx-...@googlegroups.com
You can run the tool as "java -cp . -Xms256m -Xmx1024m com.hp.test.other.TestPullContent 100 URL.txt 1 600000 > test.log".

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

Yuexiang Zhang

unread,
Jul 8, 2015, 2:44:44 AM7/8/15
to narayan sagar, nginx-...@googlegroups.com
Hi,

Thank you very much!
I guess there 's out of memory error cause nginx worker exit. Please check your /var/log/dmesg to verify it.
How much memory of your computer?  32bit OS or 64bit?
How many bytes of your one response size?
Core dump file was created in the working directory, named as "core" or "core-xxxx".  If it really was caused by out of memory, we need not core dump file.

Regards.
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

narayan sagar

unread,
Jul 8, 2015, 4:29:06 AM7/8/15
to nginx-...@googlegroups.com
Please find attached the dmesg file. I could not spot anything related to out of memory in it, but I might have missed it.

We are using a 64 bit linux (RHEL 7) VM.

The response file size is 30 MB.

Thanks and Regards,
Sagar

On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:

narayan sagar

unread,
Jul 8, 2015, 4:31:29 AM7/8/15
to nginx-...@googlegroups.com
Sorry! Missed the attachment, again.. Here it is...


On Tuesday, 23 June 2015 15:15:17 UTC+5:30, narayan sagar wrote:
dmesg.txt

Yuexiang Zhang

unread,
Jul 8, 2015, 4:54:50 AM7/8/15
to narayan sagar, nginx-...@googlegroups.com
Thanks. On CentOS/RedHat out of memory error maybe is at /var/log/messages .

e.g. we can run this command:

sudo less /var/log/messages | grep out


Because one response is 30M so if 100 concurrent user and send_timeout is 30s  there will be 3G memory. So nginx worker will have not chance to release some memory belonging to some connections within 30s and  crash by out of memory if your vm memory less than 3G. I can reproduce this at my 1G memory vm.

30M in-memory response is really not a reasonable response for general app. Could you please tell me the use case? 
If it is a static file please try return java.io.File as your response body nginx will do it better.
If you want to implement a dynamic proxy please try java rewrite handler + nginx directive proxy_pass  & proxy_buffering &  proxy_buffer_size nginx will mange buffer carefully ,e.g. when proxy_buffering is off nginx will only use at most  proxy_buffer_size bytes for one connection.
If you really need write such function by java , maybe you need to deep into NginxHttpServerChannel 's low level API , such as write/read, they are no buffer used and non-blocking.

Regards.
Xfeep






--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.

Yuexiang Zhang

unread,
Jul 8, 2015, 11:55:04 AM7/8/15
to narayan sagar, nginx-...@googlegroups.com
Hi, 

I have done a simple example for use very small buffer (4k) for one connection to stream large response so only 4M
needed for 1000 concurrent users.


Please  get the latest nginx-clojure java source and rebuild the java source (nginx-clojure executable binary need not rebuilt).  Although the example can run with nginx-clojure-0.4.0.jar from v0.4.0 but when test client is killed before they finish some out of array range exceptions will happen and these exceptions are harmless but make log file larger. The latest source fixed this issue.

We can follow below guide to download java source and  rebuild the jar file.

git clone https://github.com/nginx-clojure/nginx-clojure.git
cd nginx-clojure
lein jar

then we'll find nginx-clojure-xxxx.jar in the folder named "target".

If you haven't installed lein, please follow this guide http://leiningen.org/#install


Regards.
Xfeep


narayan sagar

unread,
Jul 27, 2015, 1:16:23 AM7/27/15
to Nginx-Clojure, bsns...@gmail.com
Hi,

Thanks for the example.

Will send_timeout work with the approach taken in the example?

Thanks and Regards,
Sagar

Yuexiang Zhang

unread,
Jul 27, 2015, 1:29:41 AM7/27/15
to narayan sagar, Nginx-Clojure
You're welcome!
Yes, send_timeout works with the example.

Regards.
Xfeep

--
You received this message because you are subscribed to the Google Groups "Nginx-Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nginx-clojur...@googlegroups.com.
To post to this group, send email to nginx-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages