Nginx with Pagespeed memory leak

1,209 views
Skip to first unread message

Stace Baal

unread,
Sep 19, 2013, 12:32:34 AM9/19/13
to mod-pagesp...@googlegroups.com
Hi All, 

We are having a problem with our Nginx + Pagespeed install and an apparent memory leak. When I enable any Pagespeed filter Nginx will continue to increase memory usage until RAM is exhausted and the box begins to swap.  Restarting Nginx frees up all memory.  Running Nginx with pagespeed off behaves normally and memory use is stable. I have tried this with several combinations of filters and cannot isolate the problem to a particular action. When filters are enabled they work as expected, except for the ever increasing memory footprint. The memory consumption presents itself running only a simple action like collapse_whitespace.  

I'm not able to find much documentation on ngx_pagespeed_statistics but will call out some items I find curious.  When running under moderate load with only collapse_whitespace on static HTML files I see the following: total_rewrite_count, num_flushes, pcache-cohorts-beacon_cohort_misses, pcache-cohorts-dom_inserts are all increase with the same value. file_cache_misses are about 2X total_rewrite_count. I also see an increasing resource_404_count.  

We are running Nginx to serve static assets and as a proxy in front of a Node.js application. (No fast CGI/PHP)  Our desire is to use Pageseepd to combine and minify CSS/JS, rewrite domains, remove white space and comments.  I'll include relevant portions of nginx conf below.  Any advice or help would be appreciated.  

-----

  #Pagespeed
  pagespeed off;
  location ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+" {
    add_header "" "";
  }
  location ~ "^/ngx_pagespeed_static/" {  }
  location ~ "^/ngx_pagespeed_beacon$" {  }

  # Needs to exist and be writable by nginx.
  pagespeed FileCachePath /data/svc/nginx/cache;
  pagespeed CacheFlushFilename /data/svc/nginx/cache/pagespeed.flush;

  #Explicitly define the filters we want to use
  pagespeed RewriteLevel PassThrough;
  pagespeed EnableFilters combine_css,combine_javascript,rewrite_css,rewrite_javascript,collapse_whitespace,remove_comments,rewrite_domains,extend_cache;
  pagespeed CombineAcrossPaths on;
  pagespeed Domain *. domain.com;
  pagespeed MapRewriteDomain "https://mstatic.domain.com" "http://m.domain.com";
  pagespeed MapRewriteDomain "https://mstatic.domain.com" "http://m.domain.ca";
  pagespeed MapRewriteDomain "https://mstatic.domain.com" "http://m.domain.co.uk";
  pagespeed MapRewriteDomain "https://mstatic.domain.com" "http://m.domain.com.au";

  #Tell Nginx to look for files on the filesystem
  pagespeed LoadFromFileMatch "^https?://m(static)?.domain.com/([^/]*)/public/"  "/data/svc/nodejs/webapps/\\2/current/public/";
  pagespeed LoadFromFileMatch "^https?://m(static)?.domain.com/([^/]*)/public/shared/"  "/data/svc/nodejs/webapps/\\2/current/node_modules/singles-shared/public/";
  pagespeed Statistics on;
  pagespeed StatisticsLogging off;
  pagespeed LogDir /data/svc/nginx/logs;

  # Default expires time
  expires       5m;

  # Enable gzip
  gzip on;
  gzip_comp_level 5;
  gzip_vary on;
  gzip_proxied any;
  gzip_types text/plain text/css text/javascript application/x-javascript application/json image/svg+xml;

-----

** Note: I am seeing similar behavior to this thread on mod_pagespeed for Apache which was resolved last year.  https://groups.google.com/forum/#!topic/mod-pagespeed-discuss/gGYruoCujMQ





Jeff Kaufman

unread,
Sep 19, 2013, 1:52:05 PM9/19/13
to mod-pagesp...@googlegroups.com
What version of ngx_pagespeed are you running? (Not that any version
should have a memory leak...)

Would it be possible for you to run a test under valgrind on an alternate port?
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/1441fa2a-0562-4781-9f39-b0eadb066eef%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Stace Baal

unread,
Sep 19, 2013, 2:56:17 PM9/19/13
to mod-pagesp...@googlegroups.com
Jeff,

Thanks for the reply.  We are running pagespeed v1.6.29.5-beta, niginx 1.4.2 on Scientific Linux release 6.3 (Carbon)
I'll check into the feasibility of running a test under valgrind.  


AnInterestedPerson

unread,
Oct 20, 2014, 7:47:45 AM10/20/14
to mod-pagesp...@googlegroups.com
Same here with the pagespeed memory leak.
CentOS 6.5
nginx 1.6.1
ngx_pagespeed-release-1.9.32.1-beta
php 5.6.1

All ok in normal use.

Ran a stress test to the max creating loads of nginx worker processes and back to normal operation. The worker processes persist and slowly leak memory, like 1-5 MB every 10 minutes total.

Joshua Marantz

unread,
Oct 20, 2014, 2:26:58 PM10/20/14
to mod-pagespeed-discuss
Is that because nginx is slowly getting behind your load?  If you slow down or stop the load, does the memory get released?

-Josh

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

AnInterestedPerson

unread,
Oct 20, 2014, 5:03:26 PM10/20/14
to mod-pagesp...@googlegroups.com
1. did a stress test to the max - nginx was far behind with the load (app. 8% of connections dropped) . totalled for app. 50K concurrent connection attempts
2. when stress test ended, all nginx workers stayed active. No memory was released.
3. Instead of releasing memory after the stress test ended, nginx slowly increased memory usage at below rate (app. 1-5MB every 10 minutes).

Later after nginx allocated an additional app. 1 GB (several hours) I plain restarted nginx since its already partially a production server. All worker processes from the heavy load were still up and idling, while the server was at app. 10 hits/hour (close to no load).

I recall that behaviour a whole lot different without pagespeed :/

Apart from that: huge leap forward in performance with pagespeed on tricky wordpress sites :D - so no am not willing to give up pagespeed, but afraid of a memory leak :(

Help?


Am Montag, 20. Oktober 2014 20:26:58 UTC+2 schrieb jmarantz:
Is that because nginx is slowly getting behind your load?  If you slow down or stop the load, does the memory get released?

-Josh
On Mon, Oct 20, 2014 at 7:47 AM, 'AnInterestedPerson' via mod-pagespeed-discuss <mod-pagesp...@googlegroups.com> wrote:
Same here with the pagespeed memory leak.
CentOS 6.5
nginx 1.6.1
ngx_pagespeed-release-1.9.32.1-beta
php 5.6.1

All ok in normal use.

Ran a stress test to the max creating loads of nginx worker processes and back to normal operation. The worker processes persist and slowly leak memory, like 1-5 MB every 10 minutes total.

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

AnInterestedPerson

unread,
Oct 21, 2014, 6:51:31 AM10/21/14
to mod-pagesp...@googlegroups.com
I did some less intense stress testing and left it idle over night.

Here my findings:
  • all worker processes stayed idle but active (consuming memory still)
  • over the time of app. 7 hours memory usage slowly increased to a peak.
  • after those 7 hours memory stayed constantly the same.
  • worker processes still idle and active after 12 hours

afterwards I performed a restart of those services.

Assumptions:

  • The memory increase is per worker process (logical)
  • since I had way less workers this time, the memory increases was a lot less in total
  • somehow pagespeed apparently keeps the workers active

Otto van der Schaaf

unread,
Oct 21, 2014, 8:36:37 AM10/21/14
to mod-pagesp...@googlegroups.com
Would you be able to show (or PM) your nginx.conf, plus the output of nginx -V ?
What did your stress test look like? Are you specifically stress testing rewriting of php generated html?

Hopefully I can reproduce your observation with that information and have a look at how to improve behaviour.

Otto


To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/f21f8da7-6fa7-44e7-ba60-36a0d9e5d418%40googlegroups.com.

AnInterestedPerson

unread,
Oct 21, 2014, 9:00:42 AM10/21/14
to mod-pagesp...@googlegroups.com
Would you be able to show (or PM) your nginx.conf, plus the output of nginx -V ?
> see below


What did your stress test look like? Are you specifically stress testing rewriting of php generated html?
> stress test was quite simple:
# ab -c 1000 -t 60 http://URL/
> the initial php-generated html page was app. 160KB (uncompressed) yet caches were prefetched, so the page was sent directly from memory (from zendopcache/apcu + pagespeed).

> mainly stress tested for response time of php-generated html. My configuration is rather complex: nginx reverse proxy to apache, where nginx has pagespeed installed. Goal was to find out about the load my configuration can handle (Sessions, concurrent connections etc.) as well as behaviour of pagespeed + apc +zendopcache + WP (incl. W3TC under a php/mysql heavy theme) under load.

Result: probably able to handle 14mio hits/Day while still being responsive under extremely heavy load (I was content considering its standard hardware :D )


Hopefully I can reproduce your observation with that information and have a look at how to improve behaviour.
> I am very very curious on your findings. Dont mistake me here: I love what pagespeed did number-wise. Yet I must be extremely cautious on potential issues ( I somehow hope I am overreacting ;) ).



# nginx -V
nginx version: nginx/1.6.1
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
TLS SNI support enabled
configure arguments: --prefix=/usr/share --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --user=nginx --group=nginx --with-ipv6 --with-file-aio --with-http_ssl_module --with-http_realip_module --with-http_sub_module --with-http_dav_module --with-http_gzip_static_module --with-http_stub_status_module --add-module=/usr/src/ngx_pagespeed-release-1.9.32.1-beta


nginx.conf
#user  nginx;
worker_processes  
8;

#error_log  /var/log/nginx/error.log;
#error_log  /var/log/nginx/error.log  notice;
#error_log  /var/log/nginx/error.log  info;

#pid        /var/run/nginx.pid;


events
{
    worker_connections  
1024;
}


http
{
    include       mime
.types;
    default_type  application
/octet-stream;

   
#log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
   
#                  '$status $body_bytes_sent "$http_referer" '
   
#                  '"$http_user_agent" "$http_x_forwarded_for"';

   
#access_log  /var/log/nginx/access.log  main;

    sendfile        on
;
   
#tcp_nopush     on;

   
#keepalive_timeout  0;
    keepalive_timeout  
65;
   
#tcp_nodelay        on;

   
#gzip  on;
   
#gzip_disable "MSIE [1-6]\.(?!.*SV1)";

    server_tokens off
;

    pagespeed on
;
    pagespeed
FileCachePath /tmp/ngx_pagespeed_cache;
    pagespeed
MessageBufferSize 100000;
    pagespeed
EnableFilters make_google_analytics_async;

    pagespeed
EnableFilters insert_image_dimensions;
    pagespeed
EnableFilters inline_google_font_css;
    pagespeed
EnableFilters remove_comments;
    pagespeed
InPlaceResourceOptimization on;
    pagespeed
LowercaseHtmlNames on;
    pagespeed
XHeaderValue "Powered By durchd8.de";
    pagespeed
EnableFilters move_css_above_scripts;
    pagespeed
EnableFilters move_css_to_head;

   
# pagespeed StatisticsPath /ngx_pagespeed_statistics;
   
# pagespeed GlobalStatisticsPath /ngx_pagespeed_global_statistics;
   
# pagespeed MessagesPath /ngx_pagespeed_message;
   
# pagespeed ConsolePath /pagespeed_console;
   
# pagespeed AdminPath /pagespeed_admin;
   
# pagespeed GlobalAdminPath /pagespeed_global_admin;

    include
/etc/nginx/conf.d/*.conf;
}


additional nginx.conf
Actually a lot conf -files. Its my own server but running atm 9 WP sites, of which 2 are productive in above configuration. 7 sites have no special additional configurations.

The two productive sites have the following additional configuration (apart from the vhost config):
# Ensure requests for pagespeed optimized resources go to the pagespeed handler
# and no extraneous headers get set.

location
~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+" {
    add_header
"" "";
}

location
~ "^/pagespeed_static/" { }
location
~ "^/ngx_pagespeed_beacon$" { }

# location /ngx_pagespeed_statistics { allow 127.0.0.1; allow 93.207.89.139; deny all; }
# location /ngx_pagespeed_global_statistics { allow 127.0.0.1; allow 93.207.89.139; deny all; }
# location /ngx_pagespeed_message { allow 127.0.0.1; allow 93.207.89.139; deny all; }
# location /pagespeed_console { allow 127.0.0.1; allow 93.207.89.139; deny all; }
# location ~ ^/pagespeed_admin { allow 127.0.0.1; allow 93.207.89.139; deny all; }
# location ~ ^/pagespeed_global_admin { allow 127.0.0.1; allow 93.207.89.139; deny all; }

# Leverage browser caching
location
~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
    expires
8d;
    add_header
Pragma public;
    add_header
Cache-Control public”;
    try_files $uri
@fallback;
}



On any question: dont hesitate but ask.

Yours

Martin
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsubscri...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

AnInterestedPerson

unread,
Oct 21, 2014, 9:05:51 AM10/21/14
to mod-pagesp...@googlegroups.com
PS: any additional performance-related advice on my setup is welcome ;) (it was my first run on pagespeed after two tests)

Otto van der Schaaf

unread,
Oct 22, 2014, 9:32:20 AM10/22/14
to mod-pagesp...@googlegroups.com
I had a look with a setup that matches your os/gcc/nginx/ngx_pagespeed version, and so far I can't find any problems and/or leaks testing static files and also not with proxying a website. 

Pagespeed uses a memory cache per worker, which won't use up a lot of memory when you repeatedly hit the same page. But under live traffic,
it will probably fill up and use more. Could that explain the slowly growing memory that you are seeing which stabilizes after a while? 
Because on my system, when the server gets 0 requests, memory usage doesn't change. What is the absolute overhead in terms of memory that you are seeing when you compare running with and without ngx_pagespeed after usage has stabilized?

Otto

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/de21aeaf-b9ec-439f-be5d-b989b648f119%40googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 11:18:05 AM10/22/14
to mod-pagesp...@googlegroups.com
I have seen similar problems. But I could solve it by linking with jemalloc.
Here is how I did it (with Intel ICC as far as possible, for much much extra performance):

Download jemalloc from http://www.canonware.com/jemalloc/download.html and then:
unset LDFLAGS
source /opt/intel/bin/compilervars.sh intel64
export PATH=/usr/lib/ccache:$PATH
CC=icc
EXTRA_CFLAGS="-fast -parallel -qopenmp -pthread -unroll-aggressive -qopt-prefetch"
LD=xild
AR=xiar
CXX=icpc
CXXFLAGS="-fast -parallel -qopenmp -pthread -unroll-aggressive -qopt-prefetch"
export CC EXTRA_CFLAGS LD AR CXX CXXFLAGS

./autogen.sh && make -j3 && make install && make clean

unset EXTRA_CFLAGS
To compile/install it.

And then I build ngx_pagespeed manually, unfortunately with gcc and not icc because icc isn't supported by pagespeed:
cd ~
### git clone https://github.com/pagespeed/ngx_pagespeed.git
cd ngx_pagespeed/
git pull
cd ~/bin
svn co http://src.chromium.org/svn/trunk/tools/depot_tools
export PATH=$PATH:~/bin/depot_tools
cd ~/mod_pagespeed
gclient config http://modpagespeed.googlecode.com/svn/branches/latest-beta/src
gclient sync --force --jobs=1
cd ~/mod_pagespeed/src
export PATH=/usr/lib/ccache:$PATH
CFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security"
CXXFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security"
LDFLAGS="-ljemalloc -Wl,-z,relro -Wl,--as-needed"
export CFLAGS CXXFLAGS LDFLAGS
nice make -j 3 AR.host=`pwd`/build/wrappers/ar.sh AR.target=`pwd`/build/wrappers/ar.sh BUILDTYPE=Release && cd ~/mod_pagespeed/src/net/instaweb/automatic && nice make -j 3 BUILDTYPE=Release AR.host="$PWD/../../../build/wrappers/ar.sh" AR.target="$PWD/../../../build/wrappers/ar.sh" all

And then I build nginx, with icc, as local user:
cd /usr/local/src/nginx-1.7.6
make clean
source /opt/intel/bin/compilervars.sh intel64

export PATH=/usr/lib/ccache:$PATH \
CC=icc \
CFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LD=xild \
AR=xiar \
CXX=icpc \
CXXFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LDFLAGS=-ljemalloc \
export CC CFLAGS LD AR CXX CXXFLAGS LDFLAGS
MOD_PAGESPEED_DIR="$HOME/mod_pagespeed/src" ./configure --prefix=/opt/nginx17 --user=www-data --group=www-data --with-http_ssl_module --with-http_spdy_module --with-openssl=/usr/local/src/openssl-1.0.2-beta3 --with-openssl-opt="enable-ec_nistp_64_gcc_128 threads" --with-md5=/usr/local/src/openssl-1.0.2-beta3 --with-md5-asm --with-sha1=/usr/local/src/openssl-1.0.2-beta3 --with-sha1-asm --with-pcre-jit --with-file-aio --with-http_flv_module --with-http_geoip_module --with-http_gzip_static_module --with-http_gunzip_module --with-http_mp4_module --with-http_realip_module --with-http_stub_status_module --with-ipv6 --add-module=/usr/local/src/nginx-rtmp-module --add-module=/usr/local/src/ngx_cache_purge-2.1 --add-module=$HOME/ngx_pagespeed --with-ld-opt="-ljemalloc" --with-cc-opt="-DTCP_FASTOPEN=23 -xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch"

And then I install nginx, as root: 
source /opt/intel/bin/compilervars.sh intel64

export PATH=/usr/lib/ccache:$PATH \
CC=icc \
CFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LD=xild \
AR=xiar \
CXX=icpc \
CXXFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LDFLAGS=-ljemalloc \
export CC CFLAGS LD AR CXX CXXFLAGS LDFLAGS

nice make install


For me, this has solved the memory problems. JEMALLOC is great, it's perfect for a stable and high performance memory management. 

I don't know if it's gonna work for others. But good luck!

-Hans

 

Op donderdag 19 september 2013 06:32:34 UTC+2 schreef Stace Baal:

AnInterestedPerson

unread,
Oct 22, 2014, 11:44:57 AM10/22/14
to mod-pagesp...@googlegroups.com
Hi,

thank you for looking into it. I do understand the need to replicate the issue I saw. Right now I started some tests and will post later about the results.

Martin

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

AnInterestedPerson

unread,
Oct 22, 2014, 12:28:59 PM10/22/14
to mod-pagesp...@googlegroups.com

Apparently I can replicate the issue with a test like:

ab -c 1000 - n 1000 http://URL/

and then

ab -c 1000 - t 60 http://URL/

In my case I could raise CPU usage of the server for all cores to 100%. The preliminary result looks like this:

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 12:42:38 PM10/22/14
to mod-pagesp...@googlegroups.com
I can replicate that too. But my RAM usage keeps stable, thanks to my custom linking with jemalloc (see above). But my CPU usage also goes to 100%, on all cores. But.. when I access http://url/?ModPagespeed=Off then my CPU usage is < 1%. It doesn't matter if I use downstream caching or not. And the Pagespeed cache is in memcached.

Here is my post of 5 weeks ago with probably the same issue: https://groups.google.com/forum/#!topic/ngx-pagespeed-discuss/4IKWjqAzguQ
I'm still trying to find the cause, but it seems to be something with how Pagespeed handles and processes incoming requests... :(

-Hans

Op woensdag 22 oktober 2014 18:28:59 UTC+2 schreef AnInterestedPerson:

Joshua Marantz

unread,
Oct 22, 2014, 12:45:05 PM10/22/14
to mod-pagespeed-discuss
RE jemalloc: very cool.  I have been concerned for years that the normal system malloc might not be great for PageSpeed, but as it stands we use the default one and had not noticed any trouble.  Maybe this is because Apache recycles processes before fragmentation becomes an issue.

AnInterestedPerson: would you be willing to try the instructions from Hans?
Hans: how important is ICC for this, in your experience?

Probably we should find out which of our objects is causing all the fragmentation and adjust our pooling in some fashion to alleviate the problem. 

AnInterestedPerson: what's in your HTML?  It's great that you are able to replicate this just with the HTML flow.  Could you also take a look
at our statistics page to see which numbers are changing a lot?  Check http://localhost/pagespeed_global_admin/statistics

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/166606d5-5a41-4941-a7ee-285b39c4b8cb%40googlegroups.com.

Joshua Marantz

unread,
Oct 22, 2014, 12:50:20 PM10/22/14
to mod-pagespeed-discuss
Hans -- it doesn't shock me to find that ngx_pagespeed is compute-bound whereas vanilla nginx is network bound, particularly if the load is 100% HTML.  This is because NPS lexes all the bytes of HTML on every HTML request.  On a more realistic mixture of resources (which are usually cached) and HTML, the CPU is only used on a fraction (10% or less usually) of the requests and you might once again be network bound.

But it's worth checking, in all these configurations, if you have the same load measurements with all PageSpeed filters turned off in your NPS configuration.  The HTML will still be parsed but we won't do anything else substantive during each request.

In short, please try isolating the problem by reducing the functionality of NPS to the bare minimum (parse & reserialize HTML) and then incrementally add in new filters and other configuration options until you see the problem re-appear. 

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/80ffe1ae-6a96-4e63-ab27-e755b08e56a3%40googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 12:50:25 PM10/22/14
to mod-pagesp...@googlegroups.com
Hi Josh,

Hans: how important is ICC for this, in your experience?

Well, ICC gives me 1.8x - 2.4x more performance. I can handle the load with ~50% less servers, thanks to ICC. And ICC speeds up while lowering system load. Plus the TTFB is 22% lower and total request times are 30 - 40% lower (on average).

# icc -V
Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.0.090 Build 20140723
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

So, yes: to me ICC is pretty important. :-)

-Hans
 

Op woensdag 22 oktober 2014 18:45:05 UTC+2 schreef jmarantz:

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 12:54:47 PM10/22/14
to mod-pagesp...@googlegroups.com
Thanks Josh!

But it's worth checking, in all these configurations, if you have the same load measurements with all PageSpeed filters turned off in your NPS configuration.  The HTML will still be parsed but we won't do anything else substantive during each request.

When I turn off all the filters and put MPS on PassThrough, the load issue is still there. As soon as I turn MPS off by "pagespeed off;" in the conf, the load is perfectly low.
The same low load as if testing it with ?ModPagespeed=Off.
So, as soon as I turn on MPS, without filters active, load rises to 100% while accessing html/php pages.
Static elements are doing fine indeed. :)

-Hans


Op woensdag 22 oktober 2014 18:50:20 UTC+2 schreef jmarantz:

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

Joshua Marantz

unread,
Oct 22, 2014, 12:55:39 PM10/22/14
to mod-pagespeed-discuss
Wow those are amazing stats.  I had benchmarked ICC a long time back (mid 2000s) and did not get such great results.  I assume you are comparing against a gcc -O3 build of nginx.

I wonder how much better this would get if you could compile PageSpeed with ICC.

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/84955957-fb4f-4005-8df6-b106cbe25399%40googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 12:58:03 PM10/22/14
to mod-pagesp...@googlegroups.com
Much better I guess. And yes, I compare the ICC build method against this GCC build method (gave me the highest GCC performance):

CFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security" \
CXXFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security" \
LDFLAGS="-ljemalloc -Wl,-z,relro -Wl,--as-needed" \
export CFLAGS CXXFLAGS LDFLAGS

MOD_PAGESPEED_DIR="$HOME/mod_pagespeed/src" ./configure --prefix=/opt/nginx17 --user=www-data --group=www-data --with-http_ssl_module --with-http_spdy_module --with-openssl=/usr/local/src/openssl-1.0.2-beta3 --with-openssl-opt="enable-ec_nistp_64_gcc_128 threads" --with-md5=/usr/local/src/openssl-1.0.2-beta3 --with-md5-asm --with-sha1=/usr/local/src/openssl-1.0.2-beta3 --with-sha1-asm --with-pcre=/usr/local/src/pcre-8.35 --with-pcre-jit --with-zlib=/usr/local/src/zlib-1.2.8 --with-file-aio --with-http_flv_module --with-http_geoip_module --with-http_gzip_static_module --with-http_gunzip_module --with-http_mp4_module --with-http_realip_module --with-http_stub_status_module --with-ipv6 --add-module=/usr/local/src/nginx-rtmp-module --add-module=/usr/local/src/ngx_cache_purge-2.1 --add-module=$HOME/ngx_pagespeed --with-ld-opt="-ljemalloc -Wl,-z,relro -Wl,--as-needed" --with-cc-opt="-DTCP_FASTOPEN=23 -O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security"

-Hans
 

Op woensdag 22 oktober 2014 18:55:39 UTC+2 schreef jmarantz:
Hi Josh,

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

Joshua Marantz

unread,
Oct 22, 2014, 1:01:33 PM10/22/14
to mod-pagespeed-discuss
Got it.  There is some possibility of improving the speed of the HTML parser, but it hasn't been our priority because it hasn't been the bottleneck in normal load scenarios for browser-originated traffic.  Whenn the load is pure HTML, I'm glad it's the bottleneck because that makes sense and it isn't just some random bug :)

I guess one other possibility is that there's something else in ngx_pagespeed filter infrastructure that's the bottleneck besides running the parser.  Any chance you could generate a profile (callgrind or gprof or whatever) from your load test?  That would tell us for sure.

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/d5109443-21e8-4226-87f8-2bb970c48ffb%40googlegroups.com.

Hans van Eijsden

unread,
Oct 22, 2014, 1:05:03 PM10/22/14
to mod-pagesp...@googlegroups.com
Yes, sure, I would love to help you with that!
One thing: I don't have experience with profiling (callgrind, prof) yet. So I need some instructions: the commands I need to do and how to compile.
I think it's too much to "pollute" this topic with it but if you want me to test, feel free to give me the instructions by mail (in...@hansvaneijsden.nl) or in a separate topic. Thanks!

-Hans


Op woensdag 22 oktober 2014 19:01:33 UTC+2 schreef jmarantz:
Thanks Josh!

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

AnInterestedPerson

unread,
Oct 22, 2014, 1:05:29 PM10/22/14
to mod-pagesp...@googlegroups.com
Well sure sounds good - trying Hans's instructions. :)

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

AnInterestedPerson

unread,
Oct 22, 2014, 2:30:25 PM10/22/14
to mod-pagesp...@googlegroups.com

Joshua Marantz

unread,
Oct 22, 2014, 2:37:55 PM10/22/14
to mod-pagespeed-discuss
AnInterestedPerson: are these new graphs from have you changed something (e.g. using jemalloc)?

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/86a73669-e54e-41dc-83f9-54a9750014db%40googlegroups.com.

AnInterestedPerson

unread,
Oct 22, 2014, 3:14:35 PM10/22/14
to mod-pagesp...@googlegroups.com
Nope, just the last graph before I restarted nginx - gained app. 180 MB of memory usage with 0 traffic during that time.

AnInterestedPerson

unread,
Oct 25, 2014, 12:44:53 PM10/25/14
to mod-pagesp...@googlegroups.com
My make ends withe the following error with icc - any idea Hans?

In file included from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/psol/include/pagespeed/kernel/base/scoped_ptr.h(26),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/psol/include/pagespeed/kernel/http/headers.h(25),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/psol/include/pagespeed/kernel/http/response_headers.h(24),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/psol/include/net/instaweb/http/public/response_headers.h(20),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/src/ngx_pagespeed.h(37),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/src/ngx_base_fetch.h(46),
                 
from /usr/src/ngx_pagespeed-release-1.9.32.1-beta/src/ngx_base_fetch.cc(19):
/usr/src/ngx_pagespeed-release-1.9.32.1-beta/psol/include/third_party/chromium/src/base/memory/scoped_ptr.h(163): error #68: integer conversion resulted in a change of sign
    COMPILE_ASSERT
(sizeof(T) == -1, do_not_use_array_with_size_as_type);
   
^

/usr/src/ngx_pagespeed-release-1.9.32.1-beta/src/ngx_base_fetch.cc(167): error #68: integer conversion resulted in a change of sign
   
if (__sync_add_and_fetch(&references_, -1) == 0) {
                                           
^

compilation aborted
for /usr/src/ngx_pagespeed-release-1.9.32.1-beta/src/ngx_base_fetch.cc (code 2)
make
[1]: *** [objs/addon/src/ngx_base_fetch.o] Error 2
make
[1]: Leaving directory `/usr/src/nginx-1.6.1'
make: *** [build] Error 2

Hans van Eijsden

unread,
Oct 25, 2014, 8:14:06 PM10/25/14
to mod-pagesp...@googlegroups.com
Hi AnInterestedPerson,

Yes! I have the idea.
The problem is, you cannot compile mod_pagespeed with ICC. You have to use GCC because the Google framework doesn't support ICC (yet).
I compile mod_pagespeed with GCC first (ngx_pagespeed needs mod_pagespeed). Then I use ICC to compile ngx_pagespeed and nginx. Here is how.

1: jemalloc:

Install from package or from source. Doesn't matter that much, I choose to install from source with ICC.

2: mod_pagespeed with GCC:

cd ~
### git clone https://github.com/pagespeed/ngx_pagespeed.git       #(first time install? then uncomment)
mkdir ngx_pagespeed/ && cd ngx_pagespeed/
git pull
mkdir ~/bin && cd ~/bin
svn co http://src.chromium.org/svn/trunk/tools/depot_tools
export PATH=$PATH:~/bin/depot_tools
mkdir ~/mod_pagespeed && cd ~/mod_pagespeed
gclient config http://modpagespeed.googlecode.com/svn/branches/latest-beta/src
gclient sync --force --jobs=1
cd ~/mod_pagespeed/src
export PATH=/usr/lib/ccache:$PATH
CFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security"
CXXFLAGS="-O3 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -fopenmp -mtune=native -march=native -Wformat -Werror=format-security"
LDFLAGS="-ljemalloc -Wl,-z,relro -Wl,--as-needed"
export CFLAGS CXXFLAGS LDFLAGS
nice make -j 3 AR.host=`pwd`/build/wrappers/ar.sh AR.target=`pwd`/build/wrappers/ar.sh BUILDTYPE=Release && cd ~/mod_pagespeed/src/net/instaweb/automatic && nice make -j 3 BUILDTYPE=Release AR.host="$PWD/../../../build/wrappers/ar.sh" AR.target="$PWD/../../../build/wrappers/ar.sh" all

3: NGINX with ICC:

source /opt/intel/bin/compilervars.sh intel64

export PATH=/usr/lib/ccache:$PATH \
CC=icc \
CFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LD=xild \
AR=xiar \
CXX=icpc \
CXXFLAGS="-xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch" \
LDFLAGS=-ljemalloc \
export CC CFLAGS LD AR CXX CXXFLAGS LDFLAGS

MOD_PAGESPEED_DIR="$HOME/mod_pagespeed/src" ./configure --prefix=/opt/nginx17 --user=www-data --group=www-data --with-http_ssl_module --with-http_spdy_module --with-openssl=/usr/local/src/openssl-1.0.2-beta3 --with-openssl-opt="enable-ec_nistp_64_gcc_128 threads" --with-md5=/usr/local/src/openssl-1.0.2-beta3 --with-md5-asm --with-sha1=/usr/local/src/openssl-1.0.2-beta3 --with-sha1-asm --with-pcre-jit --with-file-aio --with-http_flv_module --with-http_geoip_module --with-http_gzip_static_module --with-http_gunzip_module --with-http_mp4_module --with-http_realip_module --with-http_stub_status_module --with-ipv6 --add-module=/usr/local/src/nginx-rtmp-module --add-module=/usr/local/src/ngx_cache_purge-2.1 --add-module=/usr/local/src/ngx_http_substitutions_filter_module --add-module=$HOME/ngx_pagespeed --with-ld-opt="-ljemalloc" --with-cc-opt="-DTCP_FASTOPEN=23 -xHOST -O3 -ipo -no-prec-div -pthread -unroll-aggressive -qopt-prefetch"

nice make install

This is how I do it. Make sure to adjust the configure line as you wish.

-Hans


Op zaterdag 25 oktober 2014 18:44:53 UTC+2 schreef AnInterestedPerson:
Hi Josh,

#pid        /var/run/nginx.pid;<span style="color:#000"
...

AnInterestedPerson

unread,
Oct 26, 2014, 4:19:25 AM10/26/14
to mod-pagesp...@googlegroups.com
Doing it step by step now.

jemalloc
The overall improvement by using jemalloc is drastic - the least said. In my case (ATOM 2570 CPU) I plain added jemalloc to my previous config (no other change).
Results:
  1. nginx uses app. 1/3 less memory on the same load test.
  2. nginx serves almost twice (!!!!) as much successful requests as before with way less worker processes
  3. Overall my requests/second improved by app. 8-20% (depending on test paramenters)
  4. Potential requests per day (its a number, no real world value) raised from app 14 Mill to 15.2 Mill requests/day.
  5. Overall behaviour and responsiveness under load improved.

Bottom line: use jemalloc with nginx and pagespeed.


Am Mittwoch, 22. Oktober 2014 18:28:59 UTC+2 schrieb AnInterestedPerson:

AnInterestedPerson

unread,
Oct 26, 2014, 7:30:36 AM10/26/14
to mod-pagesp...@googlegroups.com
compiled nginx with icc

Result:
  1. I observe a slight performance gain in normal operations (without much load) by app. 3-8%
  2. On more computing intense operations (eg opcache rebuild etc) the gain with icc are app. 25% in normal operations

The apparent memory leak is gone in my case. So from my view am ok :D

AnInterestedPerson

unread,
Oct 26, 2014, 7:39:44 AM10/26/14
to mod-pagesp...@googlegroups.com
@Hans: thx a lot for the kickstart :) - in the end I did it different, yet you helped me a lot.

Hans van Eijsden

unread,
Oct 26, 2014, 7:51:49 AM10/26/14
to mod-pagesp...@googlegroups.com
You're welcome. Glad I could help. Enjoy! :)

-Hans

Op zondag 26 oktober 2014 12:39:44 UTC+1 schreef AnInterestedPerson:

Joshua Marantz

unread,
Oct 26, 2014, 7:57:49 AM10/26/14
to mod-pagespeed-discuss
Did this solve your memory leak problems completely?  Or does it still leak memory under heavy load, with incremental improvements due to ICC & jemalloc?

-Josh


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/70c45b60-4cfe-46d9-806c-5e7957cab7fc%40googlegroups.com.

AnInterestedPerson

unread,
Oct 26, 2014, 8:22:35 AM10/26/14
to mod-pagesp...@googlegroups.com
It looks better, but we gotta wait a day or two until I can answer this finally - for that long "from my view it looks ok".

Back in a few days.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

Joshua Marantz

unread,
Oct 26, 2014, 8:56:31 AM10/26/14
to mod-pagespeed-discuss
Great -- I look forward to your report in a few days!

Thanks,
-Josh


To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/750a3883-65db-4e3c-a956-a8baa30a0eff%40googlegroups.com.

AnInterestedPerson

unread,
Oct 26, 2014, 2:28:36 PM10/26/14
to mod-pagesp...@googlegroups.com
Hi Josh,

some findings, while at it (somehow using this thread as my post-it) ;)
  • Its important to build pagespeed and nginx with jemalloc
  • one can build pagespeed mod with gcc/jemalloc and nginx with icc, while using the pagespeed mod (gcc/jemalloc)

Results so far:

  1. I had to reconfigure nginx to be able to use that new power:
    • despite 8 cores, performance improves with 16 (my new nginx) worker processes instead of 8 (before)
    • I had to activate the open file cache
  2. By now my servers backplane limits my web-server-performance (before CPU and the internal delay in my Apache/nginx sandwich)
  3. single cores seldomly reach 100% but work at 96% -98.9% (before all were at 100% on stress tests for the most part)
  4. System overall handles 30% more load than before, while nginx uses app. 50% less (!!!) memory (for more load)

I am just having a fun time testing here.

sidenote: I am testing with a real-life Wordpress page so that I am able to translate my findings into real-life examples (same applies for my 14 Mill hits/day - before - and 15.5-18 Mill hits/day - now -)

Why Wordpress? (i) its my business, (ii) WP is a resource monster and a beast to tame.

Yours

Martin

PS: please make pagespeed compileable by icc

To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsubscri...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/mod-pagespeed-discuss/70c45b60-4cfe-46d9-806c-5e7957cab7fc%40googlegroups.com?utm_medium=email&utm_source=footer" t
...

AnInterestedPerson

unread,
Oct 27, 2014, 4:53:19 AM10/27/14
to mod-pagesp...@googlegroups.com

ITs more difficult than I thought. Over night I have a pattern. Diagram first:

So basically:
  • Shortly after 6 pm (18:00) I restarted the server after reconfiguration
  • Initiated some heavy load testing (ensuring to hit the open file limit, which apparently makes nginx struggle)
  • Left the server alone since then
  • Memory usage is (again) slowly increasing, yet completely different than before. (i) memory usage does not slowly increase or decrease, but in steps, (ii) keeping in mind that I have twice the amount of worker processes configured, the %-change is way less to before (iii) Server is still productive, therefore memory usage can be due to additional external traffic (iv) apparently memory usage is approaching an absolute max - staying within a range of app +/- 150-200 MB.

Bottom line: I am unsure about the outcome, but not worried so far. More to post later on.

...

Joshua Marantz

unread,
Oct 27, 2014, 6:56:32 AM10/27/14
to mod-pagespeed-discuss
What exactly is the meaning of the nginx 'open file limit'?  Is that simply the maximum number of file descriptors allowed per process by your kernel?  Or is tht something enforced by nginx?

This looks to me like a pattern of configuring the load so that its coming in faster than the server can keep up.  Independent of memory allocator or compiler flow, a server has some limitation in how fast it can process requests.  If you send requests faster than the limit, the server has two choices: shed load (return 500s) or queue up the requests in hopes that the load will slow down and it can catch up . In the latter case, if the load does not slow down then the server will start consuming memory to hold the increasing queues, without bound.

Of course by changing the allocator and the compiler you can increase the server capacity, which is all good.

But to solve the leak we have to get the servers to enforce a bounded queue-size and start sending 500s when they run out of it  I don't know if nginx can be configured like this.

PageSpeed itself has some queues internally for fetches and for optimizing resources.  I believe at this point we have bounded sizes for all of these, and above that usually give up optimizing stuff.  But it's possible that we've missed some of them.

As you run your load test at this high rate, could you look at /pagespeed_admin/statistics?  Click the buttons to sort by frequence and to update every 5 seconds and see which statistics are changing fastest.  That might shed some light on pagespeed's specific contribution to the memory leaks.  Also check /pagespeed_admin/histograms.

Of course, PageSpeed will indirectly contribute to the leak by adding compute overhead to every request, and making it much easier for nginx to get into this state where it might be doing some queuing.  And jemalloc and ICC can help push that out further.

-Josh

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/39675285-0c74-4ebf-889e-9649d52c9c72%40googlegroups.com.

AnInterestedPerson

unread,
Oct 28, 2014, 5:53:59 AM10/28/14
to mod-pagesp...@googlegroups.com
First of all: all is ok now in regards to memory.

The current graph (3 days):

nginx (and Apache as well) unloaded (like they should) and system returned to normal. No more memory leaking at all, just normal operations. From my current findings I got the impression, that jemalloc served me well in having such a nice solution.

Recommendation: use jemalloc as standard for pagespeed; preferably for any compilation of nginx with pagespeed.
...

AnInterestedPerson

unread,
Oct 28, 2014, 6:45:12 AM10/28/14
to mod-pagesp...@googlegroups.com
Included my comments in your post. I hope I was able to express my findings and opinion in an understandable way.

Yours

Martin


Am Montag, 27. Oktober 2014 11:56:32 UTC+1 schrieb jmarantz:
What exactly is the meaning of the nginx 'open file limit'?  Is that simply the maximum number of file descriptors allowed per process by your kernel?  Or is tht something enforced by nginx?
With too many open files nginx  and pagespeed show errors in nginx error log. Excerpt (modified original domain name with <domain.com> ):
2014/10/26 11:30:46 [error] 18750#0: [ngx_pagespeed 1.9.32.1-4238] /tmp/ngx_pagespeed_cache/prop_page/http,3A/,2F<domain.com>/_gEk8rSxoyX,40Desktop,40beacon_cohort,.tempOBdDA0:0:opening temp file: Too many open files
2014/10/26 11:30:46 [error] 18750#0: [ngx_pagespeed 1.9.32.1-4238] /tmp/ngx_pagespeed_cache/prop_page/http,3A/,2F<domain.com>/_gEk8rSxoyX,40Desktop,40beacon_cohort,.tempeBdFO5:0:opening temp file: Too many open files
2014/10/26 11:31:34 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:42 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:47 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:49 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:52 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:52 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:52 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:52 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:52 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:53 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:53 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:31:53 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 11:32:04 [crit] 18751#0: accept4() failed (24: Too many open files)
2014/10/26 14:15:46 [crit] 1616#0: accept4() failed (24: Too many open files)

One can actually configure the open file cache limit in nginx, but not the open file limit, which apparently comes from the kernel (?!?).

This looks to me like a pattern of configuring the load so that its coming in faster than the server can keep up.  Independent of memory allocator or compiler flow, a server has some limitation in how fast it can process requests.  If you send requests faster than the limit, the server has two choices: shed load (return 500s) or queue up the requests in hopes that the load will slow down and it can catch up . In the latter case, if the load does not slow down then the server will start consuming memory to hold the increasing queues, without bound.
It is actually both: testing at the sweet spot so that the server can handle the load to more load than the server can handle. And that there are server limitations is obvious ;). Just to clarify: the tests were "burst tests", meaning a high load, which dropped after max 60 seconds back to normal operations, while observing memory usage and service behaviour after the burst load tests.

There are several important reasons for this:
1. performance: evaluate server performance overall and pinpoint bottlenecks
2. pagespeed impact: verify impact of pagespeed to normal and heavy-load operations
3. security: system behaviour under DDOS attacks (something I suffered from in the past)
4. security + pagespeed: like 3. but including pagespeed
5. tuning pagespeed with nginx: overall tuning
6. admin: trying to predict long term behaviour of nginx with pagespeed
7. ...


Of course by changing the allocator and the compiler you can increase the server capacity, which is all good.
Yes and no:
- Yes, changing compiler probably increases system capacity also
- No, changing compiler and allocator apparently also prevent eg. memory leaks and other unpredictable behaviour respectively reduce the risk of those occuring (which was the whole point to try them all in first place - the increase in capacity is a goody I didnt expect in first place).
 

But to solve the leak we have to get the servers to enforce a bounded queue-size and start sending 500s when they run out of it  I don't know if nginx can be configured like this.
Not an nginx expert here - sorry :(
 
PageSpeed itself has some queues internally for fetches and for optimizing resources.  I believe at this point we have bounded sizes for all of these, and above that usually give up optimizing stuff.  But it's possible that we've missed some of them.
I figure looking for that open file limit would be a good start (if it is kernel or pagespeed related on the tmp-cache-files).
 

As you run your load test at this high rate, could you look at /pagespeed_admin/statistics?  Click the buttons to sort by frequence and to update every 5 seconds and see which statistics are changing fastest.  That might shed some light on pagespeed's specific contribution to the memory leaks.  Also check /pagespeed_admin/histograms.
- I need to rerun tests for this - for now I wanted to complete the last test first.
- I probably will rerun tests to supply you this information
 

Of course, PageSpeed will indirectly contribute to the leak by adding compute overhead to every request, and making it much easier for nginx to get into this state where it might be doing some queuing.  And jemalloc and ICC can help push that out further.
Hmmm I see it slightly different on your last remark (And jemalloc and ICC can help push that out further.)

Basically I increased load to the extreme - apparently to a point, where the nginx-pagespeed combo leaves normal operations and entered a - somehow - extreme state of operation. For sure jemalloc and ICC help to push that out further. Yet judging server combo behaviour when back to normal operations, it looked much more "structured" and "organized". Examples: (I) memory ups and downs are more ordered and appear less volatile, (II) memory increase and decrese changed from "curves" to steps, as if nginx worker processes simultanously freed memory or simultanously increased memory, (III) more even CPU usage, even under extreme load (no matter how extreme the load).

Bottom line: nginx and pagespeed together are more predictable when used with icc and jemalloc, leading to a better overall system behaviour and therefore administration.

 
...

Otto van der Schaaf

unread,
Oct 28, 2014, 3:41:57 PM10/28/14
to mod-pagesp...@googlegroups.com
My 2p:
- With regards to hitting file descriptor limits:  There's a code change [1] pending for review 
which removes two file-descriptors per request that ngx_pagespeed currently needs by changing 
the way PageSpeed and nginx communicate. 

- If you don't already have proxy buffering off: for this specific test it might also be worth it to test 
how behaviour changes with proxy buffering off [2], which is on by default and makes nginx act 
as a spool between the backend and clients. 


Otto


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages