Nginx maxing out CPU

129 views
Skip to first unread message

matt.j...@ecommistry.com

unread,
Aug 7, 2016, 10:55:30 PM8/7/16
to ngx-pagespeed-discuss
Hi there,

We deployed Pagespeed a month ago in a reverse-proxy configuration against our application servers, and it worked fine for about a week, but about two weeks ago it suddenly maxed out CPU on all proxy servers and started dropping connections (3 instances - AWS m3.medium, 3.75GB RAM).
nginx is running nothing other than pagespeed, with a backend proxy config to go to our internal load balancers.
Varnish is also running on same box, redirecting traffic to pagespeed (it's how we turn it on and off easily) - no unusual behavior there.

Symptom is very basic - any time we enable pagespeed, nginx CPU reaches 100% nearly immediately, and within 5-10 minutes starts refusing/dropping connections due to load.

I've confirmed server resources are fine - it's not running out of RAM or disk space, and disk I/O is fine. Server was fine for first week or so with no issues, so don't think this is a server sizing issue.

Utilizing shared Memcache server for Pagespeed - have tried with and without.

I have the ability to send single requests through pagespeed with a custom HTTP request header - this works fine, and it seems to handle OK with only one or two users browsing with this. Just cannot handle full traffic. I do not have a way to do partial traffic.

I'm mainly looking for advice on how to debug this - I'm really out of ideas!

Happy to provide full configs & URLs of live system privately to anyone with ideas, and I can do any debugging required, with the right pointers.

Thanks,
Matt

Config as follows (confidential parts snipped - happy to forward full actual config to anyone who thinks it might help):

        proxy_read_timeout 600s;
        proxy_send_timeout 600s;
        client_max_body_size 32M;

        pagespeed on;
        pagespeed LowercaseHtmlNames on;
        pagespeed FileCachePath "/mnt/pagespeed/"; # 4GB SSD, nowhere near full

        pagespeed MemcachedServers "{internal memcached 1}:11211,{internal memcached 2}:11211";

        pagespeed EnableCachePurge on;

        pagespeed HttpCacheCompressionLevel 0; # Disabled due to previous bug - causes nginx to return invalid responses.
        pagespeed AvoidRenamingIntrospectiveJavascript off; # Some of our JS fails this detection, but no problem being rewritten.

        pagespeed Domain *.{cdn domain here};
        pagespeed Domain https://*.{cdn domain here};
        pagespeed MapOriginDomain http://{internal loadbalancer}/ *.{cdn domain here};
        pagespeed LoadFromFileMatch "^https?://([a-z]+)-[0-9].{cdn domain here}/media/" "/storage/lfs/web/\\1/media/"; # Tested without this, no change
        pagespeed LoadFromFileRuleMatch Disallow \.php;
        pagespeed LoadFromFileRuleMatch Disallow \.xml;
        pagespeed LoadFromFileRuleMatch Disallow \.txt;
        pagespeed FetchHttps enable;
        pagespeed SslCertDirectory /etc/ssl/certs;
        pagespeed RespectXForwardedProto on;
        pagespeed InPlaceResourceOptimization off;
        pagespeed RateLimitBackgroundFetches on;

        pagespeed RewriteLevel CoreFilters;
        pagespeed EnableFilters remove_comments,rewrite_domains,collapse_whitespace,defer_javascript,lazyload_images,responsive_images,resize_images,combine_css,combine_javascript;

        pagespeed Statistics on;
        pagespeed StatisticsLogging on;
        pagespeed UsePerVhostStatistics off;
        pagespeed LogDir /var/log/pagespeed;
        pagespeed MessageBufferSize 100000;

        pagespeed StatisticsPath /ecngx_pagespeed_statistics;
        pagespeed GlobalStatisticsPath /ecngx_pagespeed_global_statistics;
        pagespeed MessagesPath /ecngx_pagespeed_message;
        pagespeed ConsolePath /ecpagespeed_console;
        pagespeed AdminPath /ecpagespeed_admin;
        pagespeed GlobalAdminPath /ecpagespeed_global_admin;

        pagespeed StatisticsDomains Allow {internal domain};
        pagespeed GlobalStatisticsDomains Allow {internal domain};
        pagespeed MessagesDomains Allow {internal domain};
        pagespeed ConsoleDomains Allow {internal domain};
        pagespeed AdminDomains Allow {internal domain};
        pagespeed GlobalAdminDomains Allow {internal domain};
        pagespeed StatisticsDomains Allow *.{internal domain};
        pagespeed GlobalStatisticsDomains Allow *.{internal domain};
        pagespeed MessagesDomains Allow *.{internal domain};
        pagespeed ConsoleDomains Allow *.{internal domain};
        pagespeed AdminDomains Allow *.{internal domain};
        pagespeed GlobalAdminDomains Allow *.{internal domain};


server {
        listen 8080;

        set_real_ip_from 127.0.0.0/16;
        set_real_ip_from 10.0.0.0/8;
        real_ip_header X-Forwarded-For;
        real_ip_recursive on;

        resolver 10.1.0.2;

        set $upstream_endpoint http://{internal load balancer};

        location ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+" {
          add_header "" "";
        }
        location ~ "^/pagespeed_static/" { }
        location ~ "^/ngx_pagespeed_beacon$" { }

        location / {

                proxy_pass $upstream_endpoint;
                proxy_pass_request_headers      on;
                proxy_set_header        Host    $host;
                proxy_set_header        X-Forwarded-For $remote_addr;
                proxy_set_header        X-Proxy-Proto $http_x_forwarded_proto;
                                add_header X-P-Server   {server name identifier};
        }

}




NOTE: The following is repeated similarly for a number of domains:

# Site URLs
pagespeed Domain https://{production domain};
pagespeed Domain {production domain};

# One per store view - local loading of resources
pagespeed MapOriginDomain http://{internal loadbalancer}/ https://{production domain} {cdn subdomain};
pagespeed MapOriginDomain http://{internal loadbalancer}/ {production domain} {cdn subdomain};

# Load Media from local repository
pagespeed LoadFromFile "https://{production domain}/media/" "/storage/lfs/web/{production site name}/media/";


Otto van der Schaaf

unread,
Aug 8, 2016, 4:08:43 AM8/8/16
to ngx-pagespeed-discuss
- Is there anything noteworthy logged to nginx's error.log when the CPU's spike (or slightly before)?
- It looks like you have statistics logging turned on. Could you post or email me the raw stats file? 

Otto
Reply all
Reply to author
Forward
0 new messages