nginx vs apache

Kent

no leída,

16 mar 2016, 7:50:15 a.m.16/3/2016

para modwsgi

I'm looking for a very brief high-level pros vs. cons of wsgi under apache vs. under nginx and then to be pointed to more details I can study myself (or at least the latter).

Our application occasionally allows requests that consume a large amount of RAM (no obvious way around that, they are valid requests) and occasionally this causes problems since we can't reclaim the RAM readily from apache. (We already have tweaked with and do use "inactivity-timeout". This helps, but still now and then we hit problems where we run into swapping to disk.)

I'm wondering if nginx may solve this problem. I've read much of what you (Graham) have had to say about the memory strategies with apache and mod_wsgi, but wonder what your opinion of nginx is and where you've already discussed this. I've read articles I could find you've written on nginx, such as "Blocking requests and nginx version of mod_wsgi," but wonder if the same weaknesses are still applicable today, 7 years later?

Thank you very much in advance!
Kent

Bill Freeman

no leída,

16 mar 2016, 1:11:30 p.m.16/3/2016

para modwsgi

I don't know about nginx, but one possibility, if the large memory requests are infrequent, is to detect when you have completed one and trigger the exit/reload of the daemon process (calling sys.exit() is not the way, since there could be other threads in the middle of something, unless you run one thread per process).

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Kent Bower

no leída,

16 mar 2016, 1:29:50 p.m.16/3/2016

para mod...@googlegroups.com

Interesting idea.. yes, we are using multiple threads and also other stack frameworks, so that's not straightforward, but worth thinking about... not sure how to approach that with the other threads. Thank you Bill.

--
You received this message because you are subscribed to a topic in the Google Groups "modwsgi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to modwsgi+u...@googlegroups.com.

Bill Freeman

no leída,

16 mar 2016, 2:30:51 p.m.16/3/2016

para modwsgi

mod_wsgi has the maximum number of requests restart, so it must know a way (e.g.; "I;m no longer sending requests to this process, and all pending requests have completed, so I'll send a signal"), but how to get it to invoke that process will require someone else's input.

Kent Bower

no leída,

16 mar 2016, 2:56:48 p.m.16/3/2016

para mod...@googlegroups.com

Well, I think the answer would be to signal SIGUSR1 to the parent process and mod_wsgi handles the rest quite beautifully. Thanks for the idea Bill!

But, Graham, if you are still listening, I'm curious if at this point in time, the same blocking problems persist with nginx and your opinion of it and any other links to anywhere you may have discussed this in more detail.

Thanks!

Graham Dumpleton

no leída,

16 mar 2016, 4:16:07 p.m.16/3/2016

para mod...@googlegroups.com

What version of mod_wsgi and Apache are you using?

Are you stuck with old versions of both?

For memory tracking there are API calls mod_wsgi provides in recent versions for getting memory usage which can be used as part of scheme to trigger a process restart. You can’t use sys.exit(), but can use signals to trigger a clean shutdown of a process. Again better to have recent mod_wsgi versions as can then also set up some graceful timeout options for signal induced restart.

Also, what is your mod_wsgi configuration so can make sure doing all the typical things one would do to limit memory usage, or quarantine particular handlers which are memory hungry?

Graham

Graham Dumpleton

no leída,

16 mar 2016, 7:27:22 p.m.16/3/2016

para mod...@googlegroups.com

On the question of whether nginx will solve this problem, I can’t see how.

When one talks about nginx and Python web applications, it is only as a proxy for HTTP requests to some backend WSGI server. The Python web application doesn’t run in nginx itself. So memory issues and how to deal with them are the provence of the WSGI server used, whatever that is and not nginx.

Anyway, answer the questions below and can start with that.

You really want to be using recent mod_wsgi version and not Apache 2.2.

Apache 2.2 design has various issues and bad configuration defaults which means it can gobble up more memory than you want. Recent mod_wsgi versions have workarounds for Apache 2.2 issues and are much better at eliminating those Apache 2.2 issues. Recent mod_wsgi versions also have fixes for memory usage problems in some corner cases. As far as what I mean by recent, I recommend 4.4.12 or later. The most recent version is 4.4.21. If you are stuck with 3.4 or 3.5 from your Linux distro that is not good and that may increase problems.

So long as got recent mod_wsgi version then can look at using vertical partitioning to farm out memory hungry request handlers to their own daemon process group and better configure those to handle that and recycle processes based on activity or, memory usage. A blog post related to that is:

* http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html

Graham

Bill Freeman

no leída,

16 mar 2016, 7:47:31 p.m.16/3/2016

para modwsgi

An extra note: It's not hard to build Apache and mod_wsgi, and even python itself, from source. I tend to do all three (particularly on Debian derived distributions, where sys.path has surprises). (I've been known to build my own PostGress, but usually I'm happy with packages from the postgres web site.

It means that you can take updates when you decide it's time, rather than when the distro gets around to it. Deployment of security fixes is under your control. On the other hand, you have to do it. Usually you would have to run your distro's update tool anyway, but that's easier than a new compile. You have to decide which one is a plus for you.

You also get to think about configuration, rather than having the easy path of accepting distro defaults. You can look at that as a plus or a minus.

Kent Bower

no leída,

17 mar 2016, 8:28:21 a.m.17/3/2016

para mod...@googlegroups.com

My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.

After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).

So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected. (Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)

Here (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html) you discuss nginx's tendency to block requests that may otherwise be executing in a different process, depending on timing, etc. Is this issue still the same (I thought I read a hint somewhere that there may be a workaround for that), so I ask.

And so I wanted your opinion on nginx...

====

Here is what you asked for if it can still be useful.

I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time is running Apache 2.4 (prefork), though some of our clients use 2.2 (prefork).

Our typical wsgi conf setting is something like this, though threads and processes varies depending on server size:

LoadModule wsgi_module modules/mod_wsgi.so

WSGIPythonHome /home/rarch/tg2env

# see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning timeouts

WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 display-name=%{GROUP} graceful-timeout=5 python-eggs=/home/rarch/tg2env/lib/python-egg-cache

WSGIProcessGroup rarch

WSGISocketPrefix run/wsgi

WSGIRestrictStdout Off

WSGIRestrictStdin On

# Memory tweak. http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html

WSGIRestrictEmbedded On

WSGIPassAuthorization On

# we'll make the /tg/ directory resolve as the wsgi script

WSGIScriptAlias /tg /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py process-group=rarch application-group=%{GLOBAL}

WSGIScriptAlias /debug/tg /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py process-group=rarch application-group=%{GLOBAL}

MaxRequestsPerChild 0

MaxClients 308

ServerLimit 308

</IfModule>

ThreadsPerChild 25

MaxClients 400

ServerLimit 16

</IfModule>

Thanks for all your help and for excellent software!

Kent

On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton <graham.d...@gmail.com> wrote:

Kent Bower

no leída,

17 mar 2016, 8:35:06 a.m.17/3/2016

para mod...@googlegroups.com

Yep, my install scripts compile mod_wsgi, and sometimes even python, etc. (I've also patched mod-wsgi in the past as well with some enhancements before Graham officially released them.)

Thanks!

Graham Dumpleton

no leída,

19 mar 2016, 1:24:49 a.m.19/3/2016

para mod...@googlegroups.com

On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:

My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.

After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).

So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.

Where have you got the idea that using nginx would result in memory being released back to the OS once garbage collected? It isn’t able to do that.

The situations are very narrow as to when a process is able to give back memory to the operating system. It can only be done when the now free memory was at top of allocated memory. This generally only happens for large block allocations and not in normal circumstances for a running Python application.

(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)

You can do this yourself rather easily with more recent mod_wsgi version.

If you create a background thread from a WSGI script file, in similar way as monitor for code changes does in:

http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces

but instead of looking for code changes, inside the main loop of the background thread do:

import os

import mod_wsgi

metrics = mod_wsgi.process_metrics()

if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:

os.kill(os.getpid(), signal.SIGUSR1)

So mod_wsgi provides the way of determining the amount of memory without resorting to importing psutil, which is quite fat in itself, but how you use it is up to you.

Do note that if using SIGUSR1 to restart the current process (which should only be done for deamon mode), you should also set graceful-timeout option to WSGIDaemonProcess if you have long running requests. It is the maximum time process will wait to shutdown while still waiting for requests when doing a SIGUSR2 graceful shutdown of process, before going into forced shutdown mode where no requests will be accepted and requests can be interrupted.

Here (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html) you discuss nginx's tendency to block requests that may otherwise be executing in a different process, depending on timing, etc. Is this issue still the same (I thought I read a hint somewhere that there may be a workaround for that), so I ask.

That was related to someones attempt to embedded a Python interpreter inside of nginx processes themselves. That project died a long time ago. No one embeds Python interpreters inside of nginx processes. It was a flawed design.

I don’t what you are reading to get all these strange ideas. :-)

And so I wanted your opinion on nginx...

====
Here is what you asked for if it can still be useful.

I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time is running Apache 2.4 (prefork), though some of our clients use 2.2 (prefork).

Our typical wsgi conf setting is something like this, though threads and processes varies depending on server size:

LoadModule wsgi_module modules/mod_wsgi.so
WSGIPythonHome /home/rarch/tg2env
# see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning timeouts
WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 display-name=%{GROUP} graceful-timeout=5 python-eggs=/home/rarch/tg2env/lib/python-egg-cache

Is your web server really going to be idle for 30 minutes? I can’t see how that would have been doing anything.

Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.

It used to apply when there were active requests and they were blocked, as well as when no requests were running.

Now it only applies to case where there are no requests.

The case for running but blocked requests is now handled by request-timeout.

You may be better of setting request-timeout now to be a more reasonable value for your expected longest request, but set inactivity-timeout to something much shorter.

So suggest you play with that.

Also, are you request handles I/O or CPU intensive and how many requests?

Such a high number of processes and threads always screams to me that half the performance problems are due to setting these too hard, invoking pathological OS process swapping issues and Python GIL issues.

Graham Dumpleton

no leída,

19 mar 2016, 1:30:34 a.m.19/3/2016

para mod...@googlegroups.com

That last sentence should have been:

“”"

Such a high number of processes and threads always screams to me that half the performance problems are due to setting these too HIGH, invoking pathological OS process swapping issues and Python GIL issues.

“”"

Kent Bower

no leída,

19 mar 2016, 10:10:09 a.m.19/3/2016

para mod...@googlegroups.com

Thanks Graham, few more items inline...

On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <graham.d...@gmail.com> wrote:

On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:

My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.

After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).

So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.

Where have you got the idea that using nginx would result in memory being released back to the OS once garbage collected? It isn’t able to do that.

The situations are very narrow as to when a process is able to give back memory to the operating system. It can only be done when the now free memory was at top of allocated memory. This generally only happens for large block allocations and not in normal circumstances for a running Python application.

At this point I'm not sure where I got that idea, but I'm surprised at this. For example, my previous observations of paster running wsgi were that it is quite faithful at returning free memory to the OS. Was I just getting lucky, or would paster be different for some reason?

In any case, if nginx won't solve that, then I can't see any reason to even consider it over apache/mod_wsgi. Thank you for answering that.

(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)

You can do this yourself rather easily with more recent mod_wsgi version.

If you create a background thread from a WSGI script file, in similar way as monitor for code changes does in:

http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces

but instead of looking for code changes, inside the main loop of the background thread do:

import os
import mod_wsgi

metrics = mod_wsgi.process_metrics()

if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:
os.kill(os.getpid(), signal.SIGUSR1)

So mod_wsgi provides the way of determining the amount of memory without resorting to importing psutil, which is quite fat in itself, but how you use it is up to you.

Right, that's an idea; (could even be a shell script that takes this approach, I suppose, but I like your recipe.)

Unfortunately, I don't want to automate bits that can feasibly clobber blocked sessions. SIGUSR1, after graceful-timeout & shutdown-timeout, can result in ungraceful killing. Our application shares a database with an old legacy application which was poorly written to hold transactions while waiting on user input (this was apparently common two decades ago). So, unfortunately, it isn't terribly uncommon that our application is blocked at the database level waiting for someone using the legacy application who has a record(s) locked and may not even be at their desk or may have gone to lunch. Sometimes our client's IT staff has to hunt down these people or decide to kill their database session. In any case, from a professional point of view, our application should be the responsible one and wait patiently, allowing our client's IT staff the choice of how to handle those cases. So, while the likelihood is pretty low, even with graceful-timeout & shutdown-timeout set at a very high value like 5 minutes, I still run the risk of killing legitimate sessions with SIGUSR1. (I've brought this up before and you didn't agree with my gripe and I do understand why, but in my use case, I don't feel I can automate that route responsibly.... we do use SIGUSR1 manually sometimes, when we can monitor and react to cases where a session is blocked at the database level.)

inactivity-timeout doesn't present this concern: it won't ever kill anything, just silently restarts like a good boy when inactive. I've recently reconsidered dropping that way down from 30 minutes. (When I first implemented this, it was just to reclaim RAM at the end of the day, so that's why it is 30 minutes. I didn't like the idea of churning new processes during busy periods, but I've been thinking 1 or 2 minutes may be quite reasonable.)

If I could signal processes to shutdown at their next opportunity (meaning the next time they are handling no requests, like inactivity-timeout), that would solve many issues in this regard for me because I could signal these processes when their RAM consumption is high and let them restart when "convenient," being the ultimate in gracefulness. SIGUSR2 could mean "the next time you get are completely idle," while SIGUSR1 continues to mean "initiate shutdown now."

Do note that if using SIGUSR1 to restart the current process (which should only be done for deamon mode), you should also set graceful-timeout option to WSGIDaemonProcess if you have long running requests. It is the maximum time process will wait to shutdown while still waiting for requests when doing a SIGUSR2 graceful shutdown of process, before going into forced shutdown mode where no requests will be accepted and requests can be interrupted.

Here (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html) you discuss nginx's tendency to block requests that may otherwise be executing in a different process, depending on timing, etc. Is this issue still the same (I thought I read a hint somewhere that there may be a workaround for that), so I ask.

That was related to someones attempt to embedded a Python interpreter inside of nginx processes themselves. That project died a long time ago. No one embeds Python interpreters inside of nginx processes. It was a flawed design.

I don’t what you are reading to get all these strange ideas. :-)

Google, I suppose ;) That's why I finally asked you when I couldn't find anything more about it via Google.

And so I wanted your opinion on nginx...

====
Here is what you asked for if it can still be useful.

I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time is running Apache 2.4 (prefork), though some of our clients use 2.2 (prefork).

Our typical wsgi conf setting is something like this, though threads and processes varies depending on server size:

LoadModule wsgi_module modules/mod_wsgi.so
WSGIPythonHome /home/rarch/tg2env
# see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning timeouts
WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 display-name=%{GROUP} graceful-timeout=5 python-eggs=/home/rarch/tg2env/lib/python-egg-cache

Is your web server really going to be idle for 30 minutes? I can’t see how that would have been doing anything.

Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.

It used to apply when there were active requests and they were blocked, as well as when no requests were running.

Now it only applies to case where there are no requests.

The case for running but blocked requests is now handled by request-timeout.

You may be better of setting request-timeout now to be a more reasonable value for your expected longest request, but set inactivity-timeout to something much shorter.

So suggest you play with that.

Also, are you request handles I/O or CPU intensive and how many requests?

Such a high number of processes and threads always screams to me that half the performance problems are due to setting these too [HIGH], invoking pathological OS process swapping issues and Python GIL issues.

Yes, the requests are I/O intensive (that is, database intensive, which adds a huge overhead to our typical request). Often requests finish in under a second or two, but they also can take many seconds (not terrible for the user, but sometimes they do a lot of processing with many trips to the database).

We have several clients (companies), so the number of requests varies widely, but can get pretty heavy on busy days (like black friday, since they are in retail). We've played with those numbers quite a bit and without high numbers like that, responsiveness suffers because we backlog due to requests often taking several seconds.

Thanks for all your input, you've been tremendously helpful!

Kent

Graham Dumpleton

no leída,

19 mar 2016, 11:22:39 p.m.19/3/2016

para mod...@googlegroups.com

On 20 Mar 2016, at 1:10 AM, Kent Bower <ke...@bowermail.net> wrote:

Thanks Graham, few more items inline...

On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <graham.d...@gmail.com> wrote:

On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:

My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.

After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).

So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.

Where have you got the idea that using nginx would result in memory being released back to the OS once garbage collected? It isn’t able to do that.

The situations are very narrow as to when a process is able to give back memory to the operating system. It can only be done when the now free memory was at top of allocated memory. This generally only happens for large block allocations and not in normal circumstances for a running Python application.

At this point I'm not sure where I got that idea, but I'm surprised at this. For example, my previous observations of paster running wsgi were that it is quite faithful at returning free memory to the OS. Was I just getting lucky, or would paster be different for some reason?

In any case, if nginx won't solve that, then I can't see any reason to even consider it over apache/mod_wsgi. Thank you for answering that.

(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)

You can do this yourself rather easily with more recent mod_wsgi version.

If you create a background thread from a WSGI script file, in similar way as monitor for code changes does in:

  http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces

but instead of looking for code changes, inside the main loop of the background thread do:

  import os
  import mod_wsgi

  metrics = mod_wsgi.process_metrics()

  if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:
  os.kill(os.getpid(), signal.SIGUSR1)

So mod_wsgi provides the way of determining the amount of memory without resorting to importing psutil, which is quite fat in itself, but how you use it is up to you.

Right, that's an idea; (could even be a shell script that takes this approach, I suppose, but I like your recipe.)

Unfortunately, I don't want to automate bits that can feasibly clobber blocked sessions. SIGUSR1, after graceful-timeout & shutdown-timeout, can result in ungraceful killing. Our application shares a database with an old legacy application which was poorly written to hold transactions while waiting on user input (this was apparently common two decades ago). So, unfortunately, it isn't terribly uncommon that our application is blocked at the database level waiting for someone using the legacy application who has a record(s) locked and may not even be at their desk or may have gone to lunch. Sometimes our client's IT staff has to hunt down these people or decide to kill their database session. In any case, from a professional point of view, our application should be the responsible one and wait patiently, allowing our client's IT staff the choice of how to handle those cases. So, while the likelihood is pretty low, even with graceful-timeout & shutdown-timeout set at a very high value like 5 minutes, I still run the risk of killing legitimate sessions with SIGUSR1. (I've brought this up before and you didn't agree with my gripe and I do understand why, but in my use case, I don't feel I can automate that route responsibly.... we do use SIGUSR1 manually sometimes, when we can monitor and react to cases where a session is blocked at the database level.)

If we have discussed it previously, then I may not have anything more to add.

Did I previously suggest offloading this memory consuming tasks behind a job queue run under Celery or something else? That way they are out of the web server processes at least.

inactivity-timeout doesn't present this concern: it won't ever kill anything, just silently restarts like a good boy when inactive. I've recently reconsidered dropping that way down from 30 minutes. (When I first implemented this, it was just to reclaim RAM at the end of the day, so that's why it is 30 minutes. I didn't like the idea of churning new processes during busy periods, but I've been thinking 1 or 2 minutes may be quite reasonable.)

If I could signal processes to shutdown at their next opportunity (meaning the next time they are handling no requests, like inactivity-timeout), that would solve many issues in this regard for me because I could signal these processes when their RAM consumption is high and let them restart when "convenient," being the ultimate in gracefulness. SIGUSR2 could mean "the next time you get are completely idle," while SIGUSR1 continues to mean "initiate shutdown now.”

That is what SIGUSR1 does it you set graceful-timeout large enough. It is SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t be a need to have a SIGUSR2 as SIGUSR1 should already do what you are hoping for with a reasonable setting of graceful-timeout.

Kent Bower

no leída,

21 mar 2016, 1:01:57 p.m.21/3/2016

para mod...@googlegroups.com

In your recipe for a background monitoring thread watching memory consumption, after issuing the SIGUSR1, I'd probably just want the thread to exit instead of sleeping... do I just do "sys.exit()" to safely accomplish that?

Also, regarding my observations of paster returning garbage-collected memory to the OS, was I just getting lucky while monitoring (the memory was at the very top of the allocated memory)? This is a universal python issue?

Again, thanks for all your help!

Graham Dumpleton

no leída,

21 mar 2016, 6:31:11 p.m.21/3/2016

para mod...@googlegroups.com

On 22 Mar 2016, at 4:01 AM, Kent Bower <ke...@bowermail.net> wrote:

In your recipe for a background monitoring thread watching memory consumption, after issuing the SIGUSR1, I'd probably just want the thread to exit instead of sleeping... do I just do "sys.exit()" to safely accomplish that?

The code isn’t just sleeping. It waits on a queue object which has something placed on it when mod_wsgi is shutting down the process via atexit callback. When the thread gets that it will exit cleanly, with the main thread waiting on it to exit to ensure it isn’t running.

If you just call sys.exit() that results in a SystemExit exception being raised which causes the thread to exit but leaves an exception in the error logs.

The use of the queue is better as it ensures that threads are shutdown properly when process is shutting down, else you risk that the thread could try and run while interpreter is being destroyed, causing Python to crash the process.

Also, regarding my observations of paster returning garbage-collected memory to the OS, was I just getting lucky while monitoring (the memory was at the very top of the allocated memory)? This is a universal python issue?

It is a universal issue with any programs running on a UNIX system.

You may want to Google up some articles on how memory allocation in UNIX as well as in Python works.

Kent Bower

no leída,

22 mar 2016, 7:13:49 a.m.22/3/2016

para mod...@googlegroups.com

A huge, grateful "thank you!"

Kent

Kent Bower

no leída,

28 mar 2016, 8:42:26 a.m.28/3/2016

para mod...@googlegroups.com

(Graham, your suggestions and recipe for a memory monitoring thread are working beautifully. Thanks again.)

Kent Bower

no leída,

7 abr 2016, 10:27:25 a.m.7/4/2016

para mod...@googlegroups.com

Graham,

Under what circumstances might mod_wsgi.process_metrics() return None?

Exception in thread Thread-1:

Traceback (most recent call last):

File "/usr/local/lib/python2.6/threading.py", line 525, in __bootstrap_inner

self.run()

File "/usr/local/lib/python2.6/threading.py", line 477, in run

self.__target(*self.__args, **self.__kwargs)

File "/home/rarch/trunk/src/appserver/wsgi-config/memory_monitor.py", line 41, in monitor

megs = metrics['memory_rss']/1048576

TypeError: 'NoneType' object is unsubscriptable

I've only seen this once in apache log file, some strange timing??

Kent

Graham Dumpleton

no leída,

7 abr 2016, 3:51:35 p.m.7/4/2016

para mod...@googlegroups.com

When you are not using Linux or MacOS X. Or when the Linux system you are using doesn’t provide proc file system for some reason.

Also, randomly if using the develop version from the Git repository before 4.5.0 was released, plus if you are using 4.5.0. :-)

The randomly bug should be fixed by using 4.5.1.

So I have now released this code, but detected the same issue about five minutes after releasing 4.5.0, as was the first time I had tried on Linux. Thus 4.5.1 was released to fix it.

Try again with the latest version.

Graham

Kent Bower

no leída,

7 abr 2016, 4:33:40 p.m.7/4/2016

para mod...@googlegroups.com

It's Linux, mod_wsgi version 4.4.22

I can go to 4.5.1, but if the proc file system isn't available for some reason (I assume transiently??), will 4.5.1 still return None?

Kent

Graham Dumpleton

no leída,

7 abr 2016, 4:37:21 p.m.7/4/2016

para mod...@googlegroups.com

If proc filesystem isn’t available will always return None.

I think the proxy system should always be available, unless is very special Linux build. But then that means always return None and not transiently.

Kent Bower

no leída,

7 abr 2016, 4:51:09 p.m.7/4/2016

para mod...@googlegroups.com

Ok, and if I hit the 'random' problem, then will a subsequent call to mod_wsgi.process_metrics() likely succeed or will it be messed up from then on out?

Graham Dumpleton

no leída,

7 abr 2016, 4:55:40 p.m.7/4/2016

para mod...@googlegroups.com

In the fixed version, process_metrics() will always return dictionary if procfs is supported. There should be no random cases where it doesn’t.

If process isn’t supported, or is Windows or Solaris, will always return None.

The bug which was fixed was that I was checking a wrong stack variable and the variable wasn’t initialised, so what happened depended on what random value was is in the variable. Thus per request it was random whether it returned something or not. Was a bug, plain and simple, but now fixed.

Responder a todos

Responder al autor

Reenviar