--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "modwsgi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to modwsgi+u...@googlegroups.com.
On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.
(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)
Here (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html) you discuss nginx's tendency to block requests that may otherwise be executing in a different process, depending on timing, etc. Is this issue still the same (I thought I read a hint somewhere that there may be a workaround for that), so I ask.
And so I wanted your opinion on nginx...====Here is what you asked for if it can still be useful.I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time is running Apache 2.4 (prefork), though some of our clients use 2.2 (prefork).Our typical wsgi conf setting is something like this, though threads and processes varies depending on server size:LoadModule wsgi_module modules/mod_wsgi.soWSGIPythonHome /home/rarch/tg2env# see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning timeoutsWSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 display-name=%{GROUP} graceful-timeout=5 python-eggs=/home/rarch/tg2env/lib/python-egg-cache
On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.Where have you got the idea that using nginx would result in memory being released back to the OS once garbage collected? It isn’t able to do that.The situations are very narrow as to when a process is able to give back memory to the operating system. It can only be done when the now free memory was at top of allocated memory. This generally only happens for large block allocations and not in normal circumstances for a running Python application.
(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)You can do this yourself rather easily with more recent mod_wsgi version.If you create a background thread from a WSGI script file, in similar way as monitor for code changes does in:but instead of looking for code changes, inside the main loop of the background thread do:import osimport mod_wsgimetrics = mod_wsgi.process_metrics()if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:os.kill(os.getpid(), signal.SIGUSR1)So mod_wsgi provides the way of determining the amount of memory without resorting to importing psutil, which is quite fat in itself, but how you use it is up to you.
Do note that if using SIGUSR1 to restart the current process (which should only be done for deamon mode), you should also set graceful-timeout option to WSGIDaemonProcess if you have long running requests. It is the maximum time process will wait to shutdown while still waiting for requests when doing a SIGUSR2 graceful shutdown of process, before going into forced shutdown mode where no requests will be accepted and requests can be interrupted.Here (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html) you discuss nginx's tendency to block requests that may otherwise be executing in a different process, depending on timing, etc. Is this issue still the same (I thought I read a hint somewhere that there may be a workaround for that), so I ask.That was related to someones attempt to embedded a Python interpreter inside of nginx processes themselves. That project died a long time ago. No one embeds Python interpreters inside of nginx processes. It was a flawed design.I don’t what you are reading to get all these strange ideas. :-)
And so I wanted your opinion on nginx...====Here is what you asked for if it can still be useful.I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time is running Apache 2.4 (prefork), though some of our clients use 2.2 (prefork).Our typical wsgi conf setting is something like this, though threads and processes varies depending on server size:LoadModule wsgi_module modules/mod_wsgi.soWSGIPythonHome /home/rarch/tg2env# see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning timeoutsWSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 display-name=%{GROUP} graceful-timeout=5 python-eggs=/home/rarch/tg2env/lib/python-egg-cacheIs your web server really going to be idle for 30 minutes? I can’t see how that would have been doing anything.Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.It used to apply when there were active requests and they were blocked, as well as when no requests were running.Now it only applies to case where there are no requests.The case for running but blocked requests is now handled by request-timeout.You may be better of setting request-timeout now to be a more reasonable value for your expected longest request, but set inactivity-timeout to something much shorter.So suggest you play with that.Also, are you request handles I/O or CPU intensive and how many requests?
Such a high number of processes and threads always screams to me that half the performance problems are due to setting these too [HIGH], invoking pathological OS process swapping issues and Python GIL issues.
On 20 Mar 2016, at 1:10 AM, Kent Bower <ke...@bowermail.net> wrote:Thanks Graham, few more items inline...On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <graham.d...@gmail.com> wrote:On 17 Mar 2016, at 11:28 PM, Kent Bower <ke...@bowermail.net> wrote:My answers are below, but before you peek, Graham, note that you and I have been through this memory discussion before & I've read the vertical partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", considered maximum-requests, etc.After years of this, I'm resigned to the fact that python is memory hungry, especially built on many of these web-stack and database libraries, etc. I'm Ok with that. I'm fine with a high-water RAM mark imposed by running under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of requests that really (and legitimately) hog a ton of RAM, like, say 500MB extra, didn't keep it when done. I may revisit vertical partitioning again, but last time I did I think I found that the 1 or 2% in my case generally won't be divisible by url. In most cases I wouldn't know whether the particular request is going to need lots of RAM until after the database queries return (which is far too late for vertical partitioning to be useful).So I was mostly just curious about the status of nginx running wsgi, which doesn't solve python's memory piggishness, but would at least relinquish the extra RAM once python garbage collected.Where have you got the idea that using nginx would result in memory being released back to the OS once garbage collected? It isn’t able to do that.The situations are very narrow as to when a process is able to give back memory to the operating system. It can only be done when the now free memory was at top of allocated memory. This generally only happens for large block allocations and not in normal circumstances for a running Python application.At this point I'm not sure where I got that idea, but I'm surprised at this. For example, my previous observations of paster running wsgi were that it is quite faithful at returning free memory to the OS. Was I just getting lucky, or would paster be different for some reason?In any case, if nginx won't solve that, then I can't see any reason to even consider it over apache/mod_wsgi. Thank you for answering that.(Have you considered a max-memory parameter to mod_wsgi that would gracefully stop taking requests and shutdown after the threshold is reached for platforms that would support it? I recall -- maybe incorrectly -- you saying on Windows or certain platforms you wouldn't be able to support that. What about the platforms that could support it? It seems to me to be the very best way mod_wsgi could approach this Apache RAM nuance, so seems like it would be tremendously useful for the platforms that could support it.)You can do this yourself rather easily with more recent mod_wsgi version.If you create a background thread from a WSGI script file, in similar way as monitor for code changes does in:but instead of looking for code changes, inside the main loop of the background thread do:import osimport mod_wsgimetrics = mod_wsgi.process_metrics()if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:os.kill(os.getpid(), signal.SIGUSR1)So mod_wsgi provides the way of determining the amount of memory without resorting to importing psutil, which is quite fat in itself, but how you use it is up to you.Right, that's an idea; (could even be a shell script that takes this approach, I suppose, but I like your recipe.)Unfortunately, I don't want to automate bits that can feasibly clobber blocked sessions. SIGUSR1, after graceful-timeout & shutdown-timeout, can result in ungraceful killing. Our application shares a database with an old legacy application which was poorly written to hold transactions while waiting on user input (this was apparently common two decades ago). So, unfortunately, it isn't terribly uncommon that our application is blocked at the database level waiting for someone using the legacy application who has a record(s) locked and may not even be at their desk or may have gone to lunch. Sometimes our client's IT staff has to hunt down these people or decide to kill their database session. In any case, from a professional point of view, our application should be the responsible one and wait patiently, allowing our client's IT staff the choice of how to handle those cases. So, while the likelihood is pretty low, even with graceful-timeout & shutdown-timeout set at a very high value like 5 minutes, I still run the risk of killing legitimate sessions with SIGUSR1. (I've brought this up before and you didn't agree with my gripe and I do understand why, but in my use case, I don't feel I can automate that route responsibly.... we do use SIGUSR1 manually sometimes, when we can monitor and react to cases where a session is blocked at the database level.)
inactivity-timeout doesn't present this concern: it won't ever kill anything, just silently restarts like a good boy when inactive. I've recently reconsidered dropping that way down from 30 minutes. (When I first implemented this, it was just to reclaim RAM at the end of the day, so that's why it is 30 minutes. I didn't like the idea of churning new processes during busy periods, but I've been thinking 1 or 2 minutes may be quite reasonable.)
If I could signal processes to shutdown at their next opportunity (meaning the next time they are handling no requests, like inactivity-timeout), that would solve many issues in this regard for me because I could signal these processes when their RAM consumption is high and let them restart when "convenient," being the ultimate in gracefulness. SIGUSR2 could mean "the next time you get are completely idle," while SIGUSR1 continues to mean "initiate shutdown now.”
On 22 Mar 2016, at 4:01 AM, Kent Bower <ke...@bowermail.net> wrote:In your recipe for a background monitoring thread watching memory consumption, after issuing the SIGUSR1, I'd probably just want the thread to exit instead of sleeping... do I just do "sys.exit()" to safely accomplish that?
Also, regarding my observations of paster returning garbage-collected memory to the OS, was I just getting lucky while monitoring (the memory was at the very top of the allocated memory)? This is a universal python issue?