Been trying to catch up on other stuff the last few days, which is why this response is delayed.
Over the years have seen a number of times people doing this exact same thing as you are. That is, doing image manipulation on an uploaded image and then returning the result. For one reason or another the final outcome seemed to always be that you are better off using a backend queuing system such as Celery to handle the image manipulation. In other words, remove the processing of the images from your web application processes.
There are a few reasons why this is the case.
The first is that images and image manipulation can use a lot of transient memory. Especially when using multithreading in your web application with Python, this can result in high peak memory usage for the process. This is because you might get a whole bunch of requests come in at the same time and so processing of them overlaps. The memory consumption will blow out to the maximum required to support that number all being processed at the same time. When done, although the memory is released back for use by other parts of the application, the damage has already been done and your application will keep the overall high memory reservation. End result is that most of the time you will have lots of unused memory held by the process, with it only being used when you get concurrent requests again.
The second problem is that image manipulation can be CPU intensive. In a multithreaded application, depending on how well the image manipulation library works and how it handles locking of the global interpreter lock, in worst case parts of that image processing will be forced to be serialised resulting in requests being blocked and time taken for requests being longer than it would if processes were single threaded. In other words, image manipulation done in different threads interfere with each other and they all suffer.
The third is that if using embedded mode of mod_wsgi, you can see problems with per request thread pool usage of Apache worker process (in which the Python code is running), blowing out due to large response sizes. In the old days of Apache, up to 8MB could be held in the per request thread memory pool and only memory above that limit would actually be released. Thus if have lot of threads per worker process, that means 8MB of memory that stays reserved for each worker thread. In more recent Apache versions the sample configuration that comes with Apache drops this to 2MB, but if the distro has removed that setting from original Apache sample configuration, or you remove it, then I believe it defaults back to 8MB.
Using a backend Celery task system avoids the first two issues as work is done in a separate process and that process could even be recycled after every task, so you avoid problem with unused memory hanging around being reserved. The Celery processes are also single threaded, eliminating Python global interpreter lock issues.
The third problem above can be lessened by ensuring the Apache configuration directive for setting per request memory pool size is actually set, and lower the value if necessary. How you configure the Apache MPM settings can also affect this.
In general though, it is always recommended as first option that you avoid using modwsgi embedded mode at all, and use daemon mode. This avoids various problems caused by Apache MPM choice and settings.
So if you can change to Celery in the short term, switch to daemon mode instead.
In doing this, ensure that embedded mode is disabled completely by setting:
WSGIRestrictEmbedded On
Also reduce the per request thread pool size. Where Apache worker processes are only acting as proxy to mod_wsgi daemon process, the value I set in mod_wsgi-express configuration is:
ThreadStackSize 262144
Thus 0.25MB per thread instead of 2MB or 8MB.
Another dangerous setting you were using that would have caused lots of problems when using embedded mode was:
MaxKeepAliveRequests 100
This would be causing Apache to restart your application processes too frequently, causing higher CPU due to high start up cost. In mod_wsgi-express I don't set this at all.
Next problem is:
KeepAliveTimeout 45
In mod_wsgi-express, I set this to 2 seconds. By having such a high value you risk problems, especially when using worker MPM, although event MPM can have its own issues. By having it lower, you may not need as many Apache worker processes and threads.
The question now is why you were restarting after 100 requests. Was this in attempt to try and keep memory usage down?
One of the consequences of this is that would possibly see a lot of interrupted requests. This is what those warning messages about killing off processes is about. This is because Apache will only wait so long for processes to shutdown. Depending on how shutdown is managed, this can be only 5 seconds, but since you have long running requests, that can prevent that so Apache kills the processes anyway, and thus why requests can be interrupted. You really want to avoid periodic restarts of Apache child worker processes using that option.
If you do have a growing memory problem because of issues with your application code, there are various ways you can trigger restarts of the mod_wsgi daemon processes, but these self initiated restarts allow for a graceful restart timeout. Thus for the WSGIDaemonProcess directive you can set the options:
maximum-requests=100 graceful-timeout=120
So when 100 requests arrive, a restart of the process will be signalled, but since the graceful timeout is set to 120 seconds, it will only be forcibly restarted after 120 seconds. In the interim, if the number of active requests being handled by the process drops to 0, a restart will be triggered at that point. This way it limits interrupting of active requests. You will still have issues though if have issues with requests getting blocked indefinitely as never reaches point where no active requests, but then if that is occurring any why are restarting so frequently, you have bigger issues.
For the latter, if you are getting stuck requests, you want to look at request-timeout option to WSGIDaemonProcess.
Anyway, for further guidance on setting up mod_wsgi daemon mode, would suggest watching:
The defaults for mod_wsgi daemon mode are not the best options for historical reasons. The video talks about that and how mod_wsgi-express sets different defaults.
To start with that is probably all I can suggest. Giving recommendations on tuning Apache MPM settings and mod_wsgi daemon mode is harder to do at this point.
Summarising things. Use Celery as out of process means to handle image manipulation. If you can't do that for now, try and switch to mod_wsgi daemon mode as that will allow memory and CPU usage to be better controlled.
Graham