Long Running WSGI Import Script

Steve Clark

unread,

Jun 12, 2020, 6:05:03 PM6/12/20

to modwsgi

Hi There,

I'm currently having an issue and was hoping for a little advice on workarounds or ways to solve it.

We have the need to extract a lot of information out of memcached before serving requests, I thought I had something solved by using a WSGIDaemonProcess that my wsgi scripts can use. And then simply load all the memcached data into a singleton object at the module level of the wsgi scripts. And this all seem to work perfectly, it takes around 40 seconds to load all of the data but works fine after that.

My problem comes when attempting to update the wsgi files, or trying apachectrl graceful command. At this point all subsequent requests are having to wait for the 40 load of data to complete.

Our current deployment is using mod_wsgi 3.4, but I've updated to 4.5 and have the same issue.

Is there a way for mod_wsgi to only start tearing down processes after there are new processes ready and fully loaded to handle requests?

The only option I can see to move forward is to load the data required in a background thread. This way requests can still be handled, but they may have to load a section of the data required from memcached at request time. Not the ideal situation as this is what I'm trying to move away from with the pre-loading of this data. But at least this way it will only happen if a process goes down, or someone manually gracefully restarts apache.

Would really appreciate some advice on how best to move forward, I did try to search in this forum for a similar question, but couldn't quite find the correct search terms.

Thank you

Steve

Jason Garber

unread,

Jun 12, 2020, 8:14:35 PM6/12/20

to mod...@googlegroups.com

Hi Steve,

IlI can't directly answer your question but wanted to suggest that you create your own index... surely you don't need all the data on every request. Can you boil it down and keep it in redis or memcached in a way that it lives there in a server that is made for that?

Alternately, run mod_wsgi Express outside of apache and use it to serve the data part of your request to your other processes in Apache.

Jason

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/modwsgi/bf2e4b18-458f-43e3-9130-363f1d524335o%40googlegroups.com.

Steve Clark

unread,

Jun 15, 2020, 5:49:11 AM6/15/20

to modwsgi

Thanks for your comment Jason,

Over the last few weeks, I've been working on splitting up the data stored in memcached so as to make it smaller per request. But a lot of processing is done to create the object that is then pickled and compressed before being loaded into memcached that is running on the same machine as apache. So in short, we currently load only what is needed at request time (and then cache it, for subsequent requests), but it results in the request that does this loading taking too long.

We're currently running in the default way (not via a deamon process), and this results in occasional spikes in our request times. I'm assuming this is due to processes being spun up and down automatically by apache or mod_wsgi?

I'm intrigued with what you've mentioned about mod_wsgi Express, is there a way for that to spawn a background process (or thread?) that shares the same memory as the standard mod_wsgi requests?

Thanks again for you help Jason,

Steve

To unsubscribe from this group and stop receiving emails from it, send an email to mod...@googlegroups.com.

Reply all

Reply to author

Forward