Re: [modwsgi] 404 on first requests for daemon process

81 views
Skip to first unread message

Graham Dumpleton

unread,
Nov 6, 2012, 6:32:04 AM11/6/12
to mod...@googlegroups.com
This sounds like your typical problem with code which is not thread
safe, or where you have import order dependencies in code and things
will only work properly if URLs requested in certain order. Once the
critical URL has been hit and the modules finally imported properly,
then all works fine.

You could always use a WSGI middleware wrapper to log the order in
which URL requests come and the responses for the initial life of the
process. Then replicate in a test environment any order captured.

Another possibility is if you rely on caching systems being loaded,
but that only occurs on initial request and concurrent requests are
wrongly thinking cache is loaded when only process of loading it has
started. Those concurrent requests can fail. This can be due to
caching system not being thread safe in access or update.

Graham

On 6 November 2012 17:06, mtk <mtt...@gmail.com> wrote:
> Hi everyone,
>
> We've been trying to debug a strange problem in our mod_wsgi/apache
> configuration. We have a mostly working configuration that serves thousands
> of successful requests per day, but a few requests will incorrectly return a
> 404 (same request will work again immediately after). We noticed that when
> we restart our webservers, we also get a bunch of 404s for the first
> requests which we always handled by quickly refreshing the page until they
> stopped. We've configured our daemon processes to restart after several
> thousand requests, so it seems like the random 404 are caused by a daemon
> process just starting (looking through apache logs, the access log reports a
> 404 and there's output in the error log indicating our app modules have been
> loaded again at the same time, but nothing else). Our wsgi application is a
> bit slow to load (on the order of couple hundred miliseconds, and maybe 1 or
> 2 seconds if there's a lot of load), so maybe there is some kind of timeout
> the apache processes have waiting for results from a daemon?
>
> Really appreciate any ideas or suggestions on how to debug.
>
> Here's relevant version and config info:
> Apache/2.2.22
> Python/2.7.3
> mod_wsgi/3.3
> werkzeug/flask .8
>
>
> apache2.conf:
> LoadModule wsgi_module modules/mod_wsgi.so
> ...
> WSGIRestrictEmbedded On
> WSGISocketPrefix /var/run/wsgi
> WSGIPythonHome /python/home/directory
> ...
> <VirtualHost *:80>
> ...
> WSGIDaemonProcess group_name processes=10 threads=15 display-name=worker
> maximum-requests=10000
> WSGIProcessGroup group_name
> WSGIScriptAlias / /path_to_wsgi
> WSGIApplicationGroup %{GLOBAL}
> ...
> </VirtualHost>
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/qINw6J75kHAJ.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to
> modwsgi+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/modwsgi?hl=en.

Gnarlodious

unread,
Nov 10, 2012, 9:45:33 AM11/10/12
to mod...@googlegroups.com
I get the same thing happening. My webapp takes a while to initialize, but meanwhile all requests get the error. Upon uploading a new version I run a curl command to make it start up, which minimizes the chance a user will get the error. I also notice that browsers are modernizing to be more standards-compliant, so that cache handling and request timeout can make a diagnosis difficult. Using the curl command to request the first page has the benefit of being more patient than a browser,  and no cache means obsolete pages are not returned.

I can imagine in the future the mod_wsgi module will send a minimal http header to prevent the browser from showing an error or cached page during webapp startup.

-- Gnarlie

Joonas Lehtolahti

unread,
Nov 10, 2012, 10:27:34 AM11/10/12
to mod...@googlegroups.com
On Sat, 10 Nov 2012 16:45:33 +0200, Gnarlodious <gnarl...@gmail.com>
wrote:
But if the error is 404, that comes from server side. No client generates
fake 404 error to my knowledge, it is server response to say that the
requested resource was not found. In this light mod_wsgi sending minimal
http header at start won't do anything since the server *is* sending
header and the browser is getting it. Only that the response is not what
you would expect.

Graham (or others knowing the deep insides of mod_wsgi), is there any case
where Apache would be sending 404 if mod_wsgi is stalling, or will it
always wait for the response from WSGI application or timeout with some
other error? I couldn't imagine Apache sending 404 randomly for a resource
handled by specific handler without the handler itself returning 404.
(could it be the webapp itself is sending 404?)

- Joonas

Graham Dumpleton

unread,
Nov 10, 2012, 7:37:21 PM11/10/12
to mod...@googlegroups.com
This sounds more like you are doing lazy initialisation of application
data/caches on the first request and they are not protected against
access from multiple threads properly. The first thread probably does
enough setup to make subsequent requests think that initialisation has
completed when it hasn't, so when they hit the cache or what ever it
is, fails to find what it is looking for and returns 404

This though is dependent on URL dispatch being somehow being dependent
on information in a cache.

Can you provide more details about what the URLs are targeting that
are returning 404. Are the something where the application itself
would return 404.

Consider using:

http://code.google.com/p/modwsgi/wiki/DebuggingTechniques#Tracking_Request_and_Response

to log requests going into your application and determine if the 404
is coming from it.

I can think of no reason Apache/mod_wsgi itself would cause a 404 to
be returned during any startup phase.

Graham
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/84adKgNJM6IJ.
Reply all
Reply to author
Forward
0 new messages