How to increase apache2 timeout for slow wsgi application

6,599 views
Skip to first unread message

Dave Williams

unread,
Aug 24, 2017, 6:50:45 AM8/24/17
to modwsgi
I have an app running under mod_wsgi on a real apache2 instance which takes a LONG time to process requests. This is due to huge quantity of database operations in a single transaction and is fine for our application requirements.
Due to database locking both the original request (and other requests) have to wait. These fail with gateway timeouts. To resolve this I simply increased the apache2.conf Timeout setting to the worst case number of seconds.

I now have moved to a docker environment using Graham Dumpleton's excellent mod_wsgi-express onbuild container. This uses defaults for apache2 and as a result I am back with the same errors.

I have spent a morning wandering around the internet and cannot find out the "correct" way to increased the Timeout to the value I require.
Clearly I could update the prepared httpd config file after it is created but this doesnt seem right.

The mod_wsgi-express start-server help talks about various timeouts but I never needed to tweak anything on my "real" instance in this area.  Are some of these relevant to my issue?
I dont want to randomly start changing things.

Some comments (eg Graham's blogs) indicate httpd settings propagate through (via SERVER_ARGS) and the mod_wsgi-start-server script shows it can be set up in a .whiskey/server_args file (but no required format given).  
The source code for mod_wsg-express is rather opaque in how it starts the httpd instance and under what environment or option switch settings.


Please can anyone help?

Thanks

Graham Dumpleton

unread,
Aug 25, 2017, 12:29:51 AM8/25/17
to mod...@googlegroups.com

> On 24 Aug 2017, at 8:49 pm, Dave Williams <flash...@gmail.com> wrote:
>
> I have an app running under mod_wsgi on a real apache2 instance which takes a LONG time to process requests. This is due to huge quantity of database operations in a single transaction and is fine for our application requirements.
> Due to database locking both the original request (and other requests) have to wait. These fail with gateway timeouts. To resolve this I simply increased the apache2.conf Timeout setting to the worst case number of seconds.

If using straight Apache/mod_wsgi (not mod_wsgi-express) they shouldn't fail with gateway timeouts. So can only assume when saying that error message it is based on what you are know seeing with mod_wsgi-express. The error would be different if using Apache/mod_wsgi directly and you configured it yourself.

> I now have moved to a docker environment using Graham Dumpleton's excellent mod_wsgi-express onbuild container. This uses defaults for apache2 and as a result I am back with the same errors.

FWIW, I really don't recommend onbuild Docker images. I am also not putting new effort into the mod_wsgi-docker image anymore as it isn't built to what I would regard as best practices. Unfortunately people are dependent on the way it worked and so can't change it. I have newer images I have been working on which do things in much better way.

The main problem with onbuild and the mod_wsgi-docker image is that it result in people running their container as root. This is really not a good idea.

> I have spent a morning wandering around the internet and cannot find out the "correct" way to increased the Timeout to the value I require.
> Clearly I could update the prepared httpd config file after it is created but this doesnt seem right.
>
> The mod_wsgi-express start-server help talks about various timeouts but I never needed to tweak anything on my "real" instance in this area. Are some of these relevant to my issue?
> I dont want to randomly start changing things.
>
> Some comments (eg Graham's blogs) indicate httpd settings propagate through (via SERVER_ARGS) and the mod_wsgi-start-server script shows it can be set up in a .whiskey/server_args file (but no required format given).
> The source code for mod_wsg-express is rather opaque in how it starts the httpd instance and under what environment or option switch settings.

You canoeist additional command line arguments for mod_wsgi-express in the .whiskey/server_args file. These will be added at the end of any builtin arguments added by the startup script and so can override prior options if need be. The arguments should be able to be on separate lines if you want.

In your case the main options you are going to want to play with are:

--request-timeout SECONDS
Maximum number of seconds allowed to pass before the
worker process is forcibly shutdown and restarted when
a request does not complete in the expected time. In a
multi threaded worker, the request time is calculated
as an average across all request threads. Defaults to
60 seconds.

--connect-timeout SECONDS
Maximum number of seconds allowed to pass before
giving up on attempting to get a connection to the
worker process from the Apache child process which
accepted the request. This comes into play when the
worker listener backlog limit is exceeded. Defaults to
15 seconds.

--socket-timeout SECONDS
Maximum number of seconds allowed to pass before
timing out on a read or write operation on a socket
and aborting the request. Defaults to 60 seconds.

--queue-timeout SECONDS
Maximum number of seconds allowed for a request to be
accepted by a worker process to be handled, taken from
the time when the Apache child process originally
accepted the request. Defaults to 30 seconds.

--response-socket-timeout SECONDS
Maximum number of seconds allowed to pass before
timing out on a write operation back to the HTTP
client when the response buffer has filled and data is
being forcibly flushed. Defaults to 0 seconds
indicating that it will default to the value of the
'socket-timeout' option.

You may also need to use:

--processes NUMBER The number of worker processes (instances of the WSGI
application) to be started up and which will handle
requests concurrently. Defaults to a single process.

--threads NUMBER The number of threads in the request thread pool of
each process for handling requests. Defaults to 5 in
each process.

--max-clients NUMBER The maximum number of simultaneous client connections
that will be accepted. This will default to being 1.5
times the total number of threads in the request
thread pools across all process handling requests.

Questions I would need answers to make recommends are:

* How long does a request actually take up to?
* How many processes are you running? It will be 1 if you didn't change it.
* How many threads are you running in that process for handling requests? It will be 5 if you didn't change it.
* How many concurrent requests do you expect?

For an I/O bound application, first thing you might do is increase the number of threads. So add to 'server_args' file:

--threads 20

This gives you more capacity to handle concurrent requests.

If you still have a various CPU bound activities, you should balance things out by increasing the number of processes at same time.

--threads 10
--processes 2

Either way, it allows you to have more requests in the Python application waiting directly on the database operation to finish.

If the number of concurrent requests exceeds capacity (threads*processes), then requests will queue up.

If those requests are queued for more than the queue timeout, they will be failed before even being handled and you will see the gateway timeout for those.

This first preference is to increase the capacity because timing out requests in the queue is important as far as recovering a server which temporarily gets stuck and too many requests backlog. If your clients are users, they have probably given up after a while and no point handling a request which has been sitting there a long time when you are already trying to catch up because you reached capacity.

If you still want to allow requests to be queue for longer, then use:

--queue-timeout 30

Make that too large such that requests backlog into the Apache child processes, then you may want to increase maximum number of clients if happy with that.

--max-clients 40

This defaults to 1.5*threads*processes. When you set it you need to work out a value yourself. It should be greater than threads*processes.

As to the Timeout directive, that is a base timeout applying to any single read or write operation on a socket. It can be modified using --socket-timeout, or for specific sub cases the --response-socket-timeout option.

You want to be a little more careful playing with that one. It really depends on how long your request is going to take and needs to be considered in relation to --request-timeout as well.

The request timeout is a notional cap on how long a request is allowed to run before it is failed. How it works is a bit tricky though. In a threaded process, the request timeout is actually an average across all request threads for active requests. So although it says 60 seconds, a request can actually run longer if there was no other requests running and so was still free capacity. It would only kick in at 60 seconds if you had concurrent requests all start at the same time and they all hit 60 seconds running time. This is why really need to understand how long your requests actually run for before can say what to do about it.

Graham

Dave Williams

unread,
Aug 25, 2017, 4:45:17 AM8/25/17
to modwsgi
Thanks Graham for your prompt response.

To answer your questions:
- Yes using mod_wsgi-express.
- onbuild: I am not wedded to this - I was just following the least line or resistance to port my application. What precisely are you recommending and what do I need to change (I am running docker-compose to orchestrate multiple wsgi containers behind haproxy acting as application selector).
- The app is part of a jenkins build pipeline. The request normally takes around 60-120 seconds. In worst case it takes 6 hours (I know! - we have 100,000+ lines of XML across around 100 files to parse (submitted with the request) and to write to the database as a single transaction. It is almost impossible to split these up into small transactions and roll back if anything should fail. It would also leave the database in a "live" but inconsistent state until done. This has been running in full production for a couple of years now without issue so although not architecturally nice - it works. The dockerisation was part of providing more concurrency
- We have 1 process (default) running on each wsgi container.
- We have 5 threads (default)
- Normally expect only1 concurrent request that writes to the database (most of the DB is write locked due to the transaction - so pointless trying more)  We try to serialise access via jenkins pipelines and reference an unoccupied wsgi container - but expect a small number of concurrent database "read only client" accesses (20 worst case estimate).

Currently (bypassing the haproxy) and talking direct to a mod_wsgi container we are getting gateway timeouts.
- I found .whiskey/server_args last night and added --include-file reference so my /tmp/mod_wsgi-localhost:80:0/http.conf now contains Include '/app/extra.conf' so I can add apache2 directives there.
I also tried --socket-timeout and --request-timeout as you have subsequently recommended. 

It appears that Apache Timeout and --socket-timeout are the critical items that need to be increased.

Whilst that has moved me forward a bit I am subsequently hitting other problems which result in wsgi process being abruptly restarted (with little or nothing reported in the apache logs) when sent test data taken from my production environment.
I am seeing "Truncated or oversized response headers received from daemon process liocalhost:80" or simply "nothing reported" followed by "[MainThread] Returning wsgi_app" from my wsgi bootstrap as it restarts. Client see as 500 internal server error in this case.
I am not sure if these relate to timeouts or how to debug further. They are very consistent in terms of time after the run starts (20 seconds). 

Dave

Dave Williams

unread,
Aug 25, 2017, 5:29:45 AM8/25/17
to modwsgi
Reference the Truncated/Oversize response headers etc:
a) I the app does not send anything back to the user beyond few lines of HTML and a 200 (if successful)
b) The post request contains a tar.gz file form element of around 4MB so although apache2 doesn't have a specific limit the post puts a load on the system especially when its subsequently decompressed by python in the app.
c) The jessie based container doesnt contain tcp_rmem/wmem entries under /proc/sys/net with which to determine socket buffer sizes (which is the default mod_wsgi takes by default)..
 
I suspect I am running out of resources somewhere and maybe the error message (where it does show) is simply misleading.



Dave Williams

unread,
Aug 25, 2017, 6:20:52 AM8/25/17
to modwsgi
Setting info level debug on Apache gives a little more:
mod_wsgi (pid=23): Aborting process 'localhost:80
mod_wsgi (pid=23): Exiting process 'localhost:80'
[client 10.10.66.7:39118] Truncated or oversized response headers received from daemon process 'localhost:80': /tmp/mod_wsgi-localhost:80:0/htdocs/submit
mod_wsgi (pid=23): Process 'localhost:80' has died, deregister and restart it.
mod_wsgi (pid=23): Process 'localhost:80' has been deregistered and will no longer be monitored.
mod_wsgi (pid=61): Starting process 'localhost:80' with uid=1001, gid=0 and threads=5.
mod_wsgi (pid=61): Python home /usr/local/python.
mod_wsgi (pid=61): Initializing Python.
mod_wsgi (pid=61): Attach interpreter ''.
mod_wsgi (pid=61): Imported 'mod_wsgi'.

I am not sure how you should set ONE_PROCESS as a docker environment variable so it gets picked up by Apache to set WSGIApplicationGroup to %{GOBAL]. This appears to be a potential way of fixing the problem. It appears it gets appended via -D if you enable debug_mode within mod_wsgi with --debug-mode but I wouldnt want that in production.

Dave Williams

unread,
Aug 25, 2017, 6:43:45 AM8/25/17
to modwsgi
Setting --debug-mode gets me a lot further (now hitting a test data set DB insertion python exception which is my problem to fix).
So if its the ONE_PROCESS that is helping how do I set it properly?


Graham Dumpleton

unread,
Aug 25, 2017, 8:26:38 AM8/25/17
to mod...@googlegroups.com
Very quickly on one point as is late.

You want to override --request-timeout option. This defaults to 60 seconds. Even if you only had one request, with 5 threads the WSGI process will be force restarted after 300 seconds. If you don't want a timeout try --request-timeout 0. That way requests can run as long as they like and is what straight Apache/mod_wsgi behaves like.

Dave

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Dave Williams

unread,
Aug 25, 2017, 12:00:43 PM8/25/17
to modwsgi

Very quickly on one point as is late.

You want to override --request-timeout option. This defaults to 60 seconds. Even if you only had one request, with 5 threads the WSGI process will be force restarted after 300 seconds. If you don't want a timeout try --request-timeout 0. That way requests can run as long as they like and is what straight Apache/mod_wsgi behaves like.


OK - I will try as soon as I can (its the long summer "bank holiday" weekend here in the UK so i might not be allowed anywhere near a computer!). 

With apache "extra.conf" settings still set to
Timeout 21600
RequestReadTimeout header=300-360,MinRate=500 body=300-360,MinRate=500
LogLevel info

and mod_wsgi-express set to
--include extra.conf  # hmm shouldn't that be --include-file but it appears to work 
--socket-timeout 3600
--request-timeout 60
--debug-mode

 I ran a test data request for 85 minutes before it fell over with another (unrelated and stupid) python exception that I have now fixed.
I am repeating that run to make sure this aspect is out of the equation.


Dave 

Graham Dumpleton

unread,
Aug 25, 2017, 8:02:20 PM8/25/17
to mod...@googlegroups.com
On 26 Aug 2017, at 2:00 am, Dave Williams <flash...@gmail.com> wrote:


Very quickly on one point as is late.

You want to override --request-timeout option. This defaults to 60 seconds. Even if you only had one request, with 5 threads the WSGI process will be force restarted after 300 seconds. If you don't want a timeout try --request-timeout 0. That way requests can run as long as they like and is what straight Apache/mod_wsgi behaves like.


OK - I will try as soon as I can (its the long summer "bank holiday" weekend here in the UK so i might not be allowed anywhere near a computer!). 

With apache "extra.conf" settings still set to
Timeout 21600

This is what is set by --socket-timeout.

RequestReadTimeout header=300-360,MinRate=500 body=300-360,MinRate=500

This is what is set by options:

  --header-timeout SECONDS
                        The number of seconds allowed for receiving the
                        request including the headers. This may be dynamically
                        increased if a minimum rate for reading the request
                        and headers is also specified, up to any limit imposed
                        by a maximum header timeout. Defaults to 15 seconds.
  --header-max-timeout SECONDS
                        Maximum number of seconds allowed for receiving the
                        request including the headers. This is the hard limit
                        after taking into consideration and increases to the
                        basic timeout due to minimum rate for reading the
                        request and headers which may be specified. Defaults
                        to 30 seconds.
  --header-min-rate BYTES
                        The number of bytes required to be sent as part of the
                        request and headers to trigger a dynamic increase in
                        the timeout on receiving the request including
                        headers. Each time this number of bytes is received
                        the timeout will be increased by 1 second up to any
                        maximum specified by the maximum header timeout.
                        Defaults to 500 bytes.
  --body-timeout SECONDS
                        The number of seconds allowed for receiving the
                        request body. This may be dynamically increased if a
                        minimum rate for reading the request body is also
                        specified, up to any limit imposed by a maximum body
                        timeout. Defaults to 15 seconds.
  --body-max-timeout SECONDS
                        Maximum number of seconds allowed for receiving the
                        request body. This is the hard limit after taking into
                        consideration and increases to the basic timeout due
                        to minimum rate for reading the request body which may
                        be specified. Defaults to 0 indicating there is no
                        maximum.
  --body-min-rate BYTES
                        The number of bytes required to be sent as part of the
                        request body to trigger a dynamic increase in the
                        timeout on receiving the request body. Each time this
                        number of bytes is received the timeout will be
                        increased by 1 second up to any maximum specified by
                        the maximum body timeout. Defaults to 500 bytes.

LogLevel info

This is what is set by --log-level.

and mod_wsgi-express set to
--include extra.conf  # hmm shouldn't that be --include-file but it appears to work 

Yes, is meant to be --include-file, which may mean where you put it isn't being picked up.

--socket-timeout 3600
--request-timeout 60
--debug-mode

If you set --debug-mode, it will run in single process mode, which makes all the timeouts meaningless exception for Timeout/RequestReadTimeout. 

 I ran a test data request for 85 minutes before it fell over with another (unrelated and stupid) python exception that I have now fixed.
I am repeating that run to make sure this aspect is out of the equation.

You need to check whether the options are even making it through.

Use docker exec to create an interactive shell in the container and look at the contents of generated apachectl file down in mod_wsgi directory under /tmp. It records the options mod_wsgi-express got passed and will confirm of server_args being picked up correctly.

Graham

Dave Williams

unread,
Aug 29, 2017, 9:19:09 AM8/29/17
to modwsgi
Hi Graham,

All the settings I quoted (including those invoked by using --include rather than --include-file) ARE making it through. 
I had been exec'ing into the container and looking the /tmp dir already to check - I was doing a ps to pick up the command line for the process parameters - I hadn't spotted the apachectl file explicitly -  so thank you for that!

I have now fixed my exception issue and ran for 4 hours to successful completion - but still need to work out which aspect of the --debug-mode actually fixed the Truncated/Oversize packet and unexplained service restarts issue I was suffering. You didnt advise on how to set the ONE_PROCESS mode properly if that IS indeed the fix for those symptoms. Should I be setting threads/processes instead or is the root cause and fix best done elsewhere?

You were earlier talking about not using onbuild. As I said I followed your instructions on  https://hub.docker.com/r/grahamdumpleton/mod-wsgi-docker/  (ie the README.rst from the repo). You use onbuild in all your demos too. Are there alternate instructions/examples you can point me at?

Thanks for your ongoing attention.

Dave




Graham Dumpleton

unread,
Aug 29, 2017, 8:27:48 PM8/29/17
to mod...@googlegroups.com

> On 29 Aug 2017, at 11:19 pm, Dave Williams <flash...@gmail.com> wrote:
>
> Hi Graham,
>
> All the settings I quoted (including those invoked by using --include rather than --include-file) ARE making it through.

Don't use --include, use --include-file. The latter is the official option name.

The only reason it works is that it seems that the option parser in Python is tolerant in being able to match an incomplete option so long as there are no conflicts. I would not rely on that though, because if I were to later add a new option --include-stuff, your --include would then stop working and would fail to startup.

> I had been exec'ing into the container and looking the /tmp dir already to check - I was doing a ps to pick up the command line for the process parameters - I hadn't spotted the apachectl file explicitly - so thank you for that!
>
> I have now fixed my exception issue and ran for 4 hours to successful completion - but still need to work out which aspect of the --debug-mode actually fixed the Truncated/Oversize packet and unexplained service restarts issue I was suffering. You didnt advise on how to set the ONE_PROCESS mode properly if that IS indeed the fix for those symptoms.

It isn't really intended that you ever run in single process mode, except for debugging. By doing that you loose many features which make running Python web applications with Apache more robust. It is better you do not rely on that mode for a production system.

> Should I be setting threads/processes instead or is the root cause and fix best done elsewhere?

Show me all the options you are currently using.

> You were earlier talking about not using onbuild. As I said I followed your instructions on https://hub.docker.com/r/grahamdumpleton/mod-wsgi-docker/ (ie the README.rst from the repo). You use onbuild in all your demos too. Are there alternate instructions/examples you can point me at?

That docker image is effectively deprecated and is no longer being developed further. Anything in there is from over 2 years ago and not what I would regard as best practice anymore.

Anyway, provide what options you are using so can see where things are at. Can worry about other images you might use later.

Graham

Dave Williams

unread,
Aug 30, 2017, 11:54:17 AM8/30/17
to modwsgi

Anyway, provide what options you are using so can see where things are at. Can worry about other images you might use later.

Graham

All noted. (Thought --include match was an unintended fallout of the parsing)

 apachectl: 
['/usr/local/python/bin/mod_wsgi-express', 'start-server', '--log-to-terminal', '--startup-log', '--port', '80', '--include', 'extra.conf', '--socket-timeout', '3600', '--request-timeout', '60', '--debug-mode', 'pyramid.wsgi']

httpd.conf all default except inclusion of 
Include '/app/extra.conf' at end
as per the --include-file

extra.conf contains 

Timeout 21600
RequestReadTimeout header=300-360,MinRate=500 body=300-360,MinRate=500
LogLevel info

The RequestReadTimeout was my attempts to stop the Gateway Timeout errors and might ne superfluous. 
The --debug mode was the item that fixed to the Overrun/Truncated packet issue.

Dockerfile:
FROM grahamdumpleton/mod-wsgi-docker:python-3.4-onbuild
CMD [ "pyramid.wsgi" ]

pyramid.wsgi is a standard pyramid/pylons framework app using SQAlchemy for DB accesses.
Contains:

from pyramid.paster import get_app, setup_logging
ini_path = 'production.ini'
setup_logging(ini_path)
application = get_app(ini_path, 'main')


docker-compose (extract)
  wsgi:
    build:
      context: ./wsgi
    env_file:
      - ./db/.env # contains common DB environment vars
    environment:
      - MYSQL_DATABASE=CICodeAnalysis
    container_name: wsgi
    restart: always
    links:
      - "db"
    ports:
      - "88:80"
    depends_on:
      - db
    networks:
      - back

.whiskey/pre_build
:
#!/usr/bin/env bash
set -eo pipefail
# required packages to allow pip to install lxml
apt-get update
apt-get install -y libxml2-dev libxslt-dev
rm -r /var/lib/apt/lists/*
pip install -e .

server_args (cited in previous post and result in conf just given in the container)


Any more that you need that I have missed?

Dave

Dave Williams

unread,
Sep 4, 2017, 10:24:37 AM9/4/17
to modwsgi
In trying to move to a more production environment as soon as I take --debug-mode off I start getting the same errors as before:

mod_wsgi (pid=62): Aborting process 'localhost:80'.
mod_wsgi (pid=62): Exiting process 'localhost:80'.
[client 172.20.0.12:57510] Truncated or oversized response headers received from daemon process 'localhost:80': /tmp/mod_wsgi-localhost:80:0/htdocs/submit
mod_wsgi (pid=24): Process 'localhost:80' has died, deregister and restart it.
mod_wsgi (pid=24): Process 'localhost:80' has been deregistered and will no longer be monitored.
mod_wsgi (pid=62): Starting process 'localhost:80' with uid=1001, gid=0 and threads=5.

These occur at around a consistent 10 minutes with mod_wsgi config of 
--request-timeout 120
--socket-timeout 0
and Apache2 
Timeout 21600

Graham Dumpleton

unread,
Sep 4, 2017, 4:25:45 PM9/4/17
to mod...@googlegroups.com
Settings wrong way around:

    --request-timeout 0
    --socket-timeout 21600

You don't need Timeout as that is what --socket-timeout sets.

The --request-timeout 0 disables timeout on request running time.

Graham

Reply all
Reply to author
Forward
0 new messages