Logging with Apache 2.4, mod_wsgi and Django

1,584 views
Skip to first unread message

Stefan Nastic

unread,
Aug 5, 2016, 7:55:55 AM8/5/16
to modwsgi
Is there a way to configure the logging with mod_wsgi and Django without losing the request information, such as remote client IP or unique log ID AND without using environ['wsgi.errors']? Other solutions such as logging wrapper would be also acceptable...

Thanks in advance!
Stefan

Graham Dumpleton

unread,
Aug 5, 2016, 8:18:35 AM8/5/16
to mod...@googlegroups.com
Short answer, no.

The two basic problems are detailed in:


but the solution even for the first is far from simple.

The problem is that the request and connection log IDs are only generated the first time a message is logged via the Apache log API. In the case of proxying a request from the Apache child worker process to a daemon mode process, there wouldn’t have been any messages logged and so nothing has triggered the generation of the log IDs. This means they aren’t available to transfer across to the daemon process such that they could be reconstructed into the connection and request records to fake up things so that logging works in daemon mode. A solution may be for mod_wsgi to forcibly cause the generation of the log IDs on every request using an Apache API call, even though they may not be required. The implications of doing this need to be looked at. Alternatively, one works out what seed information from a request is used to generate the log ID so can ensure that is being passed across and added to the fake connection and request objects that the logging will eventually use.

You also keep saying remote client IP doesn’t show. On my testing on MacOS X it does and the client IP is transferred across to daemon mode, so not sure what the issue is there as cannot replicate it at this point.

So trying to address even the log ID issues for wsgi.errors is going to take some time and work. Linking messages to stdout/stderr back to requests is going to be even more complicated.

Graham

Stefan Nastic

unread,
Aug 5, 2016, 9:30:08 AM8/5/16
to modwsgi
Hi Graham,

tanks for the detailed explanation. I understand the difficulties to make the unique log ID transferred to the deamon process since as you have said it is only generated on the first request by Apache.  

You also mention that in your setup the client IP is successfully transferred to the deamon. I could see this working as an rough solution to correlate the log entries, i.e., using client IP+ timestamp and maybe proc ID. Unfortunately, I cannot get this to work, since for any message logged to stdout/stderr the client IP also gets lost (not shown in Apache error log). Are you using environ['wsgi.errors'] for this or should it work out-of-the-box with standard Django stderr logger?

Thanks,
Stefan

Graham Dumpleton

unread,
Aug 5, 2016, 9:34:34 AM8/5/16
to mod...@googlegroups.com
On 5 Aug 2016, at 10:18 PM, Graham Dumpleton <graham.d...@gmail.com> wrote:


On 5 Aug 2016, at 9:54 PM, Stefan Nastic <stefan....@gmail.com> wrote:

Is there a way to configure the logging with mod_wsgi and Django without losing the request information, such as remote client IP or unique log ID AND without using environ['wsgi.errors']? Other solutions such as logging wrapper would be also acceptable...

Short answer, no.

The two basic problems are detailed in:


but the solution even for the first is far from simple.

The problem is that the request and connection log IDs are only generated the first time a message is logged via the Apache log API. In the case of proxying a request from the Apache child worker process to a daemon mode process, there wouldn’t have been any messages logged and so nothing has triggered the generation of the log IDs. This means they aren’t available to transfer across to the daemon process such that they could be reconstructed into the connection and request records to fake up things so that logging works in daemon mode. A solution may be for mod_wsgi to forcibly cause the generation of the log IDs on every request using an Apache API call, even though they may not be required. The implications of doing this need to be looked at. Alternatively, one works out what seed information from a request is used to generate the log ID so can ensure that is being passed across and added to the fake connection and request objects that the logging will eventually use.

Copying across underlying seed information isn’t possible because part of the calculation involves turning a C pointer to a data structure for the thread into an integer. One can’t set the C pointer to same value of daemon process side because if something then tries to access via it, process will crash.

It is possible to force the generation of the connection and request IDs though if not already set and pass those across. Luckily is not so hard.

The develop branch of the repo now has changes which will do that at least, with the connection and request ID available in the per request WSGI environ dictionary as mod_wsgi.connection_id and mod_wsgi.request_id. Any messages logged via wsgi.errors will now show the correct log ID.

You also keep saying remote client IP doesn’t show. On my testing on MacOS X it does and the client IP is transferred across to daemon mode, so not sure what the issue is there as cannot replicate it at this point.

So trying to address even the log ID issues for wsgi.errors is going to take some time and work. Linking messages to stdout/stderr back to requests is going to be even more complicated.

The stdout/stderr issues needs more investigation. One thing in our favour is that I had already added a thread local for tracking request state due to metrics collection. I don’t know that that gives access to the raw Apache request object though which is needed, so would likely need extending.

If you really need something though, you can try caching those IDs from the per request WSGI environ dictionary and using those in messages somehow.

Graham

Graham Dumpleton

unread,
Aug 5, 2016, 11:59:50 PM8/5/16
to mod...@googlegroups.com
So I have it sort of working when using stdout/stderr, but there is one issue which means still need to do some work if can decide what should be done.

niaZC/85C0Q AH01628: authorization result: granted (no directives)
niaZC/85C0Q AH01628: authorization result: granted (no directives)
niaZC/85C0Q global message
niaZC/85C0Q request message

X+fwC/85C0Q AH01628: authorization result: granted (no directives)
X+fwC/85C0Q AH01628: authorization result: granted (no directives)
X+fwC/85C0Q queued messageglobal message
X+fwC/85C0Q request message

The problem is the ‘queued message’. This is with application of:

def application(environ, start_response):
    print('global message')
    print('request message', file=environ['wsgi.errors'])
    print('queued message', end=‘')

The ‘queued message’ is one which there was no newline terminator.

What normally occurs is that messages get buffered up until a newline terminator is encountered, which then acts to flush it out and write it to Apache log API. This is important else if you do something like:

    print(1, 2, 3, 4)

They will all get printed on separate log lines, which isn’t what you would expect.

The log ID only gets pulled in when you thus flush out what is buffered.

There a couple of problems around this.

Each of those numbers in that last print() statement get buffered separately. If another request handler does the same thing, they can get interleaved, as there is only one stdout. When they get flushed out, the log ID will be for whatever request sent the newline first. This data will get attributed against wrong request.

Similar issue is where a request never sent a new line like above, in this case it ends up at start of data for next request.

One solution which should work out reasonably cleanly is that if in a request, reach back and grab wsgi.errors and divert output via that, which is per request and is flushed at the end of the request. At least I think that will work.

More later.

Graham

Graham Dumpleton

unread,
Aug 6, 2016, 1:33:03 AM8/6/16
to mod...@googlegroups.com
Getting closer. Anything output to sys.stdout and sys.stderr from within a request handler will now be routed via wsgi.errors. Your %L format will work. The remote address shows IP correctly, but is not showing the remote port correctly, instead showing 0.

Please start testing using develop branch and let me know of any unexpected problems.

Graham

Graham Dumpleton

unread,
Aug 6, 2016, 8:16:31 PM8/6/16
to mod...@googlegroups.com
Believe this should all now be fixed on develop branch of mod_wsgi. Please test and let me know if is now working as you expect so I can get a new release out.

If anyone is able to test off develop with Apache 2.0 and 2.2 (not just 2.4), that would be most appreciated.

Graham

Stefan Nastic

unread,
Aug 15, 2016, 8:43:12 AM8/15/16
to modwsgi
Hi,

sorry for a bit delayed response, but I was traveling the last week and had a very  limited internet connectivity.

Now back to the issue. I tested the version 4.5.4 and in short there is some problem as I get seg fault in the apache. I have  checked 4.5.4. for the most common issues such as loading mod_python, linking against shared libraries, etc. and there everything seam to be fine.  I have also tested the same compilation and installation procedure with 4.5.3 and it works fine. 

So I suspect that this is a codding error, most probably accessing the protected memory region when you try to access/reproduce Log ID seed. However, this is only my assumption since I have not had a chance to look at your code.

For debugging purposes .... 

Apache error log:

[mpm_event:notice] [pid 25563:tid 140401132873600] AH00494: SIGHUP received.  Attempting to restart
[mpm_event:notice] [pid 25563:tid 140401132873600] AH00489: Apache/2.4.10 (Debian) mod_wsgi/4.5.4 Python/3.4.2 configured -- resuming normal operations
[core:notice] [pid 25563:tid 140401132873600] AH00094: Command line: '/usr/sbin/apache2'
[core:notice] [pid 25563:tid 140401132873600] AH00052: child pid 27051 exit signal Segmentation fault (11)

OS: Linux  3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux

APACHE:
Server version: Apache/2.4.10 (Debian)
Server built:   Nov 28 2015 14:05:48
Server's Module Magic Number: 
Server loaded:  APR 1.5.1, APR-UTIL 1.5.4
Compiled using: APR 1.5.1, APR-UTIL 1.5.4
Architecture:   64-bit
Server MPM:     event
  threaded:     yes (fixed thread count)
    forked:     yes (variable process count)
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/etc/apache2"
 -D SUEXEC_BIN="/usr/lib/apache2/suexec"
 -D DEFAULT_PIDLOG="/var/run/apache2.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="mime.types"
 -D SERVER_CONFIG_FILE="apache2.conf"

Python:
Python 3.4.2


Linked libs for mod_wsgi.so:

linux-vdso.so.1 (0x00007ffd436c5000)
libpython3.4m.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0 (0x00007f79fea6f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f79fe852000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f79fe64e000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f79fe44b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f79fe14a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f79fdd9f000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f79fdb97000)
libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f79fd96e000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f79fd753000)
/lib64/ld-linux-x86-64.so.2 (0x00007f79ff2ea000)




Hope this helps!
Best,
Stefan

Graham Dumpleton

unread,
Aug 15, 2016, 8:46:56 AM8/15/16
to mod...@googlegroups.com
Problem only affects Python 3.X. Verification of fix being worked on right now.

If you are able to download:


and try with that it would help immensely.

Thanks.

Graham

-- 
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Stefan Nastic

unread,
Aug 15, 2016, 8:51:38 AM8/15/16
to mod...@googlegroups.com

Hi,

I have just tested it (4.5.5) and am still getting the same error.

Best,

Stefan

You received this message because you are subscribed to a topic in the Google Groups "modwsgi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/modwsgi/6I2jiEQomh8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to modwsgi+u...@googlegroups.com.

Stefan Nastic

unread,
Aug 15, 2016, 8:53:46 AM8/15/16
to mod...@googlegroups.com

BTW I can also confirm that it works with Python 2, but then of course I get Python exception, since I use Python 3.4 ...



On 15-Aug-16 14:46, Graham Dumpleton wrote:
You received this message because you are subscribed to a topic in the Google Groups "modwsgi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/modwsgi/6I2jiEQomh8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to modwsgi+u...@googlegroups.com.

Graham Dumpleton

unread,
Aug 15, 2016, 9:16:40 AM8/15/16
to mod...@googlegroups.com
I am not confident that next change will solve the problem, but can you try pulling down source code again and try.

I can’t recreate it so far after this change, but then that is what happened with the first change and it only started occurring again after you and someone else said it didn’t solve problem. Very confusing.

The problem seems to relate to strange things going on when Python objects are being destroyed. Python 3 appears to behave differently with the new streams implementation.

Graham

Stefan Nastic

unread,
Aug 15, 2016, 9:30:35 AM8/15/16
to mod...@googlegroups.com
tried again and no luck  ....

Just to remind you that this happens immediately upon the firs request after restarting the apache, so if you are referring to gc below, I don't see how it can  kick in and cause a problem. Just a thought ....

Stefan

Graham Dumpleton

unread,
Aug 15, 2016, 9:40:26 AM8/15/16
to mod...@googlegroups.com
Giving up for the night. Getting memory corruption all over the place now and can’t even get sane stack traces from gdb. Nothing making sense. Will start over tomorrow.

Graham Dumpleton

unread,
Aug 16, 2016, 12:38:16 AM8/16/16
to mod...@googlegroups.com
New set of changes on ‘develop’ branch of mod_wsgi repo.

Stefan Nastic

unread,
Aug 16, 2016, 4:43:17 AM8/16/16
to mod...@googlegroups.com

Works like a charm :) Thanks Graham.

Could you also maybe briefly describe what the problem was?

Best,
Stefan

Graham Dumpleton

unread,
Aug 16, 2016, 5:23:56 AM8/16/16
to mod...@googlegroups.com
In Python 2.X, to intercept sys.stdout/sys.stderr they are both replaced with a file like object which passes through messages though to the Apache error log API. Being Python 2.X, these are byte oriented streams.

In Python 3.X, you need to wrap the instance of the file like object in a TextIOWrapper class so that it can accept Unicode. The problem was that I was caching in the per request thread local the reference to the instance of TextIOWrapper. When I detected that a message was being logged via sys.stderr/sys.stdout in the file like object it wrapped, and attempted to then pass it onto the per request file like object, because I had cached the reference to TextIOWrapper instead of the file like object it wraps, I was casting at C level to wrong object type and so code would crash as data it was given was all wrong.

As a solution, rather than assuming type, casting and using direct C API, one could call it via the exposed Python method interface, but the problem then was that at level of file like object only have bytes and TextIOWrapper wanted Unicode. So that would generate a Python exception about wrong type for string passed in.

What I had to therefore do was change the code such that could still get access to the file like object held by the TextIOWrapper instance under Python 3 and when pass on message, make sure it bypassed the TextIOWrapper level and went straight to the file like object as bytes.

Graham
Reply all
Reply to author
Forward
0 new messages