Passing username back to Apache for access log

470 views
Skip to first unread message

Gunnlaugur Briem

unread,
Feb 11, 2010, 9:47:06 PM2/11/10
to modwsgi
Hi,

[moving my StackOverflow question to the mailing list per Graham
Dumpleton's suggestion; note his reply there: http://stackoverflow.com/questions/2244244/
]

My Django app, deployed using Django's standard WSGIHandler,
authenticates users via form login on the Django side. So to Apache,
the user is anonymous, making for an unsatisfying access log.

I want to pass the username back through the WSGI wrapper to Apache
after handling the request, so that it appears in the Apache access
log.

Graham's suggestion involves using apswigpy and configuring mod_wsgi
to pass the Apache request object in as a Python CObject. Even then,
he says it will only work in embedded mode.

Some follow-up questions:

1) the apswigpy page says it was abandoned in 2007 due to lack of
interest, and was then “very much a work in progress”. This sounds a
tad risky. Graham being the author of both mod_wsgi and apswigpy, I
don't suppose I should hope that anyone else here knows a better
way? :)

2) is the limitation to embedded mode just a reference to mod_wsgi's
(or apswigpy's) current implementation, or is it inherent in the
Apache invocation model? I.e. will this *continue* to be possible only
in embedded mode?

3) is there a reason why that functionality in apswigpy isn't/
shouldn't be merged into mod_wsgi as an optional extra? Because
mod_wsgi wants to stick strictly to implementing only the WSGI spec?

Regards,

- Gulli

Graham Dumpleton

unread,
Feb 12, 2010, 1:01:32 AM2/12/10
to mod...@googlegroups.com
On 12 February 2010 13:47, Gunnlaugur Briem <gunnl...@gmail.com> wrote:
> Hi,
>
> [moving my StackOverflow question to the mailing list per Graham
> Dumpleton's suggestion; note his reply there: http://stackoverflow.com/questions/2244244/
> ]
>
> My Django app, deployed using Django's standard WSGIHandler,
> authenticates users via form login on the Django side. So to Apache,
> the user is anonymous, making for an unsatisfying access log.
>
> I want to pass the username back through the WSGI wrapper to Apache
> after handling the request, so that it appears in the Apache access
> log.
>
> Graham's suggestion involves using apswigpy and configuring mod_wsgi
> to pass the Apache request object in as a Python CObject. Even then,
> he says it will only work in embedded mode.
>
> Some follow-up questions:
>
> 1) the apswigpy page says it was abandoned in 2007 due to lack of
> interest, and was then “very much a work in progress”. This sounds a
> tad risky.

I am always happy to address any issues with it and make changes as
necessary. Just till now no one has been interested. I don't want to
see that project die as I find it quite interesting. I just don't have
that much time these days and obviously going to focus on mod_wsgi as
that is where the interest of others is focused.

> Graham being the author of both mod_wsgi and apswigpy, I
> don't suppose I should hope that anyone else here knows a better
> way? :)

There is another way, but apswigpy was the quick way as it was a
generic solution.

The other way is to implement a very small Python C extension module
which is coded just to perform that one operation. Isn't that hard for
me at least to do and gives benefit that doesn't bring all the other
API wrappings that apswigpy provides.

If you really want to go down that path, happy to try and get that C
extension module together for you, but with apswigpy you could at
least test concept first.

> 2) is the limitation to embedded mode just a reference to mod_wsgi's
> (or apswigpy's) current implementation, or is it inherent in the
> Apache invocation model? I.e. will this *continue* to be possible only
> in embedded mode?

It is only possible to make that change to the request object in
embedded mode as that is the real one. The one in daemon mode is a
fake request object and changes to it wouldn't get reflected back in
main Apache server process that proxied the request.

The only way around that would be for mod_wsgi to provide a special
feature whereby you could provide a response header which specifies
the user and mod_wsgi picks that up in main Apache server processes,
whether embedded or daemon mode used and makes update transparently.

> 3) is there a reason why that functionality in apswigpy isn't/
> shouldn't be merged into mod_wsgi as an optional extra? Because
> mod_wsgi wants to stick strictly to implementing only the WSGI spec?

Because users wanted me to stick with mod_wsgi really only being about
WSGI. I did think about making it part of mod_wsgi. But then I would
have still had to maintain it even though no one was using it.

Graham

Gunnlaugur Thor Briem

unread,
Feb 12, 2010, 8:21:36 AM2/12/10
to mod...@googlegroups.com
On Fri, Feb 12, 2010 at 6:01 AM, Graham Dumpleton <graham.d...@gmail.com> wrote:
If you really want to go down that path, happy to try and get that C
extension module together for you, but with apswigpy you could at
least test concept first.

That's really good of you to offer. But you mean making this extension module as a sort of minimal subset of apswigpy, using the same approach, right? So it too would only work in embedded mode? I'll gladly try out apswigpy and tell you how it goes, but in the end I will probably want to avoid changing over to embedded mode if at all possible.

The only way around that would be for mod_wsgi to provide a special
feature whereby you could provide a response header which specifies
the user and mod_wsgi picks that up in main Apache server processes,
whether embedded or daemon mode used and makes update transparently.

This actually sounds quite neat and minimal — if it's an obviously non-conflicting header like X-modwsgi-trans-request-user that gets removed on the Apache side so it only exists over the daemon process divide, and only exists when specifically configured, that would be a fairly safe and low-effort feature to add, right? And works in either embedded or daemon mode.

Are there more cases besides the authenticated username, of information that people might commonly want to pass back to Apache? I'm wondering if this is a special case or whether a general scheme like 'X-modwsgi-trans-%s' for string-valued key-value pairs is warranted.

Because users wanted me to stick with mod_wsgi really only being about
WSGI. I did think about making it part of mod_wsgi. But then I would
have still had to maintain it even though no one was using it.

Fully understood. Would this apply also to the transient-header-passing approach above? I.e. is it too out-of-scope to be included, and thus should be a separate module? I imagine (unfoundedly) that it would be a fairly commonly-desired feature though.

Thanks,

- Gulli

Gunnlaugur Thor Briem

unread,
Feb 12, 2010, 9:46:15 AM2/12/10
to mod...@googlegroups.com
On Fri, Feb 12, 2010 at 1:21 PM, Gunnlaugur Thor Briem <gunnl...@gmail.com> wrote:
I'll gladly try out apswigpy and tell you how it goes, but in the end I will probably want to avoid changing over to embedded mode if at all possible.


Tested it, worked fine except that I had to add version=(2,2) to __init__.py, to work around the apache.httpd import bombing on the missing version attribute. I set WSGIApacheExtensions On, and put this in a trivial test.wsgi in an embedded-mode site:

    req = apache.httpd.request_rec(environ["apache.request_rec"])
    req.user = 'foo'

and sure enough, I got the foo username in the access log.

Still, a way that also works in daemon mode would be preferable.

    - Gulli

Graham Dumpleton

unread,
Feb 12, 2010, 2:54:51 PM2/12/10
to mod...@googlegroups.com

Use mod_wsgi 3.X and not the older mod_wsgi version you are using. You
will need to use though:

WSGIPassApacheRequest On

instead of WSGIApacheExtensions as how it was enabled was changed.

Yes I know that change wasn't documented. This was because of lack of
interest in it. :-)

Graham

Graham Dumpleton

unread,
Feb 13, 2010, 6:20:22 AM2/13/10
to mod...@googlegroups.com
On 13 February 2010 00:21, Gunnlaugur Thor Briem <gunnl...@gmail.com> wrote:
> On Fri, Feb 12, 2010 at 6:01 AM, Graham Dumpleton
> <graham.d...@gmail.com> wrote:
>>
>> If you really want to go down that path, happy to try and get that C
>> extension module together for you, but with apswigpy you could at
>> least test concept first.
>
> That's really good of you to offer. But you mean making this extension
> module as a sort of minimal subset of apswigpy, using the same approach,
> right?

No and yes.

The apswigpy module uses SWIG, I am talking about a hand crafted C
extension module for just the specific purpose required. It still
would effectively do the same thing but without the extra baggage of
apswigpy.

> So it too would only work in embedded mode?

Yes, would still only work in embedded mode.

>I'll gladly try out
> apswigpy and tell you how it goes, but in the end I will probably want to
> avoid changing over to embedded mode if at all possible.
>>
>> The only way around that would be for mod_wsgi to provide a special
>> feature whereby you could provide a response header which specifies
>> the user and mod_wsgi picks that up in main Apache server processes,
>> whether embedded or daemon mode used and makes update transparently.
>
> This actually sounds quite neat and minimal — if it's an obviously
> non-conflicting header like X-modwsgi-trans-request-user that gets removed
> on the Apache side so it only exists over the daemon process divide, and
> only exists when specifically configured, that would be a fairly safe and
> low-effort feature to add, right? And works in either embedded or daemon
> mode.
> Are there more cases besides the authenticated username, of information that
> people might commonly want to pass back to Apache? I'm wondering if this is
> a special case or whether a general scheme like 'X-modwsgi-trans-%s' for
> string-valued key-value pairs is warranted.

Personally I don't like the X-Header practice. In CGI 1.2 they propose
a special header called Script-Control. See:

http://tools.ietf.org/id/draft-rfced-info-coar-00.txt

Since how WSGI scripts are mapped in mod_wsgi is not much different to
how CGI scripts are mapped and WSGI has its derivations from CGI, I
think that this Script-Control header would make more sense.

You can see one example of where a web server has used Script-Control
to implement various control mechanisms that scripts can trigger at:

http://wasd.vsm.com.au/wasd_root/doc/scripting/scripting_0200.html

just seach for Script-Control.

My only issue with Script-Control is am not sure how well defined what
you provide as value to it is.

That VMS server uses ';' as separator, but not sure if that is
correct. The CGI specification says:

Script-Control = "Script-Control" ":" 1#control-directive NL
control-directive = "no-abort"
| extension-directive
extension-directive = *<CHAR, excluding CTLs, NL>

So, no mention of a separator and not even sure it supports the return
of multiple control directives.

Since from what I have seen no other server besides that VMS server
has ever implemented Script-Control even if did the same it isn't
likely to upset anything else.

In the past the control directives I have thought could be useful were:

X-add-output-filter - Add a named Apache output filter to output
filter chain for this request.
X-remove-output-filter - Remove a named Apache output filter from
output filter chain for this request.

Another might be used to override default WSGI output buffering
requirements. Ie., instead of flushing after every yielded value, wait
until some defined buffer size is exceeded and then flush so as to try
and avoid flushing of small data chunks.

So, along these lines one could also have:

X-remote-user - Set the authenticated user for this request so long
as a authentication handler wasn't already noted as having determined
this and set the user.
X-auth-type - Set the name of the mechanism used to authenticate the user.

In other words, would only override req->user in Apache request if
req->ap_auth_type wasn't previously set. This check is just to ensure
that an application cant override what was already calculated by a
prior Apache authentication module.

Anyway, idea is you would return response header of:

Script-Control: X-remote-user=grahamd; X-auth-type=Session/Django

Setting the authentication type possibly serves no purpose as the
places where it is consulted are in earlier phases of request handling
in Apache.

>> Because users wanted me to stick with mod_wsgi really only being about
>> WSGI. I did think about making it part of mod_wsgi. But then I would
>> have still had to maintain it even though no one was using it.
>
> Fully understood. Would this apply also to the transient-header-passing
> approach above? I.e. is it too out-of-scope to be included, and thus should
> be a separate module?

Not necessarily as it melds well with my prior thoughts on allowing
some measure of control of Apache request handling through
Script-Control response header.

Graham

> I imagine (unfoundedly) that it would be a fairly
> commonly-desired feature though.
> Thanks,
> - Gulli
>

> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to
> modwsgi+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/modwsgi?hl=en.
>

Gunnlaugur Thor Briem

unread,
Feb 14, 2010, 10:50:37 AM2/14/10
to mod...@googlegroups.com
On Sat, Feb 13, 2010 at 11:20 AM, Graham Dumpleton <graham.d...@gmail.com> wrote:
Personally I don't like the X-Header practice. In CGI 1.2 they propose
a special header called Script-Control. See:

 http://tools.ietf.org/id/draft-rfced-info-coar-00.txt

Yes, this sounds like the right way to me — amounts to the same thing (getting information back to Apache in daemon mode) but without introducing any ad-hoc header names.

    - Gulli

Reply all
Reply to author
Forward
0 new messages