I may need to ask in the Apache lists about this, but I figured I'd
try here first.
Is there an easy way to get the apr_proc_fork mechanism (as used on
line 5782 of mod_wsgi.c) to set the SELinux context or domain of child
scripts?
I'm trying to figure out a way to use SELinux to isolate two unrelated
Python scripts from each other. For example: if I'm a hosting
company, and Bob has a misconfigured Trac site and Alice has a DJango
site, I'd like to prevent Bob's Trac site from doing anything with
Alice's DJango site.
I know that I can accomplish this using the standard user-based DAC
mechanisms and running my websites as Daemon processes, but I'd like
to use SELinux instead.
Currently, under RHEL5, all HTTPD processes - including mod_wsgi
Daemon children processes - run under the httpd_t type. This is great
for protecting the rest of the OS, but alas, the sites could
theoretically still step on other sites running as httpd_t if they
somehow got root privelege...
Cheers,
-J
I think the SELinux lists themselves may actually be a better place.
> Is there an easy way to get the apr_proc_fork mechanism (as used on
> line 5782 of mod_wsgi.c) to set the SELinux context or domain of child
> scripts?
The apr_proc_fork() call is actually a very small wrapper around
fork(), setting stuff in the APR proc structure and setting random
number in forked process. Thus, easy to duplicate function to factor
in other stuff so long as it also does what APR was doing. This is why
probably not a problem for APR people, but of SELinux people as to do
what you want as you can always access fork() directly.
> I'm trying to figure out a way to use SELinux to isolate two unrelated
> Python scripts from each other. For example: if I'm a hosting
> company, and Bob has a misconfigured Trac site and Alice has a DJango
> site, I'd like to prevent Bob's Trac site from doing anything with
> Alice's DJango site.
>
> I know that I can accomplish this using the standard user-based DAC
> mechanisms and running my websites as Daemon processes, but I'd like
> to use SELinux instead.
>
> Currently, under RHEL5, all HTTPD processes - including mod_wsgi
> Daemon children processes - run under the httpd_t type. This is great
> for protecting the rest of the OS, but alas, the sites could
> theoretically still step on other sites running as httpd_t if they
> somehow got root privelege...
They don't need to be able to get root privileges to trample on other
applications running as Apache user as can do it already given they
are all the same user.
The problem is that the Apache parent process is running as a non
privileged user yet you want the forked process to be running as a
different user. Normally this can obviously only be done if parent
process doing fork was root, with it then dropping privileges after
fork to appropriate user.
Whether SELinux has a way of allowing you to do this in a safe way I
have no idea as it would mean a non privileged process being able to
change the user it runs as. Doesn't exactly sound like a good thing to
allow as how do you disable being able to do it and thus stop user
code from also doing it?
Graham
> The problem is that the Apache parent process is running as a non
> privileged user yet you want the forked process to be running as a
> different user. Normally this can obviously only be done if parent
> process doing fork was root, with it then dropping privileges after
> fork to appropriate user.
The environments I'm targeting (RHEL5 specifically) do run the initial
Apache process as Root, and then spawn children to handle the
requests. This is how you can have Apache spawn child WSGI daemons
that run as other users, no?
Is it worth contemplating the use of apr_proc_create instead of
apr_proc_fork? It appears that apr_proc_create does an execv to
create the new process, and if this is the case, you can use
setexeccon under SELinux to set the context of the new process - and
this is our end goal.
Is the use of apr_proc_create possible in mod_wsgi?
Cheers,
-J
No, as there is no separate executable for mod_wsgi to exec. This is
because mod_wsgi consists only of the Apache module and through the
fork inherits access to a lot of the internals of Apache with respect
to processing requests which would otherwise not be available if it
was a separate executable. In other words, if it was a separate
executable, it would have to implement itself all the request
processing and configuration which at the moment it gets for free.
The only thing of value in mod_wsgi may be some stuff I am thinking
about for the future. Specifically, am looking at a new feature in
mod_wsgi which would allow one to created managed processes. These
processes are effectively an independent executable, but you are using
Apache as a supervisor for starting them up, shutting them down and
restarting them if they die.
The intent in adding this feature is purely as a convenience for
starting up a separate Python based web server instance hosting a WSGI
application, so that a user doesn't have to implement some other
supervisor system for doing it. One would then not use mod_wsgi, but
standard mod_proxy or mod_rewrite to proxy requests through to this
managed back end web server instance.
This may be of interest because it will then be a fork/exec and so you
could insert the call you are talking about and thus the back end web
server security context could be controlled. Note though that this
sort of proxying type system you can already do now, the only
difference in what I am thinking of doing is that one is using Apache
as the supervisor system.
Graham
So plan B would be apache starting and stoping another python
webserver ? Sounds like adding just more resources and latencies to
me ? Why use 2 webservers when the first one can handle it just fine ?
Can you not make a mod_cluster instead where one server controls all
mod_wsgi servers as one server ? So plan B would be like the other way
around converting 2 mod wsgi servers too 1 server ?
The differences in resources used is basically the same. Ie., same
number of processes. The only difference is that rather than daemon
process being just a fork of Apache parent process, a subsequent exec
is done, thus replacing it with separate program. As the daemon
processes at the moment are all created up front, any slight
additional startup time wouldn't be an issue. Startup time would be an
issue if restarting process after set number of requests or process
dies and needs restarting. I again wouldn't expect the difference
though to be that noticeable in the context of things given that bulk
of startup time is loading the WSGI application and would be the same
in both cases.
As to latency, as the existing daemon processes do not handle requests
from a client directly, but with the request first needing to be
proxied from the Apache child processes, there is no difference in the
data path in terms of the number of hops involved. The only difference
here is that if a separate web server program is used, the protocol is
HTTP rather than the internal mod_wsgi protocol. The only issue
therefore is how much extra overhead there might be in processing the
HTTP request header rather than the slightly more compact mod_wsgi
request header format.
It is actually quite possible that if using Apache 2.2 that using a
back end web server may end up being more efficient. This is because
in Apache 2.2 it is possible in mod_proxy to specify that socket
connections to the back end web server be kept alive and also pooled.
This means one can to a degree eliminate the need to open a new socket
connection for every request. As mod_wsgi uses a new socket connection
every time for proxying request from Apache child process to daemon
process, the mod_proxy arrangement may thus have less overhead and
allow greater throughput.
> Can you not make a mod_cluster instead where one server controls all
> mod_wsgi servers as one server ? So plan B would be like the other way
> around converting 2 mod wsgi servers too 1 server ?
Huh? Not sure what you mean here. In mod_wsgi it already is capable of
managing multiple daemon process groups, where each process group may
have one or more processes, and with processes in a specific group
having one or more threads in each process as necessary.
Anyway, after some more thought I have come up with an even more
devilish plan of how I could take this idea of managed processes
further and refactor it into something mod_wsgi specific where
mod_wsgi could just invoke Python against Python/C Extension modules
which implement the whole multi interpreter server like environment
within the context of a standard Python process. It could even be made
to work with the exact same script file again. This would solve the
problem of mod_wsgi currently being tied to a specific Python version,
allowing other Python versions to also be used albeit it in a slightly
different, but at least compatible configuration. I'll have to let
this idea gestate some more. Important thing though is that this would
be an additional optional feature and would replace existing code
base. The idea is that it would just provide more flexibility.
Graham
current plan
-------------
1)REQUEST
2)APACHE DAEMON :80
3)FORK A APACHE WSGI PROCESS
4)CREATE WSGI SOCKET TO RECEIVE REQUEST FROM APACHE
5)RUN PYTHON2.5 AND RETURN ANSWER TO APACHE
6)CLOSE WSGI SOCKET
proxy plan
---------
1)REQUEST
2)APACHE DEMON :80
3)START EXTERNAL HTTP PROCESS
4)CREATE HTTP SOCKET TO RECEIVE REQUEST FROM APACHE
5)RUN PYTHON2.5 AND RETURN ANSWER TO APACHE
6)CLOSE HTTP SOCKET
devil plan
----------
1)REQUEST
2)APACHE DEMON :80
3)FORK A APACHE UNIVERSAL HTTP PROCESS
4)CREATE HTTP SOCKET TO RECEIVE REQUEST FROM APACHE
5)RUN C/PYTHON/WHATEVER AND RETURN ANSWER TO APACHE
6)CLOSE HTTP SOCKET
fastcgi plan
--------
1)REQUEST
2)APACHE DEMON :80
3)RUN EXTERNAL PROCESS AND RETURN ANSWER TO APACHE TRUE STD SOCKET
So is this what i think you are thinking ?
Next question would be adding latencies numbers next too each line and
how many times they occur on let say 1 request per second ?
Please consider the fact this could make no sense at all :)
I haven't done the proposed devilish plan, nor proposed mod_wsgi
transient daemon variants yet, but have a look at the attached and see
if this makes it slightly clearer for the attached cases at least.
The mod_wsgi transient daemon process feature would require a process
manager similar to mod_fastcgi as Apache child process will need to be
able to communicate with the process manager to ensure transient
daemon processes exist before they try to connect.
In the devilish plan, in place of WSGI daemon processes it would be
separate Python process created by fork/exec from either Apache parent
or process manager, instead of just fork for standard daemon. The exec
would do something like:
python -c 'mod_wsgi.daemon; mod_wsgi.daemon.run(....)'
This would start up a mini server which understands the WSGI wire
protocol and can handle the requests proxied from the Apache child
processes. Lots of details still to be sorted out over this one if it
were to be done, such as whether it is possible to still get logging
back into correct Apache virtual host log file.
Graham
Make sure does docs get on your http://code.google.com/p/modwsgi/ page
please they make allot of sense for people like me who try to
understand how mod_wsgi works