pylons and Apache's DocumentRoot

9 views
Skip to first unread message

Ramon Diaz-Uriarte

unread,
Dec 12, 2006, 2:51:16 PM12/12/06
to pylons-...@googlegroups.com
Apologies in advance: I think this is a trivial question, but I don't
seem to be able to get the answer from the docs or the mailing list.
If I follow either

http://pylonshq.com/project/pylonshq/wiki/FastCGI

or

http://pylonshq.com/docs/0.9.3/webserver_config.html

or

http://pylonshq.com/project/pylonshq/wiki/CgiOnNoFrillsHostingSvc

I see that DocumentRoot is a directory where, I think, the pylons egg
cannot have been installed; the eggs will install under either the
system-wide site-packages or some other location, as provided by
virtual python or workingenv. But, as far as I know, none of the
Pylons' project file will be left under, say,
"/var/www/example.com/htdocs" (as in the FastCGI doc from the wiki) or
similar. Thus:

a) are people making a sym link from, say
~/somewhere/lib/python2.4/my-project.egg/my-project/public to
/var/www/example.com/htdocs

b) are people just copying (or linking) just a few selected py files
to /var/www/example.com/htdocs? If so, which ones and why not a)


(I don't think this matters here, but I am for now restricted to apache 1.3).

Best,

R.

--
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

Mike Orr

unread,
Dec 12, 2006, 4:34:00 PM12/12/06
to pylons-...@googlegroups.com
On 12/12/06, Ramon Diaz-Uriarte <rdi...@gmail.com> wrote:
> I see that DocumentRoot is a directory where, I think, the pylons egg
> cannot have been installed; the eggs will install under either the
> system-wide site-packages or some other location, as provided by
> virtual python or workingenv. But, as far as I know, none of the
> Pylons' project file will be left under, say,
> "/var/www/example.com/htdocs" (as in the FastCGI doc from the wiki) or
> similar. Thus:
>
> a) are people making a sym link from, say
> ~/somewhere/lib/python2.4/my-project.egg/my-project/public to
> /var/www/example.com/htdocs
>
> b) are people just copying (or linking) just a few selected py files
> to /var/www/example.com/htdocs? If so, which ones and why not a)

Neither one. If you put Python files in the document root or symlink
to them, Apache will serve them as ordinary static files and users
will see your Python source code, not the webpages your application
produces.

Instead, set up your application outside Apache's space, and adjust
Apache's config file (httpd.conf) to forward requests to your
application server. The easiest way is to use the HTTP server that
comes with Paste, and use Apache's mod_proxy to forward requests to
the other HTTP server. Something like this:

ProxyPass / http://localhost:5000/

See http://httpd.apache.org/docs/1.3/mod/mod_proxy.html .
You may need ProxyPassReverse too.

Alternatively, you can use FastCGI, SCGI, mod_python, etc. The
directives are different but again it's all in the Apache config file,
not in the document root.

The only time the document root would be involved is if you put the
directives in an .htaccess file. I don't know if that's even possible
with any of these directives, and you certainly shouldn't consider it
if you have the ability to modify httpd.conf yourself.

--
Mike Orr <slugg...@gmail.com>

Shannon -jj Behrens

unread,
Dec 12, 2006, 4:37:05 PM12/12/06
to pylons-...@googlegroups.com

Of course, you can and probably should have Apache serve your static
files. You can accomplish that by either configuring Apache with an
alias or setting up a symlink as you suggest.

Best Regards,
-jj

--
http://jjinux.blogspot.com/

Mike Orr

unread,
Dec 12, 2006, 5:42:13 PM12/12/06
to pylons-...@googlegroups.com
[stuff deleted]

>
> Of course, you can and probably should have Apache serve your static
> files. You can accomplish that by either configuring Apache with an
> alias or setting up a symlink as you suggest.

You would have to alias the specific URLs or directories containing
static files.

Alias /default.css /PATH/TO/APP/EGG/packagename/static/default.css
Alias /images /PATH/TO/APP/EGG/packagename/static/images

Then you'd have to verify that Apache processes Alias before ProxyPass
or whatever module you're using. Apache processes modules in a
certain order, sometimes they way you want, sometimes not.

A symlink from the document root would probably not work unless you
can convince Apache *not* to activate ModProxy or whatever for those
particular URLs.

--
Mike Orr <slugg...@gmail.com>

Ramon Diaz-Uriarte

unread,
Dec 12, 2006, 8:01:57 PM12/12/06
to pylons-...@googlegroups.com
On 12/12/06, Mike Orr <slugg...@gmail.com> wrote:
>

I think I understand what you are saying here, but it seems very
different from what the above documents suggest (if I am understanding
correctly). I see no mention of using ProxyPass (although it might be
implicit).

> The only time the document root would be involved is if you put the
> directives in an .htaccess file. I don't know if that's even possible
> with any of these directives, and you certainly shouldn't consider it
> if you have the ability to modify httpd.conf yourself.

Yes, I have access to httpd.conf. Anyway, the need for ProxyPass would
be there just the same?

R.


>
> --
> Mike Orr <slugg...@gmail.com>

Ramon Diaz-Uriarte

unread,
Dec 12, 2006, 8:05:46 PM12/12/06
to pylons-...@googlegroups.com
On 12/12/06, Shannon -jj Behrens <jji...@gmail.com> wrote:
>

Right now, having the whole thing to work would be more than enough
:-) even if I pay a small penalty (our applications spend most of
their time doing number crunching, which keeps them busy for 5 to 50
minutes, and we do not have those many daily hits, so its not a big
deal right now if serving static content takes longer than needed).


R.

> Best Regards,
> -jj
>
> --
> http://jjinux.blogspot.com/
>
> >
>

Jose Galvez

unread,
Dec 13, 2006, 1:57:12 AM12/13/06
to pylons-...@googlegroups.com
Dear Ramon,

Do you even need Apache? if most of your application doesn't need
static files then you really don't need Apache. Having said that I
understand that most of us use apache for things that its good for, like
virtual hosts and serving the occasional php file. All the recipes
you've seen weather its with mod_proxy, fastcgi scgi or what ever are
really nothing more then getting apache to map some urls to your pylons
server. I personally have used both scgi and mod_proxy (currently I'm
using mod_proxy_ajp). Sorry this may be obvious, but from reading your
post I wasn't sure if it was. If you want some concrete examples on how
to integrate pylons with apache tell us what you want to do, which
connector you want to use (or can use) and I'm sure someone will send
you some canned code you can drop into your apache conf file to make it work

Jose

Mike Orr

unread,
Dec 13, 2006, 3:47:31 AM12/13/06
to pylons-...@googlegroups.com

I was wondering that too, but it's simply a case of that page not
mentioning that alternative. I would have added it myself but I want
to get some more experience with flup first so I don't put anything
incorrect in the wiki.

By the way, I'm using Apache only because my Pylons sites will have to
share the port with existing applications that are being served by
Apache. Otherwise I'd consider one of the Python HTTP servers
instead.

--
Mike Orr <slugg...@gmail.com>

Ramon Diaz-Uriarte

unread,
Dec 13, 2006, 5:17:10 AM12/13/06
to pylons-...@googlegroups.com, Andrés Cañada, Andreu Alibés, Lara. Mariana, Oscar Rueda
On 12/13/06, Jose Galvez <jj.g...@gmail.com> wrote:
>
> Dear Ramon,
>
> Do you even need Apache? if most of your application doesn't need
> static files then you really don't need Apache. Having said that I
> understand that most of us use apache for things that its good for, like
> virtual hosts and serving the occasional php file. All the recipes
> you've seen weather its with mod_proxy, fastcgi scgi or what ever are
> really nothing more then getting apache to map some urls to your pylons
> server. I personally have used both scgi and mod_proxy (currently I'm
> using mod_proxy_ajp). Sorry this may be obvious, but from reading your
> post I wasn't sure if it was.

Dear José,

No, no, it wasn't obvious: I am very ignorant about Apache and I
definitely don't really understand how Pylons (or similar web
frameworks) work (despite detailed explanations such as
http://groups.google.com/group/pylons-discuss/browse_thread/thread/c71a313bc84a2e1f/555190e0554d33a7?lnk=gst&q=apache&rnum=16#555190e0554d33a7).


> If you want some concrete examples on how
> to integrate pylons with apache tell us what you want to do, which
> connector you want to use (or can use) and I'm sure someone will send
> you some canned code you can drop into your apache conf file to make it work
>

Here I go (long message follows). First the "what I (think) I
understand", then "context and what we want to do".


What I (think) I understand
---------------------------------------
Regardless of other intermediate things in between, you (almost)
always need to have paster serving the thing. If using Apache (at
least without mod_python), paster MUST be up and running, serving
things at some address and port. However, I am still puzzled by:

"Apache is an incredibly mature technology, and having the knowledge
that when Apache is up, my site is up is quite nice. " by Ben Bangert
in
http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/5de35593c4571633/a1ac095664259688?rnum=1&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F5de35593c4571633%2F%3F#doc_ad8deb4f3acec8e1

Is this because of mod_python? Don't they need to have paster running
underneath?

Context and what we want to do
-----------------------------------------------

We have a set of number-crunching bioinfo applications (reachable from
http://asterias.bioinfo.cnio.es) where the CGI part is written in
Python. Most of the apps. share a lot of code and installation in
other machines was a pain. Thus, we looked around and thought Pylons
would be great. It's been great so far and most of the work of moving
things to Pylons is done. Our main meta-application is called
asterias. We have several controllers for each of the applications
(e.g., tnasas, genesrf, etc) : things work fine using the "paster
server --reload whatever.ini" and typing the right URL (e.g.,
0.0.0.0:5000/tnasas)

We create eggs, and using workingenv, we install the egg, and other
modules, under /http (a dir we created, owned by www-data). So we have
a structure like:
/http/AsteriasEnv/lib/python2.4/asterias.egg blablabla

As for connectors, we do not server thousands of short requests a day;
our apps. take a while to run (5 to 50 minutes) and there are rarely
more than 20 requests per appl. per day. (I.e., speed is an issues in
the number crunching code; it is inconsequential in the web-serving
part). I think plain CGI will be just fine.


But I think we cannot just dump Apache because:

a) We have 7 applications, and right now only 3 or 4 of those would be
ready to run with Pylons. We need to make sure the others can still
run. I thought using Pylons via Apache would allow us a non-traumatic
transition (just re-write the relevant parts of httpd.conf when
ready).

b) We cannot change the way people call an application (these apps.
have been running for > 1 year, and we cannot break this habits).
E.g., we want to allow people to keep typing "tnasas.bioinfo.cnio.es"
and not force them to type "asterias.bioinfo.cnio.es/tnasas".
I thought I'd be able to take care of this via apache. (Just define as
many virtual hosts as ways of calling the application I want).

c) We cannot ask users to type a port number after the URL. (I think
this prevents me from keeping apache for the "not-yet-pylons" but
serve the rest via paster, distinguishing via port).

d) Other things (not "owned by us") run on these machines and use Apache.

e) We run things in a cluster; we have Linux Virtual Sever (+
heartbeat) on a master node distributing the requests to the cluster
nodes. But I do not think we can do anything at the LVS level to send
things to either Apache or paster.

f) We'd rather not use mod_python (because we are using Apache 1.3).


As soon as I take care of my daughter's breakfast, I am going to try a
simple ProxyPass approach with Apache.

Best, and thanks for the patience :-).


R.

Ramon Diaz-Uriarte

unread,
Dec 13, 2006, 5:20:49 AM12/13/06
to pylons-...@googlegroups.com

Sorry, Mike, I am getting lost here: so the ProxyPass approach is an
alternative to what they recommend in those pages? So can they use
FastCGI (as in http://pylonshq.com/project/pylonshq/wiki/FastCGI) or
plain CGI (as in
http://pylonshq.com/project/pylonshq/wiki/CgiOnNoFrillsHostingSvc)
without using ProxyPass?

Don't they still need to have paster up and running?


> By the way, I'm using Apache only because my Pylons sites will have to
> share the port with existing applications that are being served by
> Apache.

Yes, same situation here.


> Otherwise I'd consider one of the Python HTTP servers
> instead.
>

I am all for making my life simple here!

Best,

James Gardner

unread,
Dec 13, 2006, 6:44:01 AM12/13/06
to pylons-...@googlegroups.com, Andrés Cañada, Andreu Alibés, Lara. Mariana, Oscar Rueda
Dear Ramon,

It sounds like you are simply bewildered by choice here! Since you are
serving only 50 requests/day it really doesn't matter which deployment
technique you use. Here are a load of bullet points which hopefully
clear up all the various areas you have touched on!

* Pylons doesn't have a server so you serve a Pylons app in whichever
way you prefer with whichever server

* Pylons is thread safe so you can use multi-threaded as well as
multi-process server techniques

* Pylons doesn't do any threading itself even though it can be used in a
multi-threaded environment so life is nice and simple

* One way of deploying a Pylons app is with paster serve. It is useful
for testing and (with care) can be used for production deployment.

* People who deploy with paster serve typically use Apache with
ProxyPass so that the visitors don't have to see the port and so that
they can take advantage of Apache's virtual hosts and mod_rewrite
capabilities.

* You can also use dedicated reverse proxies such as pound to load
balance to your running some paster serve apps

* If you deploy a Pylons app using paster serve it is wise to setup a
cron job to check the server is still running and restart it if
necessary. I use daemontools as described on the Pylons wiki to achieve
a similar thing.

* If you are serving a Pylons application you don't need to do anything
special with static files, Pylons itself handles them for you. Of
course, if you want Apache to handle them you can set that up too. The
easiest way is to copy them into your htdocs directory and then setup a
mod_rewrite rule so that Apache serves from the static directory where
possible and passes the request on to Pylons where it is not. I'm using
this for one of my apps, feel free to adapt it for your use:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([-_a-zA-Z0-9/\.]+)$ /cgi-bin/dispatch.cgi/$1

Something like the above would check the filename first then internally
redirect the request to the dispatch.cgi file which could serve the
Pylons app. If you are doing this then the Pylons app will actually be
at a different location to the URL so you will need some middleware that
manually alters the environment so it thinks it is running at that
correct URL and therefore generates links that work before the rewrite.

If you do this you could also disable Pylons' static file support by
taking the static file app out of the Cascade middleware in
config/middleware.py since it won't be needed.

> What I (think) I understand
> ---------------------------------------
> Regardless of other intermediate things in between, you (almost)
> always need to have paster serving the thing. If using Apache (at
> least without mod_python), paster MUST be up and running, serving
> things at some address and port.

Nope, not at all. You don't *need* to have a standalone running server,
it is just that some people find that the most simple/flexible way.

You can also deploy via CGI, FastCGI or mod_python and more which are
all described in detail on the wiki.

CGI and FastCGI methods use a dispatch.cgi or dispatch.fcgi file which
you could put in your rewrite rules for static file.

> However, I am still puzzled by:
>
> "Apache is an incredibly mature technology, and having the knowledge
> that when Apache is up, my site is up is quite nice. " by Ben Bangert
> in
> http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/5de35593c4571633/a1ac095664259688?rnum=1&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F5de35593c4571633%2F%3F#doc_ad8deb4f3acec8e1
>
> Is this because of mod_python? Don't they need to have paster running
> underneath?

Exactly, Ben's just saying that if you deploy via Apache you pretty much
know everything will be working because Apache is good at CGI, FastCGI
etc where as if you go down the standlaone server (eg paster serve)
route you need to write some monitoring code to restart the server if it
fails. Neither should be difficult, it just depends which approach you
prefer.

> As for connectors, we do not server thousands of short requests a day;
> our apps. take a while to run (5 to 50 minutes) and there are rarely
> more than 20 requests per appl. per day. (I.e., speed is an issues in
> the number crunching code; it is inconsequential in the web-serving
> part). I think plain CGI will be just fine.

Yup, that sounds fine unless your app displays thousands of static files
because with CGI to serve each static file a new Pylons app will be
loaded and unloaded. This can be avoided using the mod_rewrite approach
and having Apache serve your static files.

> But I think we cannot just dump Apache because:
>
> a) We have 7 applications, and right now only 3 or 4 of those would be
> ready to run with Pylons. We need to make sure the others can still
> run. I thought using Pylons via Apache would allow us a non-traumatic
> transition (just re-write the relevant parts of httpd.conf when
> ready).

Keep it then and use CGI. If performance was an issue you'd use FastCGI.

> b) We cannot change the way people call an application (these apps.
> have been running for > 1 year, and we cannot break this habits).
> E.g., we want to allow people to keep typing "tnasas.bioinfo.cnio.es"
> and not force them to type "asterias.bioinfo.cnio.es/tnasas".
> I thought I'd be able to take care of this via apache. (Just define as
> many virtual hosts as ways of calling the application I want).

Yup, use Apache and CGI then. You simply need to write a short CGI
application that loads the egg and runs the Pylons app::

#!/home/user/bin/python

from paste.deploy import loadapp
wsgi_app = loadapp('config:/home/user/pylons_app/test.ini')
import wsgiref.handlers
wsgiref.handlers.CGIHandler().run(wsgi_app)

> c) We cannot ask users to type a port number after the URL. (I think
> this prevents me from keeping apache for the "not-yet-pylons" but
> serve the rest via paster, distinguishing via port).

Use Apache and CGI then or a paster server with Apache using mod_proxy

> d) Other things (not "owned by us") run on these machines and use Apache.

Sounds wise to keep Apache then.

> e) We run things in a cluster; we have Linux Virtual Sever (+
> heartbeat) on a master node distributing the requests to the cluster
> nodes. But I do not think we can do anything at the LVS level to send
> things to either Apache or paster.

You could use a reverse proxy like Pound to distribute the requests over
the nodes, but I expect you are already doing something similar?

> f) We'd rather not use mod_python (because we are using Apache 1.3).

Fair enough.

> As soon as I take care of my daughter's breakfast, I am going to try a
> simple ProxyPass approach with Apache.

OK, hope it works out!

Let us know if any of this doesn't make sense or you need any more
advice because there are bound to be others with similar experiences and
it is good to have a definitive record of sorting things out for others
to refer to.

Best wishes,

James

Ramon Diaz-Uriarte

unread,
Dec 13, 2006, 1:23:13 PM12/13/06
to pylons-...@googlegroups.com, Andrés Cañada, Andreu Alibés, Lara. Mariana, Oscar Rueda
Dear James,

Thanks a lot for your message! I think it has clarified most of my
confusions. After considering the issues, I think I am trying using
CGI. I am using essentially the same dispatch.cgi as you showed.

I think it almost works, but not quite. The main app. is called
asterias here. Now, with the asterias.ini I use, and using paster, I
can do:

http://0.0.0.0:5000/adacgh/

and it takes me where I want (adacgh.py lives under /controllers). But
I've not been able to get the same to work with dispatch.cgi. I think
the key is somewhere in the call to loadapp, but I've been unable to
get it right.

Of course, having several "dispatch-whatever.cgi" is fine (one would
be accessed by each virtual host).

Best,

R.

P.S. Here are the dispatch and ini:

************* dispatch.cgi

#!/usr/bin/python2.4

""" To try to use the Pylons app. via CGI. See:
http://pylonshq.com/project/pylonshq/wiki/CgiOnNoFrillsHostingSvc.
http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/08e4e54d15682cb1?rnum=11&hl=en&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3Fscoring%3Dd%26hl%3Den%26&scoring=d#doc_08e4e54d15682cb1


This file lives right under the asterias dir in the repository.
Make sure libraries, etc, are OK by running it on its own.
"""

import os, sys

## to use the stuff installed in workingenv (created in /http/Asterias-Pylons)
sys.path.append('/http/Asterias-Pylons/lib/python2.4')
sys.path.append('/http/Asterias-Pylons/lib/python2.4/wsgiref-0.1.2-py2.4.egg')

# Load the WSGI application from the config file
from paste.deploy import loadapp
## It is not this one:
## wsgi_app = loadapp('config:/http/Asterias-Pylons/lib/python2.4/asterias-0.5dev-py2.4.egg/asterias/asterias.ini')
wsgi_app = loadapp('config:/http/asterias.ini')
import wsgiref.handlers
wsgiref.handlers.CGIHandler().run(wsgi_app)


********** asterias.ini

#
# asterias - Pylons configuration
#
# The %(here)s variable will be replaced with the parent directory of this file
#
[DEFAULT]
debug = false
email_to = rdi...@gmail.com
#smtp_server = localhost
#error_email_from = pa...@exceptions.com

[server:main]
# The default, which will work with paster serve
use = egg:Paste#http
# Next for FastCGI
#use = egg:PasteScript#flup_fcgi_thread
host = 0.0.0.0
port = 5000

[app:main]
use = egg:asterias
myghty_data_dir = %(here)s/data/templates
cache_data_dir = %(here)s/data/cache
session_data_dir = %(here)s/data/sessions
session_key = asterias
session_secret = LeO/+CfUwlICZIJzDkzH50F91
app_instance_uuid = {ce43a7b4-5499-403a-b157-fb28e10d534c}
option1 = adacgh


## Nothing below works
[app:adacgh]
use = egg:asterias#adacgh.py

[app:adacgh2]
use = egg:asterias
document_root =
%(here)s/Asterias-Pylons/lib/python2.4/asterias-1.0-py2.4.egg/asterias/controllers

[composite:main1]
use = egg:Paste#urlmap
/adacgh = adacgh
/genesrf = genesrf


# Specify the database for SQLObject to use via pylons.database.PackageHub.
# %(here) may include a ':' character on Windows environments; this can
# invalidate the URI when specifying a SQLite db via path name. Refer to the
# SQLObject documentation for a special syntax to preserve the URI.
#sqlobject.dburi = sqlite:%(here)s/somedb.db

# WARNING: *THE LINE BELOW MUST BE UNCOMMENTED ON A PRODUCTION ENVIRONMENT*
# Debug mode will enable the interactive debugging tool, allowing ANYONE to
# execute malicious code after an exception is raised.
set debug = false

Shannon -jj Behrens

unread,
Dec 13, 2006, 3:07:57 PM12/13/06
to pylons-...@googlegroups.com, Andrés Cañada, Andreu Alibés, Lara. Mariana, Oscar Rueda

I'm 95% positive that's not true. CGI is a way for a Web server
(Apache) to execute an external application (your app running like a
shell script). mod_python is a way of embedding Python inside Apache.
SCGI and FastCGI are ways of having one Web server (Apache) defer to
another application over a certain protocol. Proxying is a way of
having one Web server (Apache) defer to another application (Paster)
using HTTP as the protocol.

I like to use Apache because:

a) It's relatively fast with static content.
b) It knows about SSL.
c) It knows about virtual hosts.

I proxy Paster behind Apache because it's a reasonable thing to do,
and it's worked well for me in the past. CGI is too slow since you
have to respawn Python every time. mod_python has bitten me many
times over the years. FastCGI didn't work as well as proxying in some
of my past applications. I've never tried SCGI.

I'm guessing that you'll want to do the same. You probably shouldn't
use CGI because CGI starts up an entire Python interpreter for every
request, and it sounds like you need a long running process.

If you're proxying, you have to make sure that Paster gives a response
to Apache within a reasonable amount of time, otherwise Apache will
time out. For long running tasks like this, I like to return a
response immediately that tells the user "Yeah, yeah, I'm going to do
what you asked." Then I can do the processing in the background.
Then, you can use AJAX to poll the Web server to see if it's ready to
produce some output.

> If using Apache (at
> least without mod_python), paster MUST be up and running,

I'm 95% positive that mod_python and paster are mutually exclusive.
You either run your app under one or the other. Of course, you can
run one in mod_python and another in paster, but that's another story.

> serving
> things at some address and port. However, I am still puzzled by:
>
> "Apache is an incredibly mature technology, and having the knowledge
> that when Apache is up, my site is up is quite nice. " by Ben Bangert
> in

Apache is good. However, so much can go wrong besides apache that you
should ignore the second half of that statement.

mod_python is no longer everyone's favorite solution.

I agree with this approach because it's exactly what I do.

> b) We cannot change the way people call an application (these apps.
> have been running for > 1 year, and we cannot break this habits).
> E.g., we want to allow people to keep typing "tnasas.bioinfo.cnio.es"
> and not force them to type "asterias.bioinfo.cnio.es/tnasas".
> I thought I'd be able to take care of this via apache. (Just define as
> many virtual hosts as ways of calling the application I want).

Yep.

> c) We cannot ask users to type a port number after the URL. (I think
> this prevents me from keeping apache for the "not-yet-pylons" but
> serve the rest via paster, distinguishing via port).
>
> d) Other things (not "owned by us") run on these machines and use Apache.
>
> e) We run things in a cluster; we have Linux Virtual Sever (+
> heartbeat) on a master node distributing the requests to the cluster
> nodes. But I do not think we can do anything at the LVS level to send
> things to either Apache or paster.
>
> f) We'd rather not use mod_python (because we are using Apache 1.3).
>
>
> As soon as I take care of my daughter's breakfast, I am going to try a
> simple ProxyPass approach with Apache.
>
> Best, and thanks for the patience :-).

I hope my rambling was of some use.

Happy Hacking!
-jj

--
http://jjinux.blogspot.com/

Ben Bangert

unread,
Dec 13, 2006, 3:46:03 PM12/13/06
to pylons-...@googlegroups.com
On Dec 13, 2006, at 12:07 PM, Shannon -jj Behrens wrote:

>>> Do you even need Apache? if most of your application doesn't need
>>> static files then you really don't need Apache. Having said that I

Apache or something else that can do some basic HTTP scrubbing in
front of paster serve's HTTP server is always recommended. Apache and
other front-end HTTP servers typically will 'fix' malformed HTTP
requests and other stuff that paster's HTTP server may not handle
very nicely. Whether you have squid, apache, or some other HTTP proxy
in front, I believe they all scrub the request to some extent.

>> Regardless of other intermediate things in between, you (almost)
>> always need to have paster serving the thing.
>
> I'm 95% positive that's not true. CGI is a way for a Web server
> (Apache) to execute an external application (your app running like a
> shell script). mod_python is a way of embedding Python inside Apache.
> SCGI and FastCGI are ways of having one Web server (Apache) defer to
> another application over a certain protocol. Proxying is a way of
> having one Web server (Apache) defer to another application (Paster)
> using HTTP as the protocol.

JJ is correct on this. paster serve can start your app as a stand-
alone app that handles incoming HTTP, SCGI, or Fast CGI requests.
When you use a Pylons app under mod_python, mod_python is loading a
handler that creates your Pylons app as a WSGI process and serves it,
no paster serve involved in this case, and no other processes except
the Apache ones floating around.

> I like to use Apache because:
>
> a) It's relatively fast with static content.
> b) It knows about SSL.
> c) It knows about virtual hosts.

I'd really like to see some middleware that can dispatch based on
virtual host as well. :)
Routes can already handle sub-domains, but thats only within an
application.

> I'm 95% positive that mod_python and paster are mutually exclusive.
> You either run your app under one or the other. Of course, you can
> run one in mod_python and another in paster, but that's another story.

Yup!

>> "Apache is an incredibly mature technology, and having the knowledge
>> that when Apache is up, my site is up is quite nice. " by Ben Bangert
>> in
>
> Apache is good. However, so much can go wrong besides apache that you
> should ignore the second half of that statement.

I should also note that since then, I've gone to a proxy setup. I
have djb's supervise process running which ensures my paster serve
process is always up and running (mostly following the how-to James
Gardner put up on the wiki about supervise). It works wonderfully,
and paster serve has been incredibly solid, to date it hasn't even
crashed (which supervise would instantly restart). This solution is
also rather resource efficient, when your app is loaded in
mod_python, it typically means the apache process handling your
request increasing in ram. So if you want to handle 100 simultaneous
requests, under Apache pre-fork, you'll have 100 apache processes....
which means 100 Pylons apps loaded into ram.

With a proxy solution to paster serve, you can set the thread pool
higher, and since its a single process that can share ram, your app
is loaded just once. This keeps all those Apache processes lighter,
and they start-up faster since they're not loading your app when a
new one is spawned.

If you want something even lighter and more efficient, Bob Ippolito
is running what I'd consider one of the most efficient and fastest
setups. nginx (though I'd imagine squid could almost work as well) on
the front to handle requests, dispatching to a pool of Pylons
processes on the back with Fast-CGI. nginx implements Fast-CGI
differently than Apache, using intelligent keep-alives on the
connections and multiplexing requests to the backend pool. He's
handling a load significantly higher than I'd imagine many of us do
though. :)

> mod_python is no longer everyone's favorite solution.

Hopefully what I said above might help illustrate why as well. :)

Cheers,
Ben

Graham Dumpleton

unread,
Dec 13, 2006, 11:08:05 PM12/13/06
to pylons-discuss

Ben Bangert wrote:
> I should also note that since then, I've gone to a proxy setup. I
> have djb's supervise process running which ensures my paster serve
> process is always up and running (mostly following the how-to James
> Gardner put up on the wiki about supervise). It works wonderfully,
> and paster serve has been incredibly solid, to date it hasn't even
> crashed (which supervise would instantly restart). This solution is
> also rather resource efficient, when your app is loaded in
> mod_python, it typically means the apache process handling your
> request increasing in ram. So if you want to handle 100 simultaneous
> requests, under Apache pre-fork, you'll have 100 apache processes....
> which means 100 Pylons apps loaded into ram.

Which is easily avoided if you compile Apache to use the "worker" MPM
rather than "prefork". When using "worker" MPM you have more than one
thread in each Apache child process with the ability to handle
concurrent requests. At the same time you can still benefit somewhat
from multiple child Apache processes like "prefork". A typical
configuration may be:

<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>

Thus, number of actual processes required is a lot less.

> > mod_python is no longer everyone's favorite solution.
>
> Hopefully what I said above might help illustrate why as well. :)

Unfortunately mod_python gets quite a bad rap at times and where when
one delves into it is often based on experiences people have had with
quite old versions. That isn't to say that the current version doesn't
have problems, but version 3.3 which is just about to be released fixes
just about all known issues and should be much much better. The new
version also incorporates a lot of new features. Thus, don't give up on
mod_python completely, but would be a good idea to move to 3.3 as soon
as you can.

Graham

Jose Galvez

unread,
Dec 14, 2006, 12:45:16 AM12/14/06
to pylons-...@googlegroups.com
Pylons is a wsgi compliant web framework, and as such it can be served
ans stared lots of ways. The way most of us do it is to use paster
because frankly Ian has made it really easy to run lots of wsgi
applications and middle ware together with his very nice code. If you
look in your development.ini file you will find the [server:main]
section which describes what server you want to run. If you run pasters
own http server then you can use = egg:Paste#http and serve your pylons
app directly on port 5000 (or what ever port you want). If you use
fastcgi then you would use = egg:PasteScript#flup_fcgi_thread which uses
the flup fastcgi server to serve your application. If you want to use
mod_python then you can ignore the [server:main] since the pylons
application will be served via mod_python. If you want to use
mod_python the wiki has good instructions on how to set that up and it
runs very nicely. If you don't use mod_python then you will have to
chose one of the servers in (http, fcgi, scgi, ajp). Assuming you are
using the http server on port 5000, then you would turn on mod_proxy in
apace and add the proxy pass code to your apache config so that your
users can get to your application. For example if you wanted your users
to get to you pylons app from say http://somehost.com/myapp then in your
apache conf you would have:

ProxyPass /myapp http://localhost:5000
ProxyPassReverse /myapp http://localhost:5000

From one of yor other emails it looks like you are trying to install
your pylons application under your httpdocs folder so that it is
visiable to apache. If this is correct you really don't need to do that
and its probably not what you want to do because you won't be running
the code in the same way as cgi runs and so apache really don't need to
know exactly where your files are.

I think the main problem your having is that there are to many choices
(a problem that lots of new users have). I personally would either pick
mod_python or the paster http server with mod_proxy. Of those two I
think mod_proxy is the eaiser of the two to set up so I recommend that
for new comers.
Jose

Kendall Clark

unread,
Dec 14, 2006, 1:50:04 AM12/14/06
to pylons-...@googlegroups.com

On Dec 13, 2006, at 3:46 PM, Ben Bangert wrote:

> nginx (though I'd imagine squid could almost work as well) on
> the front to handle requests, dispatching to a pool of Pylons
> processes on the back with Fast-CGI. nginx implements Fast-CGI
> differently than Apache, using intelligent keep-alives on the
> connections and multiplexing requests to the backend pool.

Ben,

Has Bob documented this setup publicly? This is real tease and I'd
like to see details. :>

Cheers,
Kendall


Ramon Diaz-Uriarte

unread,
Dec 14, 2006, 8:01:38 AM12/14/06
to pylons-...@googlegroups.com, Oscar Rueda, Andrés Cañada, Andreu Alibés, marian...@cnio.es
Thank you _very much_ to all of you for your detailed responses and
your patience: this has been like taking Pylons 101 (plus "the Apache
you should have known before getting into this"). Using the proxy
approach works just fine.

I am pasting below some notes I took to document what I did right and
wrong, and links to several of your messages. Maybe this can be of
some use to other newbies like me? (In addition to my ignorance,
several of you pointed out correctly that I was getting lost because
of the wide variety of options). As soon as we have our setup fully up
and running, we will provide a complete document where we detail other
stuff (how we deal with MPI, dependencies from R, etc).

Best,

R.

P.S. Is there any "canonical way" of citing Pylons? We are writing a
paper where we want to cite Pylons.

**************************************
(This uses emacs org-mode.)
******************************************

Notes for how to deploy our apps. They run via Pylons and
paster. What about more complex set-ups?

Most of these notes come from the thread
[[http://groups.google.com/group/pylons-discuss/browse_frm/thread/4bb9b20d4724d8ae?]]


Starting on 2006-12-07 I tried to get Pylons and Apache to work
together. Tried a bunch of things, unsuccessfully. (Some historical traces
are available either on the current code base or in the repository, with
names such as dispatch.cgi, asterias_disptatch.cgi and
asterias_dispatch.fcgi, and sample_httpd.conf).


I ask on pylons-discuss. Lots of extremely useful answers!!! First, I am
obviously confused about how Pylons, Apache, FastCGI, SCGI, CGI, etc, all
talk to each other. Now I see that the info is in the docs and wiki, but I
just didn't see it (or read it right). Important background info in the
messages (which are like a compressed "Pylons 101"):

Jose Galvez's 2006-12-13:
[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/12e938531d04af61?rnum=1&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F#doc_7f4d89a8ce679d09]]

James Gardner's 2006-12-13:
[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/12e938531d04af61?rnum=1&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F#doc_08e4e54d15682cb1]]

Shannon -jj Behrens's, 2006-12-13:
[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/5e184a674437f6f2?rnum=11&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F#doc_723ce609ea1f2c3a]]

Ben Bangert, on why having Apache before paster is good:
[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/5e184a674437f6f2?rnum=11&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F%5D%5D%3D%26#doc_723ce609ea1f2c3a]]


On 2006-12-13 I thought it was for sure gonna be CGI. The most attractive
thing here is that I would not need to start paster (i.e., I would not
need to monitor it), FastCGI seems unwarranted in our case, and mod_python
is not an option (using Apache 1.3) and SCGI I hade never heard
of.However, I run into the problem I describe in:

[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/12e938531d04af61?rnum=11&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F#doc_797f64ff3f099bf2]]


I ask again in the list, and I get a whole bunch of very useful
answers. Including Ben Bangert, who now seems to be using Apache as proxy
and paster in the back. So I decide to try using paster in the back, and
Apache with mod_proxy. Instructions and details in this email by Jose
Galvez:
[[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/5e184a674437f6f2?rnum=11&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F%5D%5D%3D%26#doc_0367b653aa1d43cb]]


A few things to watch out (from JJ's message):

If you're proxying, you have to make sure that Paster gives a
response to Apache within a reasonable amount of time, otherwise Apache
will time out. For long running tasks like this, I like to return a
response immediately that tells the user "Yeah, yeah, I'm going to do
what you asked." Then I can do the processing in the background. Then,
you can use AJAX to poll the Web server to see if it's ready to produce
some output.

I think we should be OK here, with our "checkdone" approach, that
periodically has the server send the client a message saying "we are
working on it, this page will be autorefreshed".


Ben Bangert, on supervising the Paster processes
([[http://groups.google.com/group/pylons-discuss/tree/browse_frm/thread/4bb9b20d4724d8ae/5e184a674437f6f2?rnum=11&_done=%2Fgroup%2Fpylons-discuss%2Fbrowse_frm%2Fthread%2F4bb9b20d4724d8ae%3F%5D%5D%3D%26#doc_723ce609ea1f2c3a]]
)

I should also note that since then, I've gone to a proxy setup. I
have djb's supervise process running which ensures my paster serve
process is always up and running (mostly following the how-to James
Gardner put up on the wiki about supervise). It works wonderfully,
and paster serve has been incredibly solid, to date it hasn't even
crashed (which supervise would instantly restart). This solution is
also rather resource efficient, when your app is loaded in
mod_python, it typically means the apache process handling your
request increasing in ram. So if you want to handle 100 simultaneous
requests, under Apache pre-fork, you'll have 100 apache processes....
which means 100 Pylons apps loaded into ram.

The link to the supervise stuff is:
[[http://pylonshq.com/project/pylonshq/wiki/DaemonTools]]

From what I see, we could try to integrate this with the scripts that
automatically check that MPI is up, that nodes are accessible, etc. Maybe
better to integrate into our LVS setup?


------

I tried the proxy thing. Works just fine. In httpd.conf I add:

####################

LoadModule proxy_module /usr/lib/apache/1.3/libproxy.so

ProxyRequests off

<VirtualHost adacgh.bioinfo.cnio.es>
ServerAdmin rdi...@gmail.com
ServerName adacgh.bioinfo.cnio.es
ErrorLog /http/Asterias-Pylons/log/adacgh_error.log
TransferLog /http/Asterias-Pylons/log/adacgh_access.log
ProxyPass / http://localhost:5000/adacgh/
ProxyPassReverse / http://localhost:5000/adacgh/
</VirtualHost>
##########

and the asterias.ini file has:
####################3
[server:main]
use = egg:Paste#http
host = localhost
port = 5000

[app:main]
use = egg:asterias

###########
and, sure enough, when I do: http://adacgh.bioinfo.cnio.es the thing works.

Ben Bangert

unread,
Dec 14, 2006, 1:02:29 PM12/14/06
to pylons-...@googlegroups.com
On Dec 13, 2006, at 10:50 PM, Kendall Clark wrote:

>> nginx (though I'd imagine squid could almost work as well) on
>> the front to handle requests, dispatching to a pool of Pylons
>> processes on the back with Fast-CGI. nginx implements Fast-CGI
>> differently than Apache, using intelligent keep-alives on the
>> connections and multiplexing requests to the backend pool.
>
> Ben,
>
> Has Bob documented this setup publicly? This is real tease and I'd
> like to see details. :>

Afaik, Bob hasn't posted any more details on his setup than his use
of nginx, Pylons, and Erlang, which he mentioned here:
http://bob.pythonmac.org/archives/2006/11/21/mochiads-flash-game-ad-
network/

There's several documents out there on setting up nginx to a Fast-CGI
process pool, which I'd assume is the same setup he's using. He has
another post about his experiences of nginx as well:
http://bob.pythonmac.org/archives/2006/09/13/nginx-reverse-proxy-
panacea/

HTH,
Ben

Ben Bangert

unread,
Dec 14, 2006, 1:08:30 PM12/14/06
to pylons-...@googlegroups.com
On Dec 13, 2006, at 8:08 PM, Graham Dumpleton wrote:

> Which is easily avoided if you compile Apache to use the "worker" MPM
> rather than "prefork". When using "worker" MPM you have more than one
> thread in each Apache child process with the ability to handle
> concurrent requests. At the same time you can still benefit somewhat
> from multiple child Apache processes like "prefork". A typical
> configuration may be:
>
> <IfModule mpm_worker_module>
> StartServers 2
> MaxClients 150
> MinSpareThreads 25
> MaxSpareThreads 75
> ThreadsPerChild 25
> MaxRequestsPerChild 0
> </IfModule>
>
> Thus, number of actual processes required is a lot less.

Yep, very true. Though note that given such a setup you're maxed at
150 connections, the async based front-end reverse capable proxies
like squid and nginx spawn no additional threads or processes per
connection and can easily handles thousands of connections. I rather
like squid as it can do caching which saves me hits to back-end app
servers entirely, giving me a nice bump in speed. Of course, one has
to be product in setting HTTP cache headers in their dynamic pages to
get the most of this setup.

> Unfortunately mod_python gets quite a bad rap at times and where when
> one delves into it is often based on experiences people have had with
> quite old versions. That isn't to say that the current version doesn't
> have problems, but version 3.3 which is just about to be released
> fixes
> just about all known issues and should be much much better. The new
> version also incorporates a lot of new features. Thus, don't give
> up on
> mod_python completely, but would be a good idea to move to 3.3 as soon
> as you can.

I've had great experiences with mod_python as well, in my particular
setup it just wasn't the best choice. YMMV :)

Cheers,
Ben

Graham Dumpleton

unread,
Dec 14, 2006, 3:52:30 PM12/14/06
to pylons-discuss

That configuration was the default for Apache 2.2, you can change it.

FWIW, the default for 'prefork' when using Apache 2.2 is:

<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>

Thus it has a max clients setting of 150 as well. Anyone setting up a
serious web site would review any such defaults and modify as
appropriate. If your hardware supports it and the amount of site
traffic warrants it, then by all means change the values.

But then you are talking purely about proxying as well, so to compare
it to Apache may not be fare in that case as Apache could be doing a
whole lot more than proxying. The application could for example be
running within Apache and not behind it, so the proxying performance
would not be an issue then.

Graham

Reply all
Reply to author
Forward
0 new messages