I'm contemplating setting up a Python-powered website for the tourist
industry, which will involve a web service, a good deal of XML
processing, and a Django-powered front-end. If the project works, it
could get a lot of traffic. I'm sure it can be done, but I'm looking
to find out more about how existing high-volume Python sites have
managed their workload. Can anyone give me examples of high-volume
Python-powered websites, if possible with some idea of their
Managing load of high volume of visitors is a common issue for all
kind of web technologies. I mean this is not the python issue. This
issue is mostly about server level designs. You need to supply load
balancing for both web servers and databases to make your web site
able to respond to several concurrent visitors. Of course a good
programmed website is a key performance issue but for your mention I
would also suggest considering how many hardwares, how many
webservers, how many database cluster and which database server should
be used or will be used in the future..
> > Nick- Hide quoted text -
> - Show quoted text -
I wasn't aware that Google used Python for running their Google groups
servers. Can you confirm that? The only place
I've seen Google explicitly use Python on their web front end is in
the Google Ads tests.
They seem to do a great job of loading large, complex pages using
Django (stacked on Python, stacked on bytecode, stacked on C.)
Shows it can be done.
Sorry about my concern on Google. It caused a confusion about Google
groups. I didn't mean explicitly where Google uses python, I mentioned
just Google uses Python. A Google officer told that they run Python on
thousands of their servers at an interview. Due to this claim I wanted
to say it for you.
Actualy of course it can be done and even it will not be worse than
any other frameworks, and I bet can be better than Java and ASP.NET if
configured and programmed well. I really encourage you to use
mod_python for any project. Mod_python and mod_wsgi made it very
powerful at web side. As I said in my previous message a web
application's responsiveness is dependent to several issues. A good
web framework, server speed, database design etc.. In this case you
want to use django, which as I know build for mod_python and can be
configured to run on a mod_wsgi web server. Consider that you will
have one million members on your site. That traffic simply needs
several clustered web servers and clustered databases. This means you
supply a load balancing. So concurrent user sessions will be shared on
different web servers. You can do your best with such a clustered
system with such a powerful language. Really don't worry about that.
I also want to say something about creating a better web application.
If I were you I wouldn't use any web framework like django or other.
If you want to create a commercial project and want to manage all
modules of the application yourself I suggest you to create your own
framework. I don't mean create a Django like framework. Determine your
needs, modules, services and build your own modules. Use other
independent open source modules for any specific issues. To make a
successful project requires your own business logic. And you can do
this with your own algorithm. Don't avoid collecting other modules for
templating, xml parsing, DB connectivity issues. But build your own
framework with these and your own modules..
youtube once used quite a lot of Python IIRC. You may be able to find
relevant infos on the net.
While I may disagree with Kutlu on some points, it's clear that the
key to handling huge traffic is the ability to scale up. So better to
avoid solutions that make it hard - or impossible - to setup load
balancing, replication etc. Now that doesn't mean than decent
performance and reasonnable memory usage are not a concern - even a
simple website with moderate traffic can become a PITA if you choose the
wrong tools / architecture (Plone perfs problems anyone ?).
Anyway : just make sure your solution is both simple enough to avoid
becoming a resource-eater yet serious enough to allow for fine-grained
caching, load-balancing and the like.
 like reinventing your own framework - whatever architecture
(including non-blocking IO/event-based server like Twisted) you settle
on, chances are most of the grunt work has already been done, and
probably better than what you could come with in a reasonable amount of
time - unless you have a really BIG budget of course.
Bruno and Kutlu,
It's a small start-up project.
By mentioning lawrence and other Django-powered websites, I was
pointing out that the problem of creating a high-performance
web solution using Python has already been solved in several places,
and the lessons learned have been given back to us
in the form of products like Django. I tend to agree with Bruno that
I'm unlikely to do a
better job than Django.
Thanks for your responses,
You should now that there's a Google service called Google App Engine
that lets you host yur website in google's own infraestructure (this
is known nowadays as "cloud computing").
It's free to start (as long as you don't exceed the minimum quotas of
space and traffic, which are quite generous).
The good thing is that you don't have to think about scallig issues or
about your overall site's arquitecture or hardware. It's te whole
google infraestructure at your disposal, which can scale from one user
to tens of thousands without having to change anyting from your part.
Simply code correctly your site in python or java, using Django or any
other wsgi compliant framework, and you are set to go.
I don't know a lot about this issue, but take apache + php. every
time a page is loaded a new instance of php is loaded to run the
page, so i imagine load balancing can easiry be done on the page
request level by distributing instances of php processes.
whereas if you use python, you don't really want to load the python
interpreter for every page request. as far as i can tell, the
canonical way is to have one app for the whole website that's
constantly running and communicates with the server via WSGI. or is
that wrong? and wouldn't that make load balancing a little bit more
tricky, or at least different? not sure..
AFAIK that's only the case for PHP-CGI, and Python as a CGI scripting
language is used the same way. Apache is very often run with mod_php,
though, which embeds the PHP interpreter; mod_python does something
similar for Python.
"As an online discussion grows longer, the probability of a comparison
involving Nazis or Hitler approaches one." -- Godwin's Law
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
> Apache is very often run with mod_php,
> though, which embeds the PHP interpreter; mod_python does something
> similar for Python.
And FWIW, this has very few impact wrt/ load balancing issues.
> I wasn't aware that Google used Python for running their Google groups
> servers. Can you confirm that? The only place
> I've seen Google explicitly use Python on their web front end is in
> the Google Ads tests.
> I am impressed by the responsiveness of lawrence.com, ljworld.com and
> others on the Django home page (http://www.djangoproject.com/)
I'm running two sites using Django which are certainly infinitely more
modest and much less visited than the ones you quote but which
nevertheless are extremely responsive compared to other frameworks
tested, using a single machine (http://nonlineaire.univ-lille1.fr/SNL/
I can't speak to Django specifically but
you can certainly get essentially unlimited
scalability on the front-end side of the
equation using a Python based web app.
The google app engine will set such a
configuration up for you automatically, but
they are still working some bugs out
with regard to performance, I think,
based on my experience here
I hope that helps.
-- Aaron Watters
an apple every 8 hours
will keep 3 doctors away. -kliban