High-performance Python websites

2 views
Skip to first unread message

Nick Mellor

unread,
Nov 25, 2009, 7:24:57 PM11/25/09
to
Hi all,

I'm contemplating setting up a Python-powered website for the tourist
industry, which will involve a web service, a good deal of XML
processing, and a Django-powered front-end. If the project works, it
could get a lot of traffic. I'm sure it can be done, but I'm looking
to find out more about how existing high-volume Python sites have
managed their workload. Can anyone give me examples of high-volume
Python-powered websites, if possible with some idea of their
architecture?

Many thanks,

Nick

ShoqulKutlu

unread,
Nov 25, 2009, 7:33:31 PM11/25/09
to
Hi,

Managing load of high volume of visitors is a common issue for all
kind of web technologies. I mean this is not the python issue. This
issue is mostly about server level designs. You need to supply load
balancing for both web servers and databases to make your web site
able to respond to several concurrent visitors. Of course a good
programmed website is a key performance issue but for your mention I
would also suggest considering how many hardwares, how many
webservers, how many database cluster and which database server should
be used or will be used in the future..

Regards,
Kutlu

ShoqulKutlu

unread,
Nov 25, 2009, 7:45:27 PM11/25/09
to
If you need an example one is in front of you and you are using it
now. Google uses python at their thousands of servers. But as you see
Google gains this performance from the power of hundreds of thousands
servers. The main idea behind the Google search engine was indexing
the whole web through hundreds of servers. So you can imagine where
Google's power comes from..

Regards

> > Nick- Hide quoted text -
>
> - Show quoted text -

Nick Mellor

unread,
Nov 26, 2009, 12:21:25 AM11/26/09
to
Thanks Kutlu,

I wasn't aware that Google used Python for running their Google groups
servers. Can you confirm that? The only place
I've seen Google explicitly use Python on their web front end is in
the Google Ads tests.

I am impressed by the responsiveness of lawrence.com, ljworld.com and
others on the Django home page (http://www.djangoproject.com/)

They seem to do a great job of loading large, complex pages using
Django (stacked on Python, stacked on bytecode, stacked on C.)
Shows it can be done.

Nick

ShoqulKutlu

unread,
Nov 26, 2009, 12:59:22 AM11/26/09
to
Hi Nick,

Sorry about my concern on Google. It caused a confusion about Google
groups. I didn't mean explicitly where Google uses python, I mentioned
just Google uses Python. A Google officer told that they run Python on
thousands of their servers at an interview. Due to this claim I wanted
to say it for you.
Actualy of course it can be done and even it will not be worse than
any other frameworks, and I bet can be better than Java and ASP.NET if
configured and programmed well. I really encourage you to use
mod_python for any project. Mod_python and mod_wsgi made it very
powerful at web side. As I said in my previous message a web
application's responsiveness is dependent to several issues. A good
web framework, server speed, database design etc.. In this case you
want to use django, which as I know build for mod_python and can be
configured to run on a mod_wsgi web server. Consider that you will
have one million members on your site. That traffic simply needs
several clustered web servers and clustered databases. This means you
supply a load balancing. So concurrent user sessions will be shared on
different web servers. You can do your best with such a clustered
system with such a powerful language. Really don't worry about that.

Regards,
Kutlu

ShoqulKutlu

unread,
Nov 26, 2009, 1:09:30 AM11/26/09
to
Hi again,

I also want to say something about creating a better web application.
If I were you I wouldn't use any web framework like django or other.
If you want to create a commercial project and want to manage all
modules of the application yourself I suggest you to create your own
framework. I don't mean create a Django like framework. Determine your
needs, modules, services and build your own modules. Use other
independent open source modules for any specific issues. To make a
successful project requires your own business logic. And you can do
this with your own algorithm. Don't avoid collecting other modules for
templating, xml parsing, DB connectivity issues. But build your own
framework with these and your own modules..

Regards,
Kutlu

Bruno Desthuilliers

unread,
Nov 26, 2009, 11:26:47 AM11/26/09
to
Nick Mellor a �crit :

youtube once used quite a lot of Python IIRC. You may be able to find
relevant infos on the net.

While I may disagree with Kutlu on some points[1], it's clear that the
key to handling huge traffic is the ability to scale up. So better to
avoid solutions that make it hard - or impossible - to setup load
balancing, replication etc. Now that doesn't mean than decent
performance and reasonnable memory usage are not a concern - even a
simple website with moderate traffic can become a PITA if you choose the
wrong tools / architecture (Plone perfs problems anyone ?).

Anyway : just make sure your solution is both simple enough to avoid
becoming a resource-eater yet serious enough to allow for fine-grained
caching, load-balancing and the like.


[1] like reinventing your own framework - whatever architecture
(including non-blocking IO/event-based server like Twisted) you settle
on, chances are most of the grunt work has already been done, and
probably better than what you could come with in a reasonable amount of
time - unless you have a really BIG budget of course.

Nick Mellor

unread,
Nov 26, 2009, 8:57:48 PM11/26/09
to
On Nov 27, 3:26 am, Bruno Desthuilliers <bruno.
42.desthuilli...@websiteburo.invalid> wrote:
> Nick Mellor a écrit :

Bruno and Kutlu,

It's a small start-up project.

By mentioning lawrence and other Django-powered websites, I was
pointing out that the problem of creating a high-performance
web solution using Python has already been solved in several places,
and the lessons learned have been given back to us
in the form of products like Django. I tend to agree with Bruno that
I'm unlikely to do a
better job than Django.

Thanks for your responses,

Nick

Luis M. González

unread,
Nov 26, 2009, 10:12:42 PM11/26/09
to

You should now that there's a Google service called Google App Engine
that lets you host yur website in google's own infraestructure (this
is known nowadays as "cloud computing").
It's free to start (as long as you don't exceed the minimum quotas of
space and traffic, which are quite generous).
The good thing is that you don't have to think about scallig issues or
about your overall site's arquitecture or hardware. It's te whole
google infraestructure at your disposal, which can scale from one user
to tens of thousands without having to change anyting from your part.
Simply code correctly your site in python or java, using Django or any
other wsgi compliant framework, and you are set to go.

Check it out: http://code.google.com/appengine/docs/whatisgoogleappengine.html

Luis

inhahe

unread,
Nov 30, 2009, 1:55:55 PM11/30/09
to ShoqulKutlu, pytho...@python.org
On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat...@gmail.com> wrote:
> Hi,
>
> Managing load of high volume of visitors is a common issue for all
> kind of web technologies. I mean this is not the python issue. This
> issue is mostly about server level designs. You need to supply load
> balancing for both web servers and databases to make your web site
> able to respond to several concurrent visitors. Of course a good
> programmed website is a key performance issue but for your mention I
> would also suggest considering how many hardwares, how many
> webservers, how many database cluster and which database server should
> be used or will be used in the future..
>

I don't know a lot about this issue, but take apache + php. every
time a page is loaded a new instance of php is loaded to run the
page, so i imagine load balancing can easiry be done on the page
request level by distributing instances of php processes.
whereas if you use python, you don't really want to load the python
interpreter for every page request. as far as i can tell, the
canonical way is to have one app for the whole website that's
constantly running and communicates with the server via WSGI. or is
that wrong? and wouldn't that make load balancing a little bit more
tricky, or at least different? not sure..

Rami Chowdhury

unread,
Dec 1, 2009, 11:57:12 PM12/1/09
to pytho...@python.org, ShoqulKutlu
On Monday 30 November 2009 10:55:55 inhahe wrote:
> On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat...@gmail.com>
wrote:
> > Hi,
> >
> > Managing load of high volume of visitors is a common issue for all
> > kind of web technologies. I mean this is not the python issue. This
> > issue is mostly about server level designs. You need to supply load
> > balancing for both web servers and databases to make your web site
> > able to respond to several concurrent visitors. Of course a good
> > programmed website is a key performance issue but for your mention
> > I would also suggest considering how many hardwares, how many
> > webservers, how many database cluster and which database server
> > should be used or will be used in the future..
>
> I don't know a lot about this issue, but take apache + php. every
> time a page is loaded a new instance of php is loaded to run the
> page,

AFAIK that's only the case for PHP-CGI, and Python as a CGI scripting
language is used the same way. Apache is very often run with mod_php,
though, which embeds the PHP interpreter; mod_python does something
similar for Python.


----
Rami Chowdhury
"As an online discussion grows longer, the probability of a comparison
involving Nazis or Hitler approaches one." -- Godwin's Law
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)

Bruno Desthuilliers

unread,
Dec 2, 2009, 3:52:32 AM12/2/09
to
Rami Chowdhury a �crit :

> On Monday 30 November 2009 10:55:55 inhahe wrote:
>> On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat...@gmail.com>
> wrote:
>>> Hi,
>>>
>>> Managing load of high volume of visitors is a common issue for all
>>> kind of web technologies. I mean this is not the python issue. This
>>> issue is mostly about server level designs. You need to supply load
>>> balancing for both web servers and databases to make your web site
>>> able to respond to several concurrent visitors. Of course a good
>>> programmed website is a key performance issue but for your mention
>>> I would also suggest considering how many hardwares, how many
>>> webservers, how many database cluster and which database server
>>> should be used or will be used in the future..
>> I don't know a lot about this issue, but take apache + php. every
>> time a page is loaded a new instance of php is loaded to run the
>> page,
>
> AFAIK that's only the case for PHP-CGI, and Python as a CGI scripting
> language is used the same way.

Yeps.

> Apache is very often run with mod_php,
> though, which embeds the PHP interpreter; mod_python does something
> similar for Python.

Indeed.

And FWIW, this has very few impact wrt/ load balancing issues.

Dr. Marco

unread,
Dec 5, 2009, 7:13:59 PM12/5/09
to
On Wed, 25 Nov 2009 21:21:25 -0800 (PST),
Nick Mellor <nick.mell...@pobox.com> wrote :
Hi,

> I wasn't aware that Google used Python for running their Google groups
> servers. Can you confirm that? The only place
> I've seen Google explicitly use Python on their web front end is in
> the Google Ads tests.
>
> I am impressed by the responsiveness of lawrence.com, ljworld.com and
> others on the Django home page (http://www.djangoproject.com/)

I'm running two sites using Django which are certainly infinitely more
modest and much less visited than the ones you quote but which
nevertheless are extremely responsive compared to other frameworks
tested, using a single machine (http://nonlineaire.univ-lille1.fr/SNL/
and http://nonlineaire.univ-lille1.fr/GDR3070/).

--
Dr. Marco

Aaron Watters

unread,
Dec 6, 2009, 9:38:55 PM12/6/09
to
The key to scaling a web site is to make
sure you can load-balance to as many front
ends as needed and then use a common database
backend that is fast enough or possibly a
common file system that is fast enough.

I can't speak to Django specifically but
you can certainly get essentially unlimited
scalability on the front-end side of the
equation using a Python based web app.

The google app engine will set such a
configuration up for you automatically, but
they are still working some bugs out
with regard to performance, I think,
based on my experience here
http://whiffdoc.appspot.com/
and here
http://listtree.appspot.com/

I hope that helps.
-- Aaron Watters

===
an apple every 8 hours
will keep 3 doctors away. -kliban

Reply all
Reply to author
Forward
0 new messages