I'm contemplating setting up a Python-powered website for the tourist industry, which will involve a web service, a good deal of XML processing, and a Django-powered front-end. If the project works, it could get a lot of traffic. I'm sure it can be done, but I'm looking to find out more about how existing high-volume Python sites have managed their workload. Can anyone give me examples of high-volume Python-powered websites, if possible with some idea of their architecture?
Managing load of high volume of visitors is a common issue for all kind of web technologies. I mean this is not the python issue. This issue is mostly about server level designs. You need to supply load balancing for both web servers and databases to make your web site able to respond to several concurrent visitors. Of course a good programmed website is a key performance issue but for your mention I would also suggest considering how many hardwares, how many webservers, how many database cluster and which database server should be used or will be used in the future..
Regards, Kutlu
On Nov 26, 2:24 am, Nick Mellor <nick.mellor.gro...@pobox.com> wrote:
> I'm contemplating setting up a Python-powered website for the tourist > industry, which will involve a web service, a good deal of XML > processing, and a Django-powered front-end. If the project works, it > could get a lot of traffic. I'm sure it can be done, but I'm looking > to find out more about how existing high-volume Python sites have > managed their workload. Can anyone give me examples of high-volume > Python-powered websites, if possible with some idea of their > architecture?
If you need an example one is in front of you and you are using it now. Google uses python at their thousands of servers. But as you see Google gains this performance from the power of hundreds of thousands servers. The main idea behind the Google search engine was indexing the whole web through hundreds of servers. So you can imagine where Google's power comes from..
Regards
On Nov 26, 2:33 am, ShoqulKutlu <kursat.ku...@gmail.com> wrote:
> Managing load of high volume of visitors is a common issue for all > kind of web technologies. I mean this is not the python issue. This > issue is mostly about server level designs. You need to supply load > balancing for both web servers and databases to make your web site > able to respond to several concurrent visitors. Of course a good > programmed website is a key performance issue but for your mention I > would also suggest considering how many hardwares, how many > webservers, how many database cluster and which database server should > be used or will be used in the future..
> Regards, > Kutlu
> On Nov 26, 2:24 am, Nick Mellor <nick.mellor.gro...@pobox.com> wrote:
> > Hi all,
> > I'm contemplating setting up a Python-powered website for the tourist > > industry, which will involve a web service, a good deal of XML > > processing, and a Django-powered front-end. If the project works, it > > could get a lot of traffic. I'm sure it can be done, but I'm looking > > to find out more about how existing high-volume Python sites have > > managed their workload. Can anyone give me examples of high-volume > > Python-powered websites, if possible with some idea of their > > architecture?
I wasn't aware that Google used Python for running their Google groups servers. Can you confirm that? The only place I've seen Google explicitly use Python on their web front end is in the Google Ads tests.
I am impressed by the responsiveness of lawrence.com, ljworld.com and others on the Django home page (http://www.djangoproject.com/)
They seem to do a great job of loading large, complex pages using Django (stacked on Python, stacked on bytecode, stacked on C.) Shows it can be done.
Sorry about my concern on Google. It caused a confusion about Google groups. I didn't mean explicitly where Google uses python, I mentioned just Google uses Python. A Google officer told that they run Python on thousands of their servers at an interview. Due to this claim I wanted to say it for you. Actualy of course it can be done and even it will not be worse than any other frameworks, and I bet can be better than Java and ASP.NET if configured and programmed well. I really encourage you to use mod_python for any project. Mod_python and mod_wsgi made it very powerful at web side. As I said in my previous message a web application's responsiveness is dependent to several issues. A good web framework, server speed, database design etc.. In this case you want to use django, which as I know build for mod_python and can be configured to run on a mod_wsgi web server. Consider that you will have one million members on your site. That traffic simply needs several clustered web servers and clustered databases. This means you supply a load balancing. So concurrent user sessions will be shared on different web servers. You can do your best with such a clustered system with such a powerful language. Really don't worry about that.
Regards, Kutlu
On Nov 26, 7:21 am, Nick Mellor <nick.mellor.gro...@pobox.com> wrote:
> I wasn't aware that Google used Python for running their Google groups > servers. Can you confirm that? The only place > I've seen Google explicitly use Python on their web front end is in > the Google Ads tests.
> I am impressed by the responsiveness of lawrence.com, ljworld.com and > others on the Django home page (http://www.djangoproject.com/)
> They seem to do a great job of loading large, complex pages using > Django (stacked on Python, stacked on bytecode, stacked on C.) > Shows it can be done.
I also want to say something about creating a better web application. If I were you I wouldn't use any web framework like django or other. If you want to create a commercial project and want to manage all modules of the application yourself I suggest you to create your own framework. I don't mean create a Django like framework. Determine your needs, modules, services and build your own modules. Use other independent open source modules for any specific issues. To make a successful project requires your own business logic. And you can do this with your own algorithm. Don't avoid collecting other modules for templating, xml parsing, DB connectivity issues. But build your own framework with these and your own modules..
> I'm contemplating setting up a Python-powered website for the tourist > industry, which will involve a web service, a good deal of XML > processing, and a Django-powered front-end. If the project works, it > could get a lot of traffic. I'm sure it can be done, but I'm looking > to find out more about how existing high-volume Python sites have > managed their workload. Can anyone give me examples of high-volume > Python-powered websites, if possible with some idea of their > architecture?
youtube once used quite a lot of Python IIRC. You may be able to find relevant infos on the net.
While I may disagree with Kutlu on some points[1], it's clear that the key to handling huge traffic is the ability to scale up. So better to avoid solutions that make it hard - or impossible - to setup load balancing, replication etc. Now that doesn't mean than decent performance and reasonnable memory usage are not a concern - even a simple website with moderate traffic can become a PITA if you choose the wrong tools / architecture (Plone perfs problems anyone ?).
Anyway : just make sure your solution is both simple enough to avoid becoming a resource-eater yet serious enough to allow for fine-grained caching, load-balancing and the like.
[1] like reinventing your own framework - whatever architecture (including non-blocking IO/event-based server like Twisted) you settle on, chances are most of the grunt work has already been done, and probably better than what you could come with in a reasonable amount of time - unless you have a really BIG budget of course.
42.desthuilli...@websiteburo.invalid> wrote: > Nick Mellor a écrit :
> > Hi all,
> > I'm contemplating setting up a Python-powered website for the tourist > > industry, which will involve a web service, a good deal of XML > > processing, and a Django-powered front-end. If the project works, it > > could get a lot of traffic. I'm sure it can be done, but I'm looking > > to find out more about how existing high-volume Python sites have > > managed their workload. Can anyone give me examples of high-volume > > Python-powered websites, if possible with some idea of their > > architecture?
> youtube once used quite a lot of Python IIRC. You may be able to find > relevant infos on the net.
> While I may disagree with Kutlu on some points[1], it's clear that the > key to handling huge traffic is the ability to scale up. So better to > avoid solutions that make it hard - or impossible - to setup load > balancing, replication etc. Now that doesn't mean than decent > performance and reasonnable memory usage are not a concern - even a > simple website with moderate traffic can become a PITA if you choose the > wrong tools / architecture (Plone perfs problems anyone ?).
> Anyway : just make sure your solution is both simple enough to avoid > becoming a resource-eater yet serious enough to allow for fine-grained > caching, load-balancing and the like.
> [1] like reinventing your own framework - whatever architecture > (including non-blocking IO/event-based server like Twisted) you settle > on, chances are most of the grunt work has already been done, and > probably better than what you could come with in a reasonable amount of > time - unless you have a really BIG budget of course.
Bruno and Kutlu,
It's a small start-up project.
By mentioning lawrence and other Django-powered websites, I was pointing out that the problem of creating a high-performance web solution using Python has already been solved in several places, and the lessons learned have been given back to us in the form of products like Django. I tend to agree with Bruno that I'm unlikely to do a better job than Django.
On Nov 25, 4:24 pm, Nick Mellor <nick.mellor.gro...@pobox.com> wrote:
> Hi all,
> I'm contemplating setting up a Python-powered website for the tourist > industry, which will involve a web service, a good deal of XML > processing, and a Django-powered front-end. If the project works, it > could get a lot of traffic. I'm sure it can be done, but I'm looking > to find out more about how existing high-volume Python sites have > managed their workload. Can anyone give me examples of high-volume > Python-powered websites, if possible with some idea of their > architecture?
> Many thanks,
> Nick
You should now that there's a Google service called Google App Engine that lets you host yur website in google's own infraestructure (this is known nowadays as "cloud computing"). It's free to start (as long as you don't exceed the minimum quotas of space and traffic, which are quite generous). The good thing is that you don't have to think about scallig issues or about your overall site's arquitecture or hardware. It's te whole google infraestructure at your disposal, which can scale from one user to tens of thousands without having to change anyting from your part. Simply code correctly your site in python or java, using Django or any other wsgi compliant framework, and you are set to go.
On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat.ku...@gmail.com> wrote: > Hi,
> Managing load of high volume of visitors is a common issue for all > kind of web technologies. I mean this is not the python issue. This > issue is mostly about server level designs. You need to supply load > balancing for both web servers and databases to make your web site > able to respond to several concurrent visitors. Of course a good > programmed website is a key performance issue but for your mention I > would also suggest considering how many hardwares, how many > webservers, how many database cluster and which database server should > be used or will be used in the future..
I don't know a lot about this issue, but take apache + php. every time a page is loaded a new instance of php is loaded to run the page, so i imagine load balancing can easiry be done on the page request level by distributing instances of php processes. whereas if you use python, you don't really want to load the python interpreter for every page request. as far as i can tell, the canonical way is to have one app for the whole website that's constantly running and communicates with the server via WSGI. or is that wrong? and wouldn't that make load balancing a little bit more tricky, or at least different? not sure..
> On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat.ku...@gmail.com> wrote: > > Hi,
> > Managing load of high volume of visitors is a common issue for all > > kind of web technologies. I mean this is not the python issue. This > > issue is mostly about server level designs. You need to supply load > > balancing for both web servers and databases to make your web site > > able to respond to several concurrent visitors. Of course a good > > programmed website is a key performance issue but for your mention > > I would also suggest considering how many hardwares, how many > > webservers, how many database cluster and which database server > > should be used or will be used in the future..
> I don't know a lot about this issue, but take apache + php. every > time a page is loaded a new instance of php is loaded to run the > page,
AFAIK that's only the case for PHP-CGI, and Python as a CGI scripting language is used the same way. Apache is very often run with mod_php, though, which embeds the PHP interpreter; mod_python does something similar for Python.
---- Rami Chowdhury "As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one." -- Godwin's Law 408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
> On Monday 30 November 2009 10:55:55 inhahe wrote: >> On Wed, Nov 25, 2009 at 7:33 PM, ShoqulKutlu <kursat.ku...@gmail.com> > wrote: >>> Hi,
>>> Managing load of high volume of visitors is a common issue for all >>> kind of web technologies. I mean this is not the python issue. This >>> issue is mostly about server level designs. You need to supply load >>> balancing for both web servers and databases to make your web site >>> able to respond to several concurrent visitors. Of course a good >>> programmed website is a key performance issue but for your mention >>> I would also suggest considering how many hardwares, how many >>> webservers, how many database cluster and which database server >>> should be used or will be used in the future.. >> I don't know a lot about this issue, but take apache + php. every >> time a page is loaded a new instance of php is loaded to run the >> page,
> AFAIK that's only the case for PHP-CGI, and Python as a CGI scripting > language is used the same way.
Yeps.
> Apache is very often run with mod_php, > though, which embeds the PHP interpreter; mod_python does something > similar for Python.
Indeed.
And FWIW, this has very few impact wrt/ load balancing issues.
On Wed, 25 Nov 2009 21:21:25 -0800 (PST), Nick Mellor <nick.mellor.gro...@pobox.com> wrote : Hi,
> I wasn't aware that Google used Python for running their Google groups > servers. Can you confirm that? The only place > I've seen Google explicitly use Python on their web front end is in > the Google Ads tests.
> I am impressed by the responsiveness of lawrence.com, ljworld.com and > others on the Django home page (http://www.djangoproject.com/)
I'm running two sites using Django which are certainly infinitely more modest and much less visited than the ones you quote but which nevertheless are extremely responsive compared to other frameworks tested, using a single machine (http://nonlineaire.univ-lille1.fr/SNL/ and http://nonlineaire.univ-lille1.fr/GDR3070/).
The key to scaling a web site is to make sure you can load-balance to as many front ends as needed and then use a common database backend that is fast enough or possibly a common file system that is fast enough.
I can't speak to Django specifically but you can certainly get essentially unlimited scalability on the front-end side of the equation using a Python based web app.
The google app engine will set such a configuration up for you automatically, but they are still working some bugs out with regard to performance, I think, based on my experience here http://whiffdoc.appspot.com/ and here http://listtree.appspot.com/
I hope that helps. -- Aaron Watters
=== an apple every 8 hours will keep 3 doctors away. -kliban