Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Google and Python
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  25 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
TheFlyingDutchman  
View profile  
 More options Sep 19 2007, 3:01 pm
Newsgroups: comp.lang.python
From: TheFlyingDutchman <zzbba...@aol.com>
Date: Wed, 19 Sep 2007 12:01:52 -0700
Local: Wed, Sep 19 2007 3:01 pm
Subject: Google and Python
Around 2000 I heard that Google was using Python to some extent. Now I
see that Guido Van Rossum works for them as well as Alex Martellis who
has the title "Uber Technical Lead" which seems to imply some fairly
heavy Python usage there. I was wondering what is done at Google with
Python and which Python "environments/applications" (Zope, TurboGears,
mod_python ...) are in use, and what is done with other languages, and
which other languages are they using.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Larry Bates  
View profile  
 More options Sep 19 2007, 3:18 pm
Newsgroups: comp.lang.python
From: Larry Bates <larry.ba...@websafe.com>
Date: Wed, 19 Sep 2007 14:18:06 -0500
Local: Wed, Sep 19 2007 3:18 pm
Subject: Re: Google and Python

TheFlyingDutchman wrote:
> Around 2000 I heard that Google was using Python to some extent. Now I
> see that Guido Van Rossum works for them as well as Alex Martellis who
> has the title "Uber Technical Lead" which seems to imply some fairly
> heavy Python usage there. I was wondering what is done at Google with
> Python and which Python "environments/applications" (Zope, TurboGears,
> mod_python ...) are in use, and what is done with other languages, and
> which other languages are they using.

Have you tried Google "google python".  Turns up a lot of links for me.

-Larry


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
TheFlyingDutchman  
View profile  
 More options Sep 19 2007, 3:44 pm
Newsgroups: comp.lang.python
From: TheFlyingDutchman <zzbba...@aol.com>
Date: Wed, 19 Sep 2007 12:44:54 -0700
Local: Wed, Sep 19 2007 3:44 pm
Subject: Re: Google and Python

> Have you tried Google "google python".  Turns up a lot of links for me.

I had done it on this newsgroup, but not google. I did find a pretty
good link:

http://panela.blog-city.com/python_at_google_greg_stein__sdforum.htm

Which says:
"A few services including code.google.com and google groups.  Most
other front ends are in C++ (google.com) and Java (gmail).  All web
services are built on top of a highly optimizing http server wrapped
with SWIG."

I am not clear on how you would use a language - whether C++, Java or
Python to write the web app with this custom http server. Is this http
server what is referred to as an "application server" or is it the
main web server which is usually Apache at most sites?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Jones  
View profile  
 More options Sep 19 2007, 4:02 pm
Newsgroups: comp.lang.python
From: Erik Jones <e...@myemma.com>
Date: Wed, 19 Sep 2007 15:02:25 -0500
Local: Wed, Sep 19 2007 4:02 pm
Subject: Re: Google and Python
On Sep 19, 2007, at 2:44 PM, TheFlyingDutchman wrote:

No an http server and application server are two different things.  
An http server services requests of a web server those requests can  
be for static files or for services of a local application in which  
case the request if forwarded on to the application.  An application  
services requests of an application.  They are separate concepts,  
often chained, although they are sometimes implemented together.  
What they are saying here is that they have built a highly optimizing  
custom web server in C++ that services web requests for services of  
applications written in any of the three listed languages.  So, yes,  
in this case it is what is often Apache in other installations.

Erik Jones

Software Developer | Emma®
e...@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
TheFlyingDutchman  
View profile  
 More options Sep 19 2007, 5:01 pm
Newsgroups: comp.lang.python
From: TheFlyingDutchman <zzbba...@aol.com>
Date: Wed, 19 Sep 2007 14:01:18 -0700
Local: Wed, Sep 19 2007 5:01 pm
Subject: Re: Google and Python
On Sep 19, 1:02 pm, Erik Jones <e...@myemma.com> wrote:
> is usually Apache at most sites?

> No an http server and application server are two different things.  
> An http server services requests of a web server those requests can  
> be for static files or for services of a local application in which  
> case the request if forwarded on to the application.  An application  
> services requests of an application.  They are separate concepts,  
> often chained, although they are sometimes implemented together.  
> What they are saying here is that they have built a highly optimizing  
> custom web server in C++ that services web requests for services of  
> applications written in any of the three listed languages.  So, yes,  
> in this case it is what is often Apache in other installations.

OK, thanks. Would you know what technique the custom web server uses
to invoke a C++ app (ditto for Java and Python) CGI is supposed to be
too slow for large sites.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Jones  
View profile  
 More options Sep 19 2007, 5:43 pm
Newsgroups: comp.lang.python
From: Erik Jones <e...@myemma.com>
Date: Wed, 19 Sep 2007 16:43:52 -0500
Local: Wed, Sep 19 2007 5:43 pm
Subject: Re: Google and Python
On Sep 19, 2007, at 4:01 PM, TheFlyingDutchman wrote:

That's what SWIG is for:  interfacing C++ with other languages.

Erik Jones

Software Developer | Emma®
e...@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hendrik van Rooyen  
View profile  
 More options Sep 20 2007, 3:05 am
Newsgroups: comp.lang.python
From: "Hendrik van Rooyen" <m...@microcorp.co.za>
Date: Thu, 20 Sep 2007 09:05:36 +0200
Local: Thurs, Sep 20 2007 3:05 am
Subject: Re: Google and Python

"TheFlyingDutchman" <zz...@aol.com> wrote:
> Around 2000 I heard that Google was using Python to some extent. Now I
> see that Guido Van Rossum works for them as well as Alex Martellis

8< --------------------------------------------------------------------------- --
---------

It seems that shortening "Alessandro" to "Alex" was not enough.

I wonder if the Flying Dutchman is another van der Decker?

- Hendrik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bryan Olson  
View profile  
 More options Sep 20 2007, 4:13 am
Newsgroups: comp.lang.python
From: Bryan Olson <fakeaddr...@nowhere.org>
Date: Thu, 20 Sep 2007 01:13:58 -0700
Local: Thurs, Sep 20 2007 4:13 am
Subject: Re: Google and Python
TheFlyingDutchman asked of someone:

> Would you know what technique the custom web server uses
> to invoke a C++ app

No, I expect he would not know that. I can tell you
that GWS is just for Google, and anyone else is almost
certainly better off with Apache.

> (ditto for Java and Python) CGI is supposed to be too slow
> for large sites.

Sort of. The more queries a site answers, the more benefit
to reducing the per-request overhead. But if one thinks
Google could not afford so much machine time:

     On average, a single query on Google reads hundreds of
     megabytes of data and consumes tens of billions of CPU
     cycles.
     http://labs.google.com/papers/googlecluster.html

Another quote from that paper:

     We also produce all our software in-house [...]

There's a saying in the Navy that there are three ways to
do anything: the right way, the wrong way, and the Navy
way.  How does GWS invoke a Java app? The Google way.

How does Google use Python? As their scripting-language
of choice. A fine choice, but just a tiny little piece.

Maybe Alex will disagree with me. In my short time at
Google, I was uber-nobody.

--
--Bryan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Martelli  
View profile  
 More options Sep 21 2007, 2:00 am
Newsgroups: comp.lang.python
From: al...@mac.com (Alex Martelli)
Date: Thu, 20 Sep 2007 23:00:08 -0700
Local: Fri, Sep 21 2007 2:00 am
Subject: Re: Google and Python
Bryan Olson <fakeaddr...@nowhere.org> wrote:

   ...

> TheFlyingDutchman asked of someone:
> > Would you know what technique the custom web server uses
> > to invoke a C++ app

> No, I expect he would not know that. I can tell you
> that GWS is just for Google, and anyone else is almost
> certainly better off with Apache.

Or lighttpd, like YouTube (cfr
<http://trac.lighttpd.net/trac/wiki/PoweredByLighttpd>).

> How does Google use Python? As their scripting-language
> of choice. A fine choice, but just a tiny little piece.

> Maybe Alex will disagree with me. In my short time at
> Google, I was uber-nobody.

YouTube (one of Google's most valuable properties) is essentially
all-Python (except for open-source infrastructure components such as
lighttpd).  Also, at Google I'm specifically "Uber Tech Lead, Production
Systems": while I can't discuss details, my main responsibilities relate
to various software projects that are part of our "deep infrastructure",
and our general philosophy there is "Python where we can, C++ where we
must".  Python is definitely not "just a tiny little piece" nor (by a
long shot) used only for "scripting" tasks; if the mutant space-eating
nanovirus should instantly stop the execution of all Python code, the
powerful infrastructure that has been often described as "Google's
secret weapon" would seize up.

The internal web applications needed to restore things, btw, would seize
up too; as I already said I can't give details of the ones I'm
responsible for (used by Google's network specialists, reliability
engineers, hardware technicians, etc), but Guido did manage to get
permission to talk about his work, Mondrian
(<http://www.niallkennedy.com/blog/archives/2006/11/google-mondrian.html

>) -- that's what we all use to review code, whatever language it's in,

before it can be submitted to the Google codebase (code reviews are a
mandatory step of development at Google).  Internal web applications are
the preferred way at Google to make any internal functionality
available, of course.

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David  
View profile  
 More options Sep 21 2007, 3:27 pm
Newsgroups: comp.lang.python
From: David <wizza...@gmail.com>
Date: Fri, 21 Sep 2007 21:27:48 +0200
Local: Fri, Sep 21 2007 3:27 pm
Subject: Re: Google and Python

> OK, thanks. Would you know what technique the custom web server uses
> to invoke a C++ app (ditto for Java and Python) CGI is supposed to be
> too slow for large sites.

For large sites you would have modules loaded into your web server so
that executables don't have to be shelled for each request.

Another method is for the apps to run continuously and serve on non-80
port (or on 80 from another host), and your main web server on port 80
reverse proxies to it when appropriate.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Rubin  
View profile  
 More options Sep 21 2007, 4:53 pm
Newsgroups: comp.lang.python
From: Paul Rubin <http://phr...@NOSPAM.invalid>
Date: 21 Sep 2007 13:53:57 -0700
Local: Fri, Sep 21 2007 4:53 pm
Subject: Re: Google and Python

David <wizza...@gmail.com> writes:
> Another method is for the apps to run continuously and serve on non-80
> port (or on 80 from another host), and your main web server on port 80
> reverse proxies to it when appropriate.

You can also pass the open sockets around between processes instead of
reverse proxying, using the SCM_RIGHTS message on Unix domain sockets
under Linux, or some similar mechanism under other Unixes (no idea
about Windows).  Python does not currently support this but one of
these days I want to get around to writing a patch.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bryan Olson  
View profile  
 More options Sep 24 2007, 3:28 am
Newsgroups: comp.lang.python
From: Bryan Olson <fakeaddr...@nowhere.org>
Date: Mon, 24 Sep 2007 00:28:09 -0700
Local: Mon, Sep 24 2007 3:28 am
Subject: Re: Google and Python

Good motto. So is most of Google's code base now in
Python? About what is the ratio of Python code to C++
code? Of course lines of code is kine of a bogus measure.
Of all those cycles Google executes, about what portion
are executed by a Python interpreter?

> Python is definitely not "just a tiny little piece" nor (by a
> long shot) used only for "scripting" tasks;

Ah, sorry. I meant the choice of scripting language was
a tiny little piece of Google's method of operation.
"Scripting language" means languages such as Python,
Perl, and Ruby.

> if the mutant space-eating
> nanovirus should instantly stop the execution of all Python code, the
> powerful infrastructure that has been often described as "Google's
> secret weapon" would seize up.

And the essence of the Google way is to employ a lot of
smart programmers to build their own software to run on
Google's infrastructure. Choice of language is triva.

I think both Python Google are great. What I find
ludicrous is the idea that the bits one hears about how
Google builds its software make a case for how others
should build theirs. Google is kind of secretive, and
their ways are very much their own. Google's software
is much more Googley than Pythonic.

--
--Bryan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nick Craig-Wood  
View profile  
 More options Sep 24 2007, 4:30 am
Newsgroups: comp.lang.python
From: Nick Craig-Wood <n...@craig-wood.com>
Date: Mon, 24 Sep 2007 03:30:11 -0500
Local: Mon, Sep 24 2007 4:30 am
Subject: Re: Google and Python

Paul Rubin <http> wrote:
>  David <wizza...@gmail.com> writes:
> > Another method is for the apps to run continuously and serve on non-80
> > port (or on 80 from another host), and your main web server on port 80
> > reverse proxies to it when appropriate.

>  You can also pass the open sockets around between processes instead of
>  reverse proxying, using the SCM_RIGHTS message on Unix domain sockets
>  under Linux, or some similar mechanism under other Unixes (no idea
>  about Windows).  Python does not currently support this but one of
>  these days I want to get around to writing a patch.

An interesting idea!  Are there any web servers which work like that
at the moment?

Passing file descriptors between processes is one of those things I've
always meant to have a go with, but the amount of code (in Advanced
Programming in the Unix Environment) needed to implement it is rather
disconcerting!  A python module to do it would be great!

--
Nick Craig-Wood <n...@craig-wood.com> -- http://www.craig-wood.com/nick


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Martelli  
View profile  
 More options Sep 24 2007, 11:40 am
Newsgroups: comp.lang.python
From: al...@mac.com (Alex Martelli)
Date: Mon, 24 Sep 2007 08:40:04 -0700
Local: Mon, Sep 24 2007 11:40 am
Subject: Re: Google and Python
Bryan Olson <fakeaddr...@nowhere.org> wrote:

   ...

> > YouTube (one of Google's most valuable properties) is essentially
> > all-Python (except for open-source infrastructure components such as
> > lighttpd).  Also, at Google I'm specifically "Uber Tech Lead, Production
> > Systems": while I can't discuss details, my main responsibilities relate
> > to various software projects that are part of our "deep infrastructure",
> > and our general philosophy there is "Python where we can, C++ where we
> > must".

> Good motto. So is most of Google's code base now in
> Python? About what is the ratio of Python code to C++
> code? Of course lines of code is kine of a bogus measure.
> Of all those cycles Google executes, about what portion
> are executed by a Python interpreter?

I don't have those numbers at hand, and if I did they would be
confidential: you know that Google doesn't release many numbers at all
about its operations, most particularly not about our production
infrastructure (not even, say, how many server we have, in how many data
centers, with what bandwidth, and so on).

Still, I wouldn't say that "most" of our codebase is in Python: there's
a lot of Java, a lot of C++, a lot of Python, a lot of Javascript (which
may not correspond to all that many "cycles Google executes" since the
main point of coding in Javascript is having it execute in the user's
browser, of course, but it's still code that gets developed, debugged,
deployed, maintained), and a lot of other languages including ones that
Google developed in-house such as
<http://labs.google.com/papers/sawzall.html> .

> > Python is definitely not "just a tiny little piece" nor (by a
> > long shot) used only for "scripting" tasks;

> Ah, sorry. I meant the choice of scripting language was
> a tiny little piece of Google's method of operation.

In the same sense in which other such technology choices (C++, Java,
what operating systems, what relational databases, what http servers,
and so on) are similarly "tiny pieces", maybe.  Considering the number
of technology choices that must be made, plus the number of other
choices that aren't directly about technology but, say, about
methodology (style guides for each language in use, mandatory code
reviews before committing to the shared codebase, release-engineering
practices, standards for unit-tests and other kinds of tests, and so on,
and so forth), one could defensibly make a case that each and every such
choice must of necessity be "but a tiny little piece" of the whole.

> "Scripting language" means languages such as Python,
> Perl, and Ruby.

A widespread terminology, but nevertheless a fundamentally bankrupt one:
when a language is used to develop an application, it's very misleading
to call it a "scripting language", as it implies that it's instead used
only to "script" something else.  When it comes time to decide which mix
of languages to use to develop a new application, it's important to
avoid being biased by having tagged some languages as "scripting" ones,
some (say Java) as "application" ones, others yet (say C++) as "system"
ones -- the natural subconscious process would be to say "well I'm
developing an X, I should use an X language, not a Y language or a Z
language", which is most likely to lead to wrong choices.

> > if the mutant space-eating
> > nanovirus should instantly stop the execution of all Python code, the
> > powerful infrastructure that has been often described as "Google's
> > secret weapon" would seize up.

> And the essence of the Google way is to employ a lot of
> smart programmers to build their own software to run on
> Google's infrastructure. Choice of language is triva.

No, it's far from trivial, any more than choice of operating system, and
so on.  Google is a technology company: exactly which technologies to
use and/or develop for the various necessary tasks, far from being
trivial, is the very HEART of its operation.

Your ludicrous claim is similar to saying that the essence of a certain
hedge fund is to employ smart traders to make a lot of money by
sophisticated trades (so far so reasonable) and (here comes the idiocy)
"choice of currencies and financial instruments is trivia" (?!?!?!) --
it's the HEART of such a fund, to pick and choose which positions to
build, unwind, or sell-on, and which (e.g.) currencies should be
involved in such positions is obviously *crucial*, one of the many
important decisions those "smart traders" make every day, and far from
the least important of the many.  And similarly, OF COURSE, for choices
of technologies (programming languages very important among those) for a
technology company, just like, say, what horticultural techniques and
chemicals to employ would be for a company whose "essence" was
cultivating artichokes for sale on the market, and so on.

> I think both Python Google are great. What I find
> ludicrous is the idea that the bits one hears about how
> Google builds its software make a case for how others
> should build theirs.

To each his own, I guess: what I find ludicrous is your claim about
"trivia", as I explained above.  To me, on the contrary, it seems
self-evident that if a company X enjoys great success employing
technique Y, this *DOES* make something of a case for another company Z
to seriously consider and probably try out Y, when attempting tasks
analogous to those X has had success with, to see if some of the success
could not be replicable in Z's own similar tasks.  This is the heart of
"benchmarking" and "industry best practices" -- and why many companies
in the role of X aren't all that forthcoming about publicizing all the
details of their Y's, just in case Z's endeavours should put Z in
competition with X (this always needs to be balanced with the many
_advantages_ connected to publicizing some of those Y's, of course).

Such empirical support, while of course far from infallible (one will
always have to take into consideration many details, and the devil is in
the details), tends to perform vastly better in supporting decision
making than purely abstract considerations bereft of any such empirical
underpinnings.

> Google is kind of secretive, and
> their ways are very much their own. Google's software
> is much more Googley than Pythonic.

Nevertheless, if "Python has been an important part of Google since the
beginning" (as my colleague Peter Norvig said well before I joined
Google, then Guido did, etc etc), then clearly being Pythonic can be *an
important part* (NOT "trivia"!!!) of being Googley, and it would be
seriously stupid to choose to ignore this crucial data point.  YouTube's
choice of Python, done well before anybody had even conceived of their
becoming part of Google one day, does seem to have served them
particularly well too (and they gave lots of details in their talk on
the subject at OSCON, some materials are at
<http://www.scribd.com/doc/244443/Supersising-YouTube-with-Python> and
you can search web and blogs for more), etc, etc.

One delightful part of working at Google is that top management is *NOT*
made up of pointy-haired beancounters who consider such issues as choice
of technologies "trivia" -- Eric Schmidt (the CEO) started his career by
coding "lex" (the lexical analyzer part of the yacc/lex combination),
Stu Feldman started his by writing "make" (still the best-known
semi-automated software-build approach), Urs Hölzle pioneered
just-in-time compilers, Udi Manber wrote a great book on algorithms
(using a somewhat Pascal-like pseudocode which however used indentation
to denote blocks;-), etc, etc.  They KNOW how important ("trivia"
indeed...!-) such choices are: that's part of what makes them great
leaders of passionate engineers -- they haven't and never will forget
their own engineering roots.

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hendrik van Rooyen  
View profile  
 More options Sep 25 2007, 2:34 am
Newsgroups: comp.lang.python
From: "Hendrik van Rooyen" <m...@microcorp.co.za>
Date: Tue, 25 Sep 2007 08:34:21 +0200
Local: Tues, Sep 25 2007 2:34 am
Subject: Re: Google and Python

"Nick Craig-Wood" <ni....d.com> wrote:
> Passing file descriptors between processes is one of those things I've
> always meant to have a go with, but the amount of code (in Advanced
> Programming in the Unix Environment) needed to implement it is rather
> disconcerting!  A python module to do it would be great!

I must be missing something here.

What is the advantage of passing the open file rather than just the
fully qualified file name and having the other process open the
file itself?

I would tend to not go this route, but would opt for one "file owner"
process and use a message based protocol if heavy sharing is envisaged.
It feels "more right" to me than to have different processes read and write
to the same thing.  I can imagine big Dragons with sharp teeth...

- Hendrik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Rubin  
View profile  
 More options Sep 25 2007, 3:49 am
Newsgroups: comp.lang.python
From: Paul Rubin <http://phr...@NOSPAM.invalid>
Date: 25 Sep 2007 00:49:00 -0700
Subject: Re: Google and Python
"Hendrik van Rooyen" <m...@microcorp.co.za> writes:

> What is the advantage of passing the open file rather than just the
> fully qualified file name and having the other process open the
> file itself?

The idea is that the application is a web server.  The socket listener
accepts connections and hands them off to other processes.  That is,
the file descriptors are handles on network connections that were
opened by the remote client, not disk files that can be opened
locally.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Stefan Behnel  
View profile  
 More options Sep 25 2007, 3:50 am
Newsgroups: comp.lang.python
From: Stefan Behnel <stefan.behnel-n05...@web.de>
Date: Tue, 25 Sep 2007 09:50:26 +0200
Local: Tues, Sep 25 2007 3:50 am
Subject: Re: Google and Python

Hendrik van Rooyen wrote:
> "Nick Craig-Wood" <ni....d.com> wrote:

>> Passing file descriptors between processes is one of those things I've
>> always meant to have a go with, but the amount of code (in Advanced
>> Programming in the Unix Environment) needed to implement it is rather
>> disconcerting!  A python module to do it would be great!

> I must be missing something here.

> What is the advantage of passing the open file rather than just the
> fully qualified file name and having the other process open the
> file itself?

A "file descriptor" under Unix is not necessarily an open file that you can
find on the hard-disk. It might also be a socket connection or a pipe, or it
might be a file that was opened or created with specific rights or in an
atomic step (like temporary files).

Many things are (or look like or behave like) files in Unix, that's one of its
real beauties.

Stefan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bryan Olson  
View profile  
 More options Sep 25 2007, 4:57 am
Newsgroups: comp.lang.python
From: Bryan Olson <fakeaddr...@nowhere.org>
Date: Tue, 25 Sep 2007 01:57:59 -0700
Local: Tues, Sep 25 2007 4:57 am
Subject: Re: Google and Python

Can you see how that motto, "Python where we can, C++ where
we must," might lead people to a false impression of how much
Google uses Python versus C++, especially on "production
systems"? I tried to Google-up that motto; your post seems
to be Google's first disclosure of it.

[...]

> To me, on the contrary, it seems
> self-evident that if a company X enjoys great success employing
> technique Y, this *DOES* make something of a case for another company Z
> to seriously consider and probably try out Y, when attempting tasks
> analogous to those X has had success with, to see if some of the success
> could not be replicable in Z's own similar tasks.

Similar tasks to what made Google a great success? I'm not
seeing many of those.

People seem to think they should duplicate the way Google
does things, but without deep understanding of how and why
they work for Google. An impossible task, because it's
about the most un-Googley thing anyone could do.

> This is the heart of
> "benchmarking" and "industry best practices" -- and why many companies
> in the role of X aren't all that forthcoming about publicizing all the
> details of their Y's, just in case Z's endeavours should put Z in
> competition with X (this always needs to be balanced with the many
> _advantages_ connected to publicizing some of those Y's, of course).

In the case of Google, there's way, way too much hidden for
people to reason based on what Google does. In this thread,
did you notice how far wrong people went about how Google's
stuff works?

> Such empirical support, while of course far from infallible (one will
> always have to take into consideration many details, and the devil is in
> the details), tends to perform vastly better in supporting decision
> making than purely abstract considerations bereft of any such empirical
> underpinnings.

Wikipedia is in PHP, Slashdot in Perl, Basecamp in Ruby. They
all rock, but more importantly, we can look under the hood. If
Wikipedia makes a weaker case for PHP than Google for Python,
it's largely because the whole story is never as neat as a
trickle of selective disclosures.

--
--Bryan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bryan Olson  
View profile  
 More options Sep 25 2007, 5:20 am
Newsgroups: comp.lang.python
From: Bryan Olson <fakeaddr...@nowhere.org>
Date: Tue, 25 Sep 2007 09:20:11 GMT
Local: Tues, Sep 25 2007 5:20 am
Subject: Re: Google and Python

Paul Rubin wrote:
> You can also pass the open sockets around between processes instead of
> reverse proxying, using the SCM_RIGHTS message on Unix domain sockets
> under Linux, or some similar mechanism under other Unixes (no idea
> about Windows).  Python does not currently support this but one of
> these days I want to get around to writing a patch.

Windows can do it, but differently. What a surprise.
I just looked it up: WSADuplicateSocket() is the key.
Windows and Unix modules with the same Python interface
would rock.

--
--Bryan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hendrik van Rooyen  
View profile  
 More options Sep 26 2007, 2:59 am
Newsgroups: comp.lang.python
From: "Hendrik van Rooyen" <m...@microcorp.co.za>
Date: Wed, 26 Sep 2007 08:59:17 +0200
Local: Wed, Sep 26 2007 2:59 am
Subject: Re: Google and Python

"Paul Rubin" <http://lid> wrote:
> "Hendrik van Rooyen" <m..orp.co.za> writes:
> > What is the advantage of passing the open file rather than just the
> > fully qualified file name and having the other process open the
> > file itself?

> The idea is that the application is a web server.  The socket listener
> accepts connections and hands them off to other processes.  That is,
> the file descriptors are handles on network connections that were
> opened by the remote client, not disk files that can be opened
> locally.

Ok got it - so instead of starting a thread, as is current practice, you fork
a process (possibly on another machine) and "hand over" the client.
Can't you do this by passing the client's IP addy and the negotiated socket
on the clients machine?

Or is this where the heavy lifting comes in? - "spoofing" the original local IP
addy on the new server? - seems you would have to route to a local machine
based not on IP addy only, but on (IP,socket) tuples.  - This might work if
you have only one entry point to the local LAN, but would be harder to do
if there are two points of entry, and packets could hit from outside on either..

Might be easier to redirect the browser than to try to do this.

- Hendrik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nick Craig-Wood  
View profile  
 More options Sep 26 2007, 5:30 am
Newsgroups: comp.lang.python
From: Nick Craig-Wood <n...@craig-wood.com>
Date: Wed, 26 Sep 2007 04:30:17 -0500
Local: Wed, Sep 26 2007 5:30 am
Subject: Re: Google and Python
Hendrik van Rooyen <m...@microcorp.co.za> wrote:

>  "Paul Rubin" <http://lid> wrote:

> > "Hendrik van Rooyen" <m..orp.co.za> writes:
> > > What is the advantage of passing the open file rather than just the
> > > fully qualified file name and having the other process open the
> > > file itself?

> > The idea is that the application is a web server.  The socket listener
> > accepts connections and hands them off to other processes.  That is,
> > the file descriptors are handles on network connections that were
> > opened by the remote client, not disk files that can be opened
> > locally.

>  Ok got it - so instead of starting a thread, as is current practice, you fork
>  a process (possibly on another machine) and "hand over" the client.

It is trivial to pass a socket to a new thread or a forked child - you
don't need this mechanism for that.  It doesn't work on different
machines though - it has to be on the same machine.

It is for passing a socket to an already running process.  For example
you could implement fast cgi like this.

Fast cgi is a process which runs continuously which avoids startup
times and can track more state more easily.

Instead of the client talking to the web server and the web server
taking to the fast cgi process which is what normally happens, the web
server could first writes some headers on the socket then pass the
socket on to the fast cgi process directly, cutting out a whole lot of
copying of the data.

>  Can't you do this by passing the client's IP addy and the negotiated socket
>  on the clients machine?

No, because the state of the open TCP connection is kept in the kernel
not in the user process.

>  Or is this where the heavy lifting comes in? - "spoofing" the original local IP
>  addy on the new server? - seems you would have to route to a local machine
>  based not on IP addy only, but on (IP,socket) tuples.  - This might work if
>  you have only one entry point to the local LAN, but would be harder to do
>  if there are two points of entry, and packets could hit from
>  outside on either..

It is all done in the kernel.  The kernel has the state of the TCP
connection - it is just accessed from a different process.

--
Nick Craig-Wood <n...@craig-wood.com> -- http://www.craig-wood.com/nick


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hendrik van Rooyen  
View profile  
 More options Sep 27 2007, 2:52 am
Newsgroups: comp.lang.python
From: "Hendrik van Rooyen" <m...@microcorp.co.za>
Date: Thu, 27 Sep 2007 08:52:03 +0200
Local: Thurs, Sep 27 2007 2:52 am
Subject: Re: Google and Python

"Nick Craig-Wood" <....od.com> wrote:
> Hendrik van Rooyen <ma.....p.co.za> wrote:
> >  "Paul Rubin" <http://lid> wrote:

> > > "Hendrik van Rooyen" <m..orp.co.za> writes:
> >  Ok got it - so instead of starting a thread, as is current practice, you
fork
> >  a process (possibly on another machine) and "hand over" the client.

> It is trivial to pass a socket to a new thread or a forked child - you
> don't need this mechanism for that.  It doesn't work on different
> machines though - it has to be on the same machine.

8< ------------- nice explanation by Nick-----------------------

How does a very large volume site work then? - there must be some
way of sharing the load without bottlenecking it through one machine?

- Hendrik


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David  
View profile  
 More options Sep 27 2007, 3:57 am
Newsgroups: comp.lang.python
From: David <wizza...@gmail.com>
Date: Thu, 27 Sep 2007 09:57:58 +0200
Local: Thurs, Sep 27 2007 3:57 am
Subject: Re: Google and Python

> > It is trivial to pass a socket to a new thread or a forked child - you
> > don't need this mechanism for that.  It doesn't work on different
> > machines though - it has to be on the same machine.

> 8< ------------- nice explanation by Nick-----------------------

> How does a very large volume site work then? - there must be some
> way of sharing the load without bottlenecking it through one machine?

Large sites do load balancing (caching, dns round robin, reverse
proxies, etc) over many hosts. Passing connection handles around on a
single host is a different type of optimization.

See this link for info on how Wikipedia handles their load-balancing:

http://en.wikipedia.org/wiki/Wikipedia#Software_and_hardware


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Load balancing and passing sockets; was: Re: Google and Python" by Bryan Olson
Bryan Olson  
View profile  
 More options Sep 27 2007, 5:05 am
Newsgroups: comp.lang.python
From: Bryan Olson <fakeaddr...@nowhere.org>
Date: Thu, 27 Sep 2007 09:05:02 GMT
Local: Thurs, Sep 27 2007 5:05 am
Subject: Load balancing and passing sockets; was: Re: Google and Python

Hendrik van Rooyen wrote:
> "Nick Craig-Wood" wrote: [about passing sockets between processes]
>> It is trivial to pass a socket to a new thread or a forked child - you
>> don't need this mechanism for that.  It doesn't work on different
>> machines though - it has to be on the same machine.

> How does a very large volume site work then? - there must be some
> way of sharing the load without bottlenecking it through one machine?

Several ways. The Domain Name System can provide multiple IP
addresses for the same name. IP addresses often often lead
to HTTP "reverse proxies" that shoot back cached replies to
common simple requests, and forward the harder ones to the
file/application servers, with intelligent load balancing.

The services are surprisingly basic, and some excellent
software is free:

   http://en.wikipedia.org/wiki/Round_robin_DNS
   http://en.wikipedia.org/wiki/Reverse_proxy
   http://en.wikipedia.org/wiki/Squid_proxy

Web apps tend to scale just great, except when they need
data that is both shared and modifiable.

--
--Bryan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Google and Python" by asdfjehqwjerhqjwljekrh
asdfjehqwjerhqjwljekrh  
View profile  
 More options Sep 27 2007, 10:56 am
Newsgroups: comp.lang.python
From: asdfjehqwjerhqjwljekrh <tutu...@gmail.com>
Date: Thu, 27 Sep 2007 14:56:10 -0000
Local: Thurs, Sep 27 2007 10:56 am
Subject: Re: Google and Python
On Sep 24, 10:40 am, al...@mac.com (Alex Martelli) wrote:

> > Good motto. So is most of Google's code base now in
> > Python? About what is the ratio of Python code to C++
> > code? Of course lines of code is kine of a bogus measure.
> > Of all those cycles Google executes, about what portion
> > are executed by a Python interpreter?

> I don't have those numbers at hand, and if I did they would be
> confidential

I would be curious to know whether they code much "mixed model"
coding.  By that I mean (a) code your application in Python, and then
(b) optimize it as necessary by moving some functionality into Python
C/C++ modules.  (Some of (b) may happen during design, of course.)

I think of this as the state of the art in programming practice, and I
wonder whether Google's doing this, or has a superior alternative.

Mike


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »