Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

python simply not scaleable enough for google?

7 views
Skip to first unread message

Robert P. J. Day

unread,
Nov 11, 2009, 4:57:28 AM11/11/09
to Python Mailing List

http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1

thoughts?

rday
--

========================================================================
Robert P. J. Day Waterloo, Ontario, CANADA

Linux Consulting, Training and Kernel Pedantry.

Web page: http://crashcourse.ca
Twitter: http://twitter.com/rpjday
========================================================================

samwyse

unread,
Nov 11, 2009, 8:35:25 AM11/11/09
to

Terry Reedy

unread,
Nov 11, 2009, 3:15:02 PM11/11/09
to pytho...@python.org
Robert P. J. Day wrote:
> http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1
>
> thoughts?

Program_cost = human_writing&maintance_cost + running_cost*number_of_runs

Nothing new here. The builtin types and many modules are written in C to
reduce running cost for frequency used components. The first killer ap
for Python was scientific computing with early numerical python, where
people often run one-time combinations of inputs and constantly reused
linear algebra functions.

Google has an unusually high number of runs for many programs, making
running_cost minimization important.

At one time, C was not 'scaleable enough' for constantly rerun aps.
Hotspots were hand rewritten in assembler. Apparently now, with
processors more complex and compilers much improved, that is not longer
much the case.

I can imagine a day when code compiled from Python is routinely
time-competitive with hand-written C.

Terry Jan Reedy

Alain Ketterlin

unread,
Nov 11, 2009, 5:31:35 PM11/11/09
to
Terry Reedy <tjr...@udel.edu> writes:

> I can imagine a day when code compiled from Python is routinely
> time-competitive with hand-written C.

Have a look at
http://code.google.com/p/unladen-swallow/downloads/detail?name=Unladen_Swallow_PyCon.pdf&can=2&q=

Slide 6 is impressive. The bottom of slide/page 22 explains why python
is so slow: in most cases "the program is less dynamic than the
language".

-- Alain.

Peter Chant

unread,
Nov 11, 2009, 7:20:36 PM11/11/09
to
Terry Reedy wrote:

> I can imagine a day when code compiled from Python is routinely
> time-competitive with hand-written C.

In my very limited experience it was very informative programming in C for
PIC microcontrollers and inspecting the assembly code produced. If I just
threw together loops and complex if statements the assembly produced was
very long and impossible to follow whereas if I though carefully about what
I was doing, tried to parallel what I would have done in assembly I found
the assembly produced looked a lot like I imagined it would if I were
programming in assembler directly. Except that the whole process was 10x
faster and actually worked.

Pete

--
http://www.petezilla.co.uk

Vincent Manis

unread,
Nov 11, 2009, 7:38:50 PM11/11/09
to pytho...@python.org

I'm having some trouble understanding this thread. My comments aren't directed at Terry's or Alain's comments, but at the thread overall.

1. The statement `Python is slow' doesn't make any sense to me. Python is a programming language; it is implementations that have speed or lack thereof.

2. A skilled programmer could build an implementation that compiled Python code into Common Lisp or Scheme code, and then used a high-performance Common Lisp compiler such as SBCL, or a high-performance Scheme compiler such as Chez Scheme, to produce quite fast code; Python's object model is such that this can be accomplished (and not using CLOS); a Smalltalk-80-style method cache can be used to get good method dispatch. This whole approach would be a bad idea, because the compile times would be dreadful, but I use this example as an existence proof that Python implementations can generate reasonably efficient executable programs.

In the Lisp world, optional declarations and flow analysis are used to tell the compiler the types of various variables. Python 3's annotations could be used for this purpose as well. This doesn't impact the learner (or even the casual programmer) for who efficiency is not that important.

Unladen Swallow's JIT approach is, IMHO, better than this; my point here is that we don't know what the speed limits of Python implementations might be, and therefore, again, we don't know the limits of performance scalability.

3. It is certainly true that CPython doesn't scale up to environments where there are a significant number of processors with shared memory. It is also true that similar languages for highly parallel architectures have been produced (Connection Machine Lisp comes to mind). Whether the current work being done on the GIL will resolve this I don't know. Still, even if CPython had to be thrown away completely (something I don't believe is necessary), a high-performance multiprocessor/shared memory implementation could be built, if resources were available.

4. As to the programming language Python, I've never seen any evidence one way or the other that Python is more or less scalable to large problems than Java. My former employers build huge programs in C++, and there's lots of evidence (most of which I'm NDA'd from repeating) that it's possible to build huge programs in C++, but that they will be horrible :)

-- v

Steven D'Aprano

unread,
Nov 11, 2009, 11:51:17 PM11/11/09
to
On Wed, 11 Nov 2009 16:38:50 -0800, Vincent Manis wrote:

> I'm having some trouble understanding this thread. My comments aren't
> directed at Terry's or Alain's comments, but at the thread overall.
>
> 1. The statement `Python is slow' doesn't make any sense to me. Python
> is a programming language; it is implementations that have speed or lack
> thereof.

Of course you are right, but in common usage, "Python" refers to CPython,
and in fact since all the common (and possibly uncommon) implementations
of Python are as slow or slower than CPython, it's not an unreasonable
short-hand.

But of course "slow" is relative. And many applications are bound by I/O
time, not CPU time.


> 2. A skilled programmer could build an implementation that compiled
> Python code into Common Lisp or Scheme code, and then used a
> high-performance Common Lisp compiler such as SBCL, or a
> high-performance Scheme compiler such as Chez Scheme, to produce quite
> fast code;

Unless you can demonstrate this, it's all theoretical. And probably not
as straight-forward as you think:

http://codespeak.net/pypy/dist/pypy/doc/faq.html#id29


> Python's object model is such that this can be accomplished
> (and not using CLOS); a Smalltalk-80-style method cache can be used to
> get good method dispatch. This whole approach would be a bad idea,
> because the compile times would be dreadful, but I use this example as
> an existence proof that Python implementations can generate reasonably
> efficient executable programs.

I think a better existence proof is implementations that *actually*
exist, not theoretical ones. There's good work happening with Psycho and
PyPy, and you can write C extensions using almost-Python code with Cython
and Pyrex.

There's very little reason why Python *applications* have to be slow,
unless the application itself is inherently slow.


> In the Lisp world, optional declarations and flow analysis are used to
> tell the compiler the types of various variables. Python 3's annotations
> could be used for this purpose as well. This doesn't impact the learner
> (or even the casual programmer) for who efficiency is not that
> important.
>
> Unladen Swallow's JIT approach is, IMHO, better than this; my point here
> is that we don't know what the speed limits of Python implementations
> might be, and therefore, again, we don't know the limits of performance
> scalability.

Absolutely. It's early days for Python.

--
Steven

mcherm

unread,
Nov 12, 2009, 10:07:23 AM11/12/09
to
On Nov 11, 7:38 pm, Vincent Manis <vma...@telus.net> wrote:
> 1. The statement `Python is slow' doesn't make any sense to me.
> Python is a programming language; it is implementations that have
> speed or lack thereof.
[...]

> 2. A skilled programmer could build an implementation that compiled
> Python code into Common Lisp or Scheme code, and then used a
> high-performance Common Lisp compiler...

I think you have a fundamental misunderstanding of the reasons why
Python is
slow. Most of the slowness does NOT come from poor implementations:
the CPython
implementation is extremely well-optimized; the Jython and Iron Python
implementations use best-in-the-world JIT runtimes. Most of the speed
issues
come from fundamental features of the LANGUAGE itself, mostly ways in
which
it is highly dynamic.

In Python, a piece of code like this:
len(x)
needs to watch out for the following:
* Perhaps x is a list OR
* Perhaps x is a dict OR
* Perhaps x is a user-defined type that declares a __len__
method OR
* Perhaps a superclass of x declares __len__ OR
* Perhaps we are running the built-in len() function OR
* Perhaps there is a global variable 'len' which shadows the
built-in OR
* Perhaps there is a local variable 'len' which shadows the
built-in OR
* Perhaps someone has modified __builtins__

In Python it is possible for other code, outside your module to go in
and
modify or replace some methods from your module (a feature called
"monkey-patching" which is SOMETIMES useful for certain kinds of
testing).
There are just so many things that can be dynamic (even if 99% of the
time
they are NOT dynamic) that there is very little that the compiler can
assume.

So whether you implement it in C, compile to CLR bytecode, or
translate into
Lisp, the computer is still going to have to to a whole bunch of
lookups to
make certain that there isn't some monkey business going on, rather
than
simply reading a single memory location that contains the length of
the list.
Brett Cannon's thesis is an example: he attempted desperate measures
to
perform some inferences that would allow performing these
optimizations
safely and, although a few of them could work in special cases, most
of the
hoped-for improvements were impossible because of the dynamic nature
of the
language.

I have seen a number of attempts to address this, either by placing
some
restrictions on the dynamic nature of the code (but that would change
the
nature of the Python language) or by having some sort of a JIT
optimize the
common path where we don't monkey around. Unladen Swallow and PyPy are
two
such efforts that I find particularly promising.

But it isn't NEARLY as simple as you make it out to be.

-- Michael Chermside

Joel Davis

unread,
Nov 12, 2009, 11:35:23 AM11/12/09
to

obviously the GIL is a major reason it's so slow. That's one of the
_stated_ reasons why Google has decided to forgo CPython code. As far
as how sweeping the directive is, I don't know, since the situation
would sort of resolve itself if one committed to Jython application
building or just wait until unladen swallow is finished.

Steven D'Aprano

unread,
Nov 12, 2009, 12:12:53 PM11/12/09
to
On Thu, 12 Nov 2009 08:35:23 -0800, Joel Davis wrote:

> obviously the GIL is a major reason it's so slow.

No such "obviously" about it.

There have been attempts to remove the GIL, and they lead to CPython
becoming *slower*, not faster, for the still common case of single-core
processors.

And neither Jython nor IronPython have the GIL. Jython appears to scale
slightly better than CPython, but for small data sets, is slower than
CPython. IronPython varies greatly in performance, ranging from nearly
twice as fast as CPython on some benchmarks to up to 6000 times slower!

http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and-jython-2-5-performance-compared-to-python-2-5/

http://ironpython-urls.blogspot.com/2009/05/python-jython-and-ironpython.html


Blaming CPython's supposed slowness on the GIL is superficially plausible
but doesn't stand up to scrutiny. The speed of an implementation depends
on many factors, and it also depends on *what you measure* -- it is sheer
nonsense to talk about "the" speed of an implementation. Different tasks
run at different speeds, and there is no universal benchmark.


--
Steven

Alf P. Steinbach

unread,
Nov 12, 2009, 12:32:28 PM11/12/09
to
* Steven D'Aprano:

> On Thu, 12 Nov 2009 08:35:23 -0800, Joel Davis wrote:
>
>> obviously the GIL is a major reason it's so slow.


http://en.wikipedia.org/wiki/Global_Interpreter_Lock

Uh oh...


> No such "obviously" about it.
>
> There have been attempts to remove the GIL, and they lead to CPython
> becoming *slower*, not faster, for the still common case of single-core
> processors.
>
> And neither Jython nor IronPython have the GIL. Jython appears to scale
> slightly better than CPython, but for small data sets, is slower than
> CPython. IronPython varies greatly in performance, ranging from nearly
> twice as fast as CPython on some benchmarks to up to 6000 times slower!
>
> http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and-jython-2-5-performance-compared-to-python-2-5/
>
> http://ironpython-urls.blogspot.com/2009/05/python-jython-and-ironpython.html
>
>
> Blaming CPython's supposed slowness

Hm, this seems religious.

Of course Python is slow: if you want speed, pay for it by complexity.

It so happens that I think CPython could have been significantly faster, but (1)
doing that would amount to creating a new implementation, say, C++Python <g>,
and (2) what for, really?, since CPU-intensive things should be offloaded to
other language code anyway.


> on the GIL is superficially plausible
> but doesn't stand up to scrutiny. The speed of an implementation depends
> on many factors, and it also depends on *what you measure* -- it is sheer
> nonsense to talk about "the" speed of an implementation. Different tasks
> run at different speeds, and there is no universal benchmark.

This also seems religious. It's like in Norway it became illegal to market lemon
soda, since umpteen years ago it's soda with lemon flavoring. This has to do
with the *origin* of the citric acid, whether natural or chemist's concoction,
no matter that it's the same chemical. So, some people think that it's wrong to
talk about interpreted languages, hey, it should be a "language designed for
interpretation", or better yet, "dynamic language", or bestest, "language with
dynamic flavor". And slow language, oh no, should be "language whose current
implementations are perceived as somewhat slow by some (well, all) people", but
of course, that's just silly.


Cheers,

- Alf

J Kenneth King

unread,
Nov 12, 2009, 12:33:36 PM11/12/09
to
mcherm <mch...@gmail.com> writes:

You might be right for the wrong reasons in a way.

Python isn't slow because it's a dynamic language. All the lookups
you're citing are highly optimized hash lookups. It executes really
fast.

The OP is talking about scale. Some people say Python is slow at a
certain scale. I say that's about true for any language. Large amounts
of IO is a tough problem.

Where Python might get hit *as a language* is that the Python programmer
has to drop into C to implement optimized data-structures for dealing
with the kind of IO that would slow down the Python interpreter. That's
why we have numpy, scipy, etc. The special cases it takes to solve
problems with custom types wasn't special enough to alter the language.
Scale is a special case believe it or not.

As an implementation though, the sky really is the limit and Python is
only getting started. Give it another 40 years and it'll probably
realize that it's just another Lisp. ;)

Rami Chowdhury

unread,
Nov 12, 2009, 1:58:49 PM11/12/09
to Alf P. Steinbach, pytho...@python.org
On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach <al...@start.no>
wrote:

>
> This also seems religious. It's like in Norway it became illegal to
> market lemon soda, since umpteen years ago it's soda with lemon
> flavoring. This has to do with the *origin* of the citric acid, whether
> natural or chemist's concoction, no matter that it's the same chemical.
> So, some people think that it's wrong to talk about interpreted
> languages, hey, it should be a "language designed for interpretation",
> or better yet, "dynamic language", or bestest, "language with dynamic
> flavor". And slow language, oh no, should be "language whose current
> implementations are perceived as somewhat slow by some (well, all)
> people", but of course, that's just silly.

Perhaps I'm missing the point of what you're saying but I don't see why
you're conflating interpreted and dynamic here? Javascript is unarguably a
dynamic language, yet Chrome / Safari 4 / Firefox 3.5 all typically JIT
it. Does that make Javascript non-dynamic, because it's compiled? What
about Common Lisp, which is a compiled language when it's run with CMUCL
or SBCL?


--
Rami Chowdhury
"Never attribute to malice that which can be attributed to stupidity" --
Hanlon's Razor
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)

Alf P. Steinbach

unread,
Nov 12, 2009, 2:24:18 PM11/12/09
to
* Rami Chowdhury:

> On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach <al...@start.no>
> wrote:
>>
>> This also seems religious. It's like in Norway it became illegal to
>> market lemon soda, since umpteen years ago it's soda with lemon
>> flavoring. This has to do with the *origin* of the citric acid,
>> whether natural or chemist's concoction, no matter that it's the same
>> chemical. So, some people think that it's wrong to talk about
>> interpreted languages, hey, it should be a "language designed for
>> interpretation", or better yet, "dynamic language", or bestest,
>> "language with dynamic flavor". And slow language, oh no, should be
>> "language whose current implementations are perceived as somewhat slow
>> by some (well, all) people", but of course, that's just silly.
>
> Perhaps I'm missing the point of what you're saying but I don't see why
> you're conflating interpreted and dynamic here? Javascript is unarguably
> a dynamic language, yet Chrome / Safari 4 / Firefox 3.5 all typically
> JIT it. Does that make Javascript non-dynamic, because it's compiled?
> What about Common Lisp, which is a compiled language when it's run with
> CMUCL or SBCL?

Yeah, you missed it.

Blurring and coloring and downright hiding reality by insisting on misleading
but apparently more precise terminology for some vague concept is a popular
sport, and chiding others for using more practical and real-world oriented
terms, can be effective in politics and some other arenas.

But in a technical context it's silly. Or dumb. Whatever.

E.g. you'll find it impossible to define interpretation rigorously in the sense
that you're apparently thinking of. It's not that kind of term or concept. The
nearest you can get is in a different direction, something like "a program whose
actions are determined by data external to the program (+ x qualifications and
weasel words)", which works in-practice, conceptually, but try that on as a
rigorous definition and you'll see that when you get formal about it then it's
completely meaningless: either anything qualifies or nothing qualifies.

You'll also find it impossible to rigorously define "dynamic language" in a
general way so that that definition excludes C++. <g>

So, to anyone who understands what one is talking about, "interpreted", or e.g.
"slow language" (as was the case here), conveys the essence.

And to anyone who doesn't understand it trying to be more precise is an exercise
in futility and pure silliness -- except for the purpose of misleading.


Cheers & hth.,

- Alf

Rami Chowdhury

unread,
Nov 12, 2009, 2:42:05 PM11/12/09
to Alf P. Steinbach, pytho...@python.org
On Thu, 12 Nov 2009 11:24:18 -0800, Alf P. Steinbach <al...@start.no>
wrote:

Well, sure. Can you explain, then, what sense you meant it in?

> You'll also find it impossible to rigorously define "dynamic language"
> in a general way so that that definition excludes C++. <g>

Or, for that matter, suitably clever assembler. I'm not arguing with you
there.

> So, to anyone who understands what one is talking about, "interpreted",
> or e.g. "slow language" (as was the case here), conveys the essence.

Not when the context isn't clear, it doesn't.

> And to anyone who doesn't understand it trying to be more precise is an
> exercise in futility and pure silliness -- except for the purpose of
> misleading.

Or for the purpose of greater understanding, surely - and isn't that the
point?

Alf P. Steinbach

unread,
Nov 12, 2009, 3:02:11 PM11/12/09
to

I think that was in the part you *snipped* here. Just fill in the mentioned
qualifications and weasel words. And considering that a routine might be an
intepreter of data produced elsewhere in program, needs some fixing...


>> You'll also find it impossible to rigorously define "dynamic language"
>> in a general way so that that definition excludes C++. <g>
>
> Or, for that matter, suitably clever assembler. I'm not arguing with you
> there.
>
>> So, to anyone who understands what one is talking about,
>> "interpreted", or e.g. "slow language" (as was the case here), conveys
>> the essence.
>
> Not when the context isn't clear, it doesn't.
>
>> And to anyone who doesn't understand it trying to be more precise is
>> an exercise in futility and pure silliness -- except for the purpose
>> of misleading.
>
> Or for the purpose of greater understanding, surely - and isn't that the
> point?

I don't think that was the point.

Specifically, I reacted to the statement that <<it is sheer nonsense to talk
about "the" speed of an implementation>>, made in response to someone upthread,
in the context of Google finding CPython overall too slow.

It is quite slow. ;-)


Cheers,

- Alf

Rami Chowdhury

unread,
Nov 12, 2009, 3:33:05 PM11/12/09
to Alf P. Steinbach, pytho...@python.org
On Thu, 12 Nov 2009 12:02:11 -0800, Alf P. Steinbach <al...@start.no>
wrote:

> I think that was in the part you *snipped* here. Just fill in the
> mentioned qualifications and weasel words.

OK, sure. I don't think they're weasel words, because I find them useful,
but I think I see where you're coming from.

> Specifically, I reacted to the statement that <<it is sheer nonsense to
> talk about "the" speed of an implementation>>, made in response to
> someone upthread, in the context of Google finding CPython overall too
> slow.

IIRC it was "the speed of a language" that was asserted to be nonsense,
wasn't it? Which IMO is fair -- a physicist friend of mine works with a
C++ interpreter which is relatively sluggish, but that doesn't mean C++ is
slow...

Benjamin Kaplan

unread,
Nov 12, 2009, 3:44:00 PM11/12/09
to pytho...@python.org
On Thu, Nov 12, 2009 at 2:24 PM, Alf P. Steinbach <al...@start.no> wrote:
>
> You'll also find it impossible to rigorously define "dynamic language" in a
> general way so that that definition excludes C++. <g>
>
> So, to anyone who understands what one is talking about, "interpreted", or
> e.g. "slow language" (as was the case here), conveys the essence.
>
> And to anyone who doesn't understand it trying to be more precise is an
> exercise in futility and pure silliness  --  except for the purpose of
> misleading.

You just made Rami's point. You can't define a language as <insert
word here>. You can however describe what features it has - static vs.
dynamic typing, duck-typing, dynamic dispatch, and so on. Those are
features of the language. Other things, like "interpreted" vs
"compiled" are features of the implementation. C++ for instance is
considered language that gets compiled to machine code. However,
Visual Studio can compile C++ programs to run on the .NET framework
which makes them JIT compiled. Some one could even write an
interpreter for C++ if they wanted to.

Rami Chowdhury

unread,
Nov 12, 2009, 4:08:27 PM11/12/09
to Benjamin Kaplan, pytho...@python.org
On Thu, 12 Nov 2009 12:44:00 -0800, Benjamin Kaplan
<benjami...@case.edu> wrote:

> Some one could even write an
> interpreter for C++ if they wanted to.

Someone has (http://root.cern.ch/drupal/content/cint)!

Steven D'Aprano

unread,
Nov 12, 2009, 9:50:33 PM11/12/09
to
On Thu, 12 Nov 2009 21:02:11 +0100, Alf P. Steinbach wrote:

> Specifically, I reacted to the statement that <<it is sheer nonsense to
> talk about "the" speed of an implementation>>, made in response to
> someone upthread, in the context of Google finding CPython overall too
> slow.
>
> It is quite slow. ;-)

Quite slow to do what? Quite slow compared to what?

I think you'll find using CPython to sort a list of ten million integers
will be quite a bit faster than using bubblesort written in C, no matter
how efficient the C compiler.

And why are we limiting ourselves to integers representable by the native
C int? What if the items in the list were of the order of 2**100000? Of
if they were mixed integers, fractions, fixed-point decimals, and
floating-point binaries? How fast is your C code going to be now? That's
going to depend on the C library you use, isn't it? In other words, it is
an *implementation* issue, not a *language* issue.

Okay, let's keep it simple. Stick to numbers representable by native C
ints. Around this point, people start complaining that it's not fair, I'm
not comparing apples with apples. Why am I comparing a highly-optimized,
incredibly fast sort method in CPython with a lousy O(N**2) algorithm in
C? To make meaningful comparisons, you have to make sure the algorithms
are the same, so the two language implementations do the same amount of
work. (Funnily enough, it's "unfair" to play to Python's strengths, and
"fair" to play to C's strengths.)

Then people invariable try to compare (say) something in C involving low-
level bit-twiddling or pointer arithmetic with something in CPython
involving high-level object-oriented programming. Of course CPython is
"slow" if you use it to do hundreds of times more work in every operation
-- that's comparing apples with oranges again, but somehow people think
that's okay when your intention is to prove "Python is slow".

An apples-to-apples comparison would be to use a framework in C which
offered the equivalent features as Python: readable syntax ("executable
pseudo-code"), memory management, garbage disposal, high-level objects,
message passing, exception handling, dynamic strong typing, and no core
dumps ever.

If you did that, you'd get something that runs much closer to the speed
of CPython, because that's exactly what CPython is: a framework written
in C that provides all those extra features.

(That's not to say that Python-like high-level languages can't, in
theory, be significantly faster than CPython, or that they can't have JIT
compilers that emit highly efficient -- in space or time -- machine code.
That's what Psyco does, now, and that's the aim of PyPy.)

However, there is one sense that Python *the language* is slower than
(say) C the language. Python requires that an implementation treat the
built-in function (say) int as an object subject to modification by the
caller, while C requires that it is a reserved word. So when a C compiler
sees "int", it can optimize the call to a known low-level routine, while
a Python compiler can't make this optimization. It *must* search the
entire scope looking for the first object called 'int' it finds, then
search the object's scope for a method called '__call__', then execute
that. That's the rules for Python, and an implementation that does
something else isn't Python. Even though the searching is highly
optimized, if you call int() one million times, any Python implementation
*must* perform that search one million times, which adds up. Merely
identifying what function to call is O(N) at runtime for Python and O(1)
at compile time for C.

Note though that JIT compilers like Psyco can often take shortcuts and
speed up code by a factor of 2, or up to 100 in the best cases, which
brings the combination of CPython + Psyco within shouting distance of the
speed of the machine code generated by good optimizing C compilers. Or
you can pass the work onto an optimized library or function call that
avoids the extra work. Like I said, there is no reason for Python
*applications* to be slow.


--
Steven

Vincent Manis

unread,
Nov 13, 2009, 1:20:11 AM11/13/09
to pytho...@python.org
When I was approximately 5, everybody knew that higher level languages were too slow for high-speed numeric computation (I actually didn't know that then, I was too busy watching Bill and Ben the Flowerpot Men), and therefore assembly languages were mandatory. Then IBM developed Fortran, and higher-level languages were not too slow for numeric computation.

When I was in university, IBM released a perfectly horrible implementation of PL/I, which dynamically allocated and freed stack frames for each procedure entry and exit (`Do Not Use Procedures: They Are Inefficient': section heading from the IBM PL/I (G) Programmer's Guide, circa 1968). Everyone knew PL/I was an abomination of a language, which could not be implemented efficiently. Then MIT/Bell Labs/GE/Honeywell wrote Multics in a PL/I subset, and (eventually) it ran quite efficiently.

When Bell Labs pulled out of the Multics effort, some of their researchers wrote the first version of Unix in assembly language, but a few years later rewrote the kernel in C. Their paper reporting this included a sentence that said in effect, `yes, the C version is bigger and slower than the assembler version, but it has more functionality, so C isn't so bad'. Everybody knew that high-level languages were too inefficient to write an operating system in (in spite of the fact that Los Alamos had already written an OS in a Fortran dialect). Nobody knew that at about that time, IBM had started writing new OS modules in a company-confidential PL/I subset.

When I was in grad school, everybody knew that an absolute defence to a student project running slowly was `I wrote it in Lisp'; we only had a Lisp interpreter running on our system. We didn't have MacLisp, which had been demonstrated to compile carefully-written numerical programs into code that ran more efficiently than comparable programs compiled by DEC's PDP-10 Fortran compiler in optimizing mode.

In an earlier post, I mentioned SBCL and Chez Scheme, highly optimizing compiler-based implementations of Common Lisp and Scheme, respectively. I don't have numbers for SBCL, but I know that (again with carefully-written Scheme code) Chez Scheme can produce code that runs in the same order of magnitude as optimized C code. These are both very old systems that, at least in the case of Chez Scheme, use techniques that have been reported in the academic literature. My point in the earlier post about translating Python into Common Lisp or Scheme was essentially saying `look, there's more than 30 years experience building high-performance implementations of Lisp languages, and Python isn't really that different from Lisp, so we ought to be able to do it too'.

All of which leads me to summarize the current state of things.

1. Current Python implementations may or may not be performance-scalable in ways we need.

2. Reorganized interpreters may give us a substantial improvement in performance. More significant improvements would require a JIT compiler, and there are good projects such as Unladen Swallow that may well deliver a substantial improvement.

3. We might also get improvements from good use of Python 3 annotations, or other pragma style constructs that might be added to the language after the moratorium, which would give a compiler additional information about the programmer's intent. (For example, Scheme has a set of functions that essentially allow a programmer to say, `I am doing integer arithmetic with values that are limited in range to what can be stored in a machine word'.) These annotations wouldn't destroy the dynamic nature of Python, because they are purely optional. This type of language feature would allow a programmer to exploit the high-performance compilation technologies that are common in the Lisp world.

Even though points (2) and (3) between them offer a great deal of hope for future Python implementations, there is much that can be done with our current implementations. Just ask the programmer who writes a loop that laboriously does what could be done much more quickly with a list comprehension or with map.

-- v


Steven D'Aprano

unread,
Nov 13, 2009, 2:19:08 AM11/13/09
to
On Thu, 12 Nov 2009 22:20:11 -0800, Vincent Manis wrote:

> When I was approximately 5, everybody knew that higher level languages were too slow for high-speed numeric computation (I actually didn't know that then, I was too busy watching Bill and Ben the Flowerpot Men), and therefore assembly languages were mandatory. Then IBM developed Fortran, and higher-level languages were not too slow for numeric computation.

Vincent, could you please fix your mail client, or news client, so
that it follows the standard for mail and news (that is, it has a
hard-break after 68 or 72 characters?

Having to scroll horizontally to read your posts is a real pain.


--
Steven

Tim Chase

unread,
Nov 13, 2009, 5:48:59 AM11/13/09
to Steven D'Aprano, pytho...@python.org
Steven D'Aprano wrote:
> Vincent, could you please fix your mail client, or news
> client, so that it follows the standard for mail and news
> (that is, it has a hard-break after 68 or 72 characters?

This seems an awfully curmudgeonly reply, given that
word-wrapping is also client-controllable. Every MUA I've used
has afforded word-wrap including the venerable command-line
"mail", mutt, Thunderbird/Seamonkey, pine, Outlook & Outlook
Express...the list goes on. If you're reading via web-based
portal, if the web-reader doesn't support wrapped lines, (1) that
sounds like a lousy reader and (2) if you absolutely must use
such a borked web-interface, you can always hack it in a good
browser with a greasemonkey-ish script or a user-level CSS "!"
important attribute to ensure that the <div> or <p> in question
wraps even if the site tries to specify otherwise.

There might be some stand-alone news-readers that aren't smart
enough to support word-wrapping/line-breaking, in which case,
join the 80's and upgrade to one that does. Or even just pipe to
your text editor of choice: vi, emacs, ed, cat, and even Notepad
has a "wrap long lines" sort of setting or does the right thing
by default (okay, so cat relies on your console to do the
wrapping, but it does wrap).

I can see complaining about HTML content since not all MUA's
support it. I can see complaining about top-posting vs. inline
responses because that effects readability. But when the issue
is entirely controllable on your end, it sounds like a personal
issue.

-tkc


Steven D'Aprano

unread,
Nov 13, 2009, 7:24:13 AM11/13/09
to
On Fri, 13 Nov 2009 04:48:59 -0600, Tim Chase wrote:

> There might be some stand-alone news-readers that aren't smart enough to
> support word-wrapping/line-breaking, in which case, join the 80's and
> upgrade to one that does.

Of course I can change my software. That fixes the problem for me. Or the
poster can get a clue and follow the standard -- which may be as simple
as clicking a checkbox, probably called "Wrap text", under Settings
somewhere -- and fix the problem for EVERYBODY, regardless of what mail
client or newsreader they're using.


--
Steven

Message has been deleted

Aaron Watters

unread,
Nov 13, 2009, 3:37:59 PM11/13/09
to

That time is now, in many cases.

I still stand by my strategy published in Unix World
ages ago: get it working in Python, profile it, optimize
it, if you need to do it faster code the inner loops in
C.

Of course on google app engine, the last step is not possible,
but I don't think it is needed for 90% of applications
or more.

My own favorite app on google app engine/appspot is

http://listtree.appspot.com/

implemented using whiff
http://whiff.sourceforge.net
as described in this tutorial
http://aaron.oirt.rutgers.edu/myapp/docs/W1100_2300.GAEDeploy

not as fast as I would like sporadically. But that's
certainly not Python's problem because the same application
running on my laptop is *much* faster.

By the way: the GO language smells like Rob Pike,
and I certainly hope it is more successful than
Limbo was. Of course, if Google decides to really
push it then it's gonna be successful regardless
of all other considerations, just like Sun
did to Java...

-- Aaron Watters

===
Limbo: how low can you go?


Terry Reedy

unread,
Nov 13, 2009, 4:48:44 PM11/13/09
to pytho...@python.org
Aaron Watters wrote:
> On Nov 11, 3:15 pm, Terry Reedy <tjre...@udel.edu> wrote:
>> Robert P. J. Day wrote:
>> I can imagine a day when code compiled from Python is routinely
>> time-competitive with hand-written C.
>
> That time is now, in many cases.

By routinely, I meant ***ROUTINELY***, as in
"C become the province of specialized tool coders, much like assembly is
now, while most programmers use Python (or similar languages) because
they cannot (easily) beat it with hand-coded C." We are not yet at
*tha* place yet.

> I still stand by my strategy published in Unix World
> ages ago: get it working in Python, profile it, optimize
> it, if you need to do it faster code the inner loops in
> C.

Agreed

By the way: the GO language smells like Rob Pike,
and I certainly hope it is more successful than
Limbo was. Of course, if Google decides to really
push it then it's gonna be successful regardless
of all other considerations, just like Sun
did to Java...

> By the way: the GO language smells like Rob Pike,
> and I certainly hope it is more successful than

It still has the stupid, unnecessary, redundant C brackets, given that
all their example code is nicely indented like Python. That alone is a
deal killer for me.

Terry Jan Reedy

Paul Rubin

unread,
Nov 13, 2009, 6:32:52 PM11/13/09
to
Tim Chase <pytho...@tim.thechases.com> writes:
> Or even just pipe to
> your text editor of choice: vi, emacs, ed, cat, and even Notepad
> has a "wrap long lines" sort of setting or does the right thing
> by default (okay, so cat relies on your console to do the
> wrapping, but it does wrap).

No, auto wrapping long lines looks like crap. It's better to keep the
line length reasonable when you write the posts. This is Usenet so
please stick with Usenet practices. If you want a web forum there are
plenty of them out there.

Robert Brown

unread,
Nov 13, 2009, 8:42:51 PM11/13/09
to
Vincent Manis <vma...@telus.net> writes:

> On 2009-11-11, at 14:31, Alain Ketterlin wrote:
> I'm having some trouble understanding this thread. My comments aren't
> directed at Terry's or Alain's comments, but at the thread overall.
>
> 1. The statement `Python is slow' doesn't make any sense to me. Python is a
> programming language; it is implementations that have speed or lack thereof.

This is generally true, but Python *the language* is specified in a way that
makes executing Python programs quickly very very difficult. I'm tempted to
say it's impossible, but great strides have been made recently with JITs, so
we'll see.

> 2. A skilled programmer could build an implementation that compiled Python
> code into Common Lisp or Scheme code, and then used a high-performance
> Common Lisp compiler such as SBCL, or a high-performance Scheme compiler

> such as Chez Scheme, to produce quite fast code ...

A skilled programmer has done this for Common Lisp. The CLPython
implementation converts Python souce code to Common Lisp code at read time,
which is then is compiled. With SBCL you get native machine code for every
Python expression.

http://github.com/franzinc/cl-python/
http://common-lisp.net/project/clpython/

If you want to know why Python *the language* is slow, look at the Lisp code
CLPython generates and at the code implementing the run time. Simple
operations end up being very expensive. Does the object on the left side of a
comparison implement compare? No, then does the right side implement it? No,
then try something else ....

I'm sure someone can come up with a faster Python implementation, but it will
have to be very clever.

> This whole approach would be a bad idea, because the compile times would be
> dreadful, but I use this example as an existence proof that Python
> implementations can generate reasonably efficient executable programs.

The compile times are fine, not dreadful. Give it a try.

> 3. It is certainly true that CPython doesn't scale up to environments where
> there are a significant number of processors with shared memory.

Even on one processor, CPython has problems.

I last seriously used CPython to analyze OCRed books. The code read in the
OCR results for one book at a time, which included the position of every word
on every page. My books were long, 2000 pages, and dense and I was constantly
fighting address space limitations and CPython slowness related to memory
usage. I had to resort to packing and unpacking data into Python integers in
order to fit all the OCR data into RAM.

bob

Robert Brown

unread,
Nov 13, 2009, 9:02:06 PM11/13/09
to

Vincent Manis <vma...@telus.net> writes:
> My point in the earlier post about translating Python into Common Lisp or
> Scheme was essentially saying `look, there's more than 30 years experience
> building high-performance implementations of Lisp languages, and Python
> isn't really that different from Lisp, so we ought to be able to do it too'.

Common Lisp and Scheme were designed by people who wanted to write complicated
systems on machines with a tiny fraction of the horsepower of current
workstations. They were carefully designed to be compiled efficiently, which
is not the case with Python. There really is a difference here. Python the
language has features that make fast implementations extremely difficult.

bob

Robert Brown

unread,
Nov 13, 2009, 9:13:04 PM11/13/09
to

J Kenneth King <ja...@agentultra.com> writes:

> mcherm <mch...@gmail.com> writes:
>> I think you have a fundamental misunderstanding of the reasons why Python
>> is slow. Most of the slowness does NOT come from poor implementations: the
>> CPython implementation is extremely well-optimized; the Jython and Iron
>> Python implementations use best-in-the-world JIT runtimes. Most of the
>> speed issues come from fundamental features of the LANGUAGE itself, mostly
>> ways in which it is highly dynamic.
>>

>> -- Michael Chermside

> You might be right for the wrong reasons in a way. Python isn't slow
> because it's a dynamic language. All the lookups you're citing are highly
> optimized hash lookups. It executes really fast.

Sorry, but Michael is right for the right reason. Python the *language* is
slow because it's "too dynamic". All those hash table lookups are unnecessary
in some other dynamic languages and they slow down Python. A fast
implementation is going to have to be very clever about memoizing method
lookups and invalidating assumptions when methods are dynamically redefined.

> As an implementation though, the sky really is the limit and Python is
> only getting started.

Yes, but Python is starting in the basement.

bob

Vincent Manis

unread,
Nov 13, 2009, 9:15:12 PM11/13/09
to pytho...@python.org
On 2009-11-13, at 12:46, Brian J Mingus wrote:
> You're joking, right? Try purchasing a computer manufactured in this millennium. Monitors are much wider than 72 characters nowadays, old timer.
I have already agreed to make my postings VT100-friendly. Oh, wait, the VT-100,
or at least some models of it, had a mode where you could have a line width of
132 characters.

And what does this have to do with Python? About as much as an exploding penguin
on your television.

-- v

Vincent Manis

unread,
Nov 13, 2009, 9:25:59 PM11/13/09
to pytho...@python.org

On 2009-11-13, at 15:32, Paul Rubin wrote:
> This is Usenet so
> please stick with Usenet practices.
Er, this is NOT Usenet.

1. I haven't, to the best of my recollection, made a Usenet post in this millennium.

2. I haven't fired up a copy of rn or any other news reader in at least 2 decades.

3. I'm on the python-list mailing list, reading this with Apple's Mail application,
which actually doesn't have convenient ways of enforcing `Usenet practices' regarding
message format.

4. If we're going to adhere to tried-and-true message format rules, I want my IBM
2260 circa 1970, with its upper-case-only display and weird little end-of-line symbols.

Stephen asked me to wrap my posts. I'm happy to do it. Can we please finish this thread
off and dispose of it?

-- v

Vincent Manis

unread,
Nov 13, 2009, 9:39:01 PM11/13/09
to pytho...@python.org

On 2009-11-13, at 17:42, Robert Brown wrote, quoting me:

> ... Python *the language* is specified in a way that


> makes executing Python programs quickly very very difficult.

That is untrue. I have mentioned before that optional declarations integrate
well with dynamic languages. Apart from CL and Scheme, which I have mentioned
several times, you might check out Strongtalk (typed Smalltalk), and Dylan,
which was designed for high-performance compilation, though to my knowledge
no Dylan compilers ever really achieved it.

> I'm tempted to
> say it's impossible, but great strides have been made recently with JITs, so
> we'll see.

> If you want to know why Python *the language* is slow, look at the Lisp code


> CLPython generates and at the code implementing the run time. Simple
> operations end up being very expensive. Does the object on the left side of a
> comparison implement compare? No, then does the right side implement it? No,
> then try something else ....

I've never looked at CLPython. Did it use a method cache (see Peter Deutsch's
paper on Smalltalk performance in the unfortunately out-of-print `Smalltalk-80:
Bits of History, Words of Advice'? That technique is 30 years old now.

I have more to say, but I'll do that in responding to Bob's next post.

-- v

David Robinow

unread,
Nov 13, 2009, 9:44:49 PM11/13/09
to pytho...@python.org
On Fri, Nov 13, 2009 at 3:32 PM, Paul Rubin
<http://phr...@nospam.invalid> wrote:
> ...  This is Usenet so

> please stick with Usenet practices.  If you want a web forum there are
> plenty of them out there.
Actually this is pytho...@python.org
I don't use usenet and I have no intention to stick with Usenet practices.

Vincent Manis

unread,
Nov 13, 2009, 10:15:09 PM11/13/09
to pytho...@python.org
On 2009-11-13, at 18:02, Robert Brown wrote:

> Common Lisp and Scheme were designed by people who wanted to write complicated
> systems on machines with a tiny fraction of the horsepower of current
> workstations. They were carefully designed to be compiled efficiently, which
> is not the case with Python. There really is a difference here. Python the
> language has features that make fast implementations extremely difficult.

Not true. Common Lisp was designed primarily by throwing together all of the
features in every Lisp implementation the design committee was interested in.
Although the committee members were familiar with high-performance compilation,
the primary impetus was to achieve a standardized language that would be acceptable
to the Lisp community. At the time that Common Lisp was started, there was still
some sentiment that Lisp machines were the way to go for performance.

As for Scheme, it was designed primarily to satisfy an aesthetic of minimalism. Even
though Guy Steele's thesis project, Rabbit, was a Scheme compiler, the point here was
that relatively simple compilation techniques could produce moderately reasonable
object programs. Chez Scheme was indeed first run on machines that we would nowadays
consider tiny, but so too was C++. Oh, wait, so was Python!

I would agree that features such as exec and eval hurt the speed of Python programs,
but the same things do the same thing in CL and in Scheme. There is a mystique about
method dispatch, but again, the Smalltalk literature has dealt with this issue in the
past.

Using Python 3 annotations, one can imagine a Python compiler that does the appropriate
thing (shown in the comments) with the following code.

import my_module # static linking

__private_functions__ = ['my_fn'] # my_fn doesn't appear in the module dictionary.

def my_fn(x: python.int32): # Keeps x in a register
def inner(z): # Lambda-lifts the function, no nonlocal vars
return z // 2 # does not construct a closure
y = x + 17 # Via flow analysis, concludes that y can be registerized;
return inner(2 * y) # Uses inline integer arithmetic instructions.

def blarf(a: python.int32):
return my_fn(a // 2) # Because my_fn isn't exported, it can be inlined.

A new pragma statement (which I am EXPLICITLY not proposing; I respect and support
the moratorium) might be useful in telling the implementation that you don't mind
integer overflow.

Similarly, new library classes might be created to hold arrays of int32s or doubles.

Obviously, no Python system does any of these things today. But there really is
nothing stopping a Python system from doing any of these things, and the technology
is well-understood in implementations of other languages.

I am not claiming that this is _better_ than JIT. I like JIT and other runtime things
such as method caches better than these because you don't have to know very much about
the implementation in order to take advantage of them. But there may be some benefit
in allowing programmers concerned with speed to relax some of Python's dynamism
without ruining it for the people who need a truly dynamic language.

If I want to think about scalability seriously, I'm more concerned about problems that
Python shares with almost every modern language: if you have lots of processors accessing
a large shared memory, there is a real GC efficiency problem as the number of processors
goes up. On the other hand, if you have a lot of processors with some degree of private
memory sharing a common bus (think the Cell processor), how do we build an efficient
implementation of ANY language for that kind of environment?

Somehow, the issues of Python seem very orthogonal to performance scalability.

-- v


Paul Rubin

unread,
Nov 13, 2009, 10:15:32 PM11/13/09
to
Vincent Manis <vma...@telus.net> writes:
> 3. I'm on the python-list mailing list, reading this with Apple's
> Mail application, which actually doesn't have convenient ways of
> enforcing `Usenet practices' regarding message format.

Oh, I see. Damn gateway.

> Stephen asked me to wrap my posts. I'm happy to do it. Can we please
> finish this thread off and dispose of it?

Please wrap to 72 columns or less. It's easier to read that way. (I
actually don't care if you do it or not. If you don't, I'll just
stop responding to you, which might even suit your wishes.)

Paul Rubin

unread,
Nov 13, 2009, 10:53:00 PM11/13/09
to
"Robert P. J. Day" <rpj...@crashcourse.ca> writes:
> http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1
> thoughts?

I'd bet it's not just about multicore scaling and general efficiency,
but also the suitability of the language itself for large, complex
projects. It's just not possible to be everything for everybody.
Python is beginner-friendly, has a very fast learning curve for
experienced programmers in other languages, and is highly productive
for throwing small and medium sized scripts together, that are
debugged through iterated testing. One might say it's optimized for
those purposes. I use it all the time because a lot of my programming
fits the pattern. The car analogy is the no-frills electric commuter
car, just hop in and aim it where you want to go; if you crash it,
brush yourself off and restart. But there are times (large production
applications) when you really want the Airbus A380 with the 100's of
automatic monitoring systems and checkout procedures to follow before
you take off, even if the skill level needed to use it is much higher
than the commuter car.

Vincent Manis

unread,
Nov 14, 2009, 1:38:01 AM11/14/09
to pytho...@python.org

OK. The quoted link deals with Unladen Swallow, which is an attempt to deal with the
very real performance limitations of current Python systems. The remarks above deal with
productivity scalability, which is a totally different matter. So...

People can and do write large programs in Python, not just `throwing...medium sized
scripts together'. Unlike, say, Javascript, it has the necessary machinery to build very
large programs that are highly maintainable. One can reasonably compare it with Java, C#,
and Smalltalk; the facilities are comparable, and all of those (as well as Python) are
used for building enterprise systems.

I believe that the A380's control software is largely written in Ada, which is a
perfectly fine programming language that I would prefer not to write code in. For
approximately 10 years, US DOD pretty much required the use of Ada in military (and
aerospace) software (though a a couple of years ago I discovered that there is still
one remaining source of Jovial compilers that still sells to DOD). According to a
presentation by Lt. Colonel J. A. Hamilton, `Programming Language Policy in the DOD:
After The Ada Mandate', given in 1999, `We are unlikely to see a return of a programming
language mandate' (www.drew-hamilton.com/stc99/stcAda_99.pdf). As I understand it,
the removal of the Ada mandate came from the realization (20 years after many computer
scientists *told* DOD this) that software engineering processes contribute more to
reliability than do programming language structures (c.f. Fred Brooks, `No Silver
Bullet').

So: to sum up, there are lots of large systems where Python might be totally appropriate,
especially if complemented with processes that feature careful specification and strong
automated testing. There are some large systems where Python would definitely NOT be
the language of choice, or even usable at all, because different engineering processes
were in place.

From a productivity viewpoint, there is no data to say that Python is more, less, or equally
scalable than <Language X> in that it produces correctly-tested, highly-maintainable programs
at a lower, higher, or equal cost. I would appreciate it if people who wanted to comment on
Python's scalability or lack thereof would give another programming language that they would
compare it with.

-- v

Alf P. Steinbach

unread,
Nov 14, 2009, 1:51:03 AM11/14/09
to
* Rami Chowdhury:

> On Thu, 12 Nov 2009 12:02:11 -0800, Alf P. Steinbach <al...@start.no>
> wrote:
>> I think that was in the part you *snipped* here. Just fill in the
>> mentioned qualifications and weasel words.
>
> OK, sure. I don't think they're weasel words, because I find them
> useful, but I think I see where you're coming from.
>
>> Specifically, I reacted to the statement that <<it is sheer nonsense
>> to talk about "the" speed of an implementation>>, made in response to
>> someone upthread, in the context of Google finding CPython overall too
>> slow.
>
> IIRC it was "the speed of a language" that was asserted to be nonsense,
> wasn't it?

Yes, upthread.

It's sort of hilarious. <g>

Alain Ketterlin:
"slide/page 22 explains why python is so slow"

Vincent Manis (response):


"Python is a programming language; it is implementations that have speed
or lack thereof"

This was step 1 of trying to be more precise than the concept warranted.

Then Steven D'Aprano chimed in, adding even more precision:

Steven D'Aprano (further down response stack):


"it is sheer nonsense to talk about "the" speed of an implementation"

So no, it's not a language that is slow, it's of course only concrete
implementations that may have slowness flavoring. And no, not really, they
don't, because it's just particular aspects of any given implementation that may
exhibit slowness in certain contexts. And expanding on that trend, later in the
thread the observation was made that no, not really that either, it's just (if
it is at all) at this particular point in time, what about the future? Let's be
precise! Can't have that vague touchy-feely impression about a /language/ being
slow corrupting the souls of readers.

Hip hurray, Google's observation annuled, by the injections of /precision/. :-)


> Which IMO is fair -- a physicist friend of mine works with a
> C++ interpreter which is relatively sluggish, but that doesn't mean C++
> is slow...

Actually, although C++ has the potential for being really really fast (and some
C++ programs are), the amount of work you have to add to realize the potential
can be staggering. This is most clearly evidenced by C++'s standard iostreams,
which have the potential of being much much faster than C FILE i/o (in
particular Dietmar Kuhl made such an implementation), but where the complexity
of and the guidance offered by the "design" is such that nearly all extant
implementations are painfully slow, even compared to C FILE. So, we generally
talk about iostreams being slow, knowing full well what we mean and that fast
implementations are theoretically possible (as evidenced by Dietmar's) -- but
"fast" and "slow" are in-practice terms, and so what matters is in-practice,
like, how does your compiler's iostreams implementation hold up.


Cheers,

- Alf

Robert Brown

unread,
Nov 14, 2009, 2:20:44 AM11/14/09
to

Vincent Manis <vma...@telus.net> writes:

> On 2009-11-13, at 17:42, Robert Brown wrote, quoting me:

>> ... Python *the language* is specified in a way that
>> makes executing Python programs quickly very very difficult.

> That is untrue. I have mentioned before that optional declarations integrate
> well with dynamic languages. Apart from CL and Scheme, which I have
> mentioned several times, you might check out Strongtalk (typed Smalltalk),
> and Dylan, which was designed for high-performance compilation, though to my
> knowledge no Dylan compilers ever really achieved it.

You are not making an argument, just mentioning random facts. You claim I've
made a false statement, then talk about optional type declarations, which
Python doesn't have. Then you mention Smalltalk and Dylan. What's your
point? To prove me wrong you have to demonstrate that it's not very difficult
to produce a high performance Python system, given current Python semantics.

>> I'm tempted to say it's impossible, but great strides have been made
>> recently with JITs, so we'll see.
>
>> If you want to know why Python *the language* is slow, look at the Lisp
>> code CLPython generates and at the code implementing the run time. Simple
>> operations end up being very expensive. Does the object on the left side
>> of a comparison implement compare? No, then does the right side implement
>> it? No, then try something else ....

> I've never looked at CLPython. Did it use a method cache (see Peter
> Deutsch's paper on Smalltalk performance in the unfortunately out-of-print
> `Smalltalk-80: Bits of History, Words of Advice'? That technique is 30 years
> old now.

Please look at CLPython. The complexity of some Python operations will make
you weep. CLPython uses Common Lisp's CLOS method dispatch in various places,
so yes, those method lookups are definitely cached.

Method lookup is just the tip if the iceburg. How about comparison? Here are
some comments from CLPython's implementation of compare. There's a lot going
on. It's complex and SLOW.

;; This function is used in comparisons like <, <=, ==.
;;
;; The CPython logic is a bit complicated; hopefully the following
;; is a correct translation.

;; If the class is equal and it defines __cmp__, use that.

;; The "rich comparison" operations __lt__, __eq__, __gt__ are
;; now called before __cmp__ is called.
;;
;; Normally, we take these methods of X. However, if class(Y)
;; is a subclass of class(X), the first look at Y's magic
;; methods. This allows the subclass to override its parent's
;; comparison operations.
;;
;; It is assumed that the subclass overrides all of
;; __{eq,lt,gt}__. For example, if sub.__eq__ is not defined,
;; first super.__eq__ is called, and after that __sub__.__lt__
;; (or super.__lt__).
;;
;; object.c - try_rich_compare_bool(v,w,op) / try_rich_compare(v,w,op)

;; Try each `meth'; if the outcome it True, return `res-value'.

;; So the rich comparison operations didn't lead to a result.
;;
;; object.c - try_3way_compare(v,w)
;;
;; Now, first try X.__cmp__ (even if y.class is a subclass of
;; x.class) and Y.__cmp__ after that.

;; CPython now does some number coercion attempts that we don't
;; have to do because we have first-class numbers. (I think.)

;; object.c - default_3way_compare(v,w)
;;
;; Two instances of same class without any comparison operator,
;; are compared by pointer value. Our function `py-id' fakes
;; that.

;; None is smaller than everything (excluding itself, but that
;; is catched above already, when testing for same class;
;; NoneType is not subclassable).

;; Instances of different class are compared by class name, but
;; numbers are always smaller.

;; Probably, when we arrive here, there is a bug in the logic
;; above. Therefore print a warning.

Vincent Manis

unread,
Nov 14, 2009, 2:33:25 AM11/14/09
to pytho...@python.org
On 2009-11-13, at 22:51, Alf P. Steinbach wrote:
> It's sort of hilarious. <g>
It really is, see below.

> So no, it's not a language that is slow, it's of course only concrete implementations that may have slowness flavoring. And no, not really, they don't, because it's just particular aspects of any given implementation that may exhibit slowness in certain contexts. And expanding on that trend, later in the thread the observation was made that no, not really that either, it's just (if it is at all) at this particular point in time, what about the future? Let's be precise! Can't have that vague touchy-feely impression about a /language/ being slow corrupting the souls of readers.

Because `language is slow' is meaningless.

An earlier post of mine listed four examples where the common wisdom was `XXX is slow' and yet where that
turned out not to be the case.

Some others.

1. I once owned a Commodore 64. I got Waterloo Pascal for it. I timed the execution of some program
(this was 25 years ago, I forget what the program did) at 1 second per statement. Therefore: `Pascal
is slow'.

2. Bell Labs produced a fine programming language called Snobol 4. It was slow. But some folks at
IIT in Chicago did their own implementation, Spitbol, which was fast and completely compatible.
Presto: Snobol 4 was slow, but then it became fast.

3. If you write the following statements in Fortran IV (the last version of Fortran I learned)

DO 10 I=1, 1000000
DO 10 J=1, 1000000
A(I, J) = 0.0
10 CONTINUE

you would paralyze early virtual memory systems, because Fortran IV defined arrays to be stored
in column major order, and the result was extreme thrashing. Many programmers did not realize
this, and would naturally write code like that. Fortran cognoscenti would interchange the two
DO statements and thus convert Fortran from being a slow language to being a fast one.

4. When Sun released the original Java system, programs ran very slowly, and everybody said
`I will not use Java, it is a slow language'. Then Sun improved their JVM, and other organizations
wrote their own JVMs which were fast. Therefore Java became a fast language.

> Actually, although C++ has the potential for being really really fast (and some C++ programs are), the amount of work you have to add to realize the potential can be staggering. This is most clearly evidenced by C++'s standard iostreams, which have the potential of being much much faster than C FILE i/o (in particular Dietmar Kuhl made such an implementation), but where the complexity of and the guidance offered by the "design" is such that nearly all extant implementations are painfully slow, even compared to C FILE. So, we generally talk about iostreams being slow, knowing full well what we mean and that fast implementations are theoretically possible (as evidenced by Dietmar's) -- but "fast" and "slow" are in-practice terms, and so what matters is in-practice, like, how does your compiler's iostreams implementation hold up.

OK, let me work this one out. Because most iostreams implementations are very slow, C++ is a slow
language. But since Kuhl did a high-performance implementation, he made C++ into a fast language.
But since most people don't use his iostreams implementation, C++ is a slow language again, except
for organizations that have banned iostreams (as my previous employers did) because it's too slow,
therefore C++ is a fast language.

Being imprecise is so much fun! I should write my programs this imprecisely.

More seriously, when someone says `xxx is a slow language', the only thing they can possibly mean
is `there is no implementation in existence, and no likelihood of an implementation being possible,
that is efficient enough to solve my problem in the required time' or perhaps `I must write peculiar
code in order to get programs to run in the specified time; writing code in the way the language seems
to encourage produces programs that are too slow'. This is a very sweeping statement, and at the very
least ought to be accompanied by some kind of proof. If Python is indeed a slow language, then Unladen
Swallow and pypy, and many other projects, are wastes of time, and should not be continued.

Again, this doesn't have anything to do with features of an implementation that are slow or fast.
The only criterion that makes sense is `do programs run with the required performance if written
in the way the language's inventors encourage'. Most implementations of every language have a nook
or two where things get embarrassingly slow; the question is `are most programs unacceptably slow'.

But, hey, if we are ok with being imprecise, let's go for it. Instead of saying `slow' and `fast',
why not say `good' and `bad'?

-- v

Robert Brown

unread,
Nov 14, 2009, 2:39:48 AM11/14/09
to

Vincent Manis <vma...@telus.net> writes:

> On 2009-11-13, at 18:02, Robert Brown wrote:
>
>> Common Lisp and Scheme were designed by people who wanted to write
>> complicated systems on machines with a tiny fraction of the horsepower of
>> current workstations. They were carefully designed to be compiled
>> efficiently, which is not the case with Python. There really is a
>> difference here. Python the language has features that make fast
>> implementations extremely difficult.
>
> Not true. Common Lisp was designed primarily by throwing together all of the
> features in every Lisp implementation the design committee was interested
> in. Although the committee members were familiar with high-performance
> compilation, the primary impetus was to achieve a standardized language that
> would be acceptable to the Lisp community. At the time that Common Lisp was
> started, there was still some sentiment that Lisp machines were the way to
> go for performance.

Common Lisp blends together features of previous Lisps, which were designed to
be executed efficiently. Operating systems were written in these variants.
Execution speed was important. The Common Lisp standardization committee
included people who were concerned about performance on C-optimized hardware.

> As for Scheme, it was designed primarily to satisfy an aesthetic of
> minimalism. Even though Guy Steele's thesis project, Rabbit, was a Scheme
> compiler, the point here was that relatively simple compilation techniques
> could produce moderately reasonable object programs. Chez Scheme was indeed
> first run on machines that we would nowadays consider tiny, but so too was
> C++. Oh, wait, so was Python!

The Scheme standard has gone through many revisions. I think we're up to
version 6 at this point. The people working on it are concerned about
performance. For instance, see the discussions about whether the order of
evaluating function arguments should be specified. Common Lisp evaluates
arguments left to right, but Scheme leaves the order unspecified so the
compiler can better optimize. You can't point to Rabbit (1978 ?) as
representative of the Scheme programming community over the last few decades.

> Using Python 3 annotations, one can imagine a Python compiler that does the
> appropriate thing (shown in the comments) with the following code.

I can imagine a lot too, but we're talking about Python as it's specified
*today*. The Python language as it's specified today is hard to execute
quickly. Not impossible, but very hard, which is why we don't see fast Python
systems.

bob

sturlamolden

unread,
Nov 14, 2009, 2:51:33 AM11/14/09
to
On 14 Nov, 08:39, Robert Brown <bbr...@speakeasy.net> wrote:

> > Using Python 3 annotations, one can imagine a Python compiler that does the
> > appropriate thing (shown in the comments) with the following code.
>
> I can imagine a lot too, but we're talking about Python as it's specified
> *today*.  The Python language as it's specified today is hard to execute
> quickly.  Not impossible, but very hard, which is why we don't see fast Python
> systems.

It would not be too difficult to have a compiler like Cython recognize
those annotations instead of current "cdef"s.

With Cython we can get "Python" to run at "the speed of C" just by
adding in optional type declarations for critical variables (most need
not be declared).

With CMUCL and SBCL we can make Common Lisp perform at "the speed of
C", for the same reason.

Also a Cython program will usually out-perform most C code. It
combines the strengths of C, Fortran 90 and Python.


Vincent Manis

unread,
Nov 14, 2009, 2:55:22 AM11/14/09
to Robert Brown, pytho...@python.org
On 2009-11-13, at 23:20, Robert Brown wrote, quoting me:
> On 2009-11-13, at 17:42, Robert Brown wrote, quoting me:
>
>>> ... Python *the language* is specified in a way that
>>> makes executing Python programs quickly very very difficult.
>
>> That is untrue. I have mentioned before that optional declarations integrate
>> well with dynamic languages. Apart from CL and Scheme, which I have
>> mentioned several times, you might check out Strongtalk (typed Smalltalk),
>> and Dylan, which was designed for high-performance compilation, though to my
>> knowledge no Dylan compilers ever really achieved it.
>
> You are not making an argument, just mentioning random facts. You claim I've
> made a false statement, then talk about optional type declarations, which
> Python doesn't have. Then you mention Smalltalk and Dylan. What's your
> point? To prove me wrong you have to demonstrate that it's not very difficult
> to produce a high performance Python system, given current Python semantics.
The false statement you made is that `... Python *the language* is specified
in a way that makes executing Python programs quickly very very difficult.
I refuted it by citing several systems that implement languages with semantics
similar to those of Python, and do so efficiently.

>> I've never looked at CLPython. Did it use a method cache (see Peter
>> Deutsch's paper on Smalltalk performance in the unfortunately out-of-print
>> `Smalltalk-80: Bits of History, Words of Advice'? That technique is 30 years
>> old now.
>
> Please look at CLPython. The complexity of some Python operations will make
> you weep. CLPython uses Common Lisp's CLOS method dispatch in various places,
> so yes, those method lookups are definitely cached.

Ah, that does explain it. CLOS is most definitely the wrong vehicle for implementing
Python method dispatch. CLOS is focused around generic functions that themselves
do method dispatch, and do so in a way that is different from Python's. If I were
building a Python implementation in CL, I would definitely NOT use CLOS, but
do my own dispatch using funcall (the CL equivalent of the now-vanished Python
function apply).

> Method lookup is just the tip if the iceburg. How about comparison? Here are
> some comments from CLPython's implementation of compare. There's a lot going
> on. It's complex and SLOW.

Re comparison. Python 3 has cleaned comparison up a fair bit. In particular, you
can no longer compare objects of different types using default comparisons.
However, it could well be that there are nasty little crannies of inefficiency
there, they could be the subject of PEPs after the moratorium is over.

> <quoting from the CLPython code>


> ;; The CPython logic is a bit complicated; hopefully the following
> ;; is a correct translation.

I can see why CLPython has such troubles. The author has endeavoured to copy
CPython faithfully, using an implementation language (CLOS) that is hostile
to Python method dispatch.

OK, let me try this again. My assertion is that with some combination of JITting,
reorganization of the Python runtime, and optional static declarations, Python
can be made acceptably fast, which I define as program runtimes on the same order
of magnitude as those of the same programs in C (Java and other languages have
established a similar goal). I am not pushing optional declarations, as it's
worth seeing what we can get out of JITting. If you wish to refute this assertion,
citing behavior in CPython or another implementation is not enough. You have to
show that the stated feature *cannot* be made to run in an acceptable time.

For example, if method dispatch required an exponential-time algorithm, I would
agree with you. But a hypothetical implementation that did method dispatch in
exponential time for no reason would not be a proof, but would rather just be
a poor implementation.

-- v

sturlamolden

unread,
Nov 14, 2009, 3:05:59 AM11/14/09
to
On 12 Nov, 18:33, J Kenneth King <ja...@agentultra.com> wrote:

> Where Python might get hit *as a language* is that the Python programmer
> has to drop into C to implement optimized data-structures for dealing
> with the kind of IO that would slow down the Python interpreter.  That's
> why we have numpy, scipy, etc.

That's not a Python specific issue. We drop to SciPy/NumPy for certain
compute-bound tasks that operates on vectors. If that does not help,
we drop further down to Cython, C or Fortran. If that does not help,
we can use assembly. In fact, if we use SciPy linked against GotoBLAS,
a lot of compute-intensive work solving linear algebra is delegated to
hand-optimized assembly.

With Python we can stop at the level of abstraction that gives
acceptable performance. When using C, we start out at a much lower
level. The principle that premature optimization is the root of all
evil applies here: Python code that is fast enough is fast enough. It
does not matter that hand-tuned assembly will be 1000 times faster. We
can direct our optimization effort to the parts of the code that needs
it.


Paul Rubin

unread,
Nov 14, 2009, 3:10:50 AM11/14/09
to
sturlamolden <sturla...@yahoo.no> writes:
> With Cython we can get "Python" to run at "the speed of C" just by
> adding in optional type declarations for critical variables (most need
> not be declared).

I think there are other semantic differences too. For general
thoughts on such differences (Cython is not mentioned though), see:

http://dirtsimple.org/2005/10/children-of-lesser-python.html

Vincent Manis

unread,
Nov 14, 2009, 3:20:13 AM11/14/09
to Robert Brown, pytho...@python.org
On 2009-11-13, at 23:39, Robert Brown wrote, quoting me:
> Common Lisp blends together features of previous Lisps, which were designed to
> be executed efficiently. Operating systems were written in these variants.
> Execution speed was important. The Common Lisp standardization committee
> included people who were concerned about performance on C-optimized hardware.
Guy L Steele, Jr., `Common Lisp The Language' 1/e (1984). p. 1 `COMMON LISP
is intended to meet these goals: Commonality [...] Portability [...] Consistency
[...] Expressiveness [...] Compatibility [...] Efficiency [...] Power [...]
Stability [...]' The elided text amplifies each of the points. I repeat: the
purpose of Common Lisp was to have a standard Lisp dialect; efficiency was
less of an issue for those investigators.

As for C-optimized hardware, well, the dialects it aims to be compatible with
are ZetaLisp (Symbolics Lisp Machine), MacLisp (PDP-10), and Interlisp (PDP-10,
originally).

CLtL mentions S-1 Lisp as its exemplar of high numerical performance. Unfortunately,
S-1 Lisp, written by Richard Gabriel and Rod Brooks was never finished. MacLisp was
a highly efficient implementation, as I've mentioned. I worked at BBN at the time
Interlisp flourished; it was many things, some of them quite wonderful, but efficiency
was NOT its goal.

> The Scheme standard has gone through many revisions. I think we're up to
> version 6 at this point. The people working on it are concerned about
> performance.

Yes, they are. You should see <a>'s rants about how <b> specified certain
features so they'd be efficient on his implementation. I had real people's
names there, but I deleted them in the interests of not fanning flamewar
flames.

> For instance, see the discussions about whether the order of
> evaluating function arguments should be specified.

That was a long time ago, and had as much if not more to do with making
arguments work the same as let forms as it had to do with efficiency.
But I'll point out that the big feature of Scheme is continuations, and
it took quite a few years after the first Scheme implementations came out
to make continuations stop being horrendously *IN*efficient.

> You can't point to Rabbit (1978 ?) as
> representative of the Scheme programming community over the last few decades.

I didn't. I used it to buttress YOUR argument that Schemers have always been
concerned with performance.

>> Using Python 3 annotations, one can imagine a Python compiler that does the
>> appropriate thing (shown in the comments) with the following code.
> I can imagine a lot too, but we're talking about Python as it's specified
> *today*. The Python language as it's specified today is hard to execute
> quickly. Not impossible, but very hard, which is why we don't see fast Python
> systems.

Python 3 annotations exist. Check the Python 3 Language Reference.

I notice you've weakened your claim. Now we're down to `hard to execute
quickly'. That I would agree with you on, in that building an efficient
Python system would be a lot of work. However, my claim is that that work
is engineering, not research: most of the bits and pieces of how to implement
Python reasonably efficiently are known and in the public literature. And
that has been my claim since the beginning.

-- v

Alf P. Steinbach

unread,
Nov 14, 2009, 3:22:03 AM11/14/09
to
* Vincent Manis:

:-)

You're piling up so extremely many fallacies in one go that I just quoted it all.

Anyways, it's a good example of focusing on irrelevant and meaningless precision
plus at the same time utilizing imprecision, higgedly-piggedly as it suits one's
argument. Mixing hard precise logic with imprecise concepts and confound e.g.
universal quantification with existential quantification, for best effect
several times in the same sentence. Like the old Very Hard Logic + imprecision
adage: "we must do something. this is something. ergo, we must do this".

It's just idiocy.

But fun.


Cheers & hth.,

- Alf

sturlamolden

unread,
Nov 14, 2009, 3:36:23 AM11/14/09
to
On 12 Nov, 18:32, "Alf P. Steinbach" <al...@start.no> wrote:

> Of course Python is slow: if you want speed, pay for it by complexity.

Python is slow is really a misconception. Python is used for
scientific computing at HPC centres around the world. NumPy's
predecessor numarray was made by NASA for the Hubble space telescope.
Python is slow for certain types of tasks, particularly iterative
compute-bound work. But who says you have to use Python for this? It
can easily be delegated to libraries written in C or Fortran.

I can easily demonstrate Python being faster than C. For example, I
could compare the speed of appending strings to a list and "".join
(strlist) with multiple strcats in C. I can easily demonstrate C being
faster than Python as well.

To get speed from a high-level language like Python you have to
leverage on high-level data types. But then you cannot compare
algorithms in C and Python directly.

Also consider that most program today are not CPU-bound: They are i/o
bound or memory-bound. Using C does not give you faster disk access,
faster ethernet connection, or faster RAM... It does not matter that
computation is slow if the CPU is starved anyway. We have to consider
what actually limits the speed of a program.

Most of all I don't care that computation is slow if slow is fast
enough. For example, I have a Python script that parses OpenGL headers
and writes a declaration file for Cython. It takes a fraction of a
second to complete. Should I migrate it to C to make it 20 times
faster? Or do you really think I care if it takes 20 ms or just 1 ms
to complete? The only harm the extra CPU cycles did was a minor
contribution to global warming.


Vincent Manis

unread,
Nov 14, 2009, 3:37:23 AM11/14/09
to pytho...@python.org
On 2009-11-14, at 00:22, Alf P. Steinbach wrote, in response to my earlier post.

> Anyways, it's a good example of focusing on irrelevant and meaningless precision plus at the same time utilizing imprecision, higgedly-piggedly as it suits one's argument. Mixing hard precise logic with imprecise concepts and confound e.g. universal quantification with existential quantification, for best effect several times in the same sentence. Like the old Very Hard Logic + imprecision adage: "we must do something. this is something. ergo, we must do this".

OK, now we've reached a total breakdown in communication, Alf. You appear to take exception to
distinguishing between a language and its implementation. My academic work, before I became a computer
science/software engineering instructor, was in programming language specification and implementation,
so I *DO* know what I'm talking about here. However, you and I apparently are speaking on different
wavelengths.

> It's just idiocy.
Regretfully, I must agree.

> But fun.
Not so much, from my viewpoint.

-- v

sturlamolden

unread,
Nov 14, 2009, 3:43:19 AM11/14/09
to
On 14 Nov, 02:42, Robert Brown <bbr...@speakeasy.net> wrote:

> If you want to know why Python *the language* is slow, look at the Lisp code
> CLPython generates and at the code implementing the run time.  Simple
> operations end up being very expensive.

You can also see this by looking at the C that Cython or Pyrex
generates.

You can also see the dramatic effect by a handful of strategically
placed type declarations.


Alf P. Steinbach

unread,
Nov 14, 2009, 3:47:28 AM11/14/09
to
* sturlamolden:

> On 12 Nov, 18:32, "Alf P. Steinbach" <al...@start.no> wrote:
>
>> Of course Python is slow: if you want speed, pay for it by complexity.
>
> Python is slow is really a misconception.

Sorry, no, I don't think so.

But we can't know that without ESP powers.

Which seem to be in short supply.


> Python is used for
> scientific computing at HPC centres around the world. NumPy's
> predecessor numarray was made by NASA for the Hubble space telescope.
> Python is slow for certain types of tasks, particularly iterative
> compute-bound work. But who says you have to use Python for this? It
> can easily be delegated to libraries written in C or Fortran.

Yes, that's what I wrote immediately following what you quoted.


> I can easily demonstrate Python being faster than C. For example, I
> could compare the speed of appending strings to a list and "".join
> (strlist) with multiple strcats in C. I can easily demonstrate C being
> faster than Python as well.

That is a straw man argument (which is one of the classic fallacies), that is,
attacking a position that nobody's argued for.


> To get speed from a high-level language like Python you have to
> leverage on high-level data types. But then you cannot compare
> algorithms in C and Python directly.
>
> Also consider that most program today are not CPU-bound: They are i/o
> bound or memory-bound. Using C does not give you faster disk access,
> faster ethernet connection, or faster RAM... It does not matter that
> computation is slow if the CPU is starved anyway. We have to consider
> what actually limits the speed of a program.
>
> Most of all I don't care that computation is slow if slow is fast
> enough. For example, I have a Python script that parses OpenGL headers
> and writes a declaration file for Cython. It takes a fraction of a
> second to complete. Should I migrate it to C to make it 20 times
> faster? Or do you really think I care if it takes 20 ms or just 1 ms
> to complete? The only harm the extra CPU cycles did was a minor
> contribution to global warming.

Yeah, that's what I wrote immediately following what you quoted.

So, except for the straw man arg and to what degree there is a misconception,
which we can't know without ESP, it seems we /completely agree/ on this :-) )

sturlamolden

unread,
Nov 14, 2009, 4:02:49 AM11/14/09
to
On 12 Nov, 18:32, "Alf P. Steinbach" <al...@start.no> wrote:

> Hm, this seems religious.


>
> Of course Python is slow: if you want speed, pay for it by complexity.

Not really. The speed problems of Python can to a large extent be
attributed to a sub-optimal VM.

Perl tends to be much faster than Python.

Certain Common Lisp and Scheme implementations can often perform
comparable to C++.

There are JIT-compiled JavaScript which are very efficient.

Java's Hotspot JIT comes from StrongTalk, a fast version of SmallTalk.
It's not the static typing that makes Java run fast. It is a JIT
originally developed for a dynamic language. Without Hotspot, Java can
be just as bad as Python.

Even more remarkable: Lua with LuaJIT performs about ~80% of GCC on
Debian benchmarks. Question: Why is Lua so fast and Python so slow?
Here we have two very similar dynamic scripting languages. One beats
JIT-compiled Java and almost competes with C. The other is the slowest
there is. Why? Lot of it has to do with the simple fact that Python'
VM is stack-based whereas Lua's VM is register based. Stack-based VM's
are bad for branch prediction and work against the modern CPUs. Python
has reference counting which is bad for cache. Lua has a tracing GC.
But these are all implementation details totally orthogonal to the
languages. Python on a better VM (LuaJIT, Parrot, LLVM, several
JavaScript) will easily outperform CPython by orders of magnitide.

Sure, Google can brag about Go running at 80% of C speed, after
introducing static typing. But LuaJIT does the same without any typing
at all.

Alf P. Steinbach

unread,
Nov 14, 2009, 4:11:40 AM11/14/09
to
* Vincent Manis:

> On 2009-11-14, at 00:22, Alf P. Steinbach wrote, in response to my earlier post.
>
>> Anyways, it's a good example of focusing on irrelevant and meaningless
>> precision plus at the same time utilizing imprecision, higgedly-piggedly
>> as it suits one's argument. Mixing hard precise logic with imprecise
>> concepts and confound e.g. universal quantification with existential
>> quantification, for best effect several times in the same sentence. Like
>> the old Very Hard Logic + imprecision adage: "we must do something. this
>> is something. ergo, we must do this".
>
> OK, now we've reached a total breakdown in communication, Alf. You appear
> to take exception to distinguishing between a language and its implementation.

Not at all.

But that doesn't mean that making that distinction is always meaningful.

It's not like "there exists a context where making the distinction is not
meaningful" means that "in all contexts making the distinction is meaningful".

So considering that, my quoted comment about confounding universal
quantification with existential quantification was spot on... :-)

In some contexts, such as here, it is meaningless and just misleading to add the
extra precision of the distinction between language and implementation.
Academically it's there. But it doesn't influence anything (see below).

Providing a counter example, a really fast Python implementation for the kind of
processing mix that Google does, available for the relevant environments, would
be relevant.

Bringing in the hypothethical possibility of a future existence of such an
implementation is, OTOH., only hot air.

If someone were to apply the irrelevantly-precise kind of argument to that, then
one could say that future hypotheticals don't have anything to do with what
Python "is", today. Now, there's a fine word-splitting distinction... ;-)


> My academic work, before I became a computer science/software engineering
> instructor, was in programming language specification and implementation,
> so I *DO* know what I'm talking about here. However, you and I apparently
> are speaking on different wavelengths.

Granted that you haven't related incorrect facts, and I don't think anyone here
has, IMO the conclusions and implied conclusions still don't follow.

Alf P. Steinbach

unread,
Nov 14, 2009, 4:17:15 AM11/14/09
to
* sturlamolden:

Good points and good facts.

And you dispensed with the word-splitting terminology discussion, writing just
"The other [language] is the slowest". Currently. He he. :-)

And it is, as you imply, totally in the in-practice domain.


Cheers,

- Alf

Vincent Manis

unread,
Nov 14, 2009, 4:48:25 AM11/14/09
to pytho...@python.org
On 2009-11-14, at 01:11, Alf P. Steinbach wrote:
>> OK, now we've reached a total breakdown in communication, Alf. You appear
>> to take exception to distinguishing between a language and its implementation.
>
> Not at all.
>
> But that doesn't mean that making that distinction is always meaningful.
It certainly is. A language is a (normally) infinite set of strings with a way of ascribing
a meaning to each string.

A language implementation is a computer program of some sort, which is a finite set of bits
representing a program in some language, with the effect that the observed behavior of the
implementation is that strings in the language are accepted, and the computer performs the
operations defined by the semantics.

These are always different things.

> It's not like "there exists a context where making the distinction is not meaningful" means that "in all contexts making the distinction is meaningful".

Because they are different things, in all cases the distinction is meaningful.

>
> So considering that, my quoted comment about confounding universal quantification with existential quantification was spot on... :-)

It was not spot on. The examples I provided were just that, examples to help people see the
difference. They were not presented as proof. The proof comes from the definitions above.

> In some contexts, such as here, it is meaningless and just misleading to add the extra precision of the distinction between language and implementation. Academically it's there. But it doesn't influence anything (see below).

Your assertion that this distinction is meaningless must be based upon YOUR definitions of words
like `language' and `implementation'. Since I don't know your definitions, I cannot respond to this
charge.

> Providing a counter example, a really fast Python implementation for the kind of processing mix that Google does, available for the relevant environments, would be relevant.

I have presented arguments that the technologies for preparing such an implementation are
basically known, and in fact there are projects that aim to do exactly that.

>
> Bringing in the hypothethical possibility of a future existence of such an implementation is, OTOH., only hot air.

Hmm...in every programming project I have ever worked on, the goal was to write code that
didn't already exist.

> If someone were to apply the irrelevantly-precise kind of argument to that, then one could say that future hypotheticals don't have anything to do with what Python "is", today. Now, there's a fine word-splitting distinction... ;-)

Python is a set of strings, with a somewhat sloppily-defined semantics that ascribes meaning to the legal strings in the language. It was thus before any implementation existed, although I imagine that the original Python before GvR wrote any code had many differences from what Python is today.

It is quite common for language designers to specify a language completely without regard to an implementation, or only a `reference' implementation that is not designed for performance or
robustness. The `good' implementation comes after the language has been defined (though again
languages and consequently implementations are almost always modified after the original release).
If you like, a language is part of (but not all of) the set of requirements for the implementation.

Alf, if you want to say that this is a difference that makes no difference, don't let me
stop you. You are, however, completely out of step with the definitions of these terms as used
in the field of programming languages.

>> My academic work, before I became a computer science/software engineering
>> instructor, was in programming language specification and implementation, so I *DO* know what I'm talking about here. However, you and I apparently
>> are speaking on different wavelengths.
>
> Granted that you haven't related incorrect facts, and I don't think anyone here has, IMO the conclusions and implied conclusions still don't follow.

The fact that you see the situation that way is a consequence of the fact that we're on different
wavelengths.

-- v

sturlamolden

unread,
Nov 14, 2009, 5:22:54 AM11/14/09
to
On 14 Nov, 09:47, "Alf P. Steinbach" <al...@start.no> wrote:

> > Python is slow is really a misconception.
>
> Sorry, no, I don't think so.

No, i really think a lot of the conveived slowness in Python comes
from bad programming practices. Sure we can deomstrate that C or
LuaJIT is faster by orders of magnitude for CPU-bound tasks like
comparing DNA-sequences or or calculating the value of pi.

But let me give an example to the opposite from graphics programming,
one that we often run into when using OpenGL. This is not a toy
benchmark problem but one that is frequently encountered in real
programs.

We all know that calling functions in Python has a big overhead. There
are a dictionary lookup for the attribute name, and arguments are
packed into a tuple (and somtimes a dictionary). Thus calling
glVertex* repeatedly from Python will hurt. Doing it from C or Fortran
might still be ok (albeit not always recommended). So should we
conclude that Python is too slow and use C instead?

No!

What if we use glVertexArray or a display list instead? In case of a
vertex array (e.g. using NumPy ndarray for storage), there is
practically no difference in performance of C and Python. With a
display list, there is a difference on creation, but not on
invocation. So slowness from calling glVertex* multiple times is
really slowness from bad Python programming. I use numpy ndarrays to
store vertices, and pass them to OpenGL as a vertex arrays, instead of
hammering on glVertex* in a tight loop. And speed wise, it does not
really matter if I use C or Python.

But what if we need some computation in the graphics program as well?
We might use OpenCL, DirectCompute or OpenGL vertex shaders to control
the GPU. Will C be better than Python for this? Most likely not. A
program for the GPU is compiled by the graphics driver at run-time
from a text string passed to it. It is much better to use Python than
C to generate these. Will C on the CPU be better than OpenCL or a
vertex shader on the GPU? Most likely not.

So we might perhaps conclude that Python (with numpy) is better than C
for high-performance graphics? Even though Python is slower than C, we
can do just as well as C programmers by not falling into a few stupid
pitfalls. Is Python really slower than C for practical programming
like this? Superficially, perhaps yes. In practice, only if you use it
badly. But that's not Python's fault.

But if you make a CPU-bound benchmark like Debian, or time thousands
of calls to glVertex*, yes it will look like C is much better. But it
does not directly translate to the performance of a real program. The
slower can be the faster, it all depends on the programmer.


Two related issues:

- For the few cases where a graphics program really need C, we can
always resort to using ctypes, f2py or Cython. Gluing Python with C or
Fortran is very easy using these tools. That is much better than
keeping it all in C++.

- I mostly find myself using Cython instead of Python for OpenGL. That
is because I am unhappy with PyOpenGL. It was easier to expose the
whole of OpenGL to Cython than create a full or partial wrapper for
Python. With Cython there is no extra overhead from calling glVertex*
in a tight loop, so we get the same performance as C in this case.
But because I store vertices in NumPy arrays on the Python side, I
mostly end up using glVertexArray anyway.

Roel Schroeven

unread,
Nov 14, 2009, 5:28:48 AM11/14/09
to
Vincent Manis schreef:

> On 2009-11-14, at 01:11, Alf P. Steinbach wrote:
>>> OK, now we've reached a total breakdown in communication, Alf. You appear
>>> to take exception to distinguishing between a language and its implementation.
>> Not at all.
>>
>> But that doesn't mean that making that distinction is always meaningful.
> It certainly is. A language is a (normally) infinite set of strings with a way of ascribing
> a meaning to each string.

That's true, for sure.

But when people in the Python community use the word Python, the word is
not used in the strict sense of Python the language. They use it to
refer to both the language and one or more of implementations, mostly
one of the existing and working implementations, and in most cases
CPython (and sometimes it can also include the documentation, the
website or the community).

Example: go to http://python.org. Click Download. That page says
"Download Python
The current product versions are Python 2.6.4 and Python 3.1.1
..."
You can't download a language, but you can download an implementation.
Clearly, even the project's website itself uses Python not only to refer
to the language, but also to it's main implementation (and in a few
places to other implementations).

From that point of view, your distinction between languages and
implementations is correct but irrelevant. What is relevant is that all
currently usable Python implementations are slow, and it's not incorrect
to say that Python is slow.

If and when a fast Python implementation gets to a usable state and
gains traction (in the hopefully not too distant future), that changes.
We'll have to say that Python can be fast if you use the right
implementation. And once the most commonly used implementation is a fast
one, we'll say that Python is fast, unless you happen to use a slow
implementation for one reason or another.

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
-- Isaac Asimov

Roel Schroeven

Alf P. Steinbach

unread,
Nov 14, 2009, 5:30:36 AM11/14/09
to
* Vincent Manis:

> On 2009-11-14, at 01:11, Alf P. Steinbach wrote:
>>> OK, now we've reached a total breakdown in communication, Alf. You appear
>>> to take exception to distinguishing between a language and its implementation.
>> Not at all.
>>
>> But that doesn't mean that making that distinction is always meaningful.
> It certainly is. A language is a (normally) infinite set of strings with a way of ascribing
> a meaning to each string.
>
> A language implementation is a computer program of some sort, which is a finite set of bits
> representing a program in some language, with the effect that the observed behavior of the
> implementation is that strings in the language are accepted, and the computer performs the
> operations defined by the semantics.
>
> These are always different things.

Well, there you have it, your basic misconception.

Sometimes, when that's practically meaningful, people use the name of a language
to refer to both, as whoever it was did up-thread.

Or, they might mean just the latter. :-)

Apply some intelligence and it's all clear.

Stick boneheadedly to preconceived distinctions and absolute context independent
meanings, and statements using other meanings appear to be meaningless or very
unclear.

[snippety]


Cheers & hth.,

- Alf

PS: You might, or might not, benefit from looking up Usenet discussions on the
meaning of "character code", which is classic case of the confusion you have
here. There's even a discussion of that in some RFC somewhere, I think it was
MIME-related. Terms mean different things in different *contexts*.

Roel Schroeven

unread,
Nov 14, 2009, 5:35:42 AM11/14/09
to
Vincent Manis schreef:

> I notice you've weakened your claim. Now we're down to `hard to execute
> quickly'. That I would agree with you on, in that building an efficient
> Python system would be a lot of work. However, my claim is that that work
> is engineering, not research: most of the bits and pieces of how to implement
> Python reasonably efficiently are known and in the public literature. And
> that has been my claim since the beginning.

You talk about what can be and what might be. We talk about what is.

The future is an interesting place, but it's not here.

Willem Broekema

unread,
Nov 14, 2009, 9:13:47 AM11/14/09
to
On Nov 14, 8:55 am, Vincent Manis <vma...@telus.net> wrote:
> On 2009-11-13, at 23:20, Robert Brown wrote, quoting me:
> > Please look atCLPython. [...]

> Ah, that does explain it.

I bet you didn't even look at it. FWIW, I'm the author of CLPython.

> CLOS is most definitely the wrong vehicle for implementing
> Python method dispatch. CLOS is focused around generic functions that themselves
> do method dispatch, and do so in a way that is different from Python's. If I were
> building a Python implementation in CL, I would definitely NOT use CLOS, but
> do my own dispatch using funcall (the CL equivalent of the now-vanished Python
> function apply).

CLOS is way more than method dispatch, it's an infrastructure for
classes, metaclasses, slots, method combinations. And within method
dispatch there are lots of opportunities to customize the behaviour.
Ignoring all that functinality when implementing an object-oriented
language, and using "dispatch using funcall" whatever that means, just
sounds ridiculous. (And funcall != apply.)

> > Method lookup is just the tip if the iceburg. How about comparison? Here are

> > some comments fromCLPython'simplementation of compare. There's a lot going


> > on. It's complex and SLOW.

Right, although by special-casing the most common argument types you
can save most of that lookup.

> Re comparison. Python 3 has cleaned comparison up a fair bit. In particular, you
> can no longer compare objects of different types using default comparisons.
> However, it could well be that there are nasty little crannies of inefficiency
> there, they could be the subject of PEPs after the moratorium is over.

It might have gotten a bit better, but the central message still
stands: Python has made design choices that make efficient compilation
hard.

> OK, let me try this again. My assertion is that with some combination of JITting,
> reorganization of the Python runtime, and optional static declarations, Python
> can be made acceptably fast,

That does not contradict that, had other language design choices been
made, it could be much easier to get better performance. Python may in
general be about as dynamic as Common Lisp from a _user_ perspective,
but from an implementator's point of view Python is harder to make it
run efficiently.

Here are some examples of design choices in Common Lisp that help it
perform very well; while in Python there is more freedom at the cost
of performance:

- Lisp hashtables, arrays, numbers, and strings are not subclassable.
The specialized operations on them, like function aref for array index
referencing, don't invoke arbitrary user-level code, and also not
lookup of magic methods.

- Certain Lisp sequence objects, like lists and arrays, can easily be
allocated on the stack;

- Lisp allows type declarations, like for variables, array types,
function arguments and return values.

- Even if Python had type declarations, it is possible to define a
subclass that redefines semantics. E.g. it's possible to subclass
'int' and redefine what '+' means, making a declaration that "x is of
type int" not as valuable as in Lisp.

- A recursive function defined at module level, can not assume that
its name refers to itself.

- Every function can be called using keyword arguments or positional
arguments. E.g. with the definition "def f(x,y): ..", some possible
calls are: f(1), f(1,2), f(x=1, y=2), f(1,y=2) so every function must
be prepared to do keyword argument processing. (This can be considered
a lack of separation between internal details and external interface.)

- Every function call could potentially be a call to locals() (e.g.
f=locals; f()), which means every function that contains a function
call must store the value of all locals, even of "dead" variables.

- Built-in functions can be shadowed.

- The potential fields of a Python object are often not defined, as
arbitrary attributes can be set. Accessors for fields generally
generally have to retrieve the value from a dict.

- When limiting the potential fields of a class instance using
__slots__, subclasses may override __slots__ thus this is hardly
limiting.

- Python attribute lookup and comparison (as shown in a previous
mail) are examples of hairy behaviour that often mean the lookup of
several (!) __magic__ methods and could invoke arbitrary user code.
(Lisp in particular offers "structures" whose definition is fixed,
that inline all accessors, for maximum efficiency.)

This is just to show how language design leads to efficieny
characteristics. That there is a need for projects like NumPy and
Cython, follows in my eyes from Python being too dynamic for its own
good, with no way to tame it. In Common Lisp there would be less need
to go to C for speed, because of user-supplied type declarations and
compiler-based type inferencing.

It has been said that CLPython is a very good counterargument for
"just write a Python to Lisp compiler to make things fast", and even I
as its developer agree. Lisp offers lots of readily available
optimization opportunities, but Python simply doesn't.

I remember reading some years ago about a Smalltalk compiler guru who
said he would come up with a ridiculously fast Python implementation
based on all the message sending optimizations he knew. It does not
surprise me that we've never heard from him yet.

- Willem

Paul Rubin

unread,
Nov 14, 2009, 1:10:43 PM11/14/09
to
sturlamolden <sturla...@yahoo.no> writes:
> Python on a better VM (LuaJIT, Parrot, LLVM, several
> JavaScript) will easily outperform CPython by orders of magnitide.


Maybe Python semantics make it more difficult to optimize than those
other languages. For example, in
a = foo.bar(1)
b = muggle()
c = foo.bar(2)
it is not ok to cache the value of foo.bar after the first assignment.
Maybe the second one goes and modifies it through foo.__dict__ .
See "Children of a Lesser Python" (linked in another post, or websearch)
for discussion.

Grant Edwards

unread,
Nov 14, 2009, 1:40:19 PM11/14/09
to
On 2009-11-14, David Robinow <drob...@gmail.com> wrote:
> On Fri, Nov 13, 2009 at 3:32 PM, Paul Rubin
><http://phr...@nospam.invalid> wrote:
>> ... ?This is Usenet so
>> please stick with Usenet practices. ?If you want a web forum there are
>> plenty of them out there.
> Actually this is pytho...@python.org

Actually this is comp.lang.python

> I don't use usenet and I have no intention to stick with Usenet practices.

Message has been deleted

Terry Reedy

unread,
Nov 14, 2009, 5:02:19 PM11/14/09
to pytho...@python.org
sturlamolden wrote:

> - For the few cases where a graphics program really need C, we can
> always resort to using ctypes, f2py or Cython. Gluing Python with C or
> Fortran is very easy using these tools. That is much better than
> keeping it all in C++.

In case anyone thinks resorting to C or Fortran is cheating, they should
know that CPython, the implementation, was designed for this. That is
why there is a documented C-API and why the CPython devs are slow to
change it. Numerical Python dates back to at least 1.3 and probably
earlier. The people who wrote it were some of the first production users
of Python.

Terry Jan Reedy

Edward A. Falk

unread,
Nov 14, 2009, 5:34:20 PM11/14/09
to
In article <mailman.270.1257970...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:
>
>I can imagine a day when code compiled from Python is routinely
>time-competitive with hand-written C.

I can't. Too much about the language is dynamic. The untyped variables
alone are a killer.

int a,b,c;
...
a = b + c;

In C, this compiles down to just a few machine instructions. In Python,
the values in the variables need to be examined *at run time* to determine
how to add them or if they can even be added at all. You'll never in
a million years get that down to just two or three machine cycles.

Yes, technically, the speed of a language depends on its implementation,
but the nature of the language constrains what you can do in an
implementation. Python the language is inherently slower than C the
language, no matter how much effort you put into the implementation. This
is generally true for all languages without strongly typed variables.

--
-Ed Falk, fa...@despams.r.us.com
http://thespamdiaries.blogspot.com/

Terry Reedy

unread,
Nov 14, 2009, 5:45:00 PM11/14/09
to pytho...@python.org
Willem Broekema wrote:

> It might have gotten a bit better, but the central message still
> stands: Python has made design choices that make efficient compilation
> hard.
>
>> OK, let me try this again. My assertion is that with some combination of JITting,
>> reorganization of the Python runtime, and optional static declarations, Python
>> can be made acceptably fast,
>
> That does not contradict that, had other language design choices been
> made, it could be much easier to get better performance. Python may in
> general be about as dynamic as Common Lisp from a _user_ perspective,
> but from an implementator's point of view Python is harder to make it
> run efficiently.

I think you are right about the design choices. The reason for those
design choices is that Guido intended from the beginning that Python
implementations be part of open computational systems, and not islands
to themselves like Smalltalk and some Lisps. While the public CPython
C-API is *not* part of the Python language, Python was and has been
designed with the knowledge that there *would be* such an interface, and
that speed-critical code would be written in C or Fortran, or that
Python programs would interface with and use such code already written.

So: Python the language was designed for human readability, with the
knowledge that CPython the implementation (originally and still today
just called python.exe) would exist in a world where intensive
computation could be pushed onto C or Fortan when necessary.

So: to talk about the 'speed of Python', one should talk about the speed
of human reading and writing. On this score, Python, I believe, beats
most other algorithm languages, as intended. It certainly does for me.
To talk about the speed of CPython, one must, to be fair, talk about the
speed of CPython + extensions compiled to native code.

In the scale of human readability, I believe Google go is a step
backwards from Python.

Terry Jan Reedy

Robert Brown

unread,
Nov 14, 2009, 5:56:39 PM11/14/09
to

Vincent Manis <vma...@telus.net> writes:
> The false statement you made is that `... Python *the language* is specified
> in a way that makes executing Python programs quickly very very difficult.
> I refuted it by citing several systems that implement languages with
> semantics similar to those of Python, and do so efficiently.

The semantic details matter. Please read Willem's reply to your post. It
contains a long list of specific differences between Python (CPython) language
semantics and Common Lisp language semantics that cause Python performance to
suffer.

> OK, let me try this again. My assertion is that with some combination of
> JITting, reorganization of the Python runtime, and optional static
> declarations, Python can be made acceptably fast, which I define as program
> runtimes on the same order of magnitude as those of the same programs in C
> (Java and other languages have established a similar goal). I am not pushing
> optional declarations, as it's worth seeing what we can get out of
> JITting. If you wish to refute this assertion, citing behavior in CPython or
> another implementation is not enough. You have to show that the stated
> feature *cannot* be made to run in an acceptable time.

It's hard to refute your assertion. You're claiming that some future
hypothetical Python implementation will have excellent performance via a JIT.
On top of that you say that you're willing to change the definition of the
Python language, say by adding type declarations, if an implementation with a
JIT doesn't pan out. If you change the Python language to address the
semantic problems Willem lists in his post and also add optional type
declarations, then Python becomes closer to Common Lisp, which we know can be
executed efficiently, within the same ballpark as C and Java.

bob

Vincent Manis

unread,
Nov 14, 2009, 9:42:07 PM11/14/09
to pytho...@python.org
This whole thread has now proceeded to bore me senseless. I'm going to respond
once with a restatement of what I originally said. Then I'm going to drop it, and
never respond to the thread again. Much of what's below has been said by others
as well; I'm taking no credit for it, just trying to put it together into a coherent
framework.

1. The original question is `Is Python scalable enough for Google' (or, I assume
any other huge application). That's what I was responding to.

2. `Scalable' can mean performance or productivity/reliability/maintenance quality.
A number of posters conflated those. I'll deal with p/r/m by saying I'm not familiar
with any study that has taken real enterprise-type programs and compared, e.g.,
Java, Python, and C++ on the p/r/m criteria. Let's leave that issue by saying that
we all enjoy programming in Python, and Python has pretty much the same feature
set (notably modules) as any other enterprise language. This just leaves us with
performance.

3. Very clearly CPython can be improved. I don't take most benchmarks very seriously,
but we know that CPython interprets bytecode, and thus suffers relative to systems
that compile into native code, and likely to some other interpretative systems. (Lua
has been mentioned, and I recall looking at a presentation by the Lua guys on why they
chose a register rather than stack-based approach.)

4. Extensions such as numpy can produce tremendous improvements in productivity AND
performance. One answer to `is Python scalable' is to rephrase it as `is Python+C
scalable'.

5. There are a number of JIT projects being considered, and one or more of these might
well hold promise.

6. Following Scott Meyers' outstanding advice (from his Effective C++ books), one should
prefer compile time to runtime wherever possible, if one is concerned about performance.
An implementation that takes hints from programmers, e.g., that a certain variable is
not to be changed, or that a given argument is always an int32, can generate special-case
code that is at least in the same ballpark as C, if not as fast.

This in no way detracts from Python's dynamic nature: these hints would be completely
optional, and would not change the semantics of correct programs. (They might cause
programs running on incorrect data to crash, but if you want performance, you are kind of
stuck). These hints would `turn off' features that are difficult to compile into efficient
code, but would do so only in those parts of a program where, for example, it was known that
a given variable contains an int32. Dynamic (hint-free) and somewhat less-dynamic (hinted)
code would coexist. This has been done for other languages, and is not a radically new
concept.

Such hints already exist in the language; __slots__ is an example.

The language, at least as far as Python 3 is concerned, has pretty much all the machinery
needed to provide such hints. Mechanisms that are recognized specially by a high-performance
implementation (imported from a special module, for example) could include: annotations,
decorators, metaclasses, and assignment to special variables like __slots__.

7. No implementation of Python at present incorporates JITting and hints fully. Therefore,
the answer to `is CPython performance-scalable' is likely `NO'. Another implementation that
exploited all of the features described here might well have satisfactory performance for
a range of computation-intensive problems. Therefore, the answer to `is the Python language
performance-scalable' might be `we don't know, but there are a number of promising implementation
techniques that have been proven to work well in other languages, and may well have tremendous
payoff for Python'.

-- v


Steven D'Aprano

unread,
Nov 14, 2009, 10:55:21 PM11/14/09
to
On Fri, 13 Nov 2009 18:25:59 -0800, Vincent Manis wrote:

> On 2009-11-13, at 15:32, Paul Rubin wrote:
>> This is Usenet so
>> please stick with Usenet practices.

> Er, this is NOT Usenet.

Actually it is. I'm posting to comp.lang.python.


> 1. I haven't, to the best of my recollection, made a Usenet post in this
> millennium.

Actually you have, you just didn't know it.


> 2. I haven't fired up a copy of rn or any other news reader in at least
> 2 decades.
>
> 3. I'm on the python-list mailing list, reading this with Apple's Mail
> application, which actually doesn't have convenient ways of enforcing
> `Usenet practices' regarding message format.

Nevertheless, the standards for line length for email and Usenet are
compatible.


> 4. If we're going to adhere to tried-and-true message format rules, I
> want my IBM 2260 circa 1970, with its upper-case-only display and weird
> little end-of-line symbols.

No you don't, you're just taking the piss.


> Stephen asked me to wrap my posts. I'm happy to do it. Can we please
> finish this thread off and dispose of it?

My name is actually Steven, but thank you for wrapping your posts.

--
Steven

John Nagle

unread,
Nov 14, 2009, 11:41:49 PM11/14/09
to
Steven D'Aprano wrote:
> On Wed, 11 Nov 2009 16:38:50 -0800, Vincent Manis wrote:
>
>> I'm having some trouble understanding this thread. My comments aren't
>> directed at Terry's or Alain's comments, but at the thread overall.
>>
>> 1. The statement `Python is slow' doesn't make any sense to me. Python
>> is a programming language; it is implementations that have speed or lack
>> thereof.
>
> Of course you are right, but in common usage, "Python" refers to CPython,
> and in fact since all the common (and possibly uncommon) implementations
> of Python are as slow or slower than CPython, it's not an unreasonable
> short-hand.

Take a good look at Shed Skin. One guy has been able to build a system
that compiles Python to C++, without requiring the user to add "annotations"
about types. The system uses type inference to figure it out itself.
You give up some flexibility; a variable can have only one primitive type
in its life, or it can be a class object. That's enough to simplify the
type analysis to the point that most types can be nailed down before the
program is run. (Note, though, that the entire program may have to
be analyzed as a whole. Separate compilation may not work; you need
to see the callers to figure out how to compile the callees.)

It's 10 to 60x faster than CPython.

It's the implementation, not the language. Just because PyPy was a
dud doesn't mean it's impossible. There are Javascript JIT systems
far faster than Python.

Nor do you really need a JIT system. (Neither does Java; GCC has
a hard-code Java compiler. Java is JIT-oriented for historical reasons.
Remember browser applets?) If you're doing server-side work, the
program's structure and form have usually been fully determined by
the time the program begins execution.

John Nagle

Rami Chowdhury

unread,
Nov 15, 2009, 1:46:12 AM11/15/09
to pytho...@python.org
On Saturday 14 November 2009 18:42:07 Vincent Manis wrote:
>
> 3. Very clearly CPython can be improved. I don't take most benchmarks
> very seriously, but we know that CPython interprets bytecode, and
> thus suffers relative to systems that compile into native code, and
> likely to some other interpretative systems. (Lua has been
> mentioned, and I recall looking at a presentation by the Lua guys on
> why they chose a register rather than stack-based approach.)
>

For those interested in exploring the possible performance benefits of
Python on a register-based VM, there's Pynie
(http://code.google.com/p/pynie/)... and there's even a JIT in the works
for that (http://docs.parrot.org/parrot/1.0.0/html/docs/jit.pod.html)...


----
Rami Chowdhury
"A man with a watch knows what time it is. A man with two watches is
never sure". -- Segal's Law
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)

Terry Reedy

unread,
Nov 15, 2009, 1:53:06 AM11/15/09
to pytho...@python.org
John Nagle wrote:

> Steven D'Aprano wrote:
>
> Take a good look at Shed Skin. One guy has been able to build a system
> that compiles Python to C++, without requiring the user to add
> "annotations" about types.

In *only* compiles a subset of Python, as does Cython. Both cannot
(currently) do generators, but that can be done and probably will
eventually at least for Cython. Much as I love them, they can be
rewritten by hand as iterator classes and even then are not needed for a
lot of computational code.

I think both are good pieces of work so far.

greg

unread,
Nov 15, 2009, 2:25:39 AM11/15/09
to
John Nagle wrote:
> Take a good look at Shed Skin. ...

> You give up some flexibility; a variable can have only one primitive type
> in its life, or it can be a class object. That's enough to simplify the
> type analysis to the point that most types can be nailed down before the
> program is run.

These restrictions mean that it isn't really quite
Python, though.

--
Greg

Terry Reedy

unread,
Nov 15, 2009, 3:30:55 AM11/15/09
to pytho...@python.org

Python code that only uses a subset of features very much *is* Python
code. The author of ShedSkin makes no claim that is compiles all Python
code.

Paul Boddie

unread,
Nov 15, 2009, 12:16:58 PM11/15/09
to
On 15 Nov, 09:30, Terry Reedy <tjre...@udel.edu> wrote:
> greg wrote:
>

[Shed Skin]

> > These restrictions mean that it isn't really quite
> > Python, though.
>
> Python code that only uses a subset of features very much *is* Python
> code. The author of ShedSkin makes no claim that is compiles all Python
> code.

Of course, Shed Skin doesn't support all the usual CPython features,
but the code you would write for Shed Skin's benefit should be Python
code that runs under CPython. It's fair to say that Shed Skin isn't a
"complete" implementation of what CPython defines as being "the full
Python", but you're still writing Python. One can argue that the
restrictions imposed by Shed Skin inhibit the code from being "proper"
Python, but every software project has restrictions in the form of
styles, patterns and conventions.

This is where the "Lesser Python" crowd usually step in and say that
they won't look at anything which doesn't support "the full Python",
but I think it's informative to evaluate which features of Python give
the most value and which we could do without. The "Lesser Python"
attitude is to say, "No! We want it all! It's all necessary for
everything!" That doesn't really help the people implementing "proper"
implementations or those trying to deliver better-performing
implementations.

In fact, the mentality that claims that "it's perfect, or it will be
if we keep adding features" could drive Python into a diminishing
niche over time. In contrast, considering variations of Python as some
kind of "Greater Python" ecosystem could help Python (the language)
adapt to the changing demands on programming languages to which Go
(the Google language, not Go! which existed already) is supposedly a
response.

Paul

P.S. And PyPy is hardly a dud: they're only just getting started
delivering the performance benefits, and it looks rather promising.

Edward A. Falk

unread,
Nov 15, 2009, 2:33:10 PM11/15/09
to
In article <m2d43ke...@roger-vivier.bibliotech.com>,

Robert Brown <bbr...@speakeasy.net> wrote:
>
>It's hard to refute your assertion. You're claiming that some future
>hypothetical Python implementation will have excellent performance via a JIT.
>On top of that you say that you're willing to change the definition of the
>Python language, say by adding type declarations, if an implementation with a
>JIT doesn't pan out. If you change the Python language to address the
>semantic problems Willem lists in his post and also add optional type
>declarations, then Python becomes closer to Common Lisp, which we know can be
>executed efficiently, within the same ballpark as C and Java.

Ya know; without looking at Go, I'd bet that this was some of the thought
process that was behind it.

Paul Rubin

unread,
Nov 15, 2009, 3:01:57 PM11/15/09
to
fa...@mauve.rahul.net (Edward A. Falk) writes:
> >If you change the Python language to address the semantic problems
> >Willem lists in his post and also add optional type declarations,
> >then Python becomes closer to Common Lisp, which we know can be
> >executed efficiently, within the same ballpark as C and Java.
>
> Ya know; without looking at Go, I'd bet that this was some of the thought
> process that was behind it.

I don't have the slightest impression that Python had any significant
influence on Go. Go has C-like syntax, static typing with mandatory
declarations, and concurrency inspired by Occam. It seems to be a
descendant of Oberon and Newsqueak (Pike's earlier language used in
Plan 9). It also seems to be decades behind the times in some ways.
Its creators are great programmers and system designers, but I wish
they had gotten some PL theorists involved in designing Go.

John Nagle

unread,
Nov 15, 2009, 11:09:44 PM11/15/09
to

Yes. Niklaus Wirth, who designed Pascal, Modula, and Oberon, had
that happen to his languages. He's old and bitter now; a friend of
mine knows him.

The problem is that "Greater Python" is to some extent "the set of
features that are easy to implement if we look up everything at run time."
You can insert a variable into a running function of
another thread. This feature of very marginal utility is free in a
naive lookup-based interpreter, and horribly expensive in anything that
really compiles. Obsession with the CPython implementation as the language
definition tends to overemphasize such features.

The big headache from a compiler perspective is "hidden dynamism" -
use of dynamic features that isn't obvious from examining the source code.
(Hidden dynamism is a big headache to maintenance programmers, too.)
For example, if you had the rule that you can't use "getattr" and "setattr"
on an object from the outside unless the class itself implements or uses getattr
and setattr, then you know at compile time if the machinery for dynamic
attributes needs to be provided for that class. This allows the "slots"
optimization, and direct compilation into struct-type code.

Python is a very clean language held back from widespread use by slow
implementations. If Python ran faster, Go would be unnecessary.

And yes, performance matters when you buy servers in bulk.

John Nagle

sturlamolden

unread,
Nov 15, 2009, 11:51:29 PM11/15/09
to
On 16 Nov, 05:09, John Nagle <na...@animats.com> wrote:

>       Python is a very clean language held back from widespread use by slow
> implementations.  If Python ran faster, Go would be unnecessary.

That boggles me.

NASA can find money to build a space telescope and put it in orbit.
They don't find money to create a faster Python, which they use for
analyzing the data.

Google is a multi-billion dollar business. They are using Python
extensively. Yes I know about Unladen Swallow, but why can't they put
1 mill dollar into making a fast Python?

And then there is IBM and Cern's Blue Brain project. They can set up
the fastest supercomputer known to man, but finance a faster Python?
No...

I saw this myself. At work I could get money to buy a € 30,000
recording equipment. I could not get money for a MATLAB license.

It seems software and software development is heavily underfinanced.
The big bucks goes into fancy hardware. But fancy hardware is not so
fancy without equally fancy software.


sturlamolden

unread,
Nov 16, 2009, 12:02:13 AM11/16/09
to
On 16 Nov, 05:09, John Nagle <na...@animats.com> wrote:

>       Python is a very clean language held back from widespread use by slow
> implementations.

Python is clean, minimalistic, and beautiful.

Python don't have bloat like special syntax for XML or SQL databases
(cf C#) or queues (Go).

Most of all, it is easier to express ideas in Python than any computer
language I know.

Python's major drawback is slow implementations. I always find myself
resorting to Cython (or C, C++, Fortran 95) here and there.

But truth being told, I wrote an awful lot of C mex files when using
MATLAB as well. MATLAB can easily be slower than Python by orders of
magnitude, but it has not preventet it from widespread adoption.
What's keeping it back is an expensive license.

Paul Boddie

unread,
Nov 16, 2009, 10:03:34 AM11/16/09
to
On 16 Nov, 05:51, sturlamolden <sturlamol...@yahoo.no> wrote:
>
> NASA can find money to build a space telescope and put it in orbit.
> They don't find money to create a faster Python, which they use for
> analyzing the data.

Is the analysis in Python really what slows it all down?

> Google is a multi-billion dollar business. They are using Python
> extensively. Yes I know about Unladen Swallow, but why can't they put
> 1 mill dollar into making a fast Python?

Isn't this where we need those Ohloh figures on how much Unladen
Swallow is worth? ;-) I think Google is one of those organisations
where that Steve Jobs mentality of shaving time off a once-per-day
activity actually pays off. A few more cycles here and there is
arguably nothing to us, but it's a few kW when running on thousands of
Google nodes.

> And then there is IBM and Cern's Blue Brain project. They can set up
> the fastest supercomputer known to man, but finance a faster Python?
> No...

Businesses and organisations generally don't spend any more money than
they need to. And if choosing another technology is cheaper for future
work then they'll just do that instead. In a sense, Python's
extensibility using C, C++ and Fortran have helped adoption of the
language considerably, but it hasn't necessarily encouraged a focus on
performance.

Paul

Paul Rubin

unread,
Nov 16, 2009, 9:27:05 PM11/16/09
to
sturlamolden <sturla...@yahoo.no> writes:
> > � � � Python is a very clean language held back from widespread use by slow

> > implementations. �If Python ran faster, Go would be unnecessary.
>
> Google is a multi-billion dollar business. They are using Python
> extensively. Yes I know about Unladen Swallow, but why can't they put
> 1 mill dollar into making a fast Python?

I don't think Python and Go address the same set of programmer
desires. For example, Go has a static type system. Some programmers
find static type systems to be useless or undesirable. Others find
them extremely helpful and want to use them them. If you're a
programmer who wants a static type system, you'll probably prefer Go
to Python, and vice versa. That has nothing to do with implementation
speed or development expenditures. If Google spent a million dollars
adding static types to Python, it wouldn't be Python any more.

Aaron Watters

unread,
Nov 17, 2009, 8:48:10 AM11/17/09
to

> I don't think Python and Go address the same set of programmer
> desires.  For example, Go has a static type system.  Some programmers
> find static type systems to be useless or undesirable.  Others find
> them extremely helpful and want to use them them.  If you're a
> programmer who wants a static type system, you'll probably prefer Go
> to Python, and vice versa.  That has nothing to do with implementation
> speed or development expenditures.  If Google spent a million dollars
> adding static types to Python, it wouldn't be Python any more.

... and I still have an issue with the whole "Python is slow"
meme. The reason NASA doesn't build a faster Python is because
Python *when augmented with FORTRAN libraries that have been
tested and optimized for decades and are worth billions of dollars
and don't need to be rewritten* is very fast.

The reason they don't replace the Python drivers with Java is
because that would be very difficult and just stupid and I'd be
willing to bet that when they were done the result would actually
be *slower* especially when you consider things like process
start-up time.

And when someone implements a Mercurial replacement in GO (or C#
or Java) which is faster and more useful than Mercurial, I'll
be very impressed. Let me know when it happens (but I'm not
holding my breath).

By the way if it hasn't happened and if he isn't afraid
of public speaking someone should invite Matt Mackall
to give a Python conference keynote. Or how about
Bram Cohen for that matter...

-- Aaron Watters http://listtree.appspot.com/

===
if you want a friend, get a dog. -Truman


David Cournapeau

unread,
Nov 17, 2009, 9:28:09 AM11/17/09
to Aaron Watters, pytho...@python.org
On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters <aaron....@gmail.com> wrote:
>
>> I don't think Python and Go address the same set of programmer
>> desires.  For example, Go has a static type system.  Some programmers
>> find static type systems to be useless or undesirable.  Others find
>> them extremely helpful and want to use them them.  If you're a
>> programmer who wants a static type system, you'll probably prefer Go
>> to Python, and vice versa.  That has nothing to do with implementation
>> speed or development expenditures.  If Google spent a million dollars
>> adding static types to Python, it wouldn't be Python any more.
>
> ... and I still have an issue with the whole "Python is slow"
> meme.  The reason NASA doesn't build a faster Python is because
> Python *when augmented with FORTRAN libraries that have been
> tested and optimized for decades and are worth billions of dollars
> and don't need to be rewritten* is very fast.

It is a bit odd to dismiss "python is slow" by saying that you can
extend it with fortran. One of the most significant point of python
IMO is its readability, even for people not familiar with it, and
that's important when doing scientific work. Relying on a lot of
compiled libraries goes against it.

I think that python with its scientific extensions is a fantastic
tool, but I would certainly not mind if it were ten times faster. In
particular, the significant cost of function calls makes it quickly
unusable for code which cannot be easily "vectorized" - we have to
resort to using C, etc... to circumvent this ATM.

Another point which has not been mentioned much, maybe because it is
obvious: it seems that it is possible to makes high level languages
quite fast, but doing so while keeping memory usage low is very
difficult. Incidentally, the same tradeoff appears when working with
vectorized code in numpy/scipy.

David

Paul Boddie

unread,
Nov 17, 2009, 10:53:55 AM11/17/09
to
On 17 Nov, 14:48, Aaron Watters <aaron.watt...@gmail.com> wrote:
>
> ... and I still have an issue with the whole "Python is slow"
> meme.  The reason NASA doesn't build a faster Python is because
> Python *when augmented with FORTRAN libraries that have been
> tested and optimized for decades and are worth billions of dollars
> and don't need to be rewritten* is very fast.

That's why I wrote that Python's "extensibility using C, C++ and
Fortran [has] helped adoption of the language considerably", and
Python was particularly attractive to early adopters of the language
precisely because of the "scripting" functionality it could give to
existing applications, but although there are some reasonable
solutions for writing bottlenecks of a system in lower-level
programming languages, it can be awkward if those bottlenecks aren't
self-contained components or if the performance issues permeate the
entire system.

[...]

> And when someone implements a Mercurial replacement in GO (or C#
> or Java) which is faster and more useful than Mercurial, I'll
> be very impressed.  Let me know when it happens (but I'm not
> holding my breath).

Mercurial is a great example of a Python-based tool with good
performance. However, it's still interesting to consider why the
implementers chose to rewrite precisely those parts that are
implemented using C. I'm sure many people have had the experience of
looking at a piece of code and being quite certain of what that code
does, and yet wondering why it's so inefficient in vanilla Python.
It's exactly this kind of issue that has never really been answered
convincingly, other than claims that "Python must be that dynamic and
no less" and "it's doing so much more than you think", leaving people
to try and mitigate the design issues using clever implementation
techniques as best they can.

> By the way if it hasn't happened and if he isn't afraid
> of public speaking someone should invite Matt Mackall
> to give a Python conference keynote.  Or how about
> Bram Cohen for that matter...

Bryan O'Sullivan gave a talk on Mercurial at EuroPython 2006, and
although I missed that talk for various reasons beyond my control, I
did catch his video lightning talk which emphasized performance.
That's not to say that we couldn't do with more talks of this nature
at Python conferences, however.

Paul

Rustom Mody

unread,
Nov 17, 2009, 11:41:42 AM11/17/09
to pytho...@python.org
"Language L is (in)efficient. No! Only implementations are (in)efficient"

I am reminded of a personal anecdote. It happened about 20 years ago
but is still fresh and this thread reminds me of it.

I was attending some workshop on theoretical computer science.
I gave a talk on Haskell.

I showed off all the good-stuff -- pattern matching, lazy lists,
infinite data structures, etc etc.
Somebody asked me: Isnt all this very inefficient?
Now at that time I was a strong adherent of the Dijkstra-religion and
this viewpoint "efficiency has nothing to do with languages, only
implementations" traces to him. So I quoted that.

Slowing the venerable P S Thiagarajan got up and asked me:
Lets say that I have a language with a type 'Proposition'
And I have an operation on proposition called sat [ sat(p) returns
true if p is satisfiable]...

I wont complete the tale other than to say that Ive never had the wind
in my sails taken out so completely!

So Vincent? I wonder what you would have said in my place?

J Kenneth King

unread,
Nov 17, 2009, 11:42:44 AM11/17/09
to
David Cournapeau <cour...@gmail.com> writes:

I think this is the only interesting point in the whole conversation so
far.

It is possible for highly dynamic languages to be optimized, compiled,
and run really fast.

The recent versions of SBCL can compile Common Lisp into really fast and
efficient binaries. And Lisp could be considered even more dynamic than
Python (but that is debateable and I have very little evidence... so
grain of salt on that statement). It's possible, it just hasn't been
done yet.

PyPy is getting there, but development is slow and they could probably
use a hand. Instead of waiting on the sidelines for a company to back
PyPy developemnt, the passionate Python programmers worth a salt that
care about Python development should contribute at least a patch or two.

The bigger problem though is probably attention span. A lot of
developers today are more apt to simply try the next new language than
to roll of their sleeves and think deeply enough to improve the tools
they're already invested in.

>
> David

Paul Rubin

unread,
Nov 17, 2009, 3:48:05 PM11/17/09
to
Aaron Watters <aaron....@gmail.com> writes:
> ... and I still have an issue with the whole "Python is slow"
> meme. The reason NASA doesn't build a faster Python is because
> Python when augmented with FORTRAN libraries...

Do you think that numerics is the only area of programming where users
care about speed?

> And when someone implements a Mercurial replacement in GO (or C#
> or Java) which is faster and more useful than Mercurial, I'll
> be very impressed.

What about Git? Some people prefer it.

David Cournapeau

unread,
Nov 17, 2009, 5:43:44 PM11/17/09
to pytho...@python.org
On Wed, Nov 18, 2009 at 5:48 AM, Paul Rubin
<http://phr...@nospam.invalid> wrote:

>
> What about Git?  Some people prefer it.

Git is an interesting example, because it both really pushes
performance into its core structure and reasonably complete
implementations exist in other languages. In particular, jgit is
implemented in java by one of the core git developer, here is what he
has to say:

http://marc.info/?l=git&m=124111702609723&w=2

I found the comment on "optimizing 5% here and 5 % there" interesting.
It is often claimed that optimization should be done after having
found the hotspot, but that does not always apply, and I think git is
a good example of that.

In those cases, using python as the main language does not work well,
at least in my experience. Rewriting the slow parts in a compiled
language only works if you can identify the slow parts, and even in
numerical code, that's not always possible (this tends to happen when
you need to deal with many objects interacting together, for example).

David

greg

unread,
Nov 17, 2009, 6:24:37 PM11/17/09
to
David Cournapeau wrote:

> It is a bit odd to dismiss "python is slow" by saying that you can
> extend it with fortran. One of the most significant point of python
> IMO is its readability, even for people not familiar with it, and
> that's important when doing scientific work. Relying on a lot of
> compiled libraries goes against it.

If it were necessary to write a new compiled library every
time you wanted to solve a new problem, that would be true.
But it's not like that if you pick the right libraries.

NumPy, for example, is *extremely* flexible. Someone put
in the effort, once, to write it and make it fast -- and
now an endless variety of programs can be written very easily
in Python to make use of it.

--
Greg

Terry Reedy

unread,
Nov 17, 2009, 6:31:25 PM11/17/09
to pytho...@python.org
David Cournapeau wrote:
> On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters <aaron....@gmail.com> wrote:
>>> I don't think Python and Go address the same set of programmer
>>> desires. For example, Go has a static type system. Some programmers
>>> find static type systems to be useless or undesirable. Others find
>>> them extremely helpful and want to use them them. If you're a
>>> programmer who wants a static type system, you'll probably prefer Go
>>> to Python, and vice versa. That has nothing to do with implementation
>>> speed or development expenditures. If Google spent a million dollars
>>> adding static types to Python, it wouldn't be Python any more.
>> ... and I still have an issue with the whole "Python is slow"
>> meme. The reason NASA doesn't build a faster Python is because
>> Python *when augmented with FORTRAN libraries that have been
>> tested and optimized for decades and are worth billions of dollars
>> and don't need to be rewritten* is very fast.
>
> It is a bit odd to dismiss "python is slow" by saying that you can
> extend it with fortran.

I find it a bit odd that people are so resistant to evaluating Python as
it was designed to be. As Guido designed the language, he designed the
implementation to be open and easily extended by assembler, Fortran, and
C. No one carps about the fact the dictionary key lookup, say, is writen
in (optimized) C rather than pretty Python. Why should Basic Linear
Algebra Subroutines (BLAS) be any different?

> One of the most significant point of python
> IMO is its readability, even for people not familiar with it, and
> that's important when doing scientific work.

It is readable by humans because it was designed for that purpose.

> Relying on a lot of compiled libraries goes against it.

On the contrary, Python could be optimized for human readability because
it was expected that heavy computation would be delegated to other code.
There is no need for scientists to read the optimized code in BLAS,
LINPACK, and FFTPACK, in assembler, Fortran, and/or C, which are
incorporated in Numpy.

It is unfortunate that there is not yet a 3.1 version of Numpy. That is
what 3.1 most needs to run faster, as fast as intended.

> I think that python with its scientific extensions is a fantastic
> tool, but I would certainly not mind if it were ten times faster.

Python today is at least 100x as fast as 1.4 (my first version) was in
its time. Which is to say, Python today is as fast as C was then. The
problem for the future is the switch to multiple cores for further speedups.

Terry Jan Reedy


greg

unread,
Nov 17, 2009, 6:33:55 PM11/17/09
to
David Cournapeau wrote:

> It is often claimed that optimization should be done after having
> found the hotspot, but that does not always apply

It's more that if you *do* have a hotspot, you had better
find it and direct your efforts there first. E.g. if there is
a hotspot taking 99% of the time, then optimising elsewhere
can't possibly improve the overall time by more than 1% at
the very most.

Once there are no hotspots left, then there may be further
spread-out savings to be made.

--
Greg

Wolfgang Rohdewald

unread,
Nov 17, 2009, 7:17:49 PM11/17/09
to pytho...@python.org
On Wednesday 18 November 2009, Terry Reedy wrote:
> Python today is at least 100x as fast as 1.4 (my first version) was
> in its time. Which is to say, Python today is as fast as C was
> then

on the same hardware? That must have been a very buggy C compiler.
Or was it a C interpreter?


--
Wolfgang

David Cournapeau

unread,
Nov 17, 2009, 7:45:28 PM11/17/09
to Terry Reedy, pytho...@python.org
On Wed, Nov 18, 2009 at 8:31 AM, Terry Reedy <tjr...@udel.edu> wrote:
> David Cournapeau wrote:
>>
>> On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters <aaron....@gmail.com>
>> wrote:
>>>>
>>>> I don't think Python and Go address the same set of programmer
>>>> desires.  For example, Go has a static type system.  Some programmers
>>>> find static type systems to be useless or undesirable.  Others find
>>>> them extremely helpful and want to use them them.  If you're a
>>>> programmer who wants a static type system, you'll probably prefer Go
>>>> to Python, and vice versa.  That has nothing to do with implementation
>>>> speed or development expenditures.  If Google spent a million dollars
>>>> adding static types to Python, it wouldn't be Python any more.
>>>
>>> ... and I still have an issue with the whole "Python is slow"
>>> meme.  The reason NASA doesn't build a faster Python is because
>>> Python *when augmented with FORTRAN libraries that have been
>>> tested and optimized for decades and are worth billions of dollars
>>> and don't need to be rewritten* is very fast.
>>
>> It is a bit odd to dismiss "python is slow" by saying that you can
>> extend it with fortran.
>
> I find it a bit odd that people are so resistant to evaluating Python as it
> was designed to be. As Guido designed the language, he designed the
> implementation to be open and easily extended by assembler, Fortran, and C.

I am well aware of that fact - that's one of the major reason why I
decided to go the python route a few years ago instead of matlab,
because matlab C api is so limited.

> No one carps about the fact the dictionary key lookup, say, is writen in
> (optimized) C rather than pretty Python. Why should Basic Linear Algebra
> Subroutines (BLAS) be any different?

BLAS/LAPACK explicitly contains stuff that can easily be factored out
in a library. Linear algebra in general works well because the basic
data structures are well understood. You can deal with those as black
boxes most of the time (I for example have no idea how most of LAPACK
algo work, except for the simple ones). But that's not always the case
for numerical computations. Sometimes, you need to be able to go
inside the black box, and that's where python is sometimes limited for
me because of its cost.

To be more concrete, one of my area is speech processing/speech
recognition. Most of current engines are based on Hidden Markov
Models, and there are a few well known libraries to deal with those,
most of the time written in C/C++. You can wrap those in python (and
people do), but you cannot really use those unless you deal with them
at a high level. If you want to change some core algorithms (to deal
with new topology, etc....), you cannot do it without going into C. It
would be great to write my own HMM library in python, but I cannot do
it because it would be way too slow. There is no easy black-box which
I could wrap so that I keep enough flexibility without sacrificing too
much speed.

Said differently, I would be willing to be one order of magnitude
slower than say C, but not 2 to 3 as currently in python when you
cannot leverage existing libraries. When the code can be vectorized,
numpy and scipy give me this.

>> Relying on a lot of compiled libraries goes against it.
>
> On the contrary, Python could be optimized for human readability because it
> was expected that heavy computation would be delegated to other code. There
> is no need for scientists to read the optimized code in BLAS, LINPACK, and
> FFTPACK, in assembler, Fortran, and/or C, which are incorporated in Numpy.

I know all that (I am one of the main numpy develop nowadays), and
indeed, writing blas/lapack in python does not make much sense. I am
talking about libraries *I* would write. Scipy, for example, contains
more fortran and C code than python, without counting the libraries we
wrap, and a lot of it is because of speed/memory concern.

David

Chris Rebert

unread,
Nov 17, 2009, 8:14:22 PM11/17/09
to Rustom Mody, pytho...@python.org

I'm not Vincent, but: The sat() operation is by definition in
inefficient, regardless of language?

Cheers,
Chris
--
http://blog.rebertia.com

Steven D'Aprano

unread,
Nov 17, 2009, 9:32:06 PM11/17/09
to
On Tue, 17 Nov 2009 22:11:42 +0530, Rustom Mody wrote:

> "Language L is (in)efficient. No! Only implementations are
> (in)efficient"
>
> I am reminded of a personal anecdote. It happened about 20 years ago
> but is still fresh and this thread reminds me of it.
>
> I was attending some workshop on theoretical computer science. I gave a
> talk on Haskell.
>
> I showed off all the good-stuff -- pattern matching, lazy lists,
> infinite data structures, etc etc.
> Somebody asked me: Isnt all this very inefficient? Now at that time I
> was a strong adherent of the Dijkstra-religion and this viewpoint
> "efficiency has nothing to do with languages, only implementations"
> traces to him. So I quoted that.
>
> Slowing the venerable P S Thiagarajan got up and asked me: Lets say that
> I have a language with a type 'Proposition' And I have an operation on
> proposition called sat [ sat(p) returns true if p is satisfiable]...

I assume you're referring to this:

http://en.wikipedia.org/wiki/Boolean_satisfiability_problem

which is N-P complete and O(2**N) (although many such problems can be
solved rapidly in polynomial time).


> I wont complete the tale other than to say that Ive never had the wind
> in my sails taken out so completely!
>
> So Vincent? I wonder what you would have said in my place?

I won't answer for Vincent, but I would have made five points:

(1) The existence of one inherently slow function in a language does not
mean that the language itself is slow overall. It's not clear exactly
what "overall" means in the context of a language, but one function out
of potentially thousands obviously isn't it.

(2) Obviously the quality of implementation for the sat function will
make a major difference as far as speed goes, so the speed of the
function is dependent on the implementation.

(3) Since the language definition doesn't specify an implementation, no
prediction of the time needed to execute the function can be made. At
most we know how many algorithmic steps the function will take, given
many assumptions, but we have no idea of the constant term. The language
definition would be satisfied by having an omniscient, omnipotent deity
perform the O(2**N) steps required by the algorithm infinitely fast, i.e.
in constant (zero) time, which would make it pretty fast. The fact that
we don't have access to such deities to do our calculations for us is an
implementation issue, not a language issue.

(4) In order to justify the claim that the language is slow, you have to
define what you are comparing it against and how you are measuring the
speed. Hence different benchmarks give different relative ordering
between language implementations. You must have a valid benchmark, and
not stack the deck against one language: compared to (say) integer
addition in C, yes the sat function is slow, but that's an invalid
comparison, as invalid as comparing the sat function against factorizing
a one million digit number. ("Time to solve sat(P) -- sixty milliseconds.
Time to factorize N -- sixty million years.") You have to compare similar
functionality, not two arbitrary operations.

Can you write a sat function in (say) C that does better than the one in
your language? If you can't, then you have no justification for saying
that C is faster than your language, for the amount of work your language
does. If you can write a faster implementation of sat, then you can
improve the implementation of your language by using that C function,
thus demonstrating that speed depends on the implementation, not the
language.

(5) There's no need for such hypothetical examples. Let's use a more
realistic example... disk IO is expensive and slow. I believe that disk
IO is three orders of magnitude slower than memory access, and heaven
help you if you're reading from tape instead of a hard drive!

Would anyone like to argue that every language which supports disk IO
(including C, Lisp, Fortran and, yes, Python) are therefore "slow"? Since
the speed of the hard drive dominates the time taken, we might even be
justified as saying that all languages are equally slow!

Obviously this conclusion is nonsense. Since the conclusion is nonsense,
we have to question the premise, and the weakest premise is the idea that
talking about the speed of a *language* is even meaningful (except as a
short-hand for "state of the art implementations of that language").

--
Steven

sturlamolden

unread,
Nov 18, 2009, 6:31:56 AM11/18/09
to
On 18 Nov, 00:31, Terry Reedy <tjre...@udel.edu> wrote:

> The
> problem for the future is the switch to multiple cores for further speedups.

The GIL is not a big problem for scientists. Scientists are not so
dependent on threads as the Java/webdeveloper crowd:

- We are used to running multiple processes with MPI.

- Numerical libraries running C/Fortran/Assembler will often release
the GIL. Python threads are ok for multicores then.

- Numerical libraries can be written or compiles for multicores e.g.
using OpenMP or special compilers. If FFTW is compiled for multiple
cores it does not matter that Python has a GIL. LAPACK will use
multiple cores if you use MKL or GotoBLAS, regardless of the GIL.
Etc.

- A scientist used to MATLAB will think "MEX function" (i.e. C or
Fortran) if something is too slow. A web developer used to Java will
think "multithreading".


sturlamolden

unread,
Nov 18, 2009, 6:42:13 AM11/18/09
to
On 18 Nov, 00:24, greg <g...@cosc.canterbury.ac.nz> wrote:

> NumPy, for example, is *extremely* flexible. Someone put
> in the effort, once, to write it and make it fast -- and
> now an endless variety of programs can be written very easily
> in Python to make use of it.

I'm quite sure David Cournapeau knows about NumPy...

By the way, NumPy is not particularly fast because of the way it is
written. It's performance is hampered by the creation of temporary
arrays. But NumPy provides a flexible way of managing memory in
scientific programs.


It is loading more messages.
0 new messages