I've just spent a few hours debugging code similar to this:
d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d
THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}. Changing r to s inside the list
leads to the correct (imo) result.
Is this expected? Is this a known problem? Is it solved in newer
versions?
This is python 2.6.4, on a stock ubuntu 9.10 x86-64 linux box. Let me
know if more detail is needed. Thanks in advance.
-- Alain.
Quoting http://docs.python.org/reference/expressions.html#id19 :
Footnotes
[1] In Python 2.3 and later releases, a list comprehension “leaks” the
control variables of each 'for' it contains into the containing scope.
However, this behavior is deprecated, and relying on it will not work
in Python 3.0
Cheers,
Chris
--
http://blog.rebertia.com
> Hi all,
>
> I've just spent a few hours debugging code similar to this:
>
> d = dict()
> for r in [1,2,3]:
> d[r] = [r for r in [4,5,6]]
> print d
This isn't directly relevant to your problem, but why use a list
comprehension in the first place? [r for r in [4,5,6]] is just [4,5,6],
only slower.
I presume that is just a stand-in for a more useful list comp, but I
mention it because I have seen people do exactly that, in real code,
without knowing any better. (I may have even done so myself, once or
twice.)
> THe problem is that the "r" in d[r] somehow captures the value of the
> "r" in the list comprehension, and somehow kills the loop interator. The
> (unexpected) result is {6: [4, 5, 6]}.
Actually, no it doesn't kill the loop at all. You have misinterpreted
what you have seen:
>>> d = dict()
>>> for r in [1,2,3]:
... print r
... d[r] = [r for r in [4,5,6]]
... print d
...
1
{6: [4, 5, 6]}
2
{6: [4, 5, 6]}
3
{6: [4, 5, 6]}
> Changing r to s inside the list
> leads to the correct (imo) result.
>
> Is this expected? Is this a known problem? Is it solved in newer
> versions?
Yes, yes and yes.
It is expected, because list comprehensions leak the variable into the
enclosing scope. Yes, it is a problem, as you have found, although
frankly it is easy enough to make sure your list comp variable has a
unique name. And yes, it is fixed in Python 3.1.
--
Steven
Yes, this has been fixed in later revisions, but I'm curious to know what
led you to believe that a list comprehension created a new scope. I don't
that was ever promised.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Common sense about how programming languages should work? As
confirmed by later revisions?
Where exactly does this common sense come from? A list comprehension is
basically syntactic sugar over a for loop, and...
Python 3.1.2 (r312:79360M, Mar 24 2010, 01:33:18)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "copyright", "credits" or "license()" for more information.
>>> for x in range(10):
pass
>>> x
9
--
--S
... p.s: change the ".invalid" to ".com" in email address to reply privately.
Common sense? About *somebody else's* idea of how a programming
language should work?
Please. Experiment and read the manual.
~Ethan~
Common sense is about practical solutions.
Since there is no practical gain from a list comprehension affecting the
bindings of outside variables, and there correspondingly is a practical pay-off
from list comprehensions not affecting the bindings of outside variables, common
sense is to expect the latter.
It's in the nature of common sense that those who possess this ability often
tend to make the same tentative assumptions when presented with the same
problem. It doesn't mean that they're consulting each other, like your "somebody
else's": it just means that they're applying similar common sense reasoning. So,
there's no great conspiracy.
> Please. Experiment and read the manual.
Common sense is applied first, as a heuristic. You really wouldn't want to drill
down into the architect's drawings in order to get office 215 in a building.
First you apply common sense.
Cheers & hth.,
- Alf
Oh goodie, bad analogies. Can I play too?
Getting to office 215 is not analogous to writing a program. It is
analogous to using the program. Writing the program is like building
the office tower. You need to know about the tools and materials that
you are working with. You don't use "common sense" to decide what
materials to use. You study the literature and the specs.
--
D'Arcy J.M. Cain <da...@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
Yes, this is a well known design error in Python 2.x. The 3.x series
fixes this error but introduces other errors of its own. It is evil
enough that I almost always use this syntax instead:
d[r] = list(r for r in [4,5,6])
that works in 3.x and the later releases of 2.x. In early 2.x (maybe up
to 2.2) it throws an error at compile time rather than at run time.
>>>Yes, this has been fixed in later revisions, but I'm curious to know
>>>what led you to believe that a list comprehension created a new scope.
>>>I don't that was ever promised.
>>
>>
>> Common sense about how programming languages should work? As confirmed
>> by later revisions?
>
> Common sense? About *somebody else's* idea of how a programming
> language should work?
Nevertheless, it is a common intuition that the list comp variable should
*not* be exposed outside of the list comp, and that the for-loop variable
should. Perhaps it makes no sense, but it is very common -- I've never
heard of anyone being surprised that the for-loop variable is exposed,
but I've seen many people surprised by the fact that list-comps do expose
their loop variable.
--
Steven
> Alain Ketterlin <al...@dpt-info.u-strasbg.fr> writes:
>> d[r] = [r for r in [4,5,6]]
>> THe problem is that the "r" in d[r] somehow captures the value of the
>> "r" in the list comprehension, and somehow kills the loop interator.
>
> Yes, this is a well known design error in Python 2.x. The 3.x series
> fixes this error but introduces other errors of its own.
Oh, do tell?
--
Steven
IMHO, the real confusion-point of the situation wasn't so much list
comps vs for loops, but that list comps did expose it, but gen comps
didn't. If one thinks about how each would most likely be implemented
they wouldn't be surprised, but I'm glad the behavior was harmonized in
3.x.
That said, I can't quite imagine how anyone could really sit down and
write code which would be broken by list comps "leaking". The example
code in this thread is just nutty. Even if list comps did create a new
scope, why in the world would you intentionally shadow an enclosing
iteration variable?
Obfuscation is not a good goal to go after :)
I have a dramatic suggestion.
Why not use this syntax:
d[r] = [something_else for something_else in [4,5,6]]
Where something_else is basically any conceivable name in the whole
wide world which does not have meaning in the current local scope.
Just for clarity's sake, not sharing names is swell.
> d = dict()
> for r in [1,2,3]:
> d[r] = [r for r in [4,5,6]]
> print d
Thanks to Chris and Paul for the details (the list comp. r actually
leaks). I should have found this by myself.
My background is more on functional programming languages, that's why I
thought the list comprehension iterator should be purely local. And yes,
I think a "classical" for-loop iterator should also be local to the
loop, but I understand this may be too counter-intuitive to many :-)
-- Alain.
>> d = dict()
>> for r in [1,2,3]:
>> d[r] = [r for r in [4,5,6]]
>> print d
>
> This isn't directly relevant to your problem, but why use a list
> comprehension in the first place? [r for r in [4,5,6]] is just [4,5,6],
> only slower.
Sure. But I've actually spent some time reducing the real code to a
simple illustration of the problem.
>> THe problem is that the "r" in d[r] somehow captures the value of the
>> "r" in the list comprehension, and somehow kills the loop interator. The
>> (unexpected) result is {6: [4, 5, 6]}.
>
> Actually, no it doesn't kill the loop at all. You have misinterpreted
> what you have seen:
It kills the iterator, not the loop. Sorry, I used 'kill' with the
meaning it has in compiler textbooks: to assign a new value to a
variable.
> It is expected, because list comprehensions leak the variable into the
> enclosing scope.
Thanks.
-- Alain.
Actually in other programming languages, loop counter is usually local:
for (int i = 0; i < something; i++) {
....
}
foo(i); // illegal
The reason why python's loop counter leaks is for implementation
simplicity because otherwise python will have to deal with multi-layered
local namespace. Currently in python, the local namespace is just sugar
for an array access (a bit of hand-waving here). In other languages, a
{} block is a namespace and nested {} block means nested namespace even
if they're still in a single function; in python there is only a flat
local namespace and the names resolver becomes a thousand times simpler
(and faster).
This hasn't been true for a long time. Yes, local variables are
optimized to be indexed in an array, but Python has nested scopes for
functions. But it does not have true lexical scoping, no.
No, it doesn't. The compiler already has to deal with multiple
scopes for nested functions. There may be some simplification,
but not a lot.
The main reason is linguistic. Having nested blocks create new
scopes does not fit well with lack of variable declarations.
--
Greg
This have a slightly performance difference. I think mainly the
generator's next() call.
In [1]: %timeit list(r for r in range(10000))
100 loops, best of 3: 2.78 ms per loop
In [2]: %timeit [r for r in range(10000)]
100 loops, best of 3: 1.93 ms per loop
~Rolando
I've definitely seen people surprised by the for-loop behavior.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan
It is the intended behavior in 2.x. The theory was that a list
comprehension would have the same effect as if it had been unrolled
into a regular for-loop.
In 3.x, Guido changed his mind and the induction variable is hidden.
The theory is that some folks (like you) expect the variable to be
private and that is what we already do with generator expressions.
There's no RightAnswer(tm), just our best guess as to what is the most
useful behavior for the most number of people.
Raymond
> In article <4bb92850$0$8827$c3e...@news.astraweb.com>, Steven D'Aprano
> <st...@REMOVE-THIS-cybersource.com.au> wrote:
>>
>>Nevertheless, it is a common intuition that the list comp variable
>>should *not* be exposed outside of the list comp, and that the for-loop
>>variable should. Perhaps it makes no sense, but it is very common --
>>I've never heard of anyone being surprised that the for-loop variable is
>>exposed, but I've seen many people surprised by the fact that list-comps
>>do expose their loop variable.
>
> I've definitely seen people surprised by the for-loop behavior.
What programming languages were they used to (if any)?
I don't know of any language that creates a new scope for loop variables,
but perhaps that's just my ignorance...
--
Steven
Well, technically it's the idiomatic placement of the loop variable
declaration rather than the loop construct itself, but:
//Written in Java
//Can trivially be changed to C99 or C++
for (int i = 0; i < array.length; i++)
{
// code
}
// variable 'i' no longer accessible
//Using a for-each loop specific to Java
for (ItemType item : array)
{
// code
}
// variable 'item' no longer accessible
MRAB has mentioned Ada, let me mention C++ ...
<code language="C++">
#include <assert.h>
int main()
{
int const i = 42;
for( int i = 0; i < 10; ++i )
{
// blah blah
}
assert( i == 42 );
}
</code>
Java and C# take a slightly different approach where code analogous to the above
won't compile. But it's still a nested scope. E.g. ...
<code language="Java">
class App
{
static public void main( String[] args )
{
for( int i = 0; i < 10; ++i )
{
// blah blah
}
// Uncomment statement below to get compilation error:
//System.out.println( i );
}
}
</code>
So, yes, considering Ada, C++, Java and C# -- and so on. ;-)
But two things that changed as C evolved were where you could introduce
new variables, and the lifetime of variables introduced in the loop
control structure, rather than inside the braces. The first change was
in C++ from the start, but I think the second change was also an
evolution in C++.
1) In original C, all declarations in a given scope had to occur before
any executable code began. For example, the following was illegal:
int a=12, b=42;
myfunc(a, b);
int c = 9; /* illegal */
2) In original C, and I think in C++, the lifetime of i lasted long
after the loop ended.
for (int i=0; i< limit; ++i)
{
z += i;
}
i is still valid after this curly brace
In C99, and at least in later C++, the scope of i ends with the curly,
as though there were another invisible pair of braces:
{
for (int i=0; i< limit; ++i)
{
z += i;
}}
i is no longer valid here
Because C and C++ have explicit declarations, people who need the loop
variable after the loop is done can simply declare the loop variable
before the for statement.
DaveA
I think Pascal and Modula-2 do this, Fortran does this, as well as Ada.
I'm sure derivatives of Ada like Oracle's PL/SQL also enforce this. And
of course C/C++/Java if the programmer wants it that way. Actually I
think C was the first to consider "for" as some kind of syntactic sugar
for "while" (thus blurring the notion of a for-loop forever). Python's
for is really a member of the for-each family.
May I add that having strict for-loop iterators is a good thing, at
least in languages like C/C++/Fortran/etc. Optimizing compilers usually
spend some time detecting so-called "induction variables" when they're
not given: it helps simplifying loop bodies, it reduces register
pressure, and changes many array accesses into pointer increments, among
other things.
-- Alain.
>> I don't know of any language that creates a new scope for loop
>> variables, but perhaps that's just my ignorance...
>
> I think Pascal and Modula-2 do this, Fortran does this, as well as Ada.
Pascal doesn't do this.
[steve@sylar pascal]$ cat for_test.p
program main(input, output);
var
i: integer;
begin
for i := 1 to 3 do
begin
writeln(i);
end;
writeln(i);
end.
[steve@sylar pascal]$ gpc for_test.p
[steve@sylar pascal]$ ./a.out
1
2
3
3
However you can't assign to the loop variable inside the loop. Outside of
the loop, it is treated as just an ordinary variable and you can assign
to it as usual.
--
Steven
> On Sat, 17 Apr 2010 12:05:03 +0200, Alain Ketterlin wrote:
>
>>> I don't know of any language that creates a new scope for loop
>>> variables, but perhaps that's just my ignorance...
>>
>> I think Pascal and Modula-2 do this, Fortran does this, as well as Ada.
>
> Pascal doesn't do this.
[...]
> for i := 1 to 3 do
> begin
> writeln(i);
> end;
> writeln(i);
[...]
At http://standardpascal.org/iso7185.html#6.8.3.9%20For-statements
(sorry, I didn't find a more readable version), I read (second
paragraph, fourth sentence) :
"After a for-statement is executed, other than being left by a
goto-statement, the control-variable shall be undefined."
So, at least, the compiler should emit a warning.
> However you can't assign to the loop variable inside the loop. Outside of
> the loop, it is treated as just an ordinary variable and you can assign
> to it as usual.
I read the excerpt above as: you have to re-assign to it before using it.
The corner-case is obvious: if the loop body is not executed at all,
you cannot assume the "control-variable" will have the first value. I'm
curious to know what gets printed if you swap 1 and 3 in the above code.
-- Alain.
+1 QOTW
> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:
>
>> On Sat, 17 Apr 2010 12:05:03 +0200, Alain Ketterlin wrote:
>>
>>>> I don't know of any language that creates a new scope for loop
>>>> variables, but perhaps that's just my ignorance...
>>>
>>> I think Pascal and Modula-2 do this, Fortran does this, as well as
>>> Ada.
>>
>> Pascal doesn't do this.
> [...]
>> for i := 1 to 3 do
>> begin
>> writeln(i);
>> end;
>> writeln(i);
> [...]
>
> At http://standardpascal.org/iso7185.html#6.8.3.9%20For-statements
> (sorry, I didn't find a more readable version), I read (second
> paragraph, fourth sentence) :
>
> "After a for-statement is executed, other than being left by a
> goto-statement, the control-variable shall be undefined."
>
> So, at least, the compiler should emit a warning.
None of my Pascal text books mention this behaviour, and gpc doesn't emit
a warning by default. Possibly there is some option to do so.
Stardard Pascal isn't as useful as non-standard Pascal. This was one of
the reasons for the (in)famous article "Pascal Considered Harmful" back
in the 1980s(?).
>> However you can't assign to the loop variable inside the loop. Outside
>> of the loop, it is treated as just an ordinary variable and you can
>> assign to it as usual.
>
> I read the excerpt above as: you have to re-assign to it before using
> it.
>
> The corner-case is obvious: if the loop body is not executed at all, you
> cannot assume the "control-variable" will have the first value. I'm
> curious to know what gets printed if you swap 1 and 3 in the above code.
When I try it, i is initialised to 0. That either means that gpc zeroes
integers when they're declared, or the value it just randomly happened to
pick up was 0 by some fluke. I'm guessing the first is more likely.
--
Steven
> 2) In original C, and I think in C++, the lifetime of i lasted long
> after the loop ended.
> for (int i=0; i< limit; ++i)
> {
> z += i;
> }
> i is still valid after this curly brace
>
> In C99, and at least in later C++, the scope of i ends with the curly,
> as though there were another invisible pair of braces:
> {
> for (int i=0; i< limit; ++i)
> {
> z += i;
> }}
> i is no longer valid here
>
Leading to the wonderful header declaration:
#define for if(0);else for
which moves the entire for loop including the declaration inside another
statement and therefore 'fixes' the variable scope for older compilers.
Ah, those were the days. :^)
--
Duncan Booth http://kupuguy.blogspot.com