Ah yes, Python 3.
I haven't done any Python 3 development yet (though I do generally
read the release notes).
Have you any feel for how difficult it would be to convert to Python
3? This, by the way, is a long way off, it's just that it would be
nice to plan ahead a bit. I'm loathe to do "from __future__ import
print_function" at the moment because that doesn't start working until
Python 2.6, and currently we still work with 2.4 and 2.5.
Cheers,
drj
I'll email the diff to show the changes later.
-John
> --
> You received this message because you are subscribed to the Google Groups "CCC GISTEMP discussion" group.
> To post to this group, send email to ccc-giste...@googlegroups.com.
> To unsubscribe from this group, send email to ccc-gistemp-dis...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ccc-gistemp-discuss?hl=en.
>
>
So with that bug fixed here are the results.
Python 2.6.4 results - I changed the color to blue in the URL:
http://img.skitch.com/20101030-qtf8igej1ab8p1b4hd7ekys12j.png
Python 3.2a2 results:
http://img.skitch.com/20101030-khnf9nyp7xck1c2ksbsm3iuk7w.png
Laying the 2.6.4 results on top of the 3.2a2 results.
http://img.skitch.com/20101030-f6p4ap9aq3q1rqrrutbk636je6.png
Laying the 3.2a2 results on top of the 2.6.4 results.
http://img.skitch.com/20101030-bbhybj5nidpx3p73kprhpmejyc.png
So they are close but not identical. What will I do with the changes I've made?
-John
Thanks,
-John
[tiny differences]
Interesting. It's known (by me), that the exact results are sensitive
to orderings that are not well specified, for example, the order in
which items are returned by a dictionary iterator. Probably those
change between Python 2.6 and Python 3, and perhaps that is
responsible for the difference. I haven't done an investigation of
the sensitivity (yet).
> Laying the 3.2a2 results on top of the 2.6.4 results.
>
> http://img.skitch.com/20101030-bbhybj5nidpx3p73kprhpmejyc.png
>
> So they are close but not identical. What will I do with the changes I've made?
Let's create a branch on Monday and we can put the changes there.
drj
tool/compare.py will provide a more detailed comparison of two result/
directories, it's buggy in the 0.6.1 release, but I fixed it just
after, in svn: http://code.google.com/p/ccc-gistemp/source/detail?r=596
drj
I'll give this a spin soon.
-John
>
I tried to run this tool as Python 3 but there was a decode utf8 issue
which I didn't spend any time looking into. I just dropped the latest
version into the 0.6.1 release and ran with Python 2.6.4.
-John
Ah splendid. Except that all the images URLs shortened so they don't
work (for me!); shame, because the images would've worked without
being shorted (they're all google chart tools URLs).
However, we can see from the textual reports and tables that the
differences for hemispheres and at monthly resolution are also tiny.
>
> I tried to run this tool as Python 3 but there was a decode utf8 issue
> which I didn't spend any time looking into. I just dropped the latest
> version into the 0.6.1 release and ran with Python 2.6.4.
Yes, About 5 minutes after sending my previous e-mail I thought that
compare.py probably wouldn't work with Python 3.
drj
John, I've added you as a committer to the googlecode SVN repository.
Are you okay with creating a branch and committing your changes there?
Or would you like me to do it?
New policy: branches will be called: branch/YYYY-MM-DD/name/ (for what
it's worth, this is how we name branches in Ravenbrook).
Let's call this one branch/2010-11-01/python3/
This should create the branch:
svn cp https://ccc-gistemp.googlecode.com/svn/trunk/
https://ccc-gistemp.googlecode.com/svn/branch/2010-11-01/python3/
(use "creating development branch for Python 3" or somesuch as the
submit message)
and then you can checkout that branch somewhere and copy our changes
onto it, then submit that.
Let me know if that's unclear or you need me to do something. Perhaps
we can IM?
drj
-John
-John
Must be a Gmail thing that.
I've written a short blog post on the branch and linked to the results
which I have hosted on my server if you want to see the charts too.
http://keyes.ie/ccc-gistemp-python-3-branch/
-John
Excellent, thanks for that.
I looked at the diff (tedious notes in the appendix), and it basically
looks fine. I'm encouraged by 2to3. Thanks for doing this little
investigation John.
We (I) should probably try and eliminate the things that require
fixing up by hand, so that we can "just" run 2to3 and at least that
will work. Judging from the diff, that's:
- not using "long" or "list" as a variable name;
- using // instead of / when I know that I want integer division;
- those things with strings.
Cheers,
drj
From code/eqarea.py:
220 220 z = math.sin(lat)
221 221 c = math.cos(lat)
222 222 long = i[1]*math.pi/180
223 - x = math.cos(long) * c
224 - y = math.sin(long) * c
223 + x = math.cos(int) * c
224 + y = math.sin(int) * c
225 225 return (x,y,z)
This is immediately alarming. Yes, it's not a good idea for me to
call a local variable "long", but 2to3 has changed my variable "long"
to "int". Can't see how this would ever work (it won't, it's in code
that isn't routinely called). Will probably change the variable
"long" to "lon" or "longitude". I guess it's one of those things
where it never occurred to me that "long" was a Python builtin. :)
There is a similiar thing in fetch.py.
Obviously 2to3 changes map(...) to list(map(...)) and similarly for
zip. Some of these turn out to be necessary, some do not. I often
use map or zip when I could've used itertools.imap and itertools.izip
and I don't care which I get.
From code/step2.py:
176 176 annual_anoms = []
177 177 first = None
178 - for y in range(len(series)/12):
178 + for y in range(int(len(series)/12)):
179 179 # Seasons are Dec-Feb, Mar-May, Jun-Aug, Sep-Nov.
180 180 # (Dec from previous year).
I guess this was a case where you inserted "int()" by hand? I should
probably use "//12" instead of "/12". Then that will be identical in
Python 3. There are probably a few other cases where "I know" that
the division is integer (browsing through the code makes me wonder if
these are all of the form "/12", that would be amusing). Conversely
there are a few cases where I deliberately infect an arithmetic
operation with a float, and those won't be necessary in Python 3.
I note that 2to3 does some aggressive replacement of map(lambda ...)
with list comprehension.
> I looked at the diff (tedious notes in the appendix), and it basically
> looks fine. I'm encouraged by 2to3. Thanks for doing this little
> investigation John.
You're welcome.
> We (I) should probably try and eliminate the things that require
> fixing up by hand, so that we can "just" run 2to3 and at least that
> will work. Judging from the diff, that's:
> - not using "long" or "list" as a variable name;
> - using // instead of / when I know that I want integer division;
> - those things with strings.
Yeah I was thinking about this last night when I was in bed (sad I
know). The best thing is to make the 2.x code easy to translate to
3.x. I didn't ask why I was making changes, I just wanted to get it
running and then let you have a look at the diff.
> From code/eqarea.py:
> 220 220 z = math.sin(lat)
> 221 221 c = math.cos(lat)
> 222 222 long = i[1]*math.pi/180
> 223 - x = math.cos(long) * c
> 224 - y = math.sin(long) * c
> 223 + x = math.cos(int) * c
> 224 + y = math.sin(int) * c
> 225 225 return (x,y,z)
>
> This is immediately alarming. Yes, it's not a good idea for me to
> call a local variable "long", but 2to3 has changed my variable "long"
> to "int". Can't see how this would ever work (it won't, it's in code
> that isn't routinely called). Will probably change the variable
> "long" to "lon" or "longitude". I guess it's one of those things
> where it never occurred to me that "long" was a Python builtin. :)
A simple change.
> There is a similiar thing in fetch.py.
>
> Obviously 2to3 changes map(...) to list(map(...)) and similarly for
> zip. Some of these turn out to be necessary, some do not. I often
> use map or zip when I could've used itertools.imap and itertools.izip
> and I don't care which I get.
>
> From code/step2.py:
>
> 176 176 annual_anoms = []
> 177 177 first = None
> 178 - for y in range(len(series)/12):
> 178 + for y in range(int(len(series)/12)):
> 179 179 # Seasons are Dec-Feb, Mar-May, Jun-Aug, Sep-Nov.
> 180 180 # (Dec from previous year).
>
> I guess this was a case where you inserted "int()" by hand? I should
> probably use "//12" instead of "/12". Then that will be identical in
> Python 3. There are probably a few other cases where "I know" that
> the division is integer (browsing through the code makes me wonder if
> these are all of the form "/12", that would be amusing). Conversely
> there are a few cases where I deliberately infect an arithmetic
> operation with a float, and those won't be necessary in Python 3.
Yeap this was me. Again I just wanted it to work. Error message was
TypeError int needed.
> I note that 2to3 does some aggressive replacement of map(lambda ...)
> with list comprehension.
It does. I think list comprehension is clearer code than map(lambda
...) so this is something that could be changed in the 2.x code if you
agree with that.
I'll haven't examined the diff in detail yet, but I will.
What's the definitive way to check if a change has broken anything? I
think some validation tests are required.
-John
Yes, I totally understand, and it's a good approach.
>>
>> I guess this was a case where you inserted "int()" by hand? I should
>> probably use "//12" instead of "/12". Then that will be identical in
>> Python 3. There are probably a few other cases where "I know" that
>> the division is integer (browsing through the code makes me wonder if
>> these are all of the form "/12", that would be amusing). Conversely
>> there are a few cases where I deliberately infect an arithmetic
>> operation with a float, and those won't be necessary in Python 3.
>
> Yeap this was me. Again I just wanted it to work. Error message was
> TypeError int needed.
Again, totally understand, for the experiment of getting it working in
2to3, it doesn't really matter which one you did.
>> I note that 2to3 does some aggressive replacement of map(lambda ...)
>> with list comprehension.
>
> It does. I think list comprehension is clearer code than map(lambda
> ...) so this is something that could be changed in the 2.x code if you
> agree with that.
Well, I guess Guido agrees with you. As for the output of 2to3, I
think sometimes the list comprehension is clearer, sometimes the map
is. But when we do switch over and use 2to3, I'm not going to cry
about all those maps turning into list comprehensions. Conversely,
I'm not going to switch them all in the 2.x code either.
>
> What's the definitive way to check if a change has broken anything? I
> think some validation tests are required.
There isn't really any and this is a bug.
Conceptually it's actually quite a tricky problem. It depends if you
make a change that you expect to change the answer or not. The
published result, the two hemispheric and global temperature series
printed to 2 decimal places, depends on all sorts of things that
aren't particularly well specified (the exact result of trigonometric
functions, whether you used 32-bit or 64-bit floats, the order that
items came out of a dict iterator); it doesn't depend on these things
very much, but a little bit (for example, it's easy to show that
changing the order of items in get_longest() in step1.py will change
the result by a tiny bit).
So the published result can change for legitimate reasons: example, we
changed a dict to a set. I would like to lock these down, but that's
a different matter.
Conversely, genuine bugs can change the result by less than some of
the above stations. For example, accidentally dropping a few stations
(see Issue 84) can go unnoticed. It's definitely a bug, but one that
hardly affects the result.
Investigation tiny changes in the result (such as the ones from 2to3)
is usually a matter of poring over log files and determining that
nothing suspicious is going on. It's not easy.
drj