ccc-gistemp 0.3.0

2 views
Skip to first unread message

Nick Barnes

unread,
Jan 27, 2010, 7:24:25 AM1/27/10
to Reto Ruedy, ccc-giste...@googlegroups.com
Dear Reto,

I'm writing to let you know that we made a fresh release of
ccc-gistemp last night. This has quite a bit of clarification,
especially in steps 1 and 2, and some new tools to help with comparing
results, both from different runs and also against the run you
provided in December. We've removed most of the uses of intermediate
data files. There are also work arounds for some bugs we encountered
in tar file handling on some platforms.

Here is the release:

<http://ccc-gistemp.googlecode.com/files/ccc-gistemp-0.3.0.tar.gz>

and here is a file comparing the results from the release to the ones
you provided:

<http://ccc-gistemp.googlecode.com/files/ccc-gistemp-0.3.0-comparison-2010-01-27.html>

This latter file was generated by simply running

python tool/regression.py

Please feel free to download it and run it, or just browse through the
code. There's not much overall documentation yet (that's a bug!), but
anyone familiar with GISTEMP will find it obvious. step2.py should
give you some indication of the sort of work we are doing.

Ongoing work will continue this type of clarification through more of
the code. We're also going to modify all of the code to use a single
representation of a temperature series, and move any remaining
intermediate file I/O out into the tool/ directory. That will also
include some code to round or truncate intermediate data, to match the
effect of GISTEMP intermediate files which have been removed. For
instance, GISTEMP STEP2 reads the latitude and longitude of stations
from v2.inv (where it is in hundredths of degrees) and then rounds it
to tenths of degrees to write to an intermediate file. Although we no
longer have the file, our code explicitly does the rounding step to
maintain GISTEMP compatibility (because it affects, in a small way,
the behaviour of both step 2 and step 3).

I'm hoping to get to a stage at which it is possible to run
ccc-gistemp either in a "simple" mode (in which there are no
intermediate files and no such rounding) or in a "GISTEMP
compatibility" mode - with some intermediate files, and some rounding.
If we compute and quantify the differences between these modes, then
maybe some future release will do without the compatibility mode
altogether.

As we arrive at a detailed understanding of the whole code base, we
will also write some accompanying documentation to describe the whole
algorithm.

Regards,

Nick Barnes
Clear Climate Code project

Nick Barnes

unread,
Jan 27, 2010, 3:48:51 PM1/27/10
to rru...@giss.nasa.gov, ccc-giste...@googlegroups.com, rsch...@giss.nasa.gov
At 2010-01-27 18:21:27+0000, Reto Ruedy writes:

> I assume that the tar file bugs you mention have nothing to do with our
> coding and don't affect any results - I add that only because I'm sure
> that this email exchange will end up in the public domain and will be
> assessed by people for whom this is not obvious.

To deal with this first. Yes, it was a set of problems with some
corners of our own code on some older Python implementations. It does
not affect the results: the code would simply fail to run on those
systems, and would generate error messages instead. It does not
indicate any sort of difficulty with either your results or with your
original code.

Regarding the public domain, everything on the ccc-gistemp-discuss
list is published by Google, and also I exercise very little control
over membership of the list - only excluding obvious spambots. And I
neither have nor want any control over what list members do with the
messages.

In fact, here's your message now:

<http://groups.google.com/group/ccc-gistemp-discuss/browse_thread/thread/f2b1f0ccaaa98540>

Nick B

Nick Barnes

unread,
Jan 27, 2010, 4:47:09 PM1/27/10
to rru...@giss.nasa.gov, ccc-giste...@googlegroups.com, rsch...@giss.nasa.gov
At 2010-01-27 18:21:27+0000, Reto Ruedy writes:

> Thank you very much for all the effort you and your people put into
> checking and rewriting our programs. I hope to switch to your version of
> that program, if it produces data files that are compatible with our web
> utilities. If we do so, we will let you know about any additional
> modifications or documentation that we include in your code.

This is excellent news. When you say "that program", do you mean just
step2.py or the whole of ccc-gistemp? In either case, you should find
that it is all highly compatible. All of our result files, and all
the remaining intermediate files, are still in your file formats.

Here are the result files:

-rw-r--r-- 1 nb nb 1000760 Jan 26 22:50 BX.Ts.ho2.GHCN.CL.PA.1200
-rw-r--r-- 1 nb nb 14647 Jan 26 22:50 GLB.Ts.ho2.GHCN.CL.PA.txt
-rw-r--r-- 1 nb nb 14647 Jan 26 22:50 NH.Ts.ho2.GHCN.CL.PA.txt
-rw-r--r-- 1 nb nb 14647 Jan 26 22:50 SH.Ts.ho2.GHCN.CL.PA.txt
-rw-r--r-- 1 nb nb 14271 Jan 26 22:50 ZonAnn.Ts.ho2.GHCN.CL.PA.txt
-rw-r--r-- 1 nb nb 765 Jan 26 22:50 google-chart.url

(google-chart.url is one of ours).

Here are the intermediate files which we still generate:

-rw-r--r-- 1 nb nb 15974 Jan 26 22:50 ANNZON.Ts.ho2.GHCN.CL.PA.1200
-rw-r--r-- 1 nb nb 5 Jan 26 22:37 GHCN.last_year
-rw-r--r-- 1 nb nb 34001576 Jan 26 22:49 SBBX.HadR2
-rw-r--r-- 1 nb nb 50240120 Jan 26 22:49 SBBX1880.Ts.GHCN.CL.PA.1200
-rw-r--r-- 1 nb nb 19226996 Jan 26 22:40 Ts.GHCN.CL.PA
-rw-r--r-- 1 nb nb 29802663 Jan 27 16:48 Ts.txt
-rw-r--r-- 1 nb nb 176152 Jan 26 22:50 ZON.Ts.ho2.GHCN.CL.PA.1200
-rw-r--r-- 1 nb nb 176152 Jan 26 22:50 ZON.Ts.ho2.GHCN.CL.PA.1200.step1
-rw-r--r-- 1 nb nb 44716441 Jan 26 22:37 v2.mean_comb

(but note that BX.Ts.GHCN.CL.PA.1200 is just a place-holder).

As previously noted we have removed some intermediate files, internal
to steps 0, 1, 2, and 3. We have also made some small changes to log
file formats in step 2. I removed one log file entirely, simply
because generating it required me to pass some additional arguments to
a function which I could otherwise simplify. If it turns out that any
of these are in fact result files for you then we can certainly put
them back in. Or we could generate similar or related files in any
format you choose.

So if you are wanting to use our step2.py alongside the rest of your
GISTEMP, then please do go ahead. You might need a more up-to-date
version of Python than the one you currently use for STEP1. Please
ask if this proves difficult: when originally looking at GISTEMP we
quite easily adapted STEP1 to run on a fairly current Python.

If you are thinking of running ccc-gistemp instead of the whole
GISTEMP, of course I am delighted to hear it. The plan is for
ccc-gistemp to become so compellingly clear, versatile, and convenient
to use that you will adopt the whole thing. For you, being able to
say "our algorithm is the same as that really clear code over there"
is good, but "We use this really clear code" is surely better.

However, I would suggest that you defer that for a month or two, until
our clarification process reaches a natural pause. Do please download
our code, read it, run it, play with it, experiment with it, fix bugs
in it, use it to generate web pages. But I would be wary of using it
to produce Actual Science Results, with the GISS seal of approval,
until it passes out of the current fairly intensive development phase.

Nick Barnes

unread,
Jan 27, 2010, 6:23:10 PM1/27/10
to rru...@giss.nasa.gov, ccc-giste...@googlegroups.com
At 2010-01-27 22:47:28+0000, Reto Ruedy writes:
> Ideally, we would like to replace our whole code, but that will not
> happen within the next few months.

Giving us plenty of time to put air in the tires and polish the
chrome. Excellent.

> The files I was worried about were the *.dbd files that are used in the
> station data part of our web site.

These are intermediate files from internal phases of STEP 1; we
removed them as part of rewriting step1.py this month. Although
adding them back should be "a small matter of programming",
unfortunately they depend on a Python library (bsddb) which is not
available on some Python implementations (and is in fact dropped
entirely from version 3.0), so we can't retain them in the core of
ccc-gistemp for compatibility reasons.

If you provide us with the related website code, I am sure we could
adapt it to use inputs from our step1.py. Alternatively, we could add
some code to our tool/ directory which would plug into our step1 and
generate these .dbd files.

In the meantime, our earlier release 0.2.0, if you are able to run it,
does still produce those files.

> The SBBX* data are needed to produce the maps, ZON*step1 is not needed,
> neither is BX*, the other ZON* is needed to create the tables and line
> plots. The rest is only needed for debugging or solving little
> mysteries. For historical reasons, we would also like to be able to
> produce the files without including ocean data.
>
> In the next few weeks, we will have little time for activities that are
> not absolutely necessary, since we are already inundated by requests
> from the government to write reports and reply to inquiries related to
> the upcoming climate debates.

Understood. We will keep plugging away on code improvements, and
doubtless chat with you again when we get to 0.4.0 or 0.5.0.

Nick Barnes

unread,
Feb 4, 2010, 12:36:57 PM2/4/10
to rru...@giss.nasa.gov, rsch...@giss.nasa.gov, ccc-giste...@googlegroups.com
At 2010-02-03 19:37:56+0000, Reto Ruedy writes:

> The situation with the station data web utilities is complicated by the
> fact that their originator is unavailable to us. I'll ask our web master
> to try to collect the bits and pieces that make up those utilities - but
> that may take a bit.

If that's too hard, we can certainly roll our own version of this sort
of visualisation.

I am imagining a tool which presents as a map of the final subbox
anomalies, in which the subboxes are clickable, allowing one to see
the historical anomaly series for that sub-box and also to drill down,
working back through the GISTEMP steps, ultimately to the original
station records. There could be charts at each stage (e.g. click on an
urban station and see a chart showing the different rural station
records, the combined rural record, the anomaly difference series, the
two-part linear fit, and before-and-after series for the urban
station). Ideally with a "slippy map" interface, like Google Maps.

Nick B

Nick Barnes

unread,
Mar 9, 2010, 11:15:04 AM3/9/10
to rru...@giss.nasa.gov, ccc-giste...@googlegroups.com
At 2010-03-09 16:08:34+0000, Reto Ruedy writes:
> Nick, Thank you for the notice. I know I owe you some responses, but
> currently I'm busy with writing reports, responding to congressional
> inquiries and assisting Dr. Hansen to finish a paper that should serve
> as reference for all such inquiries, while setting up the long delayed
> model runs for the next IPCC report.
>
> I'll get back to you as soon as things slow down a little.

Of course. In the meantime we'll keep plugging away at our code. The
blogosphere seems to have gone a bit nuts lately about GISTEMP
emulations, variations, and reconstructions; lots of people are asking
us whether we can run little experiments. We wanted to get 0.4.0 out
so that more people could run these experiments for themselves
(e.g. running with and without UHI adjustment is now trivial).

Nick B

Reply all
Reply to author
Forward
0 new messages