GRIB variable name changes in CDM/TDS 4.3

236 views
Skip to first unread message

Ethan Davis

unread,
Mar 1, 2012, 4:45:33 PM3/1/12
to netcd...@googlegroups.com
A quick summary of the conversation so far ...

On 27 Feb 2012 John Caron announced:
> The CDM / netCDF-Java library version 4.3 (and also TDS version 4.3) is
> considering a radical change in the way that GRIB variables are named.
> Instead of nice human readable names like
>
>    float Temperature(time=1, lat=361, lon=720);
>
> they are now like
>
>    float VAR_0-0-0_L6_I6_Hour_S194(time=1, lat=361, lon=720);
>
> with "human readable names" in the long_name:
>
>      :long_name = "Temperature (6_Hour Average) @ Maximum wind level";
>
> [snip] ... for full text see http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2012/msg00048.html

Don Murray responded:
> I believe keeping the human readable variable names (as in the previous 4.3
> release - with slight modifications) is much preferable and backward
> compatible.  I understand your reasons for wanting to change, but while
> that makes the programmer's life easier, it makes the user's (and other
> programmers') life harder.
>
> [snip] ... for full text see http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2012/msg00049.html

The conversation continued over the next few days, see messages from 27 Feb - 1 March 2012 here
http://www.unidata.ucar.edu/mailing_lists/archives/netcdf-java/2012/date.html

Benno Blumenthal and John briefly discussed a GRIB table registry.

There were some voices in favor of the change. But most voices expressed concern over the impact this change would have on users and the near complete opaqueness of the proposed variable names.

There was assurance that any transition would be done methodically and with time for testing.

On 2/29/2012 10:15 AM, Ethan Davis wrote:
> First, I want to assure everyone that the transition will be done in a
> careful and controlled manner with continued input from and discussion
> with the community and, of course, plenty of time for testing.
> [snip]
>
> Either way, this is an opportunity for us to more deeply understand the
> interoperability problems we have with GRIB files (and more generally).
> It is also an opportunity to improve the flexibility of our systems and
> the robustness of these systems in the face of change.
>
> The Road Forward:
> After continued community feedback and discussion and discussion with
> Unidata's Users Committee, we will:
>
> 1) Finalize on a GRIB variable naming scheme along with the set of
> variable attributes the GRIB to netCDF mapping will include (both CF and
> GRIB specific attributes).
>
> 2) Develop a GRIB variable name mapping API that client applications can
> use to map old variable names to new variable names and vice versa.
>
> 3) Create CDM and TDS 4.3 release candidates.
>
> 4) Install a test TDS 4.3 server on motherlode to support testing and
> evaluation in parallel with the current TDS 4.2 installation. Work with
> other groups (like NCDC -- Thanks Glenn!) to setup test TDS 4.3 servers
> at their sites.
>
> 5) Work with the community and our committees to agree on a time table
> for the parallel testing and evaluation mentioned in 4 above.
>
> 6) Iterate through steps 1-5 above, as necessary.
>
> 7) At the agreed upon time, switch the main TDS on motherlode to 4.3.
> Work with the community during the transition of other servers.


Ethan Davis

unread,
Mar 1, 2012, 4:48:15 PM3/1/12
to netcd...@googlegroups.com
A summary of the pros and cons of opaque and human readable variable names:


On 2/29/2012 10:15 AM, Ethan Davis wrote:
>
> - Opaque variable names:
>   - Pro: Names are very stable and accurately represent
>     the GRIB variable.
>   - Con: All IDV bundles and other scripts that access GRIB
>     data by variable name will break.
>   - Con: Users will need to be shown the netCDF variable's
>     "long_name" attribute to interpret the meaning of the
>     variable.
>
> - Human readable variable names:
>   - Pro: Users can interpret the meaning of the variable from
>     the netCDF variable name.
>   - Pros: Not all IDV bundles and other scripts that access
>     GRIB data by name will break (but some will since some
>     variable names are currently wrong).
>   - Con: The names are not stable and are fragile. As GRIB
>     tables evolve (versioned and un-versioned changes)
>     either variable names will change over time (not stable)
>     OR effort will need to be spent to track and minimize
>     visible table changes (fragile, prone to human error).

Ethan Davis

unread,
Mar 1, 2012, 4:55:16 PM3/1/12
to netcd...@googlegroups.com
Jeff McWhirter asked about the scope of the impact on users.

On 3/1/2012 5:55 AM, Jeff McWhirter wrote:
>
> Wouldn't it be best to first understand the scope of the problem?
> Variable names show up in lots of places - not just  in the IDV core but
> also in in local user configurations and in plugins.
>
> First, there are bundles- as I mentioned before we have made great
> efforts to ensure that new versions of the IDV are always backwards
> compatible with old bundles. Bundles show up in lots of places - not
> just on the user's disks, so doing an automatic update of a bundle with
> new parameter naming just won't work. There are bundles on web sites,
> lots of bundles of ramadda servers, bundles in plugins and bundles
> referenced in ISL scripts.
>
> ISL scripts - You can specify parameter names in ISL scripts to
> explicitly create displays.
>
> Aliases - The IDV has an alias mechanism that is based around parameter
> names. These aliases are in the core IDV resources as well as in user's
> local configurations and in plugins.
>
> Jython scripts can have explicit references to parameters
>
> Derived quantities - the derived quantity framework is based on
> parameter names. The specification of derived quantities are in the core
> IDV as well as in user's local configurations and in plugins.
>
> Parameter groups - used by the derived quantity framework and are in
> core, user and plugins.
>
> I don't see how such a broad change can be done without major impacts to
> the IDV user community. Sometimes we are stuck with what we have. In
> this case we have 9+ years of a myriad number of dependencies on a
> particular naming convention. Transitioning away from this will be an
> incredibly difficult task and is of questionable benefit. Clearly, where
> names are just plain wrong (e.g., calling a pressure field temperature)
> than that has to be fixed. But, if its just a matter of maintaining GRIB
> parameter tables than perhaps that would be a better solution than the
> currently proposed one.

Donald Murray

unread,
Mar 2, 2012, 9:14:08 AM3/2/12
to netcd...@googlegroups.com
Thanks for setting this up, Ethan.

A couple of notes:

- This change affects more than just the IDV.  It affects the ncWMS, RAMADDA, the ncsubset service and anyone who has written a script to automatically access/download grib data from TDS servers and anyone who uses the netcdf-java library to access GRIB data.  At NOAA/ESRL/PSD, this will affect our use of TDS for serving up reforecast data.
- Unidata has a long history of maintaining GRIB tables to serve the needs of the users of gribtonc, GEMPAK, TDS/CDM and IDV.  I don't see that this has been an onerous task - the tables don't change frequently and differences can be resolved pretty easily.   As I asked previously, what is the scope of this effort?  It seems like John has done the really hard work of setting the structure in place and created a baseline.
- As for the registry service, if Unidata took this on, it would be a boon to the GRIB users community.   Perhap you could work with the NCL group in CISL to share the burden.
- Backwards compatibility should be maintained so that a change like this is transparent to the end users.  That should include API changes - right now the 4.2 Grib table registry method does not work in 4.3.  Couldn't the old method be deprecated and just call the new method?  As a developer, it has been frustrating to deal with these API changes with each minor release of netcdf-java that are not backward compatible.

Thanks.

Don

John Caron

unread,
Mar 3, 2012, 12:30:06 AM3/3/12
to netcd...@googlegroups.com
On 3/2/2012 7:14 AM, Donald Murray wrote:
Thanks for setting this up, Ethan.

A couple of notes:

- This change affects more than just the IDV.  It affects the ncWMS, RAMADDA, the ncsubset service and anyone who has written a script to automatically access/download grib data from TDS servers and anyone who uses the netcdf-java library to access GRIB data.  At NOAA/ESRL/PSD, this will affect our use of TDS for serving up reforecast data.

yes, good point. we have to look at the big picture of anyone wanting to read GRIB files. We have to decide whether to make a generic reader or one that is accurate for just a subset of GRIB files. Perhaps a generic reader is too ambitious for Unidata.


- Unidata has a long history of maintaining GRIB tables to serve the needs of the users of gribtonc, GEMPAK, TDS/CDM and IDV.  I don't see that this has been an onerous task - the tables don't change frequently and differences can be resolved pretty easily.   As I asked previously, what is the scope of this effort?  It seems like John has done the really hard work of setting the structure in place and created a baseline.

Thanks, and Unidata's long history is not as illustious as it may seem. GEMPAK tables are specific to (I think) a subset of NCEP data. gribtonc is wildly out of date. CDM version 4.2 tables are a sincere effort but deeply flawed. CDM 4.3 is better but still not perfect or anywhere near complete. The reality is that no one has complete and accurate tables.


- As for the registry service, if Unidata took this on, it would be a boon to the GRIB users community.   Perhap you could work with the NCL group in CISL to share the burden.

yes id like to get a registry service going. i had a first pass 2 years ago, i could probably get something useful up in a few weeks if i had a bit of time.


- Backwards compatibility should be maintained so that a change like this is transparent to the end users.  That should include API changes - right now the 4.2 Grib table registry method does not work in 4.3.  Couldn't the old method be deprecated and just call the new method?  As a developer, it has been frustrating to deal with these API changes with each minor release of netcdf-java that are not backward compatible.

 I am sorry Dave but I can't do that. ;^)

Bill Fingerhut

unread,
Mar 3, 2012, 10:05:57 AM3/3/12
to netcd...@googlegroups.com
As a long time IDV user and past member of the IDV Steering Committee I am
very disheartened with the proposed change. With many Bundles shared with
the community, used in education, and generating images for the web, the
thought of these bundles ceasing to work is quite disappointing.

One of Unidata's most important qualities has bee its concern for the users.
The proposed change seems to contradict this basic quality. I can't help but
wonder if Unidata has changed or if a small group of employees has 'gone rogue'.

Perhaps there is another explanation. So, I would like to ask the Unidata
Users Committee, which has always been fair and technologically aware, to
present their thoughts. Please add the needed perspective.

Bill

Donald Murray

unread,
Mar 4, 2012, 10:09:12 AM3/4/12
to netcd...@googlegroups.com
Hi John-

On Fri, Mar 2, 2012 at 10:30 PM, John Caron <ca...@unidata.ucar.edu> wrote:

> yes, good point. we have to look at the big picture of anyone wanting to
> read GRIB files. We have to decide whether to make a generic reader or one
> that is accurate for just a subset of GRIB files. Perhaps a generic reader
> is too ambitious for Unidata.

Having a framework that allows generic reading is a good goal. And it
seems like you have made good strides in this. However, having
output that is useful by end users (e.g. human readable names) and
what they have come to expect over 8 years should also be a goal.

To me, the change to generic names is like GEMPAK changing TMPK to
FOO. If that was done, it would be a huge impact on every GEMPAK
user.

> - Unidata has a long history of maintaining GRIB tables to serve the needs
> of the users of gribtonc, GEMPAK, TDS/CDM and IDV.  I don't see that this
> has been an onerous task - the tables don't change frequently and
> differences can be resolved pretty easily.   As I asked previously, what is
> the scope of this effort?  It seems like John has done the really hard work
> of setting the structure in place and created a baseline.
>
> Thanks, and Unidata's long history is not as illustious as it may seem.
> GEMPAK tables are specific to (I think) a subset of NCEP data. gribtonc is
> wildly out of date. CDM version 4.2 tables are a sincere effort but deeply
> flawed. CDM 4.3 is better but still not perfect or anywhere near complete.
> The reality is that no one has complete and accurate tables.

Actually, I disagree with your negative portrayal of what Unidata has
accomplished. I know that gribtonc is no longer supported, but I used
is as an example of a success of a grib to netcdf converter that had
usefulness in it's day. If not used directly in AWIPS, it (and the
ensuing discussions about NUWG standards) was a model for its
converter. As for GEMPAK, while the default tables focused on NCEP
grids, the framework is there to add in new tables as required which
users do routinely. As for netcdf-Java version 4.2 and earlier, I
would argue that the package opened up a wealth of data that could be
easily accessed through TDS and IDV. I would focus on the 80% that
was gotten right. Tables are always an issue and what the software
needs is a way for users to add their own tables. For the 4.2 version
(at least with GRIB1), if a table was not available, a generic name
like "var0_unknown" would be created. When users saw that (and or
reported that to support), they would try to find a table and if one
was available, then the user had the ability to add that in an get the
right variable names. You have this facility in the 4.3, so the users
can still update what they need. Plus, it looks like you've now added
support for customized grib2 tables.

For the 20% that is wrong in 4.2, it seems like having the variables
without levels and the accumulation times are the major problems. And
for some of the latter, it's not that the values were wrong, but that
the accumulation period wasn't called out, depending on the model.
And fixing these is important. But keeping the 80% that was right
with the same variable names as before should be important also.

The key is that Unidata worked with the users to overcome these
problem and fix them in a way that didn't push the burden of changes
off to the users.

> - As for the registry service, if Unidata took this on, it would be a boon
> to the GRIB users community.   Perhap you could work with the NCL group in
> CISL to share the burden.
>
> yes id like to get a registry service going. i had a first pass 2 years ago,
> i could probably get something useful up in a few weeks if i had a bit of
> time.

I think that would be a good resource that would earn you fame and honor. ;-)

> - Backwards compatibility should be maintained so that a change like this is
> transparent to the end users.  That should include API changes - right now
> the 4.2 Grib table registry method does not work in 4.3.  Couldn't the old
> method be deprecated and just call the new method?  As a developer, it has
> been frustrating to deal with these API changes with each minor release of
> netcdf-java that are not backward compatible.
>
>
>  I am sorry Dave but I can't do that. ;^)

Actually, you could have the old package call the new, but since
you've created a whole new package (ucar.nc2.grib vs ucar.grib), I'll
let this one slide. ;-) But I would like to see the netcdf-java
package try to achieve backward compatibility without having published
APIs break.

Don

Tom Whittaker

unread,
Mar 21, 2012, 10:39:41 AM3/21/12
to netcd...@googlegroups.com
I would like to echo the concerns about backward compatibility....especially in the APIs.  While we can certainly change the IDV and McV code to accommodate such changes, what about all the other people who have created their own applications?  Is there no way to keep the current, published APIs around (even as 'deprecated')?

I cannot remember another time in the history of Unidata when updated versions of "core" Unidata software were released with significant changes that did not maintain backward compatibility.

One other comment about the names presented to users for identifying variables: Ethan suggested that applications should present the long_name attribute to the users so they can understand what the variable is.  This might work for things like the IDV's Data Choosers, but may not for labeling charts and graphs, where short (but descriptive) strings are needed.  Some long_name attributes are...well, _very_ long, and simply would not work for labeling...

John Caron

unread,
Mar 22, 2012, 1:34:23 PM3/22/12
to netcd...@googlegroups.com
Hi Tom:

On 3/21/2012 8:39 AM, Tom Whittaker wrote:
> I would like to echo the concerns about backward
> compatibility....especially in the APIs. While we can certainly
> change the IDV and McV code to accommodate such changes, what about
> all the other people who have created their own applications? Is
> there no way to keep the current, published APIs around (even as
> 'deprecated')?

Im not sure what APIs you mean? The main problem is changing the
variables names, not the APIs.

If you are refereng to the APIs for registering GRIB tables, we need to
leave the old ones working the way they do so we can compare new and old
GRIB processing. So new GRIB table registration needs to be different.
Its not a commonly used feature, and its typically done in only one
spot, so i dont anticipate any real problems.

>
> I cannot remember another time in the history of Unidata when updated
> versions of "core" Unidata software were released with significant
> changes that did not maintain backward compatibility.

Theres no question that the variable names have to change, just a
question on how to do that in the best way.


>
> One other comment about the names presented to users for identifying
> variables: Ethan suggested that applications should present the
> long_name attribute to the users so they can understand what the
> variable is. This might work for things like the IDV's Data Choosers,
> but may not for labeling charts and graphs, where short (but
> descriptive) strings are needed. Some long_name attributes
> are...well, _very_ long, and simply would not work for labeling...

The fundamental problem is that GRIB doesnt have short, immutable,
unique names. Really GRIB doesnt even have names at all, they have kind
of a cloud of attributes. So theres work to be done at all levels of the
software to do the best we can.

The problem of names is systemic, not just GRIB. WRF output has a
similar need in that sometimes the user needs to see both the variable
name and the long name. In terms of getting a nice name on a plot,
thats even another problem that the application layer probably has to solve.

John

Reply all
Reply to author
Forward
0 new messages