Thanks for setting this up, Ethan.
A couple of notes:
- This change affects more than just the IDV. It affects the ncWMS, RAMADDA, the ncsubset service and anyone who has written a script to automatically access/download grib data from TDS servers and anyone who uses the netcdf-java library to access GRIB data. At NOAA/ESRL/PSD, this will affect our use of TDS for serving up reforecast data.
- Unidata has a long history of maintaining GRIB tables to serve the needs of the users of gribtonc, GEMPAK, TDS/CDM and IDV. I don't see that this has been an onerous task - the tables don't change frequently and differences can be resolved pretty easily. As I asked previously, what is the scope of this effort? It seems like John has done the really hard work of setting the structure in place and created a baseline.
- As for the registry service, if Unidata took this on, it would be a boon to the GRIB users community. Perhap you could work with the NCL group in CISL to share the burden.
- Backwards compatibility should be maintained so that a change like this is transparent to the end users. That should include API changes - right now the 4.2 Grib table registry method does not work in 4.3. Couldn't the old method be deprecated and just call the new method? As a developer, it has been frustrating to deal with these API changes with each minor release of netcdf-java that are not backward compatible.
On Fri, Mar 2, 2012 at 10:30 PM, John Caron <ca...@unidata.ucar.edu> wrote:
> yes, good point. we have to look at the big picture of anyone wanting to
> read GRIB files. We have to decide whether to make a generic reader or one
> that is accurate for just a subset of GRIB files. Perhaps a generic reader
> is too ambitious for Unidata.
Having a framework that allows generic reading is a good goal. And it
seems like you have made good strides in this. However, having
output that is useful by end users (e.g. human readable names) and
what they have come to expect over 8 years should also be a goal.
To me, the change to generic names is like GEMPAK changing TMPK to
FOO. If that was done, it would be a huge impact on every GEMPAK
user.
> - Unidata has a long history of maintaining GRIB tables to serve the needs
> of the users of gribtonc, GEMPAK, TDS/CDM and IDV. I don't see that this
> has been an onerous task - the tables don't change frequently and
> differences can be resolved pretty easily. As I asked previously, what is
> the scope of this effort? It seems like John has done the really hard work
> of setting the structure in place and created a baseline.
>
> Thanks, and Unidata's long history is not as illustious as it may seem.
> GEMPAK tables are specific to (I think) a subset of NCEP data. gribtonc is
> wildly out of date. CDM version 4.2 tables are a sincere effort but deeply
> flawed. CDM 4.3 is better but still not perfect or anywhere near complete.
> The reality is that no one has complete and accurate tables.
Actually, I disagree with your negative portrayal of what Unidata has
accomplished. I know that gribtonc is no longer supported, but I used
is as an example of a success of a grib to netcdf converter that had
usefulness in it's day. If not used directly in AWIPS, it (and the
ensuing discussions about NUWG standards) was a model for its
converter. As for GEMPAK, while the default tables focused on NCEP
grids, the framework is there to add in new tables as required which
users do routinely. As for netcdf-Java version 4.2 and earlier, I
would argue that the package opened up a wealth of data that could be
easily accessed through TDS and IDV. I would focus on the 80% that
was gotten right. Tables are always an issue and what the software
needs is a way for users to add their own tables. For the 4.2 version
(at least with GRIB1), if a table was not available, a generic name
like "var0_unknown" would be created. When users saw that (and or
reported that to support), they would try to find a table and if one
was available, then the user had the ability to add that in an get the
right variable names. You have this facility in the 4.3, so the users
can still update what they need. Plus, it looks like you've now added
support for customized grib2 tables.
For the 20% that is wrong in 4.2, it seems like having the variables
without levels and the accumulation times are the major problems. And
for some of the latter, it's not that the values were wrong, but that
the accumulation period wasn't called out, depending on the model.
And fixing these is important. But keeping the 80% that was right
with the same variable names as before should be important also.
The key is that Unidata worked with the users to overcome these
problem and fix them in a way that didn't push the burden of changes
off to the users.
> - As for the registry service, if Unidata took this on, it would be a boon
> to the GRIB users community. Perhap you could work with the NCL group in
> CISL to share the burden.
>
> yes id like to get a registry service going. i had a first pass 2 years ago,
> i could probably get something useful up in a few weeks if i had a bit of
> time.
I think that would be a good resource that would earn you fame and honor. ;-)
> - Backwards compatibility should be maintained so that a change like this is
> transparent to the end users. That should include API changes - right now
> the 4.2 Grib table registry method does not work in 4.3. Couldn't the old
> method be deprecated and just call the new method? As a developer, it has
> been frustrating to deal with these API changes with each minor release of
> netcdf-java that are not backward compatible.
>
>
> I am sorry Dave but I can't do that. ;^)
Actually, you could have the old package call the new, but since
you've created a whole new package (ucar.nc2.grib vs ucar.grib), I'll
let this one slide. ;-) But I would like to see the netcdf-java
package try to achieve backward compatibility without having published
APIs break.
Don
On 3/21/2012 8:39 AM, Tom Whittaker wrote:
> I would like to echo the concerns about backward
> compatibility....especially in the APIs. While we can certainly
> change the IDV and McV code to accommodate such changes, what about
> all the other people who have created their own applications? Is
> there no way to keep the current, published APIs around (even as
> 'deprecated')?
Im not sure what APIs you mean? The main problem is changing the
variables names, not the APIs.
If you are refereng to the APIs for registering GRIB tables, we need to
leave the old ones working the way they do so we can compare new and old
GRIB processing. So new GRIB table registration needs to be different.
Its not a commonly used feature, and its typically done in only one
spot, so i dont anticipate any real problems.
>
> I cannot remember another time in the history of Unidata when updated
> versions of "core" Unidata software were released with significant
> changes that did not maintain backward compatibility.
Theres no question that the variable names have to change, just a
question on how to do that in the best way.
>
> One other comment about the names presented to users for identifying
> variables: Ethan suggested that applications should present the
> long_name attribute to the users so they can understand what the
> variable is. This might work for things like the IDV's Data Choosers,
> but may not for labeling charts and graphs, where short (but
> descriptive) strings are needed. Some long_name attributes
> are...well, _very_ long, and simply would not work for labeling...
The fundamental problem is that GRIB doesnt have short, immutable,
unique names. Really GRIB doesnt even have names at all, they have kind
of a cloud of attributes. So theres work to be done at all levels of the
software to do the best we can.
The problem of names is systemic, not just GRIB. WRF output has a
similar need in that sometimes the user needs to see both the variable
name and the long name. In terms of getting a nice name on a plot,
thats even another problem that the application layer probably has to solve.
John