netCDF files with missing_value vectors

39 views
Skip to first unread message

Jonathan Helmus

unread,
Jul 18, 2013, 2:02:54 PM7/18/13
to netcdf4...@googlegroups.com
Jeff and all,

I've run into a few netCDF files which have variables where the
missing_value attribute is a vector. This is allowed according to the
NetCDF User's Guide [1] but netcdf4-python cannot automatically mask and
scale such a variable. For example the script:

import netCDF4

# create a sample netCDF file
dset = netCDF4.Dataset('test.nc', 'w')
dset.createDimension('foo', None)
bar = dset.createVariable('bar', 'i4', ('foo', ))
bar[:] = range(10)
bar.missing_value = [8, 9]
dset.close()

# read the netCDF4 file
dset2 = netCDF4.Dataset('test.nc', 'r')
print dset2.variables['bar'][:]
dset2.close()


Fails with a Traceback ending with:

File "netCDF4.pyx", line 2685, in netCDF4.Variable._toma (netCDF4.c:33506)
AttributeError: 'bool' object has no attribute 'any'

I've been working around this by using the set_auto_maskandscale(False)
and performing the conversions myself, but I wanted to see if there was
interest in supporting automatic conversion of variables with missing
value vector attributes. I am willing to volunteer to put together a
patch to accomplish this if such a feature is desired and would be
accepted. I also understand if this a "will not fix" issue as the files
themselves are seen as the problem and not netcdf4-python. Please let
me know.

[1]
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html#Attribute-Conventions

Cheers,

- Jonathan Helmus

Jeffrey Whitaker

unread,
Jul 18, 2013, 2:43:18 PM7/18/13
to netcdf4...@googlegroups.com

Jonathan:  That would be a nice feature to have, a patch would be quite welcome.  -Jeff

Jeffrey Whitaker

unread,
Jul 19, 2013, 8:27:42 AM7/19/13
to netcdf4...@googlegroups.com


On Thursday, July 18, 2013 12:02:54 PM UTC-6, Jonathan Helmus wrote:

Jonathan: I went and head and partially implemented this in svn.  However, there's a problem.  Converting to masked arrays should work fine, but writing masked arrays to the file is problematic.  Masked arrays can only have scalar fill_values, so the round trip (read data from file, convert to masked array, write masked array back to file) will not work since you don't have any way to know how to put the multiple missing values back in the data.

Jeffrey Whitaker

unread,
Jul 19, 2013, 1:35:43 PM7/19/13
to netcdf4...@googlegroups.com

I've decided to just raise an exception when trying to assign a masked array to a variable that has a vector-valued missing_value attribute, since there is no way to decide how to fill in the masked values in that case.

-Jeff
Reply all
Reply to author
Forward
0 new messages