_FillValue and missing_value are not permitted attributes and so cannot be edited/added

1,073 views
Skip to first unread message

Luke Abraham

unread,
Mar 18, 2015, 11:57:26 AM3/18/15
to scitoo...@googlegroups.com
We are seeing a similar problem reported here:

https://groups.google.com/d/topic/scitools-iris/OO-66QzcjHw/discussion (Global attribute in MERRA data causes problems)

although there have been no replies to this thread.

We are attempting to use iris to write-out some cf-compliant netCDF files for a model intercomparison project. In the data specification both _FillValue and missing_value need to be defined. 

The data is generated as a masked array which is allowing _FillValue to be defined, however missing_value cannot be manually defined, as it raises the error

ValueError: 'missing_value' is not a permitted attribute

In fact, neither _FillValue or missing_value can be manually defined or edited (and I'm not sure how the masked array is able to create a _FillValue while I am not). 

I have a number of questions:
  1. Is it possible at all to manually set missing_value (and _FillValue)
  2. Should the masked array be also setting the missing_value attribute, as well as the _FillValue attribute

I have produced a minimal code to highlight this problem which I have tested in Iris 1.7.1 and 1.7.3. To use this to you first need to download hybrid_height.nc needs to be downloaded from 


The example iris script is below, although the missing_value section will need to be commented out to be able to crash at the _FillValue section. 

The initial metadata from hybrid_height.nc is
        float air_potential_temperature(model_level_number, grid_latitude, grid_longitude) ;
                air_potential_temperature
:standard_name = "air_potential_temperature" ;
                air_potential_temperature
:units = "K" ;
                air_potential_temperature
:ukmo__um_stash_source = "m01s00i004" ;
                air_potential_temperature
:source = "Data from Met Office Unified Model 7.04" ;
                air_potential_temperature
:grid_mapping = "rotated_latitude_longitude" ;
                air_potential_temperature
:coordinates = "forecast_period forecast_reference_time level_height sigma surface_altitude time" ;

and the equivalent metadata from the MaskedArray.nc file (see script) is
        float air_potential_temperature(model_level_number, grid_latitude, grid_longitude) ;
                air_potential_temperature
:_FillValue = 1.e+20f ;
                air_potential_temperature
:standard_name = "air_potential_temperature" ;
                air_potential_temperature
:units = "K" ;
                air_potential_temperature
:um_stash_source = "m01s00i004" ;
                air_potential_temperature
:grid_mapping = "rotated_latitude_longitude" ;
                air_potential_temperature
:coordinates = "forecast_period forecast_reference_time level_height sigma surface_altitude time" ;

i.e. _FillValue has been added. 

Output from the script is:
{'source': 'Data from Met Office Unified Model 7.04', 'STASH': STASH(model=1, section=0, item=4), 'Conventions': 'CF-1.5'}
------------------------------
{'source': 'Data from Met Office Unified Model 7.04', 'STASH': STASH(model=1, section=0, item=4), 'Conventions': 'CF-1.5'}
Traceback (most recent call last):
 
File "<string>", line 1, in <module>
 
File "iris_min.py", line 26, in <module>
    cube
.attributes['missing_value'] = 1e+20
 
File "/usr/local/shared/ubuntu-12.04/x86_64/python2.7-iris/1.7.1/local/lib/python2.7/site-packages/Iris-1.7.1-py2.7.egg/iris/_cube_coord_common.py", line 61, in __setitem__
   
raise ValueError('%r is not a permitted attribute' % key)
ValueError: 'missing_value' is not a permitted attribute

and if the section trying to add-in missing_value is commented out then the output is
{'source': 'Data from Met Office Unified Model 7.04', 'STASH': STASH(model=1, section=0, item=4), 'Conventions': 'CF-1.5'}
------------------------------
{'source': 'Data from Met Office Unified Model 7.04', 'STASH': STASH(model=1, section=0, item=4), 'Conventions': 'CF-1.5'}
Traceback (most recent call last):
 
File "<string>", line 1, in <module>
 
File "iris_min.py", line 32, in <module>
    cube
.attributes['_FillValue'] = 1e+20
 
File "/usr/local/shared/ubuntu-12.04/x86_64/python2.7-iris/1.7.1/local/lib/python2.7/site-packages/Iris-1.7.1-py2.7.egg/iris/_cube_coord_common.py", line 61, in __setitem__
   
raise ValueError('%r is not a permitted attribute' % key)
ValueError: '_FillValue' is not a permitted attribute

Many thanks for your help.

Luke

import iris
import numpy.ma as ma

filename  
= 'hybrid_height.nc'
file
= iris.sample_data_path(filename)

# Obtainable from:
# https://github.com/SciTools/iris-sample-data/tree/master/sample_data
# Stick in pwd.

try:
    cube
= iris.load_cube(file)
except IOError:
    cube
= iris.load_cube(filename)

# now mask of a certain part of the cube data
cube
.data = ma.masked_greater(cube.data,288.0)

print cube.attributes
# masked array should be able to set _FillValue...
iris
.fileformats.netcdf.save(cube,'MaskedArray.nc',netcdf_format='NETCDF4_CLASSIC')

print '------------------------------'
print cube.attributes
# ...but it is impossible to add-in the missing_value attribute...
cube
.attributes['missing_value'] = 1e+20
iris
.fileformats.netcdf.save(cube,'missing_value.nc',netcdf_format='NETCDF4_CLASSIC')

print '------------------------------'
print cube.attributes
# ...or indeed try to add/edit _FillValue.
cube
.attributes['_FillValue'] = 1e+20
iris
.fileformats.netcdf.save(cube,'FillValue.nc',netcdf_format='NETCDF4_CLASSIC')








Andrew Dawson

unread,
Mar 18, 2015, 12:44:18 PM3/18/15
to scitoo...@googlegroups.com
Hi Luke

This issue isn't actually as closely related to those others as you might think, just the error message is similar.

If you want to set the _FillValue attribute then you just need to set the fill value of the masked array containing the data, from your example:

print '------------------------------'
print cube.attributes
# ...or indeed try to add/edit _FillValue.
#cube.attributes['_FillValue'] = 1e+20
cube
.data.set_fill_value(1e20)

iris
.fileformats.netcdf.save(cube,'FillValue.nc',netcdf_format='NETCDF4_CLASSIC')

This works because iris will detect the masked array, and pass the fill value onto the netCDF4.Dataset.createVariable() method, which then takes care of setting the actual _FillValue attribute. Note that this means Iris never sets the _FillValue attribute itself, and therefore must prevent you from setting it manually to make sure the data set remains consistent.

The case for missing_Value is slightly different. Like _FillValue, missing_value is considered a CF attribute, and is marked as restricted. However, unlike _FillValue this attribute is never set automatically which essentially forbids anyone from using it. I don't believe this was necessarily the intention. The CF conventions recommend that if both _FillValue and missing_ralue attributes are present then they should have the same value (section 2.5.1), but they do not require this to be true. Therefore it may be reasonable to modify Iris to allow users to set their own missing_value attribute and not worry if it is not consistent with the data contained in the cube, since this is only a recommendation of CF. I'm not sure personally, but it is an option to consider.

Luke Abraham

unread,
Mar 18, 2015, 5:01:17 PM3/18/15
to scitoo...@googlegroups.com
Hi Andrew,

Many thanks for your reply. 

You're right - only the error message are related. Sorry about that.

I can't think of a situation where _FillValue and missing_value should be different (although they might come up of course). I would expect that they should always be the same. Indeed, they are both requested in this MIP as some groups have existing analysis scripts which use missing_value and some which use _FillValue, so here they must be the same. 

Given the CF convention it seems that iris is behaving incorrectly here. A solution could be to allow set_fill_value to also set missing_value - perhaps as an optional argument. 

Many thanks,
Luke

Andrew Dawson

unread,
Mar 19, 2015, 8:38:45 AM3/19/15
to scitoo...@googlegroups.com
I can't think of a situation where _FillValue and missing_value should be different (although they might come up of course). I would expect that they should always be the same.

In the past people have used the two to distinguish locations where no data are available (e.g. value of sea-surface-temperature over land) and points where no measurement exists (e.g. sea-surface-temperature measurement didn't meet QC standards). This isn't necessarily a great idea which is why the CF conventions recommend you don't do this, but they do not prohibit it.

Given the CF convention it seems that iris is behaving incorrectly here.

Only in that it does not allow you to set the missing_value at all, and perhaps it should.

A solution could be to allow set_fill_value to also set missing_value - perhaps as an optional argument. 

You've misunderstood the situation, set_fill_value is a method of NumPy's MaskedArray and nothing to do with Iris. We can't extend it, and doing so would do no good. The _FillValue in the netCDF file is controlled by a keyword Iris passes to the netCDF4 library, and the value used is the MaskedArray's fill value. In the netCDF4 library one must explicitly set missing_value as an attribute.

There are a few paths open though:

    1. Make Iris write a missing_value attribute with the same value as _FillValue (no good, against CF conventions)
    2. Allow the user to set missing_value themselves, the user must ensure the value of this attribute is correct.

We can't do 1 because the CF conventions say the missing_value and _FillValue attributes can be different. Option 2 is reasonable and seems to meet the specification in the CF conventions, in that it doesn't force missing_value and _FillValue to be the same. Implementing option 2 would mean modifying Iris to allow a cube to carry a missing_value attribute.

Luke Abraham

unread,
Mar 19, 2015, 8:47:09 AM3/19/15
to scitoo...@googlegroups.com
Hi Andrew,

 
Given the CF convention it seems that iris is behaving incorrectly here.

Only in that it does not allow you to set the missing_value at all, and perhaps it should.

Yes - that's what I meant here. Given that the CF conventions don't state that missing_value should not be used, iris preventing someone from using it is incorrect.
 

A solution could be to allow set_fill_value to also set missing_value - perhaps as an optional argument. 

You've misunderstood the situation, set_fill_value is a method of NumPy's MaskedArray and nothing to do with Iris. We can't extend it, and doing so would do no good. The _FillValue in the netCDF file is controlled by a keyword Iris passes to the netCDF4 library, and the value used is the MaskedArray's fill value. In the netCDF4 library one must explicitly set missing_value as an attribute.


Ah - sorry, I assumed that this was an iris method. However, iris is writing this value as an attribute to _FillValue, so somewhere it must know to set this. 
 
There are a few paths open though:

    1. Make Iris write a missing_value attribute with the same value as _FillValue (no good, against CF conventions)
    2. Allow the user to set missing_value themselves, the user must ensure the value of this attribute is correct.

We can't do 1 because the CF conventions say the missing_value and _FillValue attributes can be different. Option 2 is reasonable and seems to meet the specification in the CF conventions, in that it doesn't force missing_value and _FillValue to be the same. Implementing option 2 would mean modifying Iris to allow a cube to carry a missing_value attribute.

I agree with you - I think that option 2 is the best. 

Andrew Dawson

unread,
Mar 20, 2015, 11:59:39 AM3/20/15
to scitoo...@googlegroups.com
Ah - sorry, I assumed that this was an iris method. However, iris is writing this value as an attribute to _FillValue, so somewhere it must know to set this. 

As I said, the netCDF4 library (https://github.com/Unidata/netcdf4-python) does the fill value writing, Iris never explicitly writes this _FillValue attribute which is why it is handled the way it is.

I agree with you - I think that option 2 is the best. 

I've opened a ticket for this: https://github.com/SciTools/iris/issues/1588

Jess Baker

unread,
Oct 19, 2017, 6:37:38 AM10/19/17
to Iris

On Friday, March 20, 2015 at 3:59:39 PM UTC, Andrew Dawson wrote:
Ah - sorry, I assumed that this was an iris method. However, iris is writing this value as an attribute to _FillValue, so somewhere it must know to set this. 

As I said, the netCDF4 library (https://github.com/Unidata/netcdf4-python) does the fill value writing, Iris never explicitly writes this _FillValue attribute which is why it is handled the way it is.

I agree with you - I think that option 2 is the best. 

I've opened a ticket for this: https://github.com/SciTools/iris/issues/1588

Did this fix ever get implemented?  

Andrew Dawson

unread,
Oct 19, 2017, 8:00:20 AM10/19/17
to Iris
I don't think so.

Daniel Kirkham

unread,
Oct 19, 2017, 8:32:53 AM10/19/17
to Iris
It seems to me there's a problem with the suggested fix (option 2) in that Iris doesn't fully distinguish between global attributes and attributes for variables forming the data payload of a cube. It looks like on load all global attributes are added to every cube, and on save all attributes which are the same on every cube are saved as global attributes (I'm basing this on 5 minutes of playing around saving and loading netCDF files; there may be exceptions).

What this means is that currently if one saves a single cube all attributes become global. With the proposed change we'd have to make an exception for 'missing_value'. Is that something we want to do?
Reply all
Reply to author
Forward
0 new messages