keeping a degenerate scalar coordinate as a dim_coord following a collapse

465 views
Skip to first unread message

Luke Abraham

unread,
May 16, 2016, 6:18:23 AM5/16/16
to Iris
Hello,

I would like to maintain the dimensionality of a cube following a collapse, e.g.

cube=iris.load_cube('/home/users/nlabraham/temp/ozone_last50yrs_3D_clim.nc')
print cube
Stash code = 34001 / (kg kg-1)      (time: 12; hybrid_ht: 60; latitude: 73; longitude: 96)
     
Dimension coordinates:
          time                           x              
-             -              -
          hybrid_ht                      
-              x             -              -
          latitude                      
-              -             x              -
          longitude                      
-              -             -              x
     
Attributes:
          CDI
: Climate Data Interface version 1.5.4 (http://code.zmaw.de/projects/cdi...
          CDO
: Climate Data Operators version 1.5.4 (http://code.zmaw.de/projects/cdo...
         
Conventions: CF-1.4
          date
: 01/12/60
          history
: Mon Dec 16 13:34:02 2013: cdo ymonmean ozone_last50yrs_3D_complete.nc ozone_last50yrs_3D_clim.nc
Thu...
          name
: tracer1
          source
: Unified Model Output (Vn 7.3):
          time
: 00:00
          title
: Stash code = 34001
cube
=cube.collapsed(['longitude'],iris.analysis.MEAN)
print cube

Stash code = 34001 / (kg kg-1)      (time: 12; hybrid_ht: 60; latitude: 73)
     
Dimension coordinates:
          time                           x              
-             -
          hybrid_ht                      
-              x             -
          latitude                      
-              -             x
     
Scalar coordinates:
          longitude
: 180.0 degrees, bound=(0.0, 360.0) degrees
     
Attributes:
          CDI
: Climate Data Interface version 1.5.4 (http://code.zmaw.de/projects/cdi...
          CDO
: Climate Data Operators version 1.5.4 (http://code.zmaw.de/projects/cdo...
         
Conventions: CF-1.4
          date
: 01/12/60
          history
: Mon Dec 16 13:34:02 2013: cdo ymonmean ozone_last50yrs_3D_complete.nc ozone_last50yrs_3D_clim.nc
Thu...
          name
: tracer1
          source
: Unified Model Output (Vn 7.3):
          time
: 00:00
          title
: Stash code = 34001
     
Cell methods:
          mean
: longitude


As you can see, following the collapse, the data array went from a (12,60,73,96) to a (12,60,73) with the longitude now being listed as a scalar coordinate.

What I would like collapse to return is a 4D array with a degenerate longitude dimension of size 1. I'm not sure if there is an argument to pass to collapse that does this, and if so, what is it?

If there isn't, is there an easy way to add the dimension. The add_dim_coord command seems to require an existing dimension in the data array of the cube, but when I tried to copy a 4D array of shape (12,60,73,1) containing the data in the cube to the existing data array the error was 

print shape(darr)
(12, 60, 73, 1)
print shape(cube.data)

(12, 60, 73)

cube
.data=darr

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-1aac0f45cccc> in <module>()
----> 1 o3data.data=darr


/usr/lib/python2.7/site-packages/iris/cube.pyc in data(self, value)
   
1525             if self.shape or data.shape != (1,):
   
1526                 raise ValueError('Require cube data with shape %r, got '
-> 1527                                  '%r.' % (self.shape, data.shape))
   
1528
   
1529         self._my_data = data


ValueError: Require cube data with shape (12, 60, 73), got (12, 60, 73, 1).

Is there an easy way to do this without making a new cube from scratch and adding every thing in manually?

I note that for iris.cube.interpolate there is the collapse_scalar argument which can control this. However, it isn't available for collapse. Should I raise a feature request for this?

Kwargs:


collapse_scalar
:
Whether to collapse the dimension of scalar sample points in the resulting cube. Default is True.



(if you are interested in why I want to do this, please see below)

I want to be able to do this for 2 reasons, one immediate, one more long-term.

1) I am trying to create an ozone climatology to replace an existing UM ancillary. This is defined as, when reading the ancil directly:

o3clim=iris.load_cube('/home/users/nlabraham/temp/qrclim.ozone_L85_O85')
/usr/lib/python2.7/site-packages/iris/fileformats/ff.py:707: UserWarning: The stash code m01s00i060 is on a grid 22 which has not been explicitly handled by the fieldsfile loader. Assuming the data is on a P grid.
  subgrid
))
/usr/lib/python2.7/site-packages/iris/fileformats/ff.py:721: UserWarning: Partially missing X or Y coordinate values.
 
'Partially missing X or Y coordinate values.')
/usr/lib/python2.7/site-packages/iris/fileformats/rules.py:914: UserWarning: Unable to create instance of HybridHeightFactory. The file(s) ['/home/users/nlabraham/temp/qrclim.ozone_L85_O85'] don't contain field(s) for 'orography'.
  factory=factory_name))


print o3clim
mass_fraction_of_ozone_in_air / (1) (time: 12; model_level_number: 85; latitude: 145; longitude: 1)
     Dimension coordinates:
          time                           x                       -             -               -
          model_level_number             -                       x             -               -
          latitude                       -                       -             x               -
          longitude                      -                       -             -               x
     Auxiliary coordinates:
          level_height                   -                       x             -               -
          sigma                          -                       x             -               -
     Attributes:
          STASH: m01s00i060
          source: Data from Met Office Unified Model
          um_version: 7.9
     Cell methods:
          mean: time
          mean: longitude


i.e. this contains a degenerate longitudinal dimension of shape (12,85,145,1). I then want to use regrid to convert the N48 grid to N96 (I will then use interpolate to do the vertical coordinate)

cube = cube.regrid(o3clim, iris.analysis.Linear(extrapolation_mode='error'))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-b49d83c7035e> in <module>()
----> 1 o3data = o3data.regrid(o3clim, iris.analysis.Linear(extrapolation_mode='error'))


/usr/lib/python2.7/site-packages/iris/cube.pyc in regrid(self, grid, scheme)
   
3532
   
3533         """
-> 3534         regridder = scheme.regridder(self, grid)
   3535         return regridder(self)
   3536


/usr/lib/python2.7/site-packages/iris/analysis/__init__.pyc in regridder(self, src_grid, target_grid)
   2149         """

   
2150         return RectilinearRegridder(src_grid, target_grid, 'linear',
-> 2151                                     self._normalised_extrapolation_mode())
   
2152
   
2153


/usr/lib/python2.7/site-packages/iris/analysis/_regrid.pyc in __init__(self, src_grid_cube, tgt_grid_cube, method, extrapolation_mode)
     
77         # Snapshot the state of the cubes to ensure that the regridder
     
78         # is impervious to external changes to the original source cubes.
---> 79         self._src_grid = snapshot_grid(src_grid_cube)
     
80         self._tgt_grid = snapshot_grid(tgt_grid_cube)
     
81         # Check the target grid units.


/usr/lib/python2.7/site-packages/iris/analysis/_interpolation.pyc in snapshot_grid(cube)
   
170
   
171     """
--> 172     x, y = get_xy_dim_coords(cube)
    173     return x.copy(), y.copy()
    174


/usr/lib/python2.7/site-packages/iris/analysis/_interpolation.pyc in get_xy_dim_coords(cube)
    147     if len(x_coords) != 1:
    148         raise ValueError('Cube {!r} must contain a single 1D x '
--> 149                          'coordinate.'.format(cube.name()))
    150     x_coord = x_coords[0]
    151


ValueError: Cube u'Stash code = 34001' must contain a single 1D x coordinate.


i.e. I need to have a degenerate longitudinal dimension to do this. 

Additionally, regrid doesn't seem to work on zonal mean fields in any case, as I get the same error if I zonally mean the o3clim variable and then use regrid.

My solution in this case will be to read-in a full N96 field and then regrid prior to zonally meaning.

2) My personal opinion is that all gridded data should be 4D, with a number of degenerate dimensions if required. This then greatly simplifies working as you know that you will always have arrays with 4 dimensions, loops are easier to set-up, working with different datasets are better etc. I really hate having to deal with zonal mean fields without a longitude. I admit that this is just my personal preference, as the cf standard doesn't explicitly state this, although it does say 

COARDS strongly recommends limiting the number of dimensions to four, but we wish to allow greater flexibility

I would to be able to easily always work with 4 dimensions, where possible.

Many thanks,
Luke 

Andrew Dawson

unread,
May 16, 2016, 6:32:42 AM5/16/16
to Iris
You need this: http://scitools.org.uk/iris/docs/latest/iris/iris/util.html#iris.util.new_axis. Call it after the collapse to add the new singleton dimension based on the longitude coordinate. I don't think there is an option to do this automatically.

My personal opinion is that all gridded data should be 4D, with a number of degenerate dimensions if required

I used to hold this exact opinion too when I first started using iris, but I now find that I don't need the explicit dimensionality. A scalar coordinate (a concept I was unused to from CDAT/Matlab/IDL) can serve the purpose just as well as a singleton dimension. I can use methods based on metadata to control looping for example, instead of array indices for looping I can used slices or slices_over methods to make sure I'm always working with what I expect without having to know the dimensionality of the cubes I'm working with up-front. You get these nicer, safer, more reliable interfaces when you have the support for metadata like scalar coordinates, although you may not be used to them if you have been used to other data analysis packages. If you always want 4 dimensions then make sure you follow collapsing with the appropriate call(s) to new_axis.

As for the CF/COARDS conventions, scalar coordinates are part of CF I believe, which is why iris makes heavy use of them. The excerpt you quoted refers to the older COARDS conventions which is actually just specifying an upper limit of four, not that there should always be four. This convention is outdated and CF is the new best practice.

Hope that helps.
Andrew

Luke Abraham

unread,
May 16, 2016, 7:16:25 AM5/16/16
to Iris
Hi Andrew,

Many thanks for your quick reply.

 
You need this: http://scitools.org.uk/iris/docs/latest/iris/iris/util.html#iris.util.new_axis. Call it after the collapse to add the new singleton dimension based on the longitude coordinate. I don't think there is an option to do this automatically.


This is certainly better than not having it, however, it creates it as the leading dimension, and not the trailing dimension. What I would like it to do would be to specify where the dimension is added. The data array is now (1, 12, 60, 73) rather than (12, 60, 73, 1). I'm now getting the following error from regrid:

ValueError: The rectilinear grid coordinates of the given cube and target grid must either both have coordinate systems or both have no coordinate system but with matching coordinate metadata.

My personal opinion is that all gridded data should be 4D, with a number of degenerate dimensions if required

I used to hold this exact opinion too when I first started using iris, but I now find that I don't need the explicit dimensionality. A scalar coordinate (a concept I was unused to from CDAT/Matlab/IDL) can serve the purpose just as well as a singleton dimension. I can use methods based on metadata to control looping for example, instead of array indices for looping I can used slices or slices_over methods to make sure I'm always working with what I expect without having to know the dimensionality of the cubes I'm working with up-front. You get these nicer, safer, more reliable interfaces when you have the support for metadata like scalar coordinates, although you may not be used to them if you have been used to other data analysis packages. If you always want 4 dimensions then make sure you follow collapsing with the appropriate call(s) to new_axis.

As for the CF/COARDS conventions, scalar coordinates are part of CF I believe, which is why iris makes heavy use of them. The excerpt you quoted refers to the older COARDS conventions which is actually just specifying an upper limit of four, not that there should always be four. This convention is outdated and CF is the new best practice.


Part of the issue is with dealing with data from multiple sources, some of which have 4D coordinates, and others have 3D or 2D, depending on what the output is and what the person making up the netCDF file chooses to do. I just think it would make it easier to do everything 4D, especially as not everyone uses python or iris, and other tools would expect 4 dimensions.

It seems like the new_axis utility is nearly there, it's just too restrictive as to where it places the new coord.

Many thanks,
Luke

marqh

unread,
May 16, 2016, 7:46:06 AM5/16/16
to Iris
Hello Luke



I note that for iris.cube.interpolate there is the collapse_scalar argument which can control this. However, it isn't available for collapse. Should I raise a feature request for this?

Kwargs:


collapse_scalar
:
Whether to collapse the dimension of scalar sample points in the resulting cube. Default is True.



I think that I can see the logic and the requirement. 

Is your need limited to the collapse operation

I think that reusing the pattern already in iris.cube.interpolate makes good sense

This is a useful forum to canvas opinion on such an API enhancement.  Hopefully a sense of consensus from developers can be reached on this approach

Is this a change that you are interested in implementing if it gets support?

all the best
mark

Andrew Dawson

unread,
May 16, 2016, 7:47:09 AM5/16/16
to Iris
You can reorder the dimensions using the transpose method of the cube. Since all this is a pain, you might want to write yourself a little wrapper function to do all this for you. Something like this should do the trick:

from iris.util import new_axis


def collapse_maintain_dims(cube, coords, aggregator, **kwargs):
    collapsed
= cube.collapsed(coords, aggregator, **kwargs)
   
for coord in coords:
        collapsed
= new_axis(collapsed, scalar_coord=coord)
    dim_map
= {coord.name(): i for i, coord in enumerate(collapsed.dim_coords)}
    transpose_order
= [dim_map[coord.name()] for coord in cube.dim_coords]
    collapsed
.transpose(transpose_order)
   
return collapsed

Luke Abraham

unread,
May 17, 2016, 4:44:44 AM5/17/16
to Iris
Hi Andrew,

This works a perfectly! Many thanks.

Luke

Luke Abraham

unread,
May 17, 2016, 4:47:44 AM5/17/16
to Iris
Hi Mark,
Unfortunately, due to work commitments, I can't commit to doing this work. I think that it would be a good idea to do though. Perhaps incorporating Andrew's collapse_maintain_dims within collapse if collapse_scalar=True is all that would be required.

Best wishes,
Luke

 
all the best
mark
Reply all
Reply to author
Forward
0 new messages