Merging a cubelist which has been saved and reloaded in netCDF format.

Mike Walker

unread,

Dec 5, 2013, 2:44:44 PM12/5/13

to scitoo...@googlegroups.com

Hi all

I recently ran across an issue which I have boiled down to the code below.

I had a cubelist which merged without a problem in my script. However, when the cubelist was saved and reloaded elsewhere, the merge was no longer working. The following code is a simple example of this.

import iris

# we construct 2 simple cubes
c1 = iris.cube.Cube([0,1,2], long_name='some_parameter')
xco = iris.coords.DimCoord([11, 12, 13], long_name='x_vals')
c1.add_dim_coord(xco, 0)
c1.add_aux_coord(iris.coords.AuxCoord([100], long_name='y_vals'))
c2 = c1.copy()
c2.coord('y_vals').points = [200]

# add the cubes to a cubelist
cubelist = iris.cube.CubeList()
cubelist.append(c1)
cubelist.append(c2)

# CubeList is merged as expected
merged1 = cubelist.merge()[0]
print merged1

# Save and reload the cubelist
iris.save(cubelist, 'merge_list_test.nc')
test_list = iris.load('merge_list_test.nc')

# This merge does not work, and returns merged2 is c2
merged2 = test_list.merge()[0]
print merged2

Looking in a bit more detail, the cubes in test_list are not identical to the cubes in cubelist. For example, they have gained the attribute Conventions: CF 1.5 ... and the coordinates have each gained a var_name. The order of the cubes within the list has also bee swapped.

I am guessing therefore, that the reason that the cubelist will no longer merge, is due to the conventions that are applied in order to save the cubes in the netCDF format? I am still confused however, as the changes that I have found have been applied equally to both cubes, and so to me they still look like they should be able to merge, as they seem to differ only by the point in the scaler coord. Clearly I am missing something here.

I have easily been able to get around saving and loading the cubes, but wanted to ask;

Is this behaviour known / intended?
For the sake of my understanding, what is preventing the list from merging here?

Many thanks

Mike

Niall Robinson

unread,

Dec 6, 2013, 4:04:20 AM12/6/13

to scitoo...@googlegroups.com

Hi Mike - well see what the official Iris guys say but here's my take:

I've seen this before. Its known about, weather its intended or not is another matter.
Try iris.util.describe_diff(cube_a, cube_b). Hopefully that should easily tell you what is different between the cubes. Then just nuke accordingly :D I know there is work being done on tools to make merging (and diagnosing non-merging) much easier.

Mike Walker

unread,

Dec 6, 2013, 12:10:15 PM12/6/13

to scitoo...@googlegroups.com

Thanks Niall.

This looks likes a very useful function, thanks for the tip. However, in this case it has returned 'Cubes are compatible', which is somewhat unhelpful. Not worth putting too much time into this specific case, as it is just a toy example, but it bothers me a little that I don't understand the issue!

Thanks for the advice

Mike

Andrew Dawson

unread,

Dec 6, 2013, 12:44:22 PM12/6/13

to scitoo...@googlegroups.com

Hi Mike

This one is quite subtle, so I'll try and explain bit by bit:

When you save the cube list to netcdf you are saving two variables with the same name to the same file. The netcdf saver does not know that these two cubes are related and netcdf forbids two variables from having the same name (netcdf name) so it renames one of the cubes 'some_parameter_0'. A similar thing happens to the scalar coordinate, since the netcdf saver has no way of knowing that the two scalar coordinates on each cube are related.

What happens when you read this back in is that iris sees two cubes with different netcdf names, and therefore assumes them to be separate, and records the netcdf names as the var_name attribute. Therefore it won't merge the two because it thinks (and rightly so too) that they are different.

The solution is to modify the var_name attribute of each cube that is loaded back in so that they are the same. Chances are you don't care what this is so you can just remove it by setting it to None. You must also do this with the scalar coordinate y_vals so that they can be identified as compatible.

I've modified the example you provided so that it works as you expected:

import iris

# we construct 2 simple cubes
c1 = iris.cube.Cube([0,1,2], long_name='some_parameter')
xco = iris.coords.DimCoord([11, 12, 13], long_name='x_vals')
c1.add_dim_coord(xco, 0)
c1.add_aux_coord(iris.coords.AuxCoord([100], long_name='y_vals'))
c2 = c1.copy()
c2.coord('y_vals').points = [200]

# add the cubes to a cubelist
cubelist = iris.cube.CubeList()
cubelist.append(c1)
cubelist.append(c2)

# CubeList is merged as expected
merged1 = cubelist.merge()[0]
print merged1

# Save and reload the cubelist
iris.save(cubelist, 'merge_list_test.nc')
test_list = iris.load('merge_list_test.nc')

# Remove var_name attributes
for cube in test_list:
    cube.var_name = None
    cube.coord('y_vals').var_name = None

# And now this merge does work.


merged2 = test_list.merge()[0]
print merged2

On Thursday, 5 December 2013 19:44:44 UTC, Mike Walker wrote:

Reply all

Reply to author

Forward