using iris.load on multiple files

499 views
Skip to first unread message

Morwenna Griffiths

unread,
Jun 5, 2017, 9:58:59 PM6/5/17
to Iris

I have two netcdf files, one with precipitation, one with air temperature.

I name these "f1" and "f2"

I can load them into cubes with iris.load


I expected that the resulting cubes would be in the same order as the input files.  As you can see from the code snippet below, this is not the case


precipitation is being loaded into cubes[0] regardless of the order.


Is this expected behaviour?   Can I make it preserve the order?


thanks,

Morwenna


> f2 = '/dac_tasmax_19900101_e01.nc'

> f1 = 'dac_pr_19900101_e01.nc'


>  aa = iris.load([f1, f2])

> print(aa)

0: precipitation / (mm)                (time: 217; latitude: 61; longitude: 53)

1: air_temperature / (degC)            (time: 217; latitude: 61; longitude: 53)


> a2 = iris.load([f2, f1])

>  print(a2)

0: precipitation / (mm)                (time: 217; latitude: 61; longitude: 53)

1: air_temperature / (degC)            (time: 217; latitude: 61; longitude: 53)



Andrew Dawson

unread,
Jun 6, 2017, 6:33:52 AM6/6/17
to Iris
This is the expected behaviour, you can't make it preserve the order. The reason is the filenames are sorted before loading.

I'm guessing this is causing problems because you want to index the cube-list to find a particular cube? There are ways around this, the best one will depend on what you want to do. For example, you could use a list comprehension to find the cube(s) you want in the cubelist. Or it might be better to simply use load_cube to load each cube from file separately if that fits your needs, as this is the simplest way to know which data you have loaded in which cube.

If you need more advice on what to do next feel free to continue in this discussion.

Morwenna Griffiths

unread,
Jun 13, 2017, 12:47:27 AM6/13/17
to Iris
Thanks Andrew,

I have got around the problem by loading each file independently (in a little loop) and identifying the cubes as I go.

I wasn't expecting this (I thought that the cubes would be in the order of the files in the input list), and it took me an embarrassingly long time to work out what was going on.  For a while there I thought I was going mad!   When I read it one way, I got different answers to when I read it a different way.

I have made a suggestion (PR # 2598) to include a Note in the documentation about this.  I can't believe that I'm the only person to have been caught this way!

Whilst I have a satisfactory solution for my current problem, when you have time, I would be interested to know what you mean by "use a list comprehension to find the cube(s) you want in the cubelist".  The code snippet I put in my original post was not exactly what I'm doing.  It was a simpler version of it that exhibits the same behaviour.  What I've got is a set of 11 ensembles each with precipitation.  The netcdf files (and therefore the cubes) have identical metadata.  So, other than the filename, the files are indistinguishable.  Can I find out how iris will sort the files names?  They are da_ua_19960125_e01.ncda_ua_19960125_e02.ncda_ua_19960125_e03.ncda_ua_19960125_e04.nc etc.

thanks,
Morwenna

Andrew Dawson

unread,
Jun 13, 2017, 4:07:32 AM6/13/17
to Iris
It sounds like you need a load-time callback, so you can edit the metadata to add the ensemble member number to each cube as it is loaded from file. There is a gallery example in the documentation that does this. For your case it would look something like this:

def add_realization_from_filename(cube, field, filename):
   
# Get the integer ensemble member number from the filename
   
# (assuming same naming as in your example, adjust if needed)
    ensemble_member
= int(filename[-5:-3])
   
# Create a realization scalar coordinate:
    realization_coord
= iris.coords.AuxCoord(ensemble_member, "realization")
    cube
.add_aux_coord(realization_coord)


cubes
= iris.load(['da_ua_19960125_e01.nc', 'da_ua_19960125_e02.nc', ...], callback=add_realization_from_filename)

If the metadata in each file really is identical then the result of this should be a single cube with a realization coordinate, where the coordinate values are the member number (as in the gallery example), in which case you can just use iris.load_cube instead.
Reply all
Reply to author
Forward
0 new messages