Converting array data into a cube

Niall Robinson

unread,

Dec 11, 2012, 10:45:42 AM12/11/12

to scitoo...@googlegroups.com

Hello again AVD,

I have a bunch of results that I would like to save. I suppose it would be nice if these were cubes, so that all the metadata is nicely tied into the data. I see from the Iris help file that iris.cube.Cube can be used to make a new cube, but that the help file says that this shouldn't normally be required. As you may know from reading all my other questions, I have been loading a lot of cubes from various different pp files and calculating statistics for each one in turn. What I have now is some x, y pair arrays, for instance, RMS error and a corresponding time array. What is the most efficient way for me to convert this to a RMS cube, which I can the plot and save using the iris commands?

I suspect part of the point of iris is that, if you are doing things properly, you will never end up with arrays of data, but rather you will have collapsed cubes. Would you suggest I try and adapt my code so that I never start working with arrays, or should I construct some new cubes?

Thanks again

Niall

p.s. I'm aware I've been working AVD pretty hard with the pp metedata thing, so no hurry. This is just the cherry on the analysis cake.

RHattersley

unread,

Dec 12, 2012, 9:14:42 AM12/12/12

to scitoo...@googlegroups.com

As you say, most of the time you shouldn't need to recreate cubes from arrays. So if you're happy to try modifying Iris, you could try defining a new analysis operator (i.e. like MAX, MEAN, STD_DEV, etc.) in iris/analysis/__init__.py. A copy of the PERCENTILE aggregator might be a good place to start. I'd be happy to help you get this all the way through a pull-request and into the main code.

Richard

Niall Robinson

unread,

Dec 12, 2012, 12:22:26 PM12/12/12

to scitoo...@googlegroups.com

I think that I maybe didn't communicate my issue completely: I can calculate the RMS of a cube at the minute. However, I have, say, 100 different cubes from different times. So I am currently loading each one in turn, calculating its RMS and finding out its time, adding each of these values to an x,y pair of arrays. Should I really be combining all these different cubes into one mother cube (which has an additional dimension that corresponds to the individial cube times) and then collapsing it? If so, how? And if calculating separate values for each separate cube is ok, then how should I convert these back into a cube.

Hope that made sense

Niall

bblay

unread,

Jan 17, 2013, 8:50:14 AM1/17/13

to scitoo...@googlegroups.com

Hi Niall,

I would have imagined you'd have 1 cube with a time dimension of 100 points:

cube = iris.load_cube("*.pp")
rms = cube.collapse(["latitude", "longitude"], iris.analysis.RMS)

Where cube.shape == (100, 180, 360) and rms.shape == (100)

Is this what you were trying to achieve?
Byron

Niall Robinson

unread,

Jan 17, 2013, 11:12:04 AM1/17/13

to scitoo...@googlegroups.com

That's not what I have. I have output from someone else's analysis which is one pp file per time. I am loading each one in turn, doing some calculations with the data, and then appending result to a list, along with the time of the cube. Obviously, I can plot these quite happily. The only think I can't do with them is save them with metadata. Hence me saying I would like to make them into a cube. But perhaps the answer is that I should rewrite my code so that it loads all the cubes into a cube list, merges them, and then does the calculations?

Niall

RHattersley

unread,

Jan 18, 2013, 9:14:44 AM1/18/13

to scitoo...@googlegroups.com

Hi Niall,

Does the following code describe your current process?

times = []
results = []
for pp_filename in all_my_pp_filenames:
    cube = iris.load_cube(pp_filename) # Returns a 2-dimensional Cube?
    a_single_number = my_statistic_calculation(cube.data)
    times.append(cube.coord('time').points[0])
    results.append(a_single_number)

And now you would like to save the `times` and `results` values? To what file format?

If the file format supports it, you could make a Cube from the results and save that. For example:

time_coord = iris.coord.DimCoord(times, 'time', units=<the unit from your source cubes>)
result_cube = iris.cube.Cube(results, dim_coords_and_dims=[(time_coord, 0)], standard_name=<...>, long_name=<...>, cell_methods=<...>, units=<...>)
iris.save(result_cube, result_filename)

But it would be better to convert `my_statistic_calculation()` calculation into an Iris aggregation operator (e.g. iris.analysis.MEAN), because then metadata will be maintained automatically. Then this lets you process all your PP files in one go.

For example:

MY_OPERATOR = iris.analysis.Aggregator('<my-statistic> of {standard_name:s} {action:s} {coord_names:s}', '', my_statistic_calculation)

cube = iris.load_cube(all_my_pp_filenames) # Returns a 3-dimensional Cube
result_cube = cube.collapsed(['latitude', 'longitude'], MY_OPERATOR)
iris.save(result_cube, result_filename)

If I've misunderstood what you're trying to do and gone off on the wrong route, please say so!

Richard

Niall Robinson

unread,

Jan 21, 2013, 4:37:27 AM1/21/13

to scitoo...@googlegroups.com

That exactly addresses what I've been trying to do. I was loading the cubes one at a time because of memory concerns, but having learnt more about Iris in the interim, I suppose the point is that Iris doesn't load the data itself until it has to, so your second way should be equally efficient.

Thanks
Niall

Reply all

Reply to author

Forward