Collapsing a cube with multiple aggregators

72 views

Skip to first unread message

Malcolm Brooks

unread,

Mar 5, 2013, 4:52:13 AM3/5/13

to scitoo...@googlegroups.com

When collapsing a cube, sometimes multiple statistics are required (mean and standard deviation etc).

While this is still straightforward to do with only a couple of extra lines (or a list comprehension) it would be nice to be able to supply a list/tuple of aggregators and get a list/tuple of collapsed cubes back:

eg a very simple version would be:
[time_mean, time_stdev] = new_data[0].collapsed('time', [iris.analysis.MEAN, iris.analysis.STD_DEV])

This would be a nice improvement to the syntax and have the potential for some minor optimisations.

On the other hand, if the aggregators were more complex and were doing area averaging, or interpolation/regridding etc, then having the aggregators as a list would make it possible to make significant optimisations in the process by sharing the calculations of the weights etc.

Cheers,

Malcolm

RHattersley

unread,

Mar 5, 2013, 7:00:25 AM3/5/13

to scitoo...@googlegroups.com

Hi Malcolm,

Thanks for posting - that's an interesting idea, especially the optimisation angle. For small datasets (i.e. fits in system memory) it might not make much difference overall, but for large datasets (support for these in collapsed() is on the way) it'd be much more efficient if Iris could compute the mean and standard deviation without reading the data twice. I'll add that to the to-do list! ;-)

I'd suggest a small tweak to the syntax though, to allow keyword arguments to be targeted at specific aggregation operators (MEAN, STD_DEV, etc).

Currently we have:

percentile = cube.collapsed('time', iris.analysis.PERCENTILE, percent=90)

But if we're going to support multiple operators it might be better to have something more like:

percentile = cube.collapsed('time', iris.analysis.PERCENTILE(percent=90))