Calculating an annual mean with proper month length weights

161 views
Skip to first unread message

Martin Dix

unread,
Jan 22, 2020, 12:06:00 AM1/22/20
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
I'm trying to calculate an annual mean from monthly means. The method I've seen recommended is

iris.coord_categorisation.add_year(cube,'time')
annmean
= cube.aggregated_by(['year'], iris.analysis.MEAN)

However my data uses the Gregorian calendar and I'd like to apply the proper month length weights. These can be easily calculated from the bounds on the time dimension, but aggregated_by doesn't support weights and raises an exception

ValueError: Invalid Aggregation, aggregated_by() cannot use weights.

Does anyone have an elegant way of doing this that doesn't require writing explicit loops over cube.data?

Martin

Benjamin Harrison

unread,
Jun 29, 2020, 12:28:10 PM6/29/20
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
Hi Martin,

I'm having the same problem. Did you figure this out?
Message has been deleted
Message has been deleted

Goodwin Gibbins

unread,
Jul 8, 2020, 3:39:14 AM7/8/20
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
Amusing - I just joined this group to ask about something which doesn't seem to be working with my work-around for accomplishing just this. I'd be very curious to know if someone has a more elegant solution, but here's what I've managed so far.

# Artificially inflate the cube values for the Februaries which are leap years so when weighted by the 28-day February, results are correct.:

leap_rescale = (29/28-1) * np.logical_and(cube.coord('year')[:].points%4==0, cube.coord('month')[:].points=='Feb') + 1
cube.data = cube.data*leap_rescale
       
# Manually define the month-weights. If your dataset doesn't start in January, may need to be careful.
w = np.array([31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
        
def weightmonths(data, w, axis):    # This could perhaps be done as a lambda function, which would be more elegant.
    return np.average(data, axis=axis, weights = w)
        
WeightMonths = iris.analysis.WeightedAggregator('weight_months', weightmonths, w=w)
av = cube.aggregated_by('year', WeightMonths, w= w)

The bug (?) I'm worried about is that I've noticed that if I have an auxilliary coordinate which is the number of days in each month, the averaging doesn't apply correctly to that coordinate. I think that's an issue that's not because of the workaround but is more universal, but there's the health warning anyway.

Goodwin Gibbins

unread,
Jul 8, 2020, 6:48:58 AM7/8/20
to SciTools (iris, cartopy, cf_units, etc.) - https://github.com/scitools
Alas, the solution I offered is mistaken because the denominator in the mean is as if for a 28-day February, even in leap years. You could go back and rescale the leap year averages by a correction factor, but that gets a bit ridiculous.

I've returned to the loop option:

import iris
import iris.coord_categorisation

iris.coord_categorisation.add_year(cube, 'time')


iris.coord_categorisation.add_categorised_coord(cube, 'days_in_month', 'time', lambda coord, value: 31 if coord.units.num2date(value).month in [1, 3, 5, 7, 8, 10, 12] else (30 if coord.units.num2date(value).month in [4,6,9,11] else (29 if coord.units.num2date(value).year % 4 ==0 else 28) ))

# Make the aggregated cube, but with improperly calculated averages
av = cube.aggregated_by('year', iris.analysis.MEAN)

# But then manually fix the data
av.data = [np.sum((area_av.data * area_av.coord('days_in_month').points)[area_av.coord('year').points == i]) / np.sum(area_av.coord('days_in_month').points[area_av.coord('year').points == i]) for i in av.coord('year').points]
        
Reply all
Reply to author
Forward
0 new messages