sum vs wsum for archive_day_rain

176 views
Skip to first unread message

pravee...@gmail.com

unread,
Aug 25, 2022, 1:27:34 AM8/25/22
to weewx-user
Hi,

When trying to build monthly averages for rainfall, I see that using SUM(sum) from archive_day_rain gives sensible data while doing SUM(wsum)/SUM(wtime) gives very weird data. What should be used for archive_day_rain?

Regards,
Praveen

gjr80

unread,
Aug 25, 2022, 6:11:26 AM8/25/22
to weewx-user
As a general rule if you are working with averages you should be using wsum and sumtime, but it really depends on what you are calculating. To understand you need to know what the daily summary wsum field contains and how it is different to the sum field. I have no idea what the time field is, do you perhaps mean the sumtime field?

WeeWX v3.0.0 introduced support for different archive interval values within a single database, hand in glove with this change was the introduction of weighting of archive record values used in the daily summaries. This was done to ensure that archive records with long archive intervals did not skew certain aggregates. For example, if a database contained archive records with a five minute interval during the night and archive records with a 30 minute archive interval during the day, when the average temperature was calculated for the entire day the resulting calculated value would be low due to there being up to six times more nighttime temperature values than there are daytime values. So a temperature of 15C at 11pm that 'applied' for five minutes would carry the same weight in the day average temperature as a 30C temperature at 11am that 'applied' for 30 minutes. 

The solution is to weight the archive record values by the archive interval. This is what is stored in the daily summary wsum field (the sum field is retained as an unweighted sum). Archive interval weighting was not properly implemented in v3.0.0 resulting in all archive records being weighted equally irrespective of archive interval. This was fixed variously in v3.7.0, v4.2.0, v4.3.0 and v4.4.0 (as an aside, users with a homogeneous archive interval value in their database still use archive record weighting, but each archive record is weighted equally).

For obs where you are interested in the average value for the day, such as temperature, wind speed etc, the correct daily average value is the wsum value divided by the sumtime value. If you look in the WeeWX xtypes module you will see the queries used for daily summary based averages use the wsum and sumtime fields. However, for rain  there is little sense in 'averaging' the day rain total as is done with say temperature. Typically for rain we are interested in the daily total or the sum field. You will see in the xtypes module the queries for the daily summary based sums use the sum field (you can still calculate the 'day average rain' using $day.rain.avg, which will use the rain daily summary wsum value, but it is a fairly meaningless aggregate).

In your case, if what you refer to as 'monthly averages for rainfall' is the average monthly rainfall (ie average July rainfall, average August rainfall etc), I expect you would be wanting the sum of the rainfall for the month (eg the sum of the rainfall for each July or the sum of the sum fields for each day in each July) and average that over the number of those months in your data (ie the number of 'Julys'). I wouldn't see you using wsum or sumtime.

Gary

Praveen Chandrasekaran

unread,
Aug 25, 2022, 6:33:32 AM8/25/22
to weewx...@googlegroups.com
Thanks. That makes sense. So even for monthly average temperature I guess I can use sum instead of wsum. I am using wsum for temperature (monthly average) now and it seems to work fine. It was when I attempted it for rain that it went all wrong and then i modified the query based on obs_type. I am actually updating xaggs.py to support monthly and yearly averages.

--
You received this message because you are subscribed to a topic in the Google Groups "weewx-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/weewx-user/zbg1X8dMRvQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to weewx-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/567c6084-bf8f-491e-a800-bedb8e3bbd21n%40googlegroups.com.

gjr80

unread,
Aug 25, 2022, 7:26:03 AM8/25/22
to weewx-user
> So even for monthly average temperature I guess I can use sum instead of wsum.

Well no, because you do not know whether the archive interval for all of the records is the same. For obs, such as rain, where you are interested in the absolute sum for each day (or month or year) you use the sum field. Think of it this way, it doesn't matter what the archive interval is for a cumulative measurement such as rain, the total rain for the day (or month or year) is the sum of the rain recorded in each archive interval in that day (or month or year). When considering temperature (or any other non-cumulative obs) you are using the average temperature for the day (or month or year) and this calculation does vary if the archive interval for each record varies so you need to use wsum.

Gary
Reply all
Reply to author
Forward
0 new messages