Difference between aggregation types has_data and not_null

196 views
Skip to first unread message

Karen K

unread,
Feb 23, 2024, 6:48:49 AMFeb 23
to weewx-user
What is the difference between the aggregation types has_data and not_null according to the documentation in http://weewx.com/docs/5.0/reference/aggtypes/.

Tom Keffer

unread,
Feb 23, 2024, 8:01:45 AMFeb 23
to weewx...@googlegroups.com
The aggregation not_null does not check first to see if the type exists. If you know the type exists, it's slightly faster. 

In practice, I don't think it's very useful. I may remove it from the documentation.

On Fri, Feb 23, 2024 at 3:48 AM Karen K <kk44...@gmail.com> wrote:
What is the difference between the aggregation types has_data and not_null according to the documentation in http://weewx.com/docs/5.0/reference/aggtypes/.

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/3eef09c7-c2fe-49a2-9e89-f1957c731a54n%40googlegroups.com.

Karen K

unread,
Feb 23, 2024, 9:20:25 AMFeb 23
to weewx-user
Thank you.

Karen K

unread,
Feb 23, 2024, 10:48:51 AMFeb 23
to weewx-user
Do I have to implement both the not_null and has_data aggregation in an XType extension?


Tom Keffer schrieb am Freitag, 23. Februar 2024 um 14:01:45 UTC+1:

bell...@gmail.com

unread,
Feb 23, 2024, 11:34:48 AMFeb 23
to weewx-user
I found that you only have to implement ‘not_null’. I believe this is because the ‘has_data’ tag (tags.py line 389) enters the xtype system with an aggregate of ‘has_data’. This then looks for the aggregate of ‘not_null’ (xtypes.py line 156).
Looking forward to hearing if my conclusion was correct.

Tom Keffer

unread,
Feb 23, 2024, 6:00:07 PMFeb 23
to weewx...@googlegroups.com
You should not have to implement either.

The implementation of has_data runs a not_null query over the aggregation period. In turn not_null either consults the database or, if it's an xtype, runs get_scalar() over the aggregation period. The first non-null value seen causes it to return True.

Of course, that's a fairly compute intensive way of checking for non-null xtype values, but it will work for any xtype. One could optimize the calculation with a specialized version for your type, but so far I haven't heard of a case where it's needed.

-tk


--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
Message has been deleted

Karen K

unread,
Feb 24, 2024, 2:56:07 AMFeb 24
to weewx-user
Ah.

Tom Keffer schrieb am Samstag, 24. Februar 2024 um 00:00:07 UTC+1:
You should not have to implement either.

For WeeWX 5.0 people experienced that the "GTS", "yearGDD", and "seasonGDD" observation types (XTypes calculated out of "outTemp") were not graphed any more. After an explicit implementation of "not_null" the graph appeared again.
tags.ObservationBinder.has_data has changed from WeeWX 4.X to 5.0. The old version only looked into the database by calling self.db_lookup(self.data_binding).has_data(self.obs_type, self.timespan). The new version is:

    @property
    def has_data(self):
        db_manager = self.db_lookup(self.data_binding)
        # First see if the type exists in the database.
        if db_manager.exists(self.obs_type):
            # Yes. Is it non-null?
            val = bool(weewx.xtypes.get_aggregate(self.obs_type, self.timespan,
                                                  'not_null', db_manager)[0])
        else:
            # Nope. Try the xtypes system.
            val = weewx.xtypes.has_data(self.obs_type, self.timespan, db_manager)
        return val


If you look into weewx.xtypes.get_aggregate(...,'not_null',...) and weewx.xtypes.has_data() they do almost the same. There is a loop over xtypes, and for each xtype get_aggreate() is called. If I understand the code right, the result of the has_data property in WeeWX 5.0 is independent of the value of the condition of the if statement. And the database lookup, that is used in WeeWX 4.X, has gone at all.

Tom Keffer

unread,
Feb 24, 2024, 2:55:05 PMFeb 24
to weewx...@googlegroups.com
Rich Bell brought this up in PR #929. Go there for the discussion. Try the "master" branch of the repository and see if it works better for you.

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.

Karen K

unread,
Feb 24, 2024, 3:30:19 PMFeb 24
to weewx-user
I guess PR #929 is an entirely different topic.

It seams to me, that the if condition in weewx.tags.ObservationBinder.has_data is useless, because what happens afterwards is almost the same in both cases. 

May be, the intension for that function was like that really:

    @property
    def has_data(self):
        db_manager = self.db_lookup(self.data_binding)
        # First see if the type exists in the database.
        if db_manager.exists(self.obs_type):
            # Yes. Is it non-null?
            val = bool(db_manager.has_data(self.obs_type, self.timespan))

        else:
            # Nope. Try the xtypes system.
            val = weewx.xtypes.has_data(self.obs_type, self.timespan, db_manager)
        return val


See the changed line with yellow background. There could be a connection to the performance issues some people experience I guess. Unfortunately I still have no 5.0 installation working to do tests. 

Tom Keffer

unread,
Feb 24, 2024, 6:22:46 PMFeb 24
to weewx...@googlegroups.com
I think you're right that the function can be simplified. Commit de738ca.


--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.

bell...@gmail.com

unread,
Feb 25, 2024, 4:18:12 PMFeb 25
to weewx-user
Well I am pretty sure that at some point in time I needed the ‘is_null’ method. I’ve decided to keep my implementation because I can implement a more efficient/specific one. I’ve started to prepend my XType so its implementation is used instead of XTypeTable.  (As an aside, it feels like XTypeTable should always be the last XType in the list.)

Karen K

unread,
Feb 27, 2024, 1:48:32 AMFeb 27
to weewx-user
It turned out that this thread was only a step on the way for me. I ended up writing an issue (#939) after all.

Reply all
Reply to author
Forward
0 new messages