purge / clean mysql database?

112 views
Skip to first unread message

Mario Wesolek

unread,
Feb 11, 2024, 11:03:00 AM2/11/24
to weewx-user
hi there,

i use weewx 5.0.2 with belchertown skin 1.3.1 with mysql database. over the time there are tousends of entrys in the database. is there a good / easy way to purge them without lost of alltime records statistik? i think the min/max values came from table archives, so i can select them and delete the other - so i think this is not a good way to do that like this. does anybody knows a better way or is a tool for that job include?

thank you
mario
dg1fi

michael.k...@gmx.at

unread,
Feb 11, 2024, 11:14:45 AM2/11/24
to weewx-user
What's the problem with even millions of entries in the database?
The min/max values are stored in the corresponding archive_day_xxx-tables but derived from archive and (re)built from archive.

geoff...@gmail.com

unread,
Feb 11, 2024, 1:00:42 PM2/11/24
to weewx-user
No need to purge - why throw away the data?  My archive table contains 4229492 records, no problem.

Nate Bargmann

unread,
Feb 11, 2024, 8:39:35 PM2/11/24
to weewx...@googlegroups.com
Be careful throwing away archive_day-* data!

On my setup with the Davis driver min and max values often occur in
between archive intervals. I found that those are lost if the daily
summaries have to be rebuilt from the archive table. My archive
interval is 5 minutes.

I did some testing a couple of months back cleaning up some spurious
solar sensor data that crept in from a failing ISS transmitter. In the
process I rebuilt the daily summaries on a copy of the database and saw
every daily summary min/max value was on the five minute mark. I knew a
recent day had recorded a low, just briefly, at something like 24.4 and
in the original DB was recorded at that value at the precise time but
both values were lost in the regenerated daily summaries. I've become
quite careful to not disturb the daily summaries after that.

I think this should be prominently noted in the WeeWX documentation.

In my DB I noticed that the dailies started recording this information
when I switched to version 4.x in the first days of 2021. I don't
recall if there was something done to the DB in that upgrade or not.

- Nate

--
"The optimist proclaims that we live in the best of all
possible worlds. The pessimist fears this is true."
Web: https://www.n0nb.us
Projects: https://github.com/N0NB
GPG fingerprint: 82D6 4F6B 0E67 CD41 F689 BBA6 FB2C 5130 D55A 8819

signature.asc

michael.k...@gmx.at

unread,
Feb 12, 2024, 6:13:24 AM2/12/24
to weewx-user
Nate, this is what https://weewx.com/docs/5.0/utilities/weectl-database/#recalculate-daily-summary-weights is for. Imagine you manipulate the database an know, the average of a certain type will change after that, the highs an lows aren't affected, e.g. you backfill a gap in the database for outTemp from another source, because your primary sensors battery was empty. Last good value was at 8:00 a.m., after the morning low, you've replaced the battery at 1 p.m., before the afternoon high. After you've replaced the values in the database, you run "reweight" for that day, you highs and lows will be untouched.

Nate Bargmann

unread,
Feb 12, 2024, 9:08:34 AM2/12/24
to weewx...@googlegroups.com
* On 2024 12 Feb 00:13 -0600, 'michael.k...@gmx.at' via weewx-user wrote:
> Nate, this is
> what https://weewx.com/docs/5.0/utilities/weectl-database/#recalculate-daily-summary-weights
> is for. Imagine you manipulate the database an know, the average of a
> certain type will change after that, the highs an lows aren't affected,
> e.g. you backfill a gap in the database for outTemp from another source,
> because your primary sensors battery was empty. Last good value was at 8:00
> a.m., after the morning low, you've replaced the battery at 1 p.m., before
> the afternoon high. After you've replaced the values in the database, you
> run "reweight" for that day, you highs and lows will be untouched.

Note that I am still on 4.10 and that I found what I stated occurred a
couple of months ago before 5.0 was released. In this case the 4.10
wee_database docs should be consulted.

Try this on a copy of your DB:

Drop the daily summaries.
Rebuild the daily summaries.
Compare.

Again, perhaps this varies based on the backend, but with the Davis
backend I saw that min/wax values logged between archive intervals were
lost. Fortunately, wee_database has the capability to restore a single
archive_day record if needed.

I am very cautious working with the DB and do so on a copy if need be (I
recently filled a bunch of locally missing data from before a Raspberry
Pi crash in 2015 from Weather Underground. It worked very well). I
also dump and back up the DB offsite every six hours.
signature.asc

michael.k...@gmx.at

unread,
Feb 12, 2024, 9:20:33 AM2/12/24
to weewx-user
"reweight" was in the utils before 5.0. Of course, if you (as I do) very much value the precision of min/max values, you don't ever drop daily summaries unless it is really necessary, and even then only for the narrowest timespan necessary.

Min/Max values are from the LOOP, so any LOOP value leading to a min/max value has the timestamp of the LOOP packet in which it arrived, so this isn't anything related to Davis hardware, it depends on when the data arrives in the LOOP (if so).

Be even more paranoid than you seem to be concerning the database, I do hourly, daily, weekly and monthly backups and I run several weewx instances redundantly on independent hardware :D
Reply all
Reply to author
Forward
0 new messages