Upgrade from 4.10.2-1 to 5.0.2 - high CPU and external database load.

213 views
Skip to first unread message

Bartosz Francman

unread,
Jun 1, 2024, 1:25:36 PM6/1/24
to weewx-user
Hi,
For several months I have been postponing the upgrade from version 4.10.2-1 to version 5.02. To tell you the truth, I didn't even know that it was a jump from 4.x to 5.x and a few days ago I did an automatic upgrade (sudo apt update & upgrade).
When asked about saving the configuration file, I left the old version, as everything worked so far.
My system consists of a RaspberryPi 3B, a Eurochron EFWS 2900 station and a database (MariaDB 10) located on the NAS so as not to write data to the memory card. This system has been operating for 6 years (the first entries in the database are from 2018). So far I haven't had any problems. Everything worked as it should.
As the weather station only works via WiFi, I had to use the Interceptor controller to read data from the station. And unfortunately, only port 80 could be used. It is not possible to configure the station, apart from turning it on and setting access data to 3 weather services.
Now I will briefly describe the problems with the upgrade.
Port 80 cannot be used by any user other than root, so I had to change the way weewx runs from the weewx user to the root user. Data began to appear and the script did not end with the lack of access to port 80. I thought that it was not so bad.
However, I saw that the processes related to weewx were using 100% of the load of two raspberry processors. I found information on this group that it may be a problem with the lack of appropriate columns in the archive table (https://groups.google.com/g/weewx-user/c/6rl2FIbqVp4/m/S0Ek9ZVaBwAJ) I did everything according to this post, plus from https://weewx.com/docs/5.0/utilities/weectl-database/#update-a-database update the database and check it. All this took about 2 hours due to the large amount of data I have in this database. By the way, apart from the high CPU load, I noticed that every half a minute there is a very large data read from the database. And it's related to weewx. I thought that rebuilding the database would reduce the CPU load and eliminate the frequent polling, but unfortunately it didn't help. I don't have any very advanced visualizations, just a basic diagram.
I left it because there is nothing else to do with this raspberry and two out of four processors can be fully loaded. But unfortunately, about two hours after starting the system, I received information from one weather service that I was not sending data to them. I tried to access this raspberry via ssh, but it was impossible. Connection timed out. Although the raspberry was running.
Only today I sat down to work on it and managed to connect to the raspberry. The last log entries after issuing the sudo systemctl status weewx command are provided below:

pi@raspberry-pi:~ $ sudo systemctl status weewx
× weewx.service - WeeWX
     Loaded: loaded (/lib/systemd/system/weewx.service; enabled; preset: enabled)
     Active: failed (Result: signal) since Sat 2024-06-01 10:36:12 CEST; 1min 2s ago
   Duration: 19h 10min 2.367s
       Docs: https://weewx.com/docs
    Process: 720 ExecStart=weewxd /etc/weewx/weewx.conf (code=killed, signal=KILL)
   Main PID: 720 (code=killed, signal=KILL)
        CPU: 1h 59min 21.575s


maj 31 17:51:09 raspberry-pi weewxd[720]: INFO weewx.restx: PWSWeather: Published record 2024-05-31 17:48:00 CEST (1717170480)
maj 31 17:51:22 raspberry-pi weewxd[720]: INFO weewx.restx: Wunderground-RF: Published record 2024-05-31 17:48:47 CEST (1717170527)
maj 31 17:51:28 raspberry-pi weewxd[720]: ERROR weewx.restx: WOW: Failed to publish record 2024-05-31 17:48:00 CEST (1717170480): Failed upload after 3 tries
maj 31 17:51:28 raspberry-pi weewxd[720]: INFO weewx.restx: Wunderground-RF: Published record 2024-05-31 17:49:19 CEST (1717170559)
maj 31 17:51:53 raspberry-pi weewxd[720]: INFO weewx.restx: Wunderground-RF: Published record 2024-05-31 17:49:51 CEST (1717170591)
maj 31 17:52:13 raspberry-pi weewxd[720]: INFO weewx.restx: Wunderground-RF: Published record 2024-05-31 17:50:23 CEST (1717170623)
maj 31 17:52:56 raspberry-pi weewxd[720]: INFO weewx.manager: Added record 2024-05-31 17:50:00 CEST (1717170600) to database 'weewx_metric'
cze 01 10:36:12 raspberry-pi systemd[1]: weewx.service: Main process exited, code=killed, status=9/KILL
cze 01 10:36:12 raspberry-pi systemd[1]: weewx.service: Failed with result 'signal'.
cze 01 10:36:12 raspberry-pi systemd[1]: weewx.service: Consumed 1h 59min 21.575s CPU time.

Yesterday at 17:52:56 there was the last entry to the database and since then "nothing" has happened. Only today at 10:36:12 weewx stopped. When I logged in. Yesterday I tried to log in around 18:00 and the system was frozen.
While checking the logs, I found this information regarding weewx version 5.0.2:

WARNING weewx.engine: Previous report thread has been running 601.9695854187012 seconds.  Launching report thread anyway.

INFO weewx.imagegenerator: Generated 13 images for report SeasonsReport in 458.25 seconds

INFO weewx.engine: Launch of report thread aborted: existing report thread still running

INFO weewx.cheetahgenerator: Generated 8 files for report SeasonsReport in 252.10 seconds

Generating simple images shouldn't take more than 30 seconds of time, right? And not, as here, over 7 minutes in one case and 4 in the other.
In version 4.10.2, generating the same images takes:

INFO weewx.cheetahgenerator: Generated 8 files for report SeasonsReport in 12.51 seconds

I came to the conclusion that this was no longer enough for my nerves and I went back to version 4.10.2-1. Now the system works ok, nothing loads the processor, data is sent to weather services on an ongoing basis. The local page is generated every two minutes.
In version 5.02. the page was refreshed at various intervals, when I entered, I had data from 20 minutes ago.

Below I am posting debug information from my settings for 5.02 and a few drawings illustrating the problems. Maybe someone will be able to find out what is wrong in my configuration... Or maybe the problem is somewhere else...

BR
Bartosz


CPU load 5.02
weewx.png

CPU load 4.10.2
weewx_4.10.2.png

db load 5.0.2
db.png

db load 4.10.2
db_4.10.2.png

michael.k...@gmx.at

unread,
Jun 1, 2024, 2:09:54 PM6/1/24
to weewx-user
Belchertown? It is known Belchertown consumes a lot of CPU time when certain derived values are not in the database, after a 4 => 5 upgrade. Check the group for solutions.

I had maybe something like that a couple weeks ago (without Belchertown), it might have startet after I did an apt upgrade (but no weewx upgrade), I had to restart the RPi4 every day, after 24h I couldn't even ssh into the machine any more. In the logs were messages like "database locked" and those "report thread still running" messages. I didn't care too much because I upgraded my server hardware and installed everything from scratch and don't have any issues any more. I had four weewxd and a lot of other stuff running, I never found out, what caused the issues.

Tom Keffer

unread,
Jun 1, 2024, 2:52:50 PM6/1/24
to weewx...@googlegroups.com
Permission problems tend to be "all or nothing", so I doubt running as 'root' (as opposed to 'weewx') has anything to do with the excessive use of the CPU.

The problem is almost surely caused by the need to calculate an aggregate of an xtype that is not in the database (as described in this wiki article). Because this involves many small queries to the database, using a remote database can really exacerbate the problem. 

I see that you are using the Seasons skin. Have you modified it at all? To check, you may want to use the stock Seasons skin and compare times. 



--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/0b4d851a-98a6-4ec5-ab51-d7b17cca919an%40googlegroups.com.

vince

unread,
Jun 1, 2024, 3:26:30 PM6/1/24
to weewx-user
If you have 6 years of data you probably are using the old wview-compatible schema, so this 'sounds' like you are missing db elements referenced in your skins.  You are also running on a pi 3b (slow) and mentioned you have a 2 minute archive period (a little fast) making the pi work harder than usual.

As a quick test, switch to the simplest available skin (Mobile or Smartphone or Standard) and see if the problem goes away.  If so it is almost certainly missing elements in your database.

Can you post your Seasons skin.conf, weewx.conf (with no usernames/passwords in it), and your db schema ?

Tom Keffer

unread,
Jun 1, 2024, 4:02:14 PM6/1/24
to weewx...@googlegroups.com
Vince, he posted the output of weectl debug, which includes the schema and weewx.conf. 

-tk


vince

unread,
Jun 1, 2024, 4:21:13 PM6/1/24
to weewx-user
oops - thank you.  I missed that link in the long wording of the problem description.   Like the new weectl debug output too :-)

What I see is he's using the old original wview-compatible schema, so that certainly leans toward the db schema issue.  Again, I'd suggest temporarily switching to the old hard-coded Standard or Smartphone or Mobile skin (as a test) to see if the problem goes away.

Is there a way to instrument a test version of something under the hood in order to log which (if any) xtypes are being calculated due to not being found in the db ?   This issue seems to be hitting quite a few legacy users.  Would some test instrumentation perhaps help folks identify what's missing in their setup ?

Bartosz Francman

unread,
Jun 1, 2024, 7:49:51 PM6/1/24
to weewx-user
Hi,

The skin has not been modified in any way.
In my long description of the problem ;) I forgot to write that after automatic upgrade from 4.10.2 to 5.02 and this method did not work, I uninstalled the entire weewx (sudo apt purge) and installed it fresh. Later, I just copied the entries from the old configuration file to the new one. And not replacing this file, just individual entries.
No files were edited or changed except the configuration file.
I realize that changing from weewx to root shouldn't increase CPU usage. I only marked it to describe the problem with port 80, on which the raspberry must listen for data from the weather station.


Br
Bartosz

Tom Keffer

unread,
Jun 1, 2024, 8:46:03 PM6/1/24
to weewx...@googlegroups.com
Some ideas:

1. Check the slow query log. Anything stand out?
2. Check the regular query log. Any queries that are being run repeatedly? Or, require full database searches?
3. Divide and conquer. Disable the Cheetah generator (go to the bottom of Season's skin.conf and remove weewx.cheetahgenerator.CheetahGenerator from the option generator_list). Now start removing plots under [ImageGenerator]. Is there one plot that is causing the problem?

In short, you're going to have to isolate the problem.

-tk

Reply all
Reply to author
Forward
0 new messages