Is it safe to delete null columns in database ?

26 views
Skip to first unread message

Conz

unread,
8:10 AM (11 hours ago) 8:10 AM
to weewx-user
I noticed there's a lot of unused columns in the database, is it safe to just remove those and maybe make it a little bit easier on weewx to do the updates every few minutes ?

John Smith

unread,
9:11 AM (9 hours ago) 9:11 AM
to weewx...@googlegroups.com
If the column only has NULLs you should first stop weewx then you can remove them safely.

Doing that alone shrunk my DB significantly.

On Sat, 28 Feb 2026 at 00:10, Conz <conse...@gmail.com> wrote:
I noticed there's a lot of unused columns in the database, is it safe to just remove those and maybe make it a little bit easier on weewx to do the updates every few minutes ?

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/weewx-user/33b7cce6-5d82-4231-803f-4357aa7927een%40googlegroups.com.

DR

unread,
1:23 PM (5 hours ago) 1:23 PM
to weewx...@googlegroups.com
This is nice to  know.

I think many years ago I asked if it would be of help to do so, being
sort of a minimalist and having had every byte is valuable drummed into
me in college.

I believe one of the experts suggested that the amount of saving in
storage space and perhaps time for WeeWx to do processing was
insignificant.


Are there some tests by those who have done so or  know how to do it
safely to give information on  how much, if any, saving happens?  Just
curious.  Dale


Pablo Sanchez

unread,
1:41 PM (5 hours ago) 1:41 PM
to weewx...@googlegroups.com
Hi Dale,

TL;DR

I would not bother. Disk space is cheap compared to your time and there would not be any measurable performance gains.

Details

These data sets are not large and the read/write rate is peanuts.

The default values for these NULL'able columns is NULL. At the time of INSERT, when a column value is not specified, the database engine will look at column constraints and if there are defaults, set them. For the transaction rate for this application, it would be a minuscule amount of time. Probably microseconds.

For storage, a NULL'able column does take much space but unless the entire table is constantly being read, the working-set of data will be in RAM.

Empirically, I have been running weewx since July 1st, 2025 (ish). Roughly nine months. The last time I updated my OS (I run a rolling distro), was 38 days ago. My "server" (it is a klunker laptop) has 8 GB of RAM. 5.1 GB is in use for buffers (kernel) and cache. I am only using 879 MB.

I forced a report generation and we can see disk usage is very low. I peakd at 77 IOPS (all writes).

01:32:24 PM       tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
01:33:24 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda
01:33:27 PM      6.33     97.33      8.00      0.00     16.63      0.00      0.32      0.17 sda
01:33:30 PM     16.00      0.00    597.33      0.00     37.33      0.01      0.54      0.53 sda
01:33:33 PM      1.00      0.00    140.00      0.00    140.00      0.00      1.67      0.13 sda
01:33:36 PM     77.33      0.00    381.33      0.00      4.93      0.01      0.16      1.17 sda
01:33:39 PM      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda


I am running MariaDB for no reason. Probably dumb of me. I should switch back to SQLite. :)

Cheers!
---
pablo

DR

unread,
1:59 PM (5 hours ago) 1:59 PM
to weewx...@googlegroups.com
That is in alignment from what I recall some time ago.


Database engines have gotten much smarter and do a lot which I'm no
longer aware of.  I recall the days when we packed 8 binary values into
one word and masked it to get what we wanted. Obviously won't work with
numeric values but shows the lengths we were taught to do in the infancy
of software.  And magic which can occur behind the scenes is often of no
concern to programmers now.  I recall when something termed "Rushmore"
techniques were used to speed indexed services after the first reference.

Thank you for sharing the info  you accumulated about speed and storage.

It keeps this forum fun and fact filled.   Dale


Conz

unread,
3:52 PM (3 hours ago) 3:52 PM
to weewx-user
I just wanted to do this because I like a neat db in case i want to mess around with it myself. Just makes it a bit more clear when staring at a table when you see numbers and not just a lot of NULL.
Also I had hoped it would flatten out the cpu spike that it has every 5 minutes because it's just high enough to trigger the cpu fan to speed up and become audible :-)
But it didn't really make any difference.

If anyone else does decide to tidy up their db you might need this bit of sql as the day tables don't seem to get cleaned up by 'weectl database drop-columns'
But make sure to not execute the results blindly and take out the __metadata table. But this checks all the archive_day tables and sees if the corresponding column exists in archive.
Do be sure to stop/restart weewx.
```
mysql
-N -B -e "
SELECT CONCAT('DROP TABLE \`weather\`.\`', t.TABLE_NAME, '\`;') AS drop_stmt
FROM information_schema.TABLES t
LEFT JOIN information_schema.COLUMNS c
ON c.TABLE_SCHEMA = 'weather'
AND c.TABLE_NAME = 'archive'
AND c.COLUMN_NAME = SUBSTRING(t.TABLE_NAME, CHAR_LENGTH('archive_day_') + 1)
WHERE t.TABLE_SCHEMA = 'weather'
AND t.TABLE_NAME LIKE 'archive_day\\_%' ESCAPE '\\\\'
AND c.COLUMN_NAME IS NULL
ORDER BY t.TABLE_NAME;
" > drop_archive_day_orphans.sql
``` 


Op vrijdag 27 februari 2026 om 19:59:47 UTC+1 schreef DR:

DR

unread,
4:42 PM (2 hours ago) 4:42 PM
to weewx...@googlegroups.com
Interesting.

I assumed that the spike in activity would occur with any intense
computational activity.

Do  you have timing data on how much shorter the process was?


And what was the size change in your database before vs. after? Dale.


John Smith

unread,
6:15 PM (26 minutes ago) 6:15 PM
to weewx...@googlegroups.com
On Sat, 28 Feb 2026 at 05:23, DR <daleea...@gmail.com> wrote:
I believe one of the experts suggested that the amount of saving in
storage space and perhaps time for WeeWx to do processing was
insignificant.

Disk only part of the picture, at least for me, as getting the database disk size down also decreases wasted memory that's caching NULLs  and I also use an ARM64 CPU which has it's own limitations.

Here's the thread where I wrote about this very topic:


At the end of tuning my database I reduced it's size to almost 1/10th with no loss of information.

John Smith

unread,
6:19 PM (22 minutes ago) 6:19 PM
to weewx...@googlegroups.com
On Sat, 28 Feb 2026 at 05:41, Pablo Sanchez <pa...@hillsandlakes.com> wrote:
These data sets are not large and the read/write rate is peanuts.

I beg to differ, after collecting almost 20 years of data at 5 minute intervals mine had grown to over 1.1GB, after tuning it, I got it down to ~170MB, I wouldn't call that peanuts.

John Smith

unread,
6:27 PM (14 minutes ago) 6:27 PM
to weewx...@googlegroups.com
On Sat, 28 Feb 2026 at 05:59, DR <daleea...@gmail.com> wrote:
one word and masked it to get what we wanted. Obviously won't work with
numeric values but shows the lengths we were taught to do in the infancy

That's the second part of my tuning, by default decimal numbers were being stored as doubles which take 8 bytes and allow 16 decimal places of precision when floats only use 4 bytes since only one decimal place is needs.

I also switch other columns like usUnits and interval to be tinyints which only use 1 byte compared to 4 bytes for an int. 
Reply all
Reply to author
Forward
0 new messages