PostgreSQL support, which branch to target?

170 views
Skip to first unread message

Raoul Snyman

unread,
Dec 2, 2023, 10:46:32 PM12/2/23
to weewx-development
Hello,

I just came across WeeWX a week or so ago when I was looking for some open source Linux-based software to pull data from my AcuRite weather station. Fantastic project.

I am working on adding support for PostgreSQL (my database of choice), and I wanted to check which branch I should be targeting. I see the V5 branch, which seems to be the future of WeeWX, but the docs specify targeting the "development" branch.

Thanks!

matthew wall

unread,
Dec 3, 2023, 8:45:48 AM12/3/23
to weewx-development
hi raoul!

postgres support would be most excellent.  i think the biggest obstacle is the case sensitivity of column names.  the schema uses mixed case, and the column names are not consistent (some are snake_case, some are camelCase).  these names are used directly (case-sensitive), throughout the weewx code, and particularly in the templates and plots.  the problem i encountered when doing a half-hearted attempt at postgres support (almost 10 years ago!) was the case sensitivity of the database.  i seem to recall issues mapping the mixed-case names into postgres (and also mysql on windows?  or maybe mysql with a case-sensitivity option enabled?) queries.

anyway, from a development point of view, V5 is stable enough that you are pretty safe to target that branch.  most of the changes now are packaging and bugfixes as we get ready for a release.  the core has been pretty stable for awhile.

but i defer to tom - he might want your changes to go through the development branch before hitting V5.

m

Greg Troxel

unread,
Dec 3, 2023, 8:53:55 AM12/3/23
to matthew wall, weewx-development
matthew wall <mwall...@gmail.com> writes:

> the schema uses mixed case, and the column names are not consistent
> (some are snake_case, some are camelCase). these names are used
> directly (case-sensitive), throughout the weewx code, and particularly
> in the templates and plots. the problem i encountered when doing a
> half-hearted attempt at postgres support (almost 10 years ago!) was
> the case sensitivity of the database. i seem to recall issues mapping
> the mixed-case names into postgres (and also mysql on windows? or
> maybe mysql with a case-sensitivity option enabled?) queries.

Isn't this simply a bug in the current code, which ought to be fixed?
Or is it really hard to fix for some reason?

Tom Keffer

unread,
Dec 3, 2023, 10:32:07 AM12/3/23
to weewx-development
I'll be merging the V5 branch into the development branch in the next few days, but target the V5 branch for now.

As I recall, the "case sensitivity" problem with Postgres is that in order to achieve case-insensitivity, it converts everything to lower case, including column names.

When WeeWX starts up, it reads the schema from the database itself. So, something that started out like "outTemp" becomes "outtemp". Then if some code needs the outside temperature to calculate, say, dewpoint, it can't find it because Python string comparisons are always case-sensitive.

In the intervening years, perhaps Postgres has added a switch to prevent this. Don't know.

To work around it in the WeeWX code would require either that all string comparisons involving SQL types become case-insensitive, which would break countless 3rd party extensions, or the code that reads the schema from the database would have to be changed to read it from metadata, or someplace else. 

There are ways around the problem.

But, it's a biggish project. I'd be mighty grateful if someone took it on.

-tk

Joel Bion

unread,
Dec 3, 2023, 11:04:33 AM12/3/23
to Tom Keffer, weewx-development

Sent from my iPhone

On Dec 3, 2023, at 7:32 AM, Tom Keffer <tke...@gmail.com> wrote:

I'll be merging the V5 branch into the development branch in the next few days, but target the V5 branch for now.

As I recall, the "case sensitivity" problem with Postgres is that in order to achieve case-insensitivity, it converts everything to lower case, including column names.

When WeeWX starts up, it reads the schema from the database itself. So, something that started out like "outTemp" becomes "outtemp". Then if some code needs the outside temperature to calculate, say, dewpoint, it can't find it because Python string comparisons are always case-sensitive.

In the intervening years, perhaps Postgres has added a switch to prevent this. Don't know.

To work around it in the WeeWX code would require either that all string comparisons involving SQL types become case-insensitive, which would break countless 3rd party extensions, or the code that reads the schema from the database would have to be changed to read it from metadata, or someplace else. 

There are ways around the problem.

But, it's a biggish project. I'd be mighty grateful if someone took it on.

-tk

Detail: Postgres can accept mixed case table names but it requires always quoting them, per https://stackoverflow.com/questions/20878932/are-postgresql-column-names-case-sensitive#20880247


On Saturday, December 2, 2023 at 7:46:32 PM UTC-8 Raoul Snyman wrote:
Hello,

I just came across WeeWX a week or so ago when I was looking for some open source Linux-based software to pull data from my AcuRite weather station. Fantastic project.

I am working on adding support for PostgreSQL (my database of choice), and I wanted to check which branch I should be targeting. I see the V5 branch, which seems to be the future of WeeWX, but the docs specify targeting the "development" branch.

Thanks!

--
You received this message because you are subscribed to the Google Groups "weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-developm...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-development/9af547f0-449a-4c79-96b4-2f3559359232n%40googlegroups.com.

Tom Keffer

unread,
Dec 3, 2023, 2:00:55 PM12/3/23
to Joel Bion, weewx-development
Yeah, I'm not going to do that.

3rd party code still wouldn't work.

Cameron D

unread,
Dec 3, 2023, 9:20:38 PM12/3/23
to weewx-development
I don't understand the problem.
My MariaDB server is set to case-sensitive (file-system and table names). The column names are mainly camel case (using a trimmed down version of the old schema with original names, such as 'inTemp' and 'rain').
I have a second instance with an ecowitt DB where the column names are mixed camel and snake (I took standard CamelCase name and added identifying suffixes with an underscore). And, of course, all the archive table names are snake with possible camel suffixes, matching the original column name.
I have not noticed any problems so far. 
My  systems are all run on Debian Linux - if the issue is only a problem of compatibility with servers running on ms-windows or macs then that could be just declared as a known limitation.

Raoul Snyman

unread,
Dec 3, 2023, 9:53:44 PM12/3/23
to weewx-development
Hey Matthew,

Yeah, I read about the case-sensitivity issues. I am planning to use some look-up tables to get around the issue. A bit of a dirty hack, but hopefully it'll get me what I want :-D


Raoul Snyman

unread,
Dec 3, 2023, 10:04:55 PM12/3/23
to weewx-development
On Sunday, December 3, 2023 at 3:32:07 PM UTC Tom Keffer wrote:
I'll be merging the V5 branch into the development branch in the next few days, but target the V5 branch for now.

Thanks, I'll target V5 for now.
 
As I recall, the "case sensitivity" problem with Postgres is that in order to achieve case-insensitivity, it converts everything to lower case, including column names.

When WeeWX starts up, it reads the schema from the database itself. So, something that started out like "outTemp" becomes "outtemp". Then if some code needs the outside temperature to calculate, say, dewpoint, it can't find it because Python string comparisons are always case-sensitive.

In the intervening years, perhaps Postgres has added a switch to prevent this. Don't know.

To work around it in the WeeWX code would require either that all string comparisons involving SQL types become case-insensitive, which would break countless 3rd party extensions, or the code that reads the schema from the database would have to be changed to read it from metadata, or someplace else. 

There are ways around the problem.

Yeah, like I said to Matthew, I'm planning to put some lookup tables to get around the issue. Hopefully that'll work around the issue.

But, it's a biggish project. I'd be mighty grateful if someone took it on.

I've been working with databases (PostgreSQL, SQLite, and a few others) for many years. Honestly, I've always stuck to lower-casing all the table names and columns, due to differences in case sensitivity between databases, but when you get a fairly established project that didn't start off that way, especially one like WeeWX that implements its own DB abstraction layer, it's difficult to migrate to a standardized schema. So I understand completely not wanting to mess with the schema.

This looks simple enough to set up, thanks to the DB abstraction layer. I've got most of the DB module done, I'm just looking at writing some tests and running WeeWX with the simulator, so that I can catch any places where things might fail.
 

Raoul Snyman

unread,
Dec 3, 2023, 10:07:19 PM12/3/23
to weewx-development
On Monday, December 4, 2023 at 2:20:38 AM UTC Cameron D wrote:
I don't understand the problem.
My MariaDB server is set to case-sensitive (file-system and table names). The column names are mainly camel case (using a trimmed down version of the old schema with original names, such as 'inTemp' and 'rain').
I have a second instance with an ecowitt DB where the column names are mixed camel and snake (I took standard CamelCase name and added identifying suffixes with an underscore). And, of course, all the archive table names are snake with possible camel suffixes, matching the original column name.

PostgreSQL is NOT case-sensitive. It converts everything to lowercase in order to bypass any case sensitivity issues. The problem is that WeeWX is expecting the database to be case-sensitive.
 

Cameron D

unread,
Dec 4, 2023, 5:41:30 AM12/4/23
to weewx-development
Thanks, I understand the issue a bit better now.  It looks more like a case of whether the DB system is case-preserving. Sqlite is case-preserving but case-insensitive, whereas pg looks more like case-modifying and case-insensitive (unless identifiers are quoted).  Running the mysql server in case-insensitive mode (1) will cause similar problems to pg.

Raoul Snyman

unread,
Dec 11, 2023, 11:47:04 PM12/11/23
to weewx-development
I just wanted to give a little feedback.

Firstly, I have WeeWX logging stuff into PostgreSQL and generating the Season report without issues.

But.... I did have to change some stuff outside of the DB driver. There are a lot of assumptions built into the system, specifically around non-standard SQL which MySQL uses and SQLite allows because of MySQL.

I've tried to make things slightly more DB-agnostic, and moved a couple things to the base Connection object so that they can be overridden. It's not very pretty, but it works.

I still don't have the tests running properly, I couldn't find any docs on exactly how to set up the tests, and both the MySQL and PostgreSQL tests fail when I just run "make test"

Tom Keffer

unread,
Dec 12, 2023, 7:56:31 AM12/12/23
to Raoul Snyman, weewx-development
That looks like a good, solid start! Well done.

There are some other places in the code that are database-dependent. In particular, look at the queries in weewx/xtypes.py, dictionary "group_defs". Maybe you can figure out a way around that. 

I don't think the lookup table you supplied in postgres.py will survive the test of time. There are just too many types out there in the wild to ever capture them in a static structure. Whenever you create a table, I think it will be necessary to save the original names in database metadata, then return that when needed.

Make sure you don't drift too far away from the V5 code. I don't see any major refactoring coming up, but I can't guarantee it either.

-tk


--
You received this message because you are subscribed to the Google Groups "weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-developm...@googlegroups.com.

matthew wall

unread,
Dec 12, 2023, 9:15:31 AM12/12/23
to weewx-development
On Monday, December 11, 2023 at 11:47:04 PM UTC-5 Raoul Snyman wrote:
I still don't have the tests running properly, I couldn't find any docs on exactly how to set up the tests, and both the MySQL and PostgreSQL tests fail when I just run "make test"

i am working on the unit tests.  when we started them we just used python directly. i am converting all of it to pytest.

i would like to have the pytest conversion done when v5 comes out of beta, but it might happen soon after that.

the core in v5 is pretty solid, but we're still taking care of some packaging and integration bits
 

Raoul Snyman

unread,
Dec 12, 2023, 10:25:10 AM12/12/23
to weewx-development

There are some other places in the code that are database-dependent. In particular, look at the queries in weewx/xtypes.py, dictionary "group_defs". Maybe you can figure out a way around that. 

Thanks for the pointers, I'll take a look.

 
I don't think the lookup table you supplied in postgres.py will survive the test of time. There are just too many types out there in the wild to ever capture them in a static structure. Whenever you create a table, I think it will be necessary to save the original names in database metadata, then return that when needed.

Yeah, it's not ideal, it's really just a workaround. I created the list by combining the column names from the 3 schemas, and any time a new schema is created the lookup table would need to be modified. Saving the column names in the metadata sounds like it may work,  I'll play around with that idea.
 
Make sure you don't drift too far away from the V5 code. I don't see any major refactoring coming up, but I can't guarantee it either.

Yup, I'm rebasing on the V5 branch fairly often.
 

Raoul Snyman

unread,
Dec 12, 2023, 10:27:19 AM12/12/23
to weewx-development
Oh cool, that's good to know. I've used pytest fairly extensively, it's a good test framework.

 

Raoul Snyman

unread,
Dec 22, 2023, 1:12:30 AM12/22/23
to weewx-development
I've been going through the queries in xtypes.py, and I've come across some queries that don't make sense to me, and definitely throw errors in PostgreSQL.

For example:

SELECT SUM(sum >= %(val)s) FROM %(table_name)s_day_%(obs_key)s WHERE dateTime >= %(start)s AND dateTime < %(stop)s

SUM(sum >= 90) is not valid in PostgreSQL. Can you tell me what this is supposed to be doing, then perhaps I can translate the SQL into something that's more generic?

Thanks!

Tom Keffer

unread,
Dec 22, 2023, 8:15:09 AM12/22/23
to Raoul Snyman, weewx-development
Typically, this is used to calculate the number of days of precipitation greater than some amount during a time span, using the daily summaries. The parameter %(val)s is the value.

In retrospect, the choice of column names in the daily summaries may not have been a good one. The values "sum" in the expression "SUM(sum>90)" represent two different things. The first sum ("SUM") is an aggregation, the second sum ("sum") is a column name.

Perhaps you can escape the inner "sum"?

-tk

--
You received this message because you are subscribed to the Google Groups "weewx-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-developm...@googlegroups.com.

Raoul Snyman

unread,
Dec 22, 2023, 3:35:34 PM12/22/23
to weewx-development
Actually, the columns names are fine for me, it was the expression inside the SUM function which was throwing me off.

I set up a test database in SQLite and ran those queries, and it looks like they are actually just doing a count (boolean true becomes 1, sum of a bunch of 1s is a count). I'm going to test changing those queries to count() and moving the expression into the where clause.

Tom Keffer

unread,
Dec 22, 2023, 4:09:04 PM12/22/23
to Raoul Snyman, weewx-development
I wish I knew more about SQL programming, but, alas, I know just enough to get by. I'd be very grateful for opinions from a true expert.

Reply all
Reply to author
Forward
0 new messages