Common database for doing multi-site comparison

143 views
Skip to first unread message

Tomasz Lewicki

unread,
Feb 21, 2026, 4:19:32 AM (6 days ago) Feb 21
to weewx-user

I have the following idea and problem at the same time. I would like to build a small network of weather stations in the area, say 5-10 units. Of course, they will all be powered by Weewx and Raspberry Pi. Apart from the fact that each one will publish data on the internet independently of the others (one domain, separate subdomains), I am thinking about a collective presentation of data from the stations - only selected parameters, e.g., temperature, wind, precipitation.

Each Weewx instance is a separate weewx.sdb database with an identical structure. If I understand correctly, I would have to create a separate (collective) database in which the indexes of individual parameters would be different for each station, e.g., temp_pws01, temp_pws02, etc. The idea is that such a comparison would not be generated in real time, as in a standard Weewx installation, i.e., every 5-10 minutes, but, say, 4 times a day. I figured that the databases from each station would be sent via rsync to a remote server, and then specific columns would be extracted from them “somehow” and copied to a new, shared database, from which a summary would be made. Can the final step be done with Weewx, or do I need to use other tools?

The question is, has anyone done something like this before and could point me in the right direction?


Translated with DeepL.com (free version)

Graham Eddy

unread,
Feb 21, 2026, 4:32:15 AM (6 days ago) Feb 21
to WeeWX User
if they are all on the same box, then cheetah tags can point into another weewx database for values
e.g. in my weewx.conf. if i have another database with data_binding ‘wx_other’, i can have tag day.outTemp.max(data_binding=‘wx-other’)

i have a weewx instance ‘wxrep’ that only exists to pull data from other instances to present a consolidated whole
⊣GE⊢

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/weewx-user/a6e984e8-0526-4b71-93cb-fc545a942575n%40googlegroups.com.

Tomasz Lewicki

unread,
Feb 21, 2026, 5:00:48 AM (6 days ago) Feb 21
to weewx-user
It sounds interesting! Could you share your configuration files? 

Graham Eddy

unread,
Feb 21, 2026, 6:27:58 AM (6 days ago) Feb 21
to WeeWX User
just some excerpts will suffice.
see that station ‘wxrep’ has its own database ‘weewx.sdb’ but also external database ‘vantage.sdb’ referred to using data_binding ‘vantage'

/home/weewx/wxrep/weewx.conf (excerpt)

[Databases]              

                         

    # A SQLite database is simply a single file.

    [[archive_sqlite]]   

        database_name = weewx.sdb

        database_type = SQLite

                         

    [[vantage_sqlite]]   

        database_name = vantage.sdb

        database_type = SQLite


… … … … 


[DataBindings]

         

    [[wx_binding]]

        # The database must match one of the sections in [Databases].

        # This is likely to be the only option you would want to change.

        database = archive_sqlite

        # The name of the table within the database.

        table_name = archive

        # The manager handles aggregation of data for historical summaries.

        manager = weewx.manager.DaySummaryManager

        # The schema defines the structure of the database.

        # It is *only* used when the database is created.

        #schema = schemas.wview_extended.schema

        schema = user.schemas.wxrep.schema

         

    [[vantage]]

        database = vantage_sqlite

        table_name = archive

        manager = weewx.manager.DaySummaryManager

        schema = user.schemas.vantage.schema


/home/weewx/wxrep/skins/ge-wxrep/daycharts.frag.html.tmpl (excerpt)

<figure class="chart">

    <img src="flux/images/vantage/daytempcomb.png"/>

    <div>

      <div>Out</div>

      <div>Min</div><div>$span($data_binding="vantage",$day_delta=1).outTemp.min</div><time>$span($data_binding="vantage",$day_delta=1).outTemp.mintime.format("%H:%M %d/%m")</time>

      <div>Max</div><div>$span($data_binding="vantage",$day_delta=1).outTemp.max</div><time>$span($data_binding="vantage",$day_delta=1).outTemp.maxtime.format("%H:%M %d/%m")</time>

      <div>Dew</div>

      <div>Min</div><div>$span($data_binding="vantage",$day_delta=1).dewpoint.min</div><time>$span($data_binding="vantage",$day_delta=1).dewpoint.mintime.format("%H:%M %d/%m")</time>

      <div>Max</div><div>$span($data_binding="vantage",$day_delta=1).dewpoint.max</div><time>$span($data_binding="vantage",$day_delta=1).dewpoint.maxtime.format("%H:%M %d/%m")</time>

    </div>

  </figure>


cheers
⊣GE⊢

p q

unread,
Feb 21, 2026, 9:21:39 AM (6 days ago) Feb 21
to weewx-user
Something like Grafana might be good for this

Peter Quinn
(415)794-2264

Vince Skahan

unread,
Feb 21, 2026, 1:31:00 PM (6 days ago) Feb 21
to weewx-user
Take a look at the wiki for https://github.com/weewx/weewx/wiki/dashboards for some ways to do this.  There are others.  Lots of others.

For just a few readings from just a few stations, MQTT is the simplest way:
  • run a mosquitto broker on your central site
  • have each weewx instance publish MQTT to a unique topic such as station1/loop, station2/loop, etc.
  • have the central weewx MQTTsubscribe to the desired topics and save to one weewx db
Nice thing is MQTTsubscribe can run as a driver, so you could use that to get data from the incoming MQTT topics into one central weewx db.  You'll have to map those items to weewx db elements, so if you have a lot of readings from many stations, you'll need to add db columns 'once'.

You'd have to of course create your own weewx skin to create whatever HTML and graphs you want, but you'd be totally weewx with the exception of running a Mosquitto MQTT broker (easy to do) on your central site.

That said, if you want to have 'all' possible readings from 'many' stations then it's probably best to publish to influxdb and display with something like grafana or maybe even Home Assistant.  That will get you almost infinite extensibility but it's far more work to learn, set up, and set up securely.(note - I never looked into how to secure a grafana dashboard from Internet bots.  I was fortunately LAN-only).

bell...@gmail.com

unread,
Feb 21, 2026, 1:59:04 PM (6 days ago) Feb 21
to weewx-user

If you want to try something on the bleeding edge, you could look into my weewx-mqtt/replicate extension, https://github.com/weewx-mqtt/replicate.


What is weewx-mqtt/replicate? It is my attempt to create a ‘light weight’ way of replicating weewx databases. (I say ‘light weight’ because I have decided that there is no such thing as a light weight replicator.) It uses the request/response functionality of the V5 MQTT protocol. It consists of a service running on a current (the primary) instance of WeeWX and a driver that runs on the secondary instance.


Why not just use one of the WeeWX extensions that publishes and another extension to subscribe to the topic? One limitation of this is, if the secondary instance is offline; data will not be replicated. weewx-mqtt/replicate attempts to overcome this. When the secondary instance comes on-line, it asks the service running on the primary instance for all the data since the last record in its database (aka a catchup is performed). 


Disclaimer 1: It is still very much beta (maybe alpha) code. I have not moved it into production. I only use it to replicate to a development environment. 


Disclaimer 2: It has limited documentation, so one has to be willing to work through setting it up with limited help. This includes all that goes into setting up a MQTT broker.


Disclaimer 3: Not sure if it could replicate multiple databases into a single one. But, if you are interested in experimenting I would be happy to look into supporting this.


Disclaimer 4: All of the usual stuff.

rich

Graham Eddy

unread,
Feb 21, 2026, 7:00:24 PM (5 days ago) Feb 21
to WeeWX User
the question, as i understood it, was to present from single copy of data already existing, rather than to store multiple copies of the same data in multiple databases.
i use mqtt extensively for data acquisition, creating single truths in databases. then i have a presentation layer over the top of all to create a single (complex) portal (which has more than just weewx data).
as vince and others point out, there is more than one way to ’skin a cat’ and it depends on your requirements
⊣GE⊢

Vince Skahan

unread,
Feb 21, 2026, 7:28:44 PM (5 days ago) Feb 21
to weewx-user
The original post suggested the multiple sites might each rsync their db occasionally up to the central site, so Rich’s suggestion should be lighter if he gets it to production ready. Matthew has suggested in the past that using multiple db and one skin reading them might be less reliable than putting multi-site data into one big db, but I’ve had zero issues with multiple db and one consolidated skin for many years now. Definitely works either way.

bell...@gmail.com

unread,
Feb 21, 2026, 7:42:09 PM (5 days ago) Feb 21
to weewx-user

Here is my understanding of the problem.


There are multiple databases on different physical machines and it is desired to create data visualizations that combine them. 


Cat skin 1.

A central visualization process that can access the remote data sources.


Cat skin 2.

Replicate the databases “asis” to a central location and run the visualization off these replicas.


Cat skin 3.

Replicate the data of interest into a central database and run the visualization off this central database


Cat skin n.

I am sure there are many more options.


My gut, the first decision is whether to try to access the remote data sources from a central location  (cat skin 1) or replicate the required data to the central location (cat skin 2 and 3). Then proceed from there.


Then layered on top of (or underneath, since we have skinned the cat?), the various technology that can be used.....

Tomasz Lewicki

unread,
Feb 22, 2026, 4:24:27 AM (5 days ago) Feb 22
to weewx-user
First of all, thank you for the wide response and the whole list of ideas. For now, I am at the stage of developing a concept for how to build such a network and manage it, at least in terms of configuration files. I assume that all stations will have access to high-speed internet (this will be one of the installation requirements, apart from the terrain conditions), so the amount of data transferred from the stations to the central server - if I need one - will not be a bottleneck.

I haven't dealt with MQTT, but since you say it's not difficult, all the better. I hope this protocol has the ability to announce/send to a remote server.

I have come up with another solution, although I am not sure if it duplicates any of the proposed ones - as I mentioned, I have not delved into the subject yet. If I am reinventing the wheel, please let me know. At the stage of starting each station, I would add additional columns to the individual weewx.sdb database, which would be copies of other columns, e.g., archive_day_outTemp_stationID (where stationID is, of course, a unique key - the name of the station). Before sending the database to the server, a script would cut out only the *_stationID columns, save them to a new database, and send this truncated database to the central server. However, I see a problem here in the form of additional CPU overhead and, perhaps most importantly, an increased number of records on the microSD card (as I wrote in my first post, I want to have Weewx on Raspberry Pi). So transferring the entire database, even with additional columns, and processing it in one place may be a better solution.

As for summaries, at first I didn't consider data visualization, only simple tabular summaries, e.g., maximum temperature in the entire area - station X, minimum temperature - station Y, etc. But I'm sure that over time, you'll want to have graphs and images :)

Vince Skahan

unread,
Feb 22, 2026, 3:25:22 PM (4 days ago) Feb 22
to weewx-user
You need to figure out your requirements.
  • there are over 100 possible elements in a default weewx schema
  • which measurements do you want to display and compare among your stations ?  
If you're just looking for the most typically used items (outTemp, wind, windGust, windDir, rain) then MQTT would be very very easy to set up.

> "I haven't dealt with MQTT, but [...] I hope this protocol has the ability to announce/send to a remote server."

That is EXACTLY how it works.  The client(s) publish information to the server.  The server subscribes to whatever information it is configured to care about.


1. On your weewx client system, use Matthew's  weewx-mqtt extension to publish weewx data to the MQTT server (aka 'mosquitto broker'):

    [[MQTT]]
        enable = true
        client_id = mysite1                        # optional
        server_url = mqtt://192.168.1.69:1883/     # or the FQDN of the broker computer
        append_units_label = false                 # default=True
        binding = loop,archive                     # or archive or loop alone
        topic = mysite1                            # use a different topic per client computer
        log_success = false
        log_failure = true


(you can use Rich's MQTTpublish extension if you want - syntax is a little different but similar)


2. On the target server use Rich's MQTTsubscribe to subscribe to the items you want, from the clients you want, and save them to the weewx db to some column in your schema.

This example runs it as a service to add incoming data from MQTT to the normal data from my VP2 station, but it can run as a MQTTSubscribeDriver if you want.  If you run it as a driver on your central weewx server it will just save data coming in from whatever client systems you configure to publish to it.  You'll likely need some custom set of columns in your central weewx instance db but that's easy to add.

[MQTTSubscribeService]
    enable = true
    host = 192.168.1.69     # or the FQDN of the broker computer
    port = 1883
    keepalive = 60
    username = None
    password = None
    binding = loop

    [[message_callback]]
        type = json

    [[topics]]
        unit_system = US
        ignore_start_time = True
        ignore_end_time = True

        [[[mysite1/loop]]]            # subscribe to data published by the mysite1 weewx instance
            [[[[outTemp_F]]]]         # extract its outTemp item
                name = extraTemp1     # save to the local db using extraTemp1
                units = degree_F


        [[[mysite2/loop]]]            # similarly for mysite2
            [[[[outTemp_F]]]]         
                name = extraTemp2     
                units = degree_F

3. And of course set up whatever custom skin with tabular data and/or graphs that you want for visualizing.


==> Experiment a little with MQTT while you think about your requirements.   If you want to learn MQTT this is very well documented online, but if you look at the Belchertown-from-scratch repo I mentioned this week there is a page that should walk you step-by-step.  See https://github.com/vinceskahan/belchertown-from-scratch/blob/main/configure-websockets-no-encryption.md and ignore the Belchertown mentions therein.  The rest shows how to install and test mosquitto MQTT enough to get a feel for it.

Tomasz Lewicki

unread,
Feb 23, 2026, 3:23:38 AM (4 days ago) Feb 23
to weewx-user

Thank you all very much, especially Vince and Rich, for your tips and detailed descriptions. I will familiarize myself with the subject, first experimenting on my own installation with one or two stations, and when my project takes off (first I need to secure external funding for it), I will operate on a larger scale. 

Vince Skahan

unread,
Feb 23, 2026, 4:56:28 PM (3 days ago) Feb 23
to weewx-user
Just one more update for Tomasz here.  I did some more research and found that sqlite3 has a 'sqlite3_rsync' utility that might be worth a look.  It is very similar in setup to rsync over ssh but handles sqlite3 db specifically. See https://sqlite.org/rsync.html for details

Unfortunately it does not come with the debian sqlite3 packages I can find, so you'd need to compile from sources and install into someplace in $PATH on client and server computers (I copied mine to /usr/local/bin), but compiling it is fast and simple to do.

git clone https://github.com/sqlite/sqlite.git sqlite-sources
cd sqlite-sources
make sqlite3_rsync
sudo cp sqlite3_rsync /usr/local/bin      # get it into local $PATH
 
# for the remote server, scp the binary over there into the target account $PATH if it's the same arch
# or compile+install there too if the server arch differs (ie, raspi client => AWS server)

# also set up ssh for passwordless ssh
#     (run ssh-keygen locally, add .pub to authorized_keys far side or use ssh-copy-id)

Here are some examples of how to use it:
 
# example initial run for a small source db
sqlite3_rsync -v \
                       /home/vagrant/weewx-data/archive/weewx.sdb \
     vagrant@localhost:/home/vagrant/weewx-data/archive/replicated.sdb
sent 1,008,859 bytes, received 22 bytes, 3,577,592.20 bytes/sec
total size 1,007,616

# example running it again a few minutes later
sqlite3_rsync -v \
                       /home/vagrant/weewx-data/archive/weewx.sdb \
     vagrant@localhost:/home/vagrant/weewx-data/archive/replicated.sdb
sent 168,227 bytes, received 4,763 bytes, 609,119.72 bytes/sec
total size 1,007,616  speedup is 5.82

For my test I'm sqlite3_rsync(ing) from a source db to a target db on localhost within one vm since I can't remember how to set up vagrant to let me ssh between client vms. 

If it's me, I'd just set up a cron job on each source client system to run this command occasionally to catch up the remote server copy.  I do not know how well it would work for 'many' systems all trying to replicate onto the central server if all the clients trigger at once.   Having the cron job on each client trigger perhaps a few minutes differently might be wise, but maybe all of them starting up in parallel would work.  I didn't test that.

For reporting, likely lots of ways to do that on the server. I suppose you could just install but not start up weewx there and instead run 'weectl report run' occasionally from cron to generate the output HTML.   You'd likely need to add additional database bindings on the server weewx.conf so the report generator would know which db to read when running the reports.  Something like the following perhaps.

[DataBindings]

    #--- the usual binding ---
    [[wx_binding]]
        database = archive_sqlite

        table_name = archive
        manager = weewx.manager.DaySummaryManager
        schema = weewx.schemas.wview_extended.schema

    #--- client1 ---
    [[client1_wx_binding]]
        database = archive_sqlite_client1

        table_name = archive
        manager = weewx.manager.DaySummaryManager
        schema = weewx.schemas.wview_extended.schema

    #--- client2 ---
    [[client2_wx_binding]]
        database = archive_sqlite_client2

        table_name = archive
        manager = weewx.manager.DaySummaryManager
        schema = weewx.schemas.wview_extended.schema

[Databases]

    #--- the usual db definitions ---

    [[archive_sqlite]]
        database_name = weewx.sdb
        database_type = SQLite

    [[archive_mysql]]
        database_name = weewx
        database_type = MySQL

   #----- remote client systems ----

    [[archive_sqlite_client1]]
        database_name = client1.sdb
        database_type = SQLite

    [[archive_sqlite_client2]]
        database_name = client2.sdb
        database_type = SQLite



Might be worth a look when you start experimenting on real systems....

bell...@gmail.com

unread,
Feb 23, 2026, 7:02:15 PM (3 days ago) Feb 23
to weewx-user

Very cool. Thanks for the find. I can see so many possibilities with this.
rich

Tomasz Lewicki

unread,
Feb 24, 2026, 4:20:28 AM (3 days ago) Feb 24
to weewx-user

Thank you so much, Vince. I agree, sqlite3_rsync is a very cool solution. I'll try adding it to one of my stations to see how it works. For now, I make copies of entire databases once a day, at night. But as the databases grow, the rsync built into sqlite3 is a great solution.

I think it would be worth adding this information to the Weewx wiki.

Reply all
Reply to author
Forward
0 new messages