Extensible databases

114 views
Skip to first unread message

Thomas Keffer

unread,
Feb 1, 2014, 6:47:32 PM2/1/14
to weewx-de...@googlegroups.com
I've been thinking a bit about how to support new types, such as those used by weewx-WD.

If we're willing to settle for one constraint, it does not have to be too complicated: if it's in a database, then you can plot it or show statistics about it. It does not have to be in one of the two existing databases, the archive and stats database, just some database. 

If it's not, well, you'll have to come up with some clever search list extension to support it. Although there are already two "virtual" stats types in weewx (cooling and heating degree-days), allowing them to be added in a general, extensible way is complicated.

Plots

When it comes to plots, this ability is already there. 
  1. First, make sure your database is declared in section [Databases] of weewx.conf
  2. Then just specify the option archive_database for the database you want.  The default is whatever is set in the [StdArchive] stanza of weewx.conf, but it can be overridden.
Example skin.conf excerpt:

[[myspecial_widget]]
  archive_database = my_database
  [[[widgets]]]
  [[[gadgets]]]

where widgets and gadgets are columns in the database my_database.

Templates

For the template tag substitution, it's more complicated. 

For current records (using a weewx-WD example, this would be a tag like $current.humidex) there are two approaches that I can think of:
  1. Pass in the record that triggered the reporting process to CheetahGenerator. This would presumably be augmented with everything you need. Although this breaks my rule "it's gotta be in the database," it's the simplest. There is a mild risk of a thread race condition if some other service changes the record while cheetahgenerator is using it.
  2. Have the code in cheetahgenerator search a list of archive databases, looking for one with humidex in its schema. This would be the most general, but the most complicated. 
For approach #2, as well as the aggregated statistics, things like $week.outTempDay, it will require yet another modification to the search list machinery, but I think it can be done in a backwards compatible way.

Right now, the base class SearchList looks like this:

class SearchList(object)
  ...
  def get_extension(self, timespan, archivedb, statsdb):
    ...

we add another method:

  def get_extension_extended(self, timespan, archive_dict, stats_dict):
    ...

where archive_dict and stats_dict are Python dictionaries. Their keys are a symbolic name (such as the existing 'archive_database') and the values are an instance of weewx.archive.Archive or weewx.stats.StatsDb, respectively.

Right now, the code in weewx.stats assumes that all stats data is coming from the stats database with symbolic name stats_database. This allows other databases to be included. 

So, how does weewx know which database to use?

In retrospect, I wish I had made the aggregation tags

$outTempDay.week.max

instead of

$week.outTempDay.max

That's because Cheetah could then just search a list of search list extensions, each bound to a different stats database, looking for a hit on "outTempDay" in a schema. 

But, all is not lost. With the latter syntax ($week.outTempDay.max), we would have to add some additional code in weewx.stats that would have the job of binding type to the appropriate stats database. It would have the dictionary stats_dict passed into it. Once it knows what the type is, it can then try each stats database in the dictionary, looking for a hit on that type. Basically, it has to reproduce what Cheetah would have done.

Hope this all makes sense.

-tk



mwall

unread,
Feb 1, 2014, 6:57:37 PM2/1/14
to weewx-de...@googlegroups.com
On Saturday, February 1, 2014 6:47:32 PM UTC-5, Thomas Keffer wrote:
Example skin.conf excerpt:

[[myspecial_widget]]
  archive_database = my_database
  [[[widgets]]]
  [[[gadgets]]]

where widgets and gadgets are columns in the database my_database.

there is one small change that would make this much more powerful - defer the binding one more level so that you can do this:

[[myspecial_widget]]
    [[[widgets]]]
        archive_database = my_database
    [[[outTemp]]]

this would make comparisons *much* easier.

would doing this cause any problems if the times are not aligned between databases?

m

Thomas Keffer

unread,
Feb 1, 2014, 7:01:17 PM2/1/14
to mwall, weewx-de...@googlegroups.com
As I recall, the reason I didn't do that is that you have to know the time domain before you can do the database query. That requires hitting the database to get the last timestamp. Or, something like that.

-tk

mwall

unread,
Feb 1, 2014, 7:06:18 PM2/1/14
to weewx-de...@googlegroups.com
On Saturday, February 1, 2014 6:47:32 PM UTC-5, Thomas Keffer wrote:
In retrospect, I wish I had made the aggregation tags

$outTempDay.week.max

instead of

$week.outTempDay.max

That's because Cheetah could then just search a list of search list extensions, each bound to a different stats database, looking for a hit on "outTempDay" in a schema. 

But, all is not lost. With the latter syntax ($week.outTempDay.max), we would have to add some additional code in weewx.stats that would have the job of binding type to the appropriate stats database. It would have the dictionary stats_dict passed into it. Once it knows what the type is, it can then try each stats database in the dictionary, looking for a hit on that type. Basically, it has to reproduce what Cheetah would have done.

this has always been a burr in my saddle - the first form is more intuitive for me since each word after a dot does some kind of specialization to the previous.

we could go with the first approach, and simply special case anything that starts with 'day', 'week', 'month' or 'year'.  that form would be deprecated in favor of the first form.

m

Thomas Keffer

unread,
Feb 1, 2014, 7:11:05 PM2/1/14
to mwall, weewx-de...@googlegroups.com
That's one way of looking at it. 

But, if you look at it from the point of view of time, it's not. Out of the universe of all possible "current" observations, you specialize on outTempDay. And so on. 

In any case, this is minor stuff. I'm more interested in reactions to the rest of the proposal.

-tk

Oz Greg

unread,
Feb 1, 2014, 7:30:10 PM2/1/14
to weewx-de...@googlegroups.com, mwall
Tom,

Just for interest how much of a change is it moving to ..$outTempDay.week.max 

We would all take a hit on changing our skins of course.. 

Oz Greg

unread,
Feb 1, 2014, 7:44:53 PM2/1/14
to weewx-de...@googlegroups.com
Tom,

I can see option 1 happening as technically the clientraw.txt (the base datafile of WD) should be bound to the davis loop rather than archive however on a PI unit the cheetah engine generates the skin in roughly 3 seconds and we get a davis loop every 2 seconds even with a stale tag our little pi boxes are going to feel the pain..

My gut tells me we would be better off taking the hit and switching to $outTempDay.week.max but I do not know what pain that delivers to you..

Thomas Keffer

unread,
Feb 1, 2014, 7:59:09 PM2/1/14
to weewx-de...@googlegroups.com
Being the kind of guy who eats dinner before dessert, that's what my gut was telling me too (well, maybe it was actually warning me about too many chips during tomorrow's Super Bowl).

But, it turns out to be wrong. The problem with putting the type first and the time period second,

$outTempDay.week.max
$outTempDay.current

is that you can't tell whether to bind to the stats database (first line) or the archive databse (2nd) until you've seen whether an aggregation is expected. So, you end up having to write extra code anyway.

Might as well keep things backwards compatible and stick with what we have.

Question, Greg: Is there anything in weewx-WD that can't be done through a database? Does my rule, "It's gotta be in the database," cause any pain?

Same question to Matthew about forecast, cmon, etc.

-tk



Oz Greg

unread,
Feb 1, 2014, 8:16:09 PM2/1/14
to weewx-de...@googlegroups.com
Tom, just as a FYI we tend to eat our salad's with the main meal not before it.. :-)

Our pain is in the breadth of the extra stats (excluding the two extra measurements we need in archive) and I just discovered we are going to need at least two additional stats taking the additional custom stats upto 7 or 8 now..

WD is a beast I would have never started this project knowing what I know now of it complexity... 

Thomas Keffer

unread,
Feb 1, 2014, 8:23:01 PM2/1/14
to Oz Greg, weewx-de...@googlegroups.com
... tell me about it. :-(

So long as all the additional stats are in their own database, it shouldn't be too bad.

So, can I take it that everything you need can be obtained from the databases?

-tk

gjr80

unread,
Feb 1, 2014, 8:30:01 PM2/1/14
to weewx-de...@googlegroups.com, Oz Greg

Greg is right the breadth of extras is large indeed. The philosophy taken with Weewx-Wd though was where we needed a 'substantial number of stats' on an ob we put it in either archive or stats so that we had the power of $current., $year.humidex.max etc. Those one offs, eg dateTime of last rain we just handle through (a large number of) SLEs. For my 5c worth, while SLE exist Weewx-WD would cope just fine with "It's gotta be in the database". Plots are interesting, Weewx-WD has not really touched this, plotting humidex and outTemp on the same plot would be ideal, but I think we have some ideas of a way around this for Weewx-WD.

Gary

Oz Greg

unread,
Feb 1, 2014, 8:30:22 PM2/1/14
to weewx-de...@googlegroups.com, Oz Greg
Should be yes, or we would add it as an additional stat and include it in the DB if required..

mwall

unread,
Feb 1, 2014, 8:30:25 PM2/1/14
to weewx-de...@googlegroups.com
On Saturday, February 1, 2014 7:59:09 PM UTC-5, Thomas Keffer wrote:
Question, Greg: Is there anything in weewx-WD that can't be done through a database? Does my rule, "It's gotta be in the database," cause any pain?

cmon is straightforward.  it saves data to an 'archive' database.  data is plotted using the archive_database binding mechanism in the imagegenerator.  however, there is no 'stats' database for cmon.  if someone wanted to create a cmon search list extension, they would have to reimplement all of the stats aggregation.  it is pretty easy, as illustrated in the extstats example, but it would be even easier if all of the basic types (day, week, month, year, alltime, 7day, 30day) were set up to apply to any new data types.  (maybe they are and i just don't know that part of weewx yet?)

forecast uses a completely different type of database.  the data are not directly plotable, so imagegenerator cannot use the forecast database.  timestamps are not unique keys in the forecast database.  there is some aggregation that happens in the ForecastVariables search list extensions - hourly forecast data are aggregated into daily summaries.  creating and manipulating ValueHelper objects could be easier (or i need some education).

one other scenario is comparing data sets.  i have been using emoncms and other tools for this because i cannot combine data from two different databases into a single plot in weewx.

m

Oz Greg

unread,
Feb 1, 2014, 8:49:42 PM2/1/14
to weewx-de...@googlegroups.com, Oz Greg
Gary raises an interesting point because of the nature of WD outputting all it's data in a file we use a translate program to push data to jpgraphs thus keeping the load of graphs off our very small pi boxes.. 

gjr80

unread,
Feb 1, 2014, 8:55:11 PM2/1/14
to weewx-de...@googlegroups.com, Oz Greg
Yes, this is the advantage of using the built-in graphing in WDL and jpGraphs to handle graphing in php to support Saratoga/Carter Lake templates. Your standard weewx site doesn't have such a luxury though.

Thomas Keffer

unread,
Feb 1, 2014, 9:03:40 PM2/1/14
to gjr80, weewx-de...@googlegroups.com, Oz Greg
Thanks, all. This is what I needed.

Re: graphing. Yes, I agree. It seems silly to generate the graphs in weewx and then FTP them to a webserver. Indeed, the graphing should really be done on the client. I've thought about just including the data in the web page and then using HighCharts or something similar to do the graphing.

But, that's for another day!

-tk

Thomas Keffer

unread,
Feb 2, 2014, 11:01:54 AM2/2/14
to weewx-de...@googlegroups.com
One other thought: there is an assumption here that observation types are unique across all databases. That is, a type like outTemp, or outTempDay would only appear in only one database.

Is this a problem?

If a type appears in more than one database I suppose we could disambiguate with a syntax such as:

$day(database="weewx-wx").outTempDay.max

-tk


mwall

unread,
Feb 2, 2014, 11:33:59 AM2/2/14
to weewx-de...@googlegroups.com
On Sunday, February 2, 2014 11:01:54 AM UTC-5, Thomas Keffer wrote:
One other thought: there is an assumption here that observation types are unique across all databases. That is, a type like outTemp, or outTempDay would only appear in only one database.

Is this a problem?

If a type appears in more than one database I suppose we could disambiguate with a syntax such as:

$day(database="weewx-wx").outTempDay.max

-tk


we should plan on a syntax to disambiguate. 

for example, run two weewx instances, one with a weather station driver feeding data into a standard weewx database, a second with a one-wire sensor array feeding data into a second standard (or fairly standard) weewx database, then run wee_reports on a skin that combines data from the two.

another example is using one instance of weewx to collect data from a weather station, with the owfss service collecting data from one-wire sensors into a separate database, and a skin that combines data from the two databases.  it is quite possible the names in the owfss database overlap those in the weewx database.

also, the syntax should be capable of specifying not only database, but also table.  (my impression is that the use/consistency of database/table is a bit rough around the edges right now).

m

Thomas Keffer

unread,
Feb 2, 2014, 11:39:27 AM2/2/14
to mwall, weewx-de...@googlegroups.com
Those are good examples. OK, we'll need to disambiguate.

Regarding being able to specify not only the database, but the table as well, that would only apply to the archive database. The stats database already consists of N+1 tables where N is the number of observation types. 

We could adopt a syntax similar to what MySQL uses: 'database.table', where the 'database' part would be the symbolic name of the database (e.g., 'archive_database') and 'table' is the name of the table within database.

-tk

mwall

unread,
Feb 2, 2014, 11:51:07 AM2/2/14
to weewx-de...@googlegroups.com, mwall
On Sunday, February 2, 2014 11:39:27 AM UTC-5, Thomas Keffer wrote:
We could adopt a syntax similar to what MySQL uses: 'database.table', where the 'database' part would be the symbolic name of the database (e.g., 'archive_database') and 'table' is the name of the table within database.

if table is not specified, then default to table name of 'archive', since that is what is done everywhere else in weewx:

archive_database -> table 'archive' in database defined by archive_database
archive_database.archive -> table 'archive' in database defined by archive_database
archive_database.stations -> table 'stations' in database defined by archive_database

Thomas Keffer

unread,
Feb 2, 2014, 11:53:14 AM2/2/14
to mwall, weewx-de...@googlegroups.com
Exactly.

Let's get v2.6 out first, then I'll start working on this. 

-tk
Reply all
Reply to author
Forward
0 new messages