CSV extension writing bad data

128 views
Skip to first unread message

Dave

unread,
Oct 14, 2019, 9:35:50 PM10/14/19
to weewx-user
I recently upgraded weewx to 3.9.2. Now it appears the CSV extension is writing bad data. I like the fact that it stops writing the key line every time but now the data doesn't match the key. 

For those who might need a reminder, the key line looks like this:
# dateTime,altimeter,appTemp,barometer,channel,cloudbase,dewpoint,heatindex,humidex,inTemp,maxSolarRad,outHumidity,outTemp,outTempBatteryStatus,pressure,rainRate,rssi,rxCheckPercent,sensor_battery,sensor_id,usUnits,windSpeed,windchill

Next you get lines of data like so:

1571077878,75.1909422213,None,3733.59877061,57.2421654093,72.9,78.9824045747,None,58,72.9,0,0,3,100.0,0,1228,1,None,0.0,72.9
1571077931,29.9314717722,None,76.19,None,0,29.7514624062,None,0,110.4646,3,100.0,0,1228,1,180.0,3.70760684504
1571077949,74.1763675742,None,3850.99831681,57.225607406,73.4,79.472688903,None,57,73.4,0,0,3,100.0,0,1228,1,2.67886214224,73.4


Astute readers will note that "outTemp" is 3 degrees. This is southern california. ;)

I've no idea how to begin to debug this. A wee_debug info is attached. Thanks in advance for any assistance. 
info.txt

Pat

unread,
Oct 14, 2019, 9:44:26 PM10/14/19
to weewx-user
I've never played with this extension, but I just installed it and copied your weewx.conf [CSV] settings. It appears to be working fine for me on 3.9.2 - my outTemp field is correct (see below). It's 49F outside and that's what's in the CSV.

Is there anything in your syslog that would point to an error? Perhaps set debug = 1 in weewx.conf, restart and let it run for a few archive intervals and see if anything comes up. 

Are you running another skin to make sure that weewx is seeing your outTemp correctly? Ruling out a bad sensor?

What if you run weewxd manually and check that the LOOP packets are what you'd expect them to be?


# dateTime,maxSolarRad,outTemp,rainRate,txBatteryStatus,usUnits,windDir,windSpeed,windchill
1571103610,0.0,49.4,0,0,1,None,0.0,49.4
1571103612,0.0,0.0,0,0,1,None,0.0
1571103615,0.0,0.0,0,1,None,0.0
1571103617,0.0,0.0,0,0,1,None,0.0
1571103620,0.0,49.4,0,0,1,None,0.0,49.4
1571103625,0.0,0.0,0,1,None,0.0
1571103627,30.0690643936,47.1460416382,44.15,69.98,0.0,29.6138526645,0,1

Dave

unread,
Oct 15, 2019, 11:19:24 AM10/15/19
to weewx-user
You'll note that debug is set to 1 in the weewx.conf I posted. So I have a parsing program that monitors the CSV file. Here's output from manual weewx:

LOOP:   2019-10-15 08:09:56 PDT (1571152196) altimeter: 29.9926902844, channel: None, dateTime: 1571152196, inTemp: 77.0, maxSolarRad: None, outTempBatteryStatus: 0, pressure: 29.8123680055, rain: None, rain_total: 110.4646, rainRate: 0, rssi: 3, rxCheckPercent: 100.0, sensor_battery: 0, sensor_id: 1228, usUnits: 1, windDir: None, windSpeed: 0.0
LOOP:   2019-10-15 08:10:14 PDT (1571152214) appTemp: 54.9009717276, channel: None, cloudbase: 428.468658946, dateTime: 1571152214, dewpoint: 52.8847379006, heatindex: 54.0, humidex: 57.6993884477, maxSolarRad: None, outHumidity: 96, outTemp: 54.0, outTempBatteryStatus: 0, rainRate: 0, rssi: 3, rxCheckPercent: 100.0, sensor_battery: 0, sensor_id: 1228, usUnits: 1, windchill: 54.0, windDir: None, windSpeed: 0.0

Here's raw CSV data corresponding to those LOOP entries:
1571152196,29.9926902844,None,77.0,None,0,29.8123680055,None,0,110.4646,3,100.0,0,1228,1,None,0.0
1571152214,54.9009717276,None,428.468658946,52.8847379006,54.0,57.6993884477,None,96,54.0,0,0,3,100.0,0,1228,1,None,0.0,54.0

If I parse this:
Tue Oct 15 08:09:56 2019 altimeter:29.9926902844 barometer:77.0 dewpoint:29.8123680055 inTemp:110.4646 maxSolarRad:3 outHumidity:100.0 outTempBatteryStatus:1228 pressure:1 rssi:0.0
 
Tue Oct 15 08:10:14 2019 altimeter:54.9009717276 barometer:428.468658946 channel:52.8847379006 cloudbase:54.0 dewpoint:57.6993884477 humidex:96 inTemp:54.0 outTemp:3 outTempBatteryStatus:100.0 rainRate:1228 rssi:1 sensor_battery:0.0 sensor_id:54.0


I parse according to the first line in the file:
# dateTime,altimeter,appTemp,barometer,channel,cloudbase,dewpoint,heatindex,humidex,inTemp,maxSolarRad,outHumidity,outTemp,outTempBatteryStatus,pressure,rainRate,rssi,rxCheckPercent,sensor_battery,sensor_id,usUnits,windSpeed,windchill



Pat

unread,
Oct 15, 2019, 12:39:42 PM10/15/19
to weewx-user
I saw the debug, but didn't see any actual helpful debug output. That's why I asked for a syslog.

Also, I am assuming you're on the latest CSV extension version, but I shouldn't assume anything. Are you on the latest CSV extension version?

Pat

unread,
Oct 15, 2019, 1:26:03 PM10/15/19
to weewx-user
Answered my own question. The debug shows it's 0.10 which looks like it's the latest available

Sort of running out of options on why yours is acting differently than mine. Hopefully syslog has a clue or two. 

Dave

unread,
Oct 15, 2019, 1:50:48 PM10/15/19
to weewx-user


On Tuesday, October 15, 2019 at 9:39:42 AM UTC-7, Pat wrote:
I saw the debug, but didn't see any actual helpful debug output. That's why I asked for a syslog.

Oct 15 08:09:55 myserver weewx[18438]: engine: Debug is 1
Oct 15 08:09:55 myserver weewx[18438]: engine: Initializing engine
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdTimeSynch
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdTimeSynch
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdConvert
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdConvert
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdCalibrate
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdCalibrate
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdQC
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdQC
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.wxservices.StdWXCalculate
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.wxservices.StdWXCalculate
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service user.csv.CSV
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service user.csv.CSV
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdArchive
Oct 15 08:09:55 myserver weewx[18438]: engine: Use LOOP data in hi/low calculations: 1
Oct 15 08:09:55 myserver weewx[18438]: manager: Daily summary version is 2.0
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdArchive
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdPrint
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdPrint
Oct 15 08:09:55 myserver weewx[18438]: engine: Loading service weewx.engine.StdReport
Oct 15 08:09:55 myserver weewx[18438]: engine: Finished loading service weewx.engine.StdReport
Oct 15 08:09:55 myserver weewx[18438]: engine: Station does not support reading the time
Oct 15 08:09:56 myserver weewx[18438]: acurite: Found station at bus= device=
Oct 15 08:09:56 myserver weewx[18438]: acurite: next read in 18 seconds
Oct 15 08:10:14 myserver weewx[18438]: acurite: Found station at bus= device=
Oct 15 08:10:14 myserver weewx[18438]: acurite: next read in 18 seconds


I dont think this is helpful, which is why I didn't post it. Nevertheless, here it is. Hopefully I'm wrong. :)

Pat

unread,
Oct 15, 2019, 7:28:49 PM10/15/19
to weewx-user
You're right. Not much here. What are your other skins (like Standard or Seasons) showing? Do they seem accurate?

Dave

unread,
Oct 15, 2019, 8:26:47 PM10/15/19
to weewx-user
Yes, I installed a standard report and that is accurate.

At this point, I am tempted to run weewx inside of Perl and process LOOP statements directly. :)

However, the point of all this is to get the information inside of a nagios monitor. I tried the graphite back end but that didn't seem to work. Is there a way to get the effective idea of writing LOOP data to a file for external processing asynchronously?

gjr80

unread,
Oct 16, 2019, 2:39:03 AM10/16/19
to weewx-user
Hi,

The problem you are experiencing is due an incompatibility between your station emitting what is known as partial loop packets (ie not all obs are included in all loop packets) and the mode in which you are operating the csv extension.

You are running the CSV extension such that it emits loop data to file and includes a header line. Subsequent loop packets are appended to the file. The header line is only written (1) when the csv output file is first created or (2) on each loop packet but only if mode = 'overwrite'. In your case you are using 'append' mode so the header line is only written when the csv output file is first created. Think now of the situation where the first loop packet has, say, observation outTemp but the next loop packet does not. So the header line is written and includes 'outTemp' in the relevant place. The line of csv data for that packet is written to file and each field matches with the header line. When the next loop packet is processed there is no outTemp field in the loop packet so nothing is written in outTemp's place in the line of csv data (well in fact something will be written it will be the next field if there is one). Consequently when your parser parses that line it expects to see outTemp data when in fact it sees the next value instead (perhaps it was outTempBatteryStatus) and hence you get nonsense values from your parser.

Solutions, there are a few possibilities:

1. use 'overwrite' mode instead of append, in this mode the header line is written each time a packet is processed but there is only ever one line in your csv output file
2. delete the csv output file after each loop packet is read by your parser (this is a pretty lame solution)
3. rewrite the csv extension to cache loop packet data so that a 'complete' loop packet is written each time
4. emit archive record data rather than loop packet data

Gary

Dave

unread,
Oct 16, 2019, 12:03:41 PM10/16/19
to weewx-user
On Tuesday, October 15, 2019 at 11:39:03 PM UTC-7, gjr80 wrote:
Hi,

The problem you are experiencing is due an incompatibility between your station emitting what is known as partial loop packets (ie not all obs are included in all loop packets) and the mode in which you are operating the csv extension.

You are running the CSV extension such that it emits loop data to file and includes a header line. Subsequent loop packets are appended to the file. The header line is only written (1) when the csv output file is first created or (2) on each loop packet but only if mode = 'overwrite'. In your case you are using 'append' mode so the header line is only written when the csv output file is first created. Think now of the situation where the first loop packet has, say, observation outTemp but the next loop packet does not. So the header line is written and includes 'outTemp' in the relevant place. The line of csv data for that packet is written to file and each field matches with the header line. When the next loop packet is processed there is no outTemp field in the loop packet so nothing is written in outTemp's place in the line of csv data (well in fact something will be written it will be the next field if there is one). Consequently when your parser parses that line it expects to see outTemp data when in fact it sees the next value instead (perhaps it was outTempBatteryStatus) and hence you get nonsense values from your parser.

Thank you very much for this explanation. I am familiar with this idea but It seems the behavior changed between 3.5.0 and 3.9.2. It used to write each header line when mode was 'append'. Since part of the upgrade was moving weewx to another server, I still have the old server's data and binaries available. If you look in the weewx.csv for the 3.5.0 server:

dateTime,altimeter,appTemp,barometer,channel,cloudbase,dewpoint,heatindex,humidex,inDewpoint,maxSolarRad,outHumidity,outTemp,rainRate,rssi,rxCheckPercent,sensor_battery,sensor_id,txTempBatteryStatus,usUnits,windSpeed,windchill
1571075730,None,71.4621394073,None,None,3276.5581345,56.6531442082,70.3,76.0399971396,None,None,62,70.3,0,3,100.0,0,1228,0,1,1.65011333748,70.3
dateTime,altimeter,appTemp,barometer,channel,cloudbase,dewpoint,heatindex,humidex,inDewpoint,maxSolarRad,rain,rainRate,rain_total,rssi,rxCheckPercent,sensor_battery,sensor_id,txTempBatteryStatus,usUnits,windDir,windSpeed,windchill
1571075748,None,None,None,None,None,None,None,None,None,None,0.0,0,110.4646,3,100.0,0,1228,0,1,180.0,2.16448441021,None
dateTime,altimeter,appTemp,barometer,channel,cloudbase,dewpoint,heatindex,humidex,inDewpoint,inTemp,maxSolarRad,outHumidity,outTemp,pressure,rainRate,rssi,rxCheckPercent,sensor_battery,sensor_id,txTempBatteryStatus,usUnits,windSpeed,windchill
1571075766,29.9423211803,71.4621394073,29.9470837144,None,3276.5581345,56.6531442082,70.3,76.0399971396,None,74.39,None,62,70.3,29.7622563497,0,3,100.0,0,1228,0,1,1.65011333748,70.3


...it's writing a header line each time, even in 'append' mode. My config for that server was:

[CSV]
    filename = /var/db/weewx/weewx.csv
    header = true


So it's clear that something changed to exhibit the behavior you describe above instead of the old behavior.


Solutions, there are a few possibilities:

1. use 'overwrite' mode instead of append, in this mode the header line is written each time a packet is processed but there is only ever one line in your csv output file
2. delete the csv output file after each loop packet is read by your parser (this is a pretty lame solution)
3. rewrite the csv extension to cache loop packet data so that a 'complete' loop packet is written each time
4. emit archive record data rather than loop packet data

IMO option 3 provides the best solution as there's zero way to parse a partial packet without keys for the values. ;) This would also restore my nagios plugin's proper functionality. Unfortunately, I've too little experience in python to attempt 3 in a public software repo. 

Those other solutions, for various reasons, are non-optimal. 1 and 2 will suffer from the lack of caching of data. 4 makes some sense, but for this comment:

    # If the station hardware supports data logging then the archive interval
    # will be downloaded from the station. Otherwise, specify it (in seconds).
    archive_interval = 300


I have a need for immediate data. Not knowing what my archive interval will be is a deal breaker. 

Sadly, it seems I will have to use weewx within my own script and parse it's stdout myself. Thanks for all the insight and responses. :D

Pat

unread,
Oct 16, 2019, 5:44:19 PM10/16/19
to weewx-user
It's probably not too bad to fork the extension and add the CachedValues method as seen in the restx extension (read: copy/paste). Once tested and working this should satisfy option 3.

Other important information found for CachedValues are here, and here. You can see on every loop it updates the cached dictionary of those values with the update() function. Then it outputs the observations that just came in, plus the missing ones from Cached values, to the LOOP with the get_packet() function.

gjr80

unread,
Oct 16, 2019, 7:44:53 PM10/16/19
to weewx-user
Can't comment on what changed on your install from 3.5.0 to 3.9.2. The only WeeWX code that could affect the csv output would be the AcuRite driver, and I see there have been no changes to this driver since the 3.5.0 release that could account for the changed behaviour. Perhaps you are using a different csv extension version. The logic for creating the csv output is wholly within the csv extension.

In any case, given the old sample output below you could mimic that output by making a simple alteration to the csv extension code. In csv.py you could change (untested):

        header = None
       
if self.emit_header and (
           
not os.path.exists(filename) or flag == "w"):
            header
= '# %s\n' % ','.join(self.sort_keys(data))

to

        header = '# %s\n' % ','.join(self.sort_keys(data))

be aware that this change will mean that the header is always written before each loop packet is written, none of the config options in weewx.conf will change this. Of course the other issue is that you now have an orphan version of the csv extension and any subsequent upgrades/installs of the extension will overwrite your changes.

Gary
Reply all
Reply to author
Forward
0 new messages