Conversion from VWS to weewx

388 views
Skip to first unread message

jmltech

unread,
Aug 1, 2016, 1:16:16 PM8/1/16
to weewx-user
I'm planning on converting my VWS data (from 2007 to present) into weewx.  I plan on writing (modifying) a python script to read all of my daily VWS files and add each records into the archive table.  I'm using cvsConvert (by Tom Keffer, and updated by wbphelps and Jim Myers) as my inspiration.  For some reason, my VWS cvs files only have data from 2015 on.  But, I do have every daily text file from 2007 onwards (almost 3500 files), so I would like to capture all those records in the archive table.  A couple of questions:

1) The VWS data does not have dewpoint, heatindex or windgust10 values.  Do I need to add these values to the archive table as I'm inputting records?

2) I have tested my installation using the simulator, then switched over to test the davis driver.  All seems to be working.  After a few minutes, I switched the VWS data logger back to VWS so that I wouldn't miss any records.  If I start with a new archive table (using sql to delete current test records) will there be any issues after I update the table with all of my VWS archived records?

3) The VWS had an archive rate of 15 minutes.  Can I switch to 5 minutes once I switch over to weewx, or will this cause problems with historical averages?

4) is storing this many records in an sqlite table okay?

5) any other gotchas in doing a conversion to weewx?

Thanks all
Joe

Thomas Keffer

unread,
Aug 1, 2016, 6:31:28 PM8/1/16
to weewx-user
On Mon, Aug 1, 2016 at 10:16 AM, jmltech <joseph...@gmail.com> wrote:

1) The VWS data does not have dewpoint, heatindex or windgust10 values.  Do I need to add these values to the archive table as I'm inputting records?


​If you want to produce reports and plots with dewpoint and heatindex then, yes, you should calculate them and include them.

For windgust, if you don't have it stored in the VWS database, then it's gone. Nothing you can do about it.
 
2) I have tested my installation using the simulator, then switched over to test the davis driver.  All seems to be working.  After a few minutes, I switched the VWS data logger back to VWS so that I wouldn't miss any records.  If I start with a new archive table (using sql to delete current test records) will there be any issues after I update the table with all of my VWS archived records?

​Don't see why there would be. Am I missing something?
 

3) The VWS had an archive rate of 15 minutes.  Can I switch to 5 minutes once I switch over to weewx, or will this cause problems with historical averages?

In theory, this should not be a problem but, unfortunately, there is a flaw in the daily summaries tables (issue #61) that prevents changing the archive interval. We have a fix in the works, so it should eventually be OK. In any case, even without the fix, there is only a problem with averages that span across the change in archive interval boundary. Highs and lows will be fine.


4) is storing this many records in an sqlite table okay?


​Nine years of data at a 15 minute sampling rate is actually not that much. My own station has 10 years at a 5 minute sampling rate --- about 170 MB --- and it's never been an issue.​

 
5) any other gotchas in doing a conversion to weewx?

​Please note that field 'rain' is the amount of rain that fell during that archive interval. That is, it's the amount of rain that fell in the five minute sampling window.

Also, see the Wiki for the differences between the three different kinds of pressure: barometer, pressure, and altimeter.

-tk

gjr80

unread,
Aug 1, 2016, 7:27:26 PM8/1/16
to weewx-user
Hi,

Further to what Tom said, I have been working on an 'wee_import' utility for weewx under issue 97. Initially I included import from a single CSV file and a WeatherUnderground PWS and have recently added support for importing from Cumulus monthly log files (which are really just a series of CSV files). From your description your VWS data sounds very similar to the Cumulus monhtly logs; its just a bunch of CSV files albeit covering a different time period and I expect different format. If you care to give me the format of the VWS files I expect it will be a simple matter to extend wee_import to work with the VWS data.


1) The VWS data does not have dewpoint, heatindex or windgust10 values.  Do I need to add these values to the archive table as I'm inputting records?

wee_import has an option to automatically calculate and include any missing derived obs, so this would take care of your missing dewpoint and heatindex and possibly a number of others too (windchill and rainrate if they are missing), depending on your source data. It will also take care of any missing pressures provided you have one of them and know what it is (barometer, pressure or altimeter).
 
3) The VWS had an archive rate of 15 minutes.  Can I switch to 5 minutes once I switch over to weewx, or will this cause problems with historical averages?

Since I am working on the fix Tom mentioned I would only suggest that you make may life easier for yourself if you can do the cutover from 15 to 5 minutes on a midnight boundary (ie your last 15 min record is timestamped 00:00, your first 5 minute record is 00:05).
 
5) any other gotchas in doing a conversion to weewx?

Not really a gotcha (you will soon notice if your daily sumamries are incomplete after you import data into the archive) but if you end up manually importing data into the archive you will need drop then rebuild you daily summaries using the `wee_database` utility. wee_import will take care of this though.


Gary

jmltech

unread,
Aug 2, 2016, 11:04:23 AM8/2/16
to weewx-user
Thanks for the responses.  I appreciate the info, especially about the rain.  the text files from VWS has a running total of rain for each archive period, and a running total of rain for the year.  So if it is raining during the time of the archive period, the daily rain value and total rain value will be incremented by the amount of rain received during the archive period.  To see how much rain fell between the archive period, you would need to subtract the previous daily rain from the current daily rain.

Gary - the CVS file that PWS stores the archive records is a standard comma delimited file, with no headings for row 1. I have attached a sample of 2 months worth of records, and added row 1 header so that you can see what the columns are.  However, I had a difficult time determining the columns, sine the PWS manual doesn't align with the CVS data that I have.  I found some other sources searching the web that seem to be accurate.  I'm not sure about the columns after BO, since I couldn't find any documentation for those.  I also have no idea what columns CF and CG are.  I've been trying to match the values up with what is displayed in the Vantage display without success.
As I mentioned, my CSV archive database only goes back to 2015.  I don't know where the records went.  Maybe there is a limit to size, and PWS did something (Or maybe I did something and didn't realize it).  However, I do have a plain text file (see attached) that has a record in 15 minute increments.  One file for each day.  So I have over 3500 of these files.  Reading them in using python should be easy.  Here is my strategy to ready these files to be read in by your import utility...

Python script:
1) open directory and get all file names
2) in a loop, process each file one by one
   3) open file
   4) read in first line, to capture date (could also get this from the file name)
   5) skip to 4th line to start reading each row
   6) map each column data (using space as the delimiter) to CSV columns
   7) write record to CSV file
   8) close file
   9) continue loop until all files have been processed

By my rough calculations, this should produce 140,000+ records (not including 2016 data to date).

I'm happy to give it a go unless you think your import tool (I haven't looked yet) can do this?



160731.txt
dbase - Copy.csv

gjr80

unread,
Aug 2, 2016, 7:47:23 PM8/2/16
to weewx-user
What you suggest would likely work, not sure why you are 'writing record to CSV' though, perhaps you meant 'archive' instead of 'CSV'. wee_import works in a similar manner, though it reads a file at a time, cleans up the data if required, maps the data, converts if required and then saves the records to archive using a weewx 'API' call. Then the process repeats.

Looking at the files both should be able to work with wee_import, I just need to put together the code that reads the file. What you have is very similar to the Cumulus monthly log files so rather then re-inventing the wheel it is just a case of a slightly different sized wheel. Will have a look later today.

Gary

jmltech

unread,
Aug 2, 2016, 8:04:47 PM8/2/16
to weewx-user
I was going to write all the records to a csv file so that your inport program can then read, add the calculated columns, then write the archive record. My quick program would just be a "prep" step before i run your import.

Thanks for taking a look. I started looking at your import program. I'm a very amateur programmer, but i was able to follow it (mostly).

gjr80

unread,
Aug 2, 2016, 9:56:49 PM8/2/16
to weewx-user
I was going to write all the records to a csv file so that your inport program can then read, add the calculated columns, then write the archive record. My quick program would just be a "prep" step before i run your import.

Ah, that explains it.
 

I started looking at your import program. I'm a very amateur programmer, but i was able to follow it (mostly).


As am I. 2 comments; first, if you could mostly follow it I probably have not done too badly and second, there is some pretty good help here when it comes to programming, I have learnt heaps.

Gary

gjr80

unread,
Aug 3, 2016, 10:31:45 AM8/3/16
to weewx-user
Joe,

Have had a chance this afternoon to sit down and look a bit closer at things. In terms of what data is needed for import, it really is a fairly small list as a lot of obs can be derived (eg dew point, heat index, wind chill and any missing pressures (provided you have the other pressure)). So what is beyond BO is really not required. Importing from the csv file would be the neater way to go but given your large history of data is in the daily text files these are what I have coded.

I have made a few assumptions based on your files:
  1. file name format is yymmdd.txt
  2. date format inside the file is always m/d/y
  3. 'Raw barom' is in fact what we know as barometer
I have wee_import successfully importing from multiple VWS daily text files. You will find the files in the vws branch on the wee_import repo on GitHub (I am not going to link the repo here as should wee_import eventually be included in weewx I will be deleting the repo). The files can be downloaded in an archive under the 'Releases' tab. Instructions are in the readme. Other than what is in the readme you will need to copy all of your daily text files to a folder that is accessible from your weewx machine. I would suggest the following steps in sequence:
  1. If you have any data in weewx you wish to retain then stop weewx (if running) and move your database (nominally weewx.sdb) aside.
  2. Run wee_import with the help option:
    wee_import --help
  3. Do a dry run import including any missing obs, this should give you a summary of what it found and what would be imported:
    wee_import --vws --source=/path/to/daily/files --calc-missing --dry-run
  4. If that goes well (--dry-run does everything but actually save to archive) do a full import:
  5. wee_import --vws --source=/path/to/daily/files --calc-missing
  6. If it all goes well resume weewx
  7. If it doesn't go well delete newly created weewx.sdb and replace it with the copy made at step 1, resume weewx

I have only done limited testing with the vws code, I was only importing a couple of files rather than 3500 and only did some cursory checking that the right data is going to the right place in the archive. There are also a few limitations at the moment:

  1. Only supports wind speeds in mph
  2. Only supports rain in inches and mm
  3. Only support pressures in inHg
Any issues let us know.

Gary

jmltech

unread,
Aug 3, 2016, 5:27:08 PM8/3/16
to weewx-user
Thanks Gary. This is truly above and beyond. I'll give it a try this weekend. One nice thing is that I don't have any real data in weewx archive yet. So I can start fresh at any point. Once I have the data transfered, I'll switch over. The simulator mode is awesome fore testing and playing around.

Wish I had your skills in python. My programming is usually pieces I find along the way, and lots of trial and error. I did do some programming back in the dBase days. But that was a 25+ years ago (maybe more)

jmltech

unread,
Aug 3, 2016, 9:17:25 PM8/3/16
to weewx-user
Hi Gary,
had a few minutes tonight, so I tried the import on a few daily text files.  Received this error on the command line:
root@knoxville1:/home/weewx/bin# ./wee_import --vws --source=/home/weewx/daily --calc-missing --dry-run
Starting wee_import...
Traceback (most recent call last):
  File "./wee_import", line 2446, in <module>
    main()
  File "./wee_import", line 631, in main
    source_obj = Source.sourceFactory(options, args)
  File "./wee_import", line 945, in sourceFactory
    options)
  File "./wee_import", line 2275, in __init__
    super(VwsSource, self).__init__(config_dict, vws_config_dict, options)
  File "./wee_import", line 836, in __init__
    self.wxcalculate = weewx.wxservices.WXCalculate(config_dict,
AttributeError: 'module' object has no attribute 'WXCalculate'

I didn't see anything in /var/log/syslog (I don't have weewx running)


jmltech

unread,
Aug 3, 2016, 9:35:05 PM8/3/16
to weewx-user
Some observations on the VWS daily files and the Rain columns:

Tot Rain is a running total. It resets to 0.0 on the first archive record recorded on January 1st unless it was raining during this period. If it was raining, the first achive record on January 1st is the difference between the previous archive and new archive that is being written for the 1st of the year.

DailyRain is a running total. It resets to 0.0 on the first archive record recorded on the new day (whatever the first archive record after midnight) unless it was raining during this period. If it was raining, the value recorded is the difference between the last archive record of the previous day and the new archive record for the new day.

Hope this makes sense. I saw a comment in your code and so I looked at the records from last day of the year, and the start of the new year (same with the daily files)

I didn't see anywhere in VWS where you could designate what day to start the "rain year".


Thanks

Gary Roderick

unread,
Aug 3, 2016, 10:01:48 PM8/3/16
to weewx...@googlegroups.com

Joe,

You need to get the latest wxservices.py from the weewx repo on github, it was updated since the last weewx release so has not been included in a release version yet. Just move your existing wxservices.py aside and slot in the updated version. There is a link in the readme on wee_import page if you can't find it. Unfortunately am away from PC at the moment.

Gary


--
You received this message because you are subscribed to a topic in the Google Groups "weewx-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/weewx-user/gc_F0w_rtQw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to weewx-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jmltech

unread,
Aug 4, 2016, 7:44:36 AM8/4/16
to weewx-user
Thanks Gary. I updated wxservices.py and that resolved the error.  Running again produced this error:
root@knoxville1:/home/weewx/bin# ./wee_import --vws --source=/home/weewx/daily --calc-missing --dry-run
Starting wee_import...
An import from VWS daily text files has been requested.
Using database binding 'wx_binding', which is bound to database 'weewx.sdb'
This is a dry run, imported data WILL NOT be saved to archive.
Any missing derived observations WILL be calculated.

Traceback (most recent call last):
  File "./wee_import", line 2446, in <module>
    main()
  File "./wee_import", line 632, in main
    source_obj.run()
  File "./wee_import", line 973, in run
    _raw_data = self.getRawData(period)
  File "./wee_import", line 2386, in getRawData
    self._header_map['total_rain']['units'] = self._unit_lookup[_units[9] + "_rain"]
IndexError: list index out of range
root@knoxville1:/home/weewx/bin#

I thought maybe I needed to adjust the mapping in the conv file, but I didn't see any default mapping for VWS.

gjr80

unread,
Aug 4, 2016, 8:29:33 AM8/4/16
to weewx-user
Hmm, the problem is that the code is trying to pickup the last units of measure value ('in' I expect) on the units line (line 3) and for some reason the code has found fewer units values on that line than I coded for. Is there any chance that the earliest daily text file has a different layout to the one you posted (the code will go through and import the daily files in order of age, oldest first, youngest last)? Failing that I will have need to make the code a little more robust.

The reason there is no map is that we know what each field is in the daily text file and we know the units of each field, so in theory we should be able to get everything we need from the daily text files without the user specifying anything. On the other hand, for a plain old CSV file we don't necessarily know what fields will be where and what their names will be nor do we necessarily know the units. Hence the map for CSV.

Gary

jmltech

unread,
Aug 5, 2016, 2:17:50 PM8/5/16
to weewx-user
Thanks for the info Gary. I'll take a closer look this weekend at the early daily files.

gjr80

unread,
Aug 5, 2016, 7:58:13 PM8/5/16
to weewx-user
Joe,

Have given this some more thought, did a bit of googling and reading of the online VWS manual. I've developed the opinion that importing VWS data is best done through CSV files, the so called 'daily tabulated text files' are just too 'variable' and too hard to parse. The daily text files have everything in there that you need; they identify the column, provide the units, provide an unambiguous date and time and of course the observations. But trying to pull out the column names and match them with their units is difficult, and to be able to handle all cases without some user input will be very complex and I am sure will lead to errors. In googling I found a wide variety of day text files formats, the most complex had the following columns/units:

Time Vapor Press Wind Dir A Wind Spd A WindGust A Hum In A Humidity A Temp In A Temp Out A Barom A Tot Rain A UV Avg Solar Avg WindCh AHeatIx In A HeatIx A Dew Pt A SL Barom A Wind Dir H Wind Spd H WindGust H Hum In H Humidity H Temp In H Temp Out H Barom H Tot Rain H UV Hi Solar Hi WindCh HHeatIx In H HeatIx H Dew Pt H SL Barom H Wind Dir L Wind Spd L WindGust L Hum In L Humidity L Temp In L Temp Out L Barom L Tot Rain L UV Lo Solar Lo WindCh LoHeatIx In L HeatIx L Dew Pt L SL Barom L RainRate
 
in ° mph mph % % °F °F in in W/sqm °F °F °F °F in ° mph mph % % °F °F in in W/sqm °F °F °F °F in ° mph mph % % °F °F in in W/sqm °F °F °F °F in in/hr

Would be a lot easier (and less prone to errors) if there weren't so many single spaces! Some columns having no units (eg UV) makes it more complex again.

That being said it was a fairly simple matter to rework the existing VWS import code to be a bit smarter and handle a variable number of columns rather than the fixed format I had originally coded. If you go and pull down the latest relase from GitHub you should have more success.

Gary

gjr80

unread,
Aug 5, 2016, 9:07:15 PM8/5/16
to weewx-user
It's easy to see why the pressure issue comes up all the time...

From the VWS manual regarding the VWS database format:

Column   Parameter
Number
9        Barometric Pressure
24       Sea-level Barometric Pressure
25       Pressure Altitude

The manual goes on to define the following in it's glossary:

Barometric Pressure. The pressure exerted by the atmosphere as a consequence of gravitational attraction exerted upon the "column" of air lying directly above the point in question. The measurement can be expressed in several ways. One is in millibars. Another is in inches or millimeters of mercury (Hg). Also known as atmospheric pressure.

Pressure Altitude. Atmospheric or barometric pressure expressed in terms of altitude which corresponds to that pressure in the standard atmosphere.

Sea Level Pressure. The atmospheric pressure at mean sea level either directly measured by stations at sea level or empirically determined from the station pressure and temperature by stations not at sea level. Used as a common reference for analyses of surface pressure patterns.

I can probably make sense of it so far, in VWS speak 'Barometric Pressure' is what in weewx speak we call 'pressure', 'Pressure Altitude' is what we call 'altimeter' and 'Sea level Pressure' is what we call 'barometer'. Bu then the VWS tabulated daily text file includes a column titled 'Raw Barom', now things start to get confusing...

Gary

jmltech

unread,
Aug 21, 2016, 11:19:11 AM8/21/16
to weewx-user
Gary,
Sorry for the delays in replying. I had to go out of town unexpectedly. I'll give the conversion using your updated script a try and let you know the results.

jmltech

unread,
Sep 1, 2016, 7:49:02 PM9/1/16
to weewx-user
Hi Gary,

I looked through most of the text files (towards the end I just started randomly picking files to look at).  You were correct, at least 6 different times the format of these files changed (and then changed back).  I believe this was probably due to software updates from previous years, or me playing with options in the VWS software without realizing that it was affecting the data being recorded.  I agree with you that trying to come up with a program to read through each of the files and mapping correctly in your wee_import program would not be very productive... there is no way to account for changes made in the format, or an easy way to guess which columns are which.  I think the best solution for folks that are in a similar situation is to create CSV file(s) that matches what your import program is looking for.

Once I find some time to start creating the CSV file from all of my text files (I may write a program to somewhat automate this, stopping when it encounters something odd or different) I'll give your latest wee_import a try.
I appreciate your time on looking at this.  I had no idea that the text files would be inconsistent.

Joe

gjr80

unread,
Sep 11, 2016, 8:14:40 PM9/11/16
to weewx-user
Hi Joe,

My aplogies, did mean to reply earlier. I think that proves it, trying to import VWS data from VWS produced text files (the so called  'Date Stamped File' according to the VWS manual) without some manual intervention is fraught with danger. The preferred option is to import from the VWS produced CSV file (the so called 'Csv File' according to the VWS manual). I don't know enough about VWS (I have not used it) to know whether the VWS generated CSV files can be generated retrospectively or not, if not then I suspect importing from VWS generated CSV files will not be much help unless the user has been producing the CSV files since they started with VWS. I suspect most users would not produce them.

Back to your case, by all means massage your data into some CSV files, you should be able to use some code to do most of the repetitive work and keep the manual changes/checks to a minimum. wee_import should take care of importing the reformatted data through its CSV import. wee_import presently only imports from a single CSV file at a time, though it should be a simple leap forward to import from multiple files (it does this for Cumulus monthly log files). I will put that down as a task to do - let me know when you are ready to import. My only advice would be to keep the day, month and year in the file name; import results are better when the records are imported chronologically - in most cases it eliminates discontinuities in rainfall across file boundaries.

Gary
Reply all
Reply to author
Forward
0 new messages