Importing Historical Data from Weatherlink.com

91 views
Skip to first unread message

The Reckoning

unread,
Oct 29, 2023, 5:33:02 PM10/29/23
to weewx-user
Hey guys, 

I have a Vantage Pro 2 and have had it connected to WeatherLink.com via a WeatherLink IP connector. 

I had to turn the upload off because i now want to store my data on Weewx. Is it possible to retrieve all of that historical data from the Weatherlink site?

The Reckoning

unread,
Oct 30, 2023, 11:20:22 AM10/30/23
to weewx-user
Okay so i have made some progress but will need the communities help on this. I have tried to include as much information as possible but please let me know if you need anything else. Also find the link to a folder containing relevant files below.

I downloaded my data from WeatherLink.com 1 year at a time in CSV format. I have data from 7/28/2019 through 10/25/2023. 

I then modified the CSV to fit the formatting requested by Weewx. This included changing the inter-cardinal wind directions to degree (done with excel formula), deleting some header rows, and changing the names of the column headers. I copied these column headers into my import config file and changed other settings in there. 

I stopped Weewx and ran a dry-run. There were no errors on the --dry-run so ran it for real for the first year of data (July-2019-June-2020). This worked fine. I could see all the new data on the webpage. 

Great. Now onto the next year: (July-2020-June-2021). Same process as above but the data never showed up on the webpage. I tried it again but same result. 

After this happened, I loaded a backup of the weewx database and combined all data into 1 file and imported that. 

Again, are huge holes in the data. I now have data from July-2019 to June-2020. No data from July-2020 to December-2022. Then I have normal data in all of 2023 (January-present). 

I cant figure out whats wrong. I looked the data over again but can find no differences between years. While the import was happening I saw lots of these logs saying, (as an example):
Oct 30 09:43:37 jm-Virtual-Machine wee_import[195988] INFO weewx.manager: Added record 2022-04-27 19:00:00 EDT (1651100400) to database 'weewx.sdb'

But I also saw lots of these logs: 

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 05:00:00 EST (1643623200) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 05:30:00 EST (1643625000) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 06:00:00 EST (1643626800) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 06:30:00 EST (1643628600) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 07:00:00 EST (1643630400) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 07:30:00 EST (1643632200) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 08:00:00 EST (1643634000) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 08:30:00 EST (1643635800) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Oct 30 10:53:18 jm-Virtual-Machine wee_import[199232] ERROR weewx.manager: Unable to add record 2022-01-31 09:00:00 EST (1643637600) to database 'weewx.sdb': UNIQUE constraint failed: archive.dateTime

Not sure what these mean but there were a LOT of them. Maybe these are the cause of my issues. Below is a link to:
  1. csv.conf
  2. PWS_All2.csv - all of my historical weather data
  3. Weewx_log_Import_All.txt - Log of the full import showing the above mentioned errors and successes
  4. wee_debug --info output

Any help I can get in resolving this is much appreciated. 

Here is the command output from the combined import 
jm@jm-Virtual-Machine:/var/lib/weewx$ sudo wee_import --import-config=/etc/weewx/import/csv.conf
/usr/share/weewx/wee_import:719: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.version import StrictVersion
Using WeeWX configuration file /etc/weewx/weewx.conf
Starting wee_import...
A CSV import from source file '/home/jm/Desktop/PWS_All2.csv' has been requested.
Using database binding 'wx_binding', which is bound to database 'weewx.sdb'
Destination table 'archive' unit system is '0x01' (US).
Missing derived observations will be calculated.
All WeeWX UV fields will be set to None.
All WeeWX radiation fields will be set to None.
Starting import ...
74365 records identified for import.
Proceeding will save all imported records in the WeeWX archive.
Are you sure you want to proceed (y/n)? y
Unique records processed: 74365; Last timestamp: 2023-10-25 10:00:00 EDT (1698242400)
Calculating missing derived observations ...
Processing record: 74365; Last record: 2023-10-26 00:00:00 EDT (1698292800)
Recalculating daily summaries...
Records processed: 74000; time: 2023-10-17 19:30:00 EDT (1697585400)
Finished recalculating daily summaries
Finished calculating missing derived observations
Finished import
74365 records were processed and 74365 unique records imported in 250.55 seconds.
Those records with a timestamp already in the archive will not have been
imported. Confirm successful import in the WeeWX log file.

The Reckoning

unread,
Oct 30, 2023, 1:40:06 PM10/30/23
to weewx-user
Okay so I took some time to study the logs, specifically the 'unable to add record UNIQUE constraint failed: archive.dateTime' errors and I think I understand them. By comparing what was in the database at the time of the import with the errors, I determined that these only appeared when data already existed in the database. This is perfectly acceptable because the data that was already there was there from previous imports. In other words, my imports overlapped. 

But that doesn't solve my issue. I am unable to see any reports from July 2020 through December 2022 even though the data says that it was imported successfully. 

Is there a way to query the database or export it as a CSV so I can see whats really there? Or maybe a way I can recalculate monthly or yearly summaries? I tried --drop-daily and --rebuild-daily but those did not help.

See the attached screenshots to see what i mean. The same thing happens with the default skin.
Screenshot 2023-10-30 at 1.38.53 PM.png
screencapture-10-39-10-123-weewx-year-2020-html-2023-10-30-13_37_49.pdf

The Reckoning

unread,
Oct 30, 2023, 2:36:41 PM10/30/23
to weewx-user
I exported my database (with all imports complete) to a CSV via sqlite3. 

All the data is there but the Daily/Monthly/Yearly summaries wont generate for certain days. I don't see anything different about the data for these days/months. See for yourself. CSV Linked below. The file is called weewx_db_export.csv.

The Reckoning

unread,
Oct 30, 2023, 3:37:41 PM10/30/23
to weewx-user
Another update. Man have I learned a lot about weewx and databases in general through this process. I am typing all of this up in the hopes that it helps someone in the future who encounters a similar issue. 

I examined the daily summary databases such as archive_day_outTemp and they all looked normal. In short, the data import worked as expected with no issues. All databases have all my data and are up to date.

To fix this issue, I deleted everything from /var/www/html/weewx, stopped and restarted weewx, and viola! All summaries appeared. It looks like the NOAA Monthly and yearly summaries did not regenerate since I did the import. I had assumed that they would have done so when I ran sudo wee_database --drop-daily and sudo wee_database --rebuild-daily but they did not.

The Reckoning

unread,
Oct 30, 2023, 3:55:05 PM10/30/23
to weewx-user
I am going to remove the files from the link that I posted above but am going to paste the csv.conf file that i used as well as sample of my data to help those in the future. 
Basically here are the steps to get historical data off of Weatherlink.com and into weewx:
  1. Go to Weatherlink.com. Login. Click the Data tab. Set the start Date and time period accordingly. Click the arrow inside the box above the time period and enter your email address. This will send you a CSV file. I had about 4 years of data so I had to repeat this 4 times.
  2. Follow the instructions here for the rest of it. Thats what I did. The below hints might aid you though.
  3. Modify the CSV files. Make them look like the one I have attached below. 
    1. Remove the rows above there the column headers are. 
    2. Convert the wind direction and high speed wind direction from inter-cardinal to degrees. North is 0 NNE is 22.5. NE is 45...etc. this can be done by using a reference table and a vlookup formula. 
    3. Change the column headers to something simple. IDK if this is necessary but i did it. if youre using my CSV.conf file, you can use my headers. 
  4. Change the csv.conf file accordingly making sure the units in the FieldMap section match those of your data.
  5. Follow the rest of the instructions in the link.
  6. Take a backup of your database (as recommended) this can usually be found in /var/lib/weewx/weewx.sdb
  7. Run sudo wee_import --import-config=/path/to/csv.conf --dry-run first
  8. Resolve any errors. 
  9. Run if for real sudo wee_import --import-config=/path/to/csv.conf
  10. After it finishes, wait for the website to refresh and then if you find issues with monthly or yearly reports, remove all data from /var/www/html/weewx. Restart weewx. Wait for the website to refresh.

Hope this helps someone!
csv.conf
PWS_All2.csv

gjr80

unread,
Oct 30, 2023, 4:20:44 PM10/30/23
to weewx-user
Perhaps a little late as you seem to have solved your issues but a few points post importing data.

1. Checking for a successful import by looking at the WeeWX generated web pages is a poor choice as it often provides misleading information. When folks import data it is often to obtain historical data from some days (or months or years) ago; WeeWX typically displays such data in plots or on historical stats pages. Many plots (particularly week, month and year) use aggregate periods of one, three or 24 hours when obtaining their source data. WeeWX plots are updated every aggregate period so some plots may not update for one, three or 24 hours. Stats type pages typically update each report cycle, so they may well (or may not) give an indication if data was successfully imported, but this indication will likely before for a few specific points in time rather than an extensive report (such as provided in a plot). The best measure of a success import is to look at the data in the database (the archive table) and to look at the WeeWX log.

2. 'unable to add record UNIQUE constraint failed' log entries are entirely normal and not considered an error, it is just WeeWX saying that a record with the same timestamp as the current imported record already exists in the WeeWX database and the imported record has been discarded. Unfortunately wee_import cannot tell (well not in an efficient manner) if an imported record already exists in the database, so hence the summary reporting within wee_import states the number of records processed and the number of unique records imported but refers the user to the WeeWX log for what actually happened.

3. There is no need for dropping or rebuilding daily summaries when importing. wee_import automatically updates the daily summaries. A drop and rebuild should only be used where an import failed mid-stream (and then it will unlikely be required due to the transactional nature of database operations performed by wee_import (and WeeWX)). Arguably, unnecessary dropping and rebuilding actually loses data as this can reduce the granularity of highs/lows (both in terms of value and time) recorded in the daily summaries for each observation - but that is another story.

4. NOAA format reports are updated each report cycle but only the current month and year report are updated; WeeWX does not go back and recreate any earlier/missing month/year reports. The solution here is to delete all NOAA format reports from the WeeWX machine and this will force WeeWX to regenerate all NOAA format reports for all time on the next report cycle. This can take quite a few minutes (database size dependent) so for multiple  imports it may be time effective to perform one import, delete and check NOAA format report generation and then force regeneration. Once satisfied the first import works, perform the rest of the imports then force NOAA format report regeneration once the imports are complete.

5. Plots can suffer a similar problem to the NOAA format reports but due to a different mechanism (refer 1. above). The solution; however, is the same; delete all plots on the WeeWX machine to force regeneration. Again this can take a long time so for multiple imports it may be more time effective to perform one import, delete and check plot generation and then force plot regeneration once the imports are complete.

Gary 

The Reckoning

unread,
Oct 30, 2023, 4:25:11 PM10/30/23
to weewx-user
Thanks for your detailed responses. That is great info!
Reply all
Reply to author
Forward
0 new messages