Backup/Restore DATA for weewx

2,971 views
Skip to first unread message

Yves Martin

unread,
May 30, 2014, 11:00:27 PM5/30/14
to weewx...@googlegroups.com
What is the best and simple way to do backup of data and config then restore it on another SD card?
In the same idea, what is the simple way to do a daily backup of data and config via FTP or another on an external site?

YM - Canada

vds

unread,
May 30, 2014, 11:15:48 PM5/30/14
to weewx...@googlegroups.com
There is no best but there are lots of simple ways.

I save my /home/weewx tree in a git repo that I snapshot and store via Dropbox whenever I make changes to it.   I symlink the public_html and archive trees to other locations so a tgz of the software directory doesn't have all the transient files (public_html) nor the big db files (archive).

I run a cron job nightly to save the data to a separate partition and scp it over to a Dropbox-enabled system whenever I remember it (I run weewx on a box that doesn't have a Dropbox client available for it).  Script is attached - nothing too fancy, it's just a quickie but it gets the job done.  Should be pretty obvious where/what to edit for your site if you want to try it.


weewx-backup-copy.sh

William Phelps

unread,
May 31, 2014, 11:49:59 AM5/31/14
to weewx...@googlegroups.com
I posted a script I use to backup to Amazon s3 space a while back. Amazon S3 is probably the least expensive online space available.

Yves Martin

unread,
May 31, 2014, 10:49:20 PM5/31/14
to weewx...@googlegroups.com
Thanks, I'll get a look on it.

YM

Lloyd Adams

unread,
Jun 2, 2014, 8:21:54 AM6/2/14
to weewx...@googlegroups.com
I used to ftp my backups from my linux server to my virgin media account every nioght, but now transfer them to my OneDrive (means I automatically get a copy on my PC). (OneDrive space is free if you have a MS account.) Rather than just zip up the .sdb files, I dump out the tables and zip those up - should I ever need to restore in theory it should give me more flexibility. I also zip up the whole of /home/weewx - bit of an overkill but you cannot have too many backups.

Remigiusz Zukowski

unread,
Jun 16, 2014, 8:53:51 AM6/16/14
to weewx...@googlegroups.com
Thank you for the script, however I am curious about one thing, your script does a sqlite database file copy, what if during file copying the file is actually written by weewx and the copy is not consistent? There's a very little chance about that but not a zero? I am not an expert but sqlite itself has some means for database backup. Another solution is to shut down weewx, do a file copy and then run weewx once again.

pterodaktil

unread,
Jun 16, 2014, 9:05:49 AM6/16/14
to weewx...@googlegroups.com
I use  this script  to backup data to nfs share
it automaticly remove old backups (more then 1 week)
#!/bin/bash
tar cvzf
/media/weather_backup/weather_$(date +%d:%m:%y_%H:%M).tar.gz  /etc/weewx /etc/nginx /var/lib/weewx /var/www/weewx/

find  
/media/weather_backup -mtime +7 -name "weather*.tar.gz" -exec rm -f {} \;





суббота, 31 мая 2014 г., 7:00:27 UTC+4 пользователь Yves Martin написал:

Thomas Keffer

unread,
Jun 16, 2014, 9:46:23 AM6/16/14
to weewx-user
All writes to the sqlite databases are protected by transactions. The database may not always be up to date, but it should never be inconsistent.

-tk


On Mon, Jun 16, 2014 at 5:53 AM, Remigiusz Zukowski <rola...@gmail.com> wrote:
Thank you for the script, however I am curious about one thing, your script does a sqlite database file copy, what if during file copying the file is actually written by weewx and the copy is not consistent? There's a very little chance about that but not a zero? I am not an expert but sqlite itself has some means for database backup. Another solution is to shut down weewx, do a file copy and then run weewx once again.

--
You received this message because you are subscribed to the Google Groups "Weewx user's group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Phelps

unread,
Jul 18, 2014, 9:07:27 PM7/18/14
to weewx...@googlegroups.com
Well, it turns out that Remigiusz Zukowski is correct. The database can and will be corrupt if all you do is copy it and weewx makes an update during the copy. I just had to restore a database from a backup and the resulting database was corrupt. I had to repair it.

There is a documented procedure for backing up an sqlite3 database without damage. I am testing changes to my script now, and when I've got it done I'll post back here with the update.

William

Andrew Milner

unread,
Jul 18, 2014, 9:33:15 PM7/18/14
to weewx...@googlegroups.com
Sod's law always wins!!

William Phelps

unread,
Jul 19, 2014, 3:47:05 AM7/19/14
to weewx...@googlegroups.com
in order to get a safe backup of the database, it must be done while weewx is not running. I've tried starting up sqlite3 and running a backup command, but it will fail with "database locked" if weewx is doing an update. It can also cause weewx to quit if it finds the database locked by another program.

My backup script can stop the weewx service, wait a few seconds, then have sqlite make a backup, then restart weewx again. So far that's the only way I can think of to do this safely.

Tom (or anyone else), any other ideas? Can this be done within weewx so that is makes a backup copy of the database on a schedule? I suppose even doing that there's still a race condition...

William

Andrew Milner

unread,
Jul 19, 2014, 3:53:06 AM7/19/14
to weewx...@googlegroups.com
Why make it more complex that it needs to be - KISS.  Stop weewx, do housekeeping, restart weewx just has to be the simplest thing to do.  Weewx will recover back to where it was wrt archives etc - and nothing will be lost.  No need to try and be clever just for the sake of it, and no need to build housekeeping into weewx as everyone has different methods and frequencies .. 


--
You received this message because you are subscribed to a topic in the Google Groups "Weewx user's group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/weewx-user/ha-mbc6zkpY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to weewx-user+...@googlegroups.com.

Thomas Keffer

unread,
Jul 19, 2014, 10:16:30 AM7/19/14
to weewx-user
Sorry, but I've never really thought about it. I've been doing a simple cron backup to a tarball for years without stopping weewx. AFAIK, it's never failed, nor has it ever corrupted the backup. 

The RPi SD disk can be very slow to lock and unlock with sqlite. I use a hard disk, so perhaps that is the reason I have never encountered a database locked error.

-tk


--
You received this message because you are subscribed to the Google Groups "Weewx user's group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to weewx-user+...@googlegroups.com.

William Phelps

unread,
Jul 19, 2014, 11:55:39 AM7/19/14
to weewx...@googlegroups.com
Tom,

Have you downloaded a backup copy of the database and run an integrity check? When I had to restore from backup, the database had errors. An integrity check showed errors.

weewx@fhpwx ~/test15 $ sqlite3 weewx.sdb.bad
SQLite version 3.7.13 2012-06-11 02:05:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite
> pragma integrity_check;
*** in database main ***
On tree page 1239 cell 2: Rowid 1399562700 out of order (previous was 1936433312)
Corruption detected in cell 0 on page 1239
Corruption detected in cell 1 on page 1239
Corruption detected in cell 2 on page 1239
Fragmentation of 573 bytes reported as 0 on page 1239
Page 1436: btreeInitPage() returns error code 11
On tree page 1385 cell 80: Child page depth differs
On tree page 2 cell 15: Child page depth differs
On tree page 2 cell 16: Child page depth differs
rowid
1936433312 missing from index sqlite_autoindex_archive_1
Error: database disk image is malformed
sqlite
>

William Phelps

unread,
Jul 19, 2014, 11:57:25 AM7/19/14
to weewx...@googlegroups.com
Andrew,
Thanks. I agree with you completely. I've modified my backup scripts to stop the weewx service, wait 30 seconds (to allow time for weewx to end), tar the database files, and then restart weewx.
William

Thomas Keffer

unread,
Jul 19, 2014, 11:57:52 AM7/19/14
to weewx-user
I've used a backup copy but, true, I have not run an integrity check. When I get home I'll give it a try. Maybe I have a 5 year store of nothing but bad sqlite databases. :-)

-tk


Lloyd Adams

unread,
Jul 20, 2014, 6:29:07 AM7/20/14
to weewx...@googlegroups.com
Rather than simply copy the files, I use the sqlite dump command (inherited from when I ran wview) - mind you, I've never checked the integrity either.

Lloyd

William Phelps

unread,
Jul 21, 2014, 12:52:07 AM7/21/14
to weewx...@googlegroups.com
I've used the dump command to recover from a damaged database file, but I'm not sure why you would use that instead of backup to create a backup copy, since the dump output is significantly larger. In either case if weewx tries to do an update while the dump (or backup) command is running, weewx will report a "database locked" error and stop. Might as well just stop it and then just copy the files.

mwall

unread,
Jul 21, 2014, 7:20:15 AM7/21/14
to weewx...@googlegroups.com
On Monday, July 21, 2014 12:52:07 AM UTC-4, William Phelps wrote:
I've used the dump command to recover from a damaged database file, but I'm not sure why you would use that instead of backup to create a backup copy, since the dump output is significantly larger.

one reason to dump is that the dump file can be loaded on any architecture.  i once had an arm cpu saving to sqlite, and i was just backing up the .sdb file.  the system crashed, but i thought "no problem, i have a backup".  unfortunately, at that time i had only intel hardware to do a recovery, so i could not get the data from the .sdb file.

now i always install a cron job that does a dump of every database.  for mysql the script looks like this:

#/bin/sh
DATABASES=archive stats
pw=`cat /root/mysqlpw`
mkdir -p /var/lib/mysql-dumps
for db in mysql $DATABASES; do
  echo `date` dumping $db
  /usr/bin/mysqldump -uroot -p$pw $db | gzip -c > /var/lib/mysql-dumps/$db.sql.gz
done

and for sqlite:

#/bin/sh
DATABASES=weewx stats cmon forecast
mkdir -p /var/lib/sqlite-dumps
for db in $DATABASES; do
  echo `date` dumping $db
  echo '.dump' | sqlite3 /var/lib/weewx/$db.sdb | gzip -c > /var/lib/sqlite-dumps/$db.sql.gz
done

then the incremental backups contains both platform-neutral dump files *and* the binaries.

 
In either case if weewx tries to do an update while the dump (or backup) command is running, weewx will report a "database locked" error and stop. Might as well just stop it and then just copy the files.

we should check this.

when weewx is running a report and it gets a 'database locked', it simply skips the template in the report (or perhaps the entire report?)

i would think the desired behavior for *saving* to the database should be to log the failure then continue, not to stop.  that way rapidfire or other loop-based uploads could still work.  however, this might be different than tom's original intent.  any thoughts?

m

Lloyd Adams

unread,
Jul 21, 2014, 8:16:23 AM7/21/14
to weewx...@googlegroups.com
I've so far not experienced any issues of weewx stopping during a backup (and because I don't use a fixed loop interval,  statistically I think I would have had a clash by now for writing to the databases)

William Phelps

unread,
Jul 21, 2014, 11:11:05 AM7/21/14
to weewx...@googlegroups.com
I run my station with a 1 minute archive interval and have a large database. When I said "weewx will report an error and stop", I was not speculating. In my testing this is what happened when I used a script to make a backup copy using sqlite3's backup command.

mwall: I agree, that's a good reason for using dump, this gives you a file that can be restored on any archtecture. It can also be inspected using any editor and checked for errors. I hadn't thought about gzip'ing the output, that's a great idea. With that the file size would be much less than the uncompressed file. I am going to have to add this to my script. Thanks!
 

Andrew Milner

unread,
Jul 21, 2014, 12:58:11 PM7/21/14
to weewx...@googlegroups.com, wbph...@gmail.com
William .... just a thought .... albeit a little radical, but one which has had me pondering a lot recently trying to work out the best solution .....

Like you I have a big database .... but do I need it to be so big I am asking myself .....

For 12 - perhaps 24 months one may look at and use detailed archive records.
After that one probably needs to know the information that is currently in the stats database ... with one or two additions/corrections which is currently in the NOAA reports .... and I would almost put money on it that the detailed information is never even looked at.  For graphs over the long periods one only gets one plot per day maximum ....

How can we get weewx to keep details for 24 months, then create extended stats / summary for older data ... and therefore greatly reduce the data storage whilst greatly improving report and data access times .....

Just some food for thought ....  

Of course, whilst generating the summary records and dropping the old data - the old data from year 3 ago could just be offloaded into its own table and zipped up for genuine archival purposes.  Of course most of the data is more than likely also being held on WU or elsewhere - so there is even less reason to be keeping it in local unused archive stores .....

Andrew

vds

unread,
Jul 21, 2014, 1:02:33 PM7/21/14
to weewx...@googlegroups.com
On Saturday, July 19, 2014 8:57:52 AM UTC-7, Tom Keffer wrote:
I've used a backup copy but, true, I have not run an integrity check. When I get home I'll give it a try. Maybe I have a 5 year store of nothing but bad sqlite databases. :-)


Tom - I checked all 85 of my daily sqlite3 archive backups and they were all good.  I have 675,000 records in the archive dating back 7.5 years.

I think the key is to not run sqlite3 commands against the live .sdb file when weewx is running.  I let the filesystem handle the locking/unlocking for me.  Thus far weewx (and previously wview) always wait gracefully.

All I do is copy the .sdb files to another partition on (spinning) disk and then gzip those offline files up.  That limits the window to about 20 seconds once/day while the copy is running, I've never seen any database locked errors in the logs ever, nor has weewx ever had any unexplained stops that I can recall.   Works thus far for me.

That said, the sqlite3 'pragma integrity_check' thing is definitely great to know about.  Wiki worthy info !

I also did some timing and size tests that had interesting results. This is a (wimpy) Seagate Dockstar running current Debian...
  • 20 seconds to copy the 110 MB .sdb file to a different partition
  • 40 seconds to gzip it down to 23 MB in size
  • alternately, it would be 310 seconds to sqlite .dump to a 230 MB file
  • and then 80 seconds to gzip the .dump file down to a 17 MB size
  • (surprising result - a gzipped .dump file is far smaller than the gzipped .sdb file - gzip loves to compress ascii files)

But I guess I agree that the most guaranteed way would be to:
  • copy the .sdb files offline
  • check their integrity
  • .dump them
  • gzip the .dump files


Thomas Keffer

unread,
Jul 21, 2014, 5:45:49 PM7/21/14
to weewx-user

Good data!

It does not surprise me that the gzipped dump file is smaller --- it does not contain as much information. It's missing the indexes.

-tk

Fat-fingered from my Android

Alan

unread,
Apr 11, 2016, 6:11:08 AM4/11/16
to weewx-user
William,
Would you be willing to share your backup script? I looked at the Amazon S3 script you posted and will plan to use that for remote storage. Your contributions are much appreciated.

Thanks,
Alan

Andrew Milner

unread,
Apr 11, 2016, 6:48:18 AM4/11/16
to weewx-user
Seeing this post it got me wondering, yet again, about hpw often the detailed archive data is used outside of say a two year window (current year and previous year).  It seems to me to be logical that at the end of the year weewx should create a dump file of the archive data from (year end - 2) and retain year.  The daily summary data can be retained ad infinitum.  This then opens up the possibility to be a little bit like mesowx and have three levels of record in the database:

Loop data retained for period set by .conf parameter (default say 3 days) - with rolling deletion as mesowx does
Archive data retained for period set by .conf parameter (default say 2 years) - with rolling deletion to an annual gzipped 'yeardump' file - name including the year of the dumped data
Daily summary data retained for ever - as now

What do others think?  I know that personally I compare in detail with only the previous year but frequently use history summary data (like bootstrap's history page)

A spin-off should be smaller indexes and faster database access, as well as smaller and thus faster regular backups.

Just food for thought ...... for version 4 perhaps????||

vince

unread,
Apr 11, 2016, 12:23:11 PM4/11/16
to weewx-user
On Monday, April 11, 2016 at 3:48:18 AM UTC-7, Andrew Milner wrote:
Seeing this post it got me wondering, yet again, about hpw often the detailed archive data is used outside of say a two year window (current year and previous year).  It seems to me to be logical that at the end of the year weewx should create a dump file of the archive data from (year end - 2) and retain year.  The daily summary data can be retained ad infinitum.  This then opens up the possibility to be a little bit like mesowx and have three levels of record in the database:



Disagree.  Not seeing the benefit.  A big file on disk can corrupt as easily as a small file on disk.    Also not seeing any speed benefit.

I use old data frequently, sometimes doing queries, sometimes fixing stuff.  Users sometimes need to rebuild summary tables and noaa files from 'all' the archives.   Most frequently for me this was when I needed to clean up bad 'recent' records in the archive table, but I wanted to regenerate my summary and noaa files to not have them reference the deleted (bad) data.

That said, as long as the default behavior is the same as today, no objections to 'optional' settings to do something different.   Tom usually likes pull requests with patches/fixes/features.

(weewx doesn't save LOOP data now at all, does it ?)




Andrew Milner

unread,
Apr 11, 2016, 1:01:46 PM4/11/16
to weewx-user
I am surprised if it takes you more than 2 years to find and correct archive errors!!  Under my suggestion regeneration of summaries etc would involve deletion and rebuilding for the time period covered by the current archive (eg 2 yrs).

Old data is still available (always) from the summary tables - how often do you really need to query the detailed archive instead of the daily summaries when dealing with data more than 2 yrs old??

weewx does not save Loop data - but mesowx does - and I merely suggest including mesowx approach to loop data as part of having an archive hierarchy instead of the current flat 'retain everything regardless' archive database.

vince

unread,
Apr 11, 2016, 1:40:19 PM4/11/16
to weewx-user
On Monday, April 11, 2016 at 10:01:46 AM UTC-7, Andrew Milner wrote:
I am surprised if it takes you more than 2 years to find and correct archive errors!!  Under my suggestion regeneration of summaries etc would involve deletion and rebuilding for the time period covered by the current archive (eg 2 yrs).


Nothing surprises me anymore.

Re: the fixup, yes I fixed more than 2 years of data a couple times after I switched weather stations from a lame'o 2315 to my VP2.   Nuked all the wind data for records with outlying values (Lacrosse bug) and also any ridiculously high temperatures (Lacrosse didn't shield well from direct sunlight).  So it happens.
 
Old data is still available (always) from the summary tables - how often do you really need to query the detailed archive instead of the daily summaries when dealing with data more than 2 yrs old??


Often enough. I do a lot of querying old/historical archive data looking for hi/low and special conditions from time to time.
 
weewx does not save Loop data - but mesowx does - and I merely suggest including mesowx approach to loop data as part of having an archive hierarchy instead of the current flat 'retain everything regardless' archive database.


Again, no objections for 'optional' features and behavior, but I'd suggest the default  stay with the devil we know  :-)
 

Andrew Milner

unread,
Apr 11, 2016, 2:13:27 PM4/11/16
to weewx-user
The Hi/low data queries should be in the summary data and not require a scan of, or access to, the full detailed archive.  Sounds like you don't trust weewx to put the correct data in the summary tables!!

I remain surprised that it took you two years after changing stations to 'discover' that there were errors in the 2+ year old data.  I assume you just deleted records - which of course means that those old data records - whilst not containing errors - were actually incomplete. Anyway - after your editing or massaging with 'acceptable' values invented/generated. the value of those historic data records would have been reduced.

Thomas Keffer

unread,
Apr 11, 2016, 6:03:50 PM4/11/16
to weewx-user
Weewx is not so "wee" anymore. There's no way I'd want to add archiving and backup to it. Those are clearly jobs that can be easily handled by external systems tools.

-tk

--
You received this message because you are subscribed to the Google Groups "weewx-user" group.

Yves Martin

unread,
Apr 12, 2016, 9:54:28 AM4/12/16
to weewx-user
I will answer by myself :) I see this question is still in suspend and seems to be an important part of this nice piece of software. I've lost myself 2 years of data because of corrupted physical support, since I did on a regular base, backup of my sqlite and config like this :

I'm using rsync under Debian to my Synology NAS. "cron" use useful directories for daily, weekly, monthly scripts...

Daily script here ... /etc/cron.daily/weewx-backup-daily

--- script start here ---

#!/bin/sh
#This script backs up weewx to NAS BACKUP (YMartin.com)
#Line added to prevent "TERM environment variable not set" error

DATE=`date +"%y%m%d"`
export TERM=${TERM:-dumb}

clear

echo "Backing up weewx databases"

cd /var/lib/weewx
cp *.sdb /var/backup-weewx/db/

tar cf /var/backup-weewx/weewx-db-$DATE.tar /var/backup-weewx/db/*
gzip -f /var/backup-weewx/weewx-db-$DATE.tar

sshpass -p 'xxxxxxxx' scp -rpC /var/backup-weewx/weewx-db-$DATE.tar.gz ad...@192.168.8.22:/volume1/BACKUP/Meteo/weewx/db-backup/

rm /var/backup-weewx/weewx-db-$DATE.tar.gz

echo "Done - weewx databases backed up and uploaded to NAS"

--- script stop here ---

... and weekly, this is my script in /etc/cron.weekly/weewx-backup-weekly:

--- script start here ---

#!/bin/sh
#This script backs up weewx to NAS BACKUP (YMartin.com)
#Line added to prevent "TERM environment variable not set" error

DATE=`date +"%y%m%d"`
export TERM=${TERM:-dumb}

clear

echo "Backing up weewx"

cd /etc/cron.daily/
cp -u weewx-backup-daily /var/backup-weewx/cron/

cd /etc/cron.weekly/
cp -u weewx-backup-weekly /var/backup-weewx/cron/

cd /etc/weewx/
cp -u weewx.conf /var/backup-weewx/conf/

cd /etc/weewx/skins/Bootstrap/
cp -r * /var/backup-weewx/Bootstrap/

cd /var/lib/weewx
cp *.sdb /var/backup-weewx/db/

cd /usr/share/weewx/user/
cp -r * /var/backup-weewx/user/

cd /var/www/weewx/
cp -r * /var/backup-weewx/www/

tar cf /var/backup-weewx/weewx-backup-$DATE.tar /var/backup-weewx/*
gzip -f /var/backup-weewx/weewx-backup-$DATE.tar

sshpass -p 'xxxxxxxx' scp -rpC /var/backup-weewx/weewx-backup-$DATE.tar.gz ad...@192.168.8.22:/volume1/BACKUP/Meteo/weewx/backup/

rm /var/backup-weewx/weewx-backup-$DATE.tar.gz

echo "Done - weewx backed up and uploaded to NAS"

--- script stop here ---

That's it!
I'm more safe now ;)

Yves,
YMartin.com/meteo

Jlou 43

unread,
Apr 19, 2016, 6:17:45 PM4/19/16
to weewx-user
Thank you for the scripts, Yves, but you don't stop weewx when you are doing your backup ?

Thank you.
J.L


Yves Martin

unread,
Apr 28, 2016, 8:24:05 AM4/28/16
to weewx-user
No, you don't have to.

Yves,
YMartin.com/meteo
Reply all
Reply to author
Forward
0 new messages