Automated incremental backup of local git repo

961 views
Skip to first unread message

Michael

unread,
Nov 4, 2014, 4:23:46 PM11/4/14
to chromi...@chromium.org
Those of you who back up your local changes consistently, how do you do it? I've been digging around the git docs and StackExchange trying to find a solution. So I wanted to crowdsource this question -- I'm sure people have already written useful scripts for this.

My requirements are:
  1. Backups do not include commits in the remote (origin/master)
  2. Backs up all local branches (which may branch at different points and from different local/remote branches)
  3. Recovery is easy using an up-to-date repo. Individual branches can be recovered even if the same branch exists in the existing repo (say, if I use git squash to collapse a commit list, then want to see the original commits from yesterday's backup without digging through reflog)
  4. Can be automated
I think what I want involves "git bundle" but I haven't figured out how to not include the entire project's history in a bundle.

Thanks.
Michael

Sasha Bermeister

unread,
Nov 5, 2014, 3:08:52 PM11/5/14
to mich...@chromium.org, chromi...@chromium.org
Are there really no replies to this? I've been wondering the same thing - tried a few things from stack overflow but none were really convenient enough for me to cronjob overnight.

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Scott Graham

unread,
Nov 5, 2014, 3:26:16 PM11/5/14
to sas...@chromium.org, mich...@chromium.org, chromium-dev
I `git cl upload` regularly. It's not what you asked for, but given the infrequency of catastrophic failure I feel it's sufficient.

Stefan Zager

unread,
Nov 5, 2014, 3:49:26 PM11/5/14
to Sasha Bermeister, Michael Giuffrida, Chromium-dev
Requirement #1 ("Backups to not include upstream commits") cannot be satisfied; git doesn't work that way.  The only way to backup or archive your leaf-level commits is to archive the full repository history reachable from those commits.

I'm curious what the use case is.  I've never heard someone express a need to archive and restore branches in this way.  If it's to enable collaboration with others, or to make your personal branches available from all of your machines, we usually do that with a branch in a remote repository (possibly self-hosted).

Stefan

On Wed, Nov 5, 2014 at 12:08 PM, Sasha Bermeister <sas...@chromium.org> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Dirk Pranke

unread,
Nov 5, 2014, 3:58:33 PM11/5/14
to Stefan Zager, Sasha Bermeister, Michael Giuffrida, Chromium-dev
One could presumably generate a series of patches for each commit on your leaf branches, and then package those up, right? It seems like this might be what StGit and/or Quilt kinda do.

I've occasionally wished for a way to easily be able to move the history of branches between repos without needing to be able to pull/push directly between the two.

-- Dirk

Stefan Zager

unread,
Nov 5, 2014, 4:08:06 PM11/5/14
to Dirk Pranke, Sasha Bermeister, Michael Giuffrida, Chromium-dev
On Wed, Nov 5, 2014 at 12:57 PM, Dirk Pranke <dpr...@chromium.org> wrote:

One could presumably generate a series of patches for each commit on your leaf branches, and then package those up, right? It seems like this might be what StGit and/or Quilt kinda do.

I've occasionally wished for a way to easily be able to move the history of branches between repos without needing to be able to pull/push directly between the two.

Yeah, that's what I'm confused about.  Why search for third-party workarounds, when 'git push/pull' works so beautifully?  That is what git does best.

Michael Moss

unread,
Nov 5, 2014, 4:27:53 PM11/5/14
to Stefan Zager, Dirk Pranke, Sasha Bermeister, Michael Giuffrida, Chromium-dev
I gather the idea (at least in the original request) is to be able to backup your work somewhere that doesn't have the capacity for multi-GB repos, like a personal cloud storage account, so push/pull between full git clones isn't an option.

Michael Moss

unread,
Nov 5, 2014, 4:33:43 PM11/5/14
to Michael Giuffrida, chromium-dev
$ git bundle create localbranch.bundle origin/branch..localbranch
Will give you just the commits you've added on top of "origin/branch".

As for your other requirements, I guess you could write a script to loop over all your local branches and create a bundle like this for each. And you can read the bundle into any branch you want (e.g. git fetch localbranch.bundle localbranch:new_localbranch), so you don't need to worry about conflicts with existing branches.

I've never done this, so I don't have a canned solution for you, but it definitely seems possible with a modest amount of scripting (depending on how full-featured and robust you're looking to get).
 

Thanks.
Michael

Yuri Wiitala

unread,
Nov 5, 2014, 4:47:40 PM11/5/14
to mm...@chromium.org, Michael Giuffrida, chromium-dev
I had this question myself about a year ago.  My conclusion: Just back up the whole filesystem.  My rationale is that the amount of time I'd spend setting this up correctly, plus the possibility of accidentally doing it wrong, plus the amount of time I'd spend restoring my system down-the-road is not worth the savings in storage space.  Also, I really don't care if my nightly backup takes 10-15 minutes instead of 10-15 seconds.  ;-)

Scott Hess

unread,
Nov 5, 2014, 8:50:24 PM11/5/14
to Yuri Wiitala, Michael Moss, Michael Giuffrida, chromium-dev
My strategy is to have a central repo which pulls from everywhere else.  If it pulls from upstream first, then all the other pulls are trivial.  It costs disk space, but mostly disk is cheap.

The problem with backing up the whole filesystem is that git is a database running on top of the filesystem, and the filesystem backup may-or-may-not respect the guarantees git expects to see between individual files.  Same problem as with backing up a database at the filesystem level.  Back things up at the git level if you really want to be able to get your info out later.  [Generating a bunch of format-patch files and backing those up is much less problematic.]

-scott

Primiano Tucci

unread,
Nov 6, 2014, 10:02:13 AM11/6/14
to Michael Giuffrida, Stefan Zager, Dirk Pranke, Sasha Bermeister, Chromium-dev, Michael Moss
If requirement #1 can be relaxed (why do you care?) it becomes fairly easy to handle in a very gitorious way.
Let's imagine that you / the company you work for, have a git server (or a private github account with enough quota). Maybe if the git server was smart enough you could even ask it to to fork an existing project (chromium's src) like you do on github.
You could then fork the original chromium project (which might take a bit of time, but only the first time) and then, on every backup, push all the new objects introduce by your local branches.

In concrete words, this would become something like

Setup the backup
$ your-git-server-ctl create user/michaelpg/my-chrome-backup -fork chromium/chromium/src
# Wait for the server to clone repo, might take a while. You can check the replication status with git ls-remote git://user/michaelpg/my-chrome-backup
$ git remote add backup git://user/michaelpg/my-chrome-backup
$ git fetch backup  # This should be very quick, as you already have the same objects in the 'origin' remote

Eventually, in lack of a git server, you could use a filesystem (e.g. another drive) as your backup repo, i.e.:
$ mkdir /mnt/boh/my-backup.git && git -C /mnt/boh/my-backup.git init --bare
$ git remote add backup /mnt/boh/my-backup.git
The only problem is that the first push (see below) will take a while because you will have to replicate all the chromium repo objects.

Backup all branches
$ git push --all --force backup  # This will push all the new objects to the backup repo

Backup only one branch
$ git push --force backup local_branch_name

See the diff of a branch w.r.t the last backup
$ git diff branch1..remotes/backup/branch1  # (you can strip the remotes/ part on most modern git versions)

See the new commits on a branch since the last backup
$ git log remotes/backup/branch1..branch1

Restore the state of a branch from the backup
$ git reset --hard remotes/backup/branch1

Reply all
Reply to author
Forward
0 new messages