|Restarting interrupted git operations (clone, fetch, push, update, etc)||TSU||11/1/12 12:21 PM|
How robust are git operations against remote repos?
Specifically, I've been looking at a number of Internet postings about whether it's possible to restart an interrupted cloning, but I'm now also curious about whether other network operations like Fetch, Push, and more can recover and continue an interrupted update and transfer of files.
Pls verify that the following commands can restart a partially built clone...
$ cd project
$ git fetch
$ git rebase -hard
If "git fetch" does not work, then the following commands
|Re: Restarting interrupted git operations (clone, fetch, push, update, etc)||Thomas Ferris Nicolaisen||11/2/12 2:01 AM|
The short answer is that you don't have to worry about any of these. Most Git operations are extremely reliable. And if something is missing, Git will detect it quickly with its inner consistency checks (since every object in the Git database has a verifiable checksum, the SHA, this is really safe and fast), and then give you the information you need to fix it (if you ask for it).
In my three years of using Git every day, I've never once managed to corrupt my own, or others repositories.
There are some 3rd party tools that have been a bit clumsy with how they operate on Git repositories, and leave it in an illogical state. This doesn't have anything to do with Git getting corrupted, it just has happened that these tools mess up some pointer in the Git repository.
A very easy way to recreate such a problem is to edit .git/refs/heads/master inside a temporary repository, and change the SHA code in there by just one character. Suddenly we get:
> git status
fatal: bad object HEAD
In this state I can't do anything with my repository, not even run git status, so I'll just remove the reference.
I can now run git status, but when I run git log I get a slightly different message:
fatal: bad default revision 'HEAD'
My HEAD reference (.git/HEAD) points to refs/heads/master, which doesn't exist (git log implicitly runs git log HEAD).
So now, I need to change refs/heads/master to something that makes sense. Let's go looking for lost commits:
git fsck --lost-found
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
Checking objects: 100% (58/58), done.
dangling commit ce6224048ea952a67678b6433ff048ac171655d9
There's a dangling commit. Let's point refs/heads/master to that and see what we get.
git update-ref refs/heads/master ce6224048ea952a67678b6433ff048ac171655d9
Ah, seems that was the commit master used to be pointing at before we "corrupted" the repository. So, we're back home where we started.
To recap: It is physically possible to get a git repository which has a pointer to a commit that does not exist. It is, however, extremely seldom, and has more to do with using unreliable tooling, or manual fiddling with the repository (like I did above). There aren't any Git commands that are particularly prone to this happening. Git push and pull will not update any pointers until all the commits have been safely transferred. Local operations like commit are atomical.
|Re: Restarting interrupted git operations (clone, fetch, push, update, etc)||TSU||11/2/12 9:48 AM|
Thx for posting, some good stuff for me to chew on.
Still, for some particularly long running operations it would be good to know if there is a way to "resume" something in process rather than simply discovering the anomaly, wiping the anomaly completely and starting from the beginning again.
|Re: Restarting interrupted git operations (clone, fetch, push, update, etc)||Thomas Ferris Nicolaisen||11/2/12 1:58 PM|
Just repeat the operation with the same command you used the first time, unless it was git clone, in which case you can resume it with git fetch.
|Re: [git-users] Re: Restarting interrupted git operations (clone, fetch, push, update, etc)||Konstantin Khomoutov||11/3/12 9:29 AM|
Continuing the "resumable clone" thread, a one nifty feature of
gitolite has been somewhat recently discussed on the main Git list .
Basically, gitolite can be configured to update a "Git bundle" (see the
git-bundle manual) which is then can be made downloadable via rsync or
HTTP protocols and then it can be downloaded using an rsync client of a
HTTP client which supports resuming. Using this technique can make the
"download everything" and "make a repo out of the downloaded stuff"
steps distinct, and the first step can be carried out using any number
of attempts. The downsides are obvious:
1) This requires special setup on the server side.
2) It's unclear what happens if someone manages to update a repository
while someone is downloading its bundle, or the update happens
between the adjacent download attempts.