Corrupted a Github repo via the Github API

80 views
Skip to first unread message

Jonathan "Duke" Leto

unread,
Nov 6, 2013, 8:52:19 PM11/6/13
to pdx...@googlegroups.com
Howdy,

Interesting times!

I was using PyGithub to create commits/branches on Github. No big
deal. Everything was working on a throw-away repo, so I ran it against
the production repo....

The Github API somehow let me create a corrupt tree object and now the
repo is unclonable:

https://gist.github.com/leto/7346617

I know we have some Github API experts around. Any ideas? I included
the debug info for the request that created the tree in the above
Gist. My current guess is that it is something to do with the path of
the file in my tree. I used a name that starts with a forward slash
(/study/10/10.json). Also, my previous tests did not reference a file
in a subdirectory. It could be that PyGithub did not properly set the
recursive=1 flag that is mentioned here:

http://developer.github.com/v3/git/trees/#get-a-tree-recursively

Duke

--
Jonathan "Duke" Leto <du...@leto.net>
Leto Labs LLC http://letolabs.com
209.691.DUKE http://duke.leto.net
@dukeleto LinkedIn Github

Christopher Swenson

unread,
Nov 7, 2013, 12:00:36 AM11/7/13
to pdx...@googlegroups.com
Can you fork the repo on GitHub itself?

Otherwise, you might be hosed. Might be time to call the guys at GitHub to fix your tree.

--Christopher


--
You received this message because you are subscribed to the Google Groups "pdxgit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdxgit+un...@googlegroups.com.
Visit this group at http://groups.google.com/group/pdxgit.
For more options, visit https://groups.google.com/groups/opt_out.

Daniel Hedlund

unread,
Nov 7, 2013, 12:54:02 AM11/7/13
to pdx...@googlegroups.com
I know we have some Github API experts around. Any ideas? I included
the debug info for the request that created the tree in the above
Gist. My current guess is that it is something to do with the path of
the file in my tree. I used a name that starts with a forward slash
(/study/10/10.json). Also, my previous tests did not reference a file
in a subdirectory. It could be that PyGithub did not properly set the
recursive=1 flag that is mentioned here:

GitHub has switched over to using the "smart" protocol for all traffic.  If there was still a way to access the remote repository using the "dumb" protocol, you might have a shot at pulling the raw pack files across to you machine and digging into what actually went wrong.  It might still be possible if you have full access to your repo via SSH, but I'm guessing that GitHub uses something proprietary and wouldn't give you access to run commands anyway.  See the following for some examples though:
http://git-scm.com/book/en/Git-Internals-Transfer-Protocols

If you just want to working repo again...try deleting the branch remotely.  Failing that...

It's possible to clone the repo, just not the "leto_study_10" branch.  For example, the following works:
$ git clone -vv --single-branch -b master https://github.com/OpenTreeOfLife/treenexus.git

It's also possible to initialize a new repo and fetch all commits and branches excluding the corrupt one (master is one commit behind your corrupt branch):
$ mkdir repo-temp
$ cd repo-temp
$ git init .
$ git ls-remote
$ git fetch origin app/synthesis draft/jimallman/study/10-jones-et-al-2001 example-annotations master modified-study-10-jimallman modified-study-999-jimallman

Whether you can do a force push at this point...no idea.


For a little more info, you can clone the repo with more verbosity:

$ git clone -vv https://github.com/OpenTreeOfLife/treenexus.git
Cloning into 'treenexus'...
Server supports multi_ack_detailed
Server supports no-done
Server supports side-band-64k
Server supports ofs-delta
Server version is git/1.8.4
want 5762c8194c718e22bcf17d53818477320be3658d (HEAD)
want 54b865be4f4e646eb2de9059ad4731d0b5e6b37b (refs/heads/app/synthesis)
want 229d7a6bc914b71f5ed3a68add26036b7ea229f0 (refs/heads/draft/jimallman/study/10-jones-et-al-2001)
want 5e80b313b4ab65b3d8fd51d7e7b810b85705904a (refs/heads/example-annotations)
want 8afdbadb02e0fcff6df84453a618f105978dbc8f (refs/heads/leto_study_10)
want 5762c8194c718e22bcf17d53818477320be3658d (refs/heads/master)
want 229d7a6bc914b71f5ed3a68add26036b7ea229f0 (refs/heads/modified-study-10-jimallman)
want 56267f9688be6d7cd79a6469b03faf501b4779b9 (refs/heads/modified-study-999-jimallman)
done
POST git-upload-pack (492 bytes)
remote: fatal: corrupt tree file
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header


Jonathan "Duke" Leto

unread,
Nov 8, 2013, 11:58:42 AM11/8/13
to pdx...@googlegroups.com, dan...@digitree.org
Howdy,

Attempting to delete the branch via the git command line didn't work,
but deleting the branch from the Github web interface *did* work. That
got the repo back into a clonable state.

It ends up that creating a tree starting with a forward slash causes
Bad Things To Happen. Changing my code to use a tree.path =
"study/N/N.json" instead of tree.path = "/study/N/N.json" made
everything work.

Thanks for all of your help!

Duke
> --
> You received this message because you are subscribed to the Google Groups
> "pdxgit" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pdxgit+un...@googlegroups.com.
> Visit this group at http://groups.google.com/group/pdxgit.
> For more options, visit https://groups.google.com/groups/opt_out.



Reply all
Reply to author
Forward
0 new messages