Reproducible "unpack error" since upgrading to 2.4.2

1,150 views
Skip to first unread message

Jay Soffian

unread,
Aug 17, 2012, 12:24:46 PM8/17/12
to Repo and Gerrit Discussion
I recently upgraded to 2.4.2 from 2.3.

The repo in question besides the master branch, has an "upstream"
branch, which is pushed to hourly from a cronjob (the "upstream"
branch is a git-svn maintained mirror). All updates to the repo happen
via Gerrit.

Each morning, I perform a merge in a local clone:

1. git fetch
2. git checkout master
3. git reset --hard origin/master
4. git merge <some commit from origin/upstream>
5. git push origin HEAD:refs/for/master

Meanwhile, if "upstream" has been updated via the cronjob, then I'll
get an unpacker error when attempting to push:

$ git push origin HEAD:refs/for/master
Counting objects: 2736, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (658/658), done.
Writing objects: 100% (1928/1928), 815.00 KiB, done.
Total 1928 (delta 1478), reused 1636 (delta 1214)
remote: Resolving deltas: 100% (1478/1478)
error: unpack failed: error Missing tree
489b803c87abd8c7b87158d89bcb87a47f3bbf93
fatal: Unpack error, check server log
To ssh://git/repo
! [remote rejected] HEAD -> refs/for/master (n/a (unpacker error))
error: failed to push some refs to 'ssh://git/repo'

The "fix" is just to do a git fetch, then I can push successfully:

$ git fetch
remote: Counting objects: 19, done
remote: Finding sources: 100% (12/12)
remote: Total 12 (delta 9), reused 12 (delta 9)
Unpacking objects: 100% (12/12), done.
From ssh://git/repo
3182a2d9ad..3ab70c3050 upstream -> origin/upstream

$ git push origin HEAD:refs/for/master
Counting objects: 248, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (85/85), done.
Writing objects: 100% (85/85), 11.97 KiB, done.
Total 85 (delta 83), reused 0 (delta 0)
remote: Resolving deltas: 100% (83/83)
remote: Processing changes: new: 1, done
remote:
remote: New Changes:
remote: http://git/r/18331
remote:
To ssh://git/repo
* [new branch] HEAD -> refs/for/master

This is very reproducible. Any idea what's going on here?

j.

Matthias Sohn

unread,
Aug 17, 2012, 5:54:17 PM8/17/12
to Jay Soffian, Repo and Gerrit Discussion
2012/8/17 Jay Soffian <jayso...@gmail.com>
did you check the server log mentioned in the failing push ?
 
--
Matthias

Jay Soffian

unread,
Aug 17, 2012, 6:38:42 PM8/17/12
to Matthias Sohn, Repo and Gerrit Discussion
On Fri, Aug 17, 2012 at 5:54 PM, Matthias Sohn
<matthi...@googlemail.com> wrote:
> did you check the server log mentioned in the failing push ?

I forgot to mention that. It's just a
org.eclipse.jgit.errors.MissingObjectException. But the object is not
missing on the server. I can cd into that repo and git cat-file it.
Here's the traceback from the server log:

[2012-08-17 16:01:50,719] ERROR com.google.gerrit.sshd.BaseCommand :
Internal server error (user jay account 1000000) during
git-receive-pack '/repo'
com.google.gerrit.sshd.BaseCommand$Failure: fatal: Unpack error, check
server log
at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:146)
at com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:103)
at com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:34)
at com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:69)
at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:403)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:333)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Unpack error on project "repo":
AdvertiseRefsHook:
org.eclipse.jgit.transport.AdvertiseRefsHookChain@7079b2f7class
org.eclipse.jgit.transport.AdvertiseRefsHookChain

at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:145)
... 13 more
Caused by: org.eclipse.jgit.errors.UnpackException: Exception while
parsing pack stream
at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:877)
at org.eclipse.jgit.transport.ReceivePack.receive(ReceivePack.java:768)
at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:95)
... 13 more
Caused by: org.eclipse.jgit.errors.MissingObjectException: Missing
tree 489b803c87abd8c7b87158d89bcb87a47f3bbf93
at org.eclipse.jgit.transport.ReceivePack.checkConnectivity(ReceivePack.java:1098)
at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:840)
... 15 more

j.

Shawn Pearce

unread,
Aug 18, 2012, 6:05:06 PM8/18/12
to Jay Soffian, Matthias Sohn, Repo and Gerrit Discussion
Frack. This repository uses branch level read access controls. When
not all branches are visible to you this hook is installed.
Unfortunately the chain class broke the toString we used here to debug
this particular failure. The object isn't missing, your client tried
to construct a reference to an object that the server believes you
cannot see. Making such a reference would let you see something you
aren't supposed to read the contents of. So the server is trying to
protect the data, even though you (or your client anyway) know its
SHA-1 name.

>
> at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:145)
> ... 13 more
> Caused by: org.eclipse.jgit.errors.UnpackException: Exception while
> parsing pack stream
> at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:877)
> at org.eclipse.jgit.transport.ReceivePack.receive(ReceivePack.java:768)
> at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:95)
> ... 13 more
> Caused by: org.eclipse.jgit.errors.MissingObjectException: Missing
> tree 489b803c87abd8c7b87158d89bcb87a47f3bbf93
> at org.eclipse.jgit.transport.ReceivePack.checkConnectivity(ReceivePack.java:1098)
> at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:840)
> ... 15 more

Looks like something changed about how the server based its decision
on what you can see, and what is actually referenced in the pack
stream that was received.

Jay Soffian

unread,
Aug 20, 2012, 2:38:19 PM8/20/12
to Shawn Pearce, Matthias Sohn, Repo and Gerrit Discussion
On Sat, Aug 18, 2012 at 6:05 PM, Shawn Pearce <s...@google.com> wrote:
> Frack. This repository uses branch level read access controls. When
> not all branches are visible to you this hook is installed.
> Unfortunately the chain class broke the toString we used here to debug
> this particular failure. The object isn't missing, your client tried
> to construct a reference to an object that the server believes you
> cannot see. Making such a reference would let you see something you
> aren't supposed to read the contents of. So the server is trying to
> protect the data, even though you (or your client anyway) know its
> SHA-1 name.

All-Projects (from which the repo in question inherits) had Read
access for Anonymous Users on refs/heads/* and refs/tags/*. I just got
rid of those rules, and added Read access on refs/* instead, and that
seems to have worked around the issue.

j.

Gustaf Lundh

unread,
Jan 21, 2013, 12:21:41 PM1/21/13
to repo-d...@googlegroups.com, Shawn Pearce, Matthias Sohn
Our users are still seeing this _very_ often on our 2.4.2 installation. I cannot seem to be able to replicate it in a test-setup, which makes it hard for me to fix.

Interestingly enough, I have not found anyone online complaining about the issue existing in v2.5+, so has it been fixed post v2.4.2? In JGit? Cannot seen to find relevant commits in either projects.

I would really appreciate any hints on how to replicate this issue in a controlled environment. As it is today, it drives our devs crazy, and it is not obvious for me how to attack the issue.

Best regards
Gustaf

Phil Hord

unread,
Feb 4, 2013, 12:30:59 PM2/4/13
to repo-d...@googlegroups.com, Shawn Pearce, Matthias Sohn, Phil Hord

On Monday, January 21, 2013 12:21:41 PM UTC-5, Gustaf Lundh wrote:
Our users are still seeing this _very_ often on our 2.4.2 installation. I cannot seem to be able to replicate it in a test-setup, which makes it hard for me to fix.

Interestingly enough, I have not found anyone online complaining about the issue existing in v2.5+, so has it been fixed post v2.4.2? In JGit? Cannot seen to find relevant commits in either projects.

I would really appreciate any hints on how to replicate this issue in a controlled environment. As it is today, it drives our devs crazy, and it is not obvious for me how to attack the issue.

My users see this problem every now and then, but I do not.  I assumed it might be because they use an older version of git than I do, but maybe it is that I am an administrator.  I do not know.  But it has never happened to me, even though the clones I sometimes push from can be several weeks "stale".

It seems to have begun occurring when I updated Gerrit from 2.4.1 to 2.5.  I have had the first report of it occuring this year just now, but it was from a new user who did not know the 'git fetch' workaround.  I expect the other users simply do not tell me when it occurs any more.  I am currently on v2.5.1 which I built locally in order to include a relative-submodule-fix (https://gerrit-review.googlesource.com/#/c/39372/) on 2012-11-08.

We did add branch-level ACLs back in 2011.  Maybe the pre-2.5 occurrences of this error shown in my logs below were legitimate branch-protection issues.

I have Read access on 'refs/*' in All-Projects for these affected users.

Here's a quick check of my error logs which go back to 2011-05:

for XX in review_site/logs/error_log.*.gz ; do
     count=$(gzip -dc $XX | grep MissingObjectException | wc -l)
     test "$count" == "0" || echo $count $XX
done

4 error_log.2011-10-12.gz
2 error_log.2012-02-08.gz
2 error_log.2012-03-29.gz
5 error_log.2012-03-31.gz
4 error_log.2012-09-18.gz
4 error_log.2012-10-09.gz
2 error_log.2012-10-11.gz
1 error_log.2012-10-22.gz
8 error_log.2012-10-23.gz
19 error_log.2012-10-25.gz
15 error_log.2012-10-26.gz
4 error_log.2012-10-30.gz
1 error_log.2012-10-31.gz
18 error_log.2012-11-01.gz
8 error_log.2012-11-03.gz
1 error_log.2012-11-12.gz
3 error_log.2012-11-16.gz
2 error_log.2012-11-18.gz
3 error_log.2012-11-30.gz
1 error_log.2012-12-07.gz
3 error_log.2012-12-13.gz
2 error_log.2012-12-14.gz
4 error_log.2012-12-17.gz
1 error_log.2012-12-30.gz
1 error_log.2013-01-28.gz


Let me know if I can provide any more details.

Phil

 

Brandon Casey

unread,
Jul 30, 2013, 10:14:22 PM7/30/13
to Phil Hord, repo-d...@googlegroups.com, Shawn Pearce, Matthias Sohn
On Mon, Feb 4, 2013 at 9:30 AM, Phil Hord <ho...@cisco.com> wrote:
>
> On Monday, January 21, 2013 12:21:41 PM UTC-5, Gustaf Lundh wrote:

<insert context>

On Fri, Aug 17, 2012 at 3:38 PM, Jay Soffian <jayso...@gmail.com> wrote:
> [2012-08-17 16:01:50,719] ERROR com.google.gerrit.sshd.BaseCommand :
> Internal server error (user jay account 1000000) during
> git-receive-pack '/repo'
> com.google.gerrit.sshd.BaseCommand$Failure: fatal: Unpack error, check
> server log
> at com.google.gerrit.sshd.commands.Receive.runImpl(Receive.java:146)
<snip>
> Caused by: java.io.IOException: Unpack error on project "repo":
> AdvertiseRefsHook:
> org.eclipse.jgit.transport.AdvertiseRefsHookChain@7079b2f7class
> org.eclipse.jgit.transport.AdvertiseRefsHookChain

>>> On Sat, Aug 18, 2012 at 6:05 PM, Shawn Pearce <s...@google.com> wrote:
>>> > Frack. This repository uses branch level read access controls. When
>>> > not all branches are visible to you this hook is installed.
>>> > Unfortunately the chain class broke the toString we used here to debug
>>> > this particular failure. The object isn't missing, your client tried
>>> > to construct a reference to an object that the server believes you
>>> > cannot see. Making such a reference would let you see something you
>>> > aren't supposed to read the contents of. So the server is trying to
>>> > protect the data, even though you (or your client anyway) know its
>>> > SHA-1 name.

</insert context>

>> Our users are still seeing this _very_ often on our 2.4.2 installation. I
>> cannot seem to be able to replicate it in a test-setup, which makes it hard
>> for me to fix.
>>
>> Interestingly enough, I have not found anyone online complaining about the
>> issue existing in v2.5+, so has it been fixed post v2.4.2? In JGit? Cannot
>> seen to find relevant commits in either projects.
>>
>> I would really appreciate any hints on how to replicate this issue in a
>> controlled environment. As it is today, it drives our devs crazy, and it is
>> not obvious for me how to attack the issue.
>>
> My users see this problem every now and then, but I do not. I assumed it
> might be because they use an older version of git than I do, but maybe it is
> that I am an administrator. I do not know. But it has never happened to
> me, even though the clones I sometimes push from can be several weeks
> "stale".
>
> It seems to have begun occurring when I updated Gerrit from 2.4.1 to 2.5. I
> have had the first report of it occuring this year just now, but it was from
> a new user who did not know the 'git fetch' workaround. I expect the other
> users simply do not tell me when it occurs any more. I am currently on
> v2.5.1 which I built locally in order to include a relative-submodule-fix
> (https://gerrit-review.googlesource.com/#/c/39372/) on 2012-11-08.

We've just received this error on Gerrit 2.5.4. The tree object that
Gerrit complains about exists in the repository. Branch level read
permissions are being used.

Phil, what is the 'git fetch' workaround you mentioned?

-Brandon

Phil Hord (hordp)

unread,
Jul 31, 2013, 8:08:57 AM7/31/13
to Brandon Casey, Repo and Gerrit Discussion, Shawn Pearce, Matthias Sohn
For us, the problem persisted for a particular user whenever he tried to git push. But it cleared up when they did a git fetch. It seemed to synchronize their repo with the remote.

Phil

Shawn Pearce

unread,
Jul 31, 2013, 10:12:55 AM7/31/13
to Phil Hord (hordp), Brandon Casey, Repo and Gerrit Discussion, Matthias Sohn
On Wed, Jul 31, 2013 at 5:08 AM, Phil Hord (hordp) <ho...@cisco.com> wrote:
> For us, the problem persisted for a particular user whenever he tried to git push.
> But it cleared up when they did a git fetch. It seemed to synchronize their repo
> with the remote.

Yes. This workaround can resolve the issue because of how the push
protocol works.

During push with branch level read controls the server estimates a set
of reasonable objects for the client to use as a delta base. This set
is derived from the tips of every branch the server has. The client
then looks at the server branch heads and if it has that object
already locally enqueues that as a candidate for a delta base. If it
doesn't have that object locally it skips and looks for other
candidates.

By running fetch first the client catches up to the server and is more
likely to use the same set of objects. This reduces the risk the
server sees objects it does not like.

When I mentioned how this works in the Gerrit server to Junio C Hamano
he... shuddered. The server is partially dependent upon a client side
algorithm behaving the same way as the server.

Brandon Casey

unread,
Jul 31, 2013, 3:38:13 PM7/31/13
to Shawn Pearce, Phil Hord (hordp), Repo and Gerrit Discussion, Matthias Sohn
On Wed, Jul 31, 2013 at 7:12 AM, Shawn Pearce <s...@google.com> wrote:
> On Wed, Jul 31, 2013 at 5:08 AM, Phil Hord (hordp) <ho...@cisco.com> wrote:
>> For us, the problem persisted for a particular user whenever he tried to git push.
>> But it cleared up when they did a git fetch. It seemed to synchronize their repo
>> with the remote.
>
> Yes. This workaround can resolve the issue because of how the push
> protocol works.
>
> During push with branch level read controls the server estimates a set
> of reasonable objects for the client to use as a delta base. This set
> is derived from the tips of every branch the server has. The client
> then looks at the server branch heads and if it has that object
> already locally enqueues that as a candidate for a delta base. If it
> doesn't have that object locally it skips and looks for other
> candidates.
>
> By running fetch first the client catches up to the server and is more
> likely to use the same set of objects. This reduces the risk the
> server sees objects it does not like.

That seems to have worked.

> When I mentioned how this works in the Gerrit server to Junio C Hamano
> he... shuddered. The server is partially dependent upon a client side
> algorithm behaving the same way as the server.

Are you saying that the reason Gerrit rejects the push is because the
client didn't use the base objects that Gerrit thinks the client
_should_ have used? If so, then yeah, that seems fragile.

-Brandon

Shawn Pearce

unread,
Aug 1, 2013, 9:40:36 AM8/1/13
to Brandon Casey, Phil Hord (hordp), Repo and Gerrit Discussion, Matthias Sohn
On Wed, Jul 31, 2013 at 12:38 PM, Brandon Casey <dra...@gmail.com> wrote:
> On Wed, Jul 31, 2013 at 7:12 AM, Shawn Pearce <s...@google.com> wrote:
>
>> When I mentioned how this works in the Gerrit server to Junio C Hamano
>> he... shuddered. The server is partially dependent upon a client side
>> algorithm behaving the same way as the server.
>
> Are you saying that the reason Gerrit rejects the push is because the
> client didn't use the base objects that Gerrit thinks the client
> _should_ have used? If so, then yeah, that seems fragile.

Yes. :-(

Note this code predates bitmap indexes in JGit. If bitmap indexes are
present we can compute the full list of candidates in a reasonable
time bound and use that instead. Unfortunately this code (and many
others that could use them) has not been updated to take bitmap
indexes into consideration when they are available.
Reply all
Reply to author
Forward
0 new messages