Curious regressions due to 31050f2 (Cherry-pick behavior mods)

27 views
Skip to first unread message

R. Tyler Croy

unread,
Dec 14, 2011, 3:11:05 AM12/14/11
to repo-d...@googlegroups.com
First, a special thanks to Martin for dealing with all my questions and putting
up with interactive debugging on IRC through most of today.

Background: yesterday, after having disovering a number of features needed in
the Gerrit SSH API, I tested and ultimately deployed a build of Gerrit based
off of b43c3b0 (current HEAD of master in the googlecode repo).

We discovered a couple issues, one of which has already been addressed in this
change: <https://gerrit-review.googlesource.com/30580>


The second issue we discovered was a *VERY* curious condition with our
projects, all of which are cherry-pick with Jenkins integration.

For some subset of changes, whe na user would click the "Submit" button, the
change would report either REVISION_GONE upon submit, and the latest dummy
patch-set created by 31050f2 would be completely invalid, i.e. you could not
git fetch refs/changes/01/01/2, or you could not find the actual ref in the
gerrit repo.


Alternatively, the change might get stuck in "Submitted, Merge Pending" state,
which has never happened before, since the projects are all of "Cherry-Pick"
submit type after all. This is documented somewhat in issue 871, but a restart
of Gerrit would cause this changes to convert from "Submitted, Merge Pending"
to the above REVISION_GONE state after the restart.


The work-around issue, that seems to have brought some level of sanity back to
our Gerrit install was to revert 31050f2 and deploy that, which seems to have
corrected the regressions we initially saw.

I'm really not sure how to characterize these issues in a way that I could file
into the issue tracker, I only think that 31050f2 should be reverted in the
master branch until it's had some more testing and vetting with cherry-pick
heavy-projects. I found almost all these issues difficult to reproduce and it
was only with Martin's patience with me that we progressed far enough to
isolate this particular commit as a /potential/ culprit.

Not sure what to do with this information, but there it is.

- R. Tyler Croy
--------------------------------------
Code: http://github.com/rtyler
Chatter: http://twitter.com/agentdero
rty...@jabber.org

Brad Larson

unread,
Dec 14, 2011, 1:13:40 PM12/14/11
to Repo and Gerrit Discussion

Admittedly I'm a little biased, but I'd hate for us to have to roll
back 31050f2. Are there any more specific reproduction instructions,
or just 'try a lot of chery-picked submits'? 31050f2 solves a few
other issues in Gerrit so I'd like to see it stick around, but of
course it can't be released if it causes problems like you
experienced.

Brad

>
> Not sure what to do with this information, but there it is.
>
> - R. Tyler Croy
> --------------------------------------
>     Code:http://github.com/rtyler
>  Chatter:http://twitter.com/agentdero
>           rty...@jabber.org
>

>  application_pgp-signature_part
> < 1KViewDownload

R. Tyler Croy

unread,
Dec 14, 2011, 2:08:06 PM12/14/11
to Brad Larson, Repo and Gerrit Discussion


Bias, nawww :)

What was particularly frustrating about this issue was that I wasn't able to
get a good reliable reproduction case.

We use the Gerrit Trigger plugin in Jenkins, and my current unsubstantiated
theory is that the creation of that bogus patchset on cherry-pick submission
would cause Jenkins to pick up on the change but immediately fail because it
couldn't resolve the change from `git fetch gerrit refs/changes/01/01/1`. This
failure, in my hypothesis, would immediately mark the change in Gerrit as Not
Verified.

Then Gerrit would fail to actually submit the change because it was Not
Verified.


That's my current theory, but I couldn't get a solid enough reproduction case
to prove it :-/


All I know right now, unfortunately, is that reverting that commit appears to
have solved the issue. My other hunch is that actually creating that bogus
PATCH_SETS record might also be cause for trouble :-/


Sorry I can't be of more help, this was a pretty big catastrophuck yesterday
since this affected our production Gerrit (these issues didn't repro in my
testing :/).


Cheers

Brad Larson

unread,
Dec 14, 2011, 2:34:59 PM12/14/11
to Repo and Gerrit Discussion

I thought about this too, but it seems unlikely that Jenkins could get
the notification, attempt the download, and score the change during
the time it takes Gerrit to cherry-pick it. I'll try to dig out some
time to experiment with this. Sorry for the trouble, and thanks for
the bug report.

>
> All I know right now, unfortunately, is that reverting that commit appears to
> have solved the issue. My other hunch is that actually creating that bogus
> PATCH_SETS record might also be cause for trouble :-/
>
> Sorry I can't be of more help, this was a pretty big catastrophuck yesterday
> since this affected our production Gerrit (these issues didn't repro in my
> testing :/).
>
> Cheers
> - R. Tyler Croy
> --------------------------------------
>     Code:http://github.com/rtyler
>  Chatter:http://twitter.com/agentdero
>           rty...@jabber.org
>

>  application_pgp-signature_part
> < 1KViewDownload

Brad Larson

unread,
Dec 28, 2011, 6:19:52 PM12/28/11
to Repo and Gerrit Discussion, R. Tyler Croy
I suspect we can rule out Jenkins as contributing to this issue. I
was watching the stream-events when submitting some cherry-picked
changes, and a new patchset-created event isn't sent, just the ref-
updated event. I'm fairly certain that the Gerrit Trigger plugin
doesn't listen for the ref-updated event, and even if it did I believe
it would be too late for Jenkins to impact the submit process.

For your failed submits, did they have negative verified scores from
your Jenkins job?

I just finished running a test of 1,000 cherry-picked submits and
didn't have any problems. I'll try throwing Jenkins into the mix to
see if that might impact things, but I don't have a lot of ideas at
this point. Here was my script:

cat /dev/urandom | tr -cd "[:alnum:]" | head -c 20 >> testfile.txt
echo "" >> testfile.txt
git add testfile.txt
git commit -m "added a line to testfile.txt"
git push origin HEAD:refs/for/master
ssh localhost -p 29418 gerrit review --code-review=+2 --verified=+1 $
(git rev-list origin/master..HEAD)
ssh localhost -p 29418 gerrit review --submit $(git rev-list origin/
master..HEAD)
git fetch origin
git checkout origin/master


Still digging,
Brad


>
> All I know right now, unfortunately, is that reverting that commit appears to
> have solved the issue. My other hunch is that actually creating that bogus
> PATCH_SETS record might also be cause for trouble :-/
>
> Sorry I can't be of more help, this was a pretty big catastrophuck yesterday
> since this affected our production Gerrit (these issues didn't repro in my
> testing :/).
>
> Cheers
> - R. Tyler Croy
> --------------------------------------
>     Code:http://github.com/rtyler
>  Chatter:http://twitter.com/agentdero
>           rty...@jabber.org
>
>  application_pgp-signature_part
> < 1KViewDownload
Reply all
Reply to author
Forward
0 new messages