Gerrit master *broken* (again) in NoteDB mode

28 views
Skip to first unread message

lucamilanesio

unread,
Dec 28, 2016, 2:56:42 AM12/28/16
to Repo and Gerrit Discussion
Hi Gerrit contributors and maintainers,
I had to switch off completely the changes validation on gerrit-ci.gerritforge.com because the current master is massively broken when NoteDB is enabled.

There are around 80 failures, most of them possibly for the following single reason:

1) Error in custom provider, com.google.inject.OutOfScopeException: Not in command/request

  at com.google.gerrit.sshd.SshModule.configureRequestScope(SshModule.java:121)

  while locating java.net.SocketAddress annotated with interface com.google.gerrit.server.RemotePeer


1 error

at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1028)

at com.google.gerrit.server.IdentifiedUser.newRefLogIdent(IdentifiedUser.java:327)

at com.google.gerrit.server.git.BatchUpdate$ChangeTask.stageNoteDbUpdate(BatchUpdate.java:1058)

at com.google.gerrit.server.git.BatchUpdate$ChangeTask.call(BatchUpdate.java:968)

at com.google.gerrit.server.git.BatchUpdate$ChangeTask.call(BatchUpdate.java:930)

at com.google.gerrit.server.git.BatchUpdate$ChangeTask.call(BatchUpdate.java:1)

at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)

at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)

at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: com.google.inject.OutOfScopeException: Not in command/request

at com.google.gerrit.sshd.SshScope.requireContext(SshScope.java:152)

at com.google.gerrit.sshd.SshScope.access$0(SshScope.java:149)

at com.google.gerrit.sshd.SshScope$1$1.get(SshScope.java:189)

at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)

at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1019)

at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085)

at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1015)

... 11 more


I'm bisecting now to find the offending commit ... but in the meantime I can't enable again verification because the CI would fail any incoming changes.


More details to follow.


Luca.

Edwin Kempin

unread,
Dec 28, 2016, 2:59:01 AM12/28/16
to lucamilanesio, Repo and Gerrit Discussion
Would be nice to identify the commit.
Meanwhile locally NoteDb tests succeed for me.
 


More details to follow.


Luca.

--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Luca Milanesio

unread,
Dec 28, 2016, 3:02:32 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
Hi Edwin,
did you execute them with the following mode?

export GERRIT_NOTEDB=READ_WRITE
buck test --no-results-cache --exclude flaky

Running the simple Git suite would systematically fail for me, in every condition :-(

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 3:08:36 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
Even this would always fail (10 failures over 10 tries) for me:

export GERRIT_NOTEDB=READ_WRITE
buck test --no-results-cache //gerrit-acceptance-tests/src/test/java/com/google/gerrit/acceptance/git:git

It seems that the SshModule.configRequestScope() is getting invoked outside of an HTTP Request (real or wrapped).
Happens ONLY when NoteDB is enabled.

Let me debug and see what's going on ...

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 3:09:30 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
I mean outside an SSH Request scope :-)

Luca.

Edwin Kempin

unread,
Dec 28, 2016, 3:14:16 AM12/28/16
to Luca Milanesio, Repo and Gerrit Discussion
Sorry, they fail for me too, I was still using the old GERRIT_ENABLE_NOTEDB=TRUE ...

Luca.


-- 
-- 
To unsubscribe, email repo-discuss+unsub...@googlegroups.com

Luca Milanesio

unread,
Dec 28, 2016, 3:14:40 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
So, it is not actually the bind getting invoked outside an SSH request scope, but rather the provider of remote peer getting invoked outside an scoped request.
More details to follow ...

Luca Milanesio

unread,
Dec 28, 2016, 3:54:06 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
Yeah, that the same problems we had on Gerrit master build :-(
The verifier job was up-to-date and detected the failure, but the Gerrit master build was still using the old var and was with a (fake) green.

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 4:05:11 AM12/28/16
to Edwin Kempin, Repo and Gerrit Discussion
Started bi-secting, from 6558773 (good, 23rd of December) to master (bad).
Shouldn't take long ...

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 4:11:16 AM12/28/16
to Edwin Kempin, Dave Borowitz, Repo and Gerrit Discussion
And the winner is ... Dave :-)

commit 33f08a612f3835fdc7ae449bf1dc7d7cbf4dc7a7
Author: Dave Borowitz <dbor...@google.com>
Date:   Thu Dec 22 11:28:20 2016 -0500

    NoteDbUpdateManager: Set reflog message and identity
    
    Change-Id: Iab0aae68d6f121829529c6714636eed90c3ba929


Let me work on a fix.

Luca.

Edwin Kempin

unread,
Dec 28, 2016, 4:15:30 AM12/28/16
to Luca Milanesio, Dave Borowitz, Repo and Gerrit Discussion
On Wed, Dec 28, 2016 at 10:11 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:
And the winner is ... Dave :-)
Well, I should also claim part of this price, since I verified and submitted that change... :(
 

commit 33f08a612f3835fdc7ae449bf1dc7d7cbf4dc7a7
Author: Dave Borowitz <dbor...@google.com>
Date:   Thu Dec 22 11:28:20 2016 -0500

    NoteDbUpdateManager: Set reflog message and identity
    
    Change-Id: Iab0aae68d6f121829529c6714636eed90c3ba929


Let me work on a fix.
Thanks!
 

Luca.

Luca.


-- 
-- 
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com

Luca Milanesio

unread,
Dec 28, 2016, 4:18:22 AM12/28/16
to Edwin Kempin, Dave Borowitz, Repo and Gerrit Discussion
... and I was working on the CI and verification was unstable  ... so that's my contribution to the price as well !
We are three winners then :-)

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 5:04:57 AM12/28/16
to Edwin Kempin, Dave Borowitz, Repo and Gerrit Discussion
Look at the code, it looks fine to me as well :-)

I reckon the code change has highlighted a problem where the BatchUpdate gets instantiated with an Identified user not linked to a remote peer.
The test-case, however, was using an SSH connection so it should have had a remote peer.

Will keep on my investigation ...

Luca.

Edwin Kempin

unread,
Dec 28, 2016, 5:08:06 AM12/28/16
to Luca Milanesio, Dave Borowitz, Repo and Gerrit Discussion
On Wed, Dec 28, 2016 at 11:04 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:
Look at the code, it looks fine to me as well :-)

I reckon the code change has highlighted a problem where the BatchUpdate gets instantiated with an Identified user not linked to a remote peer.
The test-case, however, was using an SSH connection so it should have had a remote peer.

Will keep on my investigation ...
Thanks, but if it takes too long we should submit the revert change so that we at least get the test back working.
 

Luca.
 

Luca Milanesio

unread,
Dec 28, 2016, 5:14:28 AM12/28/16
to Edwin Kempin, Dave Borowitz, Repo and Gerrit Discussion
Getting there, I believe I found the problem :-)

Luca.

Luca Milanesio

unread,
Dec 28, 2016, 6:32:12 AM12/28/16
to Edwin Kempin, Dave Borowitz, Repo and Gerrit Discussion
Should be fixed in:

The point is the delayed batch-update: the user is passed over *but* not materialised. This means that when you try to invoke the providers (e.g. RemotePeer provider) it may throw an exception when invoked on a different thread.

By materialising the dependencies up-front, we resolve the issue at its root :-)

Luca.
Reply all
Reply to author
Forward
0 new messages