Gerrit 2.15.2 apparently not sending review review comment emails

388 views
Skip to first unread message

Richard Christie

unread,
Jul 11, 2018, 5:43:56 AM7/11/18
to Repo and Gerrit Discussion
Some of our users have noticed issues with our gerrit server refusing(?) to send emails for particular projects beyond the "You have been added as a reviewer".

One such case a change owner did not get a notification of a +2 and comments on his own review, nor did most (but not all!) other reviewers of the change.

- There's nothing in gerrit's error logs.
- There's no special notify.* section in the project configuration (or any parent project).
- There's no special user project watch settings in the user's preferences.
- The user's account.config has nothing more than account.fullName and account.preferredEmail set (and the email address is correct)
- Gerrit is sending some emails (e.g. when added as a reviewer) so the mail settings are correct.
- Changes involving exactly the same set of reviewers on other test/scratch repos work correctly; everyone gets reviews. The problem seems to be limited to a particular set of repositories.

Sadly we can't see the mail server logs, but last time we bothered our IT people about it, they did not see any trace of the gerrit user logging in to the smtp server, which would imply gerrit is not trying to send at all.

The project config does have

[reviewer]

        enableByEmail = true


set, but there are no anonymous users on many of the changes in question, and the problem appears to occur even without it.

The server has the following mail settings:

[sendemail]
        expiryDays = 90
        includeDiff = true
        maximumDiffSize = 256k
        smtpServer = our.smtp.mail.server



Is there any other setting which will stop gerrit trying to email when reviews/comments are added to a change?

Or is there anything we can turn on by way of verbosity / debug in gerrit to let us see its logic of when it is trying to email and to whom it is sending?


Dean Wheatley

unread,
Nov 1, 2018, 4:27:43 PM11/1/18
to Repo and Gerrit Discussion
Hi Richard,

Did you find a fix for this issue? I'm also seeing this bug with 2.15.2 (not receiving review comment email notifications). Did upgrading to a later Gerrit version fix your issue?

Thanks, 

Logan Hanks

unread,
Nov 1, 2018, 4:35:21 PM11/1/18
to dean.w...@gmail.com, repo-d...@googlegroups.com
Are these changes moving in and out of the Work-in-Progress state? If so, this may be the issue described by https://bugs.chromium.org/p/gerrit/issues/detail?id=6854.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dean Wheatley

unread,
Nov 1, 2018, 5:43:26 PM11/1/18
to Repo and Gerrit Discussion
This issue affects changes starting in the "In Review" state.

Richard Christie

unread,
Nov 6, 2018, 2:22:24 PM11/6/18
to Repo and Gerrit Discussion
In the end we found (and it has happened three times now) that gerrit's email thread gets stuck in some sort of deadlock talking to the mail server (we think) and then all the other attempts to send any email just queue up behind it. Restarting gerrit was the old way to fix the problem, as just killing the stuck thread didn't release the waiting ones.

luca.mi...@gmail.com

unread,
Nov 6, 2018, 2:30:06 PM11/6/18
to Richard Christie, Repo and Gerrit Discussion


Sent from my iPhone

On 6 Nov 2018, at 19:22, Richard Christie <r.d.f.c...@gmail.com> wrote:

In the end we found (and it has happened three times now) that gerrit's email thread gets stuck in some sort of deadlock talking to the mail server

Are you telling that you don’t use the local host as smtp server?

If not, then it is well known to be problematic !!! We discussed a few times with Shawn and he said that he never thought it would have been used with a remote SMTP.

Gerrit is not Postfix :-) better to use the proper tool for the proper job ;-)

Richard Christie

unread,
Nov 6, 2018, 2:35:38 PM11/6/18
to Repo and Gerrit Discussion
Sadness! I wish I had known.

Suggest that documention for smtp is updated:

Hostname (or IP address) of a SMTP server that will relay messages generated by Gerrit to end users. By default, 127.0.0.1 (aka localhost). DO NOT CHANGE, ANYTHING ELSE MAY CAUSE RANDOM DEADLOCKS.

That said it has been working absolutely fine until 2.15 which is the first time we've seen it have issues. It still works fine 99% of the time.

Anyway, thanks for the info!

Luca Milanesio

unread,
Nov 6, 2018, 3:34:41 PM11/6/18
to Richard Christie, Luca Milanesio, Repo and Gerrit Discussion

On 6 Nov 2018, at 19:35, Richard Christie <r.d.f.c...@gmail.com> wrote:

Sadness! I wish I had known.

Suggest that documention for smtp is updated:

Hostname (or IP address) of a SMTP server that will relay messages generated by Gerrit to end users. By default, 127.0.0.1 (aka localhost). DO NOT CHANGE, ANYTHING ELSE MAY CAUSE RANDOM DEADLOCKS.


You can make the change and post for review.
I do suggest to make the change in v2.14, it unlikely that earlier releases are going to get any patch updates.

That said it has been working absolutely fine until 2.15 which is the first time we've seen it have issues. It still works fine 99% of the time.

If your SMTP server is super-available and super-fast (let's say like Postfix on localhost) then you're running without issues.
However, if the SMTP server is remote, intermittent or has flaky connectivity, then Gerrit will suffer :-(


Anyway, thanks for the info!

Thanks in advance, if you are planning to update the docs :-)

Luca.

Engle, Luke Edward

unread,
Nov 6, 2018, 4:42:39 PM11/6/18
to Luca Milanesio, Richard Christie, Repo and Gerrit Discussion

Is this related to this issue? https://bugs.chromium.org/p/gerrit/issues/detail?id=3259&can=2&start=0&num=100&q=&colspec=ID%20Type%20Stars%20Milestone%20Status%20Priority%20Owner%20Summary&groupby=&sort=

 

I’ll say here what I said there:

 

We recently ran into this issue as well after upgrading to 2.15.2. After doing some research, the fact that it gets hung appears to be a java issue, not a Gerrit issue. If you look a the thread dump for the sendemail thread, it's stuck in the socketRead0 method (which is a native java read method). Why it is initially stuck in that method is still unclear because there's very little information to go off of (thread dumps, logs, etc. don't reveal anything).

 

Gerrit does provide you with a connectTimeout config setting under sendmail, but that config setting doesn't do anything because there's a bug in the socketRead0 method. Check the description of the bug here to find the details of why the timeout doesn't affect anything: https://bugs.openjdk.java.net/browse/JDK-8075484

 

The above bug was fixed in java version 9, and was backported to many versions of java 1.8. Unfortunately the fix was not backported to the native java version that you get when you run apt-get install java (1.8.0_181), so you need to first verify that the bug fix was not backported to the version of java you're running, check the list of versions the bug fix *was* backported to (in the link above), then upgrade to one of those specific versions (We were running java 1.8.0_171 and upgraded to 1.8.0_172). Or, you can simply upgrade to any Java version 10+. You also need to make sure you add a sendemail.connectTimeout in your gerrit.config, because it's set to infinite by default.

 

I do believe there is a Gerrit bug within all this, however. When the mail thread is hung, I would expect killing the thread would open up that thread for other mail tasks to run, but killing the hung thread does nothing except kill that thread. All the queued mail tasks never move from waiting to running. The only way to recover after that is to restart Gerrit.

 

Thanks,

Luke

Richard Christie

unread,
Nov 12, 2018, 9:44:35 AM11/12/18
to Repo and Gerrit Discussion
We're seeing a new issue now again with sendmail locking up. We're running java 1.8.0_192-b12, and also using localhost for sendmail. (Gerrit 2.15.3)

However it looks from the stacktrace as if it's not even trying to do that, but instead getting stuck in evaluating mail send conditions.

"SendEmail-1" #248 prio=5 os_prio=0 tid=0x00007f38300fe800 nid=0x66b8 runnable [0x00007f37362fa000]
   java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
- locked <0x00007f5a69a314b8> (a java.lang.reflect.InvocationTargetException)
at java.lang.Throwable.<init>(Throwable.java:310)
at java.lang.Exception.<init>(Exception.java:102)
at java.lang.ReflectiveOperationException.<init>(ReflectiveOperationException.java:89)
at java.lang.reflect.InvocationTargetException.<init>(InvocationTargetException.java:72)
at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)

then the following repeats about 1000 times:
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.gerrit.index.query.QueryBuilder$ReflectionFactory.create(QueryBuilder.java:339)
at com.google.gerrit.index.query.QueryBuilder.operator(QueryBuilder.java:265)
at com.google.gerrit.index.query.QueryBuilder.operator(QueryBuilder.java:251)
at com.google.gerrit.index.query.QueryBuilder.toPredicate(QueryBuilder.java:220)
at com.google.gerrit.index.query.QueryBuilder.parse(QueryBuilder.java:187)
at com.google.gerrit.server.query.change.IsWatchedByPredicate.filters(IsWatchedByPredicate.java:53)
at com.google.gerrit.server.query.change.IsWatchedByPredicate.<init>(IsWatchedByPredicate.java:41)
at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
before finally ending with:
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.gerrit.index.query.QueryBuilder$ReflectionFactory.create(QueryBuilder.java:339)
at com.google.gerrit.index.query.QueryBuilder.operator(QueryBuilder.java:265)
at com.google.gerrit.index.query.QueryBuilder.operator(QueryBuilder.java:251)

Richard Christie

unread,
Nov 12, 2018, 11:21:42 AM11/12/18
to Repo and Gerrit Discussion
Further to this (getting tipped off by the "IsWatchedBy…" in the stack-trace, it turns out the lockup was caused by an entry in one particular user's watched projects configuration (file: watch.config). The offending configuration is here:

[project "All-Projects"]
notify = is:watched [ABANDONED_CHANGES, ALL_COMMENTS, NEW_CHANGES, NEW_PATCHSETS, SUBMITTED_CHANGES]
[project "cpu/matterhorn/mth"]
notify = path:^logical/scpu_iside/verilog/scpu_mbuild.sv.mds [NEW_CHANGES, NEW_PATCHSETS]
notify = reviewer:self [ABANDONED_CHANGES, ALL_COMMENTS, NEW_CHANGES, NEW_PATCHSETS, SUBMITTED_CHANGES]
notify = cc:self [ABANDONED_CHANGES, ALL_COMMENTS, NEW_CHANGES, NEW_PATCHSETS, SUBMITTED_CHANGES]
notify = path:^logical/scpu_decode/verilog/scpu_dec1.sv.mds [NEW_CHANGES, NEW_PATCHSETS]
notify = is:watched [ABANDONED_CHANGES, ALL_COMMENTS, NEW_CHANGES, NEW_PATCHSETS, SUBMITTED_CHANGES]
notify = path:^logical/interfaces/.* [NEW_CHANGES, NEW_PATCHSETS]

I imagine the problem is checking whether "All-Projects" is being watched, though I haven't yet attempted to reproduce this in a test environment.
Reply all
Reply to author
Forward
0 new messages