INTERMITTENT: "Refresh Credentials" or "Server Error" in web UI, but no corresponding error in log

177 views
Skip to first unread message

motorhe...@gmail.com

unread,
Mar 21, 2022, 1:20:23 PM3/21/22
to Repo and Gerrit Discussion
We have been receiving reports from our users of frequent "Refresh Credentials" message in Web UI while using Gerrit. Once the user resupplies their username/password they continue normally. Furthermore, some have reported “Server Unavailable” messages that, when it happens during a long and involved code review, causes them to lose all their existing review comments, so that they need to start over again and lost productivity.

We have not found any corresponding error messages in the Gerrit server logs when this occurs, and we cannot reproduce it reliably (so far we see it ourselves only rarely). At first, we speculated that high user traffic and/or java gc activity on Gerrit might be leading to starvation/timeouts in authentication, but recent reports took place while Gerrit server was below 5/5/5 load average and with few tasks in queue and no gc running so perhaps load/traffic on Gerrit is not a concern with respect to this issue.

One thing we tried back in January was to introduce a disk cache for web_sessions, since the default is to have no disk cache unless specified. This did not seem to have any effect since our users are still reporting it.

Can someone suggest a course of action we could take to investigate this issue further? Is the LDAP connection turnaround or other details logged anywhere? We see occasional errors in error_log when someone gets their password wrong, or tries to use a bad username with reviewers plugin, but so far we find nothing in the logs to help us understand what leads to this "Refresh Credentials" issue (or the seemingly random "Server Error" that appears in Web UI).

Gerrit version: 3.2.7
Authentication: LDAP
Arrangement: Single Master, Multi-Mirror

Matthias Sohn

unread,
Mar 21, 2022, 3:26:38 PM3/21/22
to motorhe...@gmail.com, Repo and Gerrit Discussion
How did you configure the following options ?

cache.web_sessions.maxAge
cache.web_sessions.memoryLimit
cache.web_sessions.diskLimit
 
-Matthias

motorhe...@gmail.com

unread,
Mar 22, 2022, 12:55:49 PM3/22/22
to Repo and Gerrit Discussion
[cache "web_sessions"]
maxAge = 2d

memoryLimit = 16m

diskLimit = 256m


hope it helps

Luca Milanesio

unread,
Mar 28, 2022, 2:17:27 PM3/28/22
to Repo and Gerrit Discussion, Luca Milanesio, motorhe...@gmail.com

On 21 Mar 2022, at 17:20, motorhe...@gmail.com <motorhe...@gmail.com> wrote:

We have been receiving reports from our users of frequent "Refresh Credentials" message in Web UI while using Gerrit. Once the user resupplies their username/password they continue normally. Furthermore, some have reported “Server Unavailable” messages that, when it happens during a long and involved code review, causes them to lose all their existing review comments, so that they need to start over again and lost productivity.

We have not found any corresponding error messages in the Gerrit server logs when this occurs, and we cannot reproduce it reliably (so far we see it ourselves only rarely). At first, we speculated that high user traffic and/or java gc activity on Gerrit might be leading to starvation/timeouts in authentication, but recent reports took place while Gerrit server was below 5/5/5 load average and with few tasks in queue and no gc running so perhaps load/traffic on Gerrit is not a concern with respect to this issue.

One thing we tried back in January was to introduce a disk cache for web_sessions, since the default is to have no disk cache unless specified. This did not seem to have any effect since our users are still reporting it.

Can you share your gerrit.config?

P.S. web_sessions are already stored on disk out of the box: what did you use for “disk cache”?

Luca.


Can someone suggest a course of action we could take to investigate this issue further? Is the LDAP connection turnaround or other details logged anywhere? We see occasional errors in error_log when someone gets their password wrong, or tries to use a bad username with reviewers plugin, but so far we find nothing in the logs to help us understand what leads to this "Refresh Credentials" issue (or the seemingly random "Server Error" that appears in Web UI).

Gerrit version: 3.2.7
Authentication: LDAP
Arrangement: Single Master, Multi-Mirror

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/5ef620c5-900a-4518-af14-f19cc9fcb25an%40googlegroups.com.

motorhe...@gmail.com

unread,
Mar 30, 2022, 11:38:45 AM3/30/22
to Repo and Gerrit Discussion
We read in our documentation that "If no disk cache is configured (or cache.web_sessions.diskLimit is set to 0) a server restart will force all users to sign-out, and need to sign-in again after the restart, as the cache was unable to persist the session information. Enabling a disk cache is strongly recommended.", which is why we thought that disk cache might be disabled by default for web_sessions.

Here is a sanitized copy of our gerrit.config, hope it helps:

[core]
    packedGitLimit = 16g
    packedGitWindowSize = 16k
    packedGitOpenFiles = 8192
[gerrit]
    basePath = xxxxxx
    canonicalWebUrl = xxxxxx
    canonicalGitUrl = xxxxxx
    reportBugUrl = xxxxxx
    serverId = xxxxxx
    ui = POLYGERRIT
    enableGwtUi = true
    enablePolyGerrit = true
[index]
    type = lucene
    threads = 17
[auth]
    type = LDAP
    gitBasicAuthPolicy = HTTP
    autoUpdateAccountActiveStatus = true
[accountDeactivation]
    startTime = Fri 22:00
    interval = 1 week
[ldap]
    server = xxxxxx
    username = xxxxxx
    accountBase = xxxxxx
    groupBase = xxxxxx
    guessRelevantGroups = false
[sendemail]
    smtpServer = xxxxxx
    connectTimeout = 2m
    threadPoolSize = 4
[container]
    user = gerritcr
    javaHome = /usr/java/jdk1.8.0_172-amd64
    heapLimit = 108g
[sshd]
    threads = 128
    listenAddress = *:29418
    backend = NIO2
    batchThreads = 16
    commandStartThreads = 5
    maxConnectionsPerUser = 48
    idleTimeout = 0
    waitTimeout = 600
[httpd]
    listenUrl = proxy-http://*:8087/
    maxThreads = 50
[cache]
    directory = cache
[cache "diff"]
    timeout = 600s
    memoryLimit = 1g
    diskLimit = 10g
[cache "diff_intraline"]
    timeout = 600s
    memoryLimit = 1g
    diskLimit = 10g
[cache "diff_summary"]
    timeout = 600s
    memoryLimit = 1g
    diskLimit = 10g

[cache "web_sessions"]
    maxAge = 2d
    memoryLimit = 16m
    diskLimit = 256m
[rules]
    reductionLimit = 2147483647
    compileReductionLimit = 2147483647
[gc]
    startTime = Sat 12:00
    interval = 1 week
[plugins]
    allowRemoteAdmin = true
[hooks]
    syncHookTimeout = 605
[download]
    scheme = ssh
    scheme = anon_git
[repository "*"]
    defaultSubmitType = FAST_FORWARD_ONLY
[commentlink "changeid"]
    match = (I[0-9a-f]{8,40})
    link = "#/q/$1"
[receive]
    timeout = 600s
    enableSignedPush = false
    threadPoolSize = 48
    changeUpdateThreads = 35
[changeCleanup]
    abandonAfter = 1 months
    startTime = Sat 01:00
    interval = 1 week
[plugin "login-redirect"]
    whitelist = /plugins/metrics-reporter-prometheus/
[plugin "metrics-reporter-prometheus"]
    excludeMetrics = ^jgit/block_cache/cache_used_per_repository/.*
    excludeMetrics = ^plugins/replication/latency_slower_than_threshold/900/.*

Alexander Ost

unread,
Apr 13, 2022, 4:43:31 AM4/13/22
to Repo and Gerrit Discussion
On Monday, March 21, 2022 at 6:20:23 PM UTC+1 motorhe...@gmail.com wrote:
[...] some have reported “Server Unavailable” messages that, when it happens during a long and involved code review, causes them to lose all their existing review comments, so that they need to start over again and lost productivity. [...]

Same issue here -- sporadic "Server Unavailable" messages without any errors in the logs. Only Firefox users were affected. I traced this down to sporadic TLS connection handshake errors, which are caused by a JDK bug (https://bugs.openjdk.java.net/browse/JDK-8241248).

The bug was fixed in OpenJDK 11.0.12 (note that Oracle JDK has the fix in 11.0.15, which isn't GA yet). In our case, the problem went away after switching to OpenJDK 11.0.14.

/alex

Reply all
Reply to author
Forward
0 new messages