Gerrit server is unable to fetch code properly.

251 views
Skip to first unread message

Wayne SW

unread,
Jul 28, 2023, 2:17:34 AM7/28/23
to Repo and Gerrit Discussion
The situation is this: our team's code is stored in Gerrit, and the current Gerrit version is 3.4.3. We often encounter issues where the team gets stuck while fetching code, and when we use "ssh -p 29418 user@user-domain gerrit show-queue -w" command, we can see that some users are occupying many processes. However, upon checking the dates, we noticed that some of these processes are from yesterday, some are from the day before yesterday, and even some are from even earlier.

Could this be the reason for the problem, or is there another direction that can help us?

Thank you.

ssh -p 29418 user@user-domain gerrit show-queue -w result:
圖片 033.png
my gerrit configuration:
[database]
type = h2
database = db/ReviewDB
poolMaxIdle = 32 ### 48 ### Default is 4. If there are more idle connections, connections will be closed instead of being returned back to the pool.
poolLimit = 130 ###192 ### Default is 8. Should be at least higher then sshd.threads. maybe 10 units. #This limit must be several units higher than the total number of httpd and sshd threads as some request processing code paths may need multiple connections.

[container]
user = cm
javaHome = /usr/lib/jvm/java-11-openjdk-amd64
heapLimit = 170g ### 16g ### Suggestion is 1.5~2*CPU. Maximum heap size of the Java process running Gerrit.
javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
[sshd]
backend = NIO2
listenAddress = *:29418
threads = 72 ### 128 ### Default is 1.5*CPU. Probably should be around 2*CPU you have, but may go higher if you have a lot of remote WAN based connections.
batchThreads = 2 ### 0 ### Default is 0.
maxConnectionsPerUser = 16 ### default 64 ###
idleTimeout = 2s
[core]
packedGitOpenFiles = 2048 ### 128 ### That gives us plenty of breathing space for network sockets. Max=4096 Default=
packedGitLimit = 40g ### 100m ### Default is 10m. Should be at least 50% of your on disk usage for your fully packed repositories.
PackedGitWindowSize = 512k #### 64m ### Default is 8k. Too large may load more data than is required; too small may increase the frequency of read() system calls.
streamFileThreshold = 1g ### 4g ### Largest object size.
[httpd]
listenUrl = proxy-http://*:8081/
maxQueued = 30 ### default 50 ###
maxThreads = 50 ### default 25 ### NEW
[hooks]
path = /home/cm/gerrit/review_site/hooks
patchsetCreatedHook = patchset-created
changeMergedHook = change-merged
commentAddedHook = comment-added


[cache]
directory = cache
h2CacheSize = 2g ### added on 20221124
[cache "accounts"]
maxAge = 5 min ### added on 20221124
[cache "changes"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 2048
[cache "diff"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 5g ## added on 20211028 50m, modify on 20221031 1g
[cache "diff_intraline"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 5g ## added on 20211028 50m, modify on 20221031 1g
[cache "groups"]
maxAge = 5 min ### added on 20221124
[cache "ldap_groups"]
maxAge = 5 min ### added on 20221124
[cache "ldap_usernames"]
maxAge = 5 min ### added on 20221124
[cache "projects"]
maxAge = 5 min ### modify on 20221124
memoryLimit = 2048 # added on 20211028 50m, modify on 20221031 1g


[index]
type = lucene
[recevie]
timeout = 0 ### 0 ### NEW
[pack]
threads = 8 ###NEW###
[plugins]
    allowRemoteAdmin = true
[receive]
enableSignedPush = false
[changeCleanup]
        abandonAfter = 3mon
        startTime = Sat 20:00
        interval = 2 days

Luca Milanesio

unread,
Jul 28, 2023, 3:47:27 AM7/28/23
to Repo and Gerrit Discussion
Hi Wayne,

> On 28 Jul 2023, at 05:37, Wayne SW <will5...@gmail.com> wrote:
>
> The situation is this: our team's code is stored in Gerrit, and the current Gerrit version is 3.4.3. We often encounter issues where the team gets stuck while fetching code, and when we use "ssh -p 29418 user@user-domain gerrit show-queue -w" command, we can see that some users are occupying many processes. However, upon checking the dates, we noticed that some of these processes are from yesterday, some are from the day before yesterday, and even some are from even earlier.

That looks like they are stuck, possibly deadlock?

Are the fetches over Git/SSH?
Did you get a JVM thread dump? That would the be ultimate smoking gun on why they are stuck.

Luca.

>
> Could this be the reason for the problem, or is there another direction that can help us?
>
> Thank you.
>
> ssh -p 29418 user@user-domain gerrit show-queue -w result:

Wayne SW

unread,
Jul 28, 2023, 4:40:44 AM7/28/23
to Luca Milanesio, Repo and Gerrit Discussion
Hi Luca,

I am fetching the code through "repo," and all the fetches are done using j8.

Luca Milanesio <luca.mi...@gmail.com> 於 2023年7月28日 週五 下午3:44寫道:
Hi Wayne,

On 28 Jul 2023, at 05:37, Wayne SW <will5...@gmail.com> wrote:

The situation is this: our team's code is stored in Gerrit, and the current Gerrit version is 3.4.3. We often encounter issues where the team gets stuck while fetching code, and when we use "ssh -p 29418 user@user-domain gerrit show-queue -w" command, we can see that some users are occupying many processes. However, upon checking the dates, we noticed that some of these processes are from yesterday, some are from the day before yesterday, and even some are from even earlier.
That looks like they are stuck, possibly deadlock?

Are the fetches over Git/SSH?
Did you get a JVM thread dump? That would the be ultimate smoking gun on why they are stuck.

Luca.
Could this be the reason for the problem, or is there another direction that can help us?

Thank you.

ssh -p 29418 user@user-domain gerrit show-queue -w result:
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/ed8226f3-c653-4b7e-a2e9-3d6e562de7cfn%40googlegroups.com.
<圖片 033.png>

Matthias Sohn

unread,
Jul 28, 2023, 4:43:55 AM7/28/23
to Wayne SW, Repo and Gerrit Discussion
On Fri, Jul 28, 2023 at 8:17 AM Wayne SW <will5...@gmail.com> wrote:
The situation is this: our team's code is stored in Gerrit, and the current Gerrit version is 3.4.3. We often encounter issues where the team gets stuck while fetching code, and when we use "ssh -p 29418 user@user-domain gerrit show-queue -w" command, we can see that some users are occupying many processes. However, upon checking the dates, we noticed that some of these processes are from yesterday, some are from the day before yesterday, and even some are from even earlier.

You should consider updating to a supported release.
You should always upgrade to the latest service release of the release you are using for 3.4 this is 3.4.8.
Otherwise you miss available fixes (also some security fixes) and improvements.
 
Could this be the reason for the problem, or is there another direction that can help us?

Do you run git gc on all repos on a regular schedule ?
 
Thank you.

ssh -p 29418 user@user-domain gerrit show-queue -w result:
圖片 033.png
my gerrit configuration:
[database]
type = h2
database = db/ReviewDB
poolMaxIdle = 32 ### 48 ### Default is 4. If there are more idle connections, connections will be closed instead of being returned back to the pool.

Are you versioning your gerrit.config in git ? Then you can store such comments in the commit messages
instead of cluttering the config file. Also this can help to recover from a bad config change.
 
poolLimit = 130 ###192 ### Default is 8. Should be at least higher then sshd.threads. maybe 10 units. #This limit must be several units higher than the total number of httpd and sshd threads as some request processing code paths may need multiple connections.

Remove the [database] section, 3.x doesn't use a relational database for storing project and review metadata anymore.
 
[container]
user = cm
javaHome = /usr/lib/jvm/java-11-openjdk-amd64
heapLimit = 170g ### 16g ### Suggestion is 1.5~2*CPU. Maximum heap size of the Java process running Gerrit.
javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
[sshd]
backend = NIO2
listenAddress = *:29418
threads = 72 ### 128 ### Default is 1.5*CPU. Probably should be around 2*CPU you have, but may go higher if you have a lot of remote WAN based connections.
batchThreads = 2 ### 0 ### Default is 0.

tune this according to how much load you get from non interactive users (e.g. CI systems), 
they should run on the batch thread pool to avoid affecting interactive users
 
maxConnectionsPerUser = 16 ### default 64 ###
idleTimeout = 2s

this timeout seems pretty short
 
[core]
packedGitOpenFiles = 2048 ### 128 ### That gives us plenty of breathing space for network sockets. Max=4096 Default=
packedGitLimit = 40g ### 100m ### Default is 10m. Should be at least 50% of your on disk usage for your fully packed repositories.
PackedGitWindowSize = 512k #### 64m ### Default is 8k. Too large may load more data than is required; too small may increase the frequency of read() system calls.
streamFileThreshold = 1g ### 4g ### Largest object size.
[httpd]
listenUrl = proxy-http://*:8081/
maxQueued = 30 ### default 50 ###
maxThreads = 50 ### default 25 ### NEW
[hooks]
path = /home/cm/gerrit/review_site/hooks
patchsetCreatedHook = patchset-created
changeMergedHook = change-merged
commentAddedHook = comment-added


[cache]
directory = cache
h2CacheSize = 2g ### added on 20221124

 
[cache "accounts"]
maxAge = 5 min ### added on 20221124

Why do you use such short maxAge for many caches ?
We only specify this for websessions since want a longer maxAge than the default 12h.
This causes unnecessary load and will negatively affect performance.
 
[cache "changes"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 2048
[cache "diff"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 5g ## added on 20211028 50m, modify on 20221031 1g

Specify a large enough diskLimit for caches which are expensive to compute like diff caches
This ensures the cache is persisted so you don't have to recompute it after a restart.
Use the gerrit show-caches command to observe cache usage.
 
[cache "diff_intraline"]
maxAge = 5 min ### modify on 20211124
memoryLimit = 5g ## added on 20211028 50m, modify on 20221031 1g
[cache "groups"]
maxAge = 5 min ### added on 20221124
[cache "ldap_groups"]
maxAge = 5 min ### added on 20221124
[cache "ldap_usernames"]
maxAge = 5 min ### added on 20221124
[cache "projects"]
maxAge = 5 min ### modify on 20221124
memoryLimit = 2048 # added on 20211028 50m, modify on 20221031 1g


[index]
type = lucene
[recevie]

typo, this section should be named [receive]
 
timeout = 0 ### 0 ### NEW

Why don't you set a receive timeout ? We use 5 min.
 
[pack]
threads = 8 ###NEW###
[plugins]
    allowRemoteAdmin = true
[receive]
enableSignedPush = false
[changeCleanup]
        abandonAfter = 3mon
        startTime = Sat 20:00
        interval = 2 days

--

James Muir

unread,
Jul 28, 2023, 2:50:01 PM7/28/23
to Repo and Gerrit Discussion

Are you versioning your gerrit.config in git ? Then you can store such comments in the commit messages
instead of cluttering the config file. Also this can help to recover from a bad config change.

Does gerrit provide the ability to version gerrit.config?  or do you simply create a git repo inside "$site_path/etc" using "git init"?

Matthias Sohn

unread,
Jul 28, 2023, 4:09:19 PM7/28/23
to James Muir, Repo and Gerrit Discussion
We have a clone of a config repo located outside the site directory and symlink the gerrit.config it contains to "$site_path/etc/gerrit.config".
We use a cronjob to synchronize this clone periodically by fetching from the corresponding bare repo which is hosted in gerrit.
Config updates are pushed for review and when the corresponding change is submitted the cronjob updates the clone a few minutes later.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages