Gerrit HA and Index failure

lingalugari mohankrishna

unread,

May 26, 2025, 12:25:02 PM5/26/25

to Repo and Gerrit Discussion

Hello Experts,

We have a 2 node cluster of Gerrit Running on 3.6.8 -- We know it is EOL (Planned for upgrade)

I have been observing too many Times my HTTP Requests on One Node is Going very high and at the same time I am observing from the queue Repository Index is getting triggered for certain repos and they are failing to sync the changes at the same time when my HTTP queue is going high.

This is literally impacting my Server Accessibility on UI. Any idea on how to fix these issues as HA seems to be not replicating the changes i suspect when i check queue below events can be seen they may be less sometimes and higher sometimes but HTTP queues are going very high.

send event:ref-replication-scheduled => http://x.x.x.:8080 (try #0)
index change:dev%2Fman%2Fsam~517760 => http://x.x.x.:8080 (try #0)
After service restart everything comes back normal and i had to restart each node twice today.

Before Restart:
Gerrit Code Review 3.6.8 now 10:07:28 UTC
uptime 6 days 15 hrs

Name |Entries | AvgGet |Hit Ratio|
| Mem Disk Space| |Mem Disk|
--------------------------------+---------------------+---------+---------+
adv_bases | | | |
change_notes | 123 | 29.6ms | 97% |
changeid_project | 1024 | | 85% |
changes_by_project | 1 | | 0% |
default_preferences | | | |
external_ids_map | 1 | | 99% |
groups | | | |
groups_bymember | 1024 | 290.8us | 99% |
groups_byname | 1 | 823.5us | 99% |
groups_bysubgroup | 357 | 258.0us | 99% |
groups_byuuid | 1339 | 1.6ms | 99% |
groups_external | 1 | 1.7s | 99% |
groups_external_persisted | | 1.6s | 0% |
ldap_group_existence | 7 | 338.3ms | 87% |
ldap_groups | 653 | 862.9ms | 99% |
ldap_groups_byinclude | 1024 | | 96% |
ldap_usernames | 1024 | 101.7us | 97% |
permission_sort | 1024 | 91.9us | 97% |
plugin_resources | 7 | | 2% |
project_list | 1 | 40.2s | 98% |
projects | 1024 | 2.9ms | 37% |
prolog_rules | | | |
soy_sauce_compiled_templates | 1 | 35.5ms | 99% |
sshkeys | 637 | 60.5ms | 99% |
static_content | 34 | 1.4ms | 85% |
lfs-lfs_project_locks | | | |
plugin-manager-plugins_list | 1 | 3.0s | 0% |
D accounts | 1024 22626 10.95m| 365.1us | 99% 100%|
D change_kind | 17049 891328 109.89m| 1.4ms | 97% 99%|
D comment_context | 17950 251854 158.79m| 6.2ms | 83% 99%|
D conflicts | 11388 554225 67.54m| 36.4ms | 75% 99%|
D diff_intraline | 6516 101892 128.62m| 5.3ms | 18% 98%|
D diff_summary | 4087 719529 504.20m| 9.8ms | 77% 99%|
D gerrit_file_diff | 19139 472656 663.51m| 3.0ms | 52% 97%|
D git_file_diff | 8223 342020 505.69m| 38.3ms | 2% 37%|
D git_modified_files | 4845 43191 174.04m| 3.4ms | 7% 43%|
D git_tags | 1024 2108 61.55m| 1.6ms | 66% 100%|
D groups_byuuid_persisted | 1157 1.04m| | 100%|
D mergeability | 8738 928474 128.28m| 80.7ms | 72% 99%|
D modified_files | 7893 49839 187.39m| 1.2ms | 95% 92%|
D oauth_tokens | 0.00k| | |
D persisted_projects | 13300 661.88m| | 99%|
D pure_revert | 100 7146 1.04m| 5.1ms | 95% 100%|
D web_sessions | 0.00k| | |

SSH: 29 users, oldest session started 4 days 23 hrs ago
Tasks: 597 total = 74 running + 299 ready + 224 sleeping
Mem: 190.00g total = 125.22g used + 54.78g free + 10.00g buffers
190.00g max
707 open files

Threads: 64 CPUs available, 1564 threads
NEW RUNNABLE BLOCKED WAITING TIMED_WAITING TERMINATED
ReceiveCommits 0 0 0 64 0 0
SshCommandStart 0 0 0 24 0 0
SSH-Interactive-Worker 0 0 0 114 0 0
SSH git-receive-pack 0 1 0 0 1 0
H2 0 0 0 0 34 0
HTTP 0 601 189 1 9 0
SSH git-upload-pack 0 39 0 1 40 0
Other 0 110 2 211 58 0
SSH-Stream-Worker 0 0 0 65 0 0

Luca Milanesio

unread,

May 28, 2025, 3:23:14 AM5/28/25

to Repo and Gerrit Discussion, Luca Milanesio

On 26 May 2025, at 18:25, lingalugari mohankrishna <lingalugari....@gmail.com> wrote:

Hello Experts,

We have a 2 node cluster of Gerrit Running on 3.6.8 -- We know it is EOL (Planned for upgrade)

Yes, please do upgrade. There are many issues that we fixed in the HA plugin and they just do not exist on your version.

I have been observing too many Times my HTTP Requests on One Node is Going very high and at the same time I am observing from the queue Repository Index is getting triggered for certain repos and they are failing to sync the changes at the same time when my HTTP queue is going high.

This is literally impacting my Server Accessibility on UI. Any idea on how to fix these issues as HA seems to be not replicating the changes i suspect when i check queue below events can be seen they may be less sometimes and higher sometimes but HTTP queues are going very high.

First: upgrade to get the most of fixes in the HA plugin.

Second: check your change.mergeabilityComputationBehavior setting in gerrit.config and make sure that is disabled (the default).

send event:ref-replication-scheduled => http://x.x.x.:8080 (try #0)

I don’t believe you have just two nodes: how can two nodes in HA configuration schedule replication events?

Can you please share with us the full picture?

index change:dev%2Fman%2Fsam~517760 => http://x.x.x.:8080 (try #0)
After service restart everything comes back normal and i had to restart each node twice today.

Well, after the service restart the node will be back to normal but you’ll have a lot of stale changes in the index: that’s not exactly *normal* isn't it?

Before Restart:
Gerrit Code Review 3.6.8 now 10:07:28 UTC
uptime 6 days 15 hrs

Name |Entries | AvgGet |Hit Ratio|
| Mem Disk Space| |Mem Disk|
--------------------------------+---------------------+---------+---------+
adv_bases | | | |
change_notes | 123 | 29.6ms | 97% |
changeid_project | 1024 | | 85% |

Your changeid_project cache is maxed out. You do have more than 1024 projects I believe.

changes_by_project | 1 | | 0% |
default_preferences | | | |
external_ids_map | 1 | | 99% |
groups | | | |
groups_bymember | 1024 | 290.8us | 99% |

Your groups_bymember cache is maxed out. You do have more than 1024 groups.

groups_byname | 1 | 823.5us | 99% |
groups_bysubgroup | 357 | 258.0us | 99% |
groups_byuuid | 1339 | 1.6ms | 99% |
groups_external | 1 | 1.7s | 99% |
groups_external_persisted | | 1.6s | 0% |
ldap_group_existence | 7 | 338.3ms | 87% |
ldap_groups | 653 | 862.9ms | 99% |
ldap_groups_byinclude | 1024 | | 96% |
ldap_usernames | 1024 | 101.7us | 97% |
permission_sort | 1024 | 91.9us | 97% |
plugin_resources | 7 | | 2% |
project_list | 1 | 40.2s | 98% |
projects | 1024 | 2.9ms | 37% |

Same as above.

prolog_rules | | | |
soy_sauce_compiled_templates | 1 | 35.5ms | 99% |
sshkeys | 637 | 60.5ms | 99% |
static_content | 34 | 1.4ms | 85% |
lfs-lfs_project_locks | | | |
plugin-manager-plugins_list | 1 | 3.0s | 0% |
D accounts | 1024 22626 10.95m| 365.1us | 99% 100%|

Your accounts in-memory cache is maxed out: you have 22k users but only 1k of them are loaded in the inmemory cache.

Bottom line: your setup would need a bit of review and adjustments to become more suitable for production. With over 22k users, you would need a substantial health check of your setup to avoid them going through painful restarts.

HTH

Luca.

[1] https://gerrit-documentation.storage.googleapis.com/Documentation/3.6.8/config-gerrit.html#change.mergeabilityComputationBehavior

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/b5ad845d-1abe-404f-80c9-1b60d6e05cb7n%40googlegroups.com.

lingalugari mohankrishna

unread,

May 28, 2025, 8:58:11 AM5/28/25

to Repo and Gerrit Discussion

Hi Luca,

Thanks for pointing out and my current Gerrit config looks like below. can you help me on fine tuning it to improve performance ?
GERRIT.CONFIG
[gerrit]
basePath = /shared/git
[index]
type = lucene
ramBufferSize = 4096m
maxTerms = 8192
maxBufferedDocs = 3000
threads = 16
batchThreads = 16
maxMergeCount = 100
reuseExistingDocuments = true
defaultLimit = 100
maxLimit = 500
cacheQueryResultsByChangeNum = true

[receive]
enableSignedPush = false
timeout = 150min
[transfer]
timeout = 120s
[sendemail]
smtpServer = localhost
smtpServerPort = 25
smtpUser = gerrit
[container]
user = gerrit
#javaHome = /usr/java/latest
javaHome = /usr/lib/jvm/jre
heapLimit = 190g
javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
[sshd]
listenAddress = *:29418
idleTimeout = 10m
backend = MINA
threads = 200
batchThreads = 70
commandStartThreads = 24
maxConnectionsPerUser = 64
[httpd]
listenUrl = proxy-https://*:8080/
maxThreads = 1000
requestLog = true
acceptorThreads = 48
minThreads = 49
maxQueued = 2000
[cache]
directory = cache
threads = 0
[cache "project_list"]
maxAge = 130s
[gitweb]
cgi = /var/www/git/gitweb.cgi
type = gitweb
[core]
packedGitLimit = 10g
packedGitWindowSize = 16k
packedGitOpenFiles = 10240
#[gc]
# startTime = Sun 00:00
# interval = 1 w
[pack]
threads = 24
windowMemory = 16g
[log "channel.name"]
level = DEBUG
[plugins]
allowRemoteAdmin = true
[hooks]
path = /data/gerrit/hooks
syncHookTimeout = 900
commitReceivedHook = commit-received
[user]
email = ger...@gmail.com
[change]
# mergeabilityComputationBehavior = API_REF_UPDATED_AND_CHANGE_REINDEX
cumulativeCommentSizeLimit = 10m
maxComments = 40000
maxUpdates = 10000
conflictsPredicateEnabled = true
submitWholeTopic = true
maxPatchSets = 1000000
[lfs]
plugin = lfs

[accountPatchReviewDb]
url = jdbc:postgresql://postgres:5432/and_ha?user=mohan&password=krishna
[database "h2"]
autoServer = true

and
HA.config

main]
sharedDirectory = /shared/git

[peerInfo]
strategy = static

[autoReindex]
enabled = false
delay = 10
pollInterval = 0

[peerInfo "static"] (on Node2)
url = http://Node1:8080
url = http://Node2:8080

[peerInfo "static"] (on Node1)
url = http://Node2:8080
url = http://Node1:8080

[http]
maxTries = 360
retryInterval = 10000
connectionTimeout = 5000
socketTimeout = 5000

[healthcheck]
enable = true

[cache]
threadPoolSize = 8
[index]
threadPoolSize = 12
numStripedLocks = 10000
maxTries = 10
[websession]
cleanupInterval = 24 hours

Matthias Sohn

unread,

May 28, 2025, 10:37:39 AM5/28/25

to lingalugari mohankrishna, Repo and Gerrit Discussion

On Wed, May 28, 2025 at 2:58 PM lingalugari mohankrishna <lingalugari....@gmail.com> wrote:

Hi Luca,

Thanks for pointing out and my current Gerrit config looks like below. can you help me on fine tuning it to improve performance ?
GERRIT.CONFIG
[gerrit]
basePath = /shared/git
[index]
type = lucene
ramBufferSize = 4096m

this option doesn't exist, but there is an option for each index called index.<index name>.ramBufferSize

maxTerms = 8192
maxBufferedDocs = 3000
threads = 16
batchThreads = 16
maxMergeCount = 100
reuseExistingDocuments = true
defaultLimit = 100
maxLimit = 500
cacheQueryResultsByChangeNum = true

[receive]
enableSignedPush = false
timeout = 150min

Why do you think you need a 2.5 hour timeout for receiving push requests ?

[transfer]
timeout = 120s
[sendemail]
smtpServer = localhost
smtpServerPort = 25
smtpUser = gerrit
[container]
user = gerrit
#javaHome = /usr/java/latest
javaHome = /usr/lib/jvm/jre

Which Java version are you using ?

heapLimit = 190g
javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
[sshd]
listenAddress = *:29418
idleTimeout = 10m
backend = MINA
threads = 200

sshd.threads limits the number of concurrently executed ssh requests and also the concurrently executed git requests (both ssh and http)
see https://gerrit-documentation.storage.googleapis.com/Documentation/3.6.8/config-gerrit.html#sshd.threads

As a rule of thumb fetching from a repo can keep one CPU core busy, hence allowing up to

200 concurrent git requests on a 64 CPU machine may overload it.

batchThreads = 70
commandStartThreads = 24
maxConnectionsPerUser = 64
[httpd]
listenUrl = proxy-https://*:8080/
maxThreads = 1000
requestLog = true
acceptorThreads = 48
minThreads = 49
maxQueued = 2000
[cache]
directory = cache
threads = 0
[cache "project_list"]
maxAge = 130s

Why do you think you need that ? This means gerrit potentially has to scan the file system to find projects every 2 minutes.

These options are meant to protect your gerrit server from a run-away script

or a crazy user to prevent issues caused by excessive size/number of objects related to a change.

You are effectively switching off these limits. A more cautious approach would be to start with the defaults

and increase limits when a user hits it with a reasonable use case you want to support.

Though it would be an interesting test to push 1 million patchsets for a single change

and see how gerrit can cope with it.

[lfs]
plugin = lfs

[accountPatchReviewDb]
url = jdbc:postgresql://postgres:5432/and_ha?user=mohan&password=krishna
[database "h2"]

The database section ceased to exist in gerrit 3.x. It was used to configure reviewdb in gerrit 2.x.

autoServer = true

and
HA.config

main]
sharedDirectory = /shared/git

Which type of filesystem are you using for the sharedDir ?

[peerInfo]
strategy = static

[autoReindex]
enabled = false
delay = 10
pollInterval = 0

[peerInfo "static"] (on Node2)
url = http://Node1:8080
url = http://Node2:8080

[peerInfo "static"] (on Node1)
url = http://Node2:8080
url = http://Node1:8080

[http]
maxTries = 360
retryInterval = 10000
connectionTimeout = 5000
socketTimeout = 5000

[healthcheck]
enable = true

[cache]
threadPoolSize = 8
[index]
threadPoolSize = 12
numStripedLocks = 10000

Where did you find this option ?

maxTries = 10
[websession]
cleanupInterval = 24 hours

On Wednesday, 28 May 2025 at 12:53:14 UTC+5:30 Luca Milanesio wrote:

On 26 May 2025, at 18:25, lingalugari mohankrishna <lingalugari....@gmail.com> wrote:

Hello Experts,

We have a 2 node cluster of Gerrit Running on 3.6.8 -- We know it is EOL (Planned for upgrade)

Yes, please do upgrade. There are many issues that we fixed in the HA plugin and they just do not exist on your version.

I have been observing too many Times my HTTP Requests on One Node is Going very high and at the same time I am observing from the queue Repository Index is getting triggered for certain repos and they are failing to sync the changes at the same time when my HTTP queue is going high.

This is literally impacting my Server Accessibility on UI. Any idea on how to fix these issues as HA seems to be not replicating the changes i suspect when i check queue below events can be seen they may be less sometimes and higher sometimes but HTTP queues are going very high.

First: upgrade to get the most of fixes in the HA plugin.
Second: check your change.mergeabilityComputationBehavior setting in gerrit.config and make sure that is disabled (the default).

send event:ref-replication-scheduled => http://x.x.x.:8080 (try #0)

I don’t believe you have just two nodes: how can two nodes in HA configuration schedule replication events?
Can you please share with us the full picture?

The high-availability plugin relies on a shared file system to let all nodes in the HA cluster access git repositories.
Hence there's no replication involved in that HA setup.
If you want to store the data in a separate storage for each node you should look at a multi-site setup instead.

To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/d3601673-4656-4453-8767-714b32b3cef5n%40googlegroups.com.

Luca Milanesio

unread,

May 29, 2025, 5:33:41 PM5/29/25

to Repo and Gerrit Discussion, Luca Milanesio

On 28 May 2025, at 16:37, Matthias Sohn <matthi...@gmail.com> wrote:

On Wed, May 28, 2025 at 2:58 PM lingalugari mohankrishna <lingalugari....@gmail.com> wrote:
Hi Luca,

Thanks for pointing out and my current Gerrit config looks like below. can you help me on fine tuning it to improve performance ?
GERRIT.CONFIG
[gerrit]
  basePath = /shared/git
[index]
  type = lucene
  ramBufferSize = 4096m

this option doesn't exist, but there is an option for each index called index.<index name>.ramBufferSize

I am curious where did you get this suggestion from? What were you trying to achieve? Did you experience some Lucene slowdowns? Do we have some outdated documentation mentioning it somewhere?

  maxTerms = 8192
  maxBufferedDocs = 3000
  threads = 16
  batchThreads = 16
  maxMergeCount = 100
  reuseExistingDocuments = true

This option is not available on Gerrit v3.8x., where did you find it?

  defaultLimit = 100
  maxLimit = 500
  cacheQueryResultsByChangeNum = true

This isn’t needed as the default is ’true’ anyway.

[receive]
enableSignedPush = false
timeout = 150min

Why do you think you need a 2.5 hour timeout for receiving push requests ?

+1, you should look at any long-running connections and why they are taking so long.

[transfer]
  timeout = 120s
[sendemail]
  smtpServer = localhost
  smtpServerPort = 25
  smtpUser = gerrit

[container]
  user = gerrit
  #javaHome = /usr/java/latest
  javaHome = /usr/lib/jvm/jre

Which Java version are you using ?

  heapLimit = 190g
  javaOptions = "-Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance"
  javaOptions = "-Dflogger.logging_context=com.google.gerrit.server.logging.LoggingContext#getInstance"
[sshd]
  listenAddress = *:29418
  idleTimeout = 10m
  backend = MINA
  threads = 200

sshd.threads limits the number of concurrently executed ssh requests and also the concurrently executed git requests (both ssh and http)
see https://gerrit-documentation.storage.googleapis.com/Documentation/3.6.8/config-gerrit.html#sshd.threads
As a rule of thumb fetching from a repo can keep one CPU core busy, hence allowing up to
200 concurrent git requests on a 64 CPU machine may overload it.

+1, you should not exceed 2x the number of CPUs, therefore 128 threads in your case.

  batchThreads = 70
  commandStartThreads = 24
  maxConnectionsPerUser = 64
[httpd]
  listenUrl = proxy-https://*:8080/
  maxThreads = 1000

1000 threads??? I doubt your box would be able to manage 1000 concurrent REST-API.

  requestLog = true
  acceptorThreads = 48
  minThreads = 49
  maxQueued = 2000

How many concurrent users do you have daily? You’re allowing 1000 concurrent requests and 2000 queued to be executed, 3000 in total.

[cache]
  directory = cache
  threads = 0
[cache "project_list"]
  maxAge = 130s

Why do you think you need that ? This means gerrit potentially has to scan the file system to find projects every 2 minutes.

+1

[gitweb]
  cgi = /var/www/git/gitweb.cgi
  type = gitweb
[core]
  packedGitLimit = 10g

You have 190g of heap, and just 10g of JGit cache?

Have you analysed how efficient it is and what’s the eviction rate?

packedGitWindowSize = 16k
packedGitOpenFiles = 10240

I doubt 10k open files will suffice in your case, unless you have a small number of small repositories.

#[gc]
# startTime = Sun 00:00
# interval = 1 w

[pack]
threads = 24
windowMemory = 16g

That’s only read as part of the TransferConfig, but not for all the rest of JGit.

I’d recommend to put the JGit-specific configs in $GERRIT_SITE/etc/jgit.config as they’ll be used everywhere.

[log "channel.name"]
level = DEBUG

DEBUG? Are you troubleshooting some issues? DEBUG should never be used in production; what is ‘channel.name’?

[plugins]
  allowRemoteAdmin = true
[hooks]
  path = /data/gerrit/hooks
  syncHookTimeout = 900

Waiting for 15 minutes for a synchronous hook to complete means you’re blocking threads for a very very long time.

Why would a sync hook take so long to execute?

commitReceivedHook = commit-received
[user]
email = ger...@gmail.com

Do you really own that e-mail?

[change]
# mergeabilityComputationBehavior = API_REF_UPDATED_AND_CHANGE_REINDEX
  cumulativeCommentSizeLimit = 10m
  maxComments = 40000
  maxUpdates = 10000
  conflictsPredicateEnabled = true
  submitWholeTopic = true
  maxPatchSets = 1000000

These options are meant to protect your gerrit server from a run-away script
or a crazy user to prevent issues caused by excessive size/number of objects related to a change.
You are effectively switching off these limits. A more cautious approach would be to start with the defaults
and increase limits when a user hits it with a reasonable use case you want to support.

Though it would be an interesting test to push 1 million patchsets for a single change
and see how gerrit can cope with it.

+1

[lfs]
plugin = lfs

[accountPatchReviewDb]
url = jdbc:postgresql://postgres:5432/and_ha?user=mohan&password=krishna

Did you share the production credentials by mistake here?

[database "h2"]

The database section ceased to exist in gerrit 3.x. It was used to configure reviewdb in gerrit 2.x.

autoServer = true

and
HA.config

main]
sharedDirectory = /shared/git

Which type of filesystem are you using for the sharedDir ?

I believe it should be NFS, correct?

Can you share the NFS version, mount and cache options?

Also, have you checked the $GERRIT_SITE/etc/jgit.config for the relevant NFS-specific options?

[peerInfo]
  strategy = static

[autoReindex]
  enabled = false
  delay = 10
  pollInterval = 0

[peerInfo "static"] (on Node2)
  url = http://Node1:8080
  url = http://Node2:8080


[peerInfo "static"] (on Node1)
  url = http://Node2:8080
  url = http://Node1:8080

[http]
  maxTries = 360
  retryInterval = 10000
  connectionTimeout = 5000
  socketTimeout = 5000

You are retrying the calls for 360*10 =3,600 seconds (1h).

That means that if one of the node is going down, the one left will keep on accumulating retries in memory and the associated objects.

The result would be that it will run out of memory *if the retries* are going on for 1h, and then you’ll end up with a global outage.

Where di you receive the above recommendation for those settings?

They do not seem to be randonly chosen so there must have been a rationale behind them.

[healthcheck]
enable = true

I do not recommend to use the one included in the high-availability plugin and use the healthcheck plugin instead.

I confirm my final statement: I don’t believe you have *one* problem but a *series of problems* due mainly to misconfiguration of the system.

Based on the traffic you have, the data on the repositories and the users, it should be pretty straightforward to come to a much balanced set of settings.

HTH

Luca.

To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/CAKSZd3QZShcfHy7pKNLod7LF%3DMXSm9e%2B%3D2s8KZE71E3MA2YgNA%40mail.gmail.com.

Matthias Sohn

unread,

May 29, 2025, 6:19:12 PM5/29/25

to Luca Milanesio, Repo and Gerrit Discussion

On Thu, May 29, 2025 at 11:33 PM Luca Milanesio <luca.mi...@gmail.com> wrote:

On 28 May 2025, at 16:37, Matthias Sohn <matthi...@gmail.com> wrote:

On Wed, May 28, 2025 at 2:58 PM lingalugari mohankrishna <lingalugari....@gmail.com> wrote:
Hi Luca,

Thanks for pointing out and my current Gerrit config looks like below. can you help me on fine tuning it to improve performance ?
GERRIT.CONFIG
[gerrit]
  basePath = /shared/git
[index]
  type = lucene
  ramBufferSize = 4096m

this option doesn't exist, but there is an option for each index called index.<index name>.ramBufferSize

I am curious where did you get this suggestion from? What were you trying to achieve? Did you experience some Lucene slowdowns? Do we have some outdated documentation mentioning it somewhere?

  maxTerms = 8192
  maxBufferedDocs = 3000
  threads = 16
  batchThreads = 16
  maxMergeCount = 100
  reuseExistingDocuments = true

This option is not available on Gerrit v3.8x., where did you find it?

This option has been available since 3.10.

Finding the optimal configuration is challenging with hundreds of Gerrit options.
Can we make this easier ?

To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/397869A2-9778-455E-84FD-C9ECE2BE6C7B%40gmail.com.

Luca Milanesio

unread,

May 29, 2025, 6:28:58 PM5/29/25

to Repo and Gerrit Discussion, Luca Milanesio

I have been advocating for years for having safer defaults :-) and I’m still all for it.

Some examples:

1) Infinite timeouts (still today, Gerrit wait indefinitely for the SMTP server to respond)

2) Unsafe defaults which may cause repository corruption (we have warning on the release notes, but unsure of how many people actually read them)

Luca.

Matthias Sohn

unread,

May 31, 2025, 5:44:10 PM5/31/25

to Luca Milanesio, Repo and Gerrit Discussion

ok, let's tackle this. Here's a start:

https://gerrit-review.googlesource.com/c/gerrit/+/480301

https://gerrit-review.googlesource.com/c/gerrit/+/480302

https://gerrit-review.googlesource.com/c/gerrit/+/480303

To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/11FB7B0C-89A0-4B12-AF0B-729BEC2CC511%40gmail.com.

Luca Milanesio

unread,

Jun 1, 2025, 3:32:09 AM6/1/25

to Repo and Gerrit Discussion, Luca Milanesio

[…]

I confirm my final statement: I don’t believe you have *one* problem but a *series of problems* due mainly to misconfiguration of the system.
Based on the traffic you have, the data on the repositories and the users, it should be pretty straightforward to come to a much balanced set of settings.

Finding the optimal configuration is challenging with hundreds of Gerrit options.
Can we make this easier ?

I have been advocating for years for having safer defaults :-) and I’m still all for it.

Some examples:
1) Infinite timeouts (still today, Gerrit wait indefinitely for the SMTP server to respond)
2) Unsafe defaults which may cause repository corruption (we have warning on the release notes, but unsure of how many people actually read them)

ok, let's tackle this. Here's a start:
https://gerrit-review.googlesource.com/c/gerrit/+/480301
https://gerrit-review.googlesource.com/c/gerrit/+/480302
https://gerrit-review.googlesource.com/c/gerrit/+/480303