[cache "accounts"]
memoryLimit = 4096
[cache "diff"]
memoryLimit = 49152
[cache "diff_intraline"]
memoryLimit = 131072
[cache "diff_summary"]
memoryLimit = 303104
[cache "groups"]
memoryLimit = 20480
[cache "ldap_groups"]
memoryLimit = 4096
maxAge = 5 min
[cache "permission_sort"]
memoryLimit = 262144
[cache "projects"]
memoryLimit = 3072
[cache "sshkeys"]
memoryLimit = 3072
[cache "web_sessions"]
memoryLimit = 303104
diskLimit = 303104
maxAge = 30d
GC script runs:REMOVE_/logs/refs/changesgit --git-dir=DIR_NAME config pack.threads 8git --git-dir=DIR_NAME config gc.pruneExpire 15.minutes.agossh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAME
This past weekend we spent several hours upgrading from Gerrit 2.14.18 to 2.16.10. We went to the .10 version as we tested that the longest. We are now hitting some issues with slowness on our largest instance.
We are garbage collecting the All-Users and All-Projects repositories every 30 minutes. I'm seeing the All-Users repo is taking longer than 30 minutes to complete. This morning we've added some of our largest and busiest repositories to the 30 minute gc list too.GC script runs:REMOVE_/logs/refs/changesgit --git-dir=DIR_NAME config pack.threads 8git --git-dir=DIR_NAME config gc.pruneExpire 15.minutes.ago
ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAMEBox specs:48 cores386GB RAMSSDGerrit config and details:75GB Java heapG1GC
35GB packedGitLimitdisableReverseDnsLookup = trueOld UI disabled15 mirrors to replicate to around the world
Largest repo around 7GB packed.
We are also experiencing the random 403 invalid authentication errors CrBug, which completely blocks some users for hours at a time. Just wanted to mention this if that somehow related.
On 22 Oct 2019, at 19:42, Matthias Sohn <matthi...@gmail.com> wrote:On Tue, Oct 22, 2019 at 5:56 PM Doug Luedtke <douglas...@gmail.com> wrote:This past weekend we spent several hours upgrading from Gerrit 2.14.18 to 2.16.10. We went to the .10 version as we tested that the longest. We are now hitting some issues with slowness on our largest instance.update to 2.16.12 to get all the fixes done since 2.16.10 ?
We are garbage collecting the All-Users and All-Projects repositories every 30 minutes. I'm seeing the All-Users repo is taking longer than 30 minutes to complete. This morning we've added some of our largest and busiest repositories to the 30 minute gc list too.GC script runs:REMOVE_/logs/refs/changesgit --git-dir=DIR_NAME config pack.threads 8git --git-dir=DIR_NAME config gc.pruneExpire 15.minutes.agoWhy such a short expire period ?ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAME
Box specs:48 cores386GB RAMSSDGerrit config and details:75GB Java heapG1GCWhat's the CPU percentage spent on Java gc ?How long are pause times caused by Java gc ?Log gc activity to track this. Follow [1] for tuning G1GC.Maybe you need to increase size of the young generation.
35GB packedGitLimitdisableReverseDnsLookup = trueOld UI disabled15 mirrors to replicate to around the worldAre build servers fetching from master ?Try to offload read-only load from build servers to slaves.Largest repo around 7GB packed.
Typically the large repositories above 1GB cause the biggest load.Try to avoid versioning large binary files in git. You can limit file size via receive.maxObjectSizeLimit [2]and restrict which file types can be pushed using the uploadvalidator plugin [3].What's the total size of the hot repositories you have traffic for ?It may help to increase packedGitLimit and max heap size to keep the hot packfiles in the jgit cache.3300 user accounts, 1700 active850k code-reviews
What can I do to reduce the gc time for repositories?Am I right to assume that gc.PruneExpire 15.minutes.ago is going to be a problem when the gerrit gc takes longer than 15 minutes to complete? Are there suggested settings?I'd avoid that, to prevent another process (gerrit) corrupts the repository, see the warning in [4].Pruning is less important than keeping the number of loose object and pack files reasonably small.
If there are more than 200 pack files in a repository performance typically starts degrading.Do you generate bitmap indexes ? They can reduce fetch time and thus CPU load.We are also experiencing the random 403 invalid authentication errors CrBug, which completely blocks some users for hours at a time. Just wanted to mention this if that somehow related.
-Matthias--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/CAKSZd3SP-MnTdxWR0T_ytkCS_JQH3Tfm5kUJDguuw1aKWQXKiw%40mail.gmail.com.
update to 2.16.12 to get all the fixes done since 2.16.10 ?
Why such a short expire period ?
Oh man ... you are doing Git GC *inside* the Gerrit JVM?
ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAME
Are build servers fetching from master ?
Try to offload read-only load from build servers to slaves.
Wow, with 850k code-reviews, the migration to NoteDb has added.....
On 22 Oct 2019, at 22:37, Doug Luedtke <douglas...@gmail.com> wrote:Thank you Matthias and Luca. I'm working on the majority of your answers. Sorry for the delay.update to 2.16.12 to get all the fixes done since 2.16.10 ?I'm not ready to go to Gerrit 2.16.12 yet. For the first time in years, we actually build this version ourselves to only change the 403 error message to tell users to try signing in. This is related to https://crbug.com/gerrit/9797 and was our quick way to get around it, for the moment.
Why such a short expire period ?That was a setting that we've used for years. I have now changed it to 1.day.ago.Oh man ... you are doing Git GC *inside* the Gerrit JVM?We are running gerrit gc, but setting some git configs for the repo before the gerrit gc.
ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAME
Are build servers fetching from master ?Try to offload read-only load from build servers to slaves.The build nodes are mostly pulling from a pool of four mirrors that are load balanced with HA proxy. Lucky they are using the mirrors.Wow, with 850k code-reviews, the migration to NoteDb has added.....That was just my largest Gerrit master. The Online NoteDb migration completed in 7511s (2h 5m 11s). I was impressed.
I will work on the rest of the questions. Thank you so far for comments and assistance.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/74f0ab0c-b5b0-44af-b2dd-f5af297d8005%40googlegroups.com.
Thank you Matthias and Luca. I'm working on the majority of your answers. Sorry for the delay.update to 2.16.12 to get all the fixes done since 2.16.10 ?I'm not ready to go to Gerrit 2.16.12 yet. For the first time in years, we actually build this version ourselves to only change the 403 error message to tell users to try signing in. This is related to https://crbug.com/gerrit/9797 and was our quick way to get around it, for the moment.Why such a short expire period ?That was a setting that we've used for years. I have now changed it to 1.day.ago.Oh man ... you are doing Git GC *inside* the Gerrit JVM?We are running gerrit gc, but setting some git configs for the repo before the gerrit gc.
ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAME
Are build servers fetching from master ?Try to offload read-only load from build servers to slaves.The build nodes are mostly pulling from a pool of four mirrors that are load balanced with HA proxy. Lucky they are using the mirrors.Wow, with 850k code-reviews, the migration to NoteDb has added.....That was just my largest Gerrit master. The Online NoteDb migration completed in 7511s (2h 5m 11s). I was impressed.I will work on the rest of the questions. Thank you so far for comments and assistance.
--
On 22 Oct 2019, at 22:37, Doug Luedtke <douglas...@gmail.com> wrote:Thank you Matthias and Luca. I'm working on the majority of your answers. Sorry for the delay.update to 2.16.12 to get all the fixes done since 2.16.10 ?I'm not ready to go to Gerrit 2.16.12 yet. For the first time in years, we actually build this version ourselves to only change the 403 error message to tell users to try signing in. This is related to https://crbug.com/gerrit/9797 and was our quick way to get around it, for the moment.Don't fork Gerrit :-) and just use this plugin:It basically redirects people to login when they are accessing a Gerrit URL without a valid authentication context.
Why such a short expire period ?That was a setting that we've used for years. I have now changed it to 1.day.ago.Oh man ... you are doing Git GC *inside* the Gerrit JVM?We are running gerrit gc, but setting some git configs for the repo before the gerrit gc.
ssh GERRIT_USER\@HOSTNAME IDENTITY_FILE -p GERRIT_PORT gerrit gc PROJ_NAMEOh yes, that runs the GC inside the Gerrit JVM.There are two major issues:1. JVM Heap explosion2. GC may not even finish before the SSH session times out and the GC stops
Are build servers fetching from master ?Try to offload read-only load from build servers to slaves.The build nodes are mostly pulling from a pool of four mirrors that are load balanced with HA proxy. Lucky they are using the mirrors.Wow, with 850k code-reviews, the migration to NoteDb has added.....That was just my largest Gerrit master. The Online NoteDb migration completed in 7511s (2h 5m 11s). I was impressed.We should *ALL* thank Dave Borowitz for that, *amazing job* :-)
I will work on the rest of the questions. Thank you so far for comments and assistance.--
--
To unsubscribe, email rep...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-d...@googlegroups.com.
On 22 Oct 2019, at 23:09, Doug Luedtke <douglas...@gmail.com> wrote:On Tuesday, October 22, 2019 at 4:42:28 PM UTC-5, lucamilanesio wrote:On 22 Oct 2019, at 22:37, Doug Luedtke <douglas...@gmail.com> wrote:Thank you Matthias and Luca. I'm working on the majority of your answers. Sorry for the delay.update to 2.16.12 to get all the fixes done since 2.16.10 ?I'm not ready to go to Gerrit 2.16.12 yet. For the first time in years, we actually build this version ourselves to only change the 403 error message to tell users to try signing in. This is related to https://crbug.com/gerrit/9797 and was our quick way to get around it, for the moment.Don't fork Gerrit :-) and just use this plugin:It basically redirects people to login when they are accessing a Gerrit URL without a valid authentication context.I'm really against forking Gerrit. This was a one time solution. I know nothing about that plugin. Does it work with HTTP_LDAP? I'm not seeing much for documentation.
Also, I'm not sure how that will react with the random 403 errors, invalid rest authentication that users get in PolyGerrit UI.
And if that is a workaround, why wasn't it mentioned in any of the CrBugs?
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/e9dab1f0-06c9-464d-b518-3d6a86c99c53%40googlegroups.com.
JGit's gc implementation does not respect option pack.threads, this means it's running single threaded.If you run this in-process it may need a lot of heap if you have very large repositories. We have an installationwith 100k mostly smaller repositories and run gerrit gc scheduled once a day without issues. In that instance welimit repository size to 500MB.