--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hello all,We are running Gerrit ver 2.11.4 (with one plugin 'javamelody'). We run it as a service by running `gerrit.sh run`.After running for about 2 weeks our Gerrit will become very slow and non responsive. We notice that when this happens our Gerrit is in the middle of a java garbage collection (indicated by javamelody) cycle and our Gerrit server CPU utilization goes way up[1]. At the same time Gerrit seems to have eaten up all of the system memory[2]. We also noticed that our Gerrit starts at the initial heap of 30 Gb (as expected) but over time it seems toeat up all the server's memory[3].I'm looking for help to determine whether this might be a memory leak issue or whether there is some configuration thing we should do to help alleviate this problem? Also I was wondering if anybody else is experiencing this same issue?System: Ubuntu Trusty 14.04.4 LTS, 16 cores, 60 GB physical memoryjava version "1.7.0_79"OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)We replicate to 8 git slaves and most everyone clones from those slaves.We do _not_ run gerrit gc but rather run a daily cron job to repack our repos ('git repack -afd')
Here is our config:[container]heaplimit = 30g[core]packedGitOpenFiles = 4096packedGitLimit = 400m
packedGitWindowSize = 16k[sshd]threads = 100[index]threads = 4[httpd]maxQueued = 200
On Sat, May 21, 2016 at 1:53 AM, Khai Do <zaro...@gmail.com> wrote:Hello all,We are running Gerrit ver 2.11.4 (with one plugin 'javamelody'). We run it as a service by running `gerrit.sh run`.After running for about 2 weeks our Gerrit will become very slow and non responsive. We notice that when this happens our Gerrit is in the middle of a java garbage collection (indicated by javamelody) cycle and our Gerrit server CPU utilization goes way up[1]. At the same time Gerrit seems to have eaten up all of the system memory[2]. We also noticed that our Gerrit starts at the initial heap of 30 Gb (as expected) but over time it seems toeat up all the server's memory[3].I'm looking for help to determine whether this might be a memory leak issue or whether there is some configuration thing we should do to help alleviate this problem? Also I was wondering if anybody else is experiencing this same issue?System: Ubuntu Trusty 14.04.4 LTS, 16 cores, 60 GB physical memoryjava version "1.7.0_79"OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1)OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)We replicate to 8 git slaves and most everyone clones from those slaves.We do _not_ run gerrit gc but rather run a daily cron job to repack our repos ('git repack -afd')are you using bitmap indexes ?
Here is our config:[container]heaplimit = 30g[core]packedGitOpenFiles = 4096packedGitLimit = 400mpackedGitLimit is way too small, it's the main JGit object cache. Ideally it's as large as the sum ofall your actively used repositories so that you can serve most requests from memory.
packedGitWindowSize = 16k[sshd]threads = 100[index]threads = 4[httpd]maxQueued = 200do you monitor / log the Java garbage collector runs ?
Do you limit max object size ? We introduced this limit since we observed problems when users tried to pushpacks containing huge files.
You may consider to create a heap dump (ensure you have enough disk space to store it) and analyze it using MAT [2].Run the MAT analysis on a large box using MAT headless mode. Run the script mentioned in [3] to analyse the dump,it's contained in the MAT download.$ ./ParseHeapDump.sh path/to/dump.dmp.zip org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_componentsThen copy the dump and all the files MAT analysis created to a folder on your local machine and inspect it using MAT's UI.Heap dumps compress well so it may save time to first zip the dump before copying it over the network.
What's you cache status?Gerrit cache config and show cache command output?
On 23 May 2016, at 19:29, Khai Do <zaro...@gmail.com> wrote:
On Saturday, May 21, 2016 at 1:15:30 AM UTC-7, lucamilanesio wrote:What's you cache status?Gerrit cache config and show cache command output?We are not setting anything specific for cache, we are just using the Gerrit default cache settings.
show-caches output: http://paste.openstack.org/show/498238/and snapshots of the memory when we ran that command:weekly average: http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=27&rra_id=2&view_type=&graph_start=1463422789&graph_end=1464027589Uptime is ~5 days which means we have about another 5 days until our server starts to intermittently returning 503 Errors. Once we restart everything works smoothly again.
packedGitWindowSize = 16k[sshd]threads = 100[index]threads = 4[httpd]maxQueued = 200do you monitor / log the Java garbage collector runs ?The javamelody plugin logs that for us however it's not persisted across restarts.
Do you limit max object size ? We introduced this limit since we observed problems when users tried to pushpacks containing huge files.Not at the moment. We had planned to limit to '100 m'. Was this something that was causing memory issues for you?
You may consider to create a heap dump (ensure you have enough disk space to store it) and analyze it using MAT [2].Run the MAT analysis on a large box using MAT headless mode. Run the script mentioned in [3] to analyse the dump,it's contained in the MAT download.$ ./ParseHeapDump.sh path/to/dump.dmp.zip org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_componentsThen copy the dump and all the files MAT analysis created to a folder on your local machine and inspect it using MAT's UI.Heap dumps compress well so it may save time to first zip the dump before copying it over the network.Thanks for the suggestion, will look into this.
--
On Saturday, May 21, 2016 at 2:10:28 AM UTC-7, Matthias Sohn wrote:are you using bitmap indexes ?No, we do not. Our cron repack doesn't generate those. I've heard bitmap indexes can speed up performance but I think it's the exhaustion of memory is what's causing the performance degradation.