

Hi all,I am writing to both JGit and Gerrit mailing list because the problem could be of interest for both communities.I am able to systematically reproduce a fatal memory leak that causes the complete collapse of the JVM.Repos: the Android reposTest-case: for (all repos); do clone and git GC; done
P.S. The traffic is serialised so that we have the certainty that threads are done with their jobs when the next operation starts.The scenario is quite simple: just generate a constant traffic of continuous clones and JGit GC and, after a few hours, the JVM id completely dead.By using a G1GC and tuning it, I managed to avoid almost all the STW GC cycles. However, the memory utilised increases continuously and at the end al the GC goes into an infinite loop trying to release something without being able to do so.Sample of the GC log when the JVM is collapsed:2019-06-11T08:15:27.340+0200: 28506.935: [Full GC (Allocation Failure) 90G->89G(100G), 106.4610300 secs][Eden: 0.0B(35.0G)->0.0B(35.0G) Survivors: 0.0B->0.0B Heap: 90.5G(100.0G)->89.2G(100.0G)], [Metaspace: 61559K->61559K(65536K)]And the JVM heap utilization:With regards to the memory dump, see the following top offenders:I have explicitly set low limits for caching packfiles in the gerrit.config:[core]packedGitLimit = 10mpackedGitOpenFiles = 16000packedGitWindowSize = 64kIn theory, JGit shouldn't hold huge amount of memory for packfiles and instead continuously read them from the filesystem. However, as you can see from the above figures, we have over 2M byte arrays holding over 22GBytes of memory.I also have a ~/.gitconfig with a specific [pack] configuration:[pack]maxDeltaDepth=5deltaSearchWindowSize=10deltaSearchMemoryLimit=0deltaCacheSize=52428800
deltaCacheLimit=100compressionLevel=-1indexVersion=2bigFileThreshold=52428800threads=1reuseDeltas=truereuseObjects=truedeltaCompress=falsebuildBitmaps=trueWhat sounds strange to me is the high number of LocalObjectToPack that are held in memory (9M of objects): the ObjectToPack should be only a temporary object used by a PackWriter, but I am not a super-expert of that part of code.
Rings any bell to anyone?Thanks for the feedback.Luca.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/A0E50AE5-874F-47E6-BCFE-56A58FA5E1D3%40gmail.com.
For more options, visit https://groups.google.com/d/optout.
On 11 Jun 2019, at 08:39, Matthias Sohn <matthi...@gmail.com> wrote:On Tue, Jun 11, 2019 at 8:31 AM Luca Milanesio <luca.mi...@gmail.com> wrote:Hi all,I am writing to both JGit and Gerrit mailing list because the problem could be of interest for both communities.I am able to systematically reproduce a fatal memory leak that causes the complete collapse of the JVM.Repos: the Android reposTest-case: for (all repos); do clone and git GC; doneCan you specify the exact test setup and share the test case itself ?
- I guess you are serving from Gerrit ?
- Which version of Gerrit and JGit are you using ?
- Which JVM are you using ?
- Which JVM option settings are you using ?
- Do you run git gc using native git or JGit (embedded in Gerrit or standalone) ?
P.S. The traffic is serialised so that we have the certainty that threads are done with their jobs when the next operation starts.The scenario is quite simple: just generate a constant traffic of continuous clones and JGit GC and, after a few hours, the JVM id completely dead.By using a G1GC and tuning it, I managed to avoid almost all the STW GC cycles. However, the memory utilised increases continuously and at the end al the GC goes into an infinite loop trying to release something without being able to do so.Sample of the GC log when the JVM is collapsed:2019-06-11T08:15:27.340+0200: 28506.935: [Full GC (Allocation Failure) 90G->89G(100G), 106.4610300 secs][Eden: 0.0B(35.0G)->0.0B(35.0G) Survivors: 0.0B->0.0B Heap: 90.5G(100.0G)->89.2G(100.0G)], [Metaspace: 61559K->61559K(65536K)]And the JVM heap utilization:
<Screenshot 2019-06-11 at 07.18.59.png>
With regards to the memory dump, see the following top offenders:
<Screenshot 2019-06-11 at 07.22.05.png>
private static final Map<WeakReference<PackWriter>, Boolean> instances =
new ConcurrentHashMap<>();
I have explicitly set low limits for caching packfiles in the gerrit.config:[core]packedGitLimit = 10mpackedGitOpenFiles = 16000packedGitWindowSize = 64kIn theory, JGit shouldn't hold huge amount of memory for packfiles and instead continuously read them from the filesystem. However, as you can see from the above figures, we have over 2M byte arrays holding over 22GBytes of memory.I also have a ~/.gitconfig with a specific [pack] configuration:[pack]maxDeltaDepth=5deltaSearchWindowSize=10deltaSearchMemoryLimit=0deltaCacheSize=52428800this means deltaCacheSize is 5 times larger than the JGit file cache, does that make sense ?deltaCacheLimit=100compressionLevel=-1indexVersion=2bigFileThreshold=52428800threads=1reuseDeltas=truereuseObjects=truedeltaCompress=falsebuildBitmaps=trueWhat sounds strange to me is the high number of LocalObjectToPack that are held in memory (9M of objects): the ObjectToPack should be only a temporary object used by a PackWriter, but I am not a super-expert of that part of code.do these repositories contain large files ?
Hi Matthias,thanks for the quick rely.
Answers inline.
On 11 Jun 2019, at 08:39, Matthias Sohn <matthi...@gmail.com> wrote:
To unsubscribe, email repo-d...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-d...@googlegroups.com.
On 12 Aug 2019, at 13:47, Sven Selberg <sven.s...@axis.com> wrote:Hi,I stumbled across this thread when trouble-shooting an issue we just had and thought I'd mention it here in case they are related:All CPUs went full-throttle for about 30 minutes after which everything returned to normal.From a heap-dump I took during the high load I got (we have a quite modest heap of 16.7 GB):* LocalObjectToPack - 65M instances* Byte[] - 3M Instances* ObjectId - 3M InstancesI cannot however find any evidence of a JGit GC anywhere during the time-frame of the incident.Gerrit: v2.16.8-106-g6d4cf6d026JGit: 5.1.7.201904200442-rJava: 1.8.0_212-b03 (Open JDK)
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/8b0ad786-45fe-4680-ba0b-0af8ae6bacef%40googlegroups.com.
On 12 Aug 2019, at 13:47, Sven Selberg <sven....@axis.com> wrote:Hi,I stumbled across this thread when trouble-shooting an issue we just had and thought I'd mention it here in case they are related:All CPUs went full-throttle for about 30 minutes after which everything returned to normal.From a heap-dump I took during the high load I got (we have a quite modest heap of 16.7 GB):* LocalObjectToPack - 65M instances* Byte[] - 3M Instances* ObjectId - 3M InstancesI cannot however find any evidence of a JGit GC anywhere during the time-frame of the incident.Gerrit: v2.16.8-106-g6d4cf6d026JGit: 5.1.7.201904200442-rJava: 1.8.0_212-b03 (Open JDK)Unfortunately, did not have a lot of time to invest on it ... but I found a potential problem with the PackWriter.JGit keeps a full list of PackWriters in a static field until the caller is closing them (not sure why).It looks like some of them are left open and, longer term, you could end up with logs of PackWriters that are the top-level objects of the ones you found.
To unsubscribe, email rep...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/8b0ad786-45fe-4680-ba0b-0af8ae6bacef%40googlegroups.com.
On Monday, August 12, 2019 at 2:51:19 PM UTC+2, lucamilanesio wrote:On 12 Aug 2019, at 13:47, Sven Selberg <sven....@axis.com> wrote:Hi,I stumbled across this thread when trouble-shooting an issue we just had and thought I'd mention it here in case they are related:All CPUs went full-throttle for about 30 minutes after which everything returned to normal.From a heap-dump I took during the high load I got (we have a quite modest heap of 16.7 GB):* LocalObjectToPack - 65M instances* Byte[] - 3M Instances* ObjectId - 3M InstancesI cannot however find any evidence of a JGit GC anywhere during the time-frame of the incident.Gerrit: v2.16.8-106-g6d4cf6d026JGit: 5.1.7.201904200442-rJava: 1.8.0_212-b03 (Open JDK)Unfortunately, did not have a lot of time to invest on it ... but I found a potential problem with the PackWriter.JGit keeps a full list of PackWriters in a static field until the caller is closing them (not sure why).It looks like some of them are left open and, longer term, you could end up with logs of PackWriters that are the top-level objects of the ones you found.I had:97 instances of PackWriter (and PackWriter$MutableState)Under "normal" conditions I had:12 instances of PackWriter (and PackWriter$MutableState)
On 12 Aug 2019, at 14:07, Sven Selberg <sven.s...@axis.com> wrote:
On Monday, August 12, 2019 at 2:51:19 PM UTC+2, lucamilanesio wrote:On 12 Aug 2019, at 13:47, Sven Selberg <sven....@axis.com> wrote:Hi,I stumbled across this thread when trouble-shooting an issue we just had and thought I'd mention it here in case they are related:All CPUs went full-throttle for about 30 minutes after which everything returned to normal.From a heap-dump I took during the high load I got (we have a quite modest heap of 16.7 GB):* LocalObjectToPack - 65M instances* Byte[] - 3M Instances* ObjectId - 3M InstancesI cannot however find any evidence of a JGit GC anywhere during the time-frame of the incident.Gerrit: v2.16.8-106-g6d4cf6d026JGit: 5.1.7.201904200442-rJava: 1.8.0_212-b03 (Open JDK)Unfortunately, did not have a lot of time to invest on it ... but I found a potential problem with the PackWriter.JGit keeps a full list of PackWriters in a static field until the caller is closing them (not sure why).It looks like some of them are left open and, longer term, you could end up with logs of PackWriters that are the top-level objects of the ones you found.I had:97 instances of PackWriter (and PackWriter$MutableState)
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/6f2f51d8-bd9a-400a-9931-b403a60a442a%40googlegroups.com.