I am interested to know if there are any performance gain with running gerrit with native git as oppose to using jgit. Though I am not sure what are the steps(configurations) I need to do in order to actually start gerrit with a native git.
Currently using Gerrit 2.14.20, with Jgit 4.7.9.x
Very important is to have a sufficiently large core.packedGitLimit which give the size of the jgit cache mapping pack files into memory.Ideally it matches the total size of actively used repositories on that server. Max heap size should be around twice this size.
Very important is to have a sufficiently large core.packedGitLimit which give the size of the jgit cache mapping pack files into memory.Ideally it matches the total size of actively used repositories on that server. Max heap size should be around twice this size.Did you mean "Ideally it matches the total size of the *packfiles* in the actively used repositories ..."?
If this cache is too small you'll have a lot of IO to read objects from packfiles.Do you run gc on a regular basis on all repositories ?Can you provide some numbers about size of repositories you are serving and an idea about the load ?Install javamelody plugin to simplify monitoring-Matthias
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/7883b3a1-fef3-4217-8559-b787625ff03d%40googlegroups.com.
We are looking at bunch of issue with gerrit+jgit combination.First, we saw a repository which was something like 75GB in size(on disk) - even though actually when we clone repo is no more than 1 MB. On investigating around 70GB contents were in preserved/<sha1>.old-pack and preserved/<sha1>.old-idx files. This however partially fixed by using only "jgit gc" with "-prune-preserved" option, i.e. now repo is 6GB. I found that "native git" hasn't implemented yet the "-prune-preserved" option, but "git gc" command was able to cleanup repo to reduce size to 1MB when running after the "jgit -git_dir=<repo> gc --prune-preserved".
Second, we saw that loading around 15000 repos in gerrit "projects cache" took roughly 20-30GB JVM Heap, possibly due to jgit's way of loading cache in JVM.
Third, we found that some jgit version with gc-conductor takes too long to run GC on repo. Though latest jgit version(4.7.9) doesn't have that issue.
Apart from these three, lot of time we get too many ssh requests for git clone which causes JVM Heap spike, causing JVM Full GC which ultimately makes gerrit unstable and we do restart gerrit. If we don't restart gerrit, next all ssh clone requests hangs, even administrative gerrit commands doesn't respond back.[JVM Heap 128 GB, Repo sizes ranging 3-15GB, ssh request rate(10 request within 10 seconds duration].
So, I was curious to try if native git is going to work differently in above cases or not.
On Wednesday, September 25, 2019 at 8:23:54 PM UTC-4, Matthias Sohn wrote:On Thu, Sep 26, 2019 at 12:57 AM Abhishek Patel <abh...@gmail.com> wrote:I am interested to know if there are any performance gain with running gerrit with native git as oppose to using jgit. Though I am not sure what are the steps(configurations) I need to do in order to actually start gerrit with a native git.you can't as Gert wrote, since Gerrit is based on JGit and uses its Java APICurrently using Gerrit 2.14.20, with Jgit 4.7.9.xif you want better performance upgrade to Gerrit 2.16 or better 3.0, always use the latest bugfix releaseof a given minor release (as you are doing now with 2.14.20).If you are on 2.16 or higher then use a filesystem with high file timestamp resolution (e.g. ext4, btrfs, xfs, zfswhich all provide 1ns resolution, on Java this is reduced depending on OS and Java version).Observe your caches using show-caches ssh command to learn which caches need to be tuned.Can you share your gerrit.config ?Very important is to have a sufficiently large core.packedGitLimit which give the size of the jgit cache mapping pack files into memory.Ideally it matches the total size of actively used repositories on that server. Max heap size should be around twice this size.If this cache is too small you'll have a lot of IO to read objects from packfiles.Do you run gc on a regular basis on all repositories ?Can you provide some numbers about size of repositories you are serving and an idea about the load ?Install javamelody plugin to simplify monitoring-Matthias
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/a934267d-5d8a-4a68-b1e1-194bfdb0f174%40googlegroups.com.
On Fri, Sep 27, 2019 at 9:11 PM Abhishek Patel <abh...@gmail.com> wrote:We are looking at bunch of issue with gerrit+jgit combination.First, we saw a repository which was something like 75GB in size(on disk) - even though actually when we clone repo is no more than 1 MB. On investigating around 70GB contents were in preserved/<sha1>.old-pack and preserved/<sha1>.old-idx files. This however partially fixed by using only "jgit gc" with "-prune-preserved" option, i.e. now repo is 6GB. I found that "native git" hasn't implemented yet the "-prune-preserved" option, but "git gc" command was able to cleanup repo to reduce size to 1MB when running after the "jgit -git_dir=<repo> gc --prune-preserved".Do you use NFS ? These options are meant to be used as a workaround for issues which may occur on NFS.Second, we saw that loading around 15000 repos in gerrit "projects cache" took roughly 20-30GB JVM Heap, possibly due to jgit's way of loading cache in JVM.Gerrit's project cache is not implemented in JGitThird, we found that some jgit version with gc-conductor takes too long to run GC on repo. Though latest jgit version(4.7.9) doesn't have that issue.you didn't specify which version you are talking about and I don't know what gc-conductor is4.7.9 is not the latest JGit version, the latest version is 5.5.0.Apart from these three, lot of time we get too many ssh requests for git clone which causes JVM Heap spike, causing JVM Full GC which ultimately makes gerrit unstable and we do restart gerrit. If we don't restart gerrit, next all ssh clone requests hangs, even administrative gerrit commands doesn't respond back.[JVM Heap 128 GB, Repo sizes ranging 3-15GB, ssh request rate(10 request within 10 seconds duration].these are very large git repositories, running many concurrent clones of repositories of that size is heavy loadWhy are these repositories so large and why do you need so many clone commands ?
(...)
these are very large git repositories, running many concurrent clones of repositories of that size is heavy loadWhy are these repositories so large and why do you need so many clone commands ?you can use git-sizer to get an idea