When did you gc last time ? Did you use jgit's gc which creates the shiny new bitmap indexes ?
--Matthias
--Matthias
Search the list for .noz files.
On Fri, May 24, 2013 at 4:00 PM, Matthias Sohn <matthi...@gmail.com> wrote:
On Fri, May 24, 2013 at 5:57 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
The detail is:-after run some time the gerrit become very slow when git clone/repo upload,the slow steps are "Counting objects" and "Finding sources" or "remote: Resolving deltas" when repo upload,so user should wait long time to get the sourcecode-I use top/htop to see the system load,but found the cpu and disk I/O is nothing special,most of cpus idle and the "wa" of top is low-But I notice if gerrit become slow,one of gerrit thread alway use hight cpu time-after restart gerrit,I have found this issue for some days after upgrade to 2.6rc0, because I have not found this happen when using gerrit 2.5.So I try to upgrad to 2.6rc3 but still the same.How to diagnosis this issue? I have no idea what the busy gerrit thread are doing?In error log I can not found anything useful.create a thread dump in order to find out what the busy thread is doing,kill -3 <processid> is your friend, or use jstack <processid>.When did you gc last time ? Did you use jgit's gc which creates the shiny new bitmap indexes ?(forgot to reply all...)Today the hanging happened again.
I could not find useful infomation in "jstack <pid>" 's output,see the p.txt in attachment.
And I use kill -3 <pid> but it still always use high CPU time,so kill -3 not work.
So at last I have to restart gerrit to solve temporarily.--Matthias
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
On Tue, May 28, 2013 at 11:50 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
On Fri, May 24, 2013 at 4:00 PM, Matthias Sohn <matthi...@gmail.com> wrote:
On Fri, May 24, 2013 at 5:57 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
The detail is:-after run some time the gerrit become very slow when git clone/repo upload,the slow steps are "Counting objects" and "Finding sources" or "remote: Resolving deltas" when repo upload,so user should wait long time to get the sourcecode-I use top/htop to see the system load,but found the cpu and disk I/O is nothing special,most of cpus idle and the "wa" of top is low-But I notice if gerrit become slow,one of gerrit thread alway use hight cpu time-after restart gerrit,I have found this issue for some days after upgrade to 2.6rc0, because I have not found this happen when using gerrit 2.5.So I try to upgrad to 2.6rc3 but still the same.How to diagnosis this issue? I have no idea what the busy gerrit thread are doing?In error log I can not found anything useful.create a thread dump in order to find out what the busy thread is doing,kill -3 <processid> is your friend, or use jstack <processid>.When did you gc last time ? Did you use jgit's gc which creates the shiny new bitmap indexes ?(forgot to reply all...)Today the hanging happened again.Was the CPU usage high at the time when the hanging happened?Did you again identify the thread with the high cpu usage?
I could not find useful infomation in "jstack <pid>" 's output,see the p.txt in attachment.What did you search for there?You should find the thread with the ID identified in the previous step (top -H).
And I use kill -3 <pid> but it still always use high CPU time,so kill -3 not work.Matthias mentioned the "kill -3 <pid>" as an alternative way to create a thread dump.kill -3 is not a magic tool that will reduce the CPU usage.
"SSH git-receive-pack '/repo' (username)" prio=10 tid=0x0000000043508800 nid=0x57ac runnable [0x00007fefd6cea000]
java.lang.Thread.State: RUNNABLE
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:238)
(and the stack continues for quite a while...)
--Doug
On Tue, May 28, 2013 at 7:57 PM, Saša Živkov <ziv...@gmail.com> wrote:
On Tue, May 28, 2013 at 11:50 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
On Fri, May 24, 2013 at 4:00 PM, Matthias Sohn <matthi...@gmail.com> wrote:
On Fri, May 24, 2013 at 5:57 AM, Chunlin Zhang <zhangc...@gmail.com> wrote:
The detail is:-after run some time the gerrit become very slow when git clone/repo upload,the slow steps are "Counting objects" and "Finding sources" or "remote: Resolving deltas" when repo upload,so user should wait long time to get the sourcecode-I use top/htop to see the system load,but found the cpu and disk I/O is nothing special,most of cpus idle and the "wa" of top is low-But I notice if gerrit become slow,one of gerrit thread alway use hight cpu time-after restart gerrit,I have found this issue for some days after upgrade to 2.6rc0, because I have not found this happen when using gerrit 2.5.So I try to upgrad to 2.6rc3 but still the same.How to diagnosis this issue? I have no idea what the busy gerrit thread are doing?In error log I can not found anything useful.create a thread dump in order to find out what the busy thread is doing,kill -3 <processid> is your friend, or use jstack <processid>.When did you gc last time ? Did you use jgit's gc which creates the shiny new bitmap indexes ?(forgot to reply all...)Today the hanging happened again.Was the CPU usage high at the time when the hanging happened?Did you again identify the thread with the high cpu usage?Yes,I observed this for several times,once gerrit become slow there alway is a thread with high cpu usage,I use htop to see the threads details,the strange thead use cpu time for example 6Hour+ and other theads only XX minute or XX seconds.I could not find useful infomation in "jstack <pid>" 's output,see the p.txt in attachment.What did you search for there?You should find the thread with the ID identified in the previous step (top -H).Yes,I run "jstack <pid>" then output with a error msg,so I use "jstack -F <pid>". The pid I use is the strange thread I see in htop.
I think you have to use the pid of the Gerrit process, not the pid of that thread.
15GB? WOW! That's a large repo! Even some of our "larger" repos here only have .git directories in the order of 2-3GB. If you're having files that large, I'd say there's a pretty good chance you're running into some of the problems previously mentioned on the list[1]. The receive.maxObjectSizeLimit might be relevant to the issue (receiving large binaries can be a problem, from what I've seen). Also, another post[2] discusses similar issues with repositiories of this size (though, this also discusses some of the performance improvements made to address concerns like that about a year ago).
I think what you might find in jstack is that there are threads stuck in receive-pack. This will eat up tons of CPU, but they won't show up in the work queue or connection list (I believe the underlying ssh connection died, but Gerrit gets "stuck" waiting for data). A restart of Gerrit will clear the issue as a temporary fix, but like you found--it will come back, and it tells very little of the root cause. It would be nice to find the root cause, as this has been known to hit us, too, and the best weapon we have is a restart of Gerrit. Our symptom is the box will appear with a very high load average, but with plenty of CPU and RAM available--but the queue will become extremely long, and it will even take time to accept new items into the queue (the ssh connection will just hang for minutes at a time).Good luck!--Doug
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/repo-discuss/jN8TElfgCtM/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to repo-discuss...@googlegroups.com.
Someone reported on the Git mailing list that JGit has an infinite
loop under certain inflation conditions. They managed to create a
reproduction case, but it requires a 1 GiB Git repository... and I am
still waiting on the details to be posted to the JGit bug tracker. Its
possible that is what this is.
With large Gits and lots of traffic, I’m wondering if “-Xmx7168m” is really enough. Could you run jvisualvm (included in the JDK) on the server and take a look at how hard the Java GC is working?
You could also use the JVisualvm Sampler feature to see where Gerrit spends most of its CPU-cycles. That could certainly give us a hint on what’s going on.
JVisualVM Sampler Hints:
Run $JDK/bin/jvisualvm ->
Select application (Gerrit) ->
Sampler Tab ->
Sample: CPU ->
Wait a few minutes ->
Click “Snapshot” ->
Click “Hotspot Tab” at the bottom ->
[Take Screenshot 1] ->
Right click the item with most “Self time” ->
Select “Show Back Traces” ->
[Take Screenshot 2]
Upload the screenshots to some other image sharing site (like imgur) and post to this thread.
/Gustaf
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
After I do the gc for all the ".git" >100MB,this issue not happened for days,I thought it had been solved.But today the hanging status happened again.I could not find useful information from java stack trace,so after got there information below I restarted gerrit again.1. java stack trace,see attachment,from #3 htop we can see the busy thread ID is 7280,convert to hex:0x1c70 ,but there is no stack trace for this thread.
With large Gits and lots of traffic, I’m wondering if “-Xmx7168m” is really enough. Could you run jvisualvm (included in the JDK) on the server and take a look at how hard the Java GC is working?