[Please help!] Gerrit HTTP threads using ALL the server CPU (2.11.3 version)

1,125 views
Skip to first unread message

Leandro Fonseca

unread,
May 9, 2017, 5:26:56 PM5/9/17
to Repo and Gerrit Discussion
Hi everyone

Our Gerrit 2.11.3 installation, that is currently using HTTPS/SSL, is repeatedly increasing its CPU usage until a point that all the server CPUs are 100% used and the Gerrit process obviously breaks. The only solution to free up the used CPU is to restart the Gerrit service.

I used jstack to found that the CPU high usage is coming from HTTP threads (Yes, the Gerrit process was using ~ 900% of the server CPU at the time this data was collected):

Process id    CPU utilization    Thread
4156          99.9               "HTTP-688" prio=10 tid=0x00000000041cd000 nid=0x103c runnable [0x00002baa61d19000]
6476          99.9               "HTTP-785" prio=10 tid=0x00002baa5c796000 nid=0x194c runnable [0x00002baa602ff000]
17037         99.9               "HTTP-845" prio=10 tid=0x00002baa5c794800 nid=0x428d runnable [0x00002baa60703000]
8466          99.9               "HTTP-882" prio=10 tid=0x00002baa645d5800 nid=0x2112 runnable [0x00002baa6c703000]
17404         99.9               "HTTP-924" prio=10 tid=0x00002baa5c0b7000 nid=0x43fc runnable [0x00002baa6201c000]
4376          99.9               "HTTP-929" prio=10 tid=0x0000000004650000 nid=0x1118 runnable [0x00002baa6dc14000]
2965          99.9               "HTTP-963" prio=10 tid=0x0000000004f28800 nid=0xb95 runnable [0x00002baa5670a000]
8810          99.9               "HTTP-10958" prio=10 tid=0x00000000055f1000 nid=0x226a runnable [0x00002baa6edbf000]
690           99.9               "HTTP-12100" prio=10 tid=0x0000000005373800 nid=0x2b2 runnable [0x00002baa6ecae000]

I could not find much information on the thread full information (using thread 0x226a as example, can provide the info of the others):

"HTTP-10958" prio=10 tid=0x00000000055f1000 nid=0x226a runnable [0x00002baa6edbf000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:204)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.updateBottom(TopFieldCollector.java:398)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:427)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read(LuceneChangeIndex.java:408)
        at com.google.gerrit.server.index.IndexedChangeQuery.read(IndexedChangeQuery.java:95)
        at com.google.gerrit.server.index.IndexedChangeQuery.restart(IndexedChangeQuery.java:138)
        at com.google.gerrit.server.query.change.AndSource.readImpl(AndSource.java:133)
        at com.google.gerrit.server.query.change.AndSource.read(AndSource.java:99)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:153)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:102)
        at com.google.gerrit.server.query.change.QueryChanges.query0(QueryChanges.java:143)
        at com.google.gerrit.server.query.change.QueryChanges.query(QueryChanges.java:132)
        at com.google.gerrit.server.query.change.QueryChanges.apply(QueryChanges.java:99)
        at com.google.gerrit.server.query.change.QueryChanges.apply(QueryChanges.java:40)
        at com.google.gerrit.httpd.restapi.RestApiServlet.service(RestApiServlet.java:324)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)
        at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
        at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
        at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
        at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
        at com.google.gerrit.httpd.GetUserFilter.doFilter(GetUserFilter.java:82)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.gwtexpui.server.CacheControlFilter.doFilter(CacheControlFilter.java:73)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.gerrit.httpd.RunAsFilter.doFilter(RunAsFilter.java:117)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.gerrit.httpd.RequireSslFilter.doFilter(RequireSslFilter.java:68)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:64)
        at com.google.gerrit.httpd.AllRequestFilter$FilterProxy.doFilter(AllRequestFilter.java:57)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.gerrit.httpd.RequestContextFilter.doFilter(RequestContextFilter.java:75)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
        at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
        at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
        at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
        - locked <0x00002b9a8b8561c8> (a com.google.inject.servlet.GuiceFilter$Context)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:497)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Unknown Source)

All these threads were running using 0.0% of CPU since the last time Gerrit process was restarted, some have some small usage and then return to 0.0%, but then all of the sudden they ramp up to 100% CPU usage and stay around this usage rate (97, 98 & 99%) during all their existence, until it gets to a critical level and I need to restart Gerrit again. For the above detailed thread, according to my monitors, this ramp up happened around 09/May/2017:13:04:24.

I looked on the httpd_log for any suspicious activity, but there was nothing wrong (not only for this specific case, but also for the others). I even accessed the below listed changes and couldn't reproduce the problem:

<ip1> - <user1> [09/May/2017:13:04:22 -0500] "GET /changes/990298/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip2> - <user2> [09/May/2017:13:04:23 -0500] "GET /changes/990656/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Safari/537.36"
<ip3> - <user3> [09/May/2017:13:04:23 -0500] "GET /changes/991102/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip4> - <user4> [09/May/2017:13:04:23 -0500] "GET /changes/991792/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0"
100.64.193.60 - - [09/May/2017:13:04:23 -0500] "GET /changes/?n=25&O=81 HTTP/1.1" 200 28 "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip5> - <user5> [09/May/2017:13:04:24 -0500] "GET /changes/991116/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip6> - <user6> [09/May/2017:13:04:24 -0500] "GET /changes/978360/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0"


Nothing was reported on error_log at this specific period:

[logs]$ cat error_log | grep "13:04:20"
[logs]$ cat error_log | grep "13:04:21"
[logs]$ cat error_log | grep "13:04:22"
[logs]$ cat error_log | grep "13:04:23"
[logs]$ cat error_log | grep "13:04:24"
[logs]$ cat error_log | grep "13:04:25"
[logs]$ 


Only few pushes reported on sshd_log at this specific period:

[2017-05-09 13:04:20,111 -0500] 8008524b <userA> a/1001028 LOGIN FROM <ip X>
[2017-05-09 13:04:20,526 -0500] 203c06f2 <userB> a/1010806 LOGIN FROM <ip Y>
[2017-05-09 13:04:21,456 -0500] 203c06f2 <userB> a/1010806 git-receive-pack./<project 1> 1ms 379ms 0
[2017-05-09 13:04:21,798 -0500] 8008524b <userA> a/1001028 git-receive-pack./<project 2> 1ms 1671ms 0
[2017-05-09 13:04:21,799 -0500] 8008524b <userA> a/1001028 LOGOUT
[2017-05-09 13:04:21,857 -0500] 203c06f2 <userB> a/1010806 LOGOUT
[2017-05-09 13:04:23,118 -0500] 4016da78 <userA> a/1001028 LOGIN FROM <ip X>
[2017-05-09 13:04:23,207 -0500] e0182ea3 <userB> a/1010806 LOGIN FROM <ip Y>
[2017-05-09 13:04:23,776 -0500] 4016da78 <userA> a/1001028 git-receive-pack./<project 3> 0ms 643ms 0
[2017-05-09 13:04:23,777 -0500] 4016da78 <userA> a/1001028 LOGOUT
[2017-05-09 13:04:24,178 -0500] e0182ea3 <userB> a/1010806 git-receive-pack./<project 4> 1ms 421ms 0
[2017-05-09 13:04:24,587 -0500] e0182ea3 <userB> a/1010806 LOGOUT

As I mentioned, this is happening repeatedly, so it is a very big problem.

I have found a similar problem reported on this forum at the page below but unfortunately no conclusive information on how to fix or reproduce the problem was given:

Does anyone know what could be happening here?
Thanks in advance,

Leandro.




Matthew Webber

unread,
May 10, 2017, 4:36:50 AM5/10/17
to Repo and Gerrit Discussion
Excellent problem report.
You mention that there is nothing in the error log at the specific period, but is there anything of interest if you look back a bit earlier? That file typically doesn't have much in it (!), so you should be able to eyeball it.
Matthew 

Leandro Fonseca

unread,
May 10, 2017, 10:48:47 AM5/10/17
to Repo and Gerrit Discussion
Hi Matthew, thanks for your answer.
The only errors reported on the error_log file are almost 30 minutes away from when the thread ramped up to 100% CPU usage (2017-05-09 13:04):

[2017-05-09 12:40:22,440] INFO  com.google.gerrit.httpd.auth.ldap.LdapLoginServlet : '<user>' failed to sign in: Incorrect username or password
[2017-05-09 13:34:59,120] ERROR com.google.gerrit.httpd.restapi.RestApiServlet : Error in POST /changes/991842/reviewers
com.google.gwtorm.server.OrmConcurrencyException: Concurrent modification detected

Matthew Webber

unread,
May 10, 2017, 12:05:57 PM5/10/17
to Repo and Gerrit Discussion
That being the case, the only possibility I can think of (and this is just a guess) is that the JVM is low on memory, and is spending all its CPU on (java) garbage collection, but never recovering much memory.

We had a problem with that once, with similar symptoms to yours (Web UI unresponsive). In that case, the clue was a message in error_log:
java.lang.OutOfMemoryError: Java heap space
and the solution was to set
container.heapLimit=8g

The other thing to check is whether you are running out of threads in some pool.

Sorry I don't have any other ideas.

Matthew

Matthias Sohn

unread,
May 10, 2017, 4:07:29 PM5/10/17
to Matthew Webber, Repo and Gerrit Discussion
configure the VM to write GC logs in order to check if the JVM runs gc excessively or use jconsole or visualvm to check that.
Can you post details of your gerrit.config ?
Which max heap size did you configure ?
How about JGit cache size (core.packedGitLimit) ?

-Matthias

Leandro Fonseca

unread,
May 10, 2017, 4:31:17 PM5/10/17
to Repo and Gerrit Discussion, mat...@unsolvable.org
Hi @Matthew,

Your explanation would make sense to what is happening, considering that during our night shift, when most (not all) of these "bad" threads are started, we receive a high number of pushes executed from automated systems plus usage from people on Asia, what could generate a high memory need from Gerrit. However, I have looked into our error_log files could not find this error on any of the logs collected on the entire month of April and May.
Anyway, really appreciate your help on it.

Hi @Matthias,

HeapSize = 72gb and packedGitLimit = 12g, our server has 96gb total for RAM.
Ok, I will have the VM configured as you suggested on my next restart and give you guys the details.
This is our gerrit.config file:

[gerrit]
basePath = [path to projects]
canonicalWebUrl = https://[server name]/
replicateOnStartup = false
[core]
        packedGitOpenFiles = 4096
        packedGitLimit = 12g
        packedGitWindowSize = 16k
        streamFileThreshold = 4095m

[plugins]
allowRemoteAdmin = true

[sshd]
rekeyTimeLimit = 0
rekeyBytesLimit = 107374182400

[cache]
        directory = [path to cache folder]

[cache "projects"]
        memoryLimit = 4750k
checkFrequency = 5

[cache "accounts"]
        memoryLimit = 6000k

[cache "groups"]
memoryLimit = 1600k

[cache "permission_sort"]
memoryLimit = 2750k

[cache "diff"]
timeout = 5m
        diskbuffer = 10m

[cache "web_sessions"]
diskLimit = 1600k
        maxAge = 1d

[user]
        name = [Gerrit user name]
anonymousCoward = Undefined Full Name
[auth]
        type = LDAP
[ldap]
        server = ldap://[ldap host]
        accountBase = [ldap details]
        accountPattern = [ldap details]
        accountSshUsername = ${uid.toLowerCase}
        accountFullName = ${givenName} ${sn}
        accountEmailAddress = mail
[database]
type = mysql
hostname = [db host]
database = [db name]
username = [db user]
        poolMaxIdle = 16
        poolLimit = 128
[sendemail]
smtpServer = [smtpServer]
[container]
user = [gerrit user]
javaHome = [java path]
war = [war file path]
        heapLimit = 72g
[sshd]
listenAddress = *:[gerrit port]
        threads = 96
batchThreads = 64
[httpd]
        listenUrl = proxy-https://*:[port]/
        sslKeyStore = [ssl path]
        sslKeyPassword = [ssl password]
maxQueued = 0
requestLog = true
[commentlink "[x]"]
        match = ([match rule])
        html = <a href=\"http://[x server]/$1\" target=\"_blank\">$1</a>
[receive]

changeUpdateThreads = 3
checkReferencedObjectsAreReachable = false
        allowGroup = Administrators
        allowGroup = [group A]
[upload]
        allowGroup = Administrators
        allowGroup = [group B]
        allowGroup = [group C]
[index]
type = LUCENE


Matthias Sohn

unread,
May 10, 2017, 6:34:59 PM5/10/17
to Leandro Fonseca, Repo and Gerrit Discussion, Matthew Webber
How many cores does your server have to process requests ?

What's the total size of the repositories you are serving ?
Looks like you could use a larger fraction of your heap for the JGit object cache.

Are cache hit rates near 100% after caches have warmed up ?

Leandro Fonseca

unread,
May 11, 2017, 11:17:16 AM5/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Hi Mathias,

Just confirming, are you asking how many CPU cores the server has? If that's what you are asking, the server has 32 cores.
We have ~ 1TB of active projects, but of course not all of them are used everyday, and this specific server will mostly receive only pushes (pulls are disabled for most of the users).

I have increased the core.packedGitLimit size to 24gb (it was 12gb before), and enabled the GC logs (using options: -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps). The problem happened again and 6 new "bad http threads" came up during the night. Using the thread 0x21b6 as an example, that was using 0% of CPU at 03:29:12AM, jumped to 100% at 03:29:14AM and is still using the same percentage now, almost 6 hours later:

- cache information
Gerrit Code Review        2.11.3                    now    10:13:57   CDT
                                                 uptime    13 hrs 30 min

  Name                          |Entries              |  AvgGet |Hit Ratio|
                                |   Mem   Disk   Space|         |Mem  Disk|
--------------------------------+---------------------+---------+---------+
  accounts                      |   499               |   2.7ms | 97%     |
  accounts_byemail              |   217               |   2.3ms | 87%     |
  accounts_byname               |   502               |   1.8ms | 86%     |
  adv_bases                     |                     |         |         |
  changes                       |                     |  54.5ms |  0%     |
  groups                        |   785               |   1.4ms | 91%     |
  groups_byinclude              |                     |   1.6ms | 16%     |
  groups_byname                 |                     |         |         |
  groups_byuuid                 |   321               |   2.7ms | 44%     |
  groups_external               |                     |   3.7ms | 22%     |
  groups_members                |   535               |   3.0ms | 99%     |
  ldap_group_existence          |                     |         |         |
  ldap_groups                   |     6               |  45.9ms | 33%     |
  ldap_groups_byinclude         |                     |         |         |
  ldap_usernames                |     5               |   1.5ms | 88%     |
  permission_sort               | 18055               |         | 98%     |
  plugin_resources              |                     |         |         |
  project_list                  |     1               |    2.8s | 91%     |
  projects                      | 10552               |   3.4ms | 98%     |
  sshkeys                       |    17               |   5.0ms | 98%     |
D change_kind                   |   121    121  55.54k|   4.7ms | 65% 100%|
D conflicts                     |    18     18  16.28k|         | 57% 100%|
D diff                          |   308    308 384.38k| 111.6ms | 79% 100%|
D diff_intraline                |    41     41  19.05k|   8.0ms | 14%     |
D git_tags                      |   235    235  60.02m|         | 35%  99%|
D mergeability                  |    28     28  18.65k| 339.8ms | 68% 100%|
D web_sessions                  |   917   4039   1.66m|         | 98%   6%|

SSH:      8  users, oldest session started  8 hrs 18 min ago
Tasks:    4  total =    2 running +      0 ready +    2 sleeping
Mem: 27.22g total = 23.09g used + 1.76g free + 2.37g buffers
     64.00g max
        4096 open files

Threads: 32 CPUs available, 164 threads
                                    NEW       RUNNABLE        BLOCKED        WAITING  TIMED_WAITING     TERMINATED
  SSH git-receive-pack                0              0              0              1              0              0
  HTTP                                0              8              0              0              5              0
  SSH-Interactive-Worker              0              0              0             30              0              0
  Other                               0             11              0             44             31              0
  ReceiveCommits                      0              0              0             32              0              0
  SshCommandStart                     0              0              0              2              0              0

JVM: Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 24.76-b04
  on Linux 2.6.18-348.el5 amd64

- thread information (at 03:29:14AM)
"HTTP-4983" prio=10 tid=0x00002b98ada82800 nid=0x21b6 runnable [0x00002b98a9c4a000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:65)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:192)
        - locked <0x00002b93ab0686a8> (a com.google.inject.servlet.GuiceFilter$Context)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:497)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Unknown Source)

- gc log
2017-05-11T03:26:33.297-0500: 24190.995: [GC [PSYoungGen: 20841162K->324230K(19940352K)] 37491735K->17271351K(40344064K), 0.3186810 secs] [Times: user=1.40 sys=0.19, real=0.32 secs] 
2017-05-11T03:27:52.141-0500: 24269.839: [GC [PSYoungGen: 19939974K->64623K(18904064K)] 36887095K->17041180K(39307776K), 0.0518430 secs] [Times: user=0.55 sys=0.03, real=0.05 secs] 
2017-05-11T03:28:55.948-0500: 24333.646: [GC [PSYoungGen: 18903663K->76096K(18176000K)] 35880220K->17055388K(38579712K), 0.0550160 secs] [Times: user=0.49 sys=0.05, real=0.06 secs] 
2017-05-11T03:29:43.879-0500: 24381.577: [GC [PSYoungGen: 18175808K->103826K(17499136K)] 35155100K->17101686K(37902848K), 0.0825140 secs] [Times: user=0.72 sys=0.07, real=0.08 secs] 
2017-05-11T03:30:37.197-0500: 24434.895: [GC [PSYoungGen: 17499026K->269229K(16993792K)] 34496886K->17267089K(37397504K), 0.1586370 secs] [Times: user=1.07 sys=0.28, real=0.16 secs] 
2017-05-11T03:31:26.637-0500: 24484.335: [GC [PSYoungGen: 16993709K->245822K(16331776K)] 33991569K->17243682K(36735488K), 0.0896720 secs] [Times: user=0.94 sys=0.39, real=0.09 secs] 

- httpd_log
[ip 1] - [user a] [11/May/2017:03:29:13 -0500] "GET /accounts/[user g]%40[domain1].com/avatar?s=26 HTTP/1.1" 404 30 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/comments HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 1] - [user a] [11/May/2017:03:29:13 -0500] "GET /accounts/[user h]%40[domain2].com/avatar?s=26 HTTP/1.1" 404 30 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/drafts HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 3] - [user c] [11/May/2017:03:29:13 -0500] "GET /changes/992960/detail?O=404 HTTP/1.1" 200 2165 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/files?reviewed HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/files HTTP/1.1" 200 216 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /projects/[project path 1]%2[project path 1]/config HTTP/1.1" 200 499 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:13 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/commit?links HTTP/1.1" 200 529 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 4] - - [11/May/2017:03:29:13 -0500] "GET /favicon.ico HTTP/1.1" 200 - - "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"
[ip 5] - [user d] [11/May/2017:03:29:13 -0500] "GET /changes/992994/detail?O=404 HTTP/1.1" 304 - "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0"
[ip 6] - [user e] [11/May/2017:03:29:14 -0500] "GET /changes/969435/detail?O=404 HTTP/1.1" 304 - "https://[gerrit server]/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:14 -0500] "GET /changes/958601/revisions/1/comments HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:14 -0500] "GET /changes/?q=project:[project path 1]+change:I325d6b393bae120c184bb0426b445f779d0b5192+-change:958601+-is:abandoned&O=a HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 7] - [user f] [11/May/2017:03:29:14 -0500] "GET /accounts/self/capabilities?q=createProject&q=createGroup&q=viewPlugins HTTP/1.1" 200 28 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 2] - [user b] [11/May/2017:03:29:14 -0500] "GET /changes/958601/revisions/a746eb84b70e90aad53ccab25fb71f54cbbb53db/related HTTP/1.1" 200 40 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 7] - [user f] [11/May/2017:03:29:14 -0500] "GET /config/server/top-menus HTTP/1.1" 200 645 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 7] - [user f] [11/May/2017:03:29:14 -0500] "GET /accounts/self/preferences HTTP/1.1" 200 315 "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
[ip 4] - - [11/May/2017:03:29:14 -0500] "GET /config/server/top-menus HTTP/1.1" 200 645 "https://[gerrit server]/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"
[ip 4] - - [11/May/2017:03:29:14 -0500] "GET /gerrit_ui/gwt/chrome/images/hborder.png HTTP/1.1" 200 - "https://[gerrit server]/gerrit_ui/gwt/chrome/7CF1DE6EF2AABFEFAE4D469A16D60071.cache.css" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"
[ip 7] - [user f] [11/May/2017:03:29:14 -0500] "GET /changes/958601/edit?list HTTP/1.1" 204 - "https://[gerrit server]/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"

- error_log
logs> cat error_log |grep 03:29:  
[2017-05-11 03:29:26,968] INFO  com.google.gerrit.httpd.auth.ldap.LdapLoginServlet : '[user i]' failed to sign in: Incorrect username or password
logs> cat error_log |grep 03:30:
logs> 

- sshd_log
logs> cat sshd_log |grep 03:29:10
[2017-05-11 03:29:10,400 -0500] 3d007e4b [user j] a/1001028 LOGIN FROM [ip 8]
[2017-05-11 03:29:10,581 -0500] 3d007e4b [user j] a/1001028 git-upload-pack./home/repo/scm/pre_build_scripts.git 1ms 175ms 0
[2017-05-11 03:29:10,581 -0500] 3d007e4b [user j] a/1001028 LOGOUT
[2017-05-11 03:29:10,620 -0500] 5d09b278 [user k] a/1004944 LOGIN FROM [ip 9]
[2017-05-11 03:29:10,621 -0500] 5d09b278 [user k] a/1004944 LOGOUT
[2017-05-11 03:29:10,764 -0500] bdf30e9a [user j] a/1001028 LOGIN FROM [ip 8]
[2017-05-11 03:29:10,827 -0500] bdf30e9a [user j] a/1001028 git-upload-pack./home/repo/scm/pre_build_scripts.git 1ms 56ms 0
[2017-05-11 03:29:10,865 -0500] bdf30e9a [user j] a/1001028 LOGOUT
logs> cat sshd_log |grep 03:29:11
logs> cat sshd_log |grep 03:29:12
logs> cat sshd_log |grep 03:29:13
logs> cat sshd_log |grep 03:29:14
logs> cat sshd_log |grep 03:29:15
logs> cat sshd_log |grep 03:29:16
logs> 

Matthew Webber

unread,
May 11, 2017, 11:30:26 AM5/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
>> 4096 open files
Suspicious. What's your ulimit? In my machine it is
$ ulimit -Hn
4096

Matthew

Matthias Sohn

unread,
May 11, 2017, 11:42:04 AM5/11/17
to Leandro Fonseca, Repo and Gerrit Discussion, Matthew Webber
if that's one of the bad threads it looks like you have a performance issue with lucene.
Did you maybe miss to reindex during the last upgrade of your server ?

Leandro Fonseca

unread,
May 11, 2017, 12:00:56 PM5/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Hi Matthew,

Our ulimit is 49152:

# ulimit -Hn
49152

So, I assume it is ok, right?

Hi Matthias,

Yes, we did the reindex full operation when we upgraded from 2.8.4 to 2.11.3, about 3 years ago.
I assume that you are relating this "bad thread" with Lucene because of the first lines after "java.lang.Thread.State: RUNNABLE", correct?
Is that's how I should interpret this thread information, I checked the other 5 bad threads and they all look the same:

- Thread 0x653b
"HTTP-601" prio=10 tid=0x00002b98ac4bb800 nid=0x653b runnable [0x00002b98a8b0e000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:204)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.updateBottom(TopFieldCollector.java:398)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:427)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)

- Thread 0x6981
"HTTP-680" prio=10 tid=0x00002b98a49c0000 nid=0x6981 runnable [0x00002b989ab69000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:424)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read(LuceneChangeIndex.java:408)

- Thread 0x21b6
"HTTP-4983" prio=10 tid=0x00002b98ada82800 nid=0x21b6 runnable [0x00002b98a9c4a000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:177)
        at org.apache.lucene.search.TopFieldCollector.populateResults(TopFieldCollector.java:1203)
        at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
        at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:93)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:582)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)

- Thread 0xd5b
"HTTP-5822" prio=10 tid=0x00002b98a44ff800 nid=0xd5b runnable [0x00002b98b00fd000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:204)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.updateBottom(TopFieldCollector.java:398)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:427)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)

- Thread 0x7c5a
"HTTP-6086" prio=10 tid=0x00002b98ac06e800 nid=0x7c5a runnable [0x00002b98ab27d000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:177)
        at org.apache.lucene.search.TopFieldCollector.populateResults(TopFieldCollector.java:1203)
        at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
        at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:93)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:582)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read(LuceneChangeIndex.java:408)

 - Thread 0x60a8
"HTTP-640" prio=10 tid=0x000000001a288000 nid=0x60a8 runnable [0x00002b989631d000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:204)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.updateBottom(TopFieldCollector.java:398)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:427)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)

luca.mi...@gmail.com

unread,
May 11, 2017, 12:08:57 PM5/11/17
to Leandro Fonseca, Repo and Gerrit Discussion, mat...@unsolvable.org
Your ulimit gets overridden at Gerrit startup. Did you share already your gerrit.config?

Sent from my iPhone
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Leandro Fonseca

unread,
May 11, 2017, 4:40:03 PM5/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Hi Luca,

Yes, I did, pasting it here again anyway:

Luca Milanesio

unread,
May 11, 2017, 5:06:13 PM5/11/17
to Leandro Fonseca, Repo and Gerrit Discussion, mat...@unsolvable.org
There you go: core.packedGitOpenFiles override your ulimit and set a lower value.

On 11 May 2017, at 21:40, Leandro Fonseca <lean...@motorola.com> wrote:

Hi Luca,

Yes, I did, pasting it here again anyway:

[gerrit]
basePath = [path to projects]
canonicalWebUrl = https://[server name]/
replicateOnStartup = false
[core]
        packedGitOpenFiles = 4096 <<<<====

(from gerrit.sh)
GERRIT_FDS=`get_config --int core.packedGitOpenFiles`
test -z "$GERRIT_FDS" && GERRIT_FDS=128
GERRIT_FDS=`expr $GERRIT_FDS + $GERRIT_FDS`
test $GERRIT_FDS -lt 1024 && GERRIT_FDS=1024

Your ulimit set at startup is then 8192 files.


Leandro Fonseca

unread,
May 11, 2017, 5:49:33 PM5/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Hey Luca,

Ok, so if the show-caches shows "4096 open files" and we actually have 8192 files, that is probably not the problem, right?
Or do you suggest increasing this anyway?

Thanks,
Leandro.

Leandro Fonseca

unread,
May 15, 2017, 11:34:00 AM5/15/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Hi Guys,

One thing that I noticed on this problem: during the weekend these "bad threads" are going away.
The only two things we have differently during weekends are:

- Much lower usage of the Gerrit environment
- Git-gc running on all projects

I'm going to run these git-gc operation on all projects during this week to check if this can eliminate those threads, but I'm not sure if that can be related.
Maybe the low usage can somehow timeout those threads? Would there be any setting I could use to force these threads to timeout faster?

Thanks again,
Leandro.

Leandro Fonseca

unread,
Jul 7, 2017, 3:53:58 PM7/7/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Guys,

Sorry for the long time not posting on this.
We have managed to drastically reduce the amount of pushes we receive during the night shift, however, we are still getting this bad threads opened.

Do you think it could be caused by a low limit for open files?
Do you think a new reindex would help?

Thanks,
Leandro,

Leandro Fonseca

unread,
Jul 7, 2017, 3:56:20 PM7/7/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
Another information: currently we are just running git-gc on our project, not gerrit-gc.
Would that be a problem as well?

Matthew Webber

unread,
Jul 7, 2017, 4:24:32 PM7/7/17
to Repo and Gerrit Discussion, lean...@motorola.com, mat...@unsolvable.org
On Friday, July 7, 2017 at 8:56:20 PM UTC+1, Leandro Fonseca wrote:
Another information: currently we are just running git-gc on our project, not gerrit-gc.
Would that be a problem as well?

I think that there were some bugs in Gerrit/JGit gc in the release you are on, so it's definitely better to use native git gc.
However, I think I'm right in saying that you should not run git gc while Gerrit is active - since Gerrit does not know about it, there's a potential for corruption.
So, stop Gerrit, run git gc, start Gerrit.
Matthew

Martin Fick

unread,
Jul 7, 2017, 5:28:48 PM7/7/17
to repo-d...@googlegroups.com, Matthew Webber, lean...@motorola.com
This is not feasible in most production environments.

Unless you are running git gc from another server to a
shared NFS filesystem, you should absolutely be safe running
git gc while Gerrit is running.

We run it that way (even from other servers, because we have
repacking pathces to support this) in a large system (>1.5M
changes), continuously repacking hourly, for years now,

-Martin

--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation

Martin Fick

unread,
Jul 7, 2017, 5:49:21 PM7/7/17
to repo-d...@googlegroups.com, Leandro Fonseca, mat...@unsolvable.org
On Thursday, May 11, 2017 08:17:16 AM Leandro Fonseca wrote:
> Just confirming, are you asking how many CPU cores the
> server has? If that's what you are asking, the server has
> 32 cores. We have ~ 1TB of active projects, but of course
> not all of them are used everyday, and this specific
> server will mostly receive only pushes (pulls are
> disabled for most of the users).

1TB is not small.

On Wednesday, May 10, 2017 01:31:17 PM Leandro Fonseca
wrote:
> HeapSize = 72gb and packedGitLimit = 12g, our server has
> 96gb total for RAM. Ok, I will have the VM configured as
> you suggested on my next restart and give you guys the
> details.

I don't know if this is your problem, however from your
description of your repos, and some of your settings, it
make me think that your machine is underpowered for what you
are likely trying to do with it.

We have something closer to 1/2TB in git repos, and we have
a 48 core machine with 243GB of RAM, and we set our
heap to around 196GB. We have to limit our sshd threads to
around 36 to avoid overwhelming the heap. We do have lots
of relplication threads also though contributing to the heap
usage. Do you have any replication threads, if so how many
total?

I believe that 96sshd threads is aggressive for your server
and heap size. Although I think you likely need those high
git settings also for your repos to work effectively, and
possibly even higher. I would be concerned that you do not
have enough heap to set them much higher.

> [core]
> packedGitOpenFiles = 4096
> packedGitLimit = 12g
> packedGitWindowSize = 16k
> streamFileThreshold = 4095m
> [container]
> heapLimit = 72g
> [sshd]
> threads = 96
> batchThreads = 64

Maybe you can post some more details about your usage? How
may uploads a day, how may users regularly using the system?
How many projects and how many changes on your system?

Matthias Sohn

unread,
Jul 7, 2017, 6:56:13 PM7/7/17
to Martin Fick, Repo and Gerrit Discussion, Leandro Fonseca, Matthew Webber
I second Martin's opinion that you probably need more heap.
12G packedGitLimit (jgit buffer cache) to serve 1TB of repos means
you can keep only 1% of git objects in memory for all other objects
you need to read them from disk and parse them.

Is gc configured to create bitmap indexes ?

-Matthias

Leandro Fonseca

unread,
Jul 10, 2017, 2:10:29 PM7/10/17
to Repo and Gerrit Discussion, mf...@codeaurora.org, lean...@motorola.com, mat...@unsolvable.org
Hi Matthew,

Correct, version 2.11.3 has some gerrit-gc problems (ack-nak problems while trying to sync the projects if I remember correctly), and that's the reason why we use git-gc.
As per what Martin reported, we also run git-gc weekly with Gerrit running and have no problems with that. Do you think it could be causing our issue?

Hi Martin & Matthias,

Yes, our server specs are underpowered for our need, and we are already working on some upgrades.
We have a customized & independent replication system that is not disputing resources with the Gerrit JVM, so our SSH threads are not being used by the replication plugin.
For the usage data on this specific instance, that has 1.6k active users, these are the daily averages:

14k git-upload-packs
40k git-receive-packs
3.2k http logins (GET on https://<server-address>/login)
1k patch-sets created

We are not using bitmap indexes on our git-gc operations.

Just one more information: we really started experiencing this problems after we deployed SSL to this instance. And as per Zivkov comment on the related post below this could be related to Jetty. Would there be any setting/log we could use to confirm that? Increasing the heap limit would help anyhow?

Also, Matthias have pointed before that the problem could be related to the Lucene index:

if that's one of the bad threads it looks like you have a performance issue with lucene.

Would the be any way to confirm it besides re-indexing all changes again? Like something I could look into the logs?

Thanks again,
Leandro.

Matthias Sohn

unread,
Jul 10, 2017, 5:37:14 PM7/10/17
to Leandro Fonseca, Repo and Gerrit Discussion, Martin Fick, Matthew Webber
On Mon, Jul 10, 2017 at 8:10 PM, Leandro Fonseca <lean...@motorola.com> wrote:
Hi Matthew,

Correct, version 2.11.3 has some gerrit-gc problems (ack-nak problems while trying to sync the projects if I remember correctly), and that's the reason why we use git-gc.
As per what Martin reported, we also run git-gc weekly with Gerrit running and have no problems with that. Do you think it could be causing our issue?

Hi Martin & Matthias,

Yes, our server specs are underpowered for our need, and we are already working on some upgrades.
We have a customized & independent replication system that is not disputing resources with the Gerrit JVM, so our SSH threads are not being used by the replication plugin.
For the usage data on this specific instance, that has 1.6k active users, these are the daily averages:

14k git-upload-packs
40k git-receive-packs
3.2k http logins (GET on https://<server-address>/login)
1k patch-sets created

We are not using bitmap indexes on our git-gc operations.

why ? Bitmap indexes can improve performance quite a bit
 
Just one more information: we really started experiencing this problems after we deployed SSL to this instance. And as per Zivkov comment on the related post below this could be related to Jetty. Would there be any setting/log we could use to confirm that? Increasing the heap limit would help anyhow?

Also, Matthias have pointed before that the problem could be related to the Lucene index:

if that's one of the bad threads it looks like you have a performance issue with lucene.

Would the be any way to confirm it besides re-indexing all changes again? Like something I could look into the logs?

you can try to identify which threads are creating high load:

Use "top -H",  to find out which java thread is using the CPU. At the same time make a java thread dump. 
Take the topmost PID from the output of the top command, convert it to hexadecimal format and search for
thread with that thread-ID in the java thread dump. 

If you want to stick to 2.11.x you should consider to update to the latest service release 2.11.10.
Looking at the release notes a lot of bugs were fixed between 2.11.3 and 2.11.10.

Did you check cache hit rates [1] to find out if some caches need to be tuned ?


-Matthias

Leandro Fonseca

unread,
Jul 11, 2017, 9:46:39 AM7/11/17
to Repo and Gerrit Discussion, lean...@motorola.com, mf...@codeaurora.org, mat...@unsolvable.org
Hi Matthias,

Thanks for the suggestion, I will check this bitmap indexing option and will let you guys know the results.

Regarding the thread identification, I have already made the thread capturing and posted it on my first message at this topic, and I guess your original concern about the Lucene index being the cause of the issue was probably after looking into it :) 

That's an example of these bad threads:

"HTTP-63170" prio=10 tid=0x000000000c561000 nid=0x6eae runnable [0x00002b58ea0bb000]
   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:177)
at org.apache.lucene.search.TopFieldCollector.populateResults(TopFieldCollector.java:1203)
at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:93)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:582)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read(LuceneChangeIndex.java:408)
at com.google.gerrit.server.index.IndexedChangeQuery.read(IndexedChangeQuery.java:95)
at com.google.gerrit.server.index.IndexedChangeQuery.restart(IndexedChangeQuery.java:138)
at com.google.gerrit.server.query.change.AndSource.readImpl(AndSource.java:133)
at com.google.gerrit.server.query.change.AndSource.read(AndSource.java:99)
at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:153)
at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:102)
at com.google.gerrit.server.query.change.QueryChanges.query0(QueryChanges.java:143)
at com.google.gerrit.server.query.change.QueryChanges.query(QueryChanges.java:132)
at com.google.gerrit.server.query.change.QueryChanges.apply(QueryChanges.java:99)

I checked the Lucene code online and apparently (if I'm looking to the right code) the line # 246 on PriorityQueue.java file is precisely the line that will enable the object to be gc'ed:

...
244:      if (heap[i] == element) {
245:        heap[i] = heap[size];
246:        heap[size] = null; // permit GC of objects
247:        size--;
...

Thanks,
Leandro

Björn Pedersen

unread,
Jul 12, 2017, 3:09:31 AM7/12/17
to Repo and Gerrit Discussion, lean...@motorola.com, mf...@codeaurora.org, mat...@unsolvable.org
Hi,

if java gc is the culprit, maybe take a look at
https://www.cloudbees.com/blog/joining-big-leagues-tuning-jenkins-gc-responsiveness-and-stability
and try to tune it.

Björn

Luca Milanesio

unread,
Jul 12, 2017, 4:20:29 AM7/12/17
to Björn Pedersen, Leandro Fonseca, Repo and Gerrit Discussion, Martin Fick, Matthew Webber
On 12 Jul 2017, at 08:09, 'Björn Pedersen' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:

Hi,

if java gc is the culprit, maybe take a look at
https://www.cloudbees.com/blog/joining-big-leagues-tuning-jenkins-gc-responsiveness-and-stability
and try to tune it.

Björn

I Believe the problem was about threads and not heap, isn't it?

What you call a "bad thread" is simply the execution of a Lucene Index.

Can you navigate the stack up and see what is the originator?
Have you traced it down to a set of REST API (or other cause) that is generating so many Gerrit queries?

Luca.

Leandro Fonseca

unread,
Jul 12, 2017, 5:02:45 PM7/12/17
to Repo and Gerrit Discussion, lean...@motorola.com, mf...@codeaurora.org, mat...@unsolvable.org
Hi Björn,

Thanks a lot for your suggestion.
I will review it and provide comments.

Leandro.

Leandro Fonseca

unread,
Jul 12, 2017, 5:05:47 PM7/12/17
to Repo and Gerrit Discussion, ice...@googlemail.com, lean...@motorola.com, mf...@codeaurora.org, mat...@unsolvable.org
Hi Luca,

Sorry, how exactly should I "navigate the stack up and see what is the originator"?
I have being trying to determine what is causing that, I'm pretty sure it is related to a high volume of pushing, or maybe a push to a specific project (that I couldn't identify either), however I wasn't able to do so yet.

Is there any safe way to identify these threads trigger? 
Would a git-receive-pack originate these http-lucene-threads? 
Is there anyway to improve this threads execution, or even limit the amount of parallel running threads?

Thanks a lot!
Leandro.

Luca Milanesio

unread,
Jul 12, 2017, 5:26:53 PM7/12/17
to Leandro Fonseca, Repo and Gerrit Discussion, Björn Pedersen, Martin Fick, mat...@unsolvable.org
If I look at the only stack-trace you produce, the highest call in the stack is:
at com.google.gerrit.server.query.change.QueryChanges.apply(QueryChanges.java:99)

This doesn't look like being generated by a Git push or git-receive-pack.
If you get the full stack-trace of this and other threads (e.g. via a JVM threads dump) ... you should see what's the real Gerrit operation that is triggering them.

Luca.

Leandro Fonseca

unread,
Jul 13, 2017, 10:07:20 AM7/13/17
to Repo and Gerrit Discussion, lean...@motorola.com, ice...@googlemail.com, mf...@codeaurora.org, mat...@unsolvable.org
Hi Luca,

I just checked the stack-trace of multiple examples of these "bad threads". 
Except for the first lines, that may differ a little but are all under lucene, AND the thread id on the "locked line" (both blocks are highlighted below), everything else looks the same for all threads.

Any idea what that could be? Maybe a result of a REST call?

Thanks again,
Leandro.

"HTTP-4805" prio=10 tid=0x00000000093ab000 nid=0x19d6 runnable [0x00002b71e14f6000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:246)
        at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:204)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.updateBottom(TopFieldCollector.java:398)
        at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.collect(TopFieldCollector.java:427)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
        at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
        at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:510)
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:378)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read(LuceneChangeIndex.java:408)
        at com.google.gerrit.server.index.IndexedChangeQuery.read(IndexedChangeQuery.java:95)
        at com.google.gerrit.server.index.IndexedChangeQuery.restart(IndexedChangeQuery.java:138)
        at com.google.gerrit.server.query.change.AndSource.readImpl(AndSource.java:133)
        at com.google.gerrit.server.query.change.AndSource.read(AndSource.java:99)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:153)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(QueryProcessor.java:102)
        at com.google.gerrit.server.query.change.QueryChanges.query0(QueryChanges.java:143)
        at com.google.gerrit.server.query.change.QueryChanges.query(QueryChanges.java:132)
        at com.google.gerrit.server.query.change.QueryChanges.apply(QueryChanges.java:99)
        - locked <0x00002b621d5cc4a0> (a com.google.inject.servlet.GuiceFilter$Context)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:497)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Unknown Source)


Matthias Sohn

unread,
Jul 14, 2017, 8:28:18 AM7/14/17
to Leandro Fonseca, Repo and Gerrit Discussion, Björn Pedersen, Martin Fick, Matthew Webber
On Thu, Jul 13, 2017 at 4:07 PM, Leandro Fonseca <lean...@motorola.com> wrote:
Hi Luca,

I just checked the stack-trace of multiple examples of these "bad threads". 
Except for the first lines, that may differ a little but are all under lucene, AND the thread id on the "locked line" (both blocks are highlighted below), everything else looks the same for all threads.

Any idea what that could be? Maybe a result of a REST call?

Thanks again,
Leandro.

maybe you need to increase the number of indexing threads [1] ?
In 2.11.3 the default is index.threads=1, in master the default is
1 plus half of the number of logical CPUs as returned by the JVM

-Matthias

vista...@gmail.com

unread,
Aug 22, 2018, 4:31:54 AM8/22/18
to Repo and Gerrit Discussion
Hi,all:

     I also get the problem but it is not about lunuce and is abount “SSH git-upload-pack”.

    What can I do for that?

Thanks!

vista...@gmail.com

unread,
Jan 25, 2019, 1:44:33 AM1/25/19
to Repo and Gerrit Discussion
Hi,Leandro Fonseca:

 This problom was solved?

Thank you!

Leandro Fonseca於 2017年5月10日星期三 UTC+8上午5時26分56秒寫道:
Hi everyone

Our Gerrit 2.11.3 installation, that is currently using HTTPS/SSL, is repeatedly increasing its CPU usage until a point that all the server CPUs are 100% used and the Gerrit process obviously breaks. The only solution to free up the used CPU is to restart the Gerrit service.

I used jstack to found that the CPU high usage is coming from HTTP threads (Yes, the Gerrit process was using ~ 900% of the server CPU at the time this data was collected):

Process id    CPU utilization    Thread
4156          99.9               "HTTP-688" prio=10 tid=0x00000000041cd000 nid=0x103c runnable [0x00002baa61d19000]
6476          99.9               "HTTP-785" prio=10 tid=0x00002baa5c796000 nid=0x194c runnable [0x00002baa602ff000]
17037         99.9               "HTTP-845" prio=10 tid=0x00002baa5c794800 nid=0x428d runnable [0x00002baa60703000]
8466          99.9               "HTTP-882" prio=10 tid=0x00002baa645d5800 nid=0x2112 runnable [0x00002baa6c703000]
17404         99.9               "HTTP-924" prio=10 tid=0x00002baa5c0b7000 nid=0x43fc runnable [0x00002baa6201c000]
4376          99.9               "HTTP-929" prio=10 tid=0x0000000004650000 nid=0x1118 runnable [0x00002baa6dc14000]
2965          99.9               "HTTP-963" prio=10 tid=0x0000000004f28800 nid=0xb95 runnable [0x00002baa5670a000]
8810          99.9               "HTTP-10958" prio=10 tid=0x00000000055f1000 nid=0x226a runnable [0x00002baa6edbf000]
690           99.9               "HTTP-12100" prio=10 tid=0x0000000005373800 nid=0x2b2 runnable [0x00002baa6ecae000]

I could not find much information on the thread full information (using thread 0x226a as example, can provide the info of the others):

"HTTP-10958" prio=10 tid=0x00000000055f1000 nid=0x226a runnable [0x00002baa6edbf000]
        - locked <0x00002b9a8b8561c8> (a com.google.inject.servlet.GuiceFilter$Context)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:497)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Unknown Source)

All these threads were running using 0.0% of CPU since the last time Gerrit process was restarted, some have some small usage and then return to 0.0%, but then all of the sudden they ramp up to 100% CPU usage and stay around this usage rate (97, 98 & 99%) during all their existence, until it gets to a critical level and I need to restart Gerrit again. For the above detailed thread, according to my monitors, this ramp up happened around 09/May/2017:13:04:24.

I looked on the httpd_log for any suspicious activity, but there was nothing wrong (not only for this specific case, but also for the others). I even accessed the below listed changes and couldn't reproduce the problem:

<ip1> - <user1> [09/May/2017:13:04:22 -0500] "GET /changes/990298/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip2> - <user2> [09/May/2017:13:04:23 -0500] "GET /changes/990656/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Safari/537.36"
<ip3> - <user3> [09/May/2017:13:04:23 -0500] "GET /changes/991102/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip4> - <user4> [09/May/2017:13:04:23 -0500] "GET /changes/991792/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0"
100.64.193.60 - - [09/May/2017:13:04:23 -0500] "GET /changes/?n=25&O=81 HTTP/1.1" 200 28 "https://<gerrit server name>/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip5> - <user5> [09/May/2017:13:04:24 -0500] "GET /changes/991116/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"
<ip6> - <user6> [09/May/2017:13:04:24 -0500] "GET /changes/978360/detail?O=404 HTTP/1.1" 304 - "https://<gerrit server name>/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0"


Nothing was reported on error_log at this specific period:

[logs]$ cat error_log | grep "13:04:20"
[logs]$ cat error_log | grep "13:04:21"
[logs]$ cat error_log | grep "13:04:22"
[logs]$ cat error_log | grep "13:04:23"
[logs]$ cat error_log | grep "13:04:24"
[logs]$ cat error_log | grep "13:04:25"
[logs]$ 


Only few pushes reported on sshd_log at this specific period:

[2017-05-09 13:04:20,111 -0500] 8008524b <userA> a/1001028 LOGIN FROM <ip X>
[2017-05-09 13:04:20,526 -0500] 203c06f2 <userB> a/1010806 LOGIN FROM <ip Y>
[2017-05-09 13:04:21,456 -0500] 203c06f2 <userB> a/1010806 git-receive-pack./<project 1> 1ms 379ms 0
[2017-05-09 13:04:21,798 -0500] 8008524b <userA> a/1001028 git-receive-pack./<project 2> 1ms 1671ms 0
[2017-05-09 13:04:21,799 -0500] 8008524b <userA> a/1001028 LOGOUT
[2017-05-09 13:04:21,857 -0500] 203c06f2 <userB> a/1010806 LOGOUT
[2017-05-09 13:04:23,118 -0500] 4016da78 <userA> a/1001028 LOGIN FROM <ip X>
[2017-05-09 13:04:23,207 -0500] e0182ea3 <userB> a/1010806 LOGIN FROM <ip Y>
[2017-05-09 13:04:23,776 -0500] 4016da78 <userA> a/1001028 git-receive-pack./<project 3> 0ms 643ms 0
[2017-05-09 13:04:23,777 -0500] 4016da78 <userA> a/1001028 LOGOUT
[2017-05-09 13:04:24,178 -0500] e0182ea3 <userB> a/1010806 git-receive-pack./<project 4> 1ms 421ms 0
[2017-05-09 13:04:24,587 -0500] e0182ea3 <userB> a/1010806 LOGOUT

As I mentioned, this is happening repeatedly, so it is a very big problem.

I have found a similar problem reported on this forum at the page below but unfortunately no conclusive information on how to fix or reproduce the problem was given:

Does anyone know what could be happening here?
Thanks in advance,

Leandro.




Leandro Fonseca

unread,
Jan 25, 2019, 8:46:23 AM1/25/19
to Repo and Gerrit Discussion
Hi,

No, I wasn't able to get it fixed :(
I'm moving to a bigger & faster server to try to overcome that.

Leandro.

vista...@gmail.com

unread,
Jan 25, 2019, 8:58:37 PM1/25/19
to Repo and Gerrit Discussion
Hi,Leandro :

 I am also troubled now.  :(

 Can you give me some info?

 [1] How many the core of CPU and RAM in new machine?

 [2] The database of gerrit was in another machine?

 [3] Can you show me the gerrit.config?

Thank you very much!

Leandro Fonseca於 2019年1月25日星期五 UTC+8下午9時46分23秒寫道:
Reply all
Reply to author
Forward
0 new messages