Gerrit Unable to Process Requests on Https and UI Becomes unresponsive

lingalugari mohankrishna

unread,

Feb 27, 2025, 6:44:03 AM2/27/25

to Repo and Gerrit Discussion

Hello Gerrit Experts,

We are facing some critical issues while serving traffic via https.
We have F5 Serving traffic as RP and i can see there are no issues with application handling requests on 29418 but when it hits 443 UI is too Slow and Unresponsive.

We have set grafana to pull some data everytime my https goes high it fails to process

on top i have performed:
netstat -an |grep 8080 |grep WAIT |wc -l
3400
Which looks to me high and kind of fluctuating always to being as low as 1000 nd high as 3500 and this gerrit is on Single Node with 64 CPU & 256 GB.

Also, i can see
response already committed [CONTEXT request="REST /changes/" ]
org.eclipse.jetty.io.EofException -- error in error_log

Can you please suggest how we can fix such http issues.As it is being constant from past week and everytime we need to restart service which also doesn't helps sometimes in bringing http queue down.

Current version : 3.6.5

Regards,
Mohan. L

Björn Pedersen

unread,

Feb 27, 2025, 7:02:26 AM2/27/25

to Repo and Gerrit Discussion

We observed similar traffic spikes during the last weeks , seemingly from a botnet. We got it down by blocking a complete subnet on the firewall and implementing a rather strict fail2ban on the apache in front of gerrit. Just make sure you don't block your own ips, too.

Most requests were to (public) gitiles urls, i could well be the bots are "trapped" in the many circular links you get there.

Current version : 3.6.5

Independet from the problem above, you should really consider upgrading...

Regards,
Mohan. L

Matthias Sohn

unread,

Feb 27, 2025, 8:18:40 AM2/27/25

to Björn Pedersen, Repo and Gerrit Discussion

On Thu, Feb 27, 2025 at 1:02 PM 'Björn Pedersen' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:

lingalugari mohankrishna schrieb am Donnerstag, 27. Februar 2025 um 12:44:03 UTC+1:
Hello Gerrit Experts,

We are facing some critical issues while serving traffic via https.
We have F5 Serving traffic as RP and i can see there are no issues with application handling requests on 29418 but when it hits 443 UI is too Slow and Unresponsive.

We have set grafana to pull some data everytime my https goes high it fails to process

on top i have performed:
netstat -an |grep 8080 |grep WAIT |wc -l
3400
Which looks to me high and kind of fluctuating always to being as low as 1000 nd high as 3500 and this gerrit is on Single Node with 64 CPU & 256 GB.

I guess you have some issues serving heavy lifting large fetch and clone requests with REST API and write traffic on the same JVM process.

We made good experience running a separate gerrit replica process with access to the same git directory containing the repositories (e.g. by symlinking
the gerrit primary git folder to the git folder of the replica colocated on the same host. Then block fetch and clone requests on the Gerrit primary process

and serve them only from the gerrit replica process. This way you have more control over which resources are spent on heavy fetch and clone requests

and which ones are used for REST API and queries. Then use parallelGC Java garbage collector for the replica and G1GC or even better generational zgc

(for that upgrade to gerrit 3.11 and Java 21) for the gerrit primary to optimize throughput for long running git requests on the replica and

responsiveness on the gerrit primary. Routing http requests can be automated using load balancer e.g. haproxy. For ssh use different ports to access

primary or replica.

I think your thread pools are too large for the available compute resources.

sshd.threads should be at most 2 x number of cores.

Typically most of the performance issues are caused by the largest repositories,

watch out for requests on repos larger than 1GB.

Do you have repos containing large binary files ?

Do you run git gc on a regular schedule ?

Create a couple of thread dumps on the server while you observe slow requests,

this should give you some idea where the long runners spend time.
Also you can use tracing to get detailed performance traces.

See https://gerrit-review.googlesource.com/Documentation/user-request-tracing.html

Also, i can see
response already committed [CONTEXT request="REST /changes/" ]
org.eclipse.jetty.io.EofException -- error in error_log

Can you please suggest how we can fix such http issues.As it is being constant from past week and everytime we need to restart service which also doesn't helps sometimes in bringing http queue down.

We observed similar traffic spikes during the last weeks , seemingly from a botnet. We got it down by blocking a complete subnet on the firewall and implementing a rather strict fail2ban on the apache in front of gerrit. Just make sure you don't block your own ips, too.

Most requests were to (public) gitiles urls, i could well be the bots are "trapped" in the many circular links you get there.

Current version : 3.6.5

Independet from the problem above, you should really consider upgrading...

Yes, upgrading to the latest release is always a good idea.

Regards,
Mohan. L

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/repo-discuss/51296cd7-720b-4719-af99-5ba21d88b055n%40googlegroups.com.

Reply all

Reply to author

Forward