Find out which CI flow produces most of gradle load

31 views
Skip to first unread message

Denis Khabensky

unread,
Feb 12, 2026, 10:05:11 AM (yesterday) Feb 12
to Repo and Gerrit Discussion
Our Gerrit instance serves a small number of developers and a large number of different CI flows. CI produce produce 99% of load which mainly consists of checkouts (git-upload-pack).
When Gerrit instance is overloaded, I need to know kind of top-consumers across CI flows, so I can terminate most expensive or least important of them to reduce overall load.

The problem is: all CI flows use robot (service) account so are basically indistinguishable for gerrit. Flows (and their number) change pretty frequently so I cannot give each flow its own account.

My idea is to use only http checkout and include some sort of flow identifiers/tags into user_agent.
So I will be able to breakdown all the load by user_agents and find ones that consume too much.

Do you have better ideas or any ready instruments for this problem? Or is it feasible for ssh checkouts somehow, so I can skip migration from ssh to http checkout? 

Björn Pedersen

unread,
Feb 12, 2026, 10:35:25 AM (yesterday) Feb 12
to Repo and Gerrit Discussion
Denis Khabensky schrieb am Donnerstag, 12. Februar 2026 um 16:05:11 UTC+1:
Our Gerrit instance serves a small number of developers and a large number of different CI flows. CI produce produce 99% of load which mainly consists of checkouts (git-upload-pack).
When Gerrit instance is overloaded, I need to know kind of top-consumers across CI flows, so I can terminate most expensive or least important of them to reduce overall load.

The problem is: all CI flows use robot (service) account so are basically indistinguishable for gerrit. Flows (and their number) change pretty frequently so I cannot give each flow its own account.

My idea is to use only http checkout and include some sort of flow identifiers/tags into user_agent.
So I will be able to breakdown all the load by user_agents and find ones that consume too much.


The load is mainly determined by repo size. 

Björn Pedersen

unread,
Feb 12, 2026, 10:38:50 AM (yesterday) Feb 12
to Repo and Gerrit Discussion
Björn Pedersen schrieb am Donnerstag, 12. Februar 2026 um 16:35:25 UTC+1:
Denis Khabensky schrieb am Donnerstag, 12. Februar 2026 um 16:05:11 UTC+1:
Our Gerrit instance serves a small number of developers and a large number of different CI flows. CI produce produce 99% of load which mainly consists of checkouts (git-upload-pack).
When Gerrit instance is overloaded, I need to know kind of top-consumers across CI flows, so I can terminate most expensive or least important of them to reduce overall load.

The problem is: all CI flows use robot (service) account so are basically indistinguishable for gerrit. Flows (and their number) change pretty frequently so I cannot give each flow its own account.

My idea is to use only http checkout and include some sort of flow identifiers/tags into user_agent.
So I will be able to breakdown all the load by user_agents and find ones that consume too much.


The load is mainly determined by repo size. 


And  https://groups.google.com/g/repo-discuss/c/TJ4Pdg3KHUI/m/8nH9xqgSBAAJ as well (or search for  tuning in this group...)

Denis Khabensky

unread,
Feb 12, 2026, 10:57:23 PM (13 hours ago) Feb 12
to Repo and Gerrit Discussion
Thanks, Björn!
Seems I already read most of the links you just provided and our Gerrit instance is technically configured good enough.
It does not die under load, the problem is that clients receive poor service (long checkout times).

We already have scaling procedures in our schedule, but here I am looking for short-term measures, more into how "fair" our clients use Gerrit, maybe there are misconfiguration on checkout side. One that was mentioned in https://groups.google.com/g/repo-discuss/c/TJ4Pdg3KHUI/m/8nH9xqgSBAAJ as "dont use clean dir, work in more incremental ways" is extremely relevant, but unfortunately it is unfeasible for our CI servers - clean environment is strictly enforced and bringing repos as a zip-archieves is slower than doing git clone.

четверг, 12 февраля 2026 г. в 18:38:50 UTC+3, Björn Pedersen:
Reply all
Reply to author
Forward
0 new messages