Gerrit connection issues

49 views
Skip to first unread message

tech....@gmail.com

unread,
Jul 18, 2024, 5:53:27 AM (4 days ago) Jul 18
to Repo and Gerrit Discussion
Hi Experts,

We have Gerrit HA with  48 CPU . 180 RAM.

SSh connections are getting queued after hitting 70-80 + connections at a time on each node.

What should be the correct value to bear 100 + connections with this amount of Cores or it is not possible to server that much of connection looking at 48 cores for each node.

Here is our config:

[sshd]
        listenAddress = *:29418
        idleTimeout = 10m
        backend = MINA
        threads = 110
        batchThreads = 24
        commandStartThreads = 10
        maxConnectionsPerUser = 64
[httpd]
        listenUrl = proxy-https://*:8080/
        maxThreads = 800
        requestLog = true
        acceptorThreads = 48
        minThreads = 49
        maxQueued = 2000

Daniele Sassoli

unread,
Jul 18, 2024, 10:35:54 AM (4 days ago) Jul 18
to Repo and Gerrit Discussion
On Thursday 18 July 2024 at 10:53:27 UTC+1 tech....@gmail.com wrote:
Hi Experts,

We have Gerrit HA with  48 CPU . 180 RAM.

SSh connections are getting queued after hitting 70-80 + connections at a time on each node.

What should be the correct value to bear 100 + connections with this amount of Cores or it is not possible to server that much of connection looking at 48 cores for each node.

Here is our config:

[sshd]
        listenAddress = *:29418
        idleTimeout = 10m
        backend = MINA
        threads = 110
        batchThreads = 24
 
What kind of users are the connections queueing for? Interactive or batch? You have 86(110-24) threads available for interactive, so what you're seeing makes sense.
In general I would recommend using not using Gerrit's SSH stack, you can search the mailing list for plenty of issues with it. If you can get your users to use HTTP calls
I'm sure you'll see stability improvements.

tech....@gmail.com

unread,
Jul 18, 2024, 1:55:24 PM (4 days ago) Jul 18
to Repo and Gerrit Discussion
On Thursday 18 July 2024 at 20:05:54 UTC+5:30 Daniele Sassoli wrote:
On Thursday 18 July 2024 at 10:53:27 UTC+1 tech....@gmail.com wrote:
Hi Experts,

We have Gerrit HA with  48 CPU . 180 RAM.

SSh connections are getting queued after hitting 70-80 + connections at a time on each node.

What should be the correct value to bear 100 + connections with this amount of Cores or it is not possible to server that much of connection looking at 48 cores for each node.

Here is our config:

[sshd]
        listenAddress = *:29418
        idleTimeout = 10m
        backend = MINA
        threads = 110
        batchThreads = 24
 
What kind of users are the connections queueing for? Interactive or batch? You have 86(110-24) threads available for interactive, so what you're seeing makes sense.
In general I would recommend using not using Gerrit's SSH stack, you can search the mailing list for plenty of issues with it. If you can get your users to use HTTP calls
I'm sure you'll see stability improvements.
 
 We have users from both service users and interactive ssh users. 
We are relying on ssh interaction for repo sync ro clone aosp code. So https is not ideal on such activities on our dev and build servers.
With 48 cores what should be the ideal value of threads and batch threads to support interactive+ non interactive= 130+ connections for each node?

Martin Fick

unread,
Jul 18, 2024, 4:37:39 PM (4 days ago) Jul 18
to tech....@gmail.com, Repo and Gerrit Discussion
On Thu, Jul 18, 2024 at 11:55 AM tech....@gmail.com <tech....@gmail.com> wrote:
On Thursday 18 July 2024 at 20:05:54 UTC+5:30 Daniele Sassoli wrote:
On Thursday 18 July 2024 at 10:53:27 UTC+1 tech....@gmail.com wrote:
Hi Experts,

We have Gerrit HA with  48 CPU . 180 RAM.

SSh connections are getting queued after hitting 70-80 + connections at a time on each node.

What should be the correct value to bear 100 + connections with this amount of Cores or it is not possible to server that much of connection looking at 48 cores for each node.

Here is our config:

[sshd]
        listenAddress = *:29418
        idleTimeout = 10m
        backend = MINA
        threads = 110
        batchThreads = 24
 
What kind of users are the connections queueing for? Interactive or batch? You have 86(110-24) threads available for interactive, so what you're seeing makes sense.
In general I would recommend using not using Gerrit's SSH stack, you can search the mailing list for plenty of issues with it. If you can get your users to use HTTP calls
I'm sure you'll see stability improvements.
 
 We have users from both service users and interactive ssh users. 
We are relying on ssh interaction for repo sync ro clone aosp code. So https is not ideal on such activities on our dev and build servers.
With 48 cores what should be the ideal value of threads and batch threads to support interactive+ non interactive= 130+ connections for each node?

In my experience, if you are serving AOSP, then the number of cores is less of a limiting factor than your RAM. If you can handle more threads memory wise, then go for it. But generally more threads means more memory utilization which can be problematic if you have some longer running clones on your larger repositories all at the same time. So increase your thread count "at your own risk". Only testing and real world utilization can tell you what works for your setup, but I suspect 110 threads could already potentially be putting your servers at risk of crashing given just the "wrong" workload.
 

        commandStartThreads = 10

None of my testing ever showed this setting to improve performance, I think we always kept it at 2.
 
        maxConnectionsPerUser = 64
[httpd]
        listenUrl = proxy-https://*:8080/
        maxThreads = 800

This 800 seems excessive and could eventually be a problem if a bunch of them start to do something memory intensive.
 
        requestLog = true
        acceptorThreads = 48
        minThreads = 49
        maxQueued = 2000
 
-Martin

Reply all
Reply to author
Forward
0 new messages