On 24 Jun 2022, at 04:41, Makson Lee <cdle...@gmail.com> wrote:recently we upgraded our gerrit server from 3.5.x to 3.6.1, then we need to restart the server each a few days, otherwise, we will get the follow error while repo sync,[2022-06-23T18:23:28.784+08:00] [SSH git-upload-pack /platform/prebuilts/clang/host/linux-x86 (tedwu)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user xxx account xxx) during git-upload-pack '/platform/prebuilts/clang/host/linux-x86'
org.apache.sshd.common.channel.exception.SshChannelClosedException: write(ChannelOutputStream[ChannelSession[id=388, recipient=30]-ServerSessionImpl[xxx@/172.17.100.22:40114]] SSH_MSG_CHANNEL_DATA) len=65520 - channel already closed
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/433f95ff-4523-4790-9b80-42d7fcb97dabn%40googlegroups.com.
On 24 Jun 2022, at 04:41, Makson Lee <cdle...@gmail.com> wrote:recently we upgraded our gerrit server from 3.5.x to 3.6.1, then we need to restart the server each a few days, otherwise, we will get the follow error while repo sync,[2022-06-23T18:23:28.784+08:00] [SSH git-upload-pack /platform/prebuilts/clang/host/linux-x86 (tedwu)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user xxx account xxx) during git-upload-pack '/platform/prebuilts/clang/host/linux-x86'
org.apache.sshd.common.channel.exception.SshChannelClosedException: write(ChannelOutputStream[ChannelSession[id=388, recipient=30]-ServerSessionImpl[xxx@/172.17.100.22:40114]] SSH_MSG_CHANNEL_DATA) len=65520 - channel already closedThis says that the SSH connection has been closed, nothing more.Do you collect metrics?Can you check the SSH thread pool utilisation over time?Can you check your SSH log and see the execution timings?
On 24 Jun 2022, at 09:10, Makson Lee <cdle...@gmail.com> wrote:On Friday, June 24, 2022 at 4:00:36 PM UTC+8 lucamilanesio wrote:On 24 Jun 2022, at 04:41, Makson Lee <cdle...@gmail.com> wrote:recently we upgraded our gerrit server from 3.5.x to 3.6.1, then we need to restart the server each a few days, otherwise, we will get the follow error while repo sync,[2022-06-23T18:23:28.784+08:00] [SSH git-upload-pack /platform/prebuilts/clang/host/linux-x86 (tedwu)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user xxx account xxx) during git-upload-pack '/platform/prebuilts/clang/host/linux-x86'
org.apache.sshd.common.channel.exception.SshChannelClosedException: write(ChannelOutputStream[ChannelSession[id=388, recipient=30]-ServerSessionImpl[xxx@/172.17.100.22:40114]] SSH_MSG_CHANNEL_DATA) len=65520 - channel already closedThis says that the SSH connection has been closed, nothing more.Do you collect metrics?
Can you check the SSH thread pool utilisation over time?
Can you check your SSH log and see the execution timings?
Can you give me some instructions about how to collect these?
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/9d67ce36-2eb4-48aa-928b-e201adde2445n%40googlegroups.com.
On 24 Jun 2022, at 09:10, Makson Lee <cdle...@gmail.com> wrote:On Friday, June 24, 2022 at 4:00:36 PM UTC+8 lucamilanesio wrote:On 24 Jun 2022, at 04:41, Makson Lee <cdle...@gmail.com> wrote:recently we upgraded our gerrit server from 3.5.x to 3.6.1, then we need to restart the server each a few days, otherwise, we will get the follow error while repo sync,[2022-06-23T18:23:28.784+08:00] [SSH git-upload-pack /platform/prebuilts/clang/host/linux-x86 (tedwu)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user xxx account xxx) during git-upload-pack '/platform/prebuilts/clang/host/linux-x86'
org.apache.sshd.common.channel.exception.SshChannelClosedException: write(ChannelOutputStream[ChannelSession[id=388, recipient=30]-ServerSessionImpl[xxx@/172.17.100.22:40114]] SSH_MSG_CHANNEL_DATA) len=65520 - channel already closedThis says that the SSH connection has been closed, nothing more.Do you collect metrics?See the metrics-reporter plugin at [1].Can you check the SSH thread pool utilisation over time?The two metrics:queue_ssh_batch_worker_active_threadsqueue_ssh_interactive_worker_active_threads
--Can you check your SSH log and see the execution timings?See [2] for the details of the sshd_log file data.HTHLuca.Can you give me some instructions about how to collect these?--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/9d67ce36-2eb4-48aa-928b-e201adde2445n%40googlegroups.com.
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/97882F8B-DE21-48FB-87D1-E3E913611B5D%40gmail.com.
You may consider to use the gerrit monitoring setup which gives you a nice collection ofGrafana dashboards including thread pool metrics
On 24 Jun 2022, at 04:41, Makson Lee <cdle...@gmail.com> wrote:recently we upgraded our gerrit server from 3.5.x to 3.6.1, then we need to restart the server each a few days, otherwise, we will get the follow error while repo sync,[2022-06-23T18:23:28.784+08:00] [SSH git-upload-pack /platform/prebuilts/clang/host/linux-x86 (tedwu)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user xxx account xxx) during git-upload-pack '/platform/prebuilts/clang/host/linux-x86'
org.apache.sshd.common.channel.exception.SshChannelClosedException: write(ChannelOutputStream[ChannelSession[id=388, recipient=30]-ServerSessionImpl[xxx@/172.17.100.22:40114]] SSH_MSG_CHANNEL_DATA) len=65520 - channel already closedThis says that the SSH connection has been closed, nothing more.
and here is part of our gerrit.config,
[sshd]
rekeyTimeLimit = 0
rekeyBytesLimit = 1099511627776
You could try disabling the chacha20-poly1305 cipher via the sshd.cipher Gerrit configuration, so that one of the AES ciphers with a higher auto-computed rekey limit would be used.
sorry, fixed the wrong cipher setting,[sshd]
listenAddress = *:29418
threads = 40
batchThreads = 4
rekeyTimeLimit = 0
rekeyBytesLimit = 1099511627776cipher = -chacha20-poly1305
On Friday, June 24, 2022 at 5:47:13 AM UTC+2 cdle...@gmail.com wrote:and here is part of our gerrit.config,[...][sshd][...]rekeyTimeLimit = 0rekeyBytesLimit = 1099511627776Don't know if this might be part of the problem. You are trying to avoid re-keying (that rekeyBytesLimit is 1TB). Presumably /platform/prebuilts/clang/host/linux-x86 is a large repository with more than 1GB of data transferred ?I'm not sure this is effective; Apache MINA sshd has two more settings: rekeyPacketsLimit (default 2^31), and rekeyBlocksLimit (default 0, which means auto-computed). rekeyBlocksLimit is in measures of cipher blocks. For the chacha20-poly1305 cipher chosen by default with Apache MINA sshd 2.8.0 it is auto-computed to re-key every 1GB of data. For other ciphers (like the AES ciphers, which have a block size of 16 bytes) the default comes out as 2^36 bytes (2^32 blocks).Gerrit doesn't have a property to set the rekeyBlocksLimit. OpenSSH has this to say about it: "ChaCha20 must never reuse a {key, nonce} for encryption nor may it be used to encrypt more than 2^70 bytes under the same {key, nonce}. The SSH Transport protocol (RFC4253) recommends a far more conservative rekeying every 1GB of data sent or received. If this recommendation is followed, then chacha20-poly1305[] requires no special handling in this area."[2] The "nonce" is the packet sequence number. See also RFC 4344.[3] OpenSSH chooses 1GB for chacha20-poly1305. From the aforementioned statement it appears that a much larger value would be feasible provided that that the rekeyPacketsLimit is not more than 2^31.I also don't see chacha20-poly1305 mentioned in the Gerrit 3.6.1 docs at [1], even though it is supported by Apache MINA sshd 2.8.0.
You could try disabling the chacha20-poly1305 cipher via the sshd.cipher Gerrit configuration, so that one of the AES ciphers with a higher auto-computed rekey limit would be used.If the fetch just hangs, as you wrote, it's perhaps because of issue 12758 [4] (which is SSHD-966 [5]), which is indeed about problems with re-keying. It should be fixed in Apache MINA sshd 2.9.0.
On Saturday, June 25, 2022 at 3:56:46 AM UTC+2 cdle...@gmail.com wrote:sorry, fixed the wrong cipher setting,[sshd]
listenAddress = *:29418
threads = 40
batchThreads = 4
rekeyTimeLimit = 0
rekeyBytesLimit = 1099511627776cipher = -chacha20-poly1305The name of the cipher is "chacha20-poly1305", followed by an "@", followed by "openssh.com". This discussion board mangles this because it looks like an e-mail address.
If you can post a jstack -l -e thread dump of your Gerrit server when such a fetch hangs, I could take a look to see if it might indeed by issue 12758.