RFC: sshd.idleTimeout default value != infinity

399 views
Skip to first unread message

Gert van Dijk

unread,
Aug 25, 2019, 9:34:42 AM8/25/19
to Repo and Gerrit Discussion
Hi,

I just noticed that the default value of sshd.idleTimeout is 0. With that default value it seems that SSH connections never time out server side in case a client hasn't closed it in an orderly fashion (e.g. network change, power failure, etc.). I've confirmed that also on my installation using gerrit show-connections, days after the client has long gone away they are still listed. Together with a default value of 64 open connections per user this may build up to an unavailable Gerrit peer for users over time until the server is restarted.

A similar change for HTTP/Jetty was introduced in stable-2.13 and up, with change 106450 [1] two years ago.

What I'd like to suggest, is to change this sshd.idleTimeout to a more sane default as well, e.g. 30 minutes. I'm not so sure what that actual default value should be, but I guess it should be some figure in units of minutes or hours, because SSH is more interactive than HTTP usually and using idling as a feature (e.g. stream-events).

FWIW, my Linux Git-SSH client shows to time out after ~ 10 minutes in default configuration.

My proposal is to change the default for Gerrit 3.1 to 30 minutes and announce it as a possible breaking change in the release notes for users relying on the current possibility of idling indefinitely.
WDYT?

Thanks,

Gert

Luca Milanesio

unread,
Aug 25, 2019, 9:39:25 AM8/25/19
to Gert van Dijk, Luca Milanesio, Repo and Gerrit Discussion
+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Luca.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/8af2129a-c113-423b-9265-8567017b0efc%40googlegroups.com.

Gert van Dijk

unread,
Sep 21, 2019, 2:05:38 PM9/21/19
to Repo and Gerrit Discussion
On Sunday, 25 August 2019 15:39:25 UTC+2, lucamilanesio wrote:

+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Almost one month later, I have received 1 positive response and 0 negative responses, 0 suggestions about what the new default should be and no other comments.

Please come forward *now* if you have objections or a better suggestion about the default timeout of 30 minutes. Thanks!

Tracking issue is Issue 11550 [1] & proposed change [2].

Matthias Sohn

unread,
Sep 21, 2019, 4:15:45 PM9/21/19
to Gert van Dijk, Repo and Gerrit Discussion
On Sat, Sep 21, 2019 at 8:05 PM Gert van Dijk <gert...@gmail.com> wrote:
On Sunday, 25 August 2019 15:39:25 UTC+2, lucamilanesio wrote:

+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Almost one month later, I have received 1 positive response and 0 negative responses, 0 suggestions about what the new default should be and no other comments.

Please come forward *now* if you have objections or a better suggestion about the default timeout of 30 minutes. Thanks!


we use 10min for sshd.idleTimeout and 5 min for receive.timeout [3] and this works for us

 
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

Gert van Dijk

unread,
Sep 21, 2019, 4:22:48 PM9/21/19
to Matthias Sohn, Repo and Gerrit Discussion
On Sat, Sep 21, 2019 at 10:15 PM Matthias Sohn <matthi...@gmail.com> wrote:
> we use 10min for sshd.idleTimeout and 5 min for receive.timeout [3] and this works for us

Thanks for another confirmation.

(receive.timeout seems to have a default of 4 minutes already.)

Luca Milanesio

unread,
Sep 21, 2019, 4:24:24 PM9/21/19
to Repo and Gerrit Discussion, Luca Milanesio, Gert van Dijk, Matthias Sohn

On 21 Sep 2019, at 21:15, Matthias Sohn <matthi...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 8:05 PM Gert van Dijk <gert...@gmail.com> wrote:
On Sunday, 25 August 2019 15:39:25 UTC+2, lucamilanesio wrote:

+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Almost one month later, I have received 1 positive response and 0 negative responses, 0 suggestions about what the new default should be and no other comments.

0 negative responses is a good thing :-)
I don't think anyone can say that "waiting forever" is a good idea !


Please come forward *now* if you have objections or a better suggestion about the default timeout of 30 minutes. Thanks!


we use 10min for sshd.idleTimeout and 5 min for receive.timeout [3] and this works for us

My suggestion is to define "good common sense" defaults from v3.1 onwards.

With regards to existing sites, during the migration the "old defaults" will be stored in the gerrit.config, so that the upgrade to v3.1 won't change their settings.
However, for new setups, a "plain vanilla" Gerrit would have finite timeout values.

WDYT as migration path?

Luca.


 

-- 
-- 
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/49413cbf-9088-4753-9902-f2ed431f4572%40googlegroups.com.

-- 
-- 
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

Matthias Sohn

unread,
Sep 21, 2019, 4:27:49 PM9/21/19
to Luca Milanesio, Repo and Gerrit Discussion, Gert van Dijk
On Sat, Sep 21, 2019 at 10:24 PM Luca Milanesio <luca.mi...@gmail.com> wrote:


On 21 Sep 2019, at 21:15, Matthias Sohn <matthi...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 8:05 PM Gert van Dijk <gert...@gmail.com> wrote:
On Sunday, 25 August 2019 15:39:25 UTC+2, lucamilanesio wrote:

+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Almost one month later, I have received 1 positive response and 0 negative responses, 0 suggestions about what the new default should be and no other comments.

0 negative responses is a good thing :-)
I don't think anyone can say that "waiting forever" is a good idea !


Please come forward *now* if you have objections or a better suggestion about the default timeout of 30 minutes. Thanks!


we use 10min for sshd.idleTimeout and 5 min for receive.timeout [3] and this works for us

My suggestion is to define "good common sense" defaults from v3.1 onwards.

With regards to existing sites, during the migration the "old defaults" will be stored in the gerrit.config, so that the upgrade to v3.1 won't change their settings.
However, for new setups, a "plain vanilla" Gerrit would have finite timeout values.

WDYT as migration path?

for existing sites we could emit a warning during gerrit init for timeout values explicitly set to 0 

Gert van Dijk

unread,
Sep 21, 2019, 4:35:23 PM9/21/19
to Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
On Sat, Sep 21, 2019 at 10:24 PM Luca Milanesio
<luca.mi...@gmail.com> wrote:
> My suggestion is to define "good common sense" defaults from v3.1 onwards.
>
> With regards to existing sites, during the migration the "old defaults" will be stored in the gerrit.config, so that the upgrade to v3.1 won't change their settings.
> However, for new setups, a "plain vanilla" Gerrit would have finite timeout values.
>
> WDYT as migration path?

I feel it's hard to justify to make it that complicated.
Moreover, it would mean that 'init' would add the setting explicitly
with a 0 as preserving the 'old' default. Then admins think that is
the new default probably or that the setting has become mandatory, but
actually they overrule a new default they probably never cared about
in the first place...

So, my opinion is: no migration path. If set, it stays set, if not
set, it will be the new default.

Luca Milanesio

unread,
Sep 21, 2019, 4:35:24 PM9/21/19
to Matthias Sohn, Luca Milanesio, Repo and Gerrit Discussion, Gert van Dijk

On 21 Sep 2019, at 21:27, Matthias Sohn <matthi...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 10:24 PM Luca Milanesio <luca.mi...@gmail.com> wrote:


On 21 Sep 2019, at 21:15, Matthias Sohn <matthi...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 8:05 PM Gert van Dijk <gert...@gmail.com> wrote:
On Sunday, 25 August 2019 15:39:25 UTC+2, lucamilanesio wrote:

+1 to that, it's a great idea.

We should get rid of many other "zero timeouts" in Gerrit v3.1, as long as we highlight them properly in the release notes.

Almost one month later, I have received 1 positive response and 0 negative responses, 0 suggestions about what the new default should be and no other comments.

0 negative responses is a good thing :-)
I don't think anyone can say that "waiting forever" is a good idea !


Please come forward *now* if you have objections or a better suggestion about the default timeout of 30 minutes. Thanks!


we use 10min for sshd.idleTimeout and 5 min for receive.timeout [3] and this works for us

My suggestion is to define "good common sense" defaults from v3.1 onwards.

With regards to existing sites, during the migration the "old defaults" will be stored in the gerrit.config, so that the upgrade to v3.1 won't change their settings.
However, for new setups, a "plain vanilla" Gerrit would have finite timeout values.

WDYT as migration path?

for existing sites we could emit a warning during gerrit init for timeout values explicitly set to 0 

Yes, that's also a good idea.

Luca Milanesio

unread,
Sep 21, 2019, 4:38:33 PM9/21/19
to Gert van Dijk, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
That is going to break *many* installations :-(
Specifically for people with very slow remote connections, they'll fail systematically.




Matthias Sohn

unread,
Sep 21, 2019, 4:45:27 PM9/21/19
to Gert van Dijk, Repo and Gerrit Discussion
here our settings for the other options which have a default of 0 and hence potentially wait forever:

ldap.readTimeout = 10 s
ldap.connectTimeout = 10 s
sendemail.connectTimeout = 15s
transfer.timeout = timeout = 120 s (documentation of the option suggests 10-30s but default is 0) 

Gert van Dijk

unread,
Sep 21, 2019, 4:45:56 PM9/21/19
to Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
No, they wouldn't? Why would this break any regular behavior? Long
running clones are not idle, only long-silent stream-events are.
Right?

Matthias Sohn

unread,
Sep 21, 2019, 4:48:59 PM9/21/19
to Luca Milanesio, Gert van Dijk, Repo and Gerrit Discussion
but they have this problem already so at least this doesn't make it worse

Luca Milanesio

unread,
Sep 21, 2019, 4:50:55 PM9/21/19
to Gert van Dijk, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
sshd.idleTimeout isn't applied only to the Git/SSH protocol, isn't it?

There are many operations in Gerrit that could potentially wait for many minutes, if not hours.

Matthias Sohn

unread,
Sep 21, 2019, 5:14:19 PM9/21/19
to Luca Milanesio, Gert van Dijk, Repo and Gerrit Discussion
On Sat, Sep 21, 2019 at 10:50 PM Luca Milanesio <luca.mi...@gmail.com> wrote:


On 21 Sep 2019, at 21:45, Gert van Dijk <gert...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 10:38 PM Luca Milanesio
<luca.mi...@gmail.com> wrote:
That is going to break *many* installations :-(
Specifically for people with very slow remote connections, they'll fail systematically.

No, they wouldn't? Why would this break any regular behavior? Long
running clones are not idle, only long-silent stream-events are.
Right?

sshd.idleTimeout isn't applied only to the Git/SSH protocol, isn't it?

if this is used also for other protocols this should be at least documented,
though I'd tend to call this a bug

for http there is http.idleTimeout

Gert van Dijk

unread,
Sep 21, 2019, 5:15:32 PM9/21/19
to Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
On Sat, Sep 21, 2019 at 10:50 PM Luca Milanesio
<luca.mi...@gmail.com> wrote:
> sshd.idleTimeout isn't applied only to the Git/SSH protocol, isn't it?

It's for everything over SSH, yes. That's why i considered stream-events.

> There are many operations in Gerrit that could potentially wait for many minutes, if not hours.

Yes, but that's also limited by sshd.waitTimeout, and it has a default
of 30s. So if your operation takes longer than 30s currently, it will
be killed with default settings.

Now, IIUC, it would only cause a problem if you increase the
waitTimeout to > 30 minutes and your operation takes indeed longer
*and* it does not produce any output (because then it's considered
idle for the SSH channel).

That's what I understand of it. If you or someone else thinks I
misunderstand things here, please enlighten me. :-)

[1]: https://gerrit-documentation.storage.googleapis.com/Documentation/3.0.2/config-gerrit.html#sshd.waitTimeout

Luca Milanesio

unread,
Sep 21, 2019, 5:18:53 PM9/21/19
to Matthias Sohn, Luca Milanesio, Gert van Dijk, Repo and Gerrit Discussion

On 21 Sep 2019, at 22:13, Matthias Sohn <matthi...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 10:50 PM Luca Milanesio <luca.mi...@gmail.com> wrote:


On 21 Sep 2019, at 21:45, Gert van Dijk <gert...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 10:38 PM Luca Milanesio
<luca.mi...@gmail.com> wrote:
That is going to break *many* installations :-(
Specifically for people with very slow remote connections, they'll fail systematically.

No, they wouldn't? Why would this break any regular behavior? Long
running clones are not idle, only long-silent stream-events are.
Right?

sshd.idleTimeout isn't applied only to the Git/SSH protocol, isn't it?

if this is used also for other protocols this should be at least documented,
though I'd tend to call this a bug

Not really a bug, but I believe it is wrong to assume that sshd.idleTimeout is the timeout of a Git/SSH connection only.
The word "git" isn't in the name of the configuration: the reason is because *all SSH connections* are influenced by that timeout settings.

In Gerrit SSH is mainly used to run remote executions of Gerrit tasks, most of them are not Git operations.
*ALL* the commands documented at [1] are influenced by the sshd.idleTimeout.

Some of them may even take hours (e.g. GC).

I was referring to non-Git operations over SSH.
Apologies for the misunderstanding.

Luca.

Luca Milanesio

unread,
Sep 21, 2019, 5:21:59 PM9/21/19
to Gert van Dijk, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn

On 21 Sep 2019, at 22:15, Gert van Dijk <gert...@gmail.com> wrote:

On Sat, Sep 21, 2019 at 10:50 PM Luca Milanesio
<luca.mi...@gmail.com> wrote:
sshd.idleTimeout isn't applied only to the Git/SSH protocol, isn't it?

It's for everything over SSH, yes. That's why i considered stream-events.

There are many operations in Gerrit that could potentially wait for many minutes, if not hours.

Yes, but that's also limited by sshd.waitTimeout, and it has a default
of 30s. So if your operation takes longer than 30s currently, it will
be killed with default settings.

Now, IIUC, it would only cause a problem if you increase the
waitTimeout to > 30 minutes and your operation takes indeed longer
*and* it does not produce any output (because then it's considered
idle for the SSH channel).

An operation could start within 30s (thus satisfying the wait timeout) but then not produce any output for an hour.
(e.g. GC again, for also stream events)


That's what I understand of it. If you or someone else thinks I
misunderstand things here, please enlighten me. :-)

[1]: https://gerrit-documentation.storage.googleapis.com/Documentation/3.0.2/config-gerrit.html#sshd.waitTimeout

The documentation is not accurate enough, we need to make it more explicit.

Gert van Dijk

unread,
Sep 21, 2019, 5:37:15 PM9/21/19
to Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
On Sat, Sep 21, 2019 at 11:21 PM Luca Milanesio
Hmm, if true, that sounds a bit bad for even a regular case. Thanks
for bringing up these concerns!

I see two options:

- Document that clients should set ServerAliveInterval [1] in their
SSH client configuration when invoking possibly long-running commands,
which IIUC, should avoid hitting the idleTimeout from the server in
any case by periodically pinging on the SSH channel. At least
stream-events could be fixed with that. I'll try this out.
- Make long-running Gerrit commands show some output periodically (or
find a way to send a silent no-op SSH message regularly, like the
ServerAliveInterval messages, that would also instafix long silent
stream-events).

This is because we only want to kill the actual network-disconnected
clients, and not interfere with other idle scenarios.

(Even with this info I think those are argument that do not outweigh
to change the default of sshd.idleTimeout from 0.)

[1]: From Ubuntu ssh_config(5) manpage:
Sets a timeout interval in seconds after which if no data has been
received from the server, ssh(1) will send a message through the
encrypted channel to request a response from the server. The default
is 0, indicating that these messages will not be sent to the server,
or 300 if the BatchMode option is set (Debian-specific).
Reply all
Reply to author
Forward
0 new messages