Host names hashing

51 views
Skip to first unread message

Dmitry Belyavskiy

unread,
Jan 5, 2022, 7:01:53 AM1/5/22
to OpenSSH Devel List
Dear colleagues,

OpenSSH uses SHA1 without any alternate options for hostname hashing (looks
like this is the last place where an alternate option for SHA1 is not
available). SHA1 HMAC is considered safe enough for now, but it may change
so it's definitely worth migrating to more safe algorithms (SHA2?).

I'd like to discuss possible options of such migration.

Many thanks!
--
Dmitry Belyavskiy
_______________________________________________
openssh-unix-dev mailing list
openssh-...@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev

Blumenthal, Uri - 0553 - MITLL

unread,
Jan 5, 2022, 9:10:01 AM1/5/22
to Dmitry Belyavskiy, OpenSSH Devel List
What are the cryptographic consequences of host name hah collision?

My point is - the only reason to consider replacing the algorithm here would be to avoid varying around another hash that is not usable cryptographically.

Regards,
Uri

> On Jan 5, 2022, at 07:05, Dmitry Belyavskiy <dbel...@redhat.com> wrote:
>
> Dear colleagues,

Dmitry Belyavskiy

unread,
Jan 5, 2022, 9:33:21 AM1/5/22
to OpenSSH Devel List
Dear Uri,

Not sure we are trying to protect from collisions and not from host
name's disclosure.

Jochen Bern

unread,
Jan 5, 2022, 10:37:36 AM1/5/22
to openssh-...@mindrot.org
On 05.01.22 15:06, Blumenthal, Uri - 0553 - MITLL wrote:
> What are the cryptographic consequences of host name hah collision?

None, I would guess. Someone crafting a server name/IP that causes a
hash collision with an existing hashed known_hosts entry is likely to be
a nuisance to or a source of confusion of the user (ssh will complain
about the existing entry for a different host keypair), not, per se, a
failure of security.

The hashing is meant to obscure info about what host it matches, so the
relevant failure mode is if the hash algo would become *reversible*.

One question about the real-world use cases, however: How many people
are there hashing known_hosts entries *autogenerated by client use out
of the very same account*? If I were to find that someone got ahold of
the known_hosts file off my workplace machine, I'd be worried about
disclosure of my *user keypairs*, while the info about the servers I
access would probably be a lost cause, anyway (that info is available in
cleartext in the config file(s), and/or the saved shell history).

A much more understandable use case would be if an organization
distributes a centrally administered/collected known_hosts file to users
but doesn't want marginally-trusted users to immediately see where to
find the company LAN's crown jewels. However, in order to make a
successful algo migration in *that* scenario, we'd have to address the
possibility that users and the central repository could end up
supporting *different* sets of algos for some length of time ...

Regards,
--
Jochen Bern
Systemingenieur

Binect GmbH

Damien Miller

unread,
Jan 6, 2022, 12:40:38 AM1/6/22
to Dmitry Belyavskiy, OpenSSH Devel List
On Wed, 5 Jan 2022, Dmitry Belyavskiy wrote:

> Dear colleagues,
>
> OpenSSH uses SHA1 without any alternate options for hostname hashing (looks
> like this is the last place where an alternate option for SHA1 is not
> available). SHA1 HMAC is considered safe enough for now, but it may change
> so it's definitely worth migrating to more safe algorithms (SHA2?).
>
> I'd like to discuss possible options of such migration.

I'd prefer to remove hostname hashing. It's a pointless obscurity
measure, and the most it can ever offer is protection against casual
shoulder-surfing disclosure[*]

I wish I never added it. I consider it the most stupid thing I've ever
done to OpenSSH :(

As far as what a concrete migration plan would look like, maybe something
like:

1) Add an ObscureKnownHostnames option that, instead of hashing, simply
base64-encodes the hostnames. This provides the same level of
protection as the current option. Recommend this instead of
HashKnownHosts in the manual.

2) (later) Add a deprecation warning to HashKnownHosts

3) (later still) Remove the HashKnownHosts option (or make it an alias
to ObscureKnownHostnames)

4) (later again) Warn when known_hosts contains a hashed hostname

5) (finally) rip out the hostname hashing code entirely.

-d

[*] Any real adversary will be able to reverse the hash via a dictionary
or brute-force, since the entropy of hostnames is so small.

This cannot be fixed using usual methods (e.g. using a deliberately-slow
KDF like bcrypt or scrypt instead of a hash) because ssh needs to
consider every name in known_hosts and any KDF slow enough to present
a problem for an attacker will be far too slow to be invoked for every
name in the file, every time the user runs ssh.

Nico Kadel-Garcia

unread,
Jan 6, 2022, 1:10:04 AM1/6/22
to Jochen Bern, OpenSSH Devel List
On Wed, Jan 5, 2022 at 10:38 AM Jochen Bern <Joche...@binect.de> wrote:

> The hashing is meant to obscure info about what host it matches, so the
> relevant failure mode is if the hash algo would become *reversible*.

And normally, it's the opposite of helpful. The known_hosts is useful
for casual review and for tuning .ssh/config as desired for more
specific uses, and the hashing obscures the commonly used SSH targets.

> One question about the real-world use cases, however: How many people
> are there hashing known_hosts entries *autogenerated by client use out
> of the very same account*? If I were to find that someone got ahold of

Many simply turn off known_hosts, because it is a burden to approve
new hostkeys and very awkward if there is an IP address overlap in a
DHCP or externally configured target. The .ssh/config settings for
this are quite old, and well documented:

Host *
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
LogLevel ERROR

> A much more understandable use case would be if an organization
> distributes a centrally administered/collected known_hosts file to users
> but doesn't want marginally-trusted users to immediately see where to
> find the company LAN's crown jewels. However, in order to make a
> successful algo migration in *that* scenario, we'd have to address the
> possibility that users and the central repository could end up
> supporting *different* sets of algos for some length of time ...

The burden is why I disable it in setups like, say, ansible server
setups to control wide varieties of VM's in a multi-purpose VLAN.

Brian Candler

unread,
Jan 6, 2022, 3:09:01 AM1/6/22
to OpenSSH Devel List
On 06/01/2022 06:08, Nico Kadel-Garcia wrote:
> On Wed, Jan 5, 2022 at 10:38 AM Jochen Bern<Joche...@binect.de> wrote:
>
>> The hashing is meant to obscure info about what host it matches, so the
>> relevant failure mode is if the hash algo would become*reversible*.
> And normally, it's the opposite of helpful. The known_hosts is useful
> for casual review and for tuning .ssh/config as desired for more
> specific uses, and the hashing obscures the commonly used SSH targets.
>
I agree. I find HashKnownHosts annoying, and I always turn it off when I
remember to do so.  Typically this happens when I need to trim some
entries from known_hosts, and then I find it has been hashing it up to
the current point in time.

Of course, I shouldn't have to turn it off, because the default is
'no',  I guess many distros set 'HashKnownHosts yes' in
/etc/ssh/ssh_config because they want to be seen to choose the "secure"
option by default. However the threat model seems pretty pointless to
me.  If an attacker has access to my account to the level that they can
read my known_hosts file, then I have far worse problems than them
seeing a list of hostnames, which they can obtain in many other ways.

Should I care about other system users reading this info, there's always
chmod 700 (on the .ssh directory, or my whole home directory).  If
known_hosts itself were created mode 600 by default, I wouldn't object.

mark dominik bürkle

unread,
Jan 6, 2022, 7:56:59 AM1/6/22
to openssh-...@mindrot.org, Nico Kadel-Garcia, Jochen Bern
hello all,

besides from accessing "same" ips in a vlan env i see two more possibilities that might be in widespread use:
- vlan env
- administering home office (or friends') pcs
- customers accessed via (multiple) vpn

most of these will have different gateway ips. (or just different interfaces?)
so, for these users, finding the gw (eg via "ip route get <target>" as shell cmd) and combining this with the hostname/ip for the known_hosts lookup might be helpful.
with an option like
KnownHostsUseGw <host_list>
the known_host_entry might then be extended like
<known_host_entry> ":via_" <gw>
or
<known_host_entry> ":via_" <device>

[this may lead to problems with older implementations by ":<port>" confusion,
to be considered before implementation.]

accepting already existing matching entries and/or "unique" hosts:
additionally to searching "...:via_<gw>", if not found, check "<host_without_gw>"
or add option KnownHostsUseGwFallback (bool / pattern / behaviour_algo / external_cmd)?

feasible, justified, any opinions?

[side note: in combination with ansible (or other mgmt software) the installation process should be changed to record the correct (new, replaced) host keys, those then retrieved and verification NOT be turned off. might involve using ssh-keygen; complex setups should have the appropriate "gw" information.]

thank you,
mdbuerkle

--
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

Brian Candler

unread,
Jan 6, 2022, 8:28:35 AM1/6/22
to mark dominik bürkle, openssh-...@mindrot.org
On 06/01/2022 12:54, mark dominik bürkle wrote:
> besides from accessing "same" ips in a vlan env i see two more possibilities that might be in widespread use:
> - vlan env
> - administering home office (or friends') pcs
> - customers accessed via (multiple) vpn
>
> most of these will have different gateway ips. (or just different interfaces?)
> so, for these users, finding the gw (eg via "ip route get <target>" as shell cmd) and combining this with the hostname/ip for the known_hosts lookup might be helpful.
> with an option like
> KnownHostsUseGw <host_list>
> the known_host_entry might then be extended like
> <known_host_entry> ":via_" <gw>
> or
> <known_host_entry> ":via_" <device>

You haven't explicitly said what problem you're trying to solve. Is it
that two different networks you use both have a host 192.168.1.123, and
these are colliding in known_hosts?  I don't really see how the gateway
comes into this; you could have two different 192.168.1.0/24 networks
both with gateway 192.168.1.1, and you may be connected directly to the LAN.

There are several solutions to this, but in any case you should be
accessing each target with a distinct name (because "ssh 192.168.1.123"
can't tell the difference between the two 192.168.1.123 hosts).

If you have names that resolve in /etc/hosts or DNS under a shared
domain, you could do this in ~/.ssh/config:

Host *.myfriend.local
UserKnownHostsFile ~/.ssh/known_hosts_myfriend ~/.ssh/known_hosts

Or you can make explicit entries for individual hosts (which is useful
to give them shortcut names anyway):

# My friend's machines
Host foo
Hostname 192.168.1.123
UserKnownHostsFile ~/.ssh/known_hosts_myfriend

Host bar
Hostname 192.168.1.124
UserKnownHostsFile ~/.ssh/known_hosts_myfriend

# Work machines
Host qux
Hostname 192.168.1.123
UserKnownHostsFile ~/.ssh/known_hosts_work

Recent versions of ssh also support "KnownHostsCommand" which can
implement more sophisticated logic of your choosing, for retrieving the
expected host keys for a given host.

HTH,

Brian.

Ángel

unread,
Jan 6, 2022, 4:44:15 PM1/6/22
to openssh-...@mindrot.org
On 2022-01-06 at 16:37 +1100, Damien Miller wrote:
> I'd prefer to remove hostname hashing. It's a pointless obscurity
> measure, and the most it can ever offer is protection against casual
> shoulder-surfing disclosure[*]
>
> I wish I never added it. I consider it the most stupid thing I've
> ever done to OpenSSH :(
>
> As far as what a concrete migration plan would look like, maybe
> something like:
>
> 1) Add an ObscureKnownHostnames option that, instead of hashing,
> simply
> base64-encodes the hostnames. This provides the same level of
> protection as the current option. Recommend this instead of
> HashKnownHosts in the manual.
>
> 2) (later) Add a deprecation warning to HashKnownHosts
>
> 3) (later still) Remove the HashKnownHosts option (or make it an
> alias
> to ObscureKnownHostnames)
>
> 4) (later again) Warn when known_hosts contains a hashed hostname
>
> 5) (finally) rip out the hostname hashing code entirely.
>
> -d


You should have an intermediate step where Hashed hosts get converted
to base64-ones when connecting to it. I'm sure someone would complain
("How does it dare «decrypt» it?"), but "losing" the server
fingerprint, thus forcing to either verify the fingerprint from a known
source (probably not available) or allow a MITM would be worse.


Still, I don't like too much these two options for deprecating
HashKnownHosts.

I would suggest:

- Add ObscureKnownHostnames option with values sha1 / base64 / no
(None?)
- Make HashKnownHosts a deprecated alias for ObscureKnownHostnames
- Make the value "yes" equivalent to "sha1"

- (Later) Change "yes" to mean "base64"


Optionally, the conversion might be implicit in that host in a non-
preferred obscured format get automatically upgraded to the new one.


Regards
Reply all
Reply to author
Forward
0 new messages