Maximum number of keys in 'authorized_keys' ?

343 views
Skip to first unread message

assaf...@gmail.com

unread,
Oct 15, 2014, 1:31:03 PM10/15/14
to gito...@googlegroups.com
Hello,

I've seen that 'gitolite' has been successfully deployed on setups with large number of projects/repositories.

I'd like to ask if there's any practical limit (or known problems) with large number of keys, managed through gitolite + 'authorized_keys'.

I know that technically this is an SSH server issue, but I'm hoping people in this forum has real-world experience with such setups.

I'm looking at ~4K repositories, and ~10K keys (that is: ~10K 'pub' files in "keydir").

Would such a setup work? (and work efficiently?)
The number of keys is expected to grow, perhaps even to 20K - would that present a practical problem?

I'm asking about both the gitolite administration POV,
and from the user POV when running 'git clone' or 'git push'.

Thank you for all feedback and comments,
- Gordon

Sitaram Chamarty

unread,
Oct 15, 2014, 8:49:22 PM10/15/14
to assaf...@gmail.com, gito...@googlegroups.com
On 10/15/2014 11:01 PM, assaf...@gmail.com wrote:
> Hello,
>
> I've seen that 'gitolite' has been successfully deployed on setups
> with large number of projects/repositories.
>
> I'd like to ask if there's any practical limit (or known problems)
> with large number of keys, managed through gitolite +
> 'authorized_keys'.

Once you cross about 5000 keys the people whose keys are at the end of
the authkeys file will see some slowdown, like a second or so.

I usually call this the "ssh linear scan bottleneck" or some such
phrase.

> I know that technically this is an SSH server issue, but I'm hoping
> people in this forum has real-world experience with such setups.

So far, no one has complained :)

> I'm looking at ~4K repositories, and ~10K keys (that is: ~10K 'pub'
> files in "keydir").

Benchmark it with 10k keys on your kit. You don't need to do this on a
gitolite account, and you don't have to create 10k keys. Create 5 keys,
replicate key 1 10000 times, then insert key 2 at line 2500, key 3 at
5000, key 4 at 7500, and key 5 at 10000.

Then run "ssh ... pwd" and time it for each key. For best results, time
it on the server also. I suspect on a busy server where it's already in
cache this won't matter so much

> Would such a setup work? (and work efficiently?)
> The number of keys is expected to grow, perhaps even to 20K - would
> that present a practical problem?

You'd need to benchmark it as I indicated above but as things stand
right now it will be an issue.

There are two possible mitigations to this, both referenced in [1]
below, and both require openssh 6.2 at least.

This version of ssh has a feature whereby ssh will run a command in
order to get the list of authorized keys, falling back to
~/.ssh/authorized_keys if the list does not contain the user's offered
key.

1. We use that feature to create a "mru" list of users so that only
about a few thousand who are active will be in it.

2. The second solution is to patch ssh -- patch is in that email.
Jason Donenfeld sent that patch to the ssh folks but I suppose they
didn't take it. Once this patch was applied, your
"AuthorizedKeysCommand" program would simply look up the right
pubkey based on the reveived fingerprint (trivial to maintain a
database of such) and offer only that key.

That makes it constant time access for all the users, all the time.

For the moment I'd go with solution 1. If you need help with that I'd
be happy to write the code, or if you write it please send it in and
we'll put it in contrib.

> I'm asking about both the gitolite administration POV,
> and from the user POV when running 'git clone' or 'git push'.

I've only been speaking from the "user" pov above.

For the admin, well since gitolite allows you to put keys in
subdirectories, I suppose that helps, not having 20k keys in one
directory.

Every gitolite-admin push will result in all the keys being
re-processed. If your admins make far more frequent changes to access
rights than key changes, that's a lot of waste. I'd be happy to look at
optimising that, so gitolite does not do that needlessly (i.e., not
recreate the authkeys file when the push did not touch "keydir").

Actually I might do that anyway; it looks like it can't be more than a
couple lines of code...

regards
sitaram

[1]: https://groups.google.com/forum/#!searchin/gitolite/jason$20donenfeld/gitolite/ya-YVHi_YYg/wQwKO1izBZwJ

David Pierce

unread,
Oct 15, 2014, 8:58:13 PM10/15/14
to gito...@googlegroups.com
Simple answer with Source Code here:

Basically, comes down to "no limit." But the system searches with a loop through the authorized_keys file, so the longer the file, the longer logins will take.

-Dave


From: "assaf...@gmail.com" <assaf...@gmail.com>
To: gito...@googlegroups.com
Sent: Wednesday, October 15, 2014 12:31 PM
Subject: [gitolite] Maximum number of keys in 'authorized_keys' ?
--
You received this message because you are subscribed to the Google Groups "gitolite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gitolite+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Assaf Gordon

unread,
Oct 15, 2014, 10:07:25 PM10/15/14
to Sitaram Chamarty, gito...@googlegroups.com
Hello Sitaram, David,

Thank you for the detailed and helpful replies.

On 10/15/2014 08:49 PM, Sitaram Chamarty wrote:
> On 10/15/2014 11:01 PM, assaf...@gmail.com wrote:
>
>> I'm looking at ~4K repositories, and ~10K keys (that is: ~10K 'pub'
>> files in "keydir").
>
> Benchmark it with 10k keys on your kit.

Anecdotal results:
With ~8K pubkeys, running "ssh gitolite3@server pwd" takes about 1 to 2 seconds to login, once the cache is hot ('pwd' fails of course, but the SSH part is already done by that point I presume).
Committing a gitolite configuration with ~8K pubkeys takes about 15 seconds to do the "post compile".
all done on a 1-CPU/512-MB small VM, lightly loaded.
So, not too bad, even in this not ideal state.

> There are two possible mitigations to this, both referenced in [1]
> below, and both require openssh 6.2 at least.
>
> This version of ssh has a feature whereby ssh will run a command in
> order to get the list of authorized keys, falling back to
> ~/.ssh/authorized_keys if the list does not contain the user's offered
> key.
>
> 1. We use that feature to create a "mru" list of users so that only
> about a few thousand who are active will be in it.
>
> For the moment I'd go with solution 1. If you need help with that I'd
> be happy to write the code, or if you write it please send it in and
> we'll put it in contrib.
>

Interesting idea. My approach would be to use REDIS as cache server (I'm already using it with another trigger).

But I have a technical question:
1. The trigger will get "GL_USER", which is only the user name.
2. The "AuthorizedKeysCommand" script need to return not only the SSH pubkey, but also the correctly formatted "command=/path/to/gitolite-shell USER" part, is that correct ?

So does the trigger need to re-use parts of 'optionise' from '/usr/share/gitolite3/triggers/post-compile/ssh-authkeys' ?
Or what would be a reliable way given a username to find the correct command from "authorized_keys" file ?
Would you recommend a simple "grep" ?
Also - a user might have multiple keys, the cache would need to keep them all ?


Thanks,
-gordon



Dirk Heinrichs

unread,
Oct 16, 2014, 3:29:58 AM10/16/14
to gito...@googlegroups.com
Am Mittwoch 15 Oktober 2014, 10:31:03 schrieb assaf...@gmail.com:

> I'm asking about both the gitolite administration POV,
> and from the user POV when running 'git clone' or 'git push'.

If you are concerned about possible performance impacts, you could also let
your users use ecdsa or ed25519 keys, which are way shorter than the
"traditional" rsa or dsa keys. Of course, this only works in a setup where
only recent versions of OpenSSH are involved (putty has no support for these
key types).

HTH...

Dirk
--
Dirk Heinrichs <dirk.he...@altum.de>
Tel: +49 (0)2471 209385 | Mobil: +49 (0)176 34473913
GPG Public Key CB614542 | Jabber: dirk.he...@altum.de
Tox: he...@toxme.se
Sichere Internetkommunikation: http://www.retroshare.org
Privacy Handbuch: https://www.privacy-handbuch.de
signature.asc

Sitaram Chamarty

unread,
Oct 16, 2014, 2:00:03 PM10/16/14
to Assaf Gordon, gito...@googlegroups.com
On 10/16/2014 07:37 AM, Assaf Gordon wrote:
> Hello Sitaram, David,
>
> Thank you for the detailed and helpful replies.
>
> On 10/15/2014 08:49 PM, Sitaram Chamarty wrote:
>> On 10/15/2014 11:01 PM, assaf...@gmail.com wrote:
>>
>>> I'm looking at ~4K repositories, and ~10K keys (that is: ~10K 'pub'
>>> files in "keydir").
>>
>> Benchmark it with 10k keys on your kit.
>
> Anecdotal results:
> With ~8K pubkeys, running "ssh gitolite3@server pwd" takes about 1 to 2 seconds to login, once the cache is hot ('pwd' fails of course, but the SSH part is already done by that point I presume).

that'll get annoying for people who work one more than one project or
are "integrators" -- both cases which require much more frequent
interaction with the server than a "heads-down developer" who does one
or two pulls/pushes a day.

> Committing a gitolite configuration with ~8K pubkeys takes about 15 seconds to do the "post compile".

Could you paste the gitolite log file entries for this, from the "ssh"
line to the "END" line?

> all done on a 1-CPU/512-MB small VM, lightly loaded.
> So, not too bad, even in this not ideal state.
>
>> There are two possible mitigations to this, both referenced in [1]
>> below, and both require openssh 6.2 at least.
>>
>> This version of ssh has a feature whereby ssh will run a command in
>> order to get the list of authorized keys, falling back to
>> ~/.ssh/authorized_keys if the list does not contain the user's offered
>> key.
>>
>> 1. We use that feature to create a "mru" list of users so that only
>> about a few thousand who are active will be in it.
>>
>> For the moment I'd go with solution 1. If you need help with that I'd
>> be happy to write the code, or if you write it please send it in and
>> we'll put it in contrib.
>>
>
> Interesting idea. My approach would be to use REDIS as cache server (I'm already using it with another trigger).
>
> But I have a technical question:
> 1. The trigger will get "GL_USER", which is only the user name.
> 2. The "AuthorizedKeysCommand" script need to return not only the SSH pubkey, but also the correctly formatted "command=/path/to/gitolite-shell USER" part, is that correct ?
>
> So does the trigger need to re-use parts of 'optionise' from '/usr/share/gitolite3/triggers/post-compile/ssh-authkeys' ?

yes

> Or what would be a reliable way given a username to find the correct command from "authorized_keys" file ?
> Would you recommend a simple "grep" ?

then you're back to a linear scan, except it's grep doing this now.

> Also - a user might have multiple keys, the cache would need to keep them all ?

all, since we have no way of knowing which specific key he used.

Sitaram Chamarty

unread,
Oct 16, 2014, 2:10:38 PM10/16/14
to Assaf Gordon, gito...@googlegroups.com
[top posting]

I've been thinking about this a bit more and I think all my previous
ideas were over-engineered.

Try the attached patch to ssh-authkeys. Then take the file called "mru"
from contrib and put it somewhere, and supply its path to the rc file,
like so:

SSH_MRU => "$ENV{HOME}/gitolite/contrib/utils/mru",
# before the ENABLE list

Now push to gitolite-admin and it should put the last 1000 users who
accessed the server on top.

This has one significant deviation from what I proposed in [1]: it
directly codes the order in the authkeys file instead of putting them
into some anciallry file/table and then relying on the
AuthorizedKeysCommand program to get them out etc.

(Corollary: since it doesn't use the "AuthorizedKeysCommand" feature, it
can work with older ssh servers (pre 6.2) that don't have it.)

I think this will work just fine. Change "N" in that "mru" program to
something more than 1000 if you wish.
0001-mru-most-recent-users.patch

Assaf Gordon

unread,
Oct 17, 2014, 9:38:42 PM10/17/14
to Sitaram Chamarty, gito...@googlegroups.com
On 10/16/2014 02:10 PM, Sitaram Chamarty wrote:
>
> I've been thinking about this a bit more and I think all my previous
> ideas were over-engineered.
>

<..>

> This has one significant deviation from what I proposed in [1]: it
> directly codes the order in the authkeys file instead of putting them
> into some anciallry file/table and then relying on the
> AuthorizedKeysCommand program to get them out etc.

Wouldn't that re-arrange the keys only when gitolite configuration is updated, during 'post-compile' ?



Sitaram Chamarty

unread,
Oct 18, 2014, 3:17:31 AM10/18/14
to Assaf Gordon, gito...@googlegroups.com
yes, but rearranging them more often is overkill. That is exactly what I meant by over-engineered :-)

If your admin repo pushes are very IN-frequent, you can always put 'gitolite trigger POST_COMPILE' in cron, maybe set to run once a day.

Reply all
Reply to author
Forward
0 new messages