Questions on why Gerrit is architected the way it is.

356 views
Skip to first unread message

krunkosaurus

unread,
Jun 14, 2011, 1:40:30 PM6/14/11
to Repo and Gerrit Discussion
Hi there, I have installed and worked with Gerrit for a few days now
and feel comfortable enough to start asking some questions on why it
was architected the way it is.
I've dug around Google and YouTube to try to find these answers myself
(a particularly good video here http://www.youtube.com/watch?v=VzkudNGUepQ&feature=autoshare),
and I hope I don't come off as offensive by asking them. I think it
would be great to have a thread of informative answers for others that
might be curious as well:

Security and authentication:
Why does Gerrit use a separate SSHd. Why not just use the existing
SSHd installed on most boxes? Gitolite does this well by
modifying .ssh/authorized_keys with its own "intercepter" script:

command="/home/git/.gitolite/src/gl-auth-command johnbrown",no-port-
forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAQ

The end result is a similar non-interactive, secure, shell that
restricts / authenticates while knowing who you are and allowing you
to pass certain commands (namely git and other related
functionality).

Git:
Why use a Java reimplementation of Git. Wouldn't something like 'git
push origin HEAD:refs/for/master' still work the same on a regular
build of Git?

Lastly, while I'm at it: is there a way to receive emailed diffs on a
successful merge? Sort of what git-commit-notifier does:
http://bitboxer.de/git-commit-notifier/


Thanks for the answers!

Mauvis

Shawn Pearce

unread,
Jun 14, 2011, 2:01:54 PM6/14/11
to krunkosaurus, Repo and Gerrit Discussion
On Tue, Jun 14, 2011 at 10:40, krunkosaurus <switchs...@gmail.com> wrote:
> Hi there, I have installed and worked with Gerrit for a few days now
> and feel comfortable enough to start asking some questions on why it
> was architected the way it is.
>
> Security and authentication:
> Why does Gerrit use a separate SSHd. Why not just use the existing
> SSHd installed on most boxes? Gitolite does this well by
> modifying .ssh/authorized_keys with its own "intercepter" script:

This works OK for 25 people. It is slow as snot for 100 people. Its
not possible for 1000 people. Its utterly garbage for 8000 people. I
know multiple servers with more than 8000 keys enrolled on them. You
can't use ~/.ssh/authorized_keys for this.

Using our own SSHD allows us to have direct control over the entire
authentication process, and map the login system to a database that
can scale to 8000+ keys enrolled on it.

There was also some mild rejection to everyone having the same SSH
username. Some of the Googlers who I talked to in the early days of
Gerrit thought that was ugly. They wanted their own username, because
then it could match the username on their workstation, and they
wouldn't need to configure a username at all. We didn't want to hack
OpenSSH to support what we needed, it would be ugly to distribute the
patches on top of OpenSSH and keep rebasing them as upstream OpenSSH
was developed, not to mention harder to install Gerrit on your own
server. So... we went with our own embedded SSHD.

> Git:
> Why use a Java reimplementation of Git. Wouldn't something like 'git
> push origin HEAD:refs/for/master' still work the same on a regular
> build of Git?

Not really. C Git would actually create the reference
"refs/for/master", which would then block other users from pushing
their own changes to that same name because it would be a
non-fast-forward push. We could use a C Git update hook to block
creation of refs/for/master, but still receive the SHA-1 and update
the reviews. However a non-zero exit status from the update hook to
block creation of refs/for/master would cause an error message to be
sent to the client, which is ugly.

Rather than hack the C implementation to allow a richer interaction
between the hooks and the receive-pack process that is handling the
pack stream and generating the status codes back to the client, we
chose to work with the existing Java implementation that already
supported some of what we needed, and was more flexible because it had
more direct control when everything was running in the same process.

At this point we had also already settled on rewriting the Gerrit web
UI in GWT and the server in Java, so it made sense to embed the Java
implementation of Git for Git operations, rather than forking out to
the C implementation.

> Lastly, while I'm at it: is there a way to receive emailed diffs on a
> successful merge? Sort of what git-commit-notifier does:
> http://bitboxer.de/git-commit-notifier/

You may be able to plug into the change-merged hook on your server:

http://gerrit.googlecode.com/svn/documentation/2.2.0/config-hooks.html#_change_merged

Or monitor the server using stream-events:

http://gerrit.googlecode.com/svn/documentation/2.2.0/cmd-stream-events.html

Mauvis Ledford

unread,
Jun 15, 2011, 7:58:28 PM6/15/11
to Repo and Gerrit Discussion
Thanks for the quick and lengthy response. It's always nice when a
developer can shed some light for another and those all sound pretty
valid.

Re the ~/.ssh/authorized_keys method, I found this comment (http://
drupal.org/node/782764#comment-2910398) by Sitaram, the creator of
Gitolite, that states:

"(1) Sshd does a linear scan, so this becomes an issue once you cross
about 3000-5000 users. 5 digits is probably going to kill it."

I don't know who's more right there on the limitations (and there was
nothing said of speed slowdowns) - but ultimately you are right in
that Gerrit can scale better because it's custom SSHd.

Thanks again for the detailed reply.

Mauvis


On Jun 14, 11:01 am, Shawn Pearce <s...@google.com> wrote:
>  http://gerrit.googlecode.com/svn/documentation/2.2.0/config-hooks.htm...
>
> Or monitor the server using stream-events:
>
>  http://gerrit.googlecode.com/svn/documentation/2.2.0/cmd-stream-event...
Reply all
Reply to author
Forward
0 new messages