poor git performance over http transport

498 views
Skip to first unread message

Brandon Casey

unread,
Jun 14, 2013, 10:05:44 PM6/14/13
to repo-d...@googlegroups.com
We have a number of Gerrit servers deployed. Most are version 2.2.1
and some are 2.5.4. They are all deployed using the embedded Jetty
engine and are set up in a reverse proxy configuration with Apache.

We've been mostly connecting to them for git fetches/pushes over the
ssh transport. I've heard it said that the http performance is
supposed to be much better and I've taken that on face value, but
today I felt like putting a number on it. Unfortunately, it seems
there is something wrong with our setup, since I am only getting about
3MB/s over http while I get over 20MB/s over ssh.

During the http transfer, Gerrit (java) pegs 1-2 cpus at 100% during
the "Counting objects" and the "Finding sources" phases, but once it
starts transferring objects (at 3MB/s), it drops down to about 5% cpu
usage.

There is no throttling happening on the network. I get 3MB/s even if
I run the 'git clone' on the same system that Gerrit is running on.

I get the same behavior if I run Gerrit without using a reverse proxy
(just to rule it out).

I get reasonable speeds if I clone using an external git http url.
e.g. http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

I get lightning speeds if I set up apache and smart-http using the C
git http-backend.

I get the same behavior on Gerrit 2.2.1 and 2.5.4.

That seems to rule out problems caused by the network, Apache, the
Gerrit version, and the Git version. Jetty and/or Java?

Here are the relevant parts from my gerrit.config:

[container]
javaHome = /path/to/jdk-1.6.0-29/jre # I've also tried 1.7
heapLimit = 24g

[httpd]
listenUrl = proxy-http://a.b.c.d:8081/r/
maxQueued = 0

Let me know if you want any other info.

Any ideas?

Thanks,
-Brandon

Saša Živkov

unread,
Jun 16, 2013, 6:16:08 PM6/16/13
to Brandon Casey, repo-d...@googlegroups.com
What is the value of the core.streamFileThreshold in your gerrit configuration?
This parameter should be set to be larger than the largest object in your Git repositories.
When it is too low you will see a very high CPU usage.


Thanks,
-Brandon

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



Shawn Pearce

unread,
Jun 16, 2013, 6:46:20 PM6/16/13
to Saša Živkov, Brandon Casey, repo-d...@googlegroups.com
I don't think the issue is as simple as core.streamFileThreshold.
Brandon stated they have better performance over SSH than over HTTP.
Its something in the Jetty stack that is having trouble here. Or
mod_proxy in Apache.

Brandon Casey

unread,
Jun 16, 2013, 7:56:52 PM6/16/13
to Shawn Pearce, Saša Živkov, repo-d...@googlegroups.com
I haven't set core.streamFileThreshold, so it should have its default
value of 50MB.

But yeah, like Shawn said, I am experiencing much better performance
over ssh than over http. I also suspect it's related to Jetty. I
don't think it is related to Apache or mod_proxy. In fact, I've set
up Gerrit without using Apache at all, just to explicitly rule out
this possibility, and I still experienced the same slow performance
over http.

I've got a couple of additional data points to share though:

1. I've set up v2.5.4 on my personal laptop (both the stock .war and
a .war built myself), and I experienced the same slowness and the same
(what seems like) 3MB/s rate limit for http. So this provides a test
case from a completely different environment (Ubuntu 12.04, openjdk
1.6.0, maven 3.0.4), and one that is disconnected from anything at
$dayjob.

2. I built 8713eb859eb4d6680f7e0da55725141aebd907e2 (aka
2.7-rc1-265), which is one of the last commits that can be built using
maven, but uses an updated version of Jetty. This version uses Jetty
8.1.7.v20120910 and it did _not_ suffer from the http slowness. On my
laptop I get about 15MB/s over ssh and about 20MB/s over http, which
is what I was expecting.

3. I cherry-picked ea48b3c2502bcaa51404175f381f4f34699ee96a,
"Upgrade embedded Jetty to 8.1.7.v20120910" on top of v2.5.4 and the
resulting build also does _not_ suffer from the http slowness. The
performance is the same as 8713eb85.

So, it seems that upgrading Jetty to 8.1.7 could be a work around.
Can you think of any negative side-effects of cherry-picking ea48b3c2
on top of v2.5.4?

Does that trigger any thoughts about what could be causing the http
slowness in the stock 2.5.4 (and 2.2.1) using Jetty 7.2.1?

Is anyone else running Gerrit using the embedded Jetty engine and
experiencing the poor http performance I described? (or not
experiencing it?). Or do people usually run Gerrit in an external
Jetty instance.

-Brandon

Shawn Pearce

unread,
Jun 16, 2013, 8:04:17 PM6/16/13
to Brandon Casey, Saša Živkov, repo-d...@googlegroups.com
On Sun, Jun 16, 2013 at 4:56 PM, Brandon Casey <dra...@gmail.com> wrote:
> But yeah, like Shawn said, I am experiencing much better performance
> over ssh than over http. I also suspect it's related to Jetty. I
> don't think it is related to Apache or mod_proxy. In fact, I've set
> up Gerrit without using Apache at all, just to explicitly rule out
> this possibility, and I still experienced the same slow performance
> over http.

Yea, I did not really suspect Apache. Unless there was an obscure
throttling option enabled for mod_proxy. Which I think was very
unlikely.

> 2. I built 8713eb859eb4d6680f7e0da55725141aebd907e2 (aka
> 2.7-rc1-265), which is one of the last commits that can be built using
> maven, but uses an updated version of Jetty. This version uses Jetty
> 8.1.7.v20120910 and it did _not_ suffer from the http slowness. On my
> laptop I get about 15MB/s over ssh and about 20MB/s over http, which
> is what I was expecting.
>
> 3. I cherry-picked ea48b3c2502bcaa51404175f381f4f34699ee96a,
> "Upgrade embedded Jetty to 8.1.7.v20120910" on top of v2.5.4 and the
> resulting build also does _not_ suffer from the http slowness. The
> performance is the same as 8713eb85.
>
> So, it seems that upgrading Jetty to 8.1.7 could be a work around.

See this sounds like its a bug in Jetty that was fixed in a later release.

> Can you think of any negative side-effects of cherry-picking ea48b3c2
> on top of v2.5.4?

No, this should work. It hasn't been tested by anyone, but I would not
expect any issues. It should work correctly. If its working for you in
your own testing, its probably OK to try on a larger scale.

I am probably going to cut 2.6 final tomorrow, which includes
ea48b3c2502bcaa51404175f381f4f34699ee96a. So there will not be a 2.5.5
with this backported.

> Does that trigger any thoughts about what could be causing the http
> slowness in the stock 2.5.4 (and 2.2.1) using Jetty 7.2.1?

Bug in Jetty?

> Is anyone else running Gerrit using the embedded Jetty engine and
> experiencing the poor http performance I described? (or not
> experiencing it?). Or do people usually run Gerrit in an external
> Jetty instance.

I think most people use the SSH server, so the HTTP performance is
less of an issue.

Brandon Casey

unread,
Jun 16, 2013, 8:45:52 PM6/16/13
to Shawn Pearce, Saša Živkov, repo-d...@googlegroups.com
[Wow, I top-posted. My bad. Blame gmail]

On Sun, Jun 16, 2013 at 5:04 PM, Shawn Pearce <s...@google.com> wrote:
> On Sun, Jun 16, 2013 at 4:56 PM, Brandon Casey <dra...@gmail.com> wrote:

>> 3. I cherry-picked ea48b3c2502bcaa51404175f381f4f34699ee96a,
>> "Upgrade embedded Jetty to 8.1.7.v20120910" on top of v2.5.4 and the
>> resulting build also does _not_ suffer from the http slowness. The
>> performance is the same as 8713eb85.
>>
>> So, it seems that upgrading Jetty to 8.1.7 could be a work around.
>
> See this sounds like its a bug in Jetty that was fixed in a later release.
>
>> Can you think of any negative side-effects of cherry-picking ea48b3c2
>> on top of v2.5.4?
>
> No, this should work. It hasn't been tested by anyone, but I would not
> expect any issues. It should work correctly. If its working for you in
> your own testing, its probably OK to try on a larger scale.

Ok, good, thanks.

> I am probably going to cut 2.6 final tomorrow, which includes
> ea48b3c2502bcaa51404175f381f4f34699ee96a. So there will not be a 2.5.5
> with this backported.

Yay, and boo. :)

>> Does that trigger any thoughts about what could be causing the http
>> slowness in the stock 2.5.4 (and 2.2.1) using Jetty 7.2.1?
>
> Bug in Jetty?

Ok. I was hoping there would be some Jetty configuration setting that
could be tweaked and would be applicable to the older releases too so
we could just add some info to the web site as an alternative to
patching.

>> Is anyone else running Gerrit using the embedded Jetty engine and
>> experiencing the poor http performance I described? (or not
>> experiencing it?). Or do people usually run Gerrit in an external
>> Jetty instance.
>
> I think most people use the SSH server, so the HTTP performance is
> less of an issue.

Ahh. Yeah, us too, but I'd like to start moving towards using http,
especially for external users, for the higher performance (I thought),
and also so we can provide access via port 80/443 and avoid requiring
external users to tweak their firewalls.

Thanks,
-Brandon

Saša Živkov

unread,
Jun 17, 2013, 3:00:58 AM6/17/13
to Shawn Pearce, Brandon Casey, repo-d...@googlegroups.com
Correct. I missed this part.
Reply all
Reply to author
Forward
0 new messages