[ANNOUNCE] New high-performance cache for Gerrit v3.3

219 views
Skip to first unread message

Luca Milanesio

unread,
Sep 18, 2020, 7:24:56 AM9/18/20
to Repo and Gerrit Discussion, Luca Milanesio
Hi all,
I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.

We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.

*We do expect* this change to have a significant impact on large mono-repos with hundreds of thousands of changes, thousands of packfiles and with tens of GBs in size.

The main reason of the expected performance improvement is the change_notes cache, which avoids parting the change meta-ref from NoteDb over and over again.
The change_notes cache was introduced a long ago, but its use was very limited in time and could not be persisted efficiently because of the intrinsics of H2 locking.
We tested the change_notes persistency on GerritHub a few months ago, and we literally crashed one site because of the additional latency and locks introduced by the H2 reorganisation and compaction mechanism.

Now, with the new persistent cache, all the above caches would receive a significant boost, especially for large repositories where computing the actual value is very expensive: the large and beefy mono-repos.

The new libModule will be compatible with the forthcoming Gerrit v3.3 and is pluggable as alternative implementation to the standard H2 cache.
The project is currently hosted on GerritHub.io and GitHub.com [3], and we are asking the community to move it to gerrit-review.googlesource.com, where it is going to be more discoverable and aligned with the rest of the Gerrit platform.

1. Install the chronicle-map module into the $GERRIT_SITE/lib directory.
2. Add the cache-chroniclemap module to $GERRIT_SITE/etc/gerrit.config as follows:

[gerrit]
installModule = com.googlesource.gerrit.modules.cache.chroniclemap.ChronicleMapCacheModule

— * —

See below some preliminary E2E performance tests on a clean Gerrit with empty caches, 10 users, running for 5 minutes on a 3k changes project, a *best case scenario* for the H2 cache. The Server is a 16-cores 2.4 GHz 8-Core Intel Core i9 with 64GB of RAM

Get change details REST API (relies on change_notes cache):

H2 persistent cache:
- mean: 6ms
- 95 percentile: 50ms
- 99 percentile: 91ms

ChronicleMap persistent cache:
- mean: 6ms
- 95 percentile: 43ms
- 99 percentile: 68ms

The above figures show that even with a *small repo* and a relatively small H2 persistent cache, the improvement is visible on both 95 and 99 percentile. That is possible thanks to the non-blocking nature of ChronicleMap compared to the RDBS-style access of H2.

We are building up a series of CI automation of our E2E tests that will run on the Gerrit-CI [2] that will verify the Gerrit performance on a series of large-mono repos and will publish the resulting statistics for every merged commit on the master and the future stable-3.3 branch.

— * —

Hope you would find this new module interesting and applicable to your Gerrit setup, especially for all of those who are on NoteDb and have large mono-repos.
Thanks for your feedback and feel free to ask questions or give a +1 to this e-mail for including this new module into the gerrit-review.googlesource.com.

Luca.

[1] https://github.com/OpenHFT/Chronicle-Map
[2] https://gerrit-ci.gerritforge.com
[3] https://github.com/GerritForge/modules_cache-chroniclemap


Luca Milanesio

unread,
Sep 18, 2020, 8:26:20 AM9/18/20
to Repo and Gerrit Discussion, Luca Milanesio
Some typos fixes below (we should have a code-review for the mailing list as well :-))

> On 18 Sep 2020, at 12:24, Luca Milanesio <luca.mi...@gmail.com> wrote:
>
> Hi all,
> I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.
>
> We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
> - change_notes
> - diff_summary
> - diff_intraline
> and potentially have a positive impact on many others.
>
> *We do expect* this change to have a significant impact on large mono-repos with hundreds of thousands of changes, thousands of packfiles and with tens of GBs in size.
>
> The main reason of the expected performance improvement is the change_notes cache, which avoids parting the change meta-ref from NoteDb over and over again.
> The change_notes cache was introduced a long ago, but its use was very limited in time and could not be persisted efficiently because of the intrinsics of H2 locking.
> We tested the change_notes persistency on GerritHub a few months ago, and we literally crashed one site because of the additional latency and locks introduced by the H2 reorganisation and compaction mechanism.
>
> Now, with the new persistent cache, all the above caches would receive a significant boost, especially for large repositories where computing the actual value is very expensive: the large and beefy mono-repos.
>
> The new libModule will be compatible with the forthcoming Gerrit v3.3 and is pluggable as alternative implementation to the standard H2 cache.
> The project is currently hosted on GerritHub.io and GitHub.com [3], and we are asking the community to move it to gerrit-review.googlesource.com, where it is going to be more discoverable and aligned with the rest of the Gerrit platform.

Below the instructions on how to install and use the new high-performance cache:

>
> 1. Install the chronicle-map module into the $GERRIT_SITE/lib directory.
> 2. Add the cache-chroniclemap module to $GERRIT_SITE/etc/gerrit.config as follows:
>
> [gerrit]
> installModule = com.googlesource.gerrit.modules.cache.chroniclemap.ChronicleMapCacheModule
>
> — * —
>
> See below some preliminary E2E performance tests on a clean Gerrit with empty caches, 10 users, running for 5 minutes on a 3k changes project, a *best case scenario* for the H2 cache. The Server is a 16-cores 2.4 GHz 8-Core Intel Core i9 with 64GB of RAM

16-threads 2.4GHz 8-cores Intel Core i9

>
> Get change details REST API (relies on change_notes cache):
>
> H2 persistent cache:
> - mean: 6ms
> - 95 percentile: 50ms
> - 99 percentile: 91ms
>
> ChronicleMap persistent cache:
> - mean: 6ms
> - 95 percentile: 43ms
> - 99 percentile: 68ms
>
> The above figures show that even with a *small repo* and a relatively small H2 persistent cache, the improvement is visible on both 95 and 99 percentile. That is possible thanks to the non-blocking nature of ChronicleMap compared to the RDBS-style access of H2.

Here below our plan to get more extensive testing on larger repos:

Nasser Grainawi

unread,
Sep 18, 2020, 10:54:20 AM9/18/20
to Luca Milanesio, Repo and Gerrit Discussion

On Sep 18, 2020, at 6:26 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:

Some typos fixes below (we should have a code-review for the mailing list as well :-))

On 18 Sep 2020, at 12:24, Luca Milanesio <luca.mi...@gmail.com> wrote:

Hi all,
I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.

We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.

This is pretty cool and thanks for the perf data below! What made you decide to use ChronicleMap? What other solutions did you consider?

-- 
-- 
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/BDEE6B6F-A1CC-4700-800F-446CF11996F0%40gmail.com.

Luca Milanesio

unread,
Sep 18, 2020, 3:21:33 PM9/18/20
to Nasser Grainawi, Luca Milanesio, Repo and Gerrit Discussion

On 18 Sep 2020, at 15:54, Nasser Grainawi <nas...@codeaurora.org> wrote:



On Sep 18, 2020, at 6:26 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:

Some typos fixes below (we should have a code-review for the mailing list as well :-))

On 18 Sep 2020, at 12:24, Luca Milanesio <luca.mi...@gmail.com> wrote:

Hi all,
I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.

We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.

This is pretty cool and thanks for the perf data below! What made you decide to use ChronicleMap? What other solutions did you consider?

Good question :-)

The criteria for selection has been:
N. Non-block on any operation (*must*)
J. 100% Java implementation (*should*)
H. High-performance storage (*should*)
D. Distributed (*nice to have*)
E. Embeddable (*should*)

As you can see, none of the criteria had the transactional integrity and lock support as a requirement: the persistent cache isn’t a long-term storage but just a cache. There is no need to perform complex queries, or assuring ACID transactions.

The ones we evaluated were:
1. LMDB for Java (N, H, E)
2. Redis (N, J, H, D)
3. ChronicleMap (N, J, H, D, E)

As you can see Chronicle satisfied all of them and it offered performance comparable to LMDB but with the plus of being easily embeddable (you install the libModule, nothing else, no runtime) and does not require any native code.

P.S. Because Chronicle is a libModule, it means that you could also implement LMDB or Redis as libModules as well and offer them as alternatives. That’s the power of Gerrit’s extensibility :-)

Luca.

lucamilanesio

unread,
Sep 24, 2020, 6:16:35 PM9/24/20
to Repo and Gerrit Discussion
On Friday, September 18, 2020 at 8:21:33 PM UTC+1 lucamilanesio wrote:

On 18 Sep 2020, at 15:54, Nasser Grainawi <nas...@codeaurora.org> wrote:



On Sep 18, 2020, at 6:26 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:

Some typos fixes below (we should have a code-review for the mailing list as well :-))

On 18 Sep 2020, at 12:24, Luca Milanesio <luca.mi...@gmail.com> wrote:

Hi all,
I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.

We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.

This is pretty cool and thanks for the perf data below!
Anyone else feeling that this plugin would be cool?
Any objections in moving it to gerrit-review.googlesource.com?

Thanks in advance for the feedback.

Luca.

Matthias Sohn

unread,
Sep 24, 2020, 6:47:59 PM9/24/20
to lucamilanesio, Repo and Gerrit Discussion
On Fri, Sep 25, 2020 at 12:16 AM lucamilanesio <luca.mi...@gmail.com> wrote:


On Friday, September 18, 2020 at 8:21:33 PM UTC+1 lucamilanesio wrote:

On 18 Sep 2020, at 15:54, Nasser Grainawi <nas...@codeaurora.org> wrote:



On Sep 18, 2020, at 6:26 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:

Some typos fixes below (we should have a code-review for the mailing list as well :-))

On 18 Sep 2020, at 12:24, Luca Milanesio <luca.mi...@gmail.com> wrote:

Hi all,
I am pleased to share the first results of our (GerritForge) contribution to the performance improvements in the forthcoming Gerrit v3.3 release.

We have an initial experimental version of a brand-new high-performance persistent cache (based on ChronicleMap [1]) which is targeting the reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.

This is pretty cool and thanks for the perf data below!
Anyone else feeling that this plugin would be cool?

+1, that's great
 
Any objections in moving it to gerrit-review.googlesource.com?

+1 for moving it to gerrit-review.googlesource.com

-Matthias
 

Martin Fick

unread,
Sep 24, 2020, 6:49:58 PM9/24/20
to repo-d...@googlegroups.com, lucamilanesio
On Thursday, September 24, 2020 3:16:35 PM MDT lucamilanesio wrote:
> On Friday, September 18, 2020 at 8:21:33 PM UTC+1 lucamilanesio wrote:
> > On 18 Sep 2020, at 15:54, Nasser Grainawi <nas...@codeaurora.org> wrote:
> > We have an initial experimental version of a brand-new high-performance
> > persistent cache (based on ChronicleMap [1]) which is targeting the
> > reduction of the latency in accessing significant caches like:
> > - change_notes
> > - diff_summary
> > - diff_intraline
> > and potentially have a positive impact on many others.
> >
> >
> > This is pretty cool and thanks for the perf data below!
> >
> > Anyone else feeling that this plugin would be cool?
>
> Any objections in moving it to gerrit-review.googlesource.com?

+1


> > Get change details REST API (relies on change_notes cache):
> >
> > H2 persistent cache:
> > - mean: 6ms
> > - 95 percentile: 50ms
> > - 99 percentile: 91ms
> >
> > ChronicleMap persistent cache:
> > - mean: 6ms
> > - 95 percentile: 43ms
> > - 99 percentile: 68ms
> >
> > The above figures show that even with a *small repo* and a relatively
> > small H2 persistent cache, the improvement is visible on both 95 and 99
> > percentile. That is possible thanks to the non-blocking nature of
> > ChronicleMap compared to the RDBS-style access of H2.

Since the mean did not improve, where were the declines?

-Martin

--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation

Luca Milanesio

unread,
Sep 24, 2020, 7:00:44 PM9/24/20
to Martin Fick, Luca Milanesio, repo-d...@googlegroups.com
H2 works on average well with small data-sets, hence the mean value is low (6 ms).

The problem with H2 is the blocking reads/writes, which degrade the performance with a concurrent use.
Already with a best case scenario (small data-set, only 10 concurrent users, which is a very light load) it shows a significant difference compared with a non-blocking implementation like chronicle-map.

We are building up the Gerrit v3.3 E2E performance test environment, with large mono-repos and huge number of changes: we do expect a more significant degradation of the H2 cache compared to a non-blocking cache.

Hope this clarifies the differences.

Luca.

Luca Milanesio

unread,
Sep 25, 2020, 3:55:31 AM9/25/20
to Martin Fick, Luca Milanesio, repo-d...@googlegroups.com
Apologies Martin, I was wrong on this explanation: the point is different.

The numbers I’ve shown are the E2E execution time of the get change detail REST-API, executed with an H2-backed cache and a ChronicleMap-based cache.
When the numbers are identical (mean time) is because the impact of the cache is negligible, possibly because it didn’t get to the disk persistence.

Then of course it makes sense that the mean value is identical, because they are both based on caffeine when the cache is resolved in memory.
In a restricted % of cases (hence the difference on the 95th and 99th percentiles) they go to disk and then we see the difference.

Does now make more sense?
Apologies again for the initial confusion.

Luca.

Luca Milanesio

unread,
Sep 28, 2020, 1:22:32 PM9/28/20
to Martin Fick, Luca Milanesio, repo-d...@googlegroups.com
On 24 Sep 2020, at 23:48, Martin Fick <mf...@codeaurora.org> wrote:

On Thursday, September 24, 2020 3:16:35 PM MDT lucamilanesio wrote:
On Friday, September 18, 2020 at 8:21:33 PM UTC+1 lucamilanesio wrote:
On 18 Sep 2020, at 15:54, Nasser Grainawi <nas...@codeaurora.org> wrote:
We have an initial experimental version of a brand-new high-performance
persistent cache (based on ChronicleMap [1]) which is targeting the
reduction of the latency in accessing significant caches like:
- change_notes
- diff_summary
- diff_intraline
and potentially have a positive impact on many others.


This is pretty cool and thanks for the perf data below!

Anyone else feeling that this plugin would be cool?

Any objections in moving it to gerrit-review.googlesource.com?

+1
Reply all
Reply to author
Forward
0 new messages