Analytics plugin for Gerrit Code Review

633 views
Skip to first unread message

lucamilanesio

unread,
Nov 4, 2016, 12:45:12 PM11/4/16
to Repo and Gerrit Discussion
Hi Gerrit Community,
I will be presenting a new idea of building an Analytics platform based on Gerrit Code Review data, events, and Git repository commits.

But, as usual, I would like to show stuff instead of just talking about it :-)

I need a new repository on Gerrit-Review to host my Analytics plugin, can some maintainer create that for me?

Name: analytics
Description: Plugin to aggregate information from Gerrit projects and reviews and expose them through REST and SSH API.

I remember that in the past User Summits, one of the objections to moving away from the DBMS was: "how can easily extract analytics data if I don't have a DB anymore? How can you get CSV data out of your Git repository?"
Then the answer is: use the API of the Analytics plugin and then you can import the CSV or JSON results wherever you want.

I'll present the concept Sat 12th of Nov (15:00 at the moment) at GooglePlex in Mountain View:

Thank you in advance.

Luca.

Sébastien Douche

unread,
Nov 4, 2016, 1:02:23 PM11/4/16
to repo-d...@googlegroups.com
On Fri, Nov 4, 2016, at 17:45, lucamilanesio wrote:
> Name: analytics
> Description: Plugin to aggregate information from Gerrit projects and
> reviews and expose them through REST and SSH API.

Very interesting Luca. I'm eager to test it :).


--
Sébastien Douche <s...@nmeos.net>
Twitter: @sdouche
http://douche.name

Dave Borowitz

unread,
Nov 4, 2016, 2:15:11 PM11/4/16
to lucamilanesio, Repo and Gerrit Discussion

--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Luca Milanesio

unread,
Nov 4, 2016, 6:05:04 PM11/4/16
to Dave Borowitz, Repo and Gerrit Discussion
Thanks Dave, I am looking for another "wow" factor this year :-)
See you next week.

Luca.

Luca Milanesio

unread,
Nov 4, 2016, 7:19:18 PM11/4/16
to Sébastien Douche, repo-d...@googlegroups.com
My first objective is to add to Gerrit the same type of data that GitHub shows, but with more accuracy !

See below what GitHub says about, for instance, the top 6 contributors of the Gerrit project:

1. Shawn (4,150)
2. DavidP (2,114)
3. DaveB (1,805)
4. DavidO (896)
5. Edwin (492)
6. AndyB (364)

But something isn't quite right in those statistics:

a) GitHub doesn't count the people that do not have an account
b) Gerrit is not a single project but is composed by multiple repos because of the plugins
c) GitHub isn't able to understand that people change company (Edwin was in SAP and now in Google)

With the new Gerrit Analytics, you'll be able to get the *real* picture and consider cross-repos data aggregation and cross-account grouping of stats.

See for instance the real picture of the full Gerrit Project:

1. Shawn (7,201)
2. DavidP (4,592)
3. Edwin (4,333)
4. DaveB (2,658)
5. DavidO (1,312)
6. Luca (671)

The difference of the real figures vs. the ones provided by GitHub is *huge*. Shawn has almost 2x the commits, Edwin finally at his place with 10x the commits reported by GitHub ... and I would say I am pleased to be at Nr. 6 with 7x the commits reported by GitHub.

I am planning to install the Gerrit Analytics plugin on GerritHub.io, so that we could have *real* statistics of the Gerrit project in real-time.

More details will come at the User Summit :-)

Luca.
> --
> --
> To unsubscribe, email repo-discuss...@googlegroups.com
> More info at http://groups.google.com/group/repo-discuss?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.

David Pursehouse

unread,
Nov 5, 2016, 1:35:32 AM11/5/16
to Luca Milanesio, Sébastien Douche, repo-d...@googlegroups.com
On Sat, Nov 5, 2016 at 8:19 AM Luca Milanesio <luca.mi...@gmail.com> wrote:
My first objective is to add to Gerrit the same type of data that GitHub shows, but with more accuracy !

See below what GitHub says about, for instance, the top 6 contributors of the Gerrit project:

1. Shawn (4,150)
2. DavidP (2,114)
3. DaveB (1,805)
4. DavidO (896)
5. Edwin (492)
6. AndyB (364)

But something isn't quite right in those statistics:

a) GitHub doesn't count the people that do not have an account
b) Gerrit is not a single project but is composed by multiple repos because of the plugins
c) GitHub isn't able to understand that people change company (Edwin was in SAP and now in Google)

It works if the email address(es) are correctly associated to the account.  For example the stats for gerrit should be correct for me because all the email addresses I've used (including at previous employer) are associated to my account.
 

With the new Gerrit Analytics, you'll be able to get the *real* picture and consider cross-repos data aggregation and cross-account grouping of stats.

See for instance the real picture of the full Gerrit Project:

1. Shawn (7,201)
2. DavidP (4,592)

Eh, I'm assuming this includes commits on all the plugins?  And does it include merge commits?

luca.mi...@gmail.com

unread,
Nov 5, 2016, 5:35:15 AM11/5/16
to David Pursehouse, Sébastien Douche, repo-d...@googlegroups.com


On 5 Nov 2016, at 05:35, David Pursehouse <david.pu...@gmail.com> wrote:

On Sat, Nov 5, 2016 at 8:19 AM Luca Milanesio <luca.mi...@gmail.com> wrote:
My first objective is to add to Gerrit the same type of data that GitHub shows, but with more accuracy !

See below what GitHub says about, for instance, the top 6 contributors of the Gerrit project:

1. Shawn (4,150)
2. DavidP (2,114)
3. DaveB (1,805)
4. DavidO (896)
5. Edwin (492)
6. AndyB (364)

But something isn't quite right in those statistics:

a) GitHub doesn't count the people that do not have an account
b) Gerrit is not a single project but is composed by multiple repos because of the plugins
c) GitHub isn't able to understand that people change company (Edwin was in SAP and now in Google)

It works if the email address(es) are correctly associated to the account.  For example the stats for gerrit should be correct for me because all the email addresses I've used (including at previous employer) are associated to my account.

That is the issue: you may have people that doesn't have a GitHub account and those aren't listed and accounted for.

Isn't Git a distributed peer to peer tool at the end of the day? GitHub seems to consider a more traditional client-server use.

However, ai believe GitHub'ers goal wasn't building an Analytics screen :-) even if appears to be one. That screen is reliable only for small projects.

 

With the new Gerrit Analytics, you'll be able to get the *real* picture and consider cross-repos data aggregation and cross-account grouping of stats.

See for instance the real picture of the full Gerrit Project:

1. Shawn (7,201)
2. DavidP (4,592)

Eh, I'm assuming this includes commits on all the plugins?

Yes, that's the point. Most projects have more than one repo.

  And does it include merge commits?

Right, point taken :-)
I just started by considering all commits :-) It was simply a 10' exercise to how how number can be significantly different.

Marcelo Ávila de Oliveira

unread,
Nov 5, 2016, 11:40:18 AM11/5/16
to Luca Milanesio, David Pursehouse, Sébastien Douche, Repo and Gerrit Discussion
2016-11-05 7:35 GMT-02:00 <luca.mi...@gmail.com>:

On 5 Nov 2016, at 05:35, David Pursehouse <david.pu...@gmail.com> wrote:
On Sat, Nov 5, 2016 at 8:19 AM Luca Milanesio <luca.mi...@gmail.com> wrote: 
See for instance the real picture of the full Gerrit Project:

1. Shawn (7,201)
2. DavidP (4,592)

Eh, I'm assuming this includes commits on all the plugins?
Yes, that's the point. Most projects have more than one repo.
  And does it include merge commits?
Right, point taken :-)
I just started by considering all commits :-) It was simply a 10' exercise to how how number can be significantly different.

And does it include commits from all patchsets or just the submitted ones?

Marcelo Ávila de Oliveira

unread,
Nov 10, 2016, 11:17:51 AM11/10/16
to Luca Milanesio, Repo and Gerrit Discussion
Luca?

Luca Milanesio

unread,
Nov 10, 2016, 11:27:21 AM11/10/16
to Marcelo Ávila de Oliveira, David Pursehouse, Sébastien Douche, Repo and Gerrit Discussion
Only the submitted ones.

Reply all
Reply to author
Forward
0 new messages