Dear Gerrit-community,
I would like to bring up a new feature that I am currently working on, which allows Gerrit users to subscribe to events via GCP PubSub.
Why do we need this feature?
At SAP we started to serve git fetch/clone requests from a Gerrit Replica to alleviate the load on the primary Gerrit. To make it easy to our users and to enforce this, we use a load balancer to automatically route git fetch/clone requests to the Gerrit replica. However, this only works if HTTPS is being used. WIth SSH this is not possible, since SSH connection are handled on network layer 4 in the load balancer, at which the request content cannot be viewed. Thus, we can't make any decision based on the request. At the moment our stakeholders thus have to configure a specific URL to fetch/clone from the Gerrit replica instead of the primary Gerrit. However, we can't enforce this. Thus, we would like to disable SSH. For nearly everything that can be done with SSH there is an alternative using HTTPS, i.e. git over HTTPS or the Gerrit REST API. The only exception is subscribing to stream-events. This is only possible via SSH. Since a lot of our stakeholders rely on this feature, e.g. because they use the gerrit-trigger plugin in Jenkins, we need to find an alternative.
What is the proposed solution?
The event-broker API [1] already supports publishing events to a message broker. It is mainly used to allow multiple Gerrit sites in a multisite setup to exchange events. Using a message broker to share events has a big advantage over using SSH: Events are not lost, if the subscriber is offline. Instead they are retained and will be delivered as soon as the subscriber is listening again. However, the event-broker API is currently not usable for a scenario involving end users of Gerrit for the following reasons:
- Gerrit permissions restrict what events a user is allowed to see, e.g. if a project has limited read access users without read permission in the project should not be able to see events involving that project. The current implementation does not filter events, since the main use case (communication between Gerrit instances) does not require it.
- Subscribing to a PubSub system requires authentication with the PubSub system. In the case of GCP PubSub that means a ServiceAccount with a suitable role is required. Sharing infrastructure credentials with end users or managing ServiceAccounts for a high number of users is undesired and insecure. Thus, the subscription process should be managed by a separate service with Gerrit being the preferred candidate, since it already provides an ACL and REST API. The events-broker API and implementations however currently only create a single subscription for the Gerrit instance they are running in and don't allow managing subscriptions of users.
I have implemented a workflow for GCP PubSub in the events-gcloud-pubsub plugin [2]. The workflow is as follows:
1) The user creates a topic in PubSub using a REST API provided by the plugin in Gerrit.
Only one topic per user will be created, since multiple topics would not provide additional benefit. Separate topics per user are required, since we can only filter events based on permissions in Gerrit before publishing them.
2) Gerrit will create an EventListener that will publish user-scoped events to the topic.
On startup, Gerrit will create EventListeners for all existing topics associated with Gerrit users.
3) The user deploys a service that wants to consume Gerrit events. This service has to expose an endpoint that PubSub will use to push events to.
The endpoint has to use HTTPS and signed certificates. It is ideally publicly accessible (via the internet), but a solution for non-public endpoints using CloudRun and Serverless VPC Access as a proxy is also available.
4) The user creates one or multiple subscriptions using a REST API provided by the plugin in Gerrit.
The subscriptions will automatically be attached to the topic owned by the user. These subscriptions will be push subscriptions [3], i.e. PubSub will send a request containing the event to an endpoint defined when creating the subscription. Pull subscriptions will not be supported since those require GCP credentials to authenticate with PubSub. The request sent by PubSub contains a JWT token that can be used to authenticate the request [4].
Some more notes:
- Unused subscriptions, i.e. the push requests fail, will be deleted by GCP PubSub after one day of inactivity. I also plan to add a scheduled job that deletes topics without active subscriptions regularly.
- This feature is optional and can be enabled in the plugin's configuration section in the gerrit.config.
- The REST APIs are requiring a global capability.
- I plan to add support for stream-events via GCP PubSub to the gerrit-trigger plugin for Jenkins as one of the next steps.
Some more detailed documentation can be found at [5].
Is GCP PubSub ideal for this use case?
No. GCP PubSub is designed to exchange messages between services, which are controlled by the same team. It is not designed for scenarios involving sending messages to end users. This is likely the case for other message brokers as well. However, the proposed solution works around the limitations. There are certainly other message brokers that might be better suited, e.g. nats, but they would require to be operated by ourselves, which in our case was not desired, but would for sure be possible.
What does this cost?
I did some rough estimations based on the public pricing [6] and the usage we currently see in our instances (2.5 events/s, ~125 users using SSH stream-events, 1.2 subscriptions/user, retention of all messages in topics enabled) and came up with about 3$ per user per month for throughput, storage and egress (premium tire, worldwide).
Questions to you:
I would be very thankful for comments and ideas regarding this feature. Please share them!
Would this also be interesting for you?
As mentioned above, the current implementation was done in the events-gcloud-pubsub plugin. Ponch had the valid concern that this stretches the scope of the plugin quite a bit. Since only a very small set of classes already provided by the plugin are being used by this new feature, moving it to its own plugin is definitely an option. For users, this would only mean some duplicate configuration like the location of the ServiceAccountKey. Do you agree that this feature is being moved into a separate plugin? If yes, are you OK with creating a project for it on gerrit-review?
Thanks and best regards,