Message Broker (GCP PubSub) based alternative to SSH stream-events command

249 views
Skip to first unread message

Thomas Dräbing

unread,
Apr 19, 2023, 4:59:46 AM4/19/23
to Repo and Gerrit Discussion
Dear Gerrit-community,

I would like to bring up a new feature that I am currently working on, which allows Gerrit users to subscribe to events via GCP PubSub.

Why do we need this feature?

At SAP we started to serve git fetch/clone requests from a Gerrit Replica to alleviate the load on the primary Gerrit. To make it easy to our users and to enforce this, we use a load balancer to automatically route git fetch/clone requests to the Gerrit replica. However, this only works if HTTPS is being used. WIth SSH this is not possible, since SSH connection are handled on network layer 4 in the load balancer, at which the request content cannot be viewed. Thus, we can't make any decision based on the request. At the moment our stakeholders thus have to configure a specific URL to fetch/clone from the Gerrit replica instead of the primary Gerrit. However, we can't enforce this. Thus, we would like to disable SSH. For nearly everything that can be done with SSH there is an alternative using HTTPS, i.e. git over HTTPS or the Gerrit REST API. The only exception is subscribing to stream-events. This is only possible via SSH. Since a lot of our stakeholders rely on this feature, e.g. because they use the gerrit-trigger plugin in Jenkins, we need to find an alternative.

What is the proposed solution?

The event-broker API [1] already supports publishing events to a message broker. It is mainly used to allow multiple Gerrit sites in a multisite setup to exchange events. Using a message broker to share events has a big advantage over using SSH: Events are not lost, if the subscriber is offline. Instead they are retained and will be delivered as soon as the subscriber is listening again. However, the event-broker API is currently not usable for a scenario involving end users of Gerrit for the following reasons:

- Gerrit permissions restrict what events a user is allowed to see, e.g. if a project has limited read access users without read permission in the project should not be able to see events involving that project. The current implementation does not filter events, since the main use case (communication between Gerrit instances) does not require it.
- Subscribing to a PubSub system requires authentication with the PubSub system. In the case of GCP PubSub that means a ServiceAccount with a suitable role is required. Sharing infrastructure credentials with end users or managing ServiceAccounts for a high number of users is undesired and insecure. Thus, the subscription process should be managed by a separate service with Gerrit being the preferred candidate, since it already provides an ACL and REST API. The events-broker API and implementations however currently only create a single subscription for the Gerrit instance they are running in and don't allow managing subscriptions of users.

I have implemented a workflow for GCP PubSub in the events-gcloud-pubsub plugin [2]. The workflow is as follows:

1) The user creates a topic in PubSub using a REST API provided by the plugin in Gerrit.
Only one topic per user will be created, since multiple topics would not provide additional benefit. Separate topics per user are required, since we can only filter events based on permissions in Gerrit before publishing them.
2) Gerrit will create an EventListener that will publish user-scoped events to the topic.
On startup, Gerrit will create EventListeners for all existing topics associated with Gerrit users.
3) The user deploys a service that wants to consume Gerrit events. This service has to expose an endpoint that PubSub will use to push events to.
The endpoint has to use HTTPS and signed certificates. It is ideally publicly accessible (via the internet), but a solution for non-public endpoints using CloudRun and Serverless VPC Access as a proxy is also available.
4) The user creates one or multiple subscriptions using a REST API provided by the plugin in Gerrit.
The subscriptions will automatically be attached to the topic owned by the user. These subscriptions will be push subscriptions [3], i.e. PubSub will send a request containing the event to an endpoint defined when creating the subscription. Pull subscriptions will not be supported since those require GCP credentials to authenticate with PubSub. The request sent by PubSub contains a JWT token that can be used to authenticate the request [4].

Some more notes:
- Unused subscriptions, i.e. the push requests fail, will be deleted by GCP PubSub after one day of inactivity. I also plan to add a scheduled job that deletes topics without active subscriptions regularly.
- This feature is optional and can be enabled in the plugin's configuration section in the gerrit.config.
- The REST APIs are requiring a global capability.
- I plan to add support for stream-events via GCP PubSub to the gerrit-trigger plugin for Jenkins as one of the next steps.

Some more detailed documentation can be found at [5].

Is GCP PubSub ideal for this use case?

No. GCP PubSub is designed to exchange messages between services, which are controlled by the same team. It is not designed for scenarios involving sending messages to end users. This is likely the case for other message brokers as well. However, the proposed solution works around the limitations. There are certainly other message brokers that might be better suited, e.g. nats, but they would require to be operated by ourselves, which in our case was not desired, but would for sure be possible.

What does this cost?

I did some rough estimations based on the public pricing [6] and the usage we currently see in our instances (2.5 events/s, ~125 users using SSH stream-events, 1.2 subscriptions/user, retention of all messages in topics enabled) and came up with about 3$ per user per month for throughput, storage and egress (premium tire, worldwide).

Questions to you:

I would be very thankful for comments and ideas regarding this feature. Please share them!

Would this also be interesting for you?

As mentioned above, the current implementation was done in the events-gcloud-pubsub plugin. Ponch had the valid concern that this stretches the scope of the plugin quite a bit. Since only a very small set of classes already provided by the plugin are being used by this new feature, moving it to its own plugin is definitely an option. For users, this would only mean some duplicate configuration like the location of the ServiceAccountKey. Do you agree that this feature is being moved into a separate plugin? If yes, are you OK with creating a project for it on gerrit-review?

Thanks and best regards,

Luca Milanesio

unread,
Apr 19, 2023, 8:47:29 AM4/19/23
to Repo and Gerrit Discussion, Luca Milanesio, Thomas Dräbing
Hi Thomas,
Thanks for sharing this with the community.
Yes.


As mentioned above, the current implementation was done in the events-gcloud-pubsub plugin. Ponch had the valid concern that this stretches the scope of the plugin quite a bit.

Which is good I believe, that’s why we have open-source isn’t it?
Reusing, collaborating and extending the scope of the initial work to new use-cases.

Since only a very small set of classes already provided by the plugin are being used by this new feature, moving it to its own plugin is definitely an option.

What would be the benefits of having yet another similar *but different* plugin?
That would be quite confusing for other people to understand which is the one to use.

For users, this would only mean some duplicate configuration like the location of the ServiceAccountKey. Do you agree that this feature is being moved into a separate plugin? If yes, are you OK with creating a project for it on gerrit-review?

Of course anyone is free to create and fork the code, that’s why open-source exists after all :-)
I would recommend to try to stick to one plugin though, more for the benefit of the other people using it rather than us developing it.

Luca.

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/CAG7bb4CiiZmPa8q-C8kA5vmyvrtpoS1B0vv%2BuOhPsaC%2BvtFZ4w%40mail.gmail.com.

Matthias Sohn

unread,
Apr 19, 2023, 9:21:01 AM4/19/23
to Luca Milanesio, Repo and Gerrit Discussion, Thomas Dräbing
+1, I second this opinion. Implementing this in one plugin simplifies usage.
I guess the additional functionality for managing subscriptions doesn't disturb those not using this feature.
 

Luca Milanesio

unread,
Apr 19, 2023, 9:43:48 AM4/19/23
to Repo and Gerrit Discussion, Luca Milanesio, Thomas Dräbing, Matthias Sohn
Yep, agreed.

Luca.

Fabio Ponciroli

unread,
Apr 20, 2023, 12:50:08 PM4/20/23
to Luca Milanesio, Repo and Gerrit Discussion, Thomas Dräbing, Matthias Sohn
My 2cents.

I have to disagree about adding this logic in the events-gcloud-pubsub plugin. We are going to end up with a good chunk of code in the plugin just to manage infrastructure, while the purpose of the plugin is just to provide a producer/consumer for a particular technology. I would rather have more plugins with a clear scope rather than fewer with a broader scope. I think breaking the HA plugin into multiple plugins demonstrates the benefit of having smaller specialised plugins.

Having said that, not a big deal for me to have a single plugin. However, I think we should define the interfaces we are after in the events-broker and add the broker specific implementation to each events-* plugin.
 

Thomas Dräbing

unread,
Apr 22, 2023, 4:46:10 AM4/22/23
to Fabio Ponciroli, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
Limiting the scope of the plugin definitely has its merits.However, in this case, it would mean some duplicated configuration in the Gerrit config, e.g. the service account key or project ID, and some duplicated code, e.g. the ServiceAccountCredentialsProvider, which would be against the "Don't repeat yourself" principle. I would suggest a compromise: The events-broker module serves as an interface and the events-gcloud-pubsub is an implementation. Why not use a plugin providing shared functionality as an abstract plugin as it was done with its-base? This would be a breaking change and also require users to install more plugins, but it would eliminate duplicate configuration and code. WDYT?

Luca Milanesio

unread,
Apr 22, 2023, 8:53:40 AM4/22/23
to Repo and Gerrit Discussion, Luca Milanesio, Fabio Ponciroli, Matthias Sohn, Thomas Dräbing
+1

Having the user to duplicate the same information multiple times would be very confusing, other than having to install two plugins to support publishing events to Google Cloud PubSub

I would suggest a compromise: The events-broker module serves as an interface and the events-gcloud-pubsub is an implementation. Why not use a plugin providing shared functionality as an abstract plugin as it was done with its-base?

Which shared functionality would you put into events-broker?

This would be a breaking change and also require users to install more plugins, but it would eliminate duplicate configuration and code. WDYT?

I would avoid asking the end-user to install multiple plugins, unless they are really doing completely different things :-)

Luca.

Thomas Dräbing

unread,
Apr 24, 2023, 3:33:02 AM4/24/23
to Luca Milanesio, Repo and Gerrit Discussion, Fabio Ponciroli, Matthias Sohn
- the configuration. Basically the configuration would look something like this in the end:

```
[plugin "events-glcoud-pubsub-base"]
    gcloudProject = test-project
    privateKeyLocation = /secrets/serviceAccount.json
[plugin "events-gcloud-pubsub"]
    numberOfSubscribers = 6
    subscriptionId = gerrit-subscription-0
    sendStreamEvents = true
    ackDeadlineSeconds = 10
    subscribtionTimeoutInSeconds = 10
    shutdownTimeoutInSeconds = 10
[plugin "events-gcloud-pubsub-user"]     #need to find a better name
    serviceAccountForUserSubs = service...@test-project.iam.gserviceaccount.com
 
- handling gcloud credentials, i.e. CredentialsProvider
- Test utilities like the PubSub emulator (local)
- We can likely deduplicate some code involved in communication with PubSub ,e.g. creation of AdminClients

This base plugin would be small, but I think it would provide enough functionality to justify its existence. Another option would be not to put this functionality in a plugin, but in a shared library that is imported by both plugins (and packaged into the respective jars). I doubt though that this would work with the current Guice environment and how Gerrit manages plugins.


This would be a breaking change and also require users to install more plugins, but it would eliminate duplicate configuration and code. WDYT?

I would avoid asking the end-user to install multiple plugins, unless they are really doing completely different things :-)

The two functionalities are relatively decoupled at the moment, but are similar. They are similar, since both use PubSub to distribute Gerrit events.
They are different, because the target audiences (subscribers) are different. The existing functionality targets subscribers that are part of the same setup, e.g. other Gerrit instances in a multisite setup or a CI managed by the same admins. It assumes that the subscriber has permissions to see all events and has access to the GCP project (at least PubSub Subscriber role). These subscribers are pets in the scope of the plugin, i.e. some manual configuration/setup can be done for each subscriber.
The new functionality targets Gerrit end users, i.e. the subscribers are not allowed to see all events but only the ones their Gerrit account has access to. The subscriber does not have access to the GCP account, since the trust is limited. The subscriber is treated as "cattle", i.e. no manual configuration in Gerrit or PubSub can be done per subscriber (self service).
To me this is a border case. Both having everything in one plugin or splitting it up has its merits. I can imagine that some users only want to use either functionality and some want to use both. This is currently done via configuration, where the existing functionality currently can't be completely disabled. This would for me be the main argument of splitting it into two plugins.

Thomas Dräbing

unread,
Apr 27, 2023, 4:01:26 AM4/27/23
to Luca Milanesio, Repo and Gerrit Discussion, Fabio Ponciroli, Matthias Sohn
Can we decide on how to proceed in regards to creating a new plugin or adding the new functionality to the existing plugin?

Fabio Ponciroli

unread,
Apr 27, 2023, 4:20:58 AM4/27/23
to Thomas Dräbing, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
Hi Thomas,


On Thu, 27 Apr 2023 at 10:01, Thomas Dräbing <thomas....@gmail.com> wrote:
Can we decide on how to proceed in regards to creating a new plugin or adding the new functionality to the existing plugin?

Sorry for the late reply. Shall we just have a quick call to discuss it with whoever is interested? It might be quicker. I recogn 15' top should be enough.

Thomas Dräbing

unread,
Apr 27, 2023, 5:59:41 AM4/27/23
to Fabio Ponciroli, Luca Milanesio, Repo and Gerrit Discussion, Matthias Sohn
On Thu, 27 Apr 2023 at 10:20, Fabio Ponciroli <pon...@gmail.com> wrote:
Hi Thomas,


On Thu, 27 Apr 2023 at 10:01, Thomas Dräbing <thomas....@gmail.com> wrote:
Can we decide on how to proceed in regards to creating a new plugin or adding the new functionality to the existing plugin?

Sorry for the late reply. Shall we just have a quick call to discuss it with whoever is interested? It might be quicker. I recogn 15' top should be enough.
 

Makes sense, I have sent you a meeting invite for 3.30 PM CEST today. Let me know, whether the time slot works for you.
If anybody else is interested, let me know and I will send an invitation.
Reply all
Reply to author
Forward
0 new messages