What is the purpose of two CQ pointers in C++ async interface?

Kirill Katsnelson

unread,

Apr 17, 2021, 3:49:40 PM4/17/21

to grpc.io

For an async C++ server, the method declaration generated by the C++ protoc plugin takes two completion queues: new_call_cq and notification_cq. The former is declared as a more derived ServerCompletionQueue*, the latter is "just" a CompletionQueue*. I'm in the habit of using one CQ for both, which I likely picked up from the async hello world example. This works, so I do not fix it.

But for the life of me, I could not find any information why the async interface accepts two CQs, and what are the use cases which benefit from two separate CQ w.r.t. a call. Any info out there?

-kkm

Mark D. Roth

unread,

Apr 21, 2021, 1:35:48 PM4/21/21

to Kirill Katsnelson, grpc.io

A lot of the CQ-based API was designed around the idea that the application could tune performance by deciding which activity was going to occur on which CQ and then deciding which thread(s) were going to poll each CQ. So in the case you're asking about, the API allows using one CQ to get the new call request on and then another CQ to actually perform operations on that individual call (which effectively controls which fds are polled on which threads).

In practice, what we've found is that the CQ-based API is way too complicated, and basically no one actually needs the level of performance control that it was designed to provide. So we're currently working on changing to a polling model where the event engine provides all polling threads and the application API is a simpler callback-based model. We will continue to support the CQ-based API, but it will still use the new polling model under the hood. Once that happens, the CQ-based API will still continue to work the same way it does today, but moving some of the work to a different CQ may no longer actually have the performance benefit that it might today, so there's probably no reason to ever do that.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/d5754f98-6711-45ff-9000-b178e457d9can%40googlegroups.com.

--

Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.

Mark Sandan

unread,

Apr 21, 2021, 3:34:36 PM4/21/21

to grpc.io

> So we're currently working on changing to a polling model where the event engine provides all polling threads and the application API is a simpler callback-based model.

Is this the PR for the callback-based model you are referring to? https://github.com/grpc/proposal/pull/180

> In practice, what we've found is that the CQ-based API is way too complicated, and basically no one actually needs the level of performance control that it was designed to provide.

I'd like to explore those performance controls in the current CQ-based API. Is there documentation around tuning these parameters? One parameter I'd be interested in is whether we can tune the number of threads used behind the relevant grpc cpp APIs. I'd like to see how latency for our application changes based on the number of threads.

Mark D. Roth

unread,

Apr 21, 2021, 5:25:53 PM4/21/21

to Mark Sandan, grpc.io

On Wed, Apr 21, 2021 at 12:34 PM Mark Sandan <masan...@gmail.com> wrote:

> So we're currently working on changing to a polling model where the event engine provides all polling threads and the application API is a simpler callback-based model.
Is this the PR for the callback-based model you are referring to? https://github.com/grpc/proposal/pull/180

There are actually two related efforts that we're working on:

1. The new C++ callback-based API, which is described in the PR you linked. This is currently available as an experimental API, and you're welcome to try it out and give us feedback. It's currently usable via a short-term hack involving some dedicated polling threads, and we'll probably de-experimentalize it in the not-too-distant future.

2. A new EventEngine API, which will allow implementing a custom event loop for gRPC. We'll provide a default EventEngine implementation for each platform, but you'll be able to write your own implementation and have gRPC use that instead. The EventEngine implementation will be able to fully control things like the number of polling threads, so if our default EventEngine implementation for your platform doesn't provide enough performance, you'll be able to write your own, tuning it however you need to. Each EventEngine implementation will be able to expose whatever tuning knobs it wants to. The EventEngine effort is currently in the early prototyping stages. Once we finish moving to it, it will eliminate the short-term hack that we're using to make the C++ callback API work.

> In practice, what we've found is that the CQ-based API is way too complicated, and basically no one actually needs the level of performance control that it was designed to provide.

I'd like to explore those performance controls in the current CQ-based API. Is there documentation around tuning these parameters? One parameter I'd be interested in is whether we can tune the number of threads used behind the relevant grpc cpp APIs. I'd like to see how latency for our application changes based on the number of threads.

If you want to have multiple threads polling, then all you need to do in the current CQ-based API is to have multiple threads calling CompletionQueue::Next() on the same CQ. That should basically give you the same scalability that you'll continue to have once we move to EventEngine. The more threads you have polling the CQ, the more threads you have polling the underlying fds.

I would, however, caution you against relying on mechanisms to try to split up work within one channel or one server onto different pools of threads. For example, don't use two different CQs in the request-call API you originally asked about, and don't try to dispatch different calls from the same server onto different CQs. Those are the sorts of knobs that will no longer be available when we move to EventEngine. (If you do something like that, your code will not break, but it may not perform the same way, so any tuning effort you've put into this will have been wasted.)

I hope this information is helpful.

On Wednesday, April 21, 2021 at 10:35:48 AM UTC-7 Mark D. Roth wrote:
A lot of the CQ-based API was designed around the idea that the application could tune performance by deciding which activity was going to occur on which CQ and then deciding which thread(s) were going to poll each CQ. So in the case you're asking about, the API allows using one CQ to get the new call request on and then another CQ to actually perform operations on that individual call (which effectively controls which fds are polled on which threads).

In practice, what we've found is that the CQ-based API is way too complicated, and basically no one actually needs the level of performance control that it was designed to provide. So we're currently working on changing to a polling model where the event engine provides all polling threads and the application API is a simpler callback-based model. We will continue to support the CQ-based API, but it will still use the new polling model under the hood. Once that happens, the CQ-based API will still continue to work the same way it does today, but moving some of the work to a different CQ may no longer actually have the performance benefit that it might today, so there's probably no reason to ever do that.

On Sat, Apr 17, 2021 at 12:49 PM Kirill Katsnelson <kkm.po...@gmail.com> wrote:
For an async C++ server, the method declaration generated by the C++ protoc plugin takes two completion queues: new_call_cq and notification_cq. The former is declared as a more derived ServerCompletionQueue*, the latter is "just" a CompletionQueue*. I'm in the habit of using one CQ for both, which I likely picked up from the async hello world example. This works, so I do not fix it.

But for the life of me, I could not find any information why the async interface accepts two CQs, and what are the use cases which benefit from two separate CQ w.r.t. a call. Any info out there?

-kkm

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/d5754f98-6711-45ff-9000-b178e457d9can%40googlegroups.com.

--
Mark D. Roth <ro...@google.com>
Software Engineer
Google, Inc.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/68c3dec5-1fcb-4f08-a950-357bef3e8b29n%40googlegroups.com.

Reply all

Reply to author

Forward