limits on publishers, publishes/second + expected latency + compression on publish

assumednormal

unread,

Jun 6, 2016, 4:39:22 PM6/6/16

to Google Cloud Pub/Sub Discussions

I'm using Pubsub as part of a larger data pipeline that must handle 100,000 messages/second, at a minimum. I'm currently sending a fraction of that volume while building the pipeline and I'm seeing enough latency to make me concerned about the throughput of Pubsub. There are currently 70+ instances of a service running, each publishing batches of ~60 messages to the same Pubsub topic at a time.

Is there a limit to the number of publishers for a given Pubsub topic?
Is there a limit to the number of publishes/second for a given Pubsub topic?
What are typical latency times on publishing with the Go API?

Also, does the Go API compress messages before publishing to Pubsub?

Kir Titievsky

unread,

Jun 6, 2016, 4:56:57 PM6/6/16

to assumednormal, Google Cloud Pub/Sub Discussions

Hi, AssumedNormal,

Thanks for trying Pub/Sub! Usually, strange latency and throughput measurements end up being artifacts of the set up. End-to-end measurement in particular is tricky because it very much depends on how frequently you pull the messages and with what timeout.

Could you tell me a bit more about how you are measuring latency (is it between Publish request and response or end to end) and what numbers are you seeing?

k

Kir Titievsky | Product Manager | Google Cloud Pub/Sub

--
You received this message because you are subscribed to the Google Groups "Google Cloud Pub/Sub Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/9aba3c90-2f4a-4dd5-9ca5-ee2ee3292718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Kir Titievsky | Product Manager | Google Cloud Pub/Sub

Alex Mordkovich

unread,

Jun 9, 2016, 6:52:54 PM6/9/16

to assumednormal, Google Cloud Pub/Sub Discussions, Kir Titievsky

Also, you can find information about quotas and limits at https://cloud.google.com/pubsub/quotas.

To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/CAKdozMc_UBNGnXrRCGnt2Lfi9fxmuqXg_709rdkxRtqmr8szAA%40mail.gmail.com.

assumednormal

unread,

Jun 9, 2016, 9:16:45 PM6/9/16

to Google Cloud Pub/Sub Discussions, maxwell.w...@gmail.com

Here are some sample timings (nanoseconds).

4848447203

3717572181

3881225435

3965445183

3887076452

3794808345

3884464320

3675762494

3785684191

3899791197

3735564428

3730783918

3685143244

3773444851

3805871667

Sampled using:

t := time.Now()

_, err := topic.Publish(ctx, buf...)

timingCh <- time.Since(t)

A goroutine wrote durations to a file.

On Monday, June 6, 2016 at 1:56:57 PM UTC-7, Kir Titievsky wrote:

Hi, AssumedNormal,

Thanks for trying Pub/Sub! Usually, strange latency and throughput measurements end up being artifacts of the set up. End-to-end measurement in particular is tricky because it very much depends on how frequently you pull the messages and with what timeout.

Could you tell me a bit more about how you are measuring latency (is it between Publish request and response or end to end) and what numbers are you seeing?

k

Kir Titievsky | Product Manager | Google Cloud Pub/Sub

On Mon, Jun 6, 2016 at 4:39 PM, assumednormal <maxwell.w...@gmail.com> wrote:

I'm using Pubsub as part of a larger data pipeline that must handle 100,000 messages/second, at a minimum. I'm currently sending a fraction of that volume while building the pipeline and I'm seeing enough latency to make me concerned about the throughput of Pubsub. There are currently 70+ instances of a service running, each publishing batches of ~60 messages to the same Pubsub topic at a time.
Is there a limit to the number of publishers for a given Pubsub topic?
Is there a limit to the number of publishes/second for a given Pubsub topic?
What are typical latency times on publishing with the Go API?
Also, does the Go API compress messages before publishing to Pubsub?

--
You received this message because you are subscribed to the Google Groups "Google Cloud Pub/Sub Discussions" group.

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-discuss+unsub...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/9aba3c90-2f4a-4dd5-9ca5-ee2ee3292718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

assumednormal

unread,

Jun 9, 2016, 9:36:11 PM6/9/16

to Google Cloud Pub/Sub Discussions, maxwell.w...@gmail.com

I forgot to add that buf always contains 1000 messages.

assumednormal

unread,

Jun 10, 2016, 2:46:24 PM6/10/16

to Google Cloud Pub/Sub Discussions, maxwell.w...@gmail.com

Here are some more timings. The ones I posted earlier seemed a bit slower than normal. Same methodology as above; 1000 messages per batch publish; in nanoseconds.

1873451808

1006860688

960805166

1031071650

959431664

948844165

934066813

824746696

1073776012

893119314

4180969068

1056010609

1351117508

1436416040

1015488749

1106427037

964076157

1155006552

884198150

916935294

907859456

1229922372

1084849635

855181743

1095356648

878717297

1094342141

1264016780

943205719

870119753

921160769

941944150

Kir Titievsky

unread,

Jun 11, 2016, 1:21:55 PM6/11/16

to assumednormal, Google Cloud Pub/Sub Discussions

Thank you for these. We certainly expect this to be in tens of milliseconds, not seconds. Would you tell me where you run the code?

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-dis...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/9aba3c90-2f4a-4dd5-9ca5-ee2ee3292718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Kir Titievsky | Product Manager | Google Cloud Pub/Sub

--

You received this message because you are subscribed to the Google Groups "Google Cloud Pub/Sub Discussions" group.

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/e615ba85-2526-4abf-8515-8886594c6a40%40googlegroups.com.

assumednormal

unread,

Jun 11, 2016, 1:28:58 PM6/11/16

to Google Cloud Pub/Sub Discussions, maxwell.w...@gmail.com

These were run on my debian vm.

$ uname -a
Linux assumednormal-vm01 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux 
$ go version 
go version go1.6.2 linux/amd64

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-discuss+unsub...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/9aba3c90-2f4a-4dd5-9ca5-ee2ee3292718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Kir Titievsky | Product Manager | Google Cloud Pub/Sub

--
You received this message because you are subscribed to the Google Groups "Google Cloud Pub/Sub Discussions" group.

To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-discuss+unsub...@googlegroups.com.

Message has been deleted

myk...@oden.io

unread,

Aug 19, 2016, 3:40:57 PM8/19/16

to Google Cloud Pub/Sub Discussions

Hi,

For background, we are an IoT company considering migrating to GCP due to BigTable and PubSub's attractive promises. However, we are also seeing bad latencies using PubSub for a similar use case to the original poster:

* Batch writes, 60 messages

* System must handle hundreds of thousands of messages/sec

With the gcloud Python library (which wraps the HTTP API), we are seeing latencies of 400ms, 600ms, and 5000ms min/avg/max, respectively. With the pubsub_pb2 Python library (which wraps the gRPC API), we are seeing slightly better latencies of 50ms, 300ms, and 1000ms, respectively as per above.

We understand that there are certain trade-offs made between high throughput and low latency, but anything >100ms is unacceptable for us. To be clear, these publication requests are all being made inside of Kubernetes containers hosted on GCE, so it should be considered on-premises. Are there currently any initiatives to decrease the publish latency? What is the current status of the gRPC library development as it's largely undocumented and perhaps not ready for production.

Maybe we are not understanding the purpose of PubSub or are using it wrong? Right now, we are using RabbitMQ for all of our realtime message queues and getting less than 2ms publication latencies. Is there any hope for PubSub to match this performance at some point? Right now, latency is the main deal breaker for our total migration from AWS to GCP, and until the problem is solved we'll be in a holding pattern on this.

Kir Titievsky

unread,

Aug 22, 2016, 3:15:13 PM8/22/16

to myk...@oden.io, Google Cloud Pub/Sub Discussions

Mykola,

Thanks for the detailed response! At the moment, I'm seeing ~16ms at the median, ~60-70ms at 95 percentile for the prober jobs we run to monitor this. That's typical. This tells me something else is going on with your particular setup. Let's set up a call to figure out what's keeping your code at 300ms?

You are right that in general we expect latencies above 10ms. You can locally hosted message broker with a "lax" replication policy. Taking that extra time allows us to guarantee message durability: we replicate each message to more than a single geographical location before acknowledging a publish request. If sub 10ms latency is critical for you, Cloud Pub/Sub is not the right solution. But of course, low latency will cost you in other dimensions.

On gRPC: at the moment, Cloud Pub/Sub's gRPC APIs are technically in alpha. This means our support for the client libraries is limited. That said, most recent version of the Google Cloud Platform Client Libraries support gRPC under the hood (see this page for Java and navigation to Python, PHP, Ruby, node.js, etc.) These libraries are rapidly approaching Beta status.

If you need more control or different languages, our service is not particularly special in that you can use instructions on grpc.io to compile client libraries in any of the 11 languages supported by the framework.

Kir

Product Manager

Google Cloud Pub/Sub

--

You received this message because you are subscribed to the Google Groups "Google Cloud Pub/Sub Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-pubsub-discuss+unsub...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-pubsub-discuss/aa22003d-c3aa-459e-adea-f412f51af26e%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

myk...@oden.io

unread,

Aug 22, 2016, 4:05:03 PM8/22/16

to Google Cloud Pub/Sub Discussions, myk...@oden.io

Awesome! Thanks for that. I've pinged you offline to setup a review of our stack.

To your points: sub-10ms latency is not critical for us. As I mentioned we were seeing on avg. ~600ms latencies w/ the official gcloud python library (batching enabled). I figured this was typical because it's also the # Spotify was seeing when they performed their own benchmark (source):

publishes       198,902 (  156,137 avg) messages/s      503.912 (     620.260 avg) ms latency    1,565,391 total

When we switched to the gRPC library, we were still seeing latencies around ~150ms. Most troubling, however, are the occasional spikes to 5,000 and even 10,000ms. Most of our statsd #s look like this:

However, a bunch also look like this :(

If we are using the library wrong, I would expect uniformly bad performance. This seems to be more consistent w/ congested networks or perhaps our Google Container instances are not configured properly? Looking forward to working with you to resolve this.

Thanks again for your reply =)

--Mykola

myk...@oden.io

unread,

Aug 22, 2016, 4:31:37 PM8/22/16

to Google Cloud Pub/Sub Discussions

For posterity, here's what 18 hours of publishing to PubSub (w/ gRPC) looks like on a logarithmic (base 2) scale. You'll see that we're averaging >100ms, with pretty frequent spikes to insane #s:

Reply all

Reply to author

Forward