GRPC Monitoring Hooks

1,957 views
Skip to first unread message

mic...@improbable.io

unread,
Jul 13, 2015, 11:28:57 AM7/13/15
to grp...@googlegroups.com
Hi,

Let me start with thanking you guys for open sourcing GRPC, it is amazing to see how quickly it is narrowing the gap between the Google internal RPC mechanisms and what is available in the open source world.

We're currently evaluating using GRPC as the base for our set of Micro Services, spanning Python, Go and Java. The idea is to have a well defined and documented protocol that would make it easy to use services from whatever language you need. GRPC seems like a natural choice for this. 

However, we're very data-driven and we have a philosophy of monitoring-first. We're currently using the Prometheus stack, and instrument our Go, Java and Python code with their client libraries to export our monitoring in a scrapable /metrics endpoint. We'd love to have generic mechanism in GRPC that would allow to do similar monitoring for both client and server side:
 * requests issued by Service + Method
 * requests failed by Service + Method
 * handling latency by Service + Method
 * data throughput by Service + Method
This kind of monitoring inside the RPC layer itself provides a uniform way for troubleshooting decentralized services, without requiring application developers to add it explicitly.

We are aware that there are many monitoring solutions in the open source world, that's why in https://github.com/grpc/grpc-go/issues/240 we proposed a general set of hooks for GRPC Go that would allow for arbitrary integrations. However, there was a push back on such an approach citing performance concerns. 

Is support for monitoring something that GRPC as a whole has on it's roadmap? If so, how are you planning to support generic integrations across multiple languages?

Cheers,
Michal Witkowski
Head of Infrastructure, Improbable
 

Craig Tiller

unread,
Jul 13, 2015, 11:40:50 AM7/13/15
to mic...@improbable.io, grp...@googlegroups.com, ave...@google.com

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/5ab97426-6a77-4369-a272-e5c5323c4ddb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alistair Veitch

unread,
Jul 15, 2015, 2:34:14 PM7/15/15
to mic...@improbable.io, grp...@googlegroups.com
Hi Michal, that is a great set of questions. We too believe in monitoring first, and need to have great facilities in gRPC ourselves. To accomplish this, we are currently putting a version of our internal monitoring and resource measurement API's into gRPC, and that will be accompanied by a "clean-room", open-source version of much of the functionality (unfortunately, we can't directly open-source our internal code, because we have too many dependencies upon other internal systems and libraries). We will have API's in place very shortly, and full implementation will follow over the next couple of months. A rough ordering of features you can expect:
  1. Basic stats - latency, throughput, etc
  2. Distributed tracing (Dapper-style)
  3. Resource utilization stats (e.g. CPU consumed by service/method)
  4. Cross-stack stats and resource utilization (e.g if Service A calls B calls C, then breakdown of resources used at C by callers in A)
  5. Ability to define custom metrics
As far as data export, we expect to provide integration with services like Google Cloud Monitoring, some sample "stats on an HTML page" mechanism, and work with the wider community on things like Prometheus. 

I'd be very interested in working with you if have particular requirements from a monitoring and tracing.

Cheers,
    Alistair

--

Yann Ramin

unread,
Jul 15, 2015, 7:06:28 PM7/15/15
to grp...@googlegroups.com, mic...@improbable.io
Very interested in seeing what you develop, and especially how you are handling some of the roll-up call semantics. 

Michal Witkowski

unread,
Jul 19, 2015, 5:04:30 PM7/19/15
to grp...@googlegroups.com, thea...@gmail.com, ave...@google.com

Alistair, thanks for the detailed plan! It all sounds like you've got a solid road map ahead. Can you share some ETAs on the APIs availability?

If they're coming soon (2-3 weeks) , we could possibly contribute Prometheus Client metrics (http://prometheus.io/docs/instrumenting/clientlibs/) to GRPC Go and Java. It seems they're gaining popularity among big open source projects  (Kubernetes, Docker, Etcd adoption), and are very powerful and easy to use, especially for people familiar with the Google monitoring stack :-)

Derek Perez

unread,
Sep 24, 2015, 10:20:27 PM9/24/15
to grpc.io, thea...@gmail.com, ave...@google.com, mic...@improbable.io
Friendly ping, did anything come of this yet?

Michal Witkowski

unread,
Sep 25, 2015, 5:05:58 AM9/25/15
to Derek Perez, grpc.io, thea...@gmail.com, ave...@google.com

I've got an outstanding PR with the grpc Go project that the maintainers are reviewing. It adds a general API framework for monitoring of RPCs on both client and server side, with an example integration with Prometheus.

Derek Perez

unread,
Sep 25, 2015, 11:50:59 AM9/25/15
to Michal Witkowski, grpc.io, thea...@gmail.com, ave...@google.com

Can you link me to it? I'd like to observe

Michal Witkowski

unread,
Sep 28, 2015, 2:55:11 AM9/28/15
to Derek Perez, grpc.io, thea...@gmail.com, ave...@google.com

Gesly George

unread,
Nov 11, 2015, 1:08:53 AM11/11/15
to grpc.io, de...@derekperez.com, thea...@gmail.com, ave...@google.com, mic...@improbable.io
Hi,

Has any progress been made on adding monitoring hooks to the grpc stack in different languages? It was mentioned early in this thread, that APIs would be added very shortly; so wondering if this has made it to the public repo?

Thanks,
Gesly

Gesly George

unread,
Dec 4, 2015, 6:51:47 PM12/4/15
to grpc.io, de...@derekperez.com, thea...@gmail.com, ave...@google.com, mic...@improbable.io
I saw reference to a Census mentioned here .
https://github.com/grpc/grpc-go/issues/240
https://github.com/grpc/grpc/tree/master/src/core/census

Is this something that will be added to the other implementations as well.

mic...@improbable.io

unread,
May 14, 2016, 8:57:58 AM5/14/16
to grpc.io, de...@derekperez.com, thea...@gmail.com, ave...@google.com, mic...@improbable.io
Since Go GRPC recently acquired Instrumentation, we decided to move our Prometheus monitoring code from a long-lasting upstream PR to a separate set of gRPC Go interceptors.

The code and query examples are here:

We'll be working on a Java interceptor for Prometheus monitoring as well.

Thanks,
M
Reply all
Reply to author
Forward
0 new messages