gRFC A16: BinaryLogging

240 views
Skip to first unread message

spenc...@google.com

unread,
Aug 28, 2018, 7:43:09 PM8/28/18
to grpc.io
There is a gRFC describing the design of binary logging for gRPC, and feedback is welcome. The log events describe what the application sees for an RPC.
https://github.com/grpc/proposal/pull/41

Please keep the discussions on this thread. Thanks!

spenc...@google.com

unread,
Aug 28, 2018, 7:44:16 PM8/28/18
to grpc.io
Correction: the title will be A18, not A16.

yas...@google.com

unread,
Aug 28, 2018, 8:32:37 PM8/28/18
to grpc.io
Please use A19 instead

yas...@google.com

unread,
Aug 28, 2018, 8:44:33 PM8/28/18
to grpc.io
A few observations on the proposal (especially when seen from the C++ side) -
  1. When client receive a RST_STREAM, a GOAWAY, deadline expires, the connection breaks or any similar error condition - 
    • A fake status message is generated. The status_details reflect the error with which the RPC failed.
    • The documentation should be changed accordingly, or made less stringent that the trailers are what were actually received from the wire.
  2. C++ allows the server to cancel. It differs from the other platforms on this and it probably shouldn't be able to do this, but it is the current ground truth, which means the server can also log EVENT_CANCEL.
    • This is fine, since the current documentation does not disallow the server to log EVENT_CANCEL.
  3. The client can initiate a Cancel but still receive the status from the server before the Cancel actually takes affect. 
    • In this case, implementations should log the server trailers since that is how the RPC actually ended.
    • Since this is inherently racy, it should be fine for tools to be slightly imprecise.

yas...@google.com

unread,
Aug 28, 2018, 9:25:44 PM8/28/18
to grpc.io
Also, this doc might be outdated but it does say that server can cancel RPCs https://grpc.io/docs/guides/concepts.html#cancelling-rpcs

Spencer Fang

unread,
Aug 28, 2018, 11:18:14 PM8/28/18
to Yash Tibrewal, grpc.io
Responses inline:

On Tue, Aug 28, 2018 at 6:25 PM yashkt via grpc.io <grp...@googlegroups.com> wrote:
Also, this doc might be outdated but it does say that server can cancel RPCs https://grpc.io/docs/guides/concepts.html#cancelling-rpcs
I think C is the only implementation with a server side API that allows this. Since the proposal allows logging a C server side cancel in a sensible way, I think we can add a comment about this possibility but do not attempt to log the server cancellation attempt.
 

On Tuesday, August 28, 2018 at 5:44:33 PM UTC-7, yas...@google.com wrote:
A few observations on the proposal (especially when seen from the C++ side) -
  1. When client receive a RST_STREAM, a GOAWAY, deadline expires, the connection breaks or any similar error condition - 
    • A fake status message is generated. The status_details reflect the error with which the RPC failed.
    • The documentation should be changed accordingly, or made less stringent that the trailers are what were actually received from the wire.
Updated comments to stop suggesting that trailer is always from the network. 
  1. C++ allows the server to cancel. It differs from the other platforms on this and it probably shouldn't be able to do this, but it is the current ground truth, which means the server can also log EVENT_CANCEL.
    • This is fine, since the current documentation does not disallow the server to log EVENT_CANCEL.
Note however that EVENT_CANCEL's meaning on server side would continue to mean that a cancellation has already happened. Presumably even after a server initiated cancellation in C, the server application will still get some signal that the call has been cancelled. Updated comment to clarify this.
  1. The client can initiate a Cancel but still receive the status from the server before the Cancel actually takes affect. 
    • In this case, implementations should log the server trailers since that is how the RPC actually ended.
 According to the (yet to be merged) gRPC call semantics, when a CANCEL happens: "inbound and outbound buffered data should be cleared. Cancellation trumps graceful completion; if the client gRPC implementation received the Trailers before the cancellation, yet the client application has not received the Trailers, then cancellation generally should win." I agree that if the implementation sees trailers it is fine to log them, but the implementation is not required to use the received status to end the RPC.
    • Since this is inherently racy, it should be fine for tools to be slightly imprecise.
Agreed.

--
Spencer Fang

yas...@google.com

unread,
Aug 29, 2018, 7:28:42 PM8/29/18
to grpc.io

On Tuesday, August 28, 2018 at 8:18:14 PM UTC-7, Spencer Fang wrote:
Responses inline:

On Tue, Aug 28, 2018 at 6:25 PM yashkt via grpc.io <grp...@googlegroups.com> wrote:
Also, this doc might be outdated but it does say that server can cancel RPCs https://grpc.io/docs/guides/concepts.html#cancelling-rpcs
I think C is the only implementation with a server side API that allows this. Since the proposal allows logging a C server side cancel in a sensible way, I think we can add a comment about this possibility but do not attempt to log the server cancellation attempt.
If the gRPC guidelines allow a server to cancel, then why not log the cancellation attempt?

 
 

On Tuesday, August 28, 2018 at 5:44:33 PM UTC-7, yas...@google.com wrote:
A few observations on the proposal (especially when seen from the C++ side) -
  1. When client receive a RST_STREAM, a GOAWAY, deadline expires, the connection breaks or any similar error condition - 
    • A fake status message is generated. The status_details reflect the error with which the RPC failed.
    • The documentation should be changed accordingly, or made less stringent that the trailers are what were actually received from the wire.
Updated comments to stop suggesting that trailer is always from the network. 
  1. C++ allows the server to cancel. It differs from the other platforms on this and it probably shouldn't be able to do this, but it is the current ground truth, which means the server can also log EVENT_CANCEL.
    • This is fine, since the current documentation does not disallow the server to log EVENT_CANCEL.
Note however that EVENT_CANCEL's meaning on server side would continue to mean that a cancellation has already happened. Presumably even after a server initiated cancellation in C, the server application will still get some signal that the call has been cancelled. Updated comment to clarify this.
  1. The client can initiate a Cancel but still receive the status from the server before the Cancel actually takes affect. 
    • In this case, implementations should log the server trailers since that is how the RPC actually ended.
 According to the (yet to be merged) gRPC call semantics, when a CANCEL happens: "inbound and outbound buffered data should be cleared. Cancellation trumps graceful completion; if the client gRPC implementation received the Trailers before the cancellation, yet the client application has not received the Trailers, then cancellation generally should win." I agree that if the implementation sees trailers it is fine to log them, but the implementation is not required to use the received status to end the RPC. 
This would depend highly on where the binary logging implementation is in the stack. If it is closer to the application (which is what we want), the binary logging implementation would log the status with which the RPC ended as seen by the application. 

Spencer Fang

unread,
Aug 29, 2018, 8:16:19 PM8/29/18
to Yash Tibrewal, grpc.io
On Wed, Aug 29, 2018 at 4:28 PM yashkt via grpc.io <grp...@googlegroups.com> wrote:

On Tuesday, August 28, 2018 at 8:18:14 PM UTC-7, Spencer Fang wrote:
Responses inline:

On Tue, Aug 28, 2018 at 6:25 PM yashkt via grpc.io <grp...@googlegroups.com> wrote:
Also, this doc might be outdated but it does say that server can cancel RPCs https://grpc.io/docs/guides/concepts.html#cancelling-rpcs
I think C is the only implementation with a server side API that allows this. Since the proposal allows logging a C server side cancel in a sensible way, I think we can add a comment about this possibility but do not attempt to log the server cancellation attempt.
If the gRPC guidelines allow a server to cancel, then why not log the cancellation attempt?

Outcome of offline discussion: server cancellations are not considered a part of the semantics at the moment, so let's not log the server's initiation of the cancellation. 

  1. The client can initiate a Cancel but still receive the status from the server before the Cancel actually takes affect. 
    • In this case, implementations should log the server trailers since that is how the RPC actually ended.
 According to the (yet to be merged) gRPC call semantics, when a CANCEL happens: "inbound and outbound buffered data should be cleared. Cancellation trumps graceful completion; if the client gRPC implementation received the Trailers before the cancellation, yet the client application has not received the Trailers, then cancellation generally should win." I agree that if the implementation sees trailers it is fine to log them, but the implementation is not required to use the received status to end the RPC. 
This would depend highly on where the binary logging implementation is in the stack. If it is closer to the application (which is what we want), the binary logging implementation would log the status with which the RPC ended as seen by the application.

I agree that we should log only the status given to the client application. I was just saying the client library can choose how to break a tie if there's a race and it was aware of both the client initiated cancellation and the incoming status. If what you meant by "since that is how the RPC actually ended" is "that is what the client application saw" then we are in total agreement.

--
Spencer Fang

Spencer Fang

unread,
Sep 6, 2018, 8:52:48 PM9/6/18
to Yash Tibrewal, grpc.io
We may need to remove the "grpc-trace-bin" trace header from the GRFC and wait until the census interaction with gRPC is more clearly defined. There may be different spans representing the client process, the stream, and the server process. There's no single consistent behavior across languages today. Retries and hedging add further complexity to the picture because they are stream level concepts, so if each stream has different trace spans then the higher level interceptors will not be able to see them. One reasonable approach is to log the client process span when a client RPC begins, log the server process span when a server RPC begins. But this is creeping outside the scope of the binlog proto.
--
Spencer Fang

Xiaofeng Han

unread,
Jul 15, 2021, 11:51:47 PM7/15/21
to grpc.io
Hello Spencer, 

This is Xiaofeng from Roblox, just want to double check with you to see if the binary-logging is fully implemented or not. If yes, could you please point me to the source code of the library, we would like to customized it a bit to build a debugging tool. 

thanks,
Xiaofeng
Reply all
Reply to author
Forward
0 new messages