Looking for some advice on a protocol defined using protobufs

88 views
Skip to first unread message

Edvard Fagerholm

unread,
Feb 19, 2022, 12:12:41 PM2/19/22
to Protocol Buffers
Hi everyone,

Since protobufs are used quite a bit in network protocols, I'm looking for some advice requiring extensible protocol definitions using protos. Me nor any of my colleagues have used them for this purpose, so trying to avoid blunders. The other end of the connection is a mobile device using protobuf-lite, so I'm not using "Any" fields that don't have library support on the lite version. The use case is pushing various types of messages to a mobile device and back over WebSocket or UDP without really having to concern over transport level issues.

I'm leaving out the authentication part of the protocol, since it's not relevant here. There are some key requirements:

1. It should be possible to ack messages (to support e.g. delivery callbacks and return errors).

2. It should be easy to add new payload types as well as remove old ones that are obsolete.

3. The protocol needs to be versioned and the server might need to handle old clients.

It would also help if types are chosen in a way that makes parsing quick without sacrificing ease of use and extensibility, since the server fleet will handle millions of concurrent users.

With these requirements in place, a somewhat simplified protocol definition with a payload type invented for this question would be:

    syntax = "proto3";
    package protocol;
   
    import "google/protobuf/timestamp.proto";
   
    // Basic protocol
   
    enum Platform {
      PLATFORM_UNSPECIFIED = 0;
      PLATFORM_ANDROID = 1;
      PLATFORM_IOS = 2;
    }
   
    enum MessageKind {
      MESSAGE_KIND_UNSPECIFIED = 0;
      MESSAGE_KIND_STATUS = 1;
      MESSAGE_KIND_NOTIFICATION = 2;
      // ... and others
    }
   
    enum StatusCode {
      STATUS_CODE_UNSPECIFIED = 0;
      STATUS_CODE_OK = 1;
      STATUS_CODE_PERMISSION_DENIED = 2;
      STATUS_CODE_INTERNAL_ERROR = 3;
    }
   
    // Every message written onto the wire is of this type.
    message ProtocolMessage {
      int32 version = 1;
      int32 sequence_number = 2;
   
      MessageKind kind = 3; // Indicates protobuf type encoded in 'data' field.
      bytes data = 4; // Serialized protobuf.
    }
   
    message Status {
      int32 sequence_number = 1;
      StatusCode code = 2;
      string error_string = 3;
    }
   
    // Notification payload
   
    message NotificationMessage {
      google.protobuf.Timestamp issued_at = 1;
      google.protobuf.Timestamp expiration = 2;
      google.protobuf.Timestamp not_before = 3;
   
      oneof kind {
        Promotion promotion = 4;
        PurchaseCompleted purchase = 5;
        // ... and others
      }
    }

Any comments or improvement ideas? Does oneof make good sense here for fast parsing and extensibility/deprecation? Does it make sense to e.g. group various parts of the messages and choose field numbers from a specific range for each group? For example, in "NotificationMessage" would it make sense to start the field numbers in the oneof at e.g. 100 to make room for adding more fields before that? Of course, nothing forces to write the fields in the protobuf definition in increasing numerical order. I remember having seen this done in some protos back in the day when I worked at Google.

I'm planning on phasing out fields by prepending "deprecated_" to fields that will be removed and then prepend the field name with "OBSOLETE_" once it's not in use anywhere.

Interested if someone has any lessons learned from using protobufs in network protocols and improvement ideas to make the design more future proof.

Best,
Edvard

Edvard Fagerholm

unread,
Feb 21, 2022, 7:36:46 AM2/21/22
to Protocol Buffers
Hi,

Some specific thoughts I had on what I posted on Sunday. I currently use the following in the protocol:

    message ProtocolMessage {
      int32 version = 1;
      int32 sequence_number = 2;
   
      MessageKind kind = 3; // Indicates protobuf type encoded in 'data' field.
      bytes data = 4; // Serialized protobuf.
    }

    enum MessageKind {
      MESSAGE_KIND_UNSPECIFIED = 0;
      MESSAGE_KIND_STATUS = 1;
      MESSAGE_KIND_NOTIFICATION = 2;
      // ... and others
    }

This could also be transformed to:

    message ProtocolMessage {
      int32 version = 1;
      int32 sequence_number = 2;
   
      oneof kind {
        Status status = 3;
        NotificationMessage notification = 4;
        // ... and others
      }
    }

The benefit of the latter is that the protocol parser only has to be called once. Are there any other concerns regarding code generation or extensibility? Using oneof does cause more clutter in the ProcolMessage proto and if anyone would ever want to wrap something other than a protobuf in the protocol, then that would require one more wrapper layer.

Best,
Edvard
Reply all
Reply to author
Forward
0 new messages