gRPC validation for best-practice security? (CWE-502)

Tom Hintz

unread,

Jan 13, 2021, 7:47:13 AM1/13/21

to grpc.io

What infrastructure is available to validate gRPC messages prior to deserialization to protect against CWE-502 attacks? Reference:

MITRE CWE-502 Description

Eric Anderson

unread,

Jan 13, 2021, 3:44:17 PM1/13/21

to Tom Hintz, grpc.io

On Wed, Jan 13, 2021 at 4:47 AM Tom Hintz <tdh...@gmail.com> wrote:

What infrastructure is available to validate gRPC messages prior to deserialization to protect against CWE-502 attacks? Reference:

MITRE CWE-502 Description

I wouldn't say there is any general support to manage attacks like that, because you shouldn't be using such marshalling formats unless you trust the remote. Protobuf, JSON (depending on parser), Flatbuffers, etc are much more restricted than Java serialization and Python pickle and don't allow specifying arbitrary classes.

Some languages may have APIs that would let you do some validation if you were so inclined. For example, in Java you could use an interceptor along with ServerInterceptors.useInputStreamMessages() so that the interceptor observes the message before it is deserialized. But solutions like that are very language-specific. Obviously a proxy would be a cross-language solution.

Tom Hintz

unread,

Jan 14, 2021, 3:21:19 AM1/14/21

to grpc.io

Avoiding selection of arbitrary classes is key to CWE 502, but also the bounding of arrays and strings (which are a specialized form of array). Consider attacks based on messages with intentionally large number of objects. Before that message is unpacked you need to know that the object count is reasonable. Is that possible today?

Eric Anderson

unread,

Jan 14, 2021, 1:39:22 PM1/14/21

to Tom Hintz, grpc.io

On Thu, Jan 14, 2021 at 12:21 AM Tom Hintz <tdh...@gmail.com> wrote:

Avoiding selection of arbitrary classes is key to CWE 502, but also the bounding of arrays and strings (which are a specialized form of array).

I don't see any mention of that in CWE 502, but I agree parsing untrusted data needs to be done carefully. I'll note that bounding arrays and strings is also very different than what is mentioned in CWE 502 as that can be done post-parser if you are only concerned with application behavior. The bigger concern with arrays and strings in my mind is memory consumption. In any case, this discussion is now quite specific to individual serialization formats and implementations and starts departing from what gRPC can provide.

The main bound for arrays and strings is max message size. The default size is a bit too generous, but is still better than unbounded. You are free to reduce the max message size. That is pretty good protection for things like JSON decoded to Lists/Maps and probably for Flatbuffers.

But for Protobuf and JSON decoded to schema-specific types, that works only weakly for arrays. The problem with arrays is not the array itself. Instead, the concern is more in line with a compression attack using a message that contains many fields. The serialized form may contain only a single field but in-memory it would consume memory for all its fields. The "fix" to this is complicated, although you can reduce the risk during your schema design (which is obviously error-prone). You can audit schemas though, and many of them are probably fine (assuming we're worried about attacks on a server and not untrusted servers), although you only need one with an issue to have a problem.

Consider attacks based on messages with intentionally large number of objects. Before that message is unpacked you need to know that the object count is reasonable. Is that possible today?

That requires parsing to find out. So you basically have to parse twice or integrate your check into the parser. That is possible via a proxy or in some language-specific APIs, but it would be annoying as there's no protobuf partial-parser to my knowledge that uses schema information but doesn't actually store the fields. You can make something yourself using classes like CodedInputStream in each language, but you'd need to consume the schema information as you parse.

Protobuf has native protections for recursion limit and message size (although gRPC handles the size limit itself and disables the check in protobuf). Further detailed discussion should probably happen with protobuf folks if you care about the protobuf perspective.

tdh...@gmail.com

unread,

Jan 14, 2021, 4:38:27 PM1/14/21

to Eric Anderson, grpc.io

Networks are the number 1 attack vector yet it's always difficult to communicate the risks convincingly. This is best addressed in the protocol design and security is easiest to implement when the framework has a natural solution. Limiting max request length isn't what the auditors would like.

Reply all

Reply to author

Forward