I'm posting here because I found a very outdated tool linked in the
protobuf repo that seemed to possibly include unpatched CVEs. Analysis/a
quick fix (just remove the link to the old tool) can be found in
https://github.com/protocolbuffers/protobuf/pull/26689Realizing
that the old source code introduced possible vulnerabilities inspired me
to dig into that particular CVE:
https://www.cve.org/CVERecord?id=CVE-2015-5237, especially because the
2GB message limit is actually something I've run into in my regular
work. In
https://github.com/protocolbuffers/protobuf/releases/ I noticed
that many of the old official 1P releases may contain unpatched code
allowing for buffer overflow/RCE in older C++ proto deserialization
implementations that don't enforce the ~2GB max message limit, like
https://github.com/protocolbuffers/protobuf/releases/tag/v3.0.2 (see
https://nvd.nist.gov/vuln/search#/nvd/home?cpeFilterMode=cpe&cpeName=cpe:2.3:a:google:protobuf:3.0.2:*:*:*:*:*:*:*&resultType=records).
Then I found that projects seem to still be using these vulnerable
releases still, like in
https://github.com/Wikidata/primarysources/blob/07bf74e56ada68a211f1712cc2473a98ce92ef0c/.travis.yml#L31.
I'm
not sure what the policy is for distributing old packages with
vulnerabilities (maybe they're intentionally left up without patches?)
but I think there might be a need for some cleanup in
https://github.com/protocolbuffers/protobuf to remove broken, outdated,
and vulnerable releases and any references to tools/projects that also
contain unpatched code.
Right now the protobuf project is
linking to 3P tools and unpatched 1P releases with exploitable buffer
overflow, as far as I can tell. At the time this was dismissed as
not a severe problem because it would only affect messages > 2GB, but
I think that was too hasty. Adversarial input could trigger the 2GB
overflow by providing large junk string/blob values or tons of repeated
entries (possibly from several layers removed upstream, eg like log4j)
in any message with repeated or variable-length fields, I think? That
would be quite easy to do on any exposed port running a vulnerable grpc
service, and probably pretty straightforward to do anywhere else where
media or string input could propagate from an untrusted user to an
internal grpc service, so I I'm not sure just fixing-forward and leaving
the unpatched code up on github for other projects to continue to use
was the right call. But I could be mistaken here about something (I hope
I am!) so please let me know if so!