Generating code via protoc plugin without resolving indirect imports

447 views
Skip to first unread message

Brandon Duffany

unread,
Apr 18, 2023, 8:10:15 PM4/18/23
to Protocol Buffers
I'm writing a protoc plugin protoc-gen-protobufjs that is intended as a faster version of protobufjs-cli and which is a better fit for Bazel.

I have gotten it working, and am now trying to optimize the build rules a little bit. My understanding is that when compiling a proto file, protoc needs to locate all of the file's transitive dependencies, and gives an error if it can't find one of them.

Please correct me if I'm wrong (I'm not a proto expert), but my understanding is that I just need to know about the directly imported protos so that I can tell whether a particular type reference is a message or an enum. So I am wondering if there is a way to tell protoc to not fail if it can't find an indirect import, and instead continue code generation anyway.

Thanks!

Brandon

habe...@google.com

unread,
Apr 19, 2023, 11:09:23 AM4/19/23
to Protocol Buffers
Hi Brandon,

The proto compiler doesn't generally allow what you are asking for. It always wants to see the full closure of .proto files.

But if you are writing for Bazel, you shouldn't have to worry about any of that. As long as you are on Bazel >=5.3, you can use proto_common to handle most of the hard work for you when writing proto rules with aspects. proto_common makes it easy to write rules that work like the built-in rules cc_proto_library(), java_proto_library(), etc.

There don't seem to be a lot of good documentation or examples for this right now. You could take a look at my in-progress CL that migrates to using proto_common: https://github.com/protocolbuffers/upb/pull/1254/files#diff-6816023f8495e20887edd8410f0348dbf79b27761cd5cf44cbfa6f72389274af

Josh

Adam Cozzette

unread,
Apr 19, 2023, 11:46:58 AM4/19/23
to habe...@google.com, Protocol Buffers
Getting protoc to work with only direct dependencies is an interesting idea because it has the potential to meaningfully speed up builds, but as Josh said, protoc is not currently set up to be able to do that.

I think the way to do this would be to start by calling AllowUnknownDependencies on the DescriptorPool in protoc. In principle that would allow protoc to work with only direct dependencies. I tried this once and it didn't immediately succeed, though, so I suspect it would take a fair bit of work and experimentation to get something working, and then even more work to set up Bazel to take advantage of it.

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/5db3ba6d-dec6-4391-af00-2e83ef4445ben%40googlegroups.com.

Brandon Duffany

unread,
Apr 19, 2023, 11:50:20 AM4/19/23
to Protocol Buffers
Thanks Josh! I will have a look at proto_common and see what I can glean from it.

More generally, is there any guidance around how/whether proto codegen rules should use aspects? My current rule implementation does not use aspects at all if I'm understanding correctly - I'm just forming `--proto_path` / `-I` args to protoc based on the transitive_proto_paths of direct dep ProtoInfo (code), and it seems to work. Is there an advantage to using aspects?

Brandon

Brandon Duffany

unread,
Apr 19, 2023, 11:55:03 AM4/19/23
to Protocol Buffers
Thanks Adam, do you have a sense of whether it should be possible fundamentally? i.e. is there any data in a FileDescriptor that can't be populated unless you have both direct and indirect deps available? If not, I might take a poke at adding a protoc flag to set that option, and see if I can chip away at the errors until it works.

habe...@google.com

unread,
Apr 19, 2023, 2:04:01 PM4/19/23
to Protocol Buffers
It is strongly recommended that proto rules use aspects going forward, because it avoids the need to specify the dependency graph N times (one per language). proto_library() can specify the dependency graph once, and then any number of languages can have lang_proto_library() rules that reuse the unified graph.

Adam Cozzette

unread,
Apr 20, 2023, 5:39:36 PM4/20/23
to Brandon Duffany, Protocol Buffers
As far as I know, the direct deps should be sufficient for parsing .proto files and generating descriptors from them. The one thing I'm not sure about is parsing of custom options; I don't know if we allow proto files to set custom options and reference fields that aren't in the direct deps.

For actually generating code, some of the code generators may rely on indirect deps. E.g. I believe the C++ code generator has some logic to check whether a message transitively contains any required fields.

Reply all
Reply to author
Forward
0 new messages