compile issue with import statements having path info

1,716 views
Skip to first unread message

CB

unread,
Apr 13, 2010, 10:31:49 AM4/13/10
to Protocol Buffers
part of our coding standard is that includes/imports take form;

import "<package name>/<file name>"

using this style yields compile errors for protobuf. Here's example
code

--- A.proto ---

package P;

message A { }

--- B.proto ---

import "P/A.proto";

package P;

message B { optional A a = 1; }

--- compile commands, in folder P ---

$ protoc --cpp_out=. -I../P -I../P/.. ../P/a.proto
$ protoc --cpp_out=. -I../P -I../P/.. ../P/b.proto
$ g++ -I../P -I../P/.. -c b.pb.cc -o b.pb.o
b.pb.cc: In function ‘void P::protobuf_AddDesc_b_2eproto()’:
b.pb.cc:74: error: ‘protobuf_AddDesc_P_2fa_2eproto’ is not a member of
‘P’

why all the dots? it's an automake build using $(srcdir) for the -I's.

--- workaround ---

Remove the <package name> from the import statement.

--- problem ---

The method in question can be seen in a.pb.h as;

a.pb.h: friend void protobuf_AddDesc_a_2eproto();

which lacks the 'P_2f' package specifier that gets folded into the
method name in b.pb.cc.

What's even more fun, is putting message A in foo.proto, and changing
the import in b to;

import "P/foo.proto";

which yields;

foo.pb.h: friend void protobuf_AddDesc_foo_2eproto();
b.pb.cc:74: error: ‘protobuf_AddDesc_P_2ffoo_2eproto’ is not a member
of ‘P’

This all seems to rely on the Java path=package and filename=classname
paradigms, which don't hold for C/CPP.

Jason Hsueh

unread,
Apr 13, 2010, 1:19:58 PM4/13/10
to CB, Protocol Buffers
I think if you run everything from the root of your imports this all goes away. The protobuf_AddDesc_* functions are file-level methods used to initialize various bits of static data. The names for these methods are generated based on the pathname of the .proto. When b.proto imports a.proto, it needs to invoke a.proto's AddDesc method to make sure that a's static data is initialized first. It figures out the name of the function based on the import location, which in your case is "../P/../P/a.proto" - that leads to the extra P_2f in b.pb.cc.

If instead you ran
$ protoc --cpp_out=./P -I./ P/a.proto
$ protoc --cpp_out=./P -I./ P/b.proto

then I believe everything will come out consistently. There's no requirement that the package name correspond to the path name -you just need to make sure the path names match up.


--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.


CB

unread,
Apr 13, 2010, 2:24:49 PM4/13/10
to Protocol Buffers

Workarounds aside, there is still a bug to be fixed.

If you wish to argue that the .proto files or the -I options I passed
to proto are invalid, then protoc should have declared an error.

If you wish to argue that the .proto files and the -I options I passed
to proto are valid, then the c++ code emitted by protoc should have
compiled without error.

If the error was simply an issue of -I options passed to protoc/g++
that would be one thing, but in this case, protoc generated a call to
a function that doesn't even exist in the generated code set. That's
a major bug.

Henner Zeller

unread,
Apr 13, 2010, 2:29:27 PM4/13/10
to CB, Protocol Buffers

Providing a patch that does what you think it should do instead is
probably a good way to start the discussion and get it fixed.

-h

CB

unread,
Apr 13, 2010, 3:05:47 PM4/13/10
to Protocol Buffers
I appreciate the invitation. As I only downloaded the code a few
hours ago, learning it and developing a patch could take some
considerable time. Being unfamiliar with the project philosophy, the
first step would be figuring out which of the two solutions (could be
others) the committers might be willing to consider. I.e., someone
needs to tell me whether my inputs were valid or invalid. So, it
could take even longer.

If you have an issue tracking database, you might want to enter it
there, in case someone else can get to it before I can.

Kenton Varda

unread,
Apr 13, 2010, 3:38:51 PM4/13/10
to CB, Protocol Buffers
So, let me explain exactly the problem here...

> $ protoc --cpp_out=. -I../P -I../P/.. ../P/a.proto
> $ protoc --cpp_out=. -I../P -I../P/.. ../P/b.proto

If you were to compile both protos with a single command:

$ protoc --cpp_out=. -I../P -I../P/.. ../P/a.proto ../P/b.proto

Then protoc would give you errors suggesting that "a.proto" and "P/a.proto" are defining conflicting symbols.  This hints at the underlying problem: protoc cannot tell that the "../P/a.proto" you specified on the command line is the same as the "P/a.proto" that b.proto is trying to import.

The problem here is that the meaning of ".." is subtle -- if foo is a symlink, then "foo/bar/.." is NOT necessarily the same as "foo".  Therefore, protoc makes no attempt to collapse ".."s in order to detect when two files are actually the same file. Instead, it relies only on their canonical name relative to the import path.  Unfortunately, in your case, when you pass "../P/a.proto" on the command line, protoc concludes that because "../P" is in the import path, this file's canonical name is "a.proto".  But b.proto imports "P/a.proto", which must be a different file.

What you actually want to do is this:

$ protoc --cpp_out=. -I.. ../P/a.proto
$ protoc --cpp_out=. -I.. ../P/b.proto

In this case, protoc will determine that the canonical names are "P/a.proto" and "P/b.proto", which match the style you are using in your import statements.

In general, you should never pass overlapping -I flags to protoc.  Arguably protoc should print a warning if you do.  Other than that, I'm not sure what else we could do to detect your problem.


--

CB

unread,
Apr 14, 2010, 5:01:33 PM4/14/10
to Protocol Buffers
I understand your explanation, and have changed my automake to;

$(PROTOC) --cpp_out=$(top_srcdir) -I$(top_srcdir) $(PROTOC_FLAGS) $<

which seems to handle it.

However, there is still one point we're not connecting on, which is
this. It seems that when a.proto compiles, the name for
'protobuf_AddDesc_foo_2eproto' is derived from the canonical name for
a.proto, but when b.proto compiles, the name of that function is
derived not from the canonical name for a.proto, but from the import
statement in b.proto - because protoc assumes the two will be the same
thing.

In a Java world, the assumption that these two things will be the same
holds because the javac enforces it with strict package name and
folder name checking relative to the CLASSPATH. protoc doesn't do
this, and so it shouldn't make the same assumptions. When compiling
b.proto, it should use the canonical name for a.proto to cook the
method name. If you search the -I paths, and find two a.proto files,
then by all means, flag an error and exit.

Peace.

Kenton Varda

unread,
Apr 19, 2010, 1:48:39 PM4/19/10
to CB, Protocol Buffers
On Wed, Apr 14, 2010 at 2:01 PM, CB <cn...@verizon.net> wrote:
I understand your explanation, and have changed my automake to;

       $(PROTOC) --cpp_out=$(top_srcdir) -I$(top_srcdir) $(PROTOC_FLAGS) $<

which seems to handle it.

However, there is still one point we're not connecting on, which is
this. It seems that when a.proto compiles, the name for
'protobuf_AddDesc_foo_2eproto'  is derived from the canonical name for
a.proto, but when b.proto compiles, the name of that function is
derived not from the canonical name for a.proto, but from the import
statement in b.proto - because protoc assumes the two will be the same
thing.

Yes, import statements must use canonical names.
 
In a Java world, the assumption that these two things will be the same
holds because the javac enforces it with strict package name and
folder name checking relative to the CLASSPATH. protoc doesn't do
this, and so it shouldn't make the same assumptions.

How is protoc's --proto_path (-I) any different from javac's CLASSPATH?  I don't see the difference.
 
 When compiling
b.proto, it should use the canonical name for a.proto to cook the
method name.  If you search the -I paths, and find two a.proto files,
then by all means, flag an error and exit.

When compiling b.proto, protoc sees an import for "P/a.proto".  It searches for this file and finds it with no problems.  protoc has *no reason* to believe that this file's canonical name is anything other than "P/a.proto".  There are not "two a.proto files" -- there is "a.proto" and "P/a.proto", and protoc has no reason to ever look at the former.  Sorry, there is nothing protoc can do to detect your problem, except perhaps to notice that your two import paths overlap.
Reply all
Reply to author
Forward
0 new messages