Linker bug? Conflicting class name is permitted

163 views
Skip to first unread message

Keren Zhu

unread,
Jun 28, 2024, 1:17:42 PMJun 28
to cxx
I define a TestWebUIConfig class in a new unit test file. The class overrides a virtual method IsWebUIEnabled() which is called in a unit test. However, the method is unexpectedly never really called.

Upon closer look I found that there is another TestWebUIConfig in //cbui/webui/about/about_ui_unittest.cc. However the linker does not complain about the conflicting class name. That class also overrides IsWebUIEnabled(), and it seems to shadow the IsWebUIEnabled() I define in the new file.

Looking at symbols with `nm`, it seems that the ctor and dtor from both TestWebUIConfig were generated, but only one IsWebUIEnabled() is generated. Also, there is only one vtable for TestWebUIConfig.

000000000db409c0 t TestWebUIConfig::IsWebUIEnabled(content::BrowserContext*)
000000000db408c0 t TestWebUIConfig::TestWebUIConfig(std::__Cr::basic_string<char, std::__Cr::char_traits<char>, std::__Cr::allocator<char> > const&, std::__Cr::basic_string<char, std::__Cr::char_traits<char>, std::__Cr::allocator<char> > const&, bool)
000000000e183ba0 t TestWebUIConfig::TestWebUIConfig()
000000000db40990 t TestWebUIConfig::~TestWebUIConfig()
000000000db40970 t TestWebUIConfig::~TestWebUIConfig()
000000001f53e290 d vtable for TestWebUIConfig

I reproduce the issue in https://crrev.com/c/5666414. Note how TopChromeWebUIConfigTest.TestWebUIConfig failed but TopChromeWebUIConfigTest.TestWebUIConfig2 that uses a different class name passed. This reproduces on Windows and Linux but not macOS.

Is this a known linker bug?


Keren

Lei Zhang

unread,
Jun 28, 2024, 1:34:31 PMJun 28
to Keren Zhu, cxx
It sounds like you have an ODR violation. Can the TestWebUIConfig classes go in anonymous namespaces?

LLD does not have an ODR checker. I guess you can call that a bug or missing feature.

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/d7a4354e-f2c5-42cd-87b5-af49bddce25en%40chromium.org.

Keren Zhu

unread,
Jul 1, 2024, 11:44:45 AMJul 1
to Lei Zhang, cxx
Yes this is an ODR violation. Thanks for pointing that out!

I was expecting linker to detect duplicate definitions across the translation unit, that it looks like this is, as explicitly spec'd, not a requirement for C++ linkers, if I interpret it correctly.

Using an unnamed namespace is definitely a solution. However, duplicating symbols seems to be a very easy mistake to make in tests, and I am surprised we don't have a good way to forbid that in compile time.

Keren

Peter Kasting

unread,
Jul 1, 2024, 12:01:12 PMJul 1
to Keren Zhu, Lei Zhang, cxx
On Mon, Jul 1, 2024 at 8:44 AM 'Keren Zhu' via cxx <c...@chromium.org> wrote:
Yes this is an ODR violation. Thanks for pointing that out!

I was expecting linker to detect duplicate definitions across the translation unit, that it looks like this is, as explicitly spec'd, not a requirement for C++ linkers, if I interpret it correctly.

(IIUC, the reason ODR violations are no-diagnostic-required is because, especially with templates, detecting which ones are actually errors and which are not can be tricky.)

Using an unnamed namespace is definitely a solution. However, duplicating symbols seems to be a very easy mistake to make in tests, and I am surprised we don't have a good way to forbid that in compile time.

FWIW, jumbo turns many of these (ones where both instances occur in a single merged TU) into compile errors. Of course it also turns the anon-namespace version (which is perfectly fine) into a compile error. :(

This sort of problem is one reason I automatically put all file-scope stuff in test code (which is usually almost all of it) into anonymous namespaces, just like I would with production code. The downside is that if you put your test fixture classes into anon namespaces, you can't friend them. But if you're testing the contract and not the implementation (as you should be!), friending your test fixtures is often a code smell anyway.

PK

Hans Wennborg

unread,
Jul 1, 2024, 12:05:02 PMJul 1
to Keren Zhu, Lei Zhang, cxx
It's inline functions that make it tricky. It's in their nature that an inline function may have a definition in multiple translation units.

If your function wasn't inline, the linker would error about the duplicate definition as you expected.

This is indeed an easy mistake to make, especially in tests. That's why it's common to use an anonymous namespace for this kind of code in tests. There's some previous discussion here: https://groups.google.com/a/chromium.org/g/chromium-dev/c/pKY2G4GHcu0

Keren Zhu

unread,
Jul 2, 2024, 9:37:40 PMJul 2
to Hans Wennborg, Lei Zhang, cxx
If your function wasn't inline, the linker would error about the duplicate definition as you expected.

In my case IsWebUIEnabled() is not inlined. It is supposed to have symbols (see the output of nm I pasted). However, it seems that only one symbol is generated and the other is ignored in the final executable unit_tests. I was using remoteexec and it did not seem to generate .a files so I haven't checked that.

Keren

Hans Wennborg

unread,
Jul 3, 2024, 3:04:21 AMJul 3
to Keren Zhu, Lei Zhang, cxx
On Wed, Jul 3, 2024 at 3:37 AM Keren Zhu <kere...@google.com> wrote:
If your function wasn't inline, the linker would error about the duplicate definition as you expected.

In my case IsWebUIEnabled() is not inlined. It is supposed to have symbols (see the output of nm I pasted). However, it seems that only one symbol is generated and the other is ignored in the final executable unit_tests. I was using remoteexec and it did not seem to generate .a files so I haven't checked that.

I meant that it's an inline function in your code. You define it inside the class, that means it's semantically an inline function:

class TestWebUIConfig : public content::WebUIConfig {
[...]
  bool IsWebUIEnabled(content::BrowserContext* browser_context) override {     <---- This is an inline function definition.
    return true;
  }
};

That doesn't mean the compiler will inline it, but it does mean the linker will not error about multiple definitions.

Peter Kasting

unread,
Jul 3, 2024, 11:30:37 AMJul 3
to cxx, kere...@google.com, Lei Zhang, cxx, Hans Wennborg
On Tuesday, July 2, 2024 at 6:37:40 PM UTC-7 kere...@google.com wrote:
If your function wasn't inline, the linker would error about the duplicate definition as you expected.

In my case IsWebUIEnabled() is not inlined.

"inline" in C++ means "allowed to have multiple definitions, linker is free to select any". It is distinct from whether a function has been inlined by the compiler, which there is no standardized way to control (though we have ALWAYS_INLINE and NOINLINE macros to try and tell the compiler to do one or the other).

As Hans mentions, any function defined in the class declaration is implicitly an inline function, as is any function declared "inline". Whether the compiler will inline their definitions into callers is another matter.

This is extremely confusing and probably the vast majority of C++ programmers do not understand it correctly.

PK

Keren Zhu

unread,
Jul 3, 2024, 12:28:20 PMJul 3
to Peter Kasting, cxx, Lei Zhang, Hans Wennborg
Thank you Hans and Peter for your explanation. The term "inline" is apparently overloaded - it is either "definition-inline" as you have pointed out in my code, or "caller-inline" that I confused with. Let's ignore caller-inline in this discussion.

From the perspective of a linker, what would be the difference between inline and non-inlined methods? How could a linker distinguish them, and not error on the multiple definitions of inline function?

I did a quick test with godbolt on object files disassembly of inline vs. non-inline functions (diff). The only difference is that the non-inline function has an additional nop instruction after ret. 

This looks cryptic to me, and I would be surprised if the linker is leveraging nop to identify non-inline functions. Or if I might have missed something?

Keren

Peter Kasting

unread,
Jul 3, 2024, 12:37:24 PMJul 3
to Keren Zhu, cxx, Lei Zhang, Hans Wennborg
On Wed, Jul 3, 2024 at 9:28 AM Keren Zhu <kere...@google.com> wrote:
From the perspective of a linker, what would be the difference between inline and non-inlined methods? How could a linker distinguish them, and not error on the multiple definitions of inline function?

I don't know; this is beyond my level of expertise. I don't know how linkage and visibility are communicated from compiler to linker. I assume there's data about this somewhere in the .o besides just the assembly and mangled names.

PK

Sam Clegg

unread,
Jul 3, 2024, 1:12:27 PMJul 3
to Keren Zhu, Peter Kasting, cxx, Lei Zhang, Hans Wennborg
On Wed, Jul 3, 2024 at 9:28 AM 'Keren Zhu' via cxx <c...@chromium.org> wrote:
Thank you Hans and Peter for your explanation. The term "inline" is apparently overloaded - it is either "definition-inline" as you have pointed out in my code, or "caller-inline" that I confused with. Let's ignore caller-inline in this discussion.

From the perspective of a linker, what would be the difference between inline and non-inlined methods? How could a linker distinguish them, and not error on the multiple definitions of inline function?

I did a quick test with godbolt on object files disassembly of inline vs. non-inline functions (diff). The only difference is that the non-inline function has an additional nop instruction after ret. 

This looks cryptic to me, and I would be surprised if the linker is leveraging nop to identify non-inline functions. Or if I might have missed something?

In godbolt you need to show the asm directives (they are filtered out by default i guess).  I you do that, you will see that all the inline functions are marked as `weak`.  e.g.

 .weak   Bar::Bar() [base object constructor]    

This tells the linker to simply pick the first definition it finds (or any definition it likes) and ignore all the others.   It's on you (sadly) to guarantee that all those definitions have the same semantics.

cheers,
sam


Keren

On Wed, Jul 3, 2024 at 11:30 AM Peter Kasting <pkas...@chromium.org> wrote:
On Tuesday, July 2, 2024 at 6:37:40 PM UTC-7 kere...@google.com wrote:
If your function wasn't inline, the linker would error about the duplicate definition as you expected.

In my case IsWebUIEnabled() is not inlined.

"inline" in C++ means "allowed to have multiple definitions, linker is free to select any". It is distinct from whether a function has been inlined by the compiler, which there is no standardized way to control (though we have ALWAYS_INLINE and NOINLINE macros to try and tell the compiler to do one or the other).

As Hans mentions, any function defined in the class declaration is implicitly an inline function, as is any function declared "inline". Whether the compiler will inline their definitions into callers is another matter.

This is extremely confusing and probably the vast majority of C++ programmers do not understand it correctly.

PK

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Keren Zhu

unread,
Jul 3, 2024, 2:11:47 PMJul 3
to Sam Clegg, Peter Collingbourne, Peter Kasting, cxx, Lei Zhang, Hans Wennborg
Sam, you hit the spot. Thank you for pointing that out!

It makes sense to ignore duplicate weak definitions, I assume for linker performance reasons. On the other hand, it does not seem too far-fetched to add an optional addition check to verify if those definitions are identical.

It seems that various ODR checkers have been proposed. e.g. https://discourse.llvm.org/t/rfc-odr-checker-for-clang-and-lld/45148 by @Peter Collingbourne. I have not looked into them to understand the challenges.

Keren


Reply all
Reply to author
Forward
0 new messages