Re: [protobuf] Static initialization order fiasco involving "kGlobalEmptyTable"

73 views
Skip to first unread message

Samuel Benzaquen

unread,
Jan 29, 2024, 9:25:01 AMJan 29
to Knut Stolze, Protocol Buffers


On Sun, Jan 28, 2024 at 3:18 PM Knut Stolze <stolz...@gmail.com> wrote:
Hello folks!

Just the other day, we run into a problem in our C++ project, which we tracked down to static initialization order fiasco. Specifically, we have a global variable/constant container, which includes elements that have a proto object container a map. https://godbolt.org/z/49Mx6h7Yz

What happened is that the global variable "store_t::m_map" is initialized first. The stack trace shows that the code in protobuf UntypedMapBase::TableEntryIsNonEmptyList() gets executed.
==275==ERROR: ThreadSanitizer: SEGV on unknown address 0x0001c8c56070 (pc 0x7f347b67db2a bp 0x7ffdd006baf0 sp 0x7ffdd006ba90 T275) ==275==The signal is caused by a READ memory access. #0 google::protobuf::internal::UntypedMapBase::TableEntryIsNonEmptyList(unsigned long) const bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:514 #1 google::protobuf::internal::UntypedMapBase::NodeAndBucket google::protobuf::internal::KeyMapBase<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::FindHelper<char const*>(char const* const&, absl::container_internal::btree_iterator<absl::container_internal::btree_node<absl::container_internal::map_params<std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, google::protobuf::internal::NodeBase*, google::protobuf::internal::TransparentSupport<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::less, google::protobuf::internal::MapAllocator<std::pair<std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const> const, google::protobuf::internal::NodeBase*> >, 256, false> >, std::pair<std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const> const, google::protobuf::internal::NodeBase*>&, std::pair<std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const> const, google::protobuf::internal::NodeBase*>*>*) const bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:767 #2 std::pair<google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::iterator, bool> google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::TryEmplaceInternal<char const* const&>(char const* const&) bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:1463 #3 std::pair<google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::iterator, bool> google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::ArenaAwareTryEmplace<char const* const&>(std::integral_constant<bool, false>, char const* const&) bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:1532 #4 std::pair<google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::iterator, bool> google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::try_emplace<char const* const&>(char const* const&) bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:1300 #5 md::redundancyType& google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, md::redundancyType>::operator[]<char const*>(char const* const&) bazel-out/k8-dbg/bin/external/com_google_protobuf/src/google/protobuf/_virtual_includes/protobuf_lite/google/protobuf/map.h:1222 #6 info_t::constructTI() src/info.cpp:62 #7 info_t::info_t(int) src/info.cpp:23 #8 __static_initialization_and_destruction_0 store.cpp:37
This sigsegv happens if "kGlobalEmptyTable" in protobuf/map.cc was not yet initialized and an invalid pointer is dereferenced in UntypedMapBase::TableEntryIsNonEmptyList().

I'd like to submit a patch that move the global variable definition inside a function, i.e.a a block-scope variable with static storage duration. I'd also use an std::array to define "kGlobalEmptyTable". Are there any concerns with doing this?

kGlobalEmptyTable has a constant initializer. It does not need any runtime initialization and should not cause "static initialization order fiasco".
Even if it was dynamically initialized, it is a static storage variable so it is zero initialized before the program starts, and the zero initialization is exactly the state we want it to have.
I suspect the bug is elsewhere.

However, it is important to note that protobuf use before main() is best effort and generally not supported.

I see you are running with thread sanitizer. Have you tried using address sanitizer? ASan will diagnose "static initialization order fiasco" errors explicitly. 
https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco
 

Regards,
Knut Stolze

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to protobuf+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/b63ec52c-059d-442b-9b17-061f3da469a1n%40googlegroups.com.

Knut Stolze

unread,
Jan 31, 2024, 4:45:45 PMJan 31
to Protocol Buffers
Many thanks for the feedback, Sam!

I agree with your assessment. Indeed, I was a bit on a wrong track. The segfault is due a wrong object memory layout for a proto message. It appears as if the object starts 8 bytes too late (UntypedMapBase::num_buckets_=0; seed_=1;index_of_first_non_null_=1956680;table_=0;arena_ points to a vtable for some unrelated MapField). This occurs if -O2 and -fsanitize=thread are used together; just -O2 by itself is fine as is -O0 and -fsanitize=thread. Smells like a big in gcc 10.3 and also 13.2.0. 

Have you run across such an issue already before by any chance?

Regards,
Knut

Samuel Benzaquen

unread,
Jan 31, 2024, 4:54:35 PMJan 31
to Knut Stolze, Protocol Buffers
On Wed, Jan 31, 2024 at 4:45 PM Knut Stolze <stolz...@gmail.com> wrote:
Many thanks for the feedback, Sam!

I agree with your assessment. Indeed, I was a bit on a wrong track. The segfault is due a wrong object memory layout for a proto message. It appears as if the object starts 8 bytes too late (UntypedMapBase::num_buckets_=0; seed_=1;index_of_first_non_null_=1956680;table_=0;arena_ points to a vtable for some unrelated MapField). This occurs if -O2 and -fsanitize=thread are used together; just -O2 by itself is fine as is -O0 and -fsanitize=thread. Smells like a big in gcc 10.3 and also 13.2.0. 

Have you run across such an issue already before by any chance?

The ABI of the messages changes when `-fsanitize=thread` is used.
If some users of the message compile with `-fsanitize=thread` and some without, they will not agree on how the message looks like and have an ODR violation.
Make sure that everything is compiled with the same flags.
 

Knut Stolze

unread,
Jan 31, 2024, 5:24:16 PMJan 31
to Protocol Buffers
Awesome - that helps a lot! Building protobuf itself with `-fsanitize=thread` resolves the problem and the crash vanishes. Further, all values in the member variables look fine now.

But it is curious that things worked with `-O0 -fsanitize=thread`. I wonder why...

Samuel Benzaquen

unread,
Jan 31, 2024, 5:38:26 PMJan 31
to Knut Stolze, Protocol Buffers
On Wed, Jan 31, 2024 at 5:24 PM Knut Stolze <stolz...@gmail.com> wrote:
Awesome - that helps a lot! Building protobuf itself with `-fsanitize=thread` resolves the problem and the crash vanishes. Further, all values in the member variables look fine now.

But it is curious that things worked with `-O0 -fsanitize=thread`. I wonder why...

ODR violations cause the program to be ill-formed so it might or might not work at all.
One thing that might be happening is that with -O2 you get more inlining and templates/inline functions get duplicated in many places.
With -O0 you might not be getting inlining so you get a single copy of these functions. So everyone agrees because there is a single copy.
 

Knut Stolze

unread,
Feb 1, 2024, 5:59:22 AMFeb 1
to Protocol Buffers
Makes sense! Thanks again for uncovering my failed assumptions, Sam.
Reply all
Reply to author
Forward
0 new messages