c++11: what's the guard variable for static local variable initialization?

anho...@gmail.com

unread,

Oct 25, 2019, 8:09:06 PM10/25/19

to

Hi experts,

From c++11 static local variable initialization is thread-safe.
When I was using the godbolt to check how the gcc implement this, I found it firstly access a special ``guard variable".

I remember that someone told me, before C++11 the static singleton is not thread-safe and usually people use the double check locking (DCL) trick, but it requires a memory barrier, or the is_initialized flag should be atomic.

But from the godbolt, I didn't see any meory barrier, or keywords like ``atomic"? does the ``guard variable" implies atomic or any thing?

https://godbolt.org/z/yRtxL_

I searched over the internet, only some materials discussed that the guard variable is 64 bit and the LSB should set when initialization is done.

I am new to c++, forgive me if I am asking silly question.

Thanks!
Anhong

Alf P. Steinbach

unread,

Oct 26, 2019, 2:56:11 AM10/26/19

to

The (outer) guard variable doesn't need to be atomic because the worst
that can happen in the case where it doesn't correctly reflect that
initialization has been performed, is that the code goes into needless
mutex locking and deeper checking, which is just an efficiency concern.

However, this can only happen the first few times.

Using an atomic (outer) guard variable would avoid needless rounds of
mutex locking, but accessing that variable can be slower and would then
be a cost incurred on every access of the static. Evidently the gcc devs
decided to not do that. I.e., they /support/ the maximal efficiency for
a general solution, without guaranteeing it, because it can't be
guaranteed against all kinds of user code.

For most calls, in practice all calls after the first, the overhead is
then exactly the same as before C++ got thread support, the C++03 era,
namely a simple checking of an ordinary non-atomic boolean flag.

One way to ensure that in user code is to call the function from the
main thread, before it can possibly be called from other threads.

- Alf

Öö Tiib

unread,

Oct 26, 2019, 7:08:24 AM10/26/19

to

On Saturday, 26 October 2019 03:09:06 UTC+3, anho...@gmail.com wrote:
> Hi experts,
>
> From c++11 static local variable initialization is thread-safe.
> When I was using the godbolt to check how the gcc implement this, I found it firstly access a special ``guard variable".
>
> I remember that someone told me, before C++11 the static singleton is not thread-safe and usually people use the double check locking (DCL) trick, but it requires a memory barrier, or the is_initialized flag should be atomic.
>
>
> But from the godbolt, I didn't see any meory barrier, or keywords like ``atomic"? does the ``guard variable" implies atomic or any thing?
>
> https://godbolt.org/z/yRtxL_

The code assumes that the guard variable will never be written by
any thread once it is set.
Possible write to the guard variable when it is not yet set is
synchronized with calls to __cxa_guard_acquire, __cxa_guard_release and
__cxa_guard_abort.

>
> I searched over the internet, only some materials discussed that the guard variable is 64 bit and the LSB should set when initialization is done.
>
> I am new to c++, forgive me if I am asking silly question.

No it is not silly it is the must ask question in all programming
languages about threads and singletons. C++ does it behind scenes
since threads were added to it by C++11.
See <https://en.wikipedia.org/wiki/Double-checked_locking>
then verify that your godbolt assembly is doing what other
language users have to write out explicitly.

anho...@gmail.com

unread,

Oct 26, 2019, 6:43:49 PM10/26/19

to

在 2019年10月25日星期五 UTC-7下午11:56:11，Alf P. Steinbach写道：

I checked from here: https://opensource.apple.com/source/libcppabi/libcppabi-14/src/cxa_guard.cxx
and seems compiler translates the function static variable initialization as:
// 1 if ( obj_guard.first_byte == 0 ) {
// 2 if ( __cxa_guard_acquire(&obj_guard) ) {
// 3 try {
// 4 initialize_the_object();
// 5 }
// 6 catch (...) {
// 7 __cxa_guard_abort(&obj_guard);
// 8 throw;
// 9 }
// 10 ... queue object destructor with __cxa_atexit() ...;
// 11 __cxa_guard_release(&obj_guard);
// 12 }
// 13 }

so can I understand it like this?:

1. the obj_guard get set in __cxa_guard_release(&obj_guard), this MUST (WHY?) happen after the initialize_the_object(). If not then we are facing the same problem in double-checking-lock: one thread holds the lock, and reorder the initialization and guard assignment, makes another thread waiting on line 1 incorrectly returns too eariler.

2.after the obj_guard has been set, even another thread can't immediately see this update, it just go inside the needless lock/unlock, and this only happens the first few times.

Thanks,
AH

Chris Vine

unread,

Oct 26, 2019, 7:24:10 PM10/26/19

to

If you look at the code you will see that __cxa_guard_acquire() checks
initializerHasRun() again after it has acquired the mutex. So all is
fine - locking and then unlocking a mutex has acquire and release
semantics and so forces an ordering, even in a case where a race has
occurred with respect to the original check of initializerHasRun()
outside the mutex.

Chris Vine

unread,

Oct 26, 2019, 7:44:34 PM10/26/19

to

Actually on looking further at the code it does seem to suffer from the
traditional double checked locking problem in that a second thread could
assume initialization by the first thread before it has in fact
occurred.

Presumably either there is something else which deals with that
problem or the compiler, knowing its platform (which is presumably
x86/64 in this case), has calculated that this cannot in practice occur
with its total store ordering memory model.