Volatile accesses force an ordering compared to other volatiles, but
that order is not necessarily visible to other threads. They can still
be useful for two purposes in a multi-threaded environment.
One is if you have a single cpu - no SMP or multi-threading. In such
systems, different threads /will/ see the same order of volatile
accesses (even if memory and other bus masters perhaps do not see the
same order without additional fences, barriers, or synchronisation
instructions), and volatile accesses can be significantly cheaper than
atomic accesses.
The other is that volatile accesses can be used to ensure order compared
to other actions, from the viewpoint of the current thread. They can
also control optimisation, avoiding the kind of extra read and write
operation that has been a concern in this thread. And you can easily
force an access to a normal variable to be atomic using a pointer cast
(this is not guaranteed by the standards at the moment, but all
compilers allow it and I believe it is likely to be codified in the
upcoming C standards). As far as I know, you cannot use "*((_Atomic int
*) &x)" to force an atomic access to x, in the way you can force a
volatile access with "*((volatile int *) &x)". Even it were possible,
volatile accesses can be cheaper than atomic accesses.
But as is noted below, volatile accesses do not synchronise with
multi-threading primitives or atomics (unless these are also volatile),
and their order is not guaranteed to match when viewed from different
threads.
>> If you use atomics you don't need mutexes.
>> So what's the purpose of mutexes then?
>
> This thread is getting horribly confused.
>
> Volatile is useless for multithreading. In C and C++ (as opposed to
> Java and C#), volatile does not synchronize. If you need to
> synchronize, use a mutex, an atomic variable or a fence.
Agreed. (Implementation-specific methods are also possible, but clearly
they will not be portable.)
>
> Chris Thomasson can speak for himself, but it seems clear to me that in
> the example under discussion and his subsequent example (his posting of
> 24 Nov 2018 16:30:43 -0800), he was taking it as a given that every read
> or write access to acquire_count and to his data_0, data_1 and data_2
> variables was (as written by the programmer) within a locking of the
> same mutex. That is the only reasonable explanation of his postings,
> and is how I read them. It also seems clear enough to me that that was
> the case also for the gcc bug posting to which we were referred (the
> reference to dropped increments was not to do with the fact that
> acquires_count is not an atomic variable, but to the fact that the
> compiler was reordering access to it outside the mutex for optimization
> purposes in a way forbidden by the C++ standard, although arguably not
> by posix).
>
> In such a case, using an atomic is pointless. It just results in a
> doubling up of fences or whatever other synchronization primitives the
> implementation uses.
As I see it, the fences from the mutex lock (and presumably later
unlock) protect the accesses to data_0, data_1 and data_2 in his later
example. But the earlier example from the google group link was
different - the increment code was not necessarily in the context of
having the lock. The fence from the lock would prevent movement of the
accesses to acquire_const from moving before the trylock call, but it
would not prevent optimisation after that call. So in this case,
additional effort /is/ needed to ensure there is no unexpected effects.
This could be from other fences, atomic accesses, or volatile accesses.
The following would, I believe, all work:
int acquire_counts;
int trylock1(void) {
int res = mtx_trylock(&mutex);
if (res == thrd_success) {
atomic_thread_fence(memory_order_acquire);
++acquire_counts;
}
return res;
}
or
int acquire_counts;
int trylock2(void) {
int res = mtx_trylock(&mutex);
if (res == thrd_success) {
(*(volatile int*)(&acquire_counts))++;
}
return res;
}
or
_Atomic int acquire_counts;
int trylock3(void) {
int res = mtx_trylock(&mutex);
if (res == thrd_success) {
++acquire_counts;
}
return res;
}
>
> Sure, if not all accesses to a protected variable are within a mutex,
> then it needs to be atomic.
You need to be careful about mixing accesses that are protected with a
mutex with accesses that are not protected with the same mutex, even if
these are atomic. Some accesses won't be guaranteed to be visible, or
in the same order, unless the reading thread also takes the same mutex.
Without that, other threads will read the atomic data with either the
old value, or the new value - but not necessarily with the same ordering
amongst other data.
> But if that is the case there is probably
> something wrong with the design.
Agreed.
> You should not design code which
> requires such a doubling up of synchronization approaches, and I cannot
> immediately visualize a case where that would be sensible.
>
I'd say trylock1 above is the best choice for this example, and avoids
unnecessary "doubling up". But too much protection is better than too
little protection - it is better to be a bit inefficient, than to have
the risk of race conditions.