On 26/09/2022 09:53, Juha Nieminen wrote:
> David Brown <
david...@hesbynett.no> wrote:
>> There are a number of ways to implement atomics that are larger than a
>> single bus operation can handle, or that involve multiple bus
>> operations. Different processor types have different solutions, and
>> some are optimised or limited to particular setups (such as single
>> writer, single processor, etc.). Some processors can handle atomic
>> accesses for sizes that are bigger than native C/C++ types (such as
>> 128-bit accesses). Some cannot handle atomic writes for the bigger
>> native types.
>
> I think there's a bit of confusion here about what the term "atomic"
> means. You seem to be talking about a concept of "atomic" with some kind
> of meaning like "mutual exclusion supported by the CPU itself".
>
No, that's not what I am saying.
> That's not what "atomic" means in general, when talking about
> multithreaded programming. In general "atomic" merely means that the
> resource in question can only be accessed by one thread at a time
> (in other words, it implements some sort of mutual exclusion).
And that's not quite right either.
"Atomic" means that accesses are indivisible. As many threads as you
want can read or write to the data at the same time - the defining
feature is that there is no possibility of a partial access succeeding.
We've mostly mentioned reads and writes - but more complex transactions
can be atomic too, such as increments. The term can also apply to
collections of accesses, well-known from the database world. Such
atomic transactions need to be built on top of low-level atomic accesses
with locks, lock-free algorithms, or more advanced protocols such as
software transactional memory.
Atomic accesses do not have to be purely hardware implementations,
though that is the most efficient - and anything software-based is going
to depend on smaller hardware-based atomic accesses. By far the most
convenient accesses are when you can read or write the memory with
normal memory access instructions, or at most by using things such as a
"bus lock prefix" available on some processors. On RISC processors,
anything beyond a single read or write of a size handled directly by
hardware typically involves load-store-exclusive sequences.
When you have to use code sequences for access, then it's common that
you end up with mutual exclusion - one thread at a time has access. But
it doesn't have to be that way, and different software sequences can be
used to optimise different usage patterns. All that matters is that if
a read sequence exits happily saying "I've read the data", then the data
it read matches exactly the data that some thread wrote at some point.
>
> As a concrete example: POSIX requires that fwrite() be atomic (for a
> particular FILE object). This means that no two threads can write
> to the same FILE object with a singular fwrite() call at the same
> time. In other words, fwrite() implements (at some level) some kind
> of (per FILE object) mutex.
>
That's at a much higher level than has been under discussion here - but
yes, that is applying the same term and guarantees for different
purposes. (The "atomic" requirement does not force a mutex, but
fwrite() has other guarantees beyond mere atomicity.)
> "Atomic" is actually a stronger guarantee than merely "thread-safe".
>
> If fwrite() were merely guaranteed to be "thread-safe", it would just
> mean that it won't break (eg. corrupt its internal state, or any other
> data anywhere else) if two threads call it at the same time, but it
> wouldn't guarantee that the data written by those two threads won't
> be interleaved somehow.
>
"Thread safe" is not as well-defined a term as "atomic", as far as I see it.
> However, since fwrite() is "atomic", not just "thread-safe" (if it
> conforms to POSIX), then it implements a mutex for the entire function
> call (for that particular FILE object).
"Atomic" is not really enough to describe the behaviour of a function
like "fwrite", since the function does not act on a single "state". If
you have two threads trying to write A and B to the same object
simultaneously, atomicity means that a third thread reading the object
will see A or B, and never a mixture. It's fine if this is implemented
by a write of A then a write of B, a write of B then a write of A, a
write of A alone, a write of B alone, a lock blocking the thread then a
mix of A, B, C and D that gets sorted into one of A or B before the lock
is released, or any other combination. Clearly that is not the
behaviour you want from fwrite() - here there should be either A then B,
or B then A.