multi-threaded data access

novic...@gmail.com

unread,

Dec 22, 2009, 7:09:46 PM12/22/09

to

lets say I have

unsigned int x = 0;

And the code

x++;

Is run a total of 1 million times but the million times is divided
across many threads.

If i try to read the value of x am i guaranteed to get a value between
0 and 1 million.

The subtle question here is, are reads and writes of a variable
atomic? Do i have to worry that i am reading the variable while it is
being updated and might get total garbage such as 9 million as the
value when i read it. Or is the worst case that i missed some
increments because the increment on 1 thread was done without taking
into account the updates from another thread?

I hope i have made the question clear.

Thank you,
Ivan Novick
http://www.mycppquiz.com

David Schwartz

unread,

Dec 22, 2009, 7:19:12 PM12/22/09

to

On Dec 22, 4:09 pm, "novicki...@gmail.com" <novicki...@gmail.com>
wrote:

> lets say I have
>
> unsigned int x = 0;
>
> And the code
>
> x++;
>
> Is run a total of 1 million times but the million times is divided
> across many threads.
>
> If i try to read the value of x am i guaranteed to get a value between
> 0 and 1 million.

No, at least not unless some document provides such a guarantee.
Neither the C nor C++ standards provide one.

> The subtle question here is, are reads and writes of a variable
> atomic? Do i have to worry that i am reading the variable while it is
> being updated and might get total garbage such as 9 million as the
> value when i read it. Or is the worst case that i missed some
> increments because the increment on 1 thread was done without taking
> into account the updates from another thread?

It depends what the documentation for the threading standard you are
using says. If POSIX, you are not guaranteed anything at all, it could
even crash.

DS

Rainer Weikusat

unread,

Dec 23, 2009, 5:24:22 AM12/23/09

to

"novic...@gmail.com" <novic...@gmail.com> writes:
> lets say I have
>
> unsigned int x = 0;
>
> And the code
>
> x++;
>
> Is run a total of 1 million times but the million times is divided
> across many threads.
>
> If i try to read the value of x am i guaranteed to get a value between
> 0 and 1 million.
>
> The subtle question here is, are reads and writes of a variable
> atomic?

The short answer is 'yes'. The longer one would be: This depends on
the machine code which is actually executed and on the properties of
your hardware platform. Provided that x++ is translated to single word
operations and is properly aligned so that it can be loaded and stored
in a single bus cycle, the answer is still yes. If you want a paper
decree regarding 'how people have to generally design hardware', you
won't find one.

Rainer Weikusat

unread,

Dec 23, 2009, 5:29:17 AM12/23/09

to

David Schwartz <dav...@webmaster.com> writes:
> On Dec 22, 4:09�pm, "novicki...@gmail.com" <novicki...@gmail.com>
> wrote:
>> lets say I have
>>
>> unsigned int x = 0;
>>
>> And the code
>>
>> x++;
>>
>> Is run a total of 1 million times but the million times is divided
>> across many threads.
>>
>> If i try to read the value of x am i guaranteed to get a value between
>> 0 and 1 million.
>
> No, at least not unless some document provides such a guarantee.

A standard is a specification of requirements and as such, doesn't
"guarantee" anything. Insofar a test suite exists and a particular
platform was found to be compliant, that would be sort-of a guarantee
(the test suite itself could be buggy).

[...]

>> The subtle question here is, are reads and writes of a variable
>> atomic? �Do i have to worry that i am reading the variable while it is
>> being updated and might get total garbage such as 9 million as the
>> value when i read it. �Or is the worst case that i missed some
>> increments because the increment on 1 thread was done without taking
>> into account the updates from another thread?
>
> It depends what the documentation for the threading standard you are
> using says. If POSIX, you are not guaranteed anything at all, it could
> even crash.

This is a statement which goes beyond the realm of the standard,
meaning, either it describes an actual example, then this example
should be named, and otherwise, it is science fiction.

Eric Sosman

unread,

Dec 23, 2009, 8:18:52 AM12/23/09

to

On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
> "novic...@gmail.com"<novic...@gmail.com> writes:
>> [...]

>> The subtle question here is, are reads and writes of a variable
>> atomic?
>
> The short answer is 'yes'.

The short answer is "maybe."

--
Eric Sosman
eso...@ieee-dot-org.invalid

Rainer Weikusat

unread,

Dec 23, 2009, 8:33:28 AM12/23/09

to

Eric Sosman <eso...@ieee-dot-org.invalid> writes:
> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
>> "novic...@gmail.com"<novic...@gmail.com> writes:
>>> [...]
>>> The subtle question here is, are reads and writes of a variable
>>> atomic?
>>
>> The short answer is 'yes'.
>
> The short answer is "maybe."

That's the long answer, and it shouldn't be 'maybe' but 'very likely
so' and include a somewhat more detailed explanation. Possibly
including a list of platforms where the answer is actually no and the
addresses of the various museums where one can still see them[*].

[*] An example I remembler (from a past posting) was 'some
m68k platform' where the data bus width was smaller than the
'int size'.

BTW, there is no such thing as a good or even honest motive for trying
to conceal information by threatening others into voluntarily
blindfolding themselves.

Ralph Böhme

unread,

Dec 23, 2009, 9:05:50 AM12/23/09

to

Eric Sosman <eso...@ieee-dot-org.invalid> schrieb:

> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
>> "novic...@gmail.com"<novic...@gmail.com> writes:
>>> [...]
>>> The subtle question here is, are reads and writes of a variable
>>> atomic?
>>
>> The short answer is 'yes'.
> The short answer is "maybe."

I guess the helpful short answer is sig_atomic_t ?

-Ralph

--
s/-nsp// for mail

Eric Sosman

unread,

Dec 23, 2009, 10:47:33 AM12/23/09

to

Thanks for the clarification. I now understand that the

Rainer Weikusat

unread,

Dec 23, 2009, 10:59:06 AM12/23/09

to

The OP is using Mac OS X on Intel and this means the answer is 'yes'
for him, subject to the limitations I already wrote about, eg,
accesses crossing cacheline boundaries. This is also true for all
other OSes running on x86 platforms and for all other hardware/
software combinations I am aware of.

24.4.7.2 Atomic Types
.....................

[...]

In practice, you can assume that `int' is atomic. You can also
assume that pointer types are atomic; that is very convenient. Both of
these assumptions are true on all of the machines that the GNU C
library supports and on all POSIX systems we know of.

This is part of the GNU C library documentation and implies that the
statement is also true for a wide range of architectures I haven't
ever encountered personally. Please name a few where 'word
accesses' to properly aligned storage locations are not atomic.
That shouldn't be so difficult, should it?

Eric Sosman

unread,

Dec 23, 2009, 11:33:24 AM12/23/09

to

On 12/23/2009 10:59 AM, Rainer Weikusat wrote:
> Eric Sosman<eso...@ieee-dot-org.invalid> writes:
>> On 12/23/2009 8:33 AM, Rainer Weikusat wrote:
>>> Eric Sosman<eso...@ieee-dot-org.invalid> writes:
>>>> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
>>>>> "novic...@gmail.com"<novic...@gmail.com> writes:
>>>>>> [...]
>>>>>> The subtle question here is, are reads and writes of a variable
>>>>>> atomic?
>>>>>
>>>>> The short answer is 'yes'.
>>>>
>>>> The short answer is "maybe."
>>>
>>> That's the long answer, and it shouldn't be 'maybe' but 'very likely
>>> so' and include a somewhat more detailed explanation. Possibly
>>> including a list of platforms where the answer is actually no and the
>>> addresses of the various museums where one can still see them[*].
>>>
>>> [*] An example I remembler (from a past posting) was 'some
>>> m68k platform' where the data bus width was smaller than the
>>> 'int size'.
>>>
>>> BTW, there is no such thing as a good or even honest motive for trying
>>> to conceal information by threatening others into voluntarily
>>> blindfolding themselves.
>>
>> Thanks for the clarification. I now understand that the
>> short answer is "maybe."
>

> The OP is using Mac OS X on Intel [...]

You made that up. It's your own fabrication, not something
the O.P. wrote or implied -- not in his original post, at any
rate, and as yet my news server shows a grand total of zero count
them zero of his further contributions to the thread in the
ensuing sixteen-plus hours.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Rainer Weikusat

unread,

Dec 23, 2009, 12:23:50 PM12/23/09

to

No I didn't, as you are very well aware. The relevant header which
came with the original posting was

X-HTTP-UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-US)
AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.43 Safari/532.5,gzip(gfe),gzip(gfe)

Not that this would really matter for the core parts of my
statement. So, where are your (or anyone else's) examples for
non-fictional platforms with non-atomic 'int/ unsigned' memory
accesses? This is, after all, information which could be of use, while
hand-waiving generally isn't.

Golden California Girls

unread,

Dec 23, 2009, 1:21:22 PM12/23/09

to

Rainer Weikusat wrote:
>
> No I didn't, as you are very well aware. The relevant header which
> came with the original posting was
>
> X-HTTP-UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-US)
> AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.43 Safari/532.5,gzip(gfe),gzip(gfe)
>

Which just means he has a Mac, it in no way implies that is the machine he is
writing code for.

David Schwartz

unread,

Dec 23, 2009, 2:19:05 PM12/23/09

to

On Dec 23, 2:29 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:

> > It depends what the documentation for the threading standard you are
> > using says. If POSIX, you are not guaranteed anything at all, it could
> > even crash.

> This is a statement which goes beyond the realm of the standard,
> meaning, either it describes an actual example, then this example
> should be named, and otherwise, it is science fiction.

Huh? The POSIX standard is quite clear that it puts no restrictions
whatsoever on what an implementation can do in this case.

DS

David Schwartz

unread,

Dec 23, 2009, 2:20:57 PM12/23/09

to

On Dec 23, 9:23 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:

> Not that this would really matter for the core parts of my
> statement. So, where are your (or anyone else's) examples for
> non-fictional platforms with non-atomic 'int/ unsigned' memory
> accesses? This is, after all, information which could be of use, while
> hand-waiving generally isn't.

That's sheer idiocy. Perhaps you consider it acceptable to write code
that crashes or fails in subtle ways when someone upgrades their CPU
or operating system, but I don't. Today's code is written to run
*reliably* on tomorrow's CPUs as well.

DS

Scott Lurndal

unread,

Dec 23, 2009, 2:42:44 PM12/23/09

to

Why? sig_atomic_t's significant characteristic is the volatile keyword.

There are many reasons that the "x++;" statement in the OP should
be avoided; the primary reason being cache contention. You'll get
no speedup from the additional threads if they all need to access
the same cache line. Might as well just use one thread.

One must also worry about compiler reordering (for which volatile can be
of some limited use) and C Language sequence points as well as atomicity.

While glibc documentation may imply that accesses to 'int' are "generally"
atomic, if your algorithm requires true atomicity, either protect the
variable with a pthread_mutex_t or use in-line assembly to generate the
locked prefix (later version of gcc support builtin atomic access functions,
such as __sync_fetch_and_add, etc. which automatically generate the appropriate
atomic instruction sequence for the architecture. See info gcc.)

One must also be aware of platform load and store memory ordering and use the
appropriate barriers (again, the __sync* builtins in gcc handle this for you).

scott

Rainer Weikusat

unread,

Dec 23, 2009, 2:45:15 PM12/23/09

to

David Schwartz <dav...@webmaster.com> writes:
> On Dec 23, 9:23�am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
>
>> Not that this would really matter for the core parts of my
>> statement. So, where are your (or anyone else's) examples for
>> non-fictional platforms with non-atomic 'int/ unsigned' memory
>> accesses? This is, after all, information which could be of use, while
>> hand-waiving generally isn't.
>
> That's sheer idiocy.

No. 'Idiocy', by its original definition, roughly means 'ignoring
facts in favor of preconceived opinions regarding how the world really
should have been instead'. Such as in persistently failing to mention
relevant examples in the domain this thread takes place, which is
'properties of hardware memory models', in favor of fairy-tales based on
what particular documents don't say about topics they were never
supposed to cover.

Rainer Weikusat

unread,

Dec 23, 2009, 2:45:54 PM12/23/09

to

In absence of other information, that's a reasonable assumption.

Rainer Weikusat

unread,

Dec 23, 2009, 2:52:27 PM12/23/09

to

POSIX/ UNIX(*) is an API specification and not concerned with hardware
properties at that level.

Eric Sosman

unread,

Dec 23, 2009, 4:01:42 PM12/23/09

to

Nor, even if he is in fact writing for that machine at the
moment, does it mean that he's interested only in that machine.
He did *not* specify any particular O/S, he did *not* specify
any particular hardware, he did *not* specify any particular
library or threading standard/framework or anything else. The
*only* available inference is that since he asked the question
in comp.unix.programmer he is presumably interested in flavors
of Unix-allied systems.

And that's all. No further information was provided to
specialize the query.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Rainer Weikusat

unread,

Dec 23, 2009, 4:12:59 PM12/23/09

to

sc...@slp53.sl.home (Scott Lurndal) writes:
> Ralph =?UTF-8?Q?B=C3=B6hme?= <ralp...@rsrc.de> writes:
>>Eric Sosman <eso...@ieee-dot-org.invalid> schrieb:
>>> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
>>>> "novic...@gmail.com"<novic...@gmail.com> writes:
>>>>> [...]
>>>>> The subtle question here is, are reads and writes of a variable
>>>>> atomic?
>>>>
>>>> The short answer is 'yes'.
>>> The short answer is "maybe."
>>
>>I guess the helpful short answer is sig_atomic_t ?

[...]

> While glibc documentation may imply that accesses to 'int' are "generally"
> atomic, if your algorithm requires true atomicity, either protect the
> variable with a pthread_mutex_t or use in-line assembly to generate the
> locked prefix (later version of gcc support builtin atomic access functions,
> such as __sync_fetch_and_add, etc. which automatically generate the
> appropriate atomic instruction sequence for the architecture. See
> info gcc.)

The quoted text above is a question specifically relating to read and
write operations on objects of type 'unsigned'. Not to
read-modify-write-cycles. Not memory access ordering (or lack thereof)
in multiprocessor systems. Not to all kinds of other things which
COULD be more or less loosely related to conceivable reasons why one
could ask such a question. And non-misaligned single-word loads or
stores are atomic for all practical purposes. One could argue that
this is a hardware-related question and thus, off-topic here. OTOH,
even UNIX(*) runs on real computers.

David Schwartz

unread,

Dec 23, 2009, 4:19:11 PM12/23/09

to

Right, but most UNIX programmers write hardware-independent code, so
they don't care about the properties of specific hardware
implementations. Do you see anything about the OP's question that
suggests that he's way outside the norm?

DS

David Schwartz

unread,

Dec 23, 2009, 4:20:29 PM12/23/09

to

On Dec 23, 11:45 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:

> No. 'Idiocy', by its original definition, roughly means 'ignoring
> facts in favor of preconceived opinions regarding how the world really
> should have been instead'. Such as in persistently failing to mention
> relevant examples in the domain this thread takes place, which is
> 'properties of hardware memory models', in favor of fairy-tales based on
> what particular documents don't say about topics they were never
> supposed to cover.

You have taken a perfectly routine UINX programming question and
magically transformed it into a question about the details of specific
hardware implementations. You are the one cramming this question into
your ridiculous preconceived opinions.

DS

Rainer Weikusat

unread,

Dec 23, 2009, 4:32:05 PM12/23/09

to

David Schwartz <dav...@webmaster.com> writes:
> On Dec 23, 11:45�am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
>> No. 'Idiocy', by its original definition, roughly means 'ignoring
>> facts in favor of preconceived opinions regarding how the world really
>> should have been instead'. Such as in persistently failing to mention
>> relevant examples in the domain this thread takes place, which is
>> 'properties of hardware memory models', in favor of fairy-tales based on
>> what particular documents don't say about topics they were never
>> supposed to cover.
>
> You have taken a perfectly routine UINX programming question and
> magically transformed it into a question about the details of specific
> hardware implementations.

What is (by definition) outside the domain of an API-specification
based on C 'function calls' cannot ever be part of it ...

> You are the one cramming this question into
> your ridiculous preconceived opinions.

... and your philosophical convictions as to what people should or
shouldn't (be allowed) to know don't change this. Actually, that you
are lacking the spine to argue in favor of your conviction that
Programmers Must Be Blindfolded For The Good Of Us All, as evidenced
by the fact that your only 'argument' in favor of your opinion is
character assassination, serves as a strong point against it: One
should naturally question what you have to loose when others learn
what you yourself certainly know.

Scott Lurndal

unread,

Dec 23, 2009, 4:50:50 PM12/23/09

to

Rainer Weikusat <rwei...@mssgmbh.com> writes:
>sc...@slp53.sl.home (Scott Lurndal) writes:
>> Ralph =?UTF-8?Q?B=C3=B6hme?= <ralp...@rsrc.de> writes:
>>>Eric Sosman <eso...@ieee-dot-org.invalid> schrieb:
>>>> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:
>>>>> "novic...@gmail.com"<novic...@gmail.com> writes:
>>>>>> [...]
>>>>>> The subtle question here is, are reads and writes of a variable
>>>>>> atomic?
>>>>>
>>>>> The short answer is 'yes'.
>>>> The short answer is "maybe."
>>>
>>>I guess the helpful short answer is sig_atomic_t ?
>
>[...]
>
>> While glibc documentation may imply that accesses to 'int' are "generally"
>> atomic, if your algorithm requires true atomicity, either protect the
>> variable with a pthread_mutex_t or use in-line assembly to generate the
>> locked prefix (later version of gcc support builtin atomic access functions,
>> such as __sync_fetch_and_add, etc. which automatically generate the
>> appropriate atomic instruction sequence for the architecture. See
>> info gcc.)
>
>The quoted text above is a question specifically relating to read and
>write operations on objects of type 'unsigned'. Not to
>read-modify-write-cycles.

I'll point out:

1) The O.P. used 'x++' which is a read-modify-write cycle.
2) For the problem posed by the O.P. more threads than cores
makes little sense, so the O.P. also implicitly assumed a
multiprocessor system.

scott

David Schwartz

unread,

Dec 23, 2009, 7:26:26 PM12/23/09

to

On Dec 23, 1:32 pm, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:

> What is (by definition) outside the domain of an API-specification
> based on C 'function calls' cannot ever be part of it ...

Yes, and this question is 100% inside the domain of the API
specification. The API specifies under what cases operation have
defined results and under what cases they don't.

> > You are the one cramming this question into
> > your ridiculous preconceived opinions.

> ... and your philosophical convictions as to what people should or
> shouldn't (be allowed) to know don't change this. Actually, that you
> are lacking the spine to argue in favor of your conviction that
> Programmers Must Be Blindfolded For The Good Of Us All, as evidenced
> by the fact that your only 'argument' in favor of your opinion is
> character assassination, serves as a strong point against it: One
> should naturally question what you have to loose when others learn
> what you yourself certainly know.

You've really lost your mind. And you do a major disservice to people
who listen to you.

DS

Barry Margolin

unread,

Dec 23, 2009, 9:00:27 PM12/23/09

to

In article <87fx72v...@fever.mssgmbh.com>,
Rainer Weikusat <rwei...@mssgmbh.com> wrote:

> David Schwartz <dav...@webmaster.com> writes:
> > On Dec 22, 4:09�pm, "novicki...@gmail.com" <novicki...@gmail.com>
> > wrote:
> >> lets say I have
> >>
> >> unsigned int x = 0;
> >>
> >> And the code
> >>
> >> x++;
> >>
> >> Is run a total of 1 million times but the million times is divided
> >> across many threads.
> >>
> >> If i try to read the value of x am i guaranteed to get a value between
> >> 0 and 1 million.
> >
> > No, at least not unless some document provides such a guarantee.
>
> A standard is a specification of requirements and as such, doesn't
> "guarantee" anything. Insofar a test suite exists and a particular
> platform was found to be compliant, that would be sort-of a guarantee
> (the test suite itself could be buggy).

Well, a claim of conformance to a standard is considered to be a
guarantee that all its requirements are met. If it doesn't actually
meet the guarantees, it's a bug. But that's as good a guarantee you're
going to get in anything computer-related.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Barry Margolin

unread,

Dec 23, 2009, 9:06:33 PM12/23/09

to

In article <87vdfxb...@fever.mssgmbh.com>,
Rainer Weikusat <rwei...@mssgmbh.com> wrote:

But POSIX *could* specify that the compiler must automatically generate
the necessary mutual exclusion to ensure that the operation works
atomically if the hardware doesn't ensure this. That's one of the
things operating systems and programming languages do, they hide the
details of the hardware and provide a common abstraction.

However, POSIX doesn't have such a requirement, so it ends up being
hardware-dependent.

novic...@gmail.com

unread,

Dec 23, 2009, 11:23:39 PM12/23/09

to

On Dec 23, 7:59 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:

> Eric Sosman <esos...@ieee-dot-org.invalid> writes:
> > On 12/23/2009 8:33 AM, Rainer Weikusat wrote:

> >> Eric Sosman<esos...@ieee-dot-org.invalid> writes:

> >>> On 12/23/2009 5:24 AM, Rainer Weikusat wrote:

> >>>> "novicki...@gmail.com"<novicki...@gmail.com> writes:

> The OP is using Mac OS X on Intel and this means the answer is 'yes'
> for him, subject to the limitations I already wrote about, eg,
> accesses crossing cacheline boundaries. This is also true for all
> other OSes running on x86 platforms and for all other hardware/
> software combinations I am aware of.

Just for the record, my primary platform is x86 using various OS's.
Sparc is also of some interest even though it is the minority case.

Cheers,
Ivan Novick

novic...@gmail.com

unread,

Dec 23, 2009, 11:34:45 PM12/23/09

to

On Dec 22, 4:19 pm, David Schwartz <dav...@webmaster.com> wrote:
> On Dec 22, 4:09 pm, "novicki...@gmail.com" <novicki...@gmail.com>

> > The subtle question here is, are reads and writes of a variable
> > atomic? Do i have to worry that i am reading the variable while it is

> > being updated and might get total garbage such as 9 million as the
> > value when i read it. Or is the worst case that i missed some
> > increments because the increment on 1 thread was done without taking
> > into account the updates from another thread?
>

> It depends what the documentation for the threading standard you are
> using says. If POSIX, you are not guaranteed anything at all, it could
> even crash.
>

Ok, I understand it is not guaranteed to be safe by POSIX ( yes i am
assuming posix threads).

However, I am still curious if due to the design of modern hardware in
general, it is possible to read a value for the variable that was
never actually set. The code is trying to set values between 1 and 1
million. But could you read a value of 8 million if 8 million is
never set. The reason would be because you read an incompletely
written value at some point. Or is the computer such that you can
only read complete values and never a partially written value.

The use case would be if you did not care about missing some updates
to a variable, that was not locked, but still you don't want to read a
value that was never set in the variable at all. If this could be
"guaranteed" by the bus size or something like that, you might be able
to save doing locks.

Thanks for all the info so far guys.

Cheers, and merry Christmas,
Ivan Novick

David Schwartz

unread,

Dec 23, 2009, 11:43:10 PM12/23/09

to

On Dec 23, 8:34 pm, "novicki...@gmail.com" <novicki...@gmail.com>
wrote:

> However, I am still curious if due to the design of modern hardware in
> general, it is possible to read a value for the variable that was
> never actually set. The code is trying to set values between 1 and 1
> million. But could you read a value of 8 million if 8 million is
> never set. The reason would be because you read an incompletely
> written value at some point. Or is the computer such that you can
> only read complete values and never a partially written value.

Most likely, word tearing would be the only way. I don't know of any
POSIX platform where integers are subject to word tearing. The problem
is that you would be producing software that might break when a new
CPU, operating system, compiler, or something else comes out. On the
bright side, at least on x86, a lot of software makes the assumption
that read and writes to and from aligned integers are atomic (I think
it may even be guaranteed by Intel in the x86 specification) so you
certainly wouldn't be the only thing that would break.

> The use case would be if you did not care about missing some updates
> to a variable, that was not locked, but still you don't want to read a
> value that was never set in the variable at all. If this could be
> "guaranteed" by the bus size or something like that, you might be able
> to save doing locks.

It's almost definitely not worth it. You would lose the other benefits
of locks, such as congestion avoidance.

DS

Golden California Girls

unread,

Dec 23, 2009, 11:59:32 PM12/23/09

to

novic...@gmail.com wrote:
> lets say I have
>
> unsigned int x = 0;
>
> And the code
>
> x++;
>
> Is run a total of 1 million times but the million times is divided
> across many threads.
>
> If i try to read the value of x am i guaranteed to get a value between
> 0 and 1 million.
>

> The subtle question here is, are reads and writes of a variable
> atomic? Do i have to worry that i am reading the variable while it is
> being updated and might get total garbage such as 9 million as the
> value when i read it. Or is the worst case that i missed some
> increments because the increment on 1 thread was done without taking
> into account the updates from another thread?
>

> I hope i have made the question clear.

There has to be an assumption made, that int is large enough to store 1 million.
That isn't stated in your code. Rather obviously if it isn't large enough and
as it is unsigned then no matter what happens your answer will be yes.

If your question was is there hardware that allows an interrupt in the middle of
a machine cycle of a read or write, I'm not aware of any, but that may not be
true of an emulator machine.

David Schwartz

unread,

Dec 24, 2009, 1:49:00 AM12/24/09

to

On Dec 23, 8:59 pm, Golden California Girls <gldncag...@aol.com.mil>
wrote:

> If your question was is there hardware that allows an interrupt in the middle of
> a machine cycle of a read or write, I'm not aware of any, but that may not be
> true of an emulator machine.

Well, the more likely problems would be either that a single machine
instruction would be used to accomplish the read or the write or that
one CPU would read at the same time another is writing. An interrupt
in the middle of an instruction is just one possible way it could
fail.

In theory, on x86 you could use two machine instructions to write a 32-
bit value if you wanted to. It's hard to imagine any reason a compiler
would do that. And, on x86, 32-bit aligned values are not subject to
word tearing.

There are other ways it can go wrong, but they're also very unlikely.
For example, the compiler could 'optimize':

a++;

to:

a+=10000;
j=a;
j-=9999;
a=j;

As far as the compiler knows, this will have precisely the same
effect. However, if another thread read the value at the same time as
the 'j=a;' step, and then incremented it and wrote it back after the
'a=j;' step, you could get a totally bogus value.

Now, I know what you're thinking -- there's no way the compiler would
ever make an 'optimization' like this because it actually makes the
code worse. But the scary truth is that there are optimizations like
this that compilers do make.

One example is if the compiler is just one register short. Faced with
'i++;' on a RISC platform the most optimal implementation might be to
swap the memory location with a register, increment the register, and
swap back.

Sadly, there are similar known cases on x86. If I had the time, I'd
track down the GCC bug report.

DS

novic...@gmail.com

unread,

Dec 24, 2009, 10:54:57 AM12/24/09

to

Ok, I feel convinced, that it is NOT safe to read and write to a
variable from multiple threads without locks even if you are not
concerned with missing the latest updates to the values. At the risk
of starting a whole new thread, I guess the best performance would be
to use atomic memory access functions like these:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html or
the equivalent.

Cheers,
Ivan Novick

David Schwartz

unread,

Dec 24, 2009, 2:38:04 PM12/24/09

to

On Dec 24, 7:54 am, "novicki...@gmail.com" <novicki...@gmail.com>
wrote:

> Ok, I feel convinced, that it is NOT safe to read and write to a
> variable from multiple threads without locks even if you are not
> concerned with missing the latest updates to the values. At the risk
> of starting a whole new thread, I guess the best performance would be
> to use atomic memory access functions like these:
> http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html or
> the equivalent.

Sadly, atomic operations on x86 CPUs tend to be ridiculously
expensive, on the order of 200 clock cycles or so. This is because
even a simple atomic increment implements a full CPU barrier. If
that's not a problem, then that's probably your best choice.

But in my experience, it's almost always possible to architect these
kinds of things out of all performance-critical paths. If your threads
need to exchange information so frequently that it's causing a
performance problem, you're probably doing something wrong.

DS

Chris M. Thomasson

unread,

Dec 24, 2009, 5:37:16 PM12/24/09

to

"David Schwartz" <dav...@webmaster.com> wrote in message
news:5053e4d8-d400-4570...@v15g2000prn.googlegroups.com...

FWIW, there is distributed counting algorithms in which each thread has it's
own counter. A thread simply mutates it's own counter. A reader thread sums
up all the counters from each thread in order to get the actual value. It
would scale far better than using a single global counter. There are places
where these types of algorithms can fit in fairly well.

David Schwartz

unread,

Dec 25, 2009, 2:35:42 AM12/25/09

to

On Dec 23, 10:49 pm, David Schwartz <dav...@webmaster.com> wrote:

> Sadly, there are similar known cases on x86. If I had the time, I'd
> track down the GCC bug report.

I found the example I was thinking of. If GCC sees code like this:

if(mutex!=NULL) acquire_lock(mutex);
for(int i=0; i<100; i++)
{
some_stuff(i);
if(mutex!=NULL) shared_variable+=i;
}
if(mutex!=NULL) release_lock(mutex);

GCC assumes that 'mutex' is most likely not NULL, so it can optimize
the code to:

saved_copy=shared_variable;
if(mutex!=NULL) acquire_lock(mutex);
for(int i=0; i<100; i++)
{
some_stuff(i);
shared_variable+=i;
}
if(mutex!=NULL) release_lock(mutex); else shared_variable=saved_copy;

See how this gets a conditional out of the loop? That can actually be
an optimization, especially since GCC assumes that a variable is
rarely NULL. However, this means that the shared variable is modified
(and put back) if mutex is NULL. Ack.

GCC does not promise that it won't write to variables your code flow
doesn't write to. :(

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862

DS

Moi

unread,

Dec 25, 2009, 8:05:12 AM12/25/09

to

On Wed, 23 Dec 2009 20:43:10 -0800, David Schwartz wrote:

> On Dec 23, 8:34 pm, "novicki...@gmail.com" <novicki...@gmail.com> wrote:
>
>> However, I am still curious if due to the design of modern hardware in
>> general, it is possible to read a value for the variable that was never
>> actually set. The code is trying to set values between 1 and 1
>> million. But could you read a value of 8 million if 8 million is never
>> set. The reason would be because you read an incompletely written
>> value at some point. Or is the computer such that you can only read
>> complete values and never a partially written value.
>
> Most likely, word tearing would be the only way. I don't know of any
> POSIX platform where integers are subject to word tearing. The problem
> is that you would be producing software that might break when a new CPU,
> operating system, compiler, or something else comes out. On the bright
> side, at least on x86, a lot of software makes the assumption that read
> and writes to and from aligned integers are atomic (I think it may even
> be guaranteed by Intel in the x86 specification) so you certainly
> wouldn't be the only thing that would break.
>

In short: memory reads or writes are atomic for word-sized variables.
But "read-modify-write"s are not.

eg, you cannot use

while (i++) {i--;}
... critical section here ...
i--;

as a spinlock.
I would take a look at the linux kernel source code to see
how locks and spinlocks are implemented on various platforms.

HTH,
AvK

David Schwartz

unread,

Dec 25, 2009, 2:26:06 PM12/25/09

to

On Dec 25, 5:05 am, Moi <r...@invalid.address.org> wrote:

> In short: memory reads or writes are atomic for word-sized variables.
> But "read-modify-write"s are not.

Inside the CPU, yes. But there's no guarantee the compiler will issue
a read just because your code calls for one and similarly no guarantee
the compiler will issue a write just because your code calls for one.
So the fact that read and write assembly operations are atomic on the
CPU doesn't help a C programmer much unless he has some way to get the
compiler to issue the right assembly instructions. (Which, of course,
there are ways of doing.)

DS

Maxim Yegorushkin

unread,

Dec 26, 2009, 4:10:33 PM12/26/09

to

On 23/12/09 00:09, novic...@gmail.com wrote:
> lets say I have
>
> unsigned int x = 0;
>
> And the code
>
> x++;
>
> Is run a total of 1 million times but the million times is divided
> across many threads.
>
> If i try to read the value of x am i guaranteed to get a value between
> 0 and 1 million.
>
> The subtle question here is, are reads and writes of a variable
> atomic?

The C and C++ standards do not guarantee that assignment is an atomic
operation unless the left operand has type sig_atomic_t.

A particular compiler can provide more guarantees, however, portable
code can not rely on that.

Please note, however, that threads are outside of the scope of current C
and C++ standards. So that there is an unlikely chance that their
definition of atomic might be not thread-safe.

C++0x will address threading and provide atomics library. Till it is
widely available portable code has to rely on mechanisms outside of the
scope of C and C++ standards, such as gcc atomic built-ins
(http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Atomic-Builtins.html) or
other atomics libraries, such as Intel TBB and its atomic<T> template.

> Do i have to worry that i am reading the variable while it is
> being updated and might get total garbage such as 9 million as the
> value when i read it. Or is the worst case that i missed some
> increments because the increment on 1 thread was done without taking
> into account the updates from another thread?

When you are using thread-safe atomics, writes are atomic with respect
to other threads, i.e. no worries ;)

--
Max