I have a piece of code like this:
volatile long i = 4000000;
//thread 1:
while(i != 0)
{
}
//thread 2:
while(i != 0)
{
InterlockedDecrement(&i);
}
the Intel Thread Checker reports there are race conditions here, but I think
they are innocuous.
Though I can eliminate these race conditions by using
InterlockedCompareExchange(&i,0,0) != 0,
I want to know if it is necessary to eliminate these race conditions?
Thanks!
Yes, Thread Checker is right and you should eliminate the race
condition.
Note that the volatile keyword in C/C++ has nothing to do with multi-
threaded execution, i.e. it does not guarantee atomicity.
Best Regards,
Szabolcs
In the real application, is there any data states which are dependant on the
value of `i'? What are you trying to do here? The data-race is innocuous wrt
the code AS-IS. However, I think your trying to do something else in the
"real-application"... What's inside of the while loops?
There's no way to answer without knowing what the purpose of this code
actually is. The race detector is correct here, there are races. There
is no way to know how many times thread 1 will loop, even after thread
2 drops the count to zero. If that matters, you have a problem.
DS
The example as-is needs volatile or the load in the while loop in thread 1
can be optimized away...
> The example as-is needs volatile or the load in the while loop in thread 1
> can be optimized away...
The load can still be optimized away, just not by the compiler. The
'volatile' keyword only disables compiler optimizations. It does not
disable CPU or hardware optimizations that are safe on single-threaded
code.
DS
Do you mean in some C++ implementations this could happen:
the complier translates the "while (i !=0)" in thread 1 to something like
this for some sort of reason?
MOV ax , WORD PTR [&i]
SHL eax , 16
MOV ax , WORD PTR [&i+2]
CMP eax , 0
(Sorry for the poor example)
Thanks.
In fact 'i' is a flag which shall be accessed and written in
different threads. The 'while-loop' here is intended to replay
the race condition and let the Thread Checker find it.
My real application is just like this:
volatile long closed_ = false;
volatile long outstanding_operation_ = 0;
//the producer thread
if(!closed_ && InterlockedIncrement( outstanding_operation_ )> 0)
{
post_to_message_queue_();
//deal with a failure post...
//......
}
//message queue consumer thread and there are ONLY ONE
//consumer thread in system
while(true)
{
pull_from_message_queue();
InterlockedDecrement(&outstanding_operation_);
//may "closed = true" while processing the message
if(closed_ && InterlockedCompareExchange(
&outstanding_operation_,0,-LONG_MAX)== 0)
{
//clean-up
break;
}
}
post_to_message_queue_() and pull_from_message_queue() is
implemented in terms of Completion Port in Win32. The
outstanding_operation_ is bookkept here to prevent from losing
message. And the closed_ acts as a hint to prevent starving.
the Thread Checker reports race condition on closed_ so I want to make
sure if it is serious.
Thanks.
It can very well happen and nothing prevents changing the value `i' by
another thread of computation in between the two `MOV' instructions.
> the complier translates the "while (i !=0)" in thread 1 to something like
> this for some sort of reason?
>
> MOV ax , WORD PTR [&i]
> SHL eax , 16
> MOV ax , WORD PTR [&i+2]
> CMP eax , 0
>
> (Sorry for the poor example)
I think it is not a poor example since it can highlight the details
that the volatile keyword only blocks the compiler from doing some
optimisations it would otherwise do. However, the volatile keyword
does not instruct the compiler to insert critical sections around the
small granularity machine instructions. In C/C++ the volatile keyword
does not mean atomic access as it means atomic access in case of
certain primitive types in Java (for long int it does not imply atomic
access in Java either).
Best Regards,
Szabolcs
> > The example as-is needs volatile or the load in the while loop in thread
> > 1
> > can be optimized away...
> The load can still be optimized away, just not by the compiler.
Hummm... If there is a MOV & CMP in the predicate of a loop that does
"nothing"... Well, can you show me __existing__ hardware which would not
execute the loop __if__ the source-code was in PURE assembly language?
Perhaps the loop was a delay which was important to the algorithm at hand.
> The 'volatile' keyword only disables compiler optimizations.
I know.
> It does not disable CPU or hardware optimizations that are safe on
> single-threaded
> code.
Indeed.
And YES... nothing == NOP instruction!
> Well, can you show me __existing__ hardware which would not
> execute the loop __if__ the source-code was in PURE assembly language?
It doesn't matter. It is an utterly irresponsible programming practice
to target your software only for existing hardware. People expect, and
rightly so, that they can upgrade their CPU to a new one without
breaking their existing software.
Maybe if you were coding an OS, or an ultra-low-level library that was
very CPU specific. But otherwise, targeting to existing CPU *behavior*
rather than existing CPU *specifications* is madness.
DS
Fair enough. I was just wondering how an architecture would be able to omit
parts of assembly language code. It seems like it would be a fairly
dangerous practice. Although, if the arch could do this, it would surely
have special instructions that say "don't do that!"...
;^)
> Fair enough. I was just wondering how an architecture would be able to omit
> parts of assembly language code. It seems like it would be a fairly
> dangerous practice. Although, if the arch could do this, it would surely
> have special instructions that say "don't do that!"...
It starts out that way. But people do that anyway. And before you know
it, the CPU manufacturer effectively has no choice but to keep the
earlier 'accidental' behavior.
A good example of this is the way modern x86 CPUs invalidate
speculative fetches if their corresponding cache line is invalidated.
This was specifically documented as not to be relied on, but so much
software relied on it, it is now effectively guaranteed on x86.
When Intel made Itanium CPUs emulate x86 CPUs, they discovered that it
was the Itanium that would be blamed if software didn't "just work",
even when the software relied on behavior that was explicitly
documented as not to be relied on.
So it flows both ways.
DS