Regards,
Seenu.
Your question is very vague. What do you mean by differences in
behavior?
DS
I know one difference: acquiring a lock on a multicpu machine is WAY
slower than doing the same on a unicpu. On a g5 running os x tiger,
turning off one cpu made my (contrived) lock acquiring/releasing
app run 15 times faster.
I guess in this case osx is gratuitously flushing caches, something
a uniprocessor (or a better os?) doesn't have to do.
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software.
The first question that comes to mind is whether you are asking about
differences in the performance of the mutexes or differences in how code
must be written to run correctly on single vs. multiple CPUs.
There are almost certainly differences in performance on every platform,
but the code you need to write in order to have a correct implementation
should not depend on how many CPUs you have.
- Logan
Explaination: I have an app which works fine in single CPU (the app is
multithreaded), but when I run the same app in multi-CPU machine, it is
dumping core at various places. I was suspecting whether my variables
are getting corrupted even though they are locked using mutex. So, I
wanted to know is there any difference in the way we initalize mutex in
single-CPU and multi-CPU.
Regards,
Seenu.
> Explaination: I have an app which works fine in single CPU (the app is
> multithreaded), but when I run the same app in multi-CPU machine, it is
> dumping core at various places. I was suspecting whether my variables
> are getting corrupted even though they are locked using mutex. So, I
> wanted to know is there any difference in the way we initalize mutex in
> single-CPU and multi-CPU.
This is an indication that your program is more than likely *buggy*.
You have perhaps a race condition or similar that didn't occurred on
single processor machine, but shows up on multi-processors machine.
This is actually an excellent test for your application...
Cheers,
Loic.
> Hello Seenu,
> > Explaination: I have an app which works fine in single CPU (the app is
> > multithreaded), but when I run the same app in multi-CPU machine, it is
> > dumping core at various places. I was suspecting whether my variables
> > are getting corrupted even though they are locked using mutex. So, I
> > wanted to know is there any difference in the way we initalize mutex in
> > single-CPU and multi-CPU.
>
> This is an indication that your program is more than likely *buggy*.
> You have perhaps a race condition or similar that didn't occurred on
> single processor machine, but shows up on multi-processors machine.
I can't think of any situations where a race condition in an app
running on a single cpu machine wouldn't be a race condition on
a multi cpu machine or vice versa.
> This is actually an excellent test for your application...
Yes, the odds of the race condition manifesting itself on a
multicpu machine are much much greater. Now that I think about it,
I *need* a multicpu machine with multiple cores to develop
threaded apps. I'll put in a hardware request right now...
--
Ian Collins.
>> I can't think of any situations where a race condition in an app
>> running on a single cpu machine wouldn't be a race condition on
>> a multi cpu machine or vice versa.
>>
> Not so, threads are serialised on a single CPU, so for example if the
> first threads something and the second deletes it, you won't see a
> problem. If they run on a dual CPU system, you will, assuming the
> delete happens before the read.
Threads are serialized, but timeslicing and blocking behavior tends to
"mix things up", statistically speaking, far more than you apparently
might think.
In fact we found in development of the OpenVMS/Tru64 UNIX thread library
that there were very few multiprocessor races that couldn't be
reproduced on a uniprocessor if you exercised it intensively and/or ran
a long time. A few were certainly INCONVENIENT to reproduce, but far
from impossible. An single instruction race window WILL be hit in
uniprocessor testing if you're even moderately serious about it.
We also learned that if a race condition that couldn't be identified
because it couldn't be reliably reproduced on a wide multiprocessor were
left in shipping code it would be frequently and widely reproduced
immediately -- even on uniprocessors -- by customers on the verge of
multinational production system software rollouts. (Actually, that's a
corollary: many races CANNOT be reliably reproduced on alpha or beta
software, only on identical software installed by a large customer off
official distribution media in a major production project.)
Nevertheless, if you want to develop and exercise threaded code, you
need a multiprocessor. The wider the better; but 2 cores is far more
than twice as good as one. To find bugs during development you don't
want to have to run extended saturation test loads -- or wade through
the data gathered during such a run to figure out what happened.
>> I can't think of any situations where a race condition in an app
>> running on a single cpu machine wouldn't be a race condition on
>> a multi cpu machine or vice versa.
> Not so, threads are serialised on a single CPU, so for example if the
> first threads something and the second deletes it, you won't see a
> problem. If they run on a dual CPU system, you will, assuming the delete
> happens before the read.
That's complete nonsense. On a single CPU machine, the first thread
could find an object, the CPU could switch to another thread which deletes
it, and then when the first thread runs again, it blows up.
It is extremely difficult to construct code that is technically correct
on a single CPU system but can fail on a multi-CPU system. You generally
need to use platform-specific tricks to do it.
DS
What about read & write reordering problems? DCLP, etc. They don't
happen on single CPU machines.
> We also learned that if a race condition that couldn't be identified
> because it couldn't be reliably reproduced on a wide multiprocessor were
> left in shipping code it would be frequently and widely reproduced
> immediately -- even on uniprocessors -- by customers on the verge of
> multinational production system software rollouts. (Actually, that's a
> corollary: many races CANNOT be reliably reproduced on alpha or beta
> software, only on identical software installed by a large customer off
> official distribution media in a major production project.)
>
True for all types of bugs, not just threading ones. :-)
Actually, no... I've experienced such a situation myself, where I was
(supposedly) protecting a std::vector<> using a mutex, but actually, I
was using 2 different mutexes, and didn't realize that till it started
failing consistently on a multiprocessor. On a single processor, I must
have run it at least 100+ times, and it has failed just once. This was
happening because somehow the context switch was NOT happening in the
middle of the vector modifier function calls. If it were, then data
corruption would have occured.
> It is extremely difficult to construct code that is technically correct
> on a single CPU system but can fail on a multi-CPU system. You generally
> need to use platform-specific tricks to do it.
I would say that "It is VERY VERY extremely difficult to construct code
that is technically correct on a single CPU system but can fail on a
multi-CPU system."
>
> DS
> David Schwartz wrote:
>> That's complete nonsense. On a single CPU machine, the first thread
>> could find an object, the CPU could switch to another thread which
>> deletes
>> it, and then when the first thread runs again, it blows up.
> Actually, no... I've experienced such a situation myself, where I was
> (supposedly) protecting a std::vector<> using a mutex, but actually, I
> was using 2 different mutexes, and didn't realize that till it started
> failing consistently on a multiprocessor. On a single processor, I must
> have run it at least 100+ times, and it has failed just once. This was
> happening because somehow the context switch was NOT happening in the
> middle of the vector modifier function calls. If it were, then data
> corruption would have occured.
I'm not sure why you would say "Actually, no" and then follow it up with
an example of exactly the type of problem I was saying can happen.
DS
> It is extremely difficult to construct code that is technically correct
> on a single CPU system but can fail on a multi-CPU system. You generally
> need to use platform-specific tricks to do it.
>
Code doesn't have to be technically correct to work on a single CPU, it
can contain crap code that doesn't protect critical sections adequately.
Sure it will eventually break, often in a difficult to reproduce way.
Which is which I believe all threaded code, even it is for a single
processor target, should be tested on a multi-core box.
OK, I admit it, the naive developer was me :)
--
Ian Collins.
> It is extremely difficult to construct code that is technically correct
> on a single CPU system but can fail on a multi-CPU system. You generally
> need to use platform-specific tricks to do it.
>
What I meant was that the probability of detecting an error when
testing on a SP-machine would be much less than that on a MP-machine.
But, as always, you can generally never rely on testing to find _all_
errors.
>
> DS
> What I meant was that the probability of detecting an error when
> testing on a SP-machine would be much less than that on a MP-machine.
It depends what you mean by "detecting". Fixing an error, for example,
is much easier on an MP-machine. The more easily you can produce an error,
the more easily you can fix it.
DS