On Fri, 17 Nov 2006 06:12:19 -0500, Joe Seigh wrote:
> Adam Warner wrote:
>> Hi all,
>> In my prior posts to comp.programming.threads [note new crosspost to
>> comp.arch] I described an algorithm where CPU A makes a number of memory
>> writes before writing to a memory location that acts as a flag. When CPU B
>> reads the changed flag (it's theoretically irrelevant whether the change
>> takes a nanosecond or a day to propagate to CPU B) all the writes that
>> were made by CPU A prior to CPU A writing to the memory flag must be
>> correctly readable by CPU B. A sequentially consistent architecture
>> satisfies this property.
>> Historically the shared memory IA-32 multiprocessor architecture has been
>> sequentially consistent:
> [...]
>> The fear initiated by Intel and co. that programmers must use special
>> serializing and locking operations to ensure future x86 compatibility does
>> not have to be amplified by the wider community. Let's assume future
>> changes to the x86 memory model do break existing programs that rely upon
>> a sequentially consistent architecture. To ensure old programs don't fail
>> in potentially mysterious ways an operating system could check all
>> executables for the export of a symbol indicating awareness of the new
>> memory model (and otherwise defaulting to program termination). Whatever
>> the marketing department called the new architecture they wouldn't be able
>> to weasel around such a stark reminder of binary incompatibility.
>> I am happy to conform to the memory model of the architecture I am
>> compiling for. I just need to confirm what it realistically is.
> I think Intel is in the pretending the problem does not exist mode until
> they can figure out what to do about it. About the only thing you can
> do is put an abstraction layer in place to insulate your programs from
> any memory model changes. Usually this is just a bunch of memory barrier
> and atomic access macros to give you the guarantees that you need. You
> can take a look at how the Linux kernel does it or look at atomic_ops
> in http://www.hpl.hp.com/research/linux/qprof/
This is superb advice, thank you! I was wondering if a set of atomic
operations for userspace Linux programs was available.
The qprof README_atomic_ops.txt contains this comment:
Note that the implementation reflects our understanding of real
processor behavior. This occasionally diverges from the documented
behavior. (E.g. the documented X86 behavior seems to be weak enough that
it is impractical to use. Current real implementations appear to be
much better behaved.) We of course are in no position to guarantee that
future processors (even HPs) will continue to behave this way, though we
hope they will.
Regards,
Adam