Hello...
I was aware that memory ordering is not the only issue and that you have to use fences and memory barriers, but my implementation of my algorithms like my MLock and AMLock etc. was already using the correct fences and memory barriers to ensure sequential consistency in the hardware side, but what i have added is that i have switch the optimization off locally in some units to avoid memory reordering of the Delphi and FreePascal compilers, and now my C++ synchronization objects library is more stable and fast.
Thank you,
Amine Moulay Ramdne.