Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

[x86] LOCK vs MFENCE

18 ವೀಕ್ಷಣೆಗಳು
ಮೊದಲು ಓದದ ಸಂದೇಶಕ್ಕೆ ಸ್ಕಿಪ್ ಮಾಡಿ

Dmitriy V'jukov

ಓದದಿರುವುದು,
ನವೆಂ 16, 2007, 03:29:06 ಪೂರ್ವಾಹ್ನ16/11/07
ಗೆ
I've measured latency for 'lock cmpxchg' and 'mfence' instructions on
Pentium 4 processor. I've got following results:

lock cmpxchg - 100 cycles
mfence - 104 cycles

So I conclude that they are nearly identical wrt consumed cycles.

But is there some difference between them wrt system performance?
Especially on modern multicore processors (Core 2 Duo, Core 2 Quad)?

Is following assumption correct: Lock prefix affects bus/cache
locking, so has impact on total system performance. And mfence has
only local impact on current core.

Or more practical: If I have 2 algorithms - one use lock prefix, and
another use mfence. Other things being equal, what I must prefer?
For example:
Program use sufficiently large amount of mutexes. Every particular
mutex synchronize only 2 threads.
I can implement mutex with:
1. "Traditional" scheme. Based on "lock xchg" in acquire operation and
"naked store" in release operation.
2. Peterson algorithm. Based on #StoreLoad memory barrier (mfence) in
acquire operation and "naked store" in release operation.
So net difference is - LOCK vs MFENCE.

The question is: Will be any difference in system performance on quad
core machine?


Thanks for any advance

Dmitriy V'jukov

0 ಹೊಸ ಸಂದೇಶಗಳು