Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

InterlockedExchange And Memory Alignment

126 views
Skip to first unread message

Patogenx

unread,
May 19, 2009, 10:05:11 AM5/19/09
to

I am confused that Microsoft says memory alignment is required for
InterlockedExchange however, Intel documentation says that memory
alignment is not required for LOCK. Am i missing something, or
whatever the problem?
Thanks

from Microsoft MSDN Library

Platform SDK: DLLs, Processes, and Threads InterlockedExchange

The variable pointed to by the Target parameter must be aligned on a
32-bit boundary; otherwise, this function will behave unpredictably on
multiprocessor x86 systems and any non-x86 systems.
from Intel Software Developer’s Manual;

LOCK instruction Causes the processor’s LOCK# signal to be asserted
during execution of the accompanying instruction (turns the
instruction into an atomic instruction). In a multiprocessor
environment, the LOCK# signal insures that the processor has exclusive
use of any shared memory while the signal is asserted.
The integrity of the LOCK prefix is not affected by the alignment of
the memory field. Memory locking is observed for arbitrarily
misaligned fields.
Memory Ordering in P6 and More Recent Processor Families
Locked instructions have a total order.
Software Controlled Bus Locking
The integrity of a bus lock is not affected by the alignment of the
memory field. The LOCK semantics are followed for as many bus cycles
as necessary to update the entire operand. However, it is recommend
that locked accesses be aligned on their natural boundaries for better
system performance: •Any boundary for an 8-bit access (locked or
otherwise). •16-bit boundary for locked word accesses. •32-bit
boundary for locked doubleword accesses. •64-bit boundary for locked
quadword accesses.

Alexander Grigoriev

unread,
May 19, 2009, 10:19:14 AM5/19/09
to
If the variable is not aligned, then the cache-assisted interlocked
operation is not always possible. This means it will execute actual bus
locking, which is much slower.

"Patogenx" <pato...@gmail.com> wrote in message
news:6631acfa-2711-4587...@o14g2000vbo.googlegroups.com...

Don Burn

unread,
May 19, 2009, 10:21:19 AM5/19/09
to
Not all systems are x86, try this on an IA64.


--
Don Burn (MVP, Windows DDK)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

"Patogenx" <pato...@gmail.com> wrote in message
news:6631acfa-2711-4587...@o14g2000vbo.googlegroups.com...

I am confused that Microsoft says memory alignment is required for

from Microsoft MSDN Library

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4087 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4087 (20090519) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


Pavel A.

unread,
May 19, 2009, 10:46:37 PM5/19/09
to
The InterlockedExchange operations are not directly related to the "lock"
thing.
They do require alignment. Follow the latest Intel docum, it is correct.

Regards,
--pa

"Patogenx" <pato...@gmail.com> wrote in message
news:6631acfa-2711-4587...@o14g2000vbo.googlegroups.com...
>

> system performance: .Any boundary for an 8-bit access (locked or
> otherwise). .16-bit boundary for locked word accesses. .32-bit
> boundary for locked doubleword accesses. .64-bit boundary for locked
> quadword accesses.

m

unread,
May 20, 2009, 8:50:54 PM5/20/09
to
What's to be confused about? There is a Windows API that has an alignment
requirement and a set of instructions for some Intel CPUs that can execute
on arbitrary unaligned data?

The fact that on certain platforms, the API uses one of these instructions
is useful to know, and can help to code better programs, but should not
cause any confusion!


"Patogenx" <pato...@gmail.com> wrote in message
news:6631acfa-2711-4587...@o14g2000vbo.googlegroups.com...

I am confused that Microsoft says memory alignment is required for

Pavel A.

unread,
May 21, 2009, 3:01:03 PM5/21/09
to
m wrote:
> What's to be confused about? There is a Windows API that has an alignment
> requirement and a set of instructions for some Intel CPUs that can execute
> on arbitrary unaligned data?
>
>
>
> The fact that on certain platforms, the API uses one of these instructions
> is useful to know, and can help to code better programs, but should not
> cause any confusion!
>

IMHO the Interlocked... functions are indeed quite confusing
and can be explained better to new developers.

What is special about these functions: they are not
really "Windows API". They represent the (relatively) new
mechanism designed by Intel for efficient multi threading in either user
or kernel mode.
These functions are realized by the CPU alone, as C compiler intrinsics,
without needing any OS or library calls, neither in Windows nor in Linux
or MAC OS. So, the definitive, first hand information on these functions
comes from Intel or AMD, not from Microsoft (though MS was greatly
involved in the design, of course).

The Interlocked... APIs on unaligned addresses are not guaranteed to do
the interlocking magic - it's undefined behavior. Perhaps, they would
better throw exceptions.

Regards,
-- pa

Jakob Bohm

unread,
May 22, 2009, 8:15:50 AM5/22/09
to
Pavel A. wrote:
> m wrote:
>> What's to be confused about? There is a Windows API that has an
>> alignment requirement and a set of instructions for some Intel CPUs
>> that can execute on arbitrary unaligned data?
>>
>>
>>
>> The fact that on certain platforms, the API uses one of these
>> instructions is useful to know, and can help to code better programs,
>> but should not cause any confusion!
>>
>
> IMHO the Interlocked... functions are indeed quite confusing
> and can be explained better to new developers.
>
> What is special about these functions: they are not
> really "Windows API". They represent the (relatively) new
> mechanism designed by Intel for efficient multi threading in either user
> or kernel mode.

Actually, InterlockedIncrement, InterlockedDecrement and
InterlockedExchange (16 bit versions) were part of the 1978 8086 CPU
that preceded the 8088 CPU used in the original IBM PC.
InterlockedCompareExchange() which can emulate all the other
Interlocked... functions was introduced as 32 bits on the 486 CPU and as
64 bits on the Pentium. So they are not that recent.

> These functions are realized by the CPU alone, as C compiler intrinsics,
> without needing any OS or library calls, neither in Windows nor in Linux
> or MAC OS. So, the definitive, first hand information on these functions
> comes from Intel or AMD, not from Microsoft (though MS was greatly
> involved in the design, of course).

Actually, until the most recent compilers, the Windows versions of these
calls were actual API calls to implementations in KERNEL32.DLL,
ntdll.dll or ntoskrnl.exe (depending on program type). Those
implementations were/are tight hand written assembler functions wrapping
the relevant CPU instructions. The exact implementations depended on
the available (or assumed minimum) CPU for that copy of the system DLL.

For x86 CPUs, the set of available instructions was significantly
increased with the release of the 386, 486 and Pentium CPUs. For x64
CPUs the set of instructions available is different for AMD and Intel
chips (with AMD having more instructions). For Sparc (not supported by
Windows), some instructions are only available on v9 and later CPUs.
Other CPU families have their own restrictions.

One consequence of that is that for certain older Windows versions,
InterlockedIncrement() and InterlockedDecrement() only return the sign
of the result, not the exact value. The implementation that returns the
exact value requires at least a 486 and was thus not available in
Windows versions whose original minimum system requirements allowed
installation on a 386.

>
> The Interlocked... APIs on unaligned addresses are not guaranteed to do
> the interlocking magic - it's undefined behavior. Perhaps, they would
> better throw exceptions.
>

To deliberately throw an exception, they would need to include extra
code to test the address, which is not acceptable for such high speed
calls. Some CPUs might throw alignment fault exceptions in hardware,
and Windows could then pass those on to the application, but I have not
seen this on x86 or x64 platforms.


--
Jakob Bøhm, M.Sc.Eng. * j...@danware.dk * direct tel:+45-45-90-25-33
Netop Solutions A/S * Bregnerodvej 127 * DK-3460 Birkerod * DENMARK
http://www.netop.com * tel:+45-45-90-25-25 * fax:+45-45-90-25-26
Information in this mail is hasty, not binding and may not be right.
Information in this posting may not be the official position of Netop
Solutions A/S, only the personal opinions of the author.

0 new messages