InterlockedCompareExchange64 under VC++ 6.0: in-line assembly using cmpxchg8b ??

456 views
Skip to first unread message

Michael K. O'Neill

unread,
Jun 20, 2006, 3:38:01 PM6/20/06
to
I am trying to implement an atomic compare-and-exchange of a "double" type,
under VC++ 6.0 for WinXP.

The Windows API function of InterlockedCompareExchange64 is not available
for WinXP (it's only available for Vista, see
http://msdn.microsoft.com/library/en-us/dllproc/base/interlockedcompareexchange64.asp )
.

There is a compiler intrinsic function of _InterlockedCompareExchange64 (see
http://msdn2.microsoft.com/en-us/library/ttk2z1ws(VS.80).aspx ), but this
intrinsic seems to be available only for more recent versions of VC++ (such
as VC++ 2005), and it's not available as an intrinsic for VC++ 6.0.

So, I think that I need an in-line assembly function which, like the 2005
compiler intrinsic, uses the cmpxchg8b instruction. In-line assembly,
however, is beyond my capability.

I found user-mode code that uses in-line assembly and the cmpxchg8b
instruction at this thread in microsoft.public.win32.programmer.kernel (see
the next-to-last post):
http://groups.google.com/group/microsoft.public.win32.programmer.kernel/browse_thread/thread/ecc0a35d9fcef3b5/0e3b6a23d301a62a .
The code doesn't work for me. Here's what I mean by "doesn't work": I
implemented the function in a singly-threaded environment, simply to test
whether the function properly stores the correct double value at the
destination location. As expected (since we're in a singly-threaded
environment), the function's return value indicates that the current value
of the destination agrees with the comparand's value, and that the exchange
was successful. However, the value that actually gets stored at the
destination address is bizarrely corrupted, and does not agree with the
exchange value.

Can anyone offer any insights (other than a suggestion to upgrade to VC++
2005)? Maybe someone knows of the assembly code used by the current
compiler's intrinsic?

Best,
Mike


Joe Seigh

unread,
Jun 20, 2006, 4:16:53 PM6/20/06
to
Michael K. O'Neill wrote:
> I am trying to implement an atomic compare-and-exchange of a "double" type,
> under VC++ 6.0 for WinXP.
>
> The Windows API function of InterlockedCompareExchange64 is not available
> for WinXP (it's only available for Vista, see
> http://msdn.microsoft.com/library/en-us/dllproc/base/interlockedcompareexchange64.asp )
> .
...

> Can anyone offer any insights (other than a suggestion to upgrade to VC++
> 2005)? Maybe someone knows of the assembly code used by the current
> compiler's intrinsic?
>

Here's what I was using at one point

//-----------------------------------------------------------------------------------
// cas64_mp --
//
// returns:
// 1 - successful
// 0 - unsuccessful, xcmp updated w/ current value
//-----------------------------------------------------------------------------------
int cas64_mp(void * dest, void * xcmp, void * xxchg) {
//int rc;

__asm
{
mov esi, [xxchg] ; exchange
mov ebx, [esi + 0]
mov ecx, [esi + 4]

mov esi, [xcmp] ; comparand
mov eax, [esi + 0]
mov edx, [esi + 4]

mov edi, [dest] ; destination
lock cmpxchg8b [edi]
jz yyyy;

mov [esi + 0], eax;
mov [esi + 4], edx;

yyyy:
mov eax, 0;
setz al;
};

// return rc;
}


It should work unless I broke something later on. It's
IBM style CAS, not Microsoft style.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Michael K. O'Neill

unread,
Jun 21, 2006, 12:42:01 AM6/21/06
to

"Joe Seigh" <jsei...@xemaps.com> wrote in message
news:8M2dndexxOxmywXZ...@comcast.com...

>
> Here's what I was using at one point
>
> .......

Thank you for your response. Using it, I was able to find the following
code, which works perfectly for me. I'm using it instead of your code, for
the perhaps unprincipled reason that it seems to involve fewer instructions,
and I need the code in an inner-most loop. The code is slightly adapted,
but the original was obtained from
http://www.audiomulch.com/~rossb/code/lockfree/ATOMIC.H

#pragma warning(push)
#pragma warning(disable : 4035) // disable no-return warning

inline unsigned __int64
_InterlockedCompareExchange64(volatile unsigned __int64 *dest
,unsigned __int64 exchange
,unsigned __int64 comperand)
{
//value returned in eax::edx
__asm {
lea esi,comperand;
lea edi,exchange;

mov eax,[esi];
mov edx,4[esi];
mov ebx,[edi];
mov ecx,4[edi];
mov esi,dest;
//lock CMPXCHG8B [esi] is equivalent to the following except
//that it's atomic:
//ZeroFlag = (edx:eax == *esi);
//if (ZeroFlag) *esi = ecx:ebx;
//else edx:eax = *esi;
lock CMPXCHG8B [esi];
}
}

#pragma warning(pop)


Joe Seigh

unread,
Jun 21, 2006, 6:42:32 AM6/21/06
to
Michael K. O'Neill wrote:
I use the IBM style cas because it's easier to program with. You should
verify that you are actually executing fewer instructions. x86 doesn't
exactly have a lot of registers for the compiler to work with. With
the Microsoft style case, you need to make copy of the returned value,
compare it the comparand value, and copy it to the comparand value if
a retry is needed. So an extra compare and extra memory store/load
instructions possibly. Plus extra overhead for the parameter list since
you're handling 64 bit values rather than 32 bit references.

Chris Thomasson

unread,
Jun 21, 2006, 4:43:49 PM6/21/06
to
"Michael K. O'Neill" <mikeat...@nospam.hotmail.com> wrote in message
news:t64mg.69865$4L1....@newssvr11.news.prodigy.com...

>
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:8M2dndexxOxmywXZ...@comcast.com...
>>
>> Here's what I was using at one point
>>
>> .......
>
> Thank you for your response. Using it, I was able to find the following
> code, which works perfectly for me. I'm using it instead of your code,
> for
> the perhaps unprincipled reason that it seems to involve fewer
> instructions,
> and I need the code in an inner-most loop. The code is slightly adapted,
> but the original was obtained from
> http://www.audiomulch.com/~rossb/code/lockfree/ATOMIC.H

Humm, they mention atomic reference-counted pointers in there. However,
after taking a brief look, I realized that they are not atomic; they are
only as safe as boost::shared_ptr... Also, there is no support for
"fine-grained" memory barriers.


Michael K. O'Neill

unread,
Jun 21, 2006, 5:00:56 PM6/21/06
to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:rKOdnRPzZf9vMATZ...@comcast.com...

All that might be true, and the atomic.h file indeed seems to be directed to
templated classes that are claimed to provide atomic reference-counted
pointers based on boost::shared_ptr.

However, I was uninterested in any of that.

All I wanted was an in-line assembly function that used a "locked" cmpxchg8b
instruction to perform an atomic 64-bit compare-and-exchange. I think I got
that, from code that appears near the top fifth of the page, as indicated
above.

Mike


Michael K. O'Neill

unread,
Jun 21, 2006, 5:02:30 PM6/21/06
to

"Joe Seigh" <jsei...@xemaps.com> wrote in message
news:xZmdnS8PuZNGvATZ...@comcast.com...

You might be correct, and in all candor, I am simply unequipped to determine
if fewer instructions are in fact executed.

Mike


Reply all
Reply to author
Forward
0 new messages