Dereference or not, the argument to "m" in GCC inline assembly?

Toby Douglass

unread,

Dec 1, 2009, 2:08:31 PM12/1/09

to

I am writing inline assembly, implementing atomic operations. I use the
"m" constraint when indicating memory (rather than registers) which is
being operated upon.

Example function (where atom_t is unsigned long int);

INLINE atom_t abstraction_increment( atom_t *value )
{
atom_t
stored_flag,
new_value;

__asm__ __volatile__
(
" mov %2, #1;" // move 1 into stored_flag
" dmb;" // memory barrier
"atomic_add:;"
" ldrex %1, [%3];" // load *value into new_value
" add %1, #1;" // add 1 to new_value
" strex %2, %1, [%3];" // try to store new_value into *value (on
success, strex puts 0 into stored_flag)
" teq %2, #0;" // check if stored_flag is 0
" bne atomic_add;" // if not 0, retry (someone else touched
*value after we loaded but before we stored)
" dmb;" // memory barrier

// output
: "+m" (value), "=&r" (new_value), "=&r" (stored_flag)

// input
: "r" (value)

// clobbered
: "cc"
);

return( new_value );
}

My question is; when I have a pointer to memory (e.g. atom_t *value,
where atom_t is a native type, unsigned long int in this case), should I
write;

: "+m" (value)

or should I write;

: "+m" (*value)

The docs are not clear, the HOWTO is not clear and Googling finds both
forms of use. I suspect one of them is wrong but the user silently gets
away with it because the optimiser doesn't happen to optimise in such a
way that it causes a problem.

Jan Seiffert

unread,

Dec 1, 2009, 11:47:50 PM12/1/09

to

Toby Douglass schrieb:

No, neither is wrong, it depends on what you want the compiler to put for you
into this constrain.

As always with pointers, you have to different things.
The pointer, which has a value, takes some storage, and so has an address (which
you could put into a "type **another_pointer") points to something, which has a
value, takes some storage, and has an address (which is stored in our pointer).
Which one of those two you want to manipulate, the pointer and its value or the
pointee and its value, you decide with the dereference operator * (or the array
operator []).

This is, even on the basic level, not the same thing. You can not get away with
intermixing this, not even in inline asm.

The m constrain on the other hand works like the & operand, take the address of
an object.

So when you write
int *a;
"m" (a)
is you essentially get
&a == pointer to pointer to int gets put somewhere for you,
the compiler things you want to manipulate a.

if you write
int *b;
"m" (*b)
you get
&(*b) == nop, pointer to int gets put somewhere for you,
the compiler things you want to manipulate the
pointee of a.

Because of that you can not write:
"m" (42)
the constant 42 has no storage, so the compiler can not take the address of it.
What you can do is:
const int x = 42;
"m" (x)
now you maybe get something like this:
mov 0x80484e0,%eax

Examples:

int foo(int *a)
{
int b;
asm (
"mov %1, %0"
: "=r" (b)
: "m" (*a)
);
return b;
}

turns into:
mov 0x4(%esp),%eax # compiler loads pointer a for me from the stack
mov (%eax),%eax # i derefence a to load the value it points to
ret # ok, thats it, eax is the return value

Now contrast this with:
int bar(int *a)
{
int b;
asm (
"mov %1, %0"
: "=r" (b)
: "m" (a)
);
return b;
}

turns into:
mov 0x4(%esp),%eax # i load a from the stack?
ret # i return a as an int? but it's a pointer...

That you can do tricks like this:
int baz(int *a)
{
int b;
asm (
"mov %1, %0\n\t"
"mov (%0), %0"
: "=r" (b)
: "m" (a) /* ahhh, i don't understand this m thingy */
);
return b;
}

turns into
mov 0x4(%esp),%eax
mov (%eax),%eax
ret
doesn't mean they are the same thing.

Now, in your above inline asm, you write:
atom_t *value
"+m" (value)
But then look at the asm code, you do _not_ use %0 one time.
Not using %0 is fine, it meant as an clear info for the compiler "hey, i gonna
modify this mem, don't cache it", so he knows what parts of program state is
changed in your inline asm (as you may know, GCC does not understand your inline
asm string, it merely passes it trough a printf-like function to substitute all
the %, but that's it).
But since you didn't use %0 you didn't examined "first hand" what the compiler
understands.
Unfortunately you sent the compiler for the wrong guy, the pointer, prop. on the
stack, not the value you atomically changed.
The only way to feel this mistake is when the compiler miscompiles some code
because it was not clear to him he should not cache *value. This can take some
time, it needs the right conditions. If your atmic function is in an extra
compilation unit you may never see it.

Try this:

INLINE atom_t abstraction_increment( atom_t *value )
{

int
stored_flag;
atom_t
new_value;

__asm__ __volatile__
(
" dmb;\n" // memory barrier
/*
* Do not use names as labels, otherwise you are in trouble
* when the compiler decides to duplicate the code (he is free
* to do so). Suddenly you have the same label two times and
* the assembler barfs.
* This also creates a symbol which should be avoided.
*/
"1:\n\t"
" ldrex %1, %3;\n\t" // load *value into new_value
" add %1, #1;\n\t" // add 1 to new_value
" strex %2, %1, %3;\n\t" //
" teq %2, #0;\n\t" // check if stored_flag is 0
" bne 1b\n\t" //
" dmb;" // memory barrier

// output
: "=m" (*value),
"=&r" (new_value),
"=&r" (stored_flag)

// input
: "Q" (*value) // the "+" makes trouble with some Ver. 3 gcc

// clobbered
: "cc"
);

return( new_value );
}

Greetings
Jan

--
The only problem with troubleshooting is that sometimes,
trouble shoots back.

Toby Douglass

unread,

Dec 3, 2009, 8:01:20 PM12/3/09

to

nomail@invalid wrote:

[snip]

> The m constrain on the other hand works like the & operand, take the
> address of an object.

[snip]

Thankyou - that's exactly what I needed.