I am trying to access SSE instructions with gcc's inline assembly but I
get an error message:
tests.c: In function �main�:
tests.c:6: error: memory input 1 is not directly addressable
Here is the code:
1 #include <stdio.h>
2
3 int main(int argc, char **argv)
4 {
5 float y[4] = { 1, 2, 3, 4}, x[4] = { 21, 22, 23, 24 };
6 asm("addps %1, %0"
7 : "=x" (x)
8 : "m" (y)
9 );
10 printf("%d %d", x, y);
11 return 0;
12 }
All I want to do is add the y array with the x array using the addps
(Add Packed Single-Precision Floating-Point) instruction.
Here is what the AMD64 reference says:
ADDPS xmm1, xmm2/mem128 (NASM syntax)
Adds four packed single-precision floating-point values in
an XMM register and another XMM register or 128-bit
memory location and writes the result in the destination XMM
register.
I have tried many different ways but I just can't seem to get it working.
(by the way, I am using 64-bit debian on an athlon64 and my gcc version
is 4.3.2)
I would be grateful for any help.
- Wolfnoliir
The joys of C arrays and pointers...
They are not the same thing, only "quite" compatible.
So you can use the [] operator on every pointer.
But an array disolves into an pointer if you only use the name...
> 6 asm("addps %1, %0"
> 7 : "=x" (x)
> 8 : "m" (y)
...which happens here.
GCC can not pass in &&y[0], because there is no slot for the extra indirect pointer.
Fixing this to:
asm("addps %1, %0"
: "=x" (x)
: "m" (*y)
);
GCC can now pass in y, but is still unhappy with x. A pointer can not be placed
into a xmm register.
And here your inline asm is also wrong, i think, it does not do what you think.
x is only an output, GCC does not assume it has to have a start value.
Lets fix this to:
asm("addps %2, %0"
: "=x" (*x)
: "0" (*x),
"m" (*y)
);
Now GCC will try to move *x first to parameter 0, and then read the result from
parameter 0, which is x.
This compiles here, (do not forget to add -msse or a proper -march, or gcc will
refuse to create code for the x contrain), but genereates code not as you
intended it:
8048426: f3 0f 10 05 28 85 04 08 movss 0x8048528,%xmm0
804842e: 0f 58 45 ec addps -0x14(%ebp),%xmm0
8048432: f3 0f 11 45 dc movss %xmm0,-0x24(%ebp)
movss?? scalar single precision?
Jep, we told gcc to use x[0], not the whole vector of x.
For this we first have to tell gcc we are working here with whole vectors:
#include <stdio.h>
typedef float __attribute__((vector_size(16))) vsf4;
int main(int argc, char **argv)
{
vsf4 y = (vsf4){ 1.f, 2.f, 3.f, 4.f};
vsf4 x = (vsf4){ 21.f, 22.f, 23.f, 24.f };
asm("addps %2, %0"
: "=x" (x)
: "0" (x),
"m" (y)
);
printf("%p %p", &x, &y);
return 0;
}
If you now want to access single vector members, you have to use a union of a
vector and an array.
> (by the way, I am using 64-bit debian on an athlon64 and my gcc version
> is 4.3.2)
If you are on 64 Bit, this:
> 10 printf("%d %d", x, y);
will also not work. x and y, as arrays, disolve into pointer, which you can not
print by %d. gcc with -Wall should have warned you.
>
> I would be grateful for any help.
>
> - Wolfnoliir
Greetings
Jan
--
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"