Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Immediate values in movdq?

1,518 views
Skip to first unread message

Paul K. McKneely

unread,
May 24, 2011, 2:33:28 PM5/24/11
to
Am I correct in my assessment that there are no encodings for
loading immediate value(s) into an XMM register such as in:

movdq xmm0,immediate_value(s)

Even if there were, the assembly syntax to make a list of
immediate values seems to be problematic. I conclude that
you must make a global symbol with values and reference
that location as the source such as in:

ImVals db 0,1,2,3,4,5,6,7,8,9,0Ah,0Bh,0Ch,0Dh,0Eh,0Fh

movdq xmm0,ImVals

Is this correct?


Lasse Reichstein Nielsen

unread,
May 24, 2011, 3:54:35 PM5/24/11
to
"Paul K. McKneely" <pkmck...@nospicedham.sbcglobal.net> writes:

> Am I correct in my assessment that there are no encodings for
> loading immediate value(s) into an XMM register such as in:
>
> movdq xmm0,immediate_value(s)

Correct.

> Even if there were, the assembly syntax to make a list of
> immediate values seems to be problematic.

Syntax is malleable. I'm sure something could be done if
necessary, but a 16-byte immediate is probably out of the
question.

> I conclude that
> you must make a global symbol with values and reference
> that location as the source such as in:
>
> ImVals db 0,1,2,3,4,5,6,7,8,9,0Ah,0Bh,0Ch,0Dh,0Eh,0Fh
>
> movdq xmm0,ImVals
>
> Is this correct?

If you want to do it in one instruction, yes.

If you only want to load a single double-value, you can also move from
general purpose registers. E.g., in x64 mode:
mov rax, 0x7ff0000000000000
movsd xmm0, rax ; or movq xmm0,rax if it's an integer value.

If you need the full value, you can use two gp registers and merge
them in an XMM register.
mov rax, 0x0001020304050607
mov rbx, 0x08090a0b0c0d0e0f
movq xmm0, rax
pinsrq xmm0, rbx, 1

In x86 mode, you'll need twice as many instructions, so memory looks
better in comparison.
Or, sometimes, when you only need one double, you are lucky and a
double value is representable as a single-precission value:
mov eax, 0x7f800000
movss xmm0, eax
cvtss2sd xmm0, xmm0

/L
--
Lasse Reichstein Holst Nielsen
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'

wolfgang kern

unread,
May 24, 2011, 4:19:55 PM5/24/11
to

Paul K. McKneely asked:

> Am I correct in my assessment that there are no encodings for
> loading immediate value(s) into an XMM register such as in:
>
> movdq xmm0,immediate_value(s)

Yes you're correct, unfortunately we haven't got such opcodes.

> Even if there were, the assembly syntax to make a list of
> immediate values seems to be problematic. I conclude that
> you must make a global symbol with values and reference
> that location as the source such as in:
>
> ImVals db 0,1,2,3,4,5,6,7,8,9,0Ah,0Bh,0Ch,0Dh,0Eh,0Fh
>
> movdq xmm0,ImVals
>
> Is this correct?

I'd avoid to load an XMM-reg with the 'address' of a data-label :)

movdqa xmm0,[VarVal]

the only way to load an XMM-reg is from memory or another XMM. Not
even constant-loads like FPU(FLD1 .. FLDPI) are foreseen with SSE.
__
wolfgang


Paul K. McKneely

unread,
May 27, 2011, 4:37:43 PM5/27/11
to

> movdqa xmm0,[VarVal]
>
> the only way to load an XMM-reg is from memory or another XMM. Not
> even constant-loads like FPU(FLD1 .. FLDPI) are foreseen with SSE.

Yes. I can see my error. I should have written:

movdqa xmm0,XMMWORD PTR VarVal

x86 syntax is so bad. I still think in Motorola syntax
which is much better.

Thanks.


wolfgang kern

unread,
May 28, 2011, 8:26:42 AM5/28/11
to

Paul K. McKneely wrote:

>> movdqa xmm0,[VarVal]

It may be just a matter of familiarity. After reading all this good
documentations forth and back several times I'm now really bound to
AMD's and Intel's notation and cannot find any advantage in up-side-
down translation :)

btw: I missed to say (like Lasse already mentioned) that also:

mov RAX,imm64
movd xmm0,RAX ;load low half of an XMM from any reg64

followed by a shift/merge/load for the upper half would work,
but Optimisation Guides see this as time-eating detours and
recommend to use mem128 source operands instead.
__
wolfgang


0 new messages