proc_start:
xor eax, eax
test rax, rax
jnz code32
...
code32:
But this seems rather long-winded. Is there a standard idiom that
everyone uses already?
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
> Currently I use:
>
> proc_start:
> xor eax, eax
> test rax, rax
> jnz code32
> ...
> code32:
>
> But this seems rather long-winded. Is there a standard idiom that
> everyone uses already?
What should this be good for? If you XOR EAX with itself,
the upper dword always is cleared, as well. Hence, a TEST
always sets the zero flag and the function never branches
to 'code32' - except there is no other JMP or RET between
the conditional jump and 'code32'.
Greetings from Augsburg
Bernhard Schornak
He's looking for a way to distinguish 32- and 64-bit mode (why not
16-bit mode I don't know ;)
I came up with the following:
bits 16
dec ax ; 48
mov ax,strict word 0 ; B8 00 00
jmp short code16 ; EB xx
jmp short code32 ; EB xx
nop ; 90
nop ; 90
code64:
nop
code32:
nop
code16:
nop
-hpa
Ahhh! ;)
mov eax,1
shl rax,32
test eax,eax
jne code32
might be what he wants. In 32 bit mode, 32 is masked
out, leaving the set bit in EAX. in 64 bit mode, EAX
is zero (the 32 bit shift is performed as expected).
It is not a really good solution, creating three de-
pendencies (two pipes are sent to sleep while one is
working)...
AFAIK, 16 bit mode is no longer supported for 64 bit
Windoze versions (at least, Win7 denies to launch my
old 16 bit stuff).
> mov eax,1
> shl rax,32
> test eax,eax
> jne code32
I was hoping for something shorter than the original 7 bytes. You
know, something like:
C:\gfortran\clf\mxcsr>link /dump /disasm prolog2.obj
Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file prolog2.obj
File Type: COFF OBJECT
CODE:
00000000: 83 C8 00 or eax,0
00000003: 40 inc eax
00000004: 75 0A jne code32
00000006: 90 nop
00000007: 90 nop
00000008: 90 nop
00000009: 90 nop
0000000A: 90 nop
0000000B: 90 nop
0000000C: 90 nop
0000000D: 90 nop
0000000E: 90 nop
0000000F: 90 nop
code32:
00000010: 90 nop
Summary
11 CODE
C:\gfortran\clf\mxcsr>link /dump /disasm prolog2.obj
Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file prolog2.obj
File Type: COFF OBJECT
CODE:
0000000000000000: 83 C8 00 or eax,0
0000000000000003: 40 75 0A jne code32
0000000000000006: 90 nop
0000000000000007: 90 nop
0000000000000008: 90 nop
0000000000000009: 90 nop
000000000000000A: 90 nop
000000000000000B: 90 nop
000000000000000C: 90 nop
000000000000000D: 90 nop
000000000000000E: 90 nop
000000000000000F: 90 nop
code32:
0000000000000010: 90 nop
Summary
11 CODE
But I guess I'm getting too fancy above: simply xor eax, eax as the
first instruction would get it down to 5 bytes.
> Ahhh! ;)
>
> mov eax,1
> shl rax,32
Won't the REX.W prefix of the shl be 0x48, which is 0x48, aka. dec ax,
in 32-bit mode?
I.e., in 32-bit mode you exactly zero x before not shifting it.
If you moved anything but 1 into eax, it should work.
/L
--
Lasse Reichstein Holst Nielsen
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
In 32 bit mode EAX is zero also. The rex prefix of
"shr rax,32" is 48h which is the opcode for " dec eax" in
32b mode.
mov eax,1
test rax,rax
jeq code32
should work. This is interpreted as:
mov eax,1
dec eax
test eax,eax
jeq code32
in 32b mode.
> But this seems rather long-winded. Is there a standard idiom that
> everyone uses already?
It's not just long-winded ...
In 64 bit mode it is useless at all and you can replace the whole
with: NOP/nothing, because a TEST of a previously cleared REG and
the inherent Zeroing-extension of the high dword will show
a Zero-condition anyway.
__
wolfgang
This is a reply for Lasse's posting, as well!
I have seen the prefix stuff one year ago, but forgot about
that, meanwhile. Your objections are right, of course. Darn
memory controller of mine... ;)
Dick's above solution seems to be the shortest possible way
to determine the current mode, I think.
Amd64 programmers manual vol 3 (24594 Rev. 3.14 September 2007) page 12:
Any other placement of a REX prefix, or any use of a REX prefix in an
instruction that does not access an extended register, is ignored.
So we can combine REX with JEQ as follows:
83 C8 FF or eax,-1
40 inc eax/ignored
74 xx jeq code32
Xor'ing two bytes is the shortest byte sequence to clear a register that I'm
aware of. Except for push/pop, I think all the single byte instructions
which have a hardcoded register, i.e., one byte shorter, were obsoleted for
64-bits. So, two byte instructions plus rex.w for 64-bit are needed, i.e.,
3 -bytes minimum for the second instruction. Instructions other than TEST
might work, like INC, maybe NEG, NOT etc. Anyway, I think 7 bytes is the
probably minimum, without something like self-modifying code. Your 5 byte
solution didn't seem to test since it used "or" instead of "xor"?
> But this seems rather long-winded. Is there a standard idiom that
> everyone uses already?
>
Why are you testing for 32-bits vs. 64-bits in the prologue? Can compute
this elswhere and pass this information in to the function? Can you create
a global variable?
Rod Pemberton
Seems that:
33 C0 xor eax,eax ; ZF set
40 inc eax/ignored REX ; ZF clear in 32-bit mode
75 xx jne code32
would be one byte shorter.
> Seems that:
> 33 C0 xor eax,eax ; ZF set
> 40 inc eax/ignored REX ; ZF clear in 32-bit mode
> 75 xx jne code32
> would be one byte shorter.
Exactly. I switched my code over to this two days ago when i came
to the conclusion that the thread was converging on this answer.
Except I seem to be using 31 C0 but that's just xor r32, r/m32 vs.
xor r/m32, r32. I has the further advantage over my original
version that it leaves the high bits of eax clear in the code32
branch, which ends up saving code.
H. Peter Anvin's code was also interesting in that it performed the
task in just one instruction for 64-bit code and even can handle
16-bit code, but it was twice as long as the above solution and
didn't zero the high bits of eax in the code32 branch.
I tested the solution adopted and even though it follows the dubious
practice of using an ignored REX byte, it worked!