Markus Wichmann wrote:
> On 12.03.2012 13:57, Bernhard Schornak wrote:
>>
>> What a great improvement!
>>
>
> Over what exactly? I mean, at least it's stable, which is more than can
> be said about the Win64 ABI.
Which I do not like, either. On the other hand, it is
(against my own expectation) almost as stable as OS/2
(IMHO the best OS ever - until IBM gave it up...).
16 GPRs + 16 XMMs = 32. As you listed, 28 of them are
declared as 'volatile' in *nix systems, while only 12
are declared as 'volatile' in Win-64. It surely is no
issue in functions like those you posted, but some of
my functions (e.g. my DBE core) have thousand or more
lines with contiguous code. The DBE automatically re-
sizes memory blocks if new dynamic strings exceed the
currently allocated block size. Allocation requires a
call to a 'dirty' API function, where six GPR and six
XMM registers are overwritten with garbage. Without a
wrapper (doing some more work than just calling dirty
API functions), I had to reload eight of those twelve
registers at that point. This topic has much more ill
side effects, sufficient to fill entire books...
...
shrq $0x08, %r14 # r14 = sig
shrq $0x19, %r15 # r15 = sep
movl $0x0D, %ebx # RBX = loop_cnt
movq %rdi, %rcx # RCX = HNWD dlg
movl $0x1500, %edx # RDX = ID
xorq %r8, %r8 # R08 = FALSE
andq $0x01, %r14 # r14 = sig BOOL
andq $0x03, %r15 # r15 = sep INDEX
0:call _SBtn
incl %edx
decl %ebx
jns 0b
xorl %eax, %eax # RAX = 0
decq %r15 # R15 = -1, 1, 2
cmovs %eax, %r15d # R15 = 0, 1, 2
movl $0x03, %ebx # RBX = loop_cnt
movl $0x1515, %edx # RDX = ID
1:call _SBtn
incl %edx
decl %ebx
jne 1b
...
A snippet out of a dialog procedure, 'clicking' radio
buttons. Without a wrapper, the Win-64 version looked
like this:
...
0:pushq %rcx
pushq %rdx
pushq %r8
pushq %r9
pushq %r10
pushq %r11
call *__imp__SendDlgItemMessageA(%rip)
popq %r11
popq %r10
popq %r9
popq %r8
popq %rdx
popq %rcx
...
For System V, R12...R15 had to be saved and restored,
as well.
Just have a look what GCC's emits as output for AS to
get a clue how much -unnecessary- code could be saved
with 'clean' ABIs. You can reduce GCC's code by about
40 percent (running at least twice as fast) with some
simple changes, providing a 'clean' environment...