If I had to guess the reason for the disparity I would suggest the paucity of registers in 32 bit mode causing more register spills.
Dave
Sent from my iPhone
At this point they've received about the same amount
of attention. If you can identify a small example that
demonstrates the slowdown, it would help focus said
attentions.
Russ
Please file a bug on the issue tracker explaining how to
reproduce these crashes. Thanks.
You can generate an assembly listing while linking the binary
by using 8l's -a flag.
I can't tell if you are blaming the MOVB instructions for the
crashes of gdb and Valgrind or for the performance slowdown.
The latter seems more likely, so I am assuming that.
It is true that, to make the compilers' jobs easier, the linker
allows them to ask for instructions like MOVB BP, BX, which the
linker implements as three actual x86 instructions, swapping
registers around the actual move
(say, XCHG AX, BP; MOVB AX, BX; XCHG AX, BP).
They only appear as one instruction in the 8l -a output, but
you can see the ruse in the actual instruction bytes displayed.
I would be very interested to see evidence that this trick
is causing a performance problem.
In C this rarely comes up, because all of the "usual arithmetic
conversions" convert up to int before any work happens.
In Go the 8-bit arithmetic operations exercise this workaround
more frequently. If it does turn out that this is causing a
performance problem, the easiest solution is probably to
say that the 8-bit (and maybe 16-bit) values are still represented
as 32-bit registers and just make sure to get the operations right
(only divide and right shift would need special care, I think).
Russ