On 1/31/2014 10:33 AM, Stephen Sprunk wrote:
> On 30-Jan-14 15:03, BGB wrote:
>> On 1/30/2014 2:06 AM, Terje Mathisen wrote:
>>> Michael S wrote:
>>>> On Wednesday, January 29, 2014 11:56:25 PM UTC+2, Quadibloc
>>>> wrote:
>>>>> But as far as I know, there have *been* no new CISC designs
>>>>> offered to the market; the last one was the 680x0, which didn't
>>>>> succeed.
>>>>
>>>> I'd rather count x386 as separate from x86. Which makes it newer
>>>> than 68K.
>>>
>>> ...
>>> If the 386 had done a real change of the instruction encoding,
>>> maybe moving to three-operand and 16 or 32 registers, then it would
>>> have been really separate, but as it was, with everything just
>>> extended to 32 bits, I thought it was a very nice & natural
>>> extension.
>>
>> they did significantly change how the Mod/RM byte worked, added a
>> SIB byte, ...
>
> Neither really changed how the ISA worked overall, which is remarkable
> when you consider they went from a 16:16 segmented memory model to a
> flat 32-bit memory model, doubled the width of the GPRs and made the
> GPRs (mostly) orthogonal.
>
could be.
though, segmentation could be used with 32-bit code, generally no one
bothered (apart from using FS/GS for things like thread-local-storage
and similar).
>> in contrast while the move to 64-bits did break binary compatibility,
>> it also made fewer sweeping changes over-all to the ISA (apart from
>> the REX prefix).
>
> They could have hacked 64-bit into the existing model, but doubling the
> number of GPRs was long overdue, and that necessarily broke backward
> compatibility.
>
I don't agree on this point.
they added SSE and so on without breaking compatibility (which,
similarly, added new registers).
potentially, the added GPRs could have been added with similar
properties, just there would have been a slight lag until any 32-bit
OSes preserved them ("use at your own risk"), ...
though, it could have led to a potentially longer/less efficient
instruction encoding.
later on, the VEX prefix and similar were added, with the restriction of
keeping the 8x 32-bit GPRs restriction seeming fairly arbitrary at the
ISA level. theoretically, they could have just allowed using a VEX
prefix in-place of a REX prefix and getting the extended GPRs in 32-bit
mode. but, they didn't (my x86 interpreter actually did it this way,
calling this new construction "PREX" for "Pseudo-REX").
theoretically, this could have also altered the development path of
64-bit ISAs though.
>> though, admittedly, I am not entirely happy with the REX prefix. how
>> they implemented this single feature has caused a mess for
>> programmers which has now extended into its second decade,
>
> REX is only a "mess" for assembler authors, and there are only a handful
> of those worldwide; it's transparent to everyone else.
>
the encoding of REX (among a few other things) is a big part of the
break in compatibility between 32-bit and 64-bit code.
the alternative would be a path where there was no real break between 32
and 64-bit modes, and where things expanded more "naturally".
>> and had it been done more like SSE and AVX (without the otherwise
>> needless breaks in binary compatibility), the transition to 64-bits
>> could have been smoother.
>
> See above.
>
> I gotta say, though, the VEX prefix is really clever.
>
yeah.
though sad that it wasn't used to address the existing issues, say, as a
REX alternative.
"hey, now you can use the 16x 64-bit GPRs in 32-bit code!".
>> though, I am less happy with how ABI people responded, making an ABI
>> (SysV/AMD64) which both doesn't really match the performance profile
>> of the CPUs and also is IMO needlessly complex. though, in my case,
>> for my projects, I was able to "simplify" it to a degree... and for
>> the most part, code doesn't notice.
>
> What's "needlessly complex" about it? The only major change from the
> 32-bit ABI is the switch to a register calling convention, which means
> less stack pressure--also long overdue. There are complicated rules for
> what goes where, but that's the nature of the beast; other platforms
> with register calling conventions have roughly the same complexity.
>
there was the Win64 convention, which was a little more sane (and
considerably simpler).
it also passes and returns structs by (always) passing a reference in a
register, ...
but, I am talking about SysV/AMD64...
the main ugly needlessly-complex case in the ABI is the rules for
passing structures by-value, which effectively involves decomposing them
and passing individual fields in registers, with some cases of multiple
fields being packed into a single register, ...
I was just like "screw this" and didn't bother with a lot of this.
ex, struct foo_s { int x, y; float a, b, c, d; long long z, w; };
void foo(struct foo_s foo, int s, int t, int u, int v);
would be passed as:
RDI: x and y
RSI: z
RDX: w
RCX: s
R8: t
R9: u
XMM0: a, b, c, d
[RSP+0]: v
in my lazy/hacked version:
RDI: &foo
RSI: s
RDX: t
RCX: u
R8: v
and, in Win64:
RCX: &foo
RDX: s
R8: t
R9: u
[RSP+32]: v
however:
struct bar_s { float x, y, z, w; };
void bar(struct bar_s bar, int s, int t, int u, int v);
both versions (of SysV/AMD64):
RDI: s
RSI: t
RDX: u
RCX: v
XMM0: x, y, z, w
Win64:
RCX: &bar
RDX: s
R8: t
R9: u
[RSP+32]: v
the ABI would also return structures decomposed into registers, whereas
in my lazy version, it just passes a register giving an address to put
the returned struct into (if the whole struct can't be returned either
in RAX or XMM0).
>> I still remain annoyed that the ABI doesn't provide a place to spill
>> register arguments (forcing the use of temporaries for spilling
>> register arguments, severely complicating things like "va_list",
>> ...).
>
> IIRC, the ABI requires space to be reserved on the stack for register
> parameters so the callee has a place to spill them if needed.
>
> Varargs are passed the same way they were in the 32-bit ABI.
>
you sure you aren't thinking of Win64 (the Windows 64-bit ABI)?...
Win64 provides space to spill into, but SysV/AMD64 (the Linux/OSX/...
ABI) does not.
in SysV/AMD64, after the register args, the first place on-stack is used
for the first non-register argument.
like:
if you have integer args A-K, A-F will be passed in regs (SysV/AMD64
passes 6 args in registers, vs 4 for Win64), the first space on-stack
will be for G.
in a more sanely designed ABI, unused space would be left on the stack
for arguments A-F, then G would follow immediately afterwards, but it is
not.
varargs *does not* work the same as in x86 cdecl.
in x86 cdecl, you just need a pointer to the stack, and walk along
linearly. this will not work with SysV/AMD64, and the algorithm for
walking the argument list is a bit more involved.
however, in Win64, it is possible to spill the register arguments and
then read the argument list similarly to x86 cdecl.