On Wednesday, October 17, 2012 9:01:43 PM UTC-4, MitchAlsup wrote:
> On Wednesday, October 17, 2012 6:03:26 PM UTC-5, Paul A. Clayton wrote:
>> What uses are there for an exchange instruction?
>
> Do yourself a favor and do not attempt to join the orthogonal properties
> of swapping data with the property of atomicity.
>
> Excahnge instructions should have the singular property that after the
> instruction has been performed that the registers and memory locations
> have easily computable bit patterns in them. Atomic stuff has many more
> possible bit patterns and many more potential sources of "it no workie".
I thought that exchange would be relatively simple to make
coherence atomic. (Admittedly, simple and correct might mean
uselessly slow. Waiting for the store part to be ready to
commit before performing the load might be somewhat simple
but could significantly hurt performance. In a processor using
OoO mechanisms to provide a very strong consistency model,
speculatively performing the load might not add significant
complexity.)
(x86 does distinguish between coherence atomic and merely
interrupt atomic update operations with the LOCK prefix.
Using something like the acquire/release indicator in
Itanium and ARM AArch64 would probably not be appropriate
because at least some uses of coherence atomic exchange
do not need the memory barrier semantics; again a case of
unnecessarily binding orthogonal features.)
IBM did choose to recently make certain update
instructions coherence atomic (if the memory address is
properly aligned) in zSeries (S/360 descendants). This
hints that such artificial binding might not be extremely
expensive (at least if the architecture had a strong
memory consistency model), though such might also be of
greater benefit for the target market of IBM's mainframes
than it would be for most general purpose processors (the
atomic guarantee might allow some software to be unchanged
while still supporting multiprocessor operation).
> Given one slot left in the instruction encoding and the ability to put
> an exchange instruction in that slot--don't do it--save it for something
> more valuable later on.
I was inclined to think that exchange was a less useful
instruction, which was why I asked if there were uses that
I had not thought of which could justify its inclusion in
the M88k. (Ivan Godard already mentioned part of the
attraction for a processor using core memory.)
I admit that I like that exchange facilitates certain
optimizations and avoids certain overheads. In coherence
atomic form, such could replace a three instruction
sequence (ll, sc, branch) and would not have even
theoretical lock-up issues (idiom recognition with short
ll/sc sequences could be useful, but such would be more
expensive [but also more flexible] than specialized
instructions).
Implementing exchange might not be excessively complex
at any scale of performance, but if its use is relatively
limited forcing such complexity on all implementations
could be suboptimal.
I am not entirely convinced on the preciousness of
instruction encodings, especially if one has variable
length encodings. Even in a RISC with 32-bit instructions,
using one of hundreds of minor opcodes for a single major
opcode might not be excessively expensive. Since it seems
to be less useful _and_ can be effectively synthesized
with modest extra overhead, there does not seem to be a
good case for exchange. (Of course, I have no experience
with ISA design much less maintenance, and I have a very
underdeveloped sense of the importance of binary compatibility;
so my perception is not exactly trustworthy in this.)