A thread in comp.lang.c++ has come up regarding a particular well-
optimized video decoder, and it uses a double jump sequence:
cmp something
jz target1
; Other code bytes
target1:
jmp target2
; Other code bytes
target2:
The OP was able take the disassembled code and re-assemble it by
altering it to:
cmp something
jz target2
; Other code bytes
target1:
jmp target2
; Other code bytes
target2:
... indicating it was not out of range. The OP asked why such a
double jump would be present.
I responded with the possibility that the algorithm uses SMC, and
the hard JMP is there to flush the pipeline, so that any recent
changes to the L1 instruction cache will be re-loaded.
Another responder replied and said that's no longer necessary in
modern Intel/AMD x86/x64 CPUs, as they all snoop the linear addresses
for SMC, and will automatically flush and refill the instruction
pipeline without the explicit need for a JMP instruction, as with
the 486 and earlier CPUs.
Is this true? I could not find a reference to that feature in the
Intel IA-32/Intel64 manual:
https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
The closest I found is on page 3710, and it reads:
On the Intel486 processor, a write to an instruction in the cache
will modify it in both the cache and memory. If the instruction was
prefetched before the write, however, the old version of the
instruction could be the one executed. To prevent this problem, it
is necessary to flush the instruction prefetch unit of the Intel486
processor by coding a jump instruction immediately after any write
that modifies an instruction.
The P6 family and Pentium processors, however, check whether a write
may modify an instruction that has been prefetched for execution.
This check is based on the linear address of the instruction. If
the linear address of an instruction is found to be present in the
prefetch queue, the P6 family and Pentium processors flush the
prefetch queue, eliminating the need to code a jump instruction
after any writes that modify an instruction.
-----
NOTE
The check on linear addresses described above is not in practice a
concern for compatibility. Applications that include self-modifying
code use the same linear address for modifying and fetching the
instruction. System software, such as a debugger, that might possibly
modify an instruction using a different linear address than that
used to fetch the instruction must execute a serializing operation,
such as IRET, before the modified instruction is executed.
I know you don't want to reply to me ... but I am still asking for
help. As I understand it, and my knowledge may be specific to the
old 486-and-prior way of doing things, whenever you use SMC you
always added a JMP $+2 in order to refill the pipeline with any
changes that may now be in the instruction cache, but not in the
pre-decoded pipeline.
Thank you,
Rick C. Hodgin