zero opcode

Joe keane

unread,

Mar 26, 2013, 3:02:38 PM3/26/13

to

PDP-11:
000000 -> hlt
VAX:
00 -> hlt
6502:
00 -> brk

Mark Thorson

unread,

Mar 26, 2013, 6:15:49 PM3/26/13

to

Did you want to extend this list, or were you
making some comment?

I suppose when people were programming without
any software tools, making zero = halt may have
been useful for catching a jump into memory
that had been initialized to zero. I'm not sure
it has any value these days.

When I worked at Weitek, there was a custom of
initializing memory to (hexadecimal) DEAD.
That distinguished initialized but unused memory
nicely in the various debugging tools we used.

Robert Wessel

unread,

Mar 26, 2013, 6:26:06 PM3/26/13

to

On Tue, 26 Mar 2013 14:15:49 -0800, Mark Thorson <nos...@sonic.net>
wrote:

On the 6502 the 00 for BRK meant you could patch a fusible link PROM
by blowing the remaining fuses at the point where you wanted to insert
the BRK, and then put your patch code in the BRK handler. Never
actually saw it used like that personally, though...

Paul A. Clayton

unread,

Mar 26, 2013, 7:15:05 PM3/26/13

to

On Mar 26, 5:12 pm, Mark Thorson <nos...@sonic.net> wrote:
> Joe keane wrote:
>
> > PDP-11:
> > 000000 -> hlt
> > VAX:
> > 00 -> hlt
> > 6502:
> > 00 -> brk
>
> Did you want to extend this list, or were you
> making some comment?

It looks like Alpha used the zero primary
opcode for CALL_PAL.

PA-RISC seems to have used the zero primary opcode
for "system_op" with all zeros with 5-bit immediate
as BREAK.

MIPS uses the zero primary opcode plus 0b001101 in
bits 0 through 5 (with a 20 bit arbitrary "code"
value) for break, but uses all zeros as the
preferred NOP (shift left logical using zero
registers) with other special values encoding
superscalar NOP (how quaint! :-) and Execution
Hazard Barrier. I.e., MIPS can slide into
"executable data" with preceding zero-initialized
memory.

Power makes all zero instructions permanently
illegal: "An instruction consisting entirely of
binary 0s is illegal, and is guaranteed to be
illegal in all future versions of this
architecture."

OpenRISC has the zero primary opcode be a jump,
so executing an all-zero instruction will stall
the processor (jump to the jump-with-offset-zero
instruction).

> I suppose when people were programming without
> any software tools, making zero = halt may have
> been useful for catching a jump into memory
> that had been initialized to zero. I'm not sure
> it has any value these days.

Making all zero either a perpetually undefined
instruction or an explicit trap, syscall, or the
like is still useful. It not only catches trying
to execute zeroed memory but also the common case
of a zero data value. Mitch Alsup has discussed
here his ISA which maintains many more common data
patterns as undefined instructions to help protect
against executing data.

Using such for a valid instruction avoids wasting
opcode space, but introduces the possibility of
executing privileged code on behalf of an
application when the application is misbehaving.
A debug break might be a little safer than a
break used for a syscall.

A wait-for-interrupt (halt) instruction might be
somewhat safe, but slightly less desirable when
I/O or other processors might overwrite memory
that would be helpful in determining the cause of
failure; such would also waste execution resources.
Using such for a SCHEDULE instruction (where all
zeros might be a terminate thread instruction)
might be interesting but would seem to make
detecting a failure and determining its cause even
more difficult (register values would be lost and
the detection of the error could be greatly
delayed).

I would think such would be unnecessary given the
ability to exclude execute permission from a page;
but I can sort of understand using such for
defense in depth (somewhat like having the low
address area being reserved to catch stray
references relative to a zero null-pointer).

> When I worked at Weitek, there was a custom of
> initializing memory to (hexadecimal) DEAD.
> That distinguished initialized but unused memory
> nicely in the various debugging tools we used.

I thought 0xdeadbeef was traditional (for 32-bit
systems).

Paul A. Clayton

unread,

Mar 26, 2013, 10:06:43 PM3/26/13

to

On Mar 26, 7:15 pm, "Paul A. Clayton" <paaronclay...@gmail.com> wrote:
[snip]

> Making all zero either a perpetually undefined
> instruction or an explicit trap, syscall, or the
> like is still useful. It not only catches trying
> to execute zeroed memory but also the common case
> of a zero data value. Mitch Alsup has discussed
> here his ISA which maintains many more common data
> patterns as undefined instructions to help protect
> against executing data.

I found this mention (just in case someone is
interested):

In Message-ID: <55b753cd-2f72-4eb4-a503-
cb677d...@googlegroups.com>,
11 Aug 2012, Mitch Alsup wrote:
> In fact my current instruction set has OPcode
> faults for zeros, small negative numbers, and
> for the floating point numbers between 0.01
> and 100. Positives close to zero (in a 32-bit
> or 64-bit sense) are illegal opcodes, nagatives
> of similar ilk are illegal opcodes, and the
> typical range of FP data are also illegal opcodes.

MitchAlsup

unread,

Mar 27, 2013, 12:44:56 AM3/27/13

to

On Tuesday, March 26, 2013 6:15:05 PM UTC-5, Paul A. Clayton wrote:
> Mitch Alsup has discussed
> here his ISA which maintains many more common data
> patterns as undefined instructions to help protect
> against executing data.

And in particular, opcodes near integer zero are decoded as unimplemented,
and Opcodes near FP one (~10**-5 to ~10**5) are also decoded as unimplemented;
in both positive and negative senses.

Thus jumping into data is highly likely to result in attempting to execute
an unimplemented instruction.

Mitch

Paul A. Clayton

unread,

Mar 27, 2013, 2:12:25 AM3/27/13

to

On Mar 26, 5:12 pm, Mark Thorson <nos...@sonic.net> wrote:

> Joe keane wrote:
>
>> PDP-11:
>> 000000 -> hlt
>> VAX:
>> 00 -> hlt
>> 6502:
>> 00 -> brk
>
> Did you want to extend this list, or were you
> making some comment?

Some more zero opcodes/instructions:

CDC 6600 zero opcode: program stop instruction
("This instruction stops the Central Processor at
the current step in the program. An exchange Jump
is necessary to restart the Central Processor.")

SPARC is an unimplemented instruction (generating
an exception).

Fairchild CLIPPER has the zero opcode mapped to
a 16-bit nop (with 8 bits ignored).

Motorola 88k uses the zero opcode for a load
into the extended register; since x0 is a
hardwired zero register, an all zero instruction
is a nop.

Motorola M-Core has the zero opcode as a breakpoint
instruction (16-bit all zeros).

Renesas RX has an 8-bit all-zero break instruction.

GreenArrays F18 stack processor uses the zero
opcode for return.

Itanium assigns break to the zero major and minor
opcode (with 21-bit immediate and 6-bit predicate
register ID--p0 is always true).

Infineon TriCore assigns the zero major and minor
opcode to nop.

Renesas M32R assigns the zero opcode to subtract
with overflow checking; so the all zero instruction
clears R0 (and the condition bit).

Samsung CalmRISC16 uses the zero opcode for add
immediate; so the all zero instruction is a nop
(R0=R0+0b0000000).

SEL32 has the zero opcode generating a halt.

Berkeley RISC V Compressed has the all zero
instruction as a 16-bit jump to target in R0
(with R0 hardwired to zero). If the memory
at address zero contained the (16-bit) value
zero and had execute permission, then this
would result in a unending loop as with
OpenRISC.

Lattice Semiconductor's LatticeMicro has the
zero opcode assigned to shift right unsigned
immediate; so all zero is a nop.

Xilinx PicoBlaze assigns load immediate into
register to opcode zero; so an all zero
instruction clears R0.

It looks like the Motorola 68k might leave
opcode zero undefined (so generating an
exception), but I am not certain.

It is disappointing how many ISAs define an
all zero instruction as a nop.

Andy (Super) Glew

unread,

Mar 27, 2013, 2:13:56 AM3/27/13

to

On 3/26/2013 3:15 PM, Mark Thorson wrote:
> Joe keane wrote:
>>
>> PDP-11:
>> 000000 -> hlt
>> VAX:
>> 00 -> hlt
>> 6502:
>> 00 -> brk
>
> Did you want to extend this list, or were you
> making some comment?
>
> I suppose when people were programming without
> any software tools, making zero = halt may have
> been useful for catching a jump into memory
> that had been initialized to zero. I'm not sure
> it has any value these days.

I disagree.

If 0 is a NOP, or some other instruction that does not trap, then you
ave made it quite a lot easier for an attacker to break in.

Much of memory is zero initialized.

If the attacker can find a bug that causes a wild transfer to a garbaged
address, then, if that address does not trap, nor the address after it,
nor... the attacker may eventually slide into code that he can control.

Google "NOP slide".

--
The content of this message is my personal opinion only. Although I am
an employee (currently of MIPS Technologies, which has been acquired by
Imagination Technologies; in the past of companies such as Intellectual
Ventures and QIPS, Intel, AMD, Motorola, and Gould), I reveal this only
so that the reader may account for any possible bias I may have towards
my employer's products. The statements I make here in no way represent
my employers' positions on the issue, nor am I authorized to speak on
behalf of my employers, past or present.

Robert Wessel

unread,

Mar 27, 2013, 3:30:36 AM3/27/13

to

S/360 has opcode zero as an invalid instruction, and that's been made
formal in the ISA definition.

x86, of course, encodes a number of ADDs at 0.

Terje Mathisen

unread,

Mar 27, 2013, 5:27:41 AM3/27/13

to

Andy (Super) Glew wrote:
> I disagree.
>
> If 0 is a NOP, or some other instruction that does not trap, then you
> ave made it quite a lot easier for an attacker to break in.

Right.

>
> Much of memory is zero initialized.
>
> If the attacker can find a bug that causes a wild transfer to a garbaged
> address, then, if that address does not trap, nor the address after it,
> nor... the attacker may eventually slide into code that he can control.
>
> Google "NOP slide".

I wrote an "executable ascii" encoder many years ago, going for the
smallest code with the least possible amount of self-modification (i.e.
a single backwards branch, since it is impossible to write any kind of
loop in x86 using only 7-bit characters).

I wanted my code to survive reformatting, so the initial bootstrap code
had to work across a possible one or two-byte newline, the solution was
to use a NOP slide as the jump target. (My NOP was 'E', which is INC BP
afair, I did not use BP for anything in that bootstrap.)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Megol

unread,

Mar 27, 2013, 7:01:52 AM3/27/13

to

On Wednesday, March 27, 2013 7:12:25 AM UTC+1, Paul A. Clayton wrote:
> Some more zero opcodes/instructions:
> (snip)...

> It looks like the Motorola 68k might leave
>
> opcode zero undefined (so generating an
>
> exception), but I am not certain.

Any instruction starting with a zero byte is an ORI instruction (or immediate). A two byte zero instruction is a ORI.B #imm8, D0 instruction with the following bytes being nn (ignored by the processor), imm8.

> It is disappointing how many ISAs define an
>
> all zero instruction as a nop.

Yeah but there are many much more disappointing features in standard ISAs IMHO.

Casper H.S. Dik

unread,

Mar 27, 2013, 7:26:28 AM3/27/13

to

"Paul A. Clayton" <paaron...@gmail.com> writes:

>I thought 0xdeadbeef was traditional (for 32-bit
>systems).

We use 0xdeadbeef in the Solaris kernel memory allocator
in debugging mode (0xdeadbeef is for not allocated memory;
0xbaddcafe for just allocated memory)

Casper

Noob

unread,

Mar 27, 2013, 7:49:39 AM3/27/13

to

You sound like a bot. Is this some kind of Turing test?

Paul A. Clayton

unread,

Mar 27, 2013, 10:55:16 AM3/27/13

to

On Mar 27, 7:01 am, Megol <golem...@gmail.com> wrote:
> On Wednesday, March 27, 2013 7:12:25 AM UTC+1, Paul A. Clayton wrote:
>> Some more zero opcodes/instructions:
>> (snip)...
>> It looks like the Motorola 68k might leave
>> opcode zero undefined (so generating an
>> exception), but I am not certain.
>
> Any instruction starting with a zero byte is
> an ORI instruction (or immediate). A two byte
> zero instruction is a ORI.B #imm8, D0 instruction
> with the following bytes being nn (ignored by the
> processor), imm8.

Thank you very much for the quick correction! I
had just skimmed through a ColdFire manual and did
not see any zero instruction, but it was getting
late and I was sufficiently unmotivated to take
the time to do a more thorough search.

I wonder if there is a resource on the Internet
that has organized presentations of various ISA's
instruction encodings (possibly with some
commentary on rationales, strengths and weaknesses).
Creating such even for just the 20 (or so) most
"popular" ISAs would be quite a task. If the
format was sufficiently formal, it would even be
possible to automatically generate disassemblers.

>> It is disappointing how many ISAs define an
>> all zero instruction as a nop.
>
> Yeah but there are many much more disappointing
> features in standard ISAs IMHO.

While many people would agree with that statement,
the statement is crying out for at least a short
list of the most disappointing features of some
common ISAs.

John Levine

unread,

Mar 27, 2013, 5:35:23 PM3/27/13

to

>>> PDP-11:
>>> 000000 -> hlt

PDP-8:
0000 -> AND 0 (AND accumulator with location zero)

PDP-10:
000 -> UUO0 (does the same thing as system call instructions,
but reserved by convention)

--
Regards,
John Levine, jo...@iecc.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. http://jl.ly

Robert Wessel

unread,

Mar 27, 2013, 5:56:55 PM3/27/13

to

Instruction decoding is amongst the more trivial things a disassembler
has to do. The problems of identifying code, data, routine boundaries
and control flows are very much harder (in fact impossible in general,
although liberal application of various heuristics can go a long way
towards building a useful tool).