Arm registers

Hul Tytus

unread,

Apr 16, 2021, 3:30:15 PM4/16/21

to

sci.electronics.design
Arm Cortex assembley code

Does the Arm Cortex support both the "thumb" 16 bit machine codes and also
the 32 bit codes? I had been under the impression that Cortex devices only
provided the thumb codes but a little time going through "Arm v7-M
Architecture Reference Manual" indicates otherwise. Whether or not that
reference manual is suitable for Cortex information needs verification also.
There may be a more pertinent text available.
And another question: the thumb instructions show 8 available registers
rather than 16. However, somewhere on the internet a site states that 16
registers are available & which group of 8 varies with the "processor state".
Any suggestions on an Arm manual or text that defines "processor state"? The
CPS (change processor state) machine code description in the "Arm v7-M
Architecture Reference Manual" doesn't mention register groups.

Hul

Richard Damon

unread,

Apr 16, 2021, 4:20:11 PM4/16/21

to

Thumb instructions can be 16 or 32 bit, the 32 bit instruction are
different than the non-thumb 32-bit mode, which may be the confusion,
and are sometimes called Thumb-2 instructions, as they weren't available
in the first version of the Arm Processors that just had the basic 16
bit Thumb instructions.

The Arm v7-M processor has 16 32-bit registers, but the top 3 are
special purpose (LR, SP, PC), so only R0-R12 are really general purpose.

Most of the 16-bit thumb instruction can only access R0-R7, (but some
will use the special registers implicitly), to get to the full set of
registers you often need to use a 32-bit Thumb instructions.

Hul Tytus

unread,

Apr 16, 2021, 6:13:02 PM4/16/21

to

Richard - you've verified that the v7-m processors use both the 16 bit
and the 32 bit instructions and for that I thank you. I'm guessing now
the Cortex processors, not the v7-m types, are what I should be looking
at. Do you know the title of Arm's Architectural Reference Manual for the
Cortex versions?

Hul

Richard Damon

unread,

Apr 16, 2021, 7:41:53 PM4/16/21

to

I was looking in the Arm®v7-M Architecture Reference Manual manual.

One thing to note, there are multiple 'Cortex' lines, the Cortex-A, the
Cortex-R, and the Cortex-M

Which one you are looking at will change the -M in the manual name.

Arm v7 updated the Thumb instruction set from the Thumb-1 set which was
16 bits only, to include the Thumb-2 instructions (using unused codes in
the Thumb-1 set).

David Brown

unread,

Apr 17, 2021, 6:44:03 AM4/17/21

to

You seem to be mixing up a range of different things here. (That is not
surprising - ARM has /seriously/ confusing terminology here, with
similar names for different things, different names for similar things,
different numbers for instruction sets, architectures, and implementations.)

The answers you got from Richard are all correct - I am just wording
things a little differently, to see if that helps your understand.

In the beginning, Arm used a 32-bit fixed-size instruction set. It was
fast, but not very compact. Most instructions had three registers (so
you have "Rx = Ry + Rz"), with 16 registers - that adds up to 12 bits of
instruction just for the registers.

When they started getting serious in the microcontroller market, this
extra space was a cost - it meant code was bigger than for other
microcontrollers, and you needed a bigger and more expensive flash. So
they invented the "Thumb" mode. Here, instructions are 16-bit only and
so much more compact. These were a subset of the ARM instructions -
instructions only cover two registers, and for many operations these
could only come from 8 registers (so only needed 6 bits of the
instruction set). The cpu decoder expanded the Thumb instructions into
32-bit ARM instructions before executing them.

Thumb gave more compact code, but slower code, and there were some
things you simply could not do in Thumb mode. So the microcontrollers
had to support both instruction sets, and you would switch sets in code
(interrupts, for example, would be in ARM mode for speed). It was all a
bit of a mess.

So ARM invented "Thumb-2". This was a set of 32-bit instructions that
can be mixed along with the slightly re-named "Thumb-1" 16-bit
instructions. Sometimes it is now all called "Thumb2", or just "Thumb"
(since for the vast majority of ARM programmers, 16-bit only thumb was
ancient history from before they started using the devices). 16-bit and
32-bit instructions are called "narrow" and "wide" forms. These new
32-bit additions to thumb meant you could get a mixed code that was
compact, fast, and covered all needs (including full access to all
registers).

So for the Cortex-M devices, Thumb2 is the only set needed - they don't
support "old-style" 32-bit Arm instructions. You no longer need to
worry about changing states or modes, you are always in "thumb2" mode.

Different Cortex-M processors support different subsets and extensions
of this, so it is important that you inform your compiler of exactly
which device you are using. Do that, and you don't really need to care
about the details for the most part - the compiler does the work. But
it can be interesting to know what is going on.

There are also Cortex-A devices for applications processors - basically,
running Linux (including Android) or, occasionally, Windows. And there
are Cortex-R devices for safety-critical systems. But I expect you are
talking about Cortex-M devices here.

Hul Tytus

unread,

Apr 17, 2021, 11:43:31 AM4/17/21

to

Thanks again Richard. I will assume that the "Arm v7-m Architecture Reference
Manual" is most current. The only similiarly titled text on Arm's web site
was for a 64 bit version.

Hul

Richard Damon <Ric...@damon-family.org> wrote:

> I was looking in the Arm??v7-M Architecture Reference Manual manual.

Hul Tytus

unread,

Apr 17, 2021, 11:53:35 AM4/17/21

to

David - I'm in the process of putting together an assembler for the Arm
32 bit processors. There is no all-knowing compiler envolved so I need
to know what machine codes are valid for a given processor. Any suggestions
on where that information can be found?

Hul

David Brown

unread,

Apr 17, 2021, 12:50:46 PM4/17/21

to

On 17/04/2021 17:53, Hul Tytus wrote:
> David - I'm in the process of putting together an assembler for the Arm
> 32 bit processors. There is no all-knowing compiler envolved so I need
> to know what machine codes are valid for a given processor. Any suggestions
> on where that information can be found?
>

Why? There are perfectly good Arm assemblers available freely. Is this
just for fun, or do you have a good reason behind it?

Arm use what they call "Unified Assembly Language" to let people write
assembly in a single consistent format that can be used to generated
object code in the various instruction formats. That means a single UAL
instruction might lead to multiple object code instructions if necessary
(for example, loading a register with a constant might need multiple
instructions depending on the constant value and the instruction set).

Add to that there /lots/ of different subsets of the instructions that
are available on different devices. It is a big effort writing an
assembler her.

Richard Damon

unread,

Apr 17, 2021, 1:16:21 PM4/17/21

to

The ARM Architecture Reference Manuela give the detailed encoding of
every instruction. You may need to look at the documentation for a given
processor class to see which instructions are legal for that given
processor, as that varies by machine class/sub-class.

It will be a bit of work to assemble the info, a lot of 'paper' to wade
through.

Tauno Voipio

unread,

Apr 17, 2021, 2:09:14 PM4/17/21

to

On 17.4.21 18.53, Hul Tytus wrote:
> David - I'm in the process of putting together an assembler for the Arm
> 32 bit processors. There is no all-knowing compiler envolved so I need
> to know what machine codes are valid for a given processor. Any suggestions
> on where that information can be found?

Why on earth?

What is wrong with the GNU arm-none-eabi-as?

The arm-none-eabi-gcc has switches for different processor
architectures.

ARM has an instruction reference card for Thumb2, with
different processors marked on the card.

--

-TV

Hul Tytus

unread,

Apr 17, 2021, 3:23:48 PM4/17/21

to

Yes, I've noticed a slight difficulty so far. Once again, I do appreciate
you aid along those lines.

Hul

Richard Damon <Ric...@damon-family.org> wrote:

Hul Tytus

unread,

Apr 17, 2021, 3:30:41 PM4/17/21

to

David: writing an assembler isn't really a "big effort" although
the documentation of the Arm and the heaps of machine codes it
has moves in that direction. As for "why?", there are several
reasons and fun is one.
Thanks for the direction regarding the Arm device.

Hul

Hul Tytus

unread,

Apr 17, 2021, 3:38:58 PM4/17/21

to

Tauno - just the GNU copyright provides reason for not using it.
I would expect anyone using it, or considering same, to a major extent, whether
government sponsored or private, would look closely.

Hul

Theo

unread,

Apr 17, 2021, 4:42:44 PM4/17/21

to

Richard Damon <Ric...@damon-family.org> wrote:
> The ARM Architecture Reference Manuela give the detailed encoding of
> every instruction. You may need to look at the documentation for a given
> processor class to see which instructions are legal for that given
> processor, as that varies by machine class/sub-class.
>
> It will be a bit of work to assemble the info, a lot of 'paper' to wade
> through.

Arm has a machine-readable architecture description that's amenable to
formal proof - describes all of the instructions and where all the bits go.
There are pathways to take this and generate various bits of toolchain, if
you would rather not do it manually:
https://alastairreid.github.io/ARM-v8a-xml-release/

Theo

Theo

unread,

Apr 17, 2021, 4:45:26 PM4/17/21

to

Hul Tytus <h...@panix.com> wrote:
> Thanks again Richard. I will assume that the "Arm v7-m Architecture Reference
> Manual" is most current. The only similiarly titled text on Arm's web site
> was for a 64 bit version.

You probably want the ARMv8-M architecture reference, unless you're dealing
with older hardware:
https://developer.arm.com/documentation/ddi0553/latest

THeo

anti...@math.uni.wroc.pl

unread,

Apr 17, 2021, 5:05:55 PM4/17/21

to

Hul Tytus <h...@panix.com> wrote:
> David - I'm in the process of putting together an assembler for the Arm
> 32 bit processors. There is no all-knowing compiler envolved so I need
> to know what machine codes are valid for a given processor. Any suggestions
> on where that information can be found?

If you think about microcontrollers, then most relevant probably
is ARM v8-M Architecture Reference Manual which I found under
file name DDI0553A_e_armv8m_arm.pdf

It contains list of instructions with their encodings. v8 is resonably
recent (maybe newest) version of ARM architecture. Since ARM is
mostly adding instructions it probably covers all intructions that
you want to support. ARM defined several subsets, for example
Cortex -M3 Devices Generic User Guide file name
DUI0552A_cortex_m3_dgug.pdf describes Cortex M3, in particular
provides list of valid instructions.

Collectiong information that you need looks like large and tedious
job: you need to check instruction descriptions to know if
instruction is valid for given subset (given mnemonic may be
valid, but specific combination of argument may be invalid).
I am not aware any documents containing needed data is
syntetic form. You may try a shortcut: generate "program"
that contains all varianants of all instructions and look
which one GNU assembler accepts for given subset (model).
Of course, in this way you will repeat any bugs in tables
in GNU assembler. OTOH tables in GNU assembler are probably
debugged at least as well as ARM documents...

--
Waldek Hebisch

Tauno Voipio

unread,

Apr 18, 2021, 4:44:17 AM4/18/21

to

Your code is just data passed through the tools, the GNU
copyright does not limit it.

You'll be safe as long as you are not going to modify the
tools. Even then you have to publish the modifications, not
the code processed using the tools.

For libraries, there is the GNU LGPL coversin such cases
and permitting to use them.

--

-TV

David Brown

unread,

Apr 18, 2021, 9:02:37 AM4/18/21

to

(Please don't top-post. Usenet has its standards for formatting posts,
and it is best to stick to those.)

On 17/04/2021 21:30, Hul Tytus wrote:
> David: writing an assembler isn't really a "big effort" although
> the documentation of the Arm and the heaps of machine codes it
> has moves in that direction. As for "why?", there are several
> reasons and fun is one.
> Thanks for the direction regarding the Arm device.
>
> Hul

I have always felt that "fun" is a perfectly good reason for any
programming project - you don't need more than that. But there is no
doubt that as assemblers go, making one for ARM is complicated by all
its variations and the fact that unlike most assemblies, there is not a
one-to-one relationship between assembly instructions and machine code
instructions.

But have fun with it anyway!

David Brown

unread,

Apr 18, 2021, 9:10:42 AM4/18/21

to

(Please don't top-post - it is making a serious mess of this thread.
I've fixed it here.)

On 17/04/2021 21:38, Hul Tytus wrote:
>
> Tauno Voipio <tauno....@notused.fi.invalid> wrote:
>> On 17.4.21 18.53, Hul Tytus wrote:
>>> David - I'm in the process of putting together an assembler for the Arm
>>> 32 bit processors. There is no all-knowing compiler envolved so I need
>>> to know what machine codes are valid for a given processor. Any suggestions
>>> on where that information can be found?
>
>
>> Why on earth?
>
>> What is wrong with the GNU arm-none-eabi-as?
>
>> The arm-none-eabi-gcc has switches for different processor
>> architectures.
>
>> ARM has an instruction reference card for Thumb2, with
>> different processors marked on the card.
>
> Tauno - just the GNU copyright provides reason for not using it.
> I would expect anyone using it, or considering same, to a major
extent, whether
> government sponsored or private, would look closely.
>
> Hul

The copyright ownership has no particular relevance here, it is the
license that is important. And the license lets you use the tools
completely freely for any purpose - commercial, governmental, private,
or whatever. It only places restrictions on what you can do with the
source code to the assembler - you can modify it that source as much as
you like, but if you give people a copy of a binary version of the
assembler, you need to give them a copy of the modified source too. The
GPL does not affect your ARM code in any way.

The GNU toolchain - assembler, linker, compiler collection - are far and
away the most used development toolchain in the world, and cover the
widest range of targets. If the copyright or license caused some
restriction in how they could be used or who could use them, someone
would have noticed by now.

And if you want to be able to take the source code for an ARM assembler,
modify it, and pass on (or sell) the binaries while keeping your changes
secret, then you could look to the LLVM project which also includes an
assembler and supports ARM as a target, and is all under a BSD/MIT style
license.

Hul Tytus

unread,

Apr 18, 2021, 9:57:22 AM4/18/21

to

I'm aiming at the 32 bit devices and looking at the v7 text.

Hul

Hul Tytus

unread,

Apr 18, 2021, 10:00:11 AM4/18/21

to

Maybe I should take a look at the version 8 text. Thanks

Hul

Hul Tytus

unread,

Apr 18, 2021, 10:07:12 AM4/18/21

to

Apparantly you read a different version of the GNU copyright
than I did but I didn't read it all. It does vary with
time.

Theo

unread,

Apr 18, 2021, 11:31:22 AM4/18/21

to

Hul Tytus <h...@panix.com> wrote:
> I'm aiming at the 32 bit devices and looking at the v7 text.

ARMv8-M devices /are/ 32 bit.

ARMv8-A brought in the 64 bit instruction set (AArch64) for Application
cores. The classic 32 bit instruction set (AArch32) remains optional for
Application cores. v8-A (and v9-A) describe the Application class
architecture.

ARMv8-M describes the Microcontroller class architecture and it's all 32
bit. For example v8-M brought in TrustZone instructions which weren't in
ARMv7-M.

You should check which core(s) you plan to use and what architecture they
are. For example the Cortex M55 follows ARMv8.1-M while the Cortex M7
follows ARMv7E-M.

Theo

Hul Tytus

unread,

Apr 19, 2021, 5:22:28 PM4/19/21

to

Thanks Theo. I will get the version 8 manual. Is the title for that "Armv8-m
Archetecture Reference Manual"? From what you've said, I was probably looking
at the v8-A manual.

Hul