Add 1, Subtract 1

244 views
Skip to first unread message

Schmitt, Michael

unread,
Mar 10, 2021, 6:27:09 PM3/10/21
to ASSEMBL...@listserv.uga.edu
I was taught long ago to add 1 to a register using LA r#,1(,r#) and to subtract 1 using BCTR r#,0.

Is the fastest way now to use AHI r#,1 and AHI r#,-1?

Charles Mills

unread,
Mar 10, 2021, 7:56:10 PM3/10/21
to ASSEMBL...@listserv.uga.edu
1. "Instruction speed" is not exactly a concept anymore due to pipelining.
An instruction can take literally zero effective time because it overlaps
with something else such as a wait for cache. "Does an AHI or an LA take
longer?" is no longer a question that has an answer.

2. AHI is wonderful but not exactly the functional equivalent of LA and
BCTR. LA can add two registers plus an offset, AHI only one register. AHI
sets the condition code, which may be good for your situation, or bad. BCTR
is a branch instruction, which may affect cache performance, although modern
CPUs are probably smart enough to realize that BCTR x,0 is not really a
branch.

3. Code readability is much more important than instruction speed. CPUs are
fast; programmers are relatively very, very slow and error-prone. I like AHI
for its readability. LHI says what it does: adds an immediate value to a
register. Would a novice read BCTR as subtracting one? LHI also takes
equates, which improve maintainability. Prefix_Offset EQU
TablePrefix-TableStart / AHI R1,Prefix_Offset is a lot clearer than BCTR
R1,0 / BCTR R1,0.

Charles

Steve Smith

unread,
Mar 10, 2021, 7:58:19 PM3/10/21
to ASSEMBL...@listserv.uga.edu
If you have to ask, it doesn't matter.

sas


On Wed, Mar 10, 2021 at 6:27 PM Schmitt, Michael <michael...@dxc.com>
wrote:

Mike Hochee

unread,
Mar 10, 2021, 8:00:05 PM3/10/21
to ASSEMBL...@listserv.uga.edu
Hi Michael,

You may want to check out the IBM Z optimization primers if you're super serious about this. Here's a link, and I believe there's a z15 version as well in 2020... https://community.ibm.com/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=d1cdb394-0159-464c-92a3-3f74f8c545c4

I did a little very rough benchmarking a few years ago on a z13 and found that all of the instructions you mentioned performed comparably. I suspect the same is true today.

HTH,
Mike

-----Original Message-----
From: IBM Mainframe Assembler List [mailto:ASSEMBL...@LISTSERV.UGA.EDU] On Behalf Of Schmitt, Michael
Sent: Wednesday, March 10, 2021 6:26 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Add 1, Subtract 1

Caution! This message was sent from outside your organization.

Charles Mills

unread,
Mar 10, 2021, 8:00:39 PM3/10/21
to ASSEMBL...@listserv.uga.edu
Sorry: LHI in the below should be AHI, of course.

Charles Mills

unread,
Mar 10, 2021, 8:09:12 PM3/10/21
to ASSEMBL...@listserv.uga.edu
+1

Dr. Shum's papers are the best if you are serious about this stuff. There
are other factors such as data placement and instruction sequence that are
MUCH more significant than "how long does BCTR take?" Dr. Shum's paper will
let you understand what the heck is going on in a Z CPU under the covers. It
is pretty amazing!

I would not +1 on benchmarking however. The result of a particular benchmark
may say more about the benchmark than about the instructions benchmarked.
"AHI outperforms LA" may be true for a particular benchmark because of how
it is organized; it may not be true for other cases. But yes, +1 on "all of
them perform comparably." Other factors are what matter, not AHI or LA.

Charles


-----Original Message-----
From: IBM Mainframe Assembler List [mailto:ASSEMBL...@LISTSERV.UGA.EDU]
On Behalf Of Mike Hochee
Sent: Wednesday, March 10, 2021 5:00 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Re: Add 1, Subtract 1

Robin Vowels

unread,
Mar 10, 2021, 8:20:55 PM3/10/21
to ASSEMBL...@listserv.uga.edu
From: "Schmitt, Michael" <michael...@DXC.COM>
Sent: Thursday, March 11, 2021 10:26 AM


> I was taught long ago to add 1 to a register using LA r#,1(,r#) and to subtract 1 using BCTR r#,0.

> Is the fastest way now to use AHI r#,1 and AHI r#,-1?

LA and BCTR are good when you don't want to change the CC.
BCTR R,0 is specially good because it needs only 2 bytes.
LA is good fir small values.


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Mark Hammack

unread,
Mar 10, 2021, 9:55:31 PM3/10/21
to ASSEMBL...@listserv.uga.edu
+1 on this. Readability/maintainability is much more important than
relative instruction speed on modern systems.


*Mark*

Paul Gilmartin

unread,
Mar 10, 2021, 11:56:48 PM3/10/21
to ASSEMBL...@listserv.uga.edu
> On 2021-03-10, at 18:21:11, Robin Vowels wrote:
>
> From: "Schmitt, Michael"
> Sent: Thursday, March 11, 2021 10:26 AM
>
>
>> I was taught long ago to add 1 to a register using LA r#,1(,r#) and to subtract 1 using BCTR r#,0.
>
>> Is the fastest way now to use AHI r#,1 and AHI r#,-1?
>
> LA and BCTR are good when you don't want to change the CC.
> BCTR R,0 is specially good because it needs only 2 bytes.
> LA is good fir small values.
>
"Small" means when you can tolerate truncation to 24 or 31 bits.

You can use LA to subtract 1 if you have a negative value
in a base register, subject to the same limit.

-- gil

rob...@dodo.com.au

unread,
Mar 11, 2021, 12:10:22 AM3/11/21
to ASSEMBL...@listserv.uga.edu
Why would you do that, when BCTR R,0 will do it for free
with even zero as the source register (you get true -1
as a result). And besides, BCTR requires only 2 bytes,
LA needs 4.

Mike Hochee

unread,
Mar 11, 2021, 12:20:27 AM3/11/21
to ASSEMBL...@listserv.uga.edu
Actually, I think 'small' means a 12 bit unsigned binary integer for LA and 20 bits for LAY. And with LAY you can simply express the displacement as a negative.

Paul Gilmartin

unread,
Mar 11, 2021, 8:38:50 AM3/11/21
to ASSEMBL...@listserv.uga.edu
On 2021-03-10, at 22:10:14, rob...@dodo.com.au wrote:
> ...
>>
>> You can use LA to subtract 1 if you have a negative value
>> in a base register, subject to the same limit.
>
> Why would you do that, when BCTR R,0 will do it for free
> with even zero as the source register (you get true -1
> as a result). And besides, BCTR requires only 2 bytes,
> LA needs 4.
>
What if you want to subtract 42?

I saw this done in some naively machine-generated code
targeted for s/370, which had only LA as an immediate
instruction. The code dedicated a pair of base registers
for -4096 and -8192 to facilitate addressing control
block prefixes. I suspect the author was most familiar
with PDP-11.

The code worked.

-- gil

Seymour J Metz

unread,
Mar 11, 2021, 8:40:38 AM3/11/21
to ASSEMBL...@listserv.uga.edu
Instruction speed is still a concept, but it is much less relevant to predicting performance than it used to be, and even in the old days the execution time of an instruction could differ based on its data. Pipeling and caches are subject to various kinds of flushes and stalls. Predicting the performance of an instruction mix on a particular processor is far too complicated to be practical.

LA behaves differently depending on the addressing mode.

The importance of code maintainability and readability cannot be overstated. If a code sequence is not crystal clear, add a comment to explain what, how and why it does what it does, in terms of the application. Pillory the programmer that writes

L R7,=A(MAGIC_NUMBER) Put magic number in register 7

Note: you can also use equated symbols in LA, and should, for the same reasons that you should use them on AHI.


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Assembler List [ASSEMBL...@LISTSERV.UGA.EDU] on behalf of Charles Mills [char...@MCN.ORG]
Sent: Wednesday, March 10, 2021 7:56 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Re: Add 1, Subtract 1

Peter Relson

unread,
Mar 11, 2021, 8:41:14 AM3/11/21
to ASSEMBL...@listserv.uga.edu
To add to what Charles M posted,

Don't forget that LA in AMODE 31 always zeroes bit 32, and LA in AMODE 24
zeroes bits 32-39 of the 64-bit GR.
That's another way that they are not functionally equivalent, if that
difference matters to you.

A good rule of thumb is that when you have equivalent alternatives, choose
the one that has the smallest instruction byte footprint.
BCTR is a 2 byte instruction. But don't sacrifice the readability of your
code.

Peter Relson
z/OS Core Technology Design

Seymour J Metz

unread,
Mar 11, 2021, 8:57:15 AM3/11/21
to ASSEMBL...@listserv.uga.edu
The question is very much hardware dependent, and is much less relevant to predicting performance than it might have been half a century ago.

The relevant questions are:

Do you want to clear the high bits?

Do you want to preserved the condition code?

What will be more readable to your fellow programmers?


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Assembler List [ASSEMBL...@LISTSERV.UGA.EDU] on behalf of Schmitt, Michael [michael...@DXC.COM]
Sent: Wednesday, March 10, 2021 6:26 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Add 1, Subtract 1

rob...@dodo.com.au

unread,
Mar 11, 2021, 8:57:51 AM3/11/21
to ASSEMBL...@listserv.uga.edu
On 2021-03-12 00:38, Paul Gilmartin wrote:
> On 2021-03-10, at 22:10:14, rob...@dodo.com.au wrote:
>> ...
>>>
>>> You can use LA to subtract 1 if you have a negative value
>>> in a base register, subject to the same limit.
>>
>> Why would you do that, when BCTR R,0 will do it for free
>> with even zero as the source register (you get true -1
>> as a result). And besides, BCTR requires only 2 bytes,
>> LA needs 4.
>>
> What if you want to subtract 42?

What if you don't?
BCTR was designed for decrementing 1 for loop control
and for the special case when 1 was decremented (without branch).
That was useful in conjunction with the EX instruction.

> I saw this done in some naively machine-generated code
> targeted for s/370, which had only LA as an immediate
> instruction. The code dedicated a pair of base registers
> for -4096 and -8192 to facilitate addressing control
> block prefixes. I suspect the author was most familiar
> with PDP-11.
>
> The code worked.

Dodgy, in general and not much good when the result is negative.

rob...@dodo.com.au

unread,
Mar 11, 2021, 9:04:47 AM3/11/21
to ASSEMBL...@listserv.uga.edu
If it's readability you want, do a macro DECR.
But BCTR has been in use since ... 1965 or so,
that's 55 years, so it' likely that people will
know what it does by now.
Reply all
Reply to author
Forward
0 new messages