auipc+jalr pairs for offsets close to S32_MAX on RV64

178 views
Skip to first unread message

Luke Nelson

unread,
Apr 4, 2020, 12:08:22 PM4/4/20
to RISC-V ISA Dev
The RISC-V spec justifies the semantics of auipc+jalr by that it enables code to jump anywhere in a 32-bit offset relative to the PC.
However, I was doing some experimenting and it seem to me that there are some 32-bit offsets just below S32_MAX (2^31 - 1) that
cannot be encoded in this scheme on RV64.

One such example would be an offset of 0x7ffffffe. The normal strategy would be to construct the upper 20 and lower 12 bits
of the offset for auipc+jalr, and add 1 to the auipc immediate to compensate for the sign-extension from jalr. In this edge case however,
adding that additional 1 causes the sign of the auipc constant to flip, messing up the sign extension performed by auipc. This isn't
a problem on RV32 because there are no upper bits to sign extend into.

In fact, it seems to me that there is no pair of auipc+jalr instructions that can encode 0x7ffffffe because of the double sign-extension.
For loading an immediate, you can get around this by using an addiw instruction instead of addi, but there's no such equivalent for jalr.
I tried to do some digging and see if anyone else has brought this up but didn't see any prior discussion.

Thanks for the help,

Luke


P.S. To anybody curious, I tried modeling this problem in the Rosette programming language to ask an SMT solver to come up
with immediates that encode the offset. The SMT returns unsat, indiciating there are no such immediates given my model of
the instructions. This means either that there really is no way to encode this offset using an auipc+jalr pair, or my model is wrong :)

#lang rosette/safe

; Initial program counter
(define-symbolic pc (bitvector 64))

; 32-bit offset
(define offset (bv #x7ffffffe 32))

; Immediate for auipc
(define-symbolic upper (bitvector 20))
; Immediate for jalr
(define-symbolic lower (bitvector 12))

; Construct new program counter by simulating the effects of auipc+jalr
(define newpc (bvadd pc
                     (sign-extend (concat upper (bv 0 12)) (bitvector 64)) ; auipc                 
                     (sign-extend lower (bitvector 64)))) ; jalr

; Ask the SMT solver for an assignment to the immediates that produces
; the desired offset.
(solve (assert (equal? newpc (bvadd pc (sign-extend offset (bitvector 64))))))

Nick Knight

unread,
Apr 4, 2020, 6:10:48 PM4/4/20
to Luke Nelson, RISC-V ISA Dev
Hi Luke,

Interesting! It looks to me like offsets 0x7ffff800 through 0x7fffffff are unobtainable by auipc+jalr in RV64I (of course, only the even offsets are relevant). I wonder if there are other holes.

In this case, I think it can be remedied by reducing the jalr immediate below 0x800 and compensating with an addi to the auipc result. Of course, the ISA manual doesn't mention a need for an extra instruction.

Curious to hear what others think.

Best,
Nick Knight

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/1f49ebe9-9636-4553-87f6-15418e7287cd%40groups.riscv.org.

Andrew Waterman

unread,
Apr 5, 2020, 10:20:04 PM4/5/20
to Nick Knight, Luke Nelson, RISC-V ISA Dev
Note that binutils handles this limitation correctly.  If you try placing a callee 0x7ffff7fe bytes past its call site, it works for both RV32 and RV64, whereas a displacement of 0x7ffff800 works for RV32 but issues a link error for RV64.

Tommy Murphy

unread,
Apr 5, 2020, 11:05:00 PM4/5/20
to Nick Knight, Andrew Waterman, Luke Nelson, RISC-V ISA Dev
Regardless of what any specify toolchain does in practice, isn't the key issue here the inaccuracy/discrepancy in the spec that is alleged in the first post?

> The RISC-V spec justifies the semantics of auipc+jalr by that it enables code to jump anywhere in a 32-bit offset relative to the PC.

Is there a flaw in the spec that needs to be addressed or clarified?

From: Andrew Waterman <and...@sifive.com>
Sent: Monday, April 6, 2020 3:19:46 AM
To: Nick Knight <nick....@sifive.com>
Cc: Luke Nelson <luke....@gmail.com>; RISC-V ISA Dev <isa...@groups.riscv.org>
Subject: Re: [isa-dev] auipc+jalr pairs for offsets close to S32_MAX on RV64
 

Andrew Waterman

unread,
Apr 5, 2020, 11:35:53 PM4/5/20
to Tommy Murphy, Nick Knight, Luke Nelson, RISC-V ISA Dev
On Sun, Apr 5, 2020 at 8:04 PM Tommy Murphy <tommy_...@hotmail.com> wrote:
Regardless of what any specify toolchain does in practice, isn't the key issue here the inaccuracy/discrepancy in the spec that is alleged in the first post?

I was obliquely pointing out that this fact is not a secret.
 

> The RISC-V spec justifies the semantics of auipc+jalr by that it enables code to jump anywhere in a 32-bit offset relative to the PC.

Is there a flaw in the spec that needs to be addressed or clarified?

Not a flaw, but deserving of clarification anyway.  The quote is from the RV32 chapter, where it's actually a true statement.  Furthermore, it's commentary, not normative text; the semantics of the instructions are not in question.  I'm proposing adding the following note below the definition of AUIPC in the RV64 chapter:

\begin{commentary}
Note that the set of addresses or offsets that can be accessed by pairing LUI
with LD, AUIPC with JALR, etc. in RV64 include all signed 32-bit offsets
except for the range [$2^{31}{-}2^{11}$, $2^{31}{-}1$].
\end{commentary}

Tommy Murphy

unread,
Apr 5, 2020, 11:52:37 PM4/5/20
to Andrew Waterman, Nick Knight, Luke Nelson, RISC-V ISA Dev
Thanks Andrew.
I was just looking for clarity/clarification.

From: Andrew Waterman <and...@sifive.com>
Sent: Monday 6 April 2020 04:35
To: Tommy Murphy <tommy_...@hotmail.com>
Cc: Nick Knight <nick....@sifive.com>; Luke Nelson <luke....@gmail.com>; RISC-V ISA Dev <isa...@groups.riscv.org>

Luke Nelson

unread,
Apr 6, 2020, 12:01:08 AM4/6/20
to RISC-V ISA Dev, and...@sifive.com, nick....@sifive.com, luke....@gmail.com
Thanks everyone for the help! The new note about auipc in the RV64 section looks good to me.

I agree that the current spec is correct as written. The context is that I'm verifying code that generates
jumps using auipc+jalr for RV64 and was getting counterexamples from my verification tool where
the offset fell in this region. Initially I assumed my RISC-V model was wrong, until I realized that the phrase
about 32-bit offsets applies only to RV32. The new note will hopefully help future readers from avoiding my same mistake :)

Thanks again!

 Luke


On Sunday, April 5, 2020 at 8:52:37 PM UTC-7, Tommy Murphy wrote:
Thanks Andrew.
I was just looking for clarity/clarification.

From: Andrew Waterman <and...@sifive.com>
Sent: Monday 6 April 2020 04:35
To: Tommy Murphy <tommy_...@hotmail.com>
Cc: Nick Knight <nick....@sifive.com>; Luke Nelson <luke...@gmail.com>; RISC-V ISA Dev <isa...@groups.riscv.org>

Subject: Re: [isa-dev] auipc+jalr pairs for offsets close to S32_MAX on RV64
On Sun, Apr 5, 2020 at 8:04 PM Tommy Murphy <tommy_...@hotmail.com> wrote:
Regardless of what any specify toolchain does in practice, isn't the key issue here the inaccuracy/discrepancy in the spec that is alleged in the first post?

I was obliquely pointing out that this fact is not a secret.
 

> The RISC-V spec justifies the semantics of auipc+jalr by that it enables code to jump anywhere in a 32-bit offset relative to the PC.

Is there a flaw in the spec that needs to be addressed or clarified?

Not a flaw, but deserving of clarification anyway.  The quote is from the RV32 chapter, where it's actually a true statement.  Furthermore, it's commentary, not normative text; the semantics of the instructions are not in question.  I'm proposing adding the following note below the definition of AUIPC in the RV64 chapter:

\begin{commentary}
Note that the set of addresses or offsets that can be accessed by pairing LUI
with LD, AUIPC with JALR, etc. in RV64 include all signed 32-bit offsets
except for the range [$2^{31}{-}2^{11}$, $2^{31}{-}1$].
\end{commentary}

From: Andrew Waterman <and...@sifive.com>
Sent: Monday, April 6, 2020 3:19:46 AM
To: Nick Knight <nick....@sifive.com>
Cc: Luke Nelson <luke...@gmail.com>; RISC-V ISA Dev <isa...@groups.riscv.org>

Subject: Re: [isa-dev] auipc+jalr pairs for offsets close to S32_MAX on RV64
Note that binutils handles this limitation correctly.  If you try placing a callee 0x7ffff7fe bytes past its call site, it works for both RV32 and RV64, whereas a displacement of 0x7ffff800 works for RV32 but issues a link error for RV64.

On Sat, Apr 4, 2020 at 3:10 PM Nick Knight <nick....@sifive.com> wrote:
Hi Luke,

Interesting! It looks to me like offsets 0x7ffff800 through 0x7fffffff are unobtainable by auipc+jalr in RV64I (of course, only the even offsets are relevant). I wonder if there are other holes.

In this case, I think it can be remedied by reducing the jalr immediate below 0x800 and compensating with an addi to the auipc result. Of course, the ISA manual doesn't mention a need for an extra instruction.

Curious to hear what others think.

Best,
Nick Knight

To unsubscribe from this group and stop receiving emails from it, send an email to isa...@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa...@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa...@groups.riscv.org.

Nick Knight

unread,
Apr 6, 2020, 12:20:29 AM4/6/20
to Andrew Waterman, Tommy Murphy, Luke Nelson, RISC-V ISA Dev
Hi Andrew,

Thanks for the clarification.

On Sun, Apr 5, 2020 at 8:35 PM Andrew Waterman <and...@sifive.com> wrote:
I'm proposing adding the following note below the definition of AUIPC in the RV64 chapter:

\begin{commentary}
Note that the set of addresses or offsets that can be accessed by pairing LUI
with LD, AUIPC with JALR, etc. in RV64 include all signed 32-bit offsets
except for the range [$2^{31}{-}2^{11}$, $2^{31}{-}1$].
\end{commentary}

My only suggestion is replacing "RV64" by "RV64I", or perhaps by "RV64I and RV128I" (I prefer the latter).

Best,
Nick Knight

Paul Campbell

unread,
Apr 6, 2020, 7:16:30 AM4/6/20
to isa...@groups.riscv.org
On Monday, 6 April 2020 3:35:36 PM NZST Andrew Waterman wrote:
> Not a flaw, but deserving of clarification anyway. The quote is from the
> RV32 chapter, where it's actually a true statement. Furthermore, it's
> commentary, not normative text; the semantics of the instructions are not
> in question. I'm proposing adding the following note below the definition
> of AUIPC in the RV64 chapter:

it's probably also worth making sure there's a test for this in the R64I ISA
test suite - 64-bit architectures doing macro-op fusion may need to carry a
33rd bit

Paul


Claire Wolf

unread,
Apr 6, 2020, 8:52:10 AM4/6/20
to Nick Knight, Luke Nelson, RISC-V ISA Dev
Regarding "I wonder if there are other holes." No, there are not. And technically this is not a hole as all the offsets possible to encode with AUIPC+JALR are one contiguous region with 0xFFFFFFFF7FFFF800 being the smallest (most negative) and 0x000000007FFFF7FF the largest possible offset.

Rogier Brussee

unread,
Apr 6, 2020, 12:06:19 PM4/6/20
to RISC-V ISA Dev
Maybe I am thick, but this seems a corner case limitation of the used algorithm to split up the call as a 

AUIPC ra H; JALR ra ra L

where H is a 20 bit signed immediate and L is a 12 bit signed immediate. However I claim: 


for every -2^{31} <= A < 2^{31}  there are unique integer L, H with -2^{11} <= L < 2^11, and  -2^19 <= H < 2^19. such that A= SEXT((H << 12)  + L, 31)  where  SEXT(X, N) is sign extension  of X from bit N (with N counting from 0, so SEXT(X, 31) = SEXTW(X) ). In fact

L = SEXT(A , 11)  
H = SEXT((A - L) >> 12, 19)

proof:
By construction, we have -2^{11} <= L < 2^{11},  and A (mod 2^{12}) = L (mod 2 ^{12}),
and the  integer satisfying both the modular and range constraints is unique, so must be L.

In particular, A - L  is divisible by 2^{12}, and H (mod 2^{20}) = ((A-L) >> 12) (mod 2^{20})  = ((A - L) / 2^{12}) (mod 2^20) (note: arithmetic or logic right shift give the same answer mod 2^20!). 
Moreover -2^{19} <= H < 2^{19}, and again the integer satisfying the modular and range constraints is unique, so must be H. 
 
Let A' = SEXT((H<<12) + L, 31).
Then A' (mod 2^32) = H*2^{12} + L (mod 2^{32}) = A (mod 2^{32}). 
Moreover since both -2^{31} <= A' < 2^31 and -2^{31} <= A < 2^31, we again have A'= A.



Note that I just proved this!
Rogier.


Op zaterdag 4 april 2020 18:08:22 UTC+2 schreef Luke Nelson:

Nick Knight

unread,
Apr 6, 2020, 12:06:19 PM4/6/20
to Claire Wolf, Luke Nelson, RISC-V ISA Dev
Hi Claire,

Thanks for the clarification! It was not immediately clear to me from Andrew's proposed commentary that the set of possible 64-bit offsets is restricted symmetrically on the lower end.

Best,
Nick Knight

Claire Wolf

unread,
Apr 6, 2020, 12:29:10 PM4/6/20
to Nick Knight, Luke Nelson, RISC-V ISA Dev
Actually it's _extended_ on the lower end. The 2048 points missing on the higher and are "tacked on" on the lower end, thus the region starts at 0xFFFFFFFF7FFFF800 instead of 0xFFFFFFFF80000000.

Luke Nelson

unread,
Apr 6, 2020, 12:30:01 PM4/6/20
to RISC-V ISA Dev
You are correct that for every 32-bit immediate, there exist an H and and L such that auipc+jalr
encodes that immediate into the lower 32 bits. The problem is that the PC and registers on RV64 are 64 bits wide,
and each intermediate result is sign extended to 64 bits.

For RV64, we would need to show
sext(A, 63) = SEXT(H << 12, 63) + SEXT(L, 63)
for which there exist no H and L when A = 0x7ffffffe.

Luke

Claire Wolf

unread,
Apr 6, 2020, 12:36:42 PM4/6/20
to Rogier Brussee, RISC-V ISA Dev, Luke Nelson
Here is a different way of looking at this: with H = 0xFFFFFFFF80000000 and L = -2 you can jump by 0xFFFFFFFF7FFFFFFE. This is obviously outside the range -2^{31} <= A < 2^{31}. Because 20 bits + 12 bits are 32 bits, there are only 2^32 possible immediate combinations, thus not enough combinations for everything in the range -2^{31} <= A < 2^{31} if there are immediate combinations that code for something outside of that range.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

Rogier Brussee

unread,
Apr 6, 2020, 1:55:29 PM4/6/20
to RISC-V ISA Dev, rogier....@gmail.com, luke....@gmail.com
You two are right. I am thick. 

Rogier


Op maandag 6 april 2020 18:36:42 UTC+2 schreef clifford:
Here is a different way of looking at this: with H = 0xFFFFFFFF80000000 and L = -2 you can jump by 0xFFFFFFFF7FFFFFFE. This is obviously outside the range -2^{31} <= A < 2^{31}. Because 20 bits + 12 bits are 32 bits, there are only 2^32 possible immediate combinations, thus not enough combinations for everything in the range -2^{31} <= A < 2^{31} if there are immediate combinations that code for something outside of that range.

To unsubscribe from this group and stop receiving emails from it, send an email to isa...@groups.riscv.org.
Reply all
Reply to author
Forward
0 new messages