[llvm-dev] How to calculate the offset obtained via a GEP instruction

533 views
Skip to first unread message

Alberto Barbaro via llvm-dev

unread,
Jan 25, 2022, 2:00:39 AM1/25/22
to llvm-dev
Hi all,
so I'm trying to understand how to manually calculate the offset calculated by a GEP instruction. I found that this question was asked over 6 years ago on stackoverflow[1] as well but never got a real answer.

Since I need exactly the same, is there anyone willing to help me to understand how to calculate the offset?

Thanks


Markus Böck via llvm-dev

unread,
Jan 25, 2022, 2:04:36 AM1/25/22
to Alberto Barbaro, llvm-dev
The easiest way to do so in my experience is to change the base operand to a null pointer of the type and then convert the result of the GEP to an integer using ptrtoint. That will give you the offset in bytes and a pass using the target info will constant fold it as well. 

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Alberto Barbaro via llvm-dev

unread,
Jan 25, 2022, 2:12:52 AM1/25/22
to Markus Böck, llvm-dev
Hi Markus,
Since I'm working on an llvm plugin and I have access to the GEP object but I'm not really sure how to do what you told me would you be able to show it to me with few lines of code?

Thanks

Nikita Popov via llvm-dev

unread,
Jan 25, 2022, 3:25:41 AM1/25/22
to Alberto Barbaro, llvm-dev
On Tue, Jan 25, 2022 at 8:00 AM Alberto Barbaro via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi all,
so I'm trying to understand how to manually calculate the offset calculated by a GEP instruction. I found that this question was asked over 6 years ago on stackoverflow[1] as well but never got a real answer.

Since I need exactly the same, is there anyone willing to help me to understand how to calculate the offset?

You can use GEPOperator::accumulateConstantOffset(). Or more generically, there is Value::stripAndAccumulateConstantOffsets(), which can look through multiple GEPs, bitcasts, etc.

Regards,
Nikita

Alberto Barbaro via llvm-dev

unread,
Jan 25, 2022, 5:17:18 AM1/25/22
to Nikita Popov, llvm-dev
Thanks Nikita,
I'll try it and get back to you if I still have problem.

Thanks a lot for your help

Alberto

Alberto Barbaro via llvm-dev

unread,
Jan 25, 2022, 3:03:22 PM1/25/22
to Nikita Popov, llvm-dev
Hi Nikita,
so I think I made some progress but I'm not quite there yet.

So the GEP Instruction that I'm interested to analyze are:

%3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
%4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
%5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2

and the Point struct is declared in the following way:

struct Point
{
   int x;
   char y;
   long z;
};

Reading some doc online I tried the following:

Module *M = I.getModule();
I.dump();
APInt ap_offset(32, 0, false);
std::cout << "ap_offset: " << ap_offset.getSExtValue() << "\n";
std::cout << "Accumulated offset: " << I.accumulateConstantOffset(M->getDataLayout(), ap_offset) << "\n";
std::cout << "ap_offset: " << ap_offset.getSExtValue() << "\n";

The output is something like:
  %3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
  %4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
  %5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8
  %3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
  %4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
  %5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8
  %3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
  %4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
  %5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 8

I think the output is almost correct because the offset is increased by 4 each time... but the struct has char and long elements so the offset does not always look right. I think it due to the fact that the GEP is referring only to i32.

How should I fix this situation?  Few lines of code would be very helpful

Thanks
Alberto
   

Alberto Barbaro via llvm-dev

unread,
Jan 25, 2022, 3:10:14 PM1/25/22
to Nikita Popov, llvm-dev
Hello,
I did some more digging and I think that those numbers where indeed correct:

```
➜  /tmp cat struct.c
#include<stdio.h>

 
struct Point
{
   int x;
   char y;
   char z;
   long w;
};
 
int main()
{
   struct Point p1;
 
   // Accessing members of point p1
   p1.x = 1;
   p1.y = 2;
   p1.z = 3;
   p1.w = 4;

   printf("p1: 0x%x\n", &p1);
   printf("p1.x: 0x%x\n", &p1.x);
   printf("p1.y: 0x%x\n", &p1.y);
   printf("p1.z: 0x%x\n", &p1.z);
   printf("p1.w: 0x%x\n", &p1.w);
   return 0;
}
➜  /tmp
```
➜  /tmp ./struct    
p1: 0x9791b30
p1.x: 0x9791b30
p1.y: 0x9791b34
p1.z: 0x9791b35
p1.w: 0x9791b38
➜  /tmp
```

```
  %3 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 0
ap_offset: 0
Accumulated offset: 1
ap_offset: 0
  %4 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 1
ap_offset: 0
Accumulated offset: 1
ap_offset: 4
  %5 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 2
ap_offset: 0
Accumulated offset: 1
ap_offset: 5
  %6 = getelementptr inbounds %struct.Point, %struct.Point* %2, i32 0, i32 3

ap_offset: 0
Accumulated offset: 1
ap_offset: 8
```

Let me know if I'm missing anything.

Thanks
Alberto
Reply all
Reply to author
Forward
0 new messages