It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.
On Mon, Sep 8, 2014 at 4:22 PM, Dan Gohman <dan4...@gmail.com> wrote:It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.Could we say that constant integers have no objects associated with them? If so, we need a way to bless constant integers that *do* refer to real objects, such as ASan's shadow memory base.
Then you should be able to take something like add a phi of constant ints to an inttoptr and transform that to a GEP, without explicitly calling out constant integers.
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.I'm guessing the IR comes from C++ code that subtracts pointers, so it'd be good if we could figure this out.
On Mon, Sep 8, 2014 at 8:36 PM, Reid Kleckner <r...@google.com> wrote:On Mon, Sep 8, 2014 at 4:22 PM, Dan Gohman <dan4...@gmail.com> wrote:It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.Could we say that constant integers have no objects associated with them? If so, we need a way to bless constant integers that *do* refer to real objects, such as ASan's shadow memory base.Then you should be able to take something like add a phi of constant ints to an inttoptr and transform that to a GEP, without explicitly calling out constant integers.
It's not pretty to have situations where dynamic values permit more optimization than constants. If there's an expression which can be folded to a constant int, should the constant folder avoid doing so, because it might pessimize subsequent alias analysis?
An object can be allocated at virtual address 5 through extra-VM means (eg. mmap), and then one can (creatively) interpret the return value of @f as being associated with whatever %A was associated with *and* 5. The return value of @g can only be associated with exactly the same set that %A was associated with. Consequently, it's not always safe to replace @f with @g.
It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.
On Sat, Aug 30, 2014 at 6:01 PM, David Majnemer <david.m...@gmail.com> wrote:
Consider the two functions bellow:
define i8* @f(i8* %A) { %pti = ptrtoint i8* %A to i64 %add = add i64 %pti, 5 %itp = inttoptr i64 %add to i8* ret i8* %itp}
define i8* @g(i8* %A) {
%gep = getelementptr i8* %A, i64 5 ret i8* %gep}
What, if anything, prevents us from canonicalizing @f to @g?I've heard that this might be in violation of http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.
_______________________________________________ LLVM Developers mailing list LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
I'm guessing the key difference in our reasoning is about the constant 5. :) I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range. Could you expand on why?
On 09/08/2014 04:22 PM, Dan Gohman wrote:
Dan, I'm trying to follow your logic here and am not arriving at the same conclusion. Can you point out the flaw in my reasoning here?An object can be allocated at virtual address 5 through extra-VM means (eg. mmap), and then one can (creatively) interpret the return value of @f as being associated with whatever %A was associated with *and* 5. The return value of @g can only be associated with exactly the same set that %A was associated with. Consequently, it's not always safe to replace @f with @g.
define i8* @f(i8* %A) {
%pti = ptrtoint i8* %A to i64 <-- %pti is not a pointer and is thus not based on anything
%add = add i64 %pti, 5 <-- %add is not a pointer and is thus not based on anything, it is "associated with" the memory pointed to by %A
--- In particular, "5" is NOT a "an integer constant ... returned from a function not defined within LLVM". It is not returned by a function. As a result the pointer value of 5 is not associated with any address range.
%itp = inttoptr i64 %add to i8* %itp is based on %pti only
ret i8* %itp}
I'm guessing the key difference in our reasoning is about the constant 5. :) I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range. Could you expand on why?
It would be nice to be able to canoncalize ptrtoint+add+inttoptr to geps. Having seemingly reasonable-looking legal IR that simply doesn't optimize is not the best introduction for new frontend authors. :)
It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.