_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
This has probably been discussed somewhere, but I missed it. Can you elaborate a bit on this?
I‘m specifically trying to understand what will happen when typed pointer support is removed. How will IR with typed pointers be auto-upgraded to pure opaque pointer IR? Will the bitcode reader keep some level of typed pointer support indefinitely?
Also, do you have a plan for replacing intrinsics that currently rely on pointee types? For example, the load instruction was updated to take an explicit type operand but I don’t think we can do the same thing for an intrinsic like llvm.masked.load since there is Value for Type. This is an easy problem to work around for something like masked.load, but more complicated if anyone has a downstream GEP-like intrinsic that needs more than the size of an element (spoiler alert: I do have such an intrinsic). Would you use a metadata argument?
Thanks,
Andy
_______________________________________________
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
This has probably been discussed somewhere, but I missed it. Can you elaborate a bit on this?
- Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer type (not to be turned on until ready)
- To support legacy bitcode, such as legacy stores/loads, we need to track pointee types for all values since legacy instructions may infer the types from a pointer operand's pointee type
I‘m specifically trying to understand what will happen when typed pointer support is removed. How will IR with typed pointers be auto-upgraded to pure opaque pointer IR? Will the bitcode reader keep some level of typed pointer support indefinitely?
Also, do you have a plan for replacing intrinsics that currently rely on pointee types? For example, the load instruction was updated to take an explicit type operand but I don’t think we can do the same thing for an intrinsic like llvm.masked.load since there is Value for Type. This is an easy problem to work around for something like masked.load, but more complicated if anyone has a downstream GEP-like intrinsic that needs more than the size of an element (spoiler alert: I do have such an intrinsic). Would you use a metadata argument?
Honestly, I didn’t (and still don’t) understand enough about the intrinsics Craig mentioned to even be sure they were analogous to the case I brought up. It seemed like they might be more interested in debug types than IR types and may or may not be going down the same paths of reasoning, but I just don’t understand what the intrinsics are doing.
We’re scrambling to get our out-of-tree code caught up with the opaque pointer transition. I think the scatter/gather intrinsics are likely to be our first useful overlap with the in-tree work to be done, though now that I’ve thought about it a bit more, I think the type in that can just be inferred from non-pointer arguments. In any case, I guess talking about it and agreeing on an approach is a good first step.
I started trying to mock up some IR for how this might work, but I wasn’t entirely happy with it. Specifically, how do you describe an IR type in metadata. We could use an undef value in the metadata, similar to what Arthur proposed as an argument, but that only moves the ugliness. This is kind of what I was thinking
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr %p, metadata !1)
<…>
return void
}
define void @llvm.some.intrinsic(ptr, metadata)
%MyElementTy = type { <whatever> }
!1 = !{ %MyElementTy undef }
I guess that would work, but I don’t really like it. I looked at the way TBAA and debug info describe types, but that’s not describing the IR type and so it seems simultaneously fragile and too verbose. Is there a better way to do this? What about introducing a parameter attribute that would be considered required for the intrinsic? Something like this:
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr elementtype(%MyElementTy) %p)
<…>
return void
}
; Requires ‘elementype’ parameter attribute
define void @llvm.some.intrinsic(ptr)
%MyElementTy = type { <whatever> }
Does that feel like a step backward to you? I don’t know how hard it would be for front end’s to generate something like that, but it hasn’t the benefit that the elementtype would have intentional and well-defined semantic meaning.
Thanks,
Andy
Honestly, I didn’t (and still don’t) understand enough about the intrinsics Craig mentioned to even be sure they were analogous to the case I brought up. It seemed like they might be more interested in debug types than IR types and may or may not be going down the same paths of reasoning, but I just don’t understand what the intrinsics are doing.
We’re scrambling to get our out-of-tree code caught up with the opaque pointer transition. I think the scatter/gather intrinsics are likely to be our first useful overlap with the in-tree work to be done, though now that I’ve thought about it a bit more, I think the type in that can just be inferred from non-pointer arguments. In any case, I guess talking about it and agreeing on an approach is a good first step.
I started trying to mock up some IR for how this might work, but I wasn’t entirely happy with it. Specifically, how do you describe an IR type in metadata. We could use an undef value in the metadata, similar to what Arthur proposed as an argument, but that only moves the ugliness. This is kind of what I was thinking
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr %p, metadata !1)
<…>
return void
}
define void @llvm.some.intrinsic(ptr, metadata)
%MyElementTy = type { <whatever> }
!1 = !{ %MyElementTy undef }
I guess that would work, but I don’t really like it. I looked at the way TBAA and debug info describe types, but that’s not describing the IR type and so it seems simultaneously fragile and too verbose. Is there a better way to do this? What about introducing a parameter attribute that would be considered required for the intrinsic? Something like this:
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr elementtype(%MyElementTy) %p)
<…>
return void
}
; Requires ‘elementype’ parameter attribute
define void @llvm.some.intrinsic(ptr)
%MyElementTy = type { <whatever> }
Does that feel like a step backward to you? I don’t know how hard it would be for front end’s to generate something like that, but it hasn’t the benefit that the elementtype would have intentional and well-defined semantic meaning.
> I don't think metadata can reference an LLVM type. My previous hacky suggestion was to add a new overloaded parameter with the type you want and pass undef/poison to it. In any case, we'll have to find a way to fix these sorts of intrinsics, we shouldn't block the opaque pointers project on some intrinsics.
It looks like I just missed your response before getting my response to David in flight. Yes, I see the metadata problem. I don’t like the undef/poison approach, but I agree that it would technically work.
I completely agree that we should solve this problem rather than block the opaque pointer project because of it. Nevertheless, we’ll need to solve the problem.
> LLVM already (mostly) treats memory as untyped, what is your intrinsic attempting to do?
It’s kind of like a GEP but more specialized. It’s doing a pointer calculation based on multidimensional composite types with a shape that is known at runtime but unknown at compile time, and we have a front end that’s able to supply enough info to make this useful.
> As for the timeline, we'll have to support mixed opaque pointers and legacy pointers for a while, we don't want out of tree users to immediately break as soon as the opaque pointers work is finished up in-tree.
Is there any consensus on the scope of “for a while”? Like, how many major releases after the opaque pointer work is complete? My manager would very much like to know the answer to this question. :-) I’ve been trying to prepare for the buffer being as little as one major release after the in-tree work is done but hoping for more. I expect there are others with a similar interest.
From: Arthur Eubanks <aeub...@google.com>
Sent: Tuesday, June 22, 2021 12:15 PM
To: Kaylor, Andrew <andrew...@intel.com>
> Maybe the byref attribute is suitable for this? That is @llvm.some.intrinsic(ptr byref(%ElemTy) %p).
I’m always hesitant to repurpose existing attributes, but byref does seem very close. The line in the LangRef saying explicitly that it is intended for ABI constraints and shouldn’t be used for optimization worries me.
Also, byref assumes the pointer is pointing to a singe element. For a GEP-like intrinsic, that’s not what I’d want. It would take some time to craft a precise definition, but I was thinking something like “Within the scope of this function call, the pointer may be assumed to point to zero or more elements of the specified type. This attribute does not guarantee the amount of dereferenceable memory, but whatever memory is dereferenceable may be assumed to be in increments of the size of the element type.” It may need to say more than that.
> I don't think metadata can reference an LLVM type. My previous hacky suggestion was to add a new overloaded parameter with the type you want and pass undef/poison to it. In any case, we'll have to find a way to fix these sorts of intrinsics, we shouldn't block the opaque pointers project on some intrinsics.
It looks like I just missed your response before getting my response to David in flight. Yes, I see the metadata problem. I don’t like the undef/poison approach, but I agree that it would technically work.
I completely agree that we should solve this problem rather than block the opaque pointer project because of it. Nevertheless, we’ll need to solve the problem.
> LLVM already (mostly) treats memory as untyped, what is your intrinsic attempting to do?
It’s kind of like a GEP but more specialized. It’s doing a pointer calculation based on multidimensional composite types with a shape that is known at runtime but unknown at compile time, and we have a front end that’s able to supply enough info to make this useful.
> As for the timeline, we'll have to support mixed opaque pointers and legacy pointers for a while, we don't want out of tree users to immediately break as soon as the opaque pointers work is finished up in-tree.
Is there any consensus on the scope of “for a while”? Like, how many major releases after the opaque pointer work is complete? My manager would very much like to know the answer to this question. :-) I’ve been trying to prepare for the buffer being as little as one major release after the in-tree work is done but hoping for more. I expect there are others with a similar interest.
From: Arthur Eubanks <aeub...@google.com>
Sent: Tuesday, June 22, 2021 12:15 PM
To: Kaylor, Andrew <andrew...@intel.com>
Cc: llvm...@lists.llvm.org
Subject: Re: [llvm-dev] Opaque Pointers Help Wanted
On Mon, Jun 21, 2021 at 3:27 PM Kaylor, Andrew <andrew...@intel.com> wrote:
This has probably been discussed somewhere, but I missed it. Can you elaborate a bit on this?
- Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer type (not to be turned on until ready)
- To support legacy bitcode, such as legacy stores/loads, we need to track pointee types for all values since legacy instructions may infer the types from a pointer operand's pointee type
I‘m specifically trying to understand what will happen when typed pointer support is removed. How will IR with typed pointers be auto-upgraded to pure opaque pointer IR? Will the bitcode reader keep some level of typed pointer support indefinitely?
Yes, the plan is something along the lines of associating each Value with a possible pointee type inside the bitcode reader.
Also, do you have a plan for replacing intrinsics that currently rely on pointee types? For example, the load instruction was updated to take an explicit type operand but I don’t think we can do the same thing for an intrinsic like llvm.masked.load since there is Value for Type. This is an easy problem to work around for something like masked.load, but more complicated if anyone has a downstream GEP-like intrinsic that needs more than the size of an element (spoiler alert: I do have such an intrinsic). Would you use a metadata argument?
I don't think metadata can reference an LLVM type. My previous hacky suggestion was to add a new overloaded parameter with the type you want and pass undef/poison to it. In any case, we'll have to find a way to fix these sorts of intrinsics, we shouldn't block the opaque pointers project on some intrinsics.
LLVM already (mostly) treats memory as untyped, what is your intrinsic attempting to do?
_______________________________________________
On Tue, Jun 22, 2021 at 9:39 AM Kaylor, Andrew via llvm-dev <llvm...@lists.llvm.org> wrote:> I don't think metadata can reference an LLVM type. My previous hacky suggestion was to add a new overloaded parameter with the type you want and pass undef/poison to it. In any case, we'll have to find a way to fix these sorts of intrinsics, we shouldn't block the opaque pointers project on some intrinsics.
It looks like I just missed your response before getting my response to David in flight. Yes, I see the metadata problem. I don’t like the undef/poison approach, but I agree that it would technically work.
I completely agree that we should solve this problem rather than block the opaque pointer project because of it. Nevertheless, we’ll need to solve the problem.
> LLVM already (mostly) treats memory as untyped, what is your intrinsic attempting to do?
It’s kind of like a GEP but more specialized. It’s doing a pointer calculation based on multidimensional composite types with a shape that is known at runtime but unknown at compile time, and we have a front end that’s able to supply enough info to make this useful.
> As for the timeline, we'll have to support mixed opaque pointers and legacy pointers for a while, we don't want out of tree users to immediately break as soon as the opaque pointers work is finished up in-tree.
Is there any consensus on the scope of “for a while”? Like, how many major releases after the opaque pointer work is complete? My manager would very much like to know the answer to this question. :-) I’ve been trying to prepare for the buffer being as little as one major release after the in-tree work is done but hoping for more. I expect there are others with a similar interest.
By default I'd vote for one major release - but likely the ongoing burden of carrying it for a bit longer if some significant contributors would benefit from extra time it's probably going to be pretty harmless to keep it around a bit longer, I think?
| My expectation is that after the opaque pointer migration is complete, the time the non-opaque pointer mode remains available will be measured in days, not years.
That can be true for textual IR, but we need to retain backward compatibility (auto-upgrade) for bitcode for much longer. I don’t have a clear idea of what’s involved (haven’t really been tracking this project) but many users/scenarios require bitcode compatibility for long periods. I don’t think we’ve had a true format break since about 4.0?
--paulr
| My expectation is that after the opaque pointer migration is complete, the time the non-opaque pointer mode remains available will be measured in days, not years.
That can be true for textual IR, but we need to retain backward compatibility (auto-upgrade) for bitcode for much longer. I don’t have a clear idea of what’s involved (haven’t really been tracking this project) but many users/scenarios require bitcode compatibility for long periods. I don’t think we’ve had a true format break since about 4.0?
--paulr
On Thu, Jul 1, 2021 at 4:39 PM <paul.r...@sony.com> wrote:| My expectation is that after the opaque pointer migration is complete, the time the non-opaque pointer mode remains available will be measured in days, not years.
That can be true for textual IR, but we need to retain backward compatibility (auto-upgrade) for bitcode for much longer. I don’t have a clear idea of what’s involved (haven’t really been tracking this project) but many users/scenarios require bitcode compatibility for long periods. I don’t think we’ve had a true format break since about 4.0?
--paulr
Sure, we definitely need to retain bitcode compatibility (i.e. auto-upgrade to opaque pointers).My comment was targeted at maintainers of out-of-free frontends and targets, who shouldn't assume there will be a long-term period of dual compatibility, like there was/is with the new pass manager. If you don't want to be stuck on an old toolchain, you need to be ready by the time the switch happens.
On Thu, Jul 1, 2021 at 8:00 AM Nikita Popov <nikit...@gmail.com> wrote:On Thu, Jul 1, 2021 at 4:39 PM <paul.r...@sony.com> wrote:| My expectation is that after the opaque pointer migration is complete, the time the non-opaque pointer mode remains available will be measured in days, not years.
That can be true for textual IR, but we need to retain backward compatibility (auto-upgrade) for bitcode for much longer. I don’t have a clear idea of what’s involved (haven’t really been tracking this project) but many users/scenarios require bitcode compatibility for long periods. I don’t think we’ve had a true format break since about 4.0?
--paulr
Sure, we definitely need to retain bitcode compatibility (i.e. auto-upgrade to opaque pointers).My comment was targeted at maintainers of out-of-free frontends and targets, who shouldn't assume there will be a long-term period of dual compatibility, like there was/is with the new pass manager. If you don't want to be stuck on an old toolchain, you need to be ready by the time the switch happens.
Correct me if I'm wrong, but to clarify, I think: LLVM IR API likely won't have a long period of overlap supporting both opaque and non-opaque pointers and especially the ability to mix/match in IR may never exist (hard cutover) - as discussed with things like intrinsics recently, we'll likely have to have a hard cutover to opaque pointers. Textual IR may be able to be auto-upgraded for a time, and Bitcode certainly will be - so, in the worst case, if a downstream project needed to pickup a newer version of LLVM but didn't have time to invest it migrating their IR generation - maybe they could get away with linking an older LLVM into their frontend, then serializing roundtrip via bitcode back into a process with the newer LLVM that would convert to opaque pointers on loading.
Is that about right?
> My comment was targeted at maintainers of out-of-free frontends and targets, who shouldn't assume there will be a long-term period of dual compatibility, like there was/is with the new pass manager. If you don't want to be stuck on an old toolchain, you need to be ready by the time the switch happens.
I have serious concerns about this. I understand why you want to limit the time non-opaque pointers are supported, but having a sharp cutoff feels somewhat hostile to those who have a significant amount of out-of-tree code.
I know the opaque pointer work has been well-publicized and ongoing for quite some time, but that’s part of the problem. A lot of work has gone into making opaque pointers work, and a corresponding amount if work may be required for out-of-tree projects. My company, for example, has been following the work in LLVM and working on our own updates, but we’ve been allocating resources based on the pace of the open source work, which until recently has been fairly slow. We are adjusting our plans in accordance with the new push, but it is going to take us some time to catch up.
At a minimum, I think we need to agree upon a schedule for the transition so people can plan their work. Saying that we’re going to cut off support for a fundamental feature like this “when the migration is done” makes planning impossible. Beyond that, I would vote for keeping the duplicate tests around for some period of time.
-Andy
> My comment was targeted at maintainers of out-of-free frontends and targets, who shouldn't assume there will be a long-term period of dual compatibility, like there was/is with the new pass manager. If you don't want to be stuck on an old toolchain, you need to be ready by the time the switch happens.
I have serious concerns about this. I understand why you want to limit the time non-opaque pointers are supported, but having a sharp cutoff feels somewhat hostile to those who have a significant amount of out-of-tree code.
I know the opaque pointer work has been well-publicized and ongoing for quite some time, but that’s part of the problem. A lot of work has gone into making opaque pointers work, and a corresponding amount if work may be required for out-of-tree projects. My company, for example, has been following the work in LLVM and working on our own updates, but we’ve been allocating resources based on the pace of the open source work, which until recently has been fairly slow. We are adjusting our plans in accordance with the new push, but it is going to take us some time to catch up.
At a minimum, I think we need to agree upon a schedule for the transition so people can plan their work. Saying that we’re going to cut off support for a fundamental feature like this “when the migration is done” makes planning impossible. Beyond that, I would vote for keeping the duplicate tests around for some period of time.
I understand the hard cutover for things like intrinsics and other API that are being marked as deprecated. I’m more concerned about the ability to continue generating (from clang or other front ends) IR with pseudo-typed pointers (by which I mean pointers as they have been and mostly are now) and bitcasts that our out-of-tree passes are currently depending on and having in-tree passes continue to be able to optimize that. I’d like to especially emphasize this latter point. If key in-tree passes suddenly stop optimizing code that contains non-opaque pointers and bitcasts, that’s effectively a hard cutover for the entire optimizer even if the IR is still technically legal.
I see the problems with doing this. I really do. Once the opaque pointer work is complete it will be possible to remove a lot of code that is only there to support a “dead” form of the IR, and obviously we’d like to do that as soon as we can. I guess what I’m asking is that the non-opaque pointer IR form be considered deprecated rather than unsupported in the optimizer for at least one release after the work is complete and the code to optimize it be kept around for that long. If we make improvements that don’t also improve performance with the old form, that’s fine. I just don’t want to lose what’s there now.
It would also be very helpful if we could make a reasonable estimate of when the opaque pointer work will be complete. I know we have a list of what’s left to be done, but it would be very helpful to me if those working on this could say when they would like to have it done and when they realistically think it will be done.
Finally, if anyone else shares my concerns it would be helpful to hear that, because I realize that just one LLVM contributor (by which I mean my company, not me as an individual) can only expect to have so much influence.
Thanks,
Andy
It would also be very helpful if we could make a reasonable estimate of when the opaque pointer work will be complete. I know we have a list of what’s left to be done, but it would be very helpful to me if those working on this could say when they would like to have it done and when they realistically think it will be done.
Finally, if anyone else shares my concerns it would be helpful to hear that, because I realize that just one LLVM contributor (by which I mean my company, not me as an individual) can only expect to have so much influence.
I guess what I’m asking is that the non-opaque pointer IR form be considered deprecated rather than unsupported in the optimizer for at least one release after the work is complete and the code to optimize it be kept around for that long.
| |
Neil Henning | |
Senior Software Engineer Compiler | |
I understand the hard cutover for things like intrinsics and other API that are being marked as deprecated. I’m more concerned about the ability to continue generating (from clang or other front ends) IR with pseudo-typed pointers (by which I mean pointers as they have been and mostly are now) and bitcasts that our out-of-tree passes are currently depending on and having in-tree passes continue to be able to optimize that. I’d like to especially emphasize this latter point. If key in-tree passes suddenly stop optimizing code that contains non-opaque pointers and bitcasts, that’s effectively a hard cutover for the entire optimizer even if the IR is still technically legal.
I see the problems with doing this. I really do. Once the opaque pointer work is complete it will be possible to remove a lot of code that is only there to support a “dead” form of the IR, and obviously we’d like to do that as soon as we can. I guess what I’m asking is that the non-opaque pointer IR form be considered deprecated rather than unsupported in the optimizer for at least one release after the work is complete and the code to optimize it be kept around for that long. If we make improvements that don’t also improve performance with the old form, that’s fine. I just don’t want to lose what’s there now.
It would also be very helpful if we could make a reasonable estimate of when the opaque pointer work will be complete. I know we have a list of what’s left to be done, but it would be very helpful to me if those working on this could say when they would like to have it done and when they realistically think it will be done.
> Or at least, the only form I would be willing to allow this is that we promise to not actively/intentionally break typed pointers, but all the testing and fixing burden is on a 3rd-party.
That’s fair, and I could probably live with that.
If I understand correctly, lagging out-of-tree projects would effectively continue existing in a state between steps 3 and 4, while the main source base moved on. That state between steps 3 and 4 seems a lot like where we were for a couple of years with the new and legacy pass manager -- the support was all there and many downstream projects were using it by default but we delayed changing the default in trunk until all issues were resolved.
Honestly, I didn’t (and still don’t) understand enough about the intrinsics Craig mentioned to even be sure they were analogous to the case I brought up. It seemed like they might be more interested in debug types than IR types and may or may not be going down the same paths of reasoning, but I just don’t understand what the intrinsics are doing.
We’re scrambling to get our out-of-tree code caught up with the opaque pointer transition. I think the scatter/gather intrinsics are likely to be our first useful overlap with the in-tree work to be done, though now that I’ve thought about it a bit more, I think the type in that can just be inferred from non-pointer arguments. In any case, I guess talking about it and agreeing on an approach is a good first step.
I started trying to mock up some IR for how this might work, but I wasn’t entirely happy with it. Specifically, how do you describe an IR type in metadata. We could use an undef value in the metadata, similar to what Arthur proposed as an argument, but that only moves the ugliness. This is kind of what I was thinking
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr %p, metadata !1)
<…>
return void
}
define void @llvm.some.intrinsic(ptr, metadata)
%MyElementTy = type { <whatever> }
!1 = !{ %MyElementTy undef }
I guess that would work, but I don’t really like it. I looked at the way TBAA and debug info describe types, but that’s not describing the IR type and so it seems simultaneously fragile and too verbose. Is there a better way to do this? What about introducing a parameter attribute that would be considered required for the intrinsic? Something like this:
define void @f() {
ptr %p = <however we got %p>
call void @llvm.some.intrinsic(ptr elementtype(%MyElementTy) %p)
<…>
return void
}
; Requires ‘elementype’ parameter attribute
define void @llvm.some.intrinsic(ptr)
%MyElementTy = type { <whatever> }
Does that feel like a step backward to you? I don’t know how hard it would be for front end’s to generate something like that, but it hasn’t the benefit that the elementtype would have intentional and well-defined semantic meaning.
Thanks,
Andy
Hi Nikita,
We saw the elementtype attribute changes and have been working on some patches to make use of it. This is what led us to the change in https://reviews.llvm.org/D108796, which I plan to commit for Chenyang tomorrow. Apart from that minor problem, it looks like it will work very well for us. Thanks!
-Andy
From: Nikita Popov <nikit...@gmail.com>
Sent: Saturday, September 04, 2021 1:39 AM
To: Kaylor, Andrew <andrew...@intel.com>
Cc: David Blaikie <dbla...@gmail.com>; craig....@gmail.com; llvm...@lists.llvm.org
Subject: Re: [llvm-dev] Opaque Pointers Help Wanted