[llvm-dev] RFC: A change in InstCombine canonical form

40 views
Skip to first unread message

Ehsan Amiri via llvm-dev

unread,
Mar 16, 2016, 9:42:33 AM3/16/16
to llvm...@lists.llvm.org
=== PROBLEM === (See this bug https://llvm.org/bugs/show_bug.cgi?id=26445)

IR contains code for loading a float from float * and storing it to a float * address. After canonicalization of load in InstCombine [1], new bitcasts are added to the IR (see bottom of the email for code samples). This prevents select speculation in SROA to work. Also after SROA we have bitcasts from int32 to float. (Whereas originally after instCombine, bitcasts are only done on pointer types).

=== PROPOSED SOLUTION===

[1] implies that we need load canonicalization when we load a value only to store it again. The reason is to avoid generating slightly different code (due to different ways of adding bitcasts), in different situations. In all examples presented in [1] there is a non-zero number of bitcasts. I think when we load a value of type T from a T* address and store it as a type T value to one or more T* address (and there is no other use or store), we can redefine canonical form to mean there should not be any bitcasts. So we still have a canonical form, but its definition is slightly different.

=== REASONS FOR / AGAINST===

Hal Finkel warns that while this may be useful for power pc, this may hurt more than one other platform and become a very large project. Despite this he is fine with bringing up the issue to the mailing list to get feedback, mostly because this seems inline with our future direction of having a unique type for all pointers.  (Hal please correct me if I misunderstood your comment)

This is a much simpler fix compared to alternatives. (ignoring potential regressions)

=== ALTERNATIVE SOLUTION ===

Fix select speculation in SROA to see through bitcasts. Handle remaining bitcasts during code gen. Other alternative solutions are welcome.

Should I implement the proposed solution or is it too risky? I understand that we may need to undo it if it breaks too many things. Comments are welcome.


[1] http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html  r226781  git commit id: b778cbc0c8



Code Samples (only relevant part is copied):

--------------------  Before Canonicalization (contains call to std::max): -------------------- 
entry:
  %max_value = alloca float, align 4
  %1 = load float, float* %input, align 4, !tbaa !1
  store float %1, float* %max_value, align 4, !tbaa !1

for.body:
  %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
  %3 = load float, float* %call, align 4, !tbaa !1
  store float %3, float* %max_value, align 4, !tbaa !1

--------------------  After Canonicalization (contains call to std::max):-------------------- 

entry:
  %max_value = alloca float, align 4
  %1 = bitcast float* %input to i32*
  %2 = load i32, i32* %1, align 4, !tbaa !1
  %3 = bitcast float* %max_value to i32*
  store i32 %2, i32* %3, align 4, !tbaa !1

for.body:
  %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %max_value, float* dereferenceable(4) %arrayidx1)
  %5 = bitcast float* %call to i32*
  %6 = load i32, i32* %5, align 4, !tbaa !1
  %7 = bitcast float* %max_value to i32*
  store i32 %6, i32* %7, align 4, !tbaa !1

-------------------- After SROA (the call to std::max is inlined now):-------------------- 
entry:
  %max_value.sroa.0 = alloca i32
  %0 = bitcast float* %input to i32*
  %1 = load i32, i32* %0, align 4, !tbaa !1
  store i32 %1, i32* %max_value.sroa.0

for.body:
  %max_value.sroa.0.0.max_value.sroa.0.0.6 = load i32, i32* %max_value.sroa.0
  %3 = bitcast i32 %max_value.sroa.0.0.max_value.sroa.0.0.6 to float
  %max_value.sroa.0.0.max_value.sroa_cast8 = bitcast i32* %max_value.sroa.0 to float*
  %__b.__a.i = select i1 %cmp.i, float* %arrayidx1, float* %max_value.sroa.0.0.max_value.sroa_cast8
  %5 = bitcast float* %__b.__a.i to i32*
  %6 = load i32, i32* %5, align 4, !tbaa !1
  store i32 %6, i32* %max_value.sroa.0

-------------------- After SROA when Canonicalization is turned off-------------------- 
entry:
  %0 = load float, float* %input, align 4, !tbaa !1

for.cond:                                         ; preds = %for.body, %entry
  %max_value.0 = phi float [ %0, %entry ], [ %.sroa.speculated, %for.body ]

for.body:
  %1 = load float, float* %arrayidx1, align 4, !tbaa !1
  %cmp.i = fcmp olt float %max_value.0, %1
  %.sroa.speculate.load.true = load float, float* %arrayidx1, align 4, !tbaa !1
  %.sroa.speculated = select i1 %cmp.i, float %.sroa.speculate.load.true, float %max_value.0

Mehdi Amini via llvm-dev

unread,
Mar 16, 2016, 11:34:54 AM3/16/16
to ehsan...@gmail.com, llvm...@lists.llvm.org
Hi,

How do it interact with the "typeless pointers" work?

Thanks,

-- 
Mehdi

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Ehsan Amiri via llvm-dev

unread,
Mar 16, 2016, 12:36:04 PM3/16/16
to Mehdi Amini, llvm...@lists.llvm.org
Hal knows better. My understanding is that, there is a similarity in the code pattern generated, in that there will be no intervening bitcasts between load and store.

Having said that, I just double checked one of the test cases that was committed with canonicalization work.  My proposed solution may result in lowering of memcpy to non-integer load and store. (See test1 in  test/Transforms/InstCombine/struct-assign-tbaa.ll).  This might be a blocker. (Even if we can fix it, the proposed solution is now more complicated than what I thought).


David Blaikie via llvm-dev

unread,
Mar 16, 2016, 1:12:51 PM3/16/16
to Mehdi Amini, Chandler Carruth, llvm-dev
On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

How do it interact with the "typeless pointers" work?

Right - the goal of the typeless pointer work is to fix all these bugs related to "didn't look through bitcasts" in optimizations. Sometimes that's going to mean more work (because the code is leaning on the absence of bitcasts & the presence of convenient (but not guaranteed) type information to inform optimization decisions) but if we remove typed pointers while keeping optimization quality in the cases we have today, then we should've also fixed the cases that were broken because the type information didn't end up aligning to produce the optimal output.

& I know I've been off the typeless pointer stuff for a few months working on llvm-dwp - but happy for any help (the next immediate piece is probably figuring out teh right representation for byval and inalloca - there were some code reviews sent out for that that I'll need to come back around to - but also any optimizations people want to help rework/improve would be great too & I can provide some techniques/tools to help people approach those)

- Dave

Ehsan Amiri via llvm-dev

unread,
Mar 16, 2016, 2:01:24 PM3/16/16
to David Blaikie, llvm-dev
David,

Could you give us an update on the status of typeless pointer work? How much work is left and when you think it might be ready?

Thanks
Ehsan

David Blaikie via llvm-dev

unread,
Mar 16, 2016, 2:14:06 PM3/16/16
to Ehsan Amiri, llvm-dev
On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri <ehsan...@gmail.com> wrote:
David,

Could you give us an update on the status of typeless pointer work? How much work is left and when you think it might be ready?

It's a bit of an onion peel, really - since it will eventually involve generalizing/fixing every optimization that's currently leaning on typed pointers to keep the performance while removing the crutch they're currently leaning on. (in cases where bitcasts are literally just getting in the way, those won't require cleaning up & should just become "free performance wins" once we remove them, though)

At the moment we can roundtrip every LLVM IR test case through bitcode and textual IR (reading/writing both formats) while using only a narrow whitelist of places that request the type of a pointer (things like the verifier, the parser/printer where it actually needs the typed pointer to verify it matches the explicit type, etc).

The next thing on the list is probably figuring out the byval/inalloca representation (add an explicit pointee type? just make the number of bytes explicit with no type information?).

Then start migrating optimizations over - doing the same sort of testing I did for the IR/bitcode roundtripping - assert that the pointee type is not accessed, whitelist places that need it until the bitcasts go away, fix anything else... it'll still be a fair bit of work & I don't really know how much. It should parallelize pretty well (doing any of this work is really helpful, each optimization is indepednent, etc) if anyone wants to/is able to help.

- Dave

Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 2:33:31 PM3/22/16
to David Blaikie, t.p.no...@gmail.com, ric...@xmos.com, kpar...@codeaurora.org, paul_r...@playstation.sony.com, Nadav Rotem, daniel....@imgtec.com, thomas....@amd.com, uwei...@de.ibm.com, Mehdi Amini, Chandler Carruth, llvm-dev
Back to the discussion on the RFC, I still see some advantage in following the proposed solution. I see two paths forward:

1- Change canonical form, possibly lower memcpy to non-integer load and store in InstCombine. Then teach the backends to convert that to integer load and store if that is more profitable. Notice that we are talking about loads that have no use other than store. So it is a fairly simple change in the backends.

2- Do not change the canonical form. Then for this bug, we need to teach select speculation to see through bitcasts. We will probably need to teach other optimizations to see though bitcasts in the future as problems are uncovered. That is until typeless pointer work is complete. Once the typeless pointer work is complete, we have some extra code in each optimization for seeing through bitcasts which is possibly no longer needed.

Based on this I think (1) is the right thing to do. But there might be other reasons for the current canonical form that I am not aware of. Please let me know what you think.

Thanks
Ehsan

Philip Reames via llvm-dev

unread,
Mar 22, 2016, 3:04:03 PM3/22/16
to ehsan...@gmail.com, David Blaikie, t.p.no...@gmail.com, ric...@xmos.com, kpar...@codeaurora.org, paul_r...@playstation.sony.com, Nadav Rotem, daniel....@imgtec.com, thomas....@amd.com, uwei...@de.ibm.com, Mehdi Amini, Chandler Carruth, llvm-dev
I'm generally in support of (1), but you should definitely get active buy-in from Chandler before moving forward.  He's the most knowledgeable on the tradeoffs.

Also, are you willing to commit to the fairly large amount of work (1) implies?  I strongly suspect that doing (1) right will be *far* more work in the short term than (2).  It may still be the right long term answer, but are you ready to see the entire thing through?  Getting half way through and stopping would be much worse than (2). 

Philip

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 3:16:39 PM3/22/16
to ehsan...@gmail.com, llvm-dev, uwei...@de.ibm.com, thomas....@amd.com
I don't know enough about the tradeoff for 1, but 2 seems like a bandaid for something that is not a correctness issue neither a regression. I'm not sure it justifies "bandaid patches" while there is a clear path forward, i.e. typeless pointers, unless there is an acknowledgement that typeless pointers won't be there before a couple of years.

-- 
Mehdi

Philip Reames via llvm-dev

unread,
Mar 22, 2016, 3:36:34 PM3/22/16
to Mehdi Amini, ehsan...@gmail.com, llvm-dev, uwei...@de.ibm.com, thomas....@amd.com
I'd phrase this differently: being pointer-bitcast agnostic is a step towards support typeless pointers.  :)  We can either become bitcast agnostic all in one big change or incrementally.  Personally, I'd prefer the later since it reduces the risk associated with enabling typeless pointers in the end.

Philip

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 3:39:42 PM3/22/16
to Philip Reames, David Blaikie, llvm-dev, uwei...@de.ibm.com, Tom Stellard
I don't really mind, but the intermediate stage will not be very nice: that a lot of code / tests that needs to be written with bitcast, and all of that while they are deemed to disappear. The added value isn't clear to me considering the added work. I'm not sure it wouldn't add more work for all the cleanup required by the "typeless pointer", but I'm not sure what's involved here and if David thinks the intermediate steps of handling bit casts everywhere is not making it harder I'm fine with it.

-- 
Mehdi

Hal Finkel via llvm-dev

unread,
Mar 22, 2016, 3:47:49 PM3/22/16
to Mehdi Amini, llvm-dev, uwei...@de.ibm.com, Tom Stellard
From: "Mehdi Amini via llvm-dev" <llvm...@lists.llvm.org>
To: "Philip Reames" <list...@philipreames.com>, "David Blaikie" <dbla...@gmail.com>
Cc: "llvm-dev" <llvm...@lists.llvm.org>, uwei...@de.ibm.com, "Tom Stellard" <thomas....@amd.com>
Sent: Tuesday, March 22, 2016 2:39:36 PM
Subject: Re: [llvm-dev] RFC: A change in InstCombine canonical form

I don't really mind, but the intermediate stage will not be very nice: that a lot of code / tests that needs to be written with bitcast, and all of that while they are deemed to disappear. The added value isn't clear to me considering the added work. I'm not sure it wouldn't add more work for all the cleanup required by the "typeless pointer", but I'm not sure what's involved here and if David thinks the intermediate steps of handling bit casts everywhere is not making it harder I'm fine with it.
It is not clear to me that this is a particularly-large change. As I understand it, we're only talking about the canonicalization of small memcpys, so there are no other uses of the relevant values. Changing the canonical form here seems like a pure win, except for the fact that doing the load and store using integer registers is likely more efficient on some architectures than using floating-point registers. For those architectures, we'd want to fix this in the backend. This seems relatively easy to do in CGP (or DAGCombine) (and, in fact, doing it later would enable us to catch more load/store cases than just those that comes from memcpy expansions). Thoughts?

 -Hal
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 3:51:58 PM3/22/16
to Hal Finkel, llvm-dev, uwei...@de.ibm.com, Tom Stellard
On Mar 22, 2016, at 12:47 PM, Hal Finkel <hfi...@anl.gov> wrote:



From: "Mehdi Amini via llvm-dev" <llvm...@lists.llvm.org>
To: "Philip Reames" <list...@philipreames.com>, "David Blaikie" <dbla...@gmail.com>
Cc: "llvm-dev" <llvm...@lists.llvm.org>, uwei...@de.ibm.com, "Tom Stellard" <thomas....@amd.com>
Sent: Tuesday, March 22, 2016 2:39:36 PM
Subject: Re: [llvm-dev] RFC: A change in InstCombine canonical form

I don't really mind, but the intermediate stage will not be very nice: that a lot of code / tests that needs to be written with bitcast, and all of that while they are deemed to disappear. The added value isn't clear to me considering the added work. I'm not sure it wouldn't add more work for all the cleanup required by the "typeless pointer", but I'm not sure what's involved here and if David thinks the intermediate steps of handling bit casts everywhere is not making it harder I'm fine with it.
It is not clear to me that this is a particularly-large change. As I understand it, we're only talking about the canonicalization of small memcpy,

What you describe is option 1, I have zero opinion on this since I don't know about the tradeoff involved. Also option 1 is not obsolete by the typeless pointers.
I was address the other option, which as I understood it, would be to accept the current canonicalization and the implied bitcast, and then fix all the places that don't look through it. This would be obsolete by typeless pointers.

-- 
Mehdi

David Blaikie via llvm-dev

unread,
Mar 22, 2016, 4:09:59 PM3/22/16
to Mehdi Amini, llvm-dev, Ulrich Weigand, Tom Stellard
Ultimately everything is going to be made to not rely on the types of pointers - that's nearly equivalent to bitcast-ignorant (the difference being that the presence of an extra instruction (the bitcast) might trip up some optimizations - but the presence of the /type/ information implied by the bitcast should not trip up or be necessary for optimizations (two sides of the same coin))

If we're talking about making an optimization able to ignore the bitcast instructions - yes, that work is unnecessary & perhaps questionable given the typeless pointer work. Not outright off limits, but the same time might be better invested in moving typeless pointers forward if the contributor is so inclined/able to shift in that direction.

But if we're talking about the work to make the optimization not use the type of pointers as a crutch - that work is a necessary precursor to the typeless pointer work and would be most welcome.

- David

Philip Reames via llvm-dev

unread,
Mar 22, 2016, 4:31:34 PM3/22/16
to David Blaikie, Mehdi Amini, llvm-dev, Ulrich Weigand, Tom Stellard
I feel very strongly that blocking work on making optimization bitcast-ignorant on the typeless pointer work would be a major mistake.  Unless we expected the typeless pointer work to be concluded within the near term (say 3-6 months maximum), we should not block any development which would be accepted in the typeless pointer work wasn't planned. 

In my view, this is one of the largest mistakes we've made with the pass manager work, it has seriously cost us, and I don't want to see it happening again. 

Philip

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 4:34:52 PM3/22/16
to Philip Reames, llvm-dev, Ulrich Weigand, Tom Stellard
This is roughly what I wrote...

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 4:41:23 PM3/22/16
to Philip Reames, llvm-dev, Ulrich Weigand, Tom Stellard
Sorry I should have been more clear (writing to many email in parallel)

You're right. I was adding a +1 to you here.

Especially I wrote "unless there is an acknowledgement that typeless pointers won't be there before a couple of years" with  the PassManager in mind, and I was expecting from David some good indication of a timeframe for the typeless pointers.
If the typeless pointer work is stalled or if it is not planned for LLVM 3.9, I agree with Philip to not block anything.

-- 
Mehdi

On Mar 22, 2016, at 1:37 PM, Philip Reames <list...@philipreames.com> wrote:

But not what David was stating, unless I misread?  I was specifically responding to David's wording:

"If we're talking about making an optimization able to ignore the bitcast instructions - yes, that work is unnecessary & perhaps questionable given the typeless pointer work. Not outright off limits, but the same time might be better invested in moving typeless pointers forward if the contributor is so inclined/able to shift in that direction."

Both "perhaps questionable" and "not outright off limits" seem to strongly imply such work should be discouraged.  I disagree with that view which is why I wrote my response.

Philip

Philip Reames via llvm-dev

unread,
Mar 22, 2016, 4:42:28 PM3/22/16
to Mehdi Amini, llvm-dev, Ulrich Weigand, Tom Stellard
But not what David was stating, unless I misread?  I was specifically responding to David's wording:
"If we're talking about making an optimization able to ignore the bitcast instructions - yes, that work is unnecessary & perhaps questionable given the typeless pointer work. Not outright off limits, but the same time might be better invested in moving typeless pointers forward if the contributor is so inclined/able to shift in that direction."

Both "perhaps questionable" and "not outright off limits" seem to strongly imply such work should be discouraged.  I disagree with that view which is why I wrote my response.

Philip

On 03/22/2016 01:34 PM, Mehdi Amini wrote:

David Blaikie via llvm-dev

unread,
Mar 22, 2016, 5:30:35 PM3/22/16
to Mehdi Amini, llvm-dev, Ulrich Weigand, Tom Stellard
On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini <mehdi...@apple.com> wrote:
Sorry I should have been more clear (writing to many email in parallel)

You're right. I was adding a +1 to you here.

Especially I wrote "unless there is an acknowledgement that typeless pointers won't be there before a couple of years" with  the PassManager in mind, and I was expecting from David some good indication of a timeframe for the typeless pointers.
If the typeless pointer work is stalled or if it is not planned for LLVM 3.9,

It's neither stalled nor planned, as such.
 
I agree with Philip to not block anything.

All I'm suggesting is that if there are people who want to fix these bugs, I'd really appreciate them helping out on the typeless pointer work - it's totally parallelizable/shardable/shareable work & the project as a whole seemed to agree it was the right direction. Why not just get it done?

The Pass Manager work is a bit different & harder to share at certain points (I don't tihnk there's ever been a point at which someone could ask me "what can I do to help with the typeless pointer work" I couldn't've given some pretty large areas I wasn't anywhere near touching that they could've gotten their teeth into) - I think it's reaching the point where multiple passes can be ported independently & seen some work like that in Polly, etc. So at this point it seems like if people want to address the issues the new pass manager is aimed at addressing, they could pitch in on that effort.

That said, I'm not in a position (nor would I do so, even if I were) to block other patches, just to encourage people to help out on the bigger efforts in whatever way they can (either directly, or indirectly through ensuring stopgaps/new work is done in a way that makes that future work easier (where it's reasonable to judge what might be easier or harder, etc), etc)

- Dave

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 5:43:09 PM3/22/16
to David Blaikie, llvm-dev, Ulrich Weigand, Tom Stellard
On Mar 22, 2016, at 2:30 PM, David Blaikie <dbla...@gmail.com> wrote:



On Tue, Mar 22, 2016 at 1:41 PM, Mehdi Amini <mehdi...@apple.com> wrote:
Sorry I should have been more clear (writing to many email in parallel)

You're right. I was adding a +1 to you here.

Especially I wrote "unless there is an acknowledgement that typeless pointers won't be there before a couple of years" with  the PassManager in mind, and I was expecting from David some good indication of a timeframe for the typeless pointers.
If the typeless pointer work is stalled or if it is not planned for LLVM 3.9,

It's neither stalled nor planned, as such.
 
I agree with Philip to not block anything.

All I'm suggesting is that if there are people who want to fix these bugs, I'd really appreciate them helping out on the typeless pointer work - it's totally parallelizable/shardable/shareable work & the project as a whole seemed to agree it was the right direction. Why not just get it done?

Of course that's the right to go, however there might be a different amount of work involved in the two case. For instance maybe handling bitcast everywhere is 20 times less work than finishing the typeless pointers, but maybe it is only 2 times less work. I have no clue and this is the kind of information that I think prevents someone who needs to have an issue solved from making the right choice, and conservatively choosing to go with the "bandaid".

-- 
Mehdi

Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 5:45:01 PM3/22/16
to David Blaikie, llvm-dev, Ulrich Weigand, Tom Stellard
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.

David I think (1) is more inline with typeless pointer work than (2). Contributing to typeless pointer work will be great, but given its unknown time frame we cannot stop fixing existing problems. Of course, we should follow an approach consistent with the long-term solution.

Philip Reames via llvm-dev

unread,
Mar 22, 2016, 6:43:47 PM3/22/16
to ehsan...@gmail.com, David Blaikie, llvm-dev, Ulrich Weigand, Tom Stellard


On 03/22/2016 02:44 PM, Ehsan Amiri wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.
I have no specific reason to believe it will be a large amount of work, but prior experience tells me changes to canonical form have a tendency of exposing unexpected issues.  To be clear, I am supportive of you implementing solution 1. 

James Y Knight via llvm-dev

unread,
Mar 22, 2016, 6:58:26 PM3/22/16
to ehsan...@gmail.com, llvm-dev, Ulrich Weigand, Tom Stellard
On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.

David I think (1) is more inline with typeless pointer work than (2). Contributing to typeless pointer work will be great, but given its unknown time frame we cannot stop fixing existing problems. Of course, we should follow an approach consistent with the long-term solution.

It seems to me that the question to ask is what would be the best state of the code, assuming that the typeless pointers work had already been done. Is it the current canonical form? Or the newly proposed one?

I think it'd be the current one? If so, I'd suggest that proposal #2 is more compatible with the typeless pointer work. That is because (if done properly), code which knows how to look through pointer bitcast nodes is something which will be much more easily deletable once pointer bitcast nodes cease to exist, than to change the canonicalization of memcpy back and remove any backend code which was added only to compensate for that change.

Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 7:16:04 PM3/22/16
to James Y Knight, llvm-dev, Ulrich Weigand, Tom Stellard
James,

I think (1) reduces the number of "do-not-see-through-bitcast" bugs that we need to uncover and fix between now and the time that typeless pointer is available. That means it is likely that we have multiple such fixes in the code and then we have to remove each one of them. (And means each one of those has to be done properly to be easily remove-able).

Changing canonicaliztion of memcpy, will be removing a couple of lines of code. I am not sure about the size of backend changes to optimize load-store patterns. But I expect it to be small.

Mehdi Amini via llvm-dev

unread,
Mar 22, 2016, 7:33:48 PM3/22/16
to ehsan...@gmail.com, llvm-dev, Tom Stellard, Ulrich Weigand
On Mar 22, 2016, at 4:15 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:

James,

I think (1) reduces the number of "do-not-see-through-bitcast" bugs that we need to uncover and fix between now and the time that typeless pointer is available. That means it is likely that we have multiple such fixes in the code and then we have to remove each one of them. (And means each one of those has to be done properly to be easily remove-able).

Changing canonicaliztion of memcpy, will be removing a couple of lines of code. I am not sure about the size of backend changes to optimize load-store patterns. But I expect it to be small.

Are you saying that the canonicalization you want to change is temporary till we get the typeless pointers?

-- 
Mehdi





On Tue, Mar 22, 2016 at 6:58 PM, James Y Knight <jykn...@google.com> wrote:


On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.

David I think (1) is more inline with typeless pointer work than (2). Contributing to typeless pointer work will be great, but given its unknown time frame we cannot stop fixing existing problems. Of course, we should follow an approach consistent with the long-term solution.

It seems to me that the question to ask is what would be the best state of the code, assuming that the typeless pointers work had already been done. Is it the current canonical form? Or the newly proposed one?

I think it'd be the current one? If so, I'd suggest that proposal #2 is more compatible with the typeless pointer work. That is because (if done properly), code which knows how to look through pointer bitcast nodes is something which will be much more easily deletable once pointer bitcast nodes cease to exist, than to change the canonicalization of memcpy back and remove any backend code which was added only to compensate for that change.

Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 7:39:19 PM3/22/16
to Mehdi Amini, llvm-dev, Tom Stellard, Ulrich Weigand
On Tue, Mar 22, 2016 at 7:33 PM, Mehdi Amini <mehdi...@apple.com> wrote:

On Mar 22, 2016, at 4:15 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:

James,

I think (1) reduces the number of "do-not-see-through-bitcast" bugs that we need to uncover and fix between now and the time that typeless pointer is available. That means it is likely that we have multiple such fixes in the code and then we have to remove each one of them. (And means each one of those has to be done properly to be easily remove-able).

Changing canonicaliztion of memcpy, will be removing a couple of lines of code. I am not sure about the size of backend changes to optimize load-store patterns. But I expect it to be small.

Are you saying that the canonicalization you want to change is temporary till we get the typeless pointers?


I think typeless pointer, will automatically makes it obsolete. Remember, I proposed to make a change "when we load a value of type T from a type T* memory address". There will not be a type T * memory address, once typeless pointer work is in.

Hal Finkel via llvm-dev

unread,
Mar 22, 2016, 7:46:41 PM3/22/16
to ehsan...@gmail.com, llvm-dev, Ulrich Weigand, Tom Stellard
From: "Ehsan Amiri via llvm-dev" <llvm...@lists.llvm.org>
To: "Mehdi Amini" <mehdi...@apple.com>
Cc: "llvm-dev" <llvm...@lists.llvm.org>, "Tom Stellard" <thomas....@amd.com>, "Ulrich Weigand" <uwei...@de.ibm.com>
Sent: Tuesday, March 22, 2016 6:38:32 PM
Subject: Re: [llvm-dev] RFC: A change in InstCombine canonical form

On Tue, Mar 22, 2016 at 7:33 PM, Mehdi Amini <mehdi...@apple.com> wrote:

On Mar 22, 2016, at 4:15 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:

James,

I think (1) reduces the number of "do-not-see-through-bitcast" bugs that we need to uncover and fix between now and the time that typeless pointer is available. That means it is likely that we have multiple such fixes in the code and then we have to remove each one of them. (And means each one of those has to be done properly to be easily remove-able).

Changing canonicaliztion of memcpy, will be removing a couple of lines of code. I am not sure about the size of backend changes to optimize load-store patterns. But I expect it to be small.

Are you saying that the canonicalization you want to change is temporary till we get the typeless pointers?


I think typeless pointer, will automatically makes it obsolete. Remember, I proposed to make a change "when we load a value of type T from a type T* memory address". There will not be a type T * memory address, once typeless pointer work is in.

When we transform a small memcpy into a pair of load and store instructions, we'll still need to pick a type. Currently, as I understand it, we always pick integers. It is proposed to use the original type instead. Once we have typeless pointers, how will we pick the type? If the answer is that we'll always use integers, then I suppose this is temporary. Otherwise, not. Does that accurately represent the situation?

 -Hal



 
-- 
Mehdi





On Tue, Mar 22, 2016 at 6:58 PM, James Y Knight <jykn...@google.com> wrote:


On Tue, Mar 22, 2016 at 5:44 PM, Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.

David I think (1) is more inline with typeless pointer work than (2). Contributing to typeless pointer work will be great, but given its unknown time frame we cannot stop fixing existing problems. Of course, we should follow an approach consistent with the long-term solution.

It seems to me that the question to ask is what would be the best state of the code, assuming that the typeless pointers work had already been done. Is it the current canonical form? Or the newly proposed one?

I think it'd be the current one? If so, I'd suggest that proposal #2 is more compatible with the typeless pointer work. That is because (if done properly), code which knows how to look through pointer bitcast nodes is something which will be much more easily deletable once pointer bitcast nodes cease to exist, than to change the canonicalization of memcpy back and remove any backend code which was added only to compensate for that change.

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 7:53:34 PM3/22/16
to Hal Finkel, llvm-dev, Ulrich Weigand, Tom Stellard
On Tue, Mar 22, 2016 at 7:46 PM, Hal Finkel <hfi...@anl.gov> wrote:
I think typeless pointer, will automatically makes it obsolete. Remember, I proposed to make a change "when we load a value of type T from a type T* memory address". There will not be a type T * memory address, once typeless pointer work is in.

When we transform a small memcpy into a pair of load and store instructions, we'll still need to pick a type. Currently, as I understand it, we always pick integers. It is proposed to use the original type instead. Once we have typeless pointers, how will we pick the type? If the answer is that we'll always use integers, then I suppose this is temporary. Otherwise, not. Does that accurately represent the situation?


The way I understand it is temporary. (Using Hal's wording)

Ehsan Amiri via llvm-dev

unread,
Mar 22, 2016, 9:54:35 PM3/22/16
to Philip Reames, llvm-dev, Ulrich Weigand, Tom Stellard

On Tue, Mar 22, 2016 at 6:42 PM, Philip Reames <list...@philipreames.com> wrote:


On 03/22/2016 02:44 PM, Ehsan Amiri wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.
I have no specific reason to believe it will be a large amount of work, but prior experience tells me changes to canonical form have a tendency of exposing unexpected issues.  To be clear, I am supportive of you implementing solution 1. 


Thanks Phillip. Do you mean exposing bugs that were unknown before? or something else?

If the problem is that some transformations will start to have problems because they do not know how to deal with the new canonical form, that's alarming. If the chances of exposing such problem is not too high, then we can possibly go ahead with (1) and fallback to (2) if (1) starts to be complicated. But if the risk is too high, then we may want to prefer (2) to (1).

Any suggestions?

Philip Reames via llvm-dev

unread,
Mar 23, 2016, 12:22:25 AM3/23/16
to ehsan...@gmail.com, llvm-dev, Ulrich Weigand, Tom Stellard


On 03/22/2016 06:50 PM, Ehsan Amiri wrote:


On Tue, Mar 22, 2016 at 6:42 PM, Philip Reames <list...@philipreames.com> wrote:


On 03/22/2016 02:44 PM, Ehsan Amiri wrote:
Thanks.

Phillip, As Hal said I do not think (1) is a very large item. Please let me know if I am mistaken.
I have no specific reason to believe it will be a large amount of work, but prior experience tells me changes to canonical form have a tendency of exposing unexpected issues.  To be clear, I am supportive of you implementing solution 1. 


Thanks Phillip. Do you mean exposing bugs that were unknown before? or something else?

If the problem is that some transformations will start to have problems because they do not know how to deal with the new canonical form, that's alarming. If the chances of exposing such problem is not too high, then we can possibly go ahead with (1) and fallback to (2) if (1) starts to be complicated. But if the risk is too high, then we may want to prefer (2) to (1).
Most are not correctness bugs, but performance bugs.  Later passes (particular CGP and later) do not do well when presented arbitrary non-canonical IR.  They won't crash, but they may not be as effective as expected and may miss obvious performance wins.  When you change what canonical is, you sometimes miss the existing sweat spot and have to adjust the code to handle the new canonical form. 

My suggestion would be to implement your change, run a reasonable set of performance tests, fix any problems, and iterate.  Just don't be surprised if what seems like a simple patch snowballs into a series of mostly minor patches.  I'm not trying to warn you off or scare you; I just want to make sure you have reasonable expectations going in. 

Any suggestions?

Ehsan Amiri via llvm-dev

unread,
Mar 23, 2016, 11:58:51 AM3/23/16
to Philip Reames, llvm-dev, Ulrich Weigand, Tom Stellard
OK. I will do some experiments with (1) on Power PC. Will update this email chain about the results.


Chandler Carruth via llvm-dev

unread,
Mar 28, 2016, 5:18:38 PM3/28/16
to ehsan...@gmail.com, Philip Reames, llvm-dev, Ulrich Weigand, Tom Stellard
Sorry for my delay responding, finally caught up on my email to this point and read through the whole thread.

First and foremost: we should *definitely* not sit on our hands and wait for typeless pointers to arrive. However, we also shouldn't (IMO) take on lots of technical debt instead of working on causing typeless pointers to arrive sooner. But I don't think any of the options here are likely to incur large technical debt, so we should IMO feel free to pursue either approach.

I do actually think that in the face of typeless pointers, we will likely want to use integer loads and stores in the absence of some operation that makes a particular type a better fit. The reason I feel this way is because that will give us more consistent and thus "canonical" results from different input programs.

I think leaning on pointer type to pick a better type when lowering memcpy is a bad idea because it will essentially cause us to not know about the optimizations that are currently blocked by pointer bitcasts.

I have been advocating in other places that we should keep canonicalizing exactly as we currently do, and teach every optimization pass to look through pointer bitcasts (until they go away). The particular reason I advocate for this is that I expect this to make it *easier to get to typeless pointers*. Every time we fix optimization passes to look through bitcasts we get the rest of the optimizer closer in semantics to the world with typeless pointers. As an example, there may be in some passes work that will be needed to support this, and we can parallelize that work with the other typeless pointer work.

When typeless pointers arrive, yes, we will have lots of code that looks through bitcasts that are no longer there, but that will be both harmless and easily found and removed I suspect. Whereas, adding usage of pointer types runs the risk of continued barriers to typeless pointers creeping into the optimization layers.

So, I vote for approach #2 above FWIW.
-Chandler

On Wed, Mar 23, 2016 at 8:58 AM Ehsan Amiri via llvm-dev <llvm...@lists.llvm.org> wrote:
OK. I will do some experiments with (1) on Power PC. Will update this email chain about the results.


Ehsan Amiri via llvm-dev

unread,
Mar 29, 2016, 12:47:29 PM3/29/16
to Chandler Carruth, llvm-dev, Ulrich Weigand, Tom Stellard
Thanks Chandler. So I will focus on fixing bug 26445, using approach (2). If I encounter something unexpected, I will update this thread.

Ehsan

Reply all
Reply to author
Forward
0 new messages