[llvm-dev] Scalable Vector Types in IR

Renato Golin via llvm-dev

unread,

Mar 8, 2019, 11:08:39 AM3/8/19

to LLVM Dev, David Greene, Chandler Carruth, Chris Lattner, Maxim Kuvyrkov

Hi folks,

We seem to be converging on how the representation of scalable vectors
will be implemented in IR, and we also have support for such vectors
in the AArch64 back-end. We're also fresh out of the release process
and have a good number of months to hash out potential problems until
next release. What are the next steps to get this merged into trunk?

Given this is a major change to IR, we need more core developers
reviews before approving. The current quasi-consensus means now it's
the time for you to look closer. :)

This change per se shouldn't change how any of the passes or lowering
behave, but it will introduce the ability to break things in the
future. Unlike the "new pass manager", we can't create a "new IR", so
it needs to be a change that everyone is conscious and willing to take
on the project to stabilize it until the next release.

Here are some of the reviews on the matter, mostly agreed upon by the
current reviewers:
https://reviews.llvm.org/D32530
https://reviews.llvm.org/D53137
https://reviews.llvm.org/D47770

And the corresponding RFC threads:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106819.html
http://lists.llvm.org/pipermail/llvm-dev/2017-March/110772.html
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113587.html
http://lists.llvm.org/pipermail/llvm-dev/2018-April/122517.html
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123780.html

There is also an ongoing discussion about vector predication, which is
related but not depending on the scalable representation in IR:
https://reviews.llvm.org/D57504

And the corresponding RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129791.html

cheers,
--renato
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Graham Hunter via llvm-dev

unread,

Mar 12, 2019, 7:35:58 AM3/12/19

to Renato Golin, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, Maxim Kuvyrkov

Hi all,

Thanks Renato for the prod.

We (Arm) have had more off-line discussions with some members of the
community and they have expressed some reservations on adding scalable
vectors as a first class type. They have proposed an alternative to enable
support for C-level intrinsics and autovectorization for SVE.

While Arm's preference is still to support VLA autovec in LLVM (and not just
for SVE; we'll continue the discussion around the RFC), we are evaluating the
details of this alternative -- SVE-capable hardware will begin shipping within
the next couple of years, so we would like to support at least some
autovectorization as well as the C intrinsics by the time that happens.

This alternative proposal has two parts:

* For the SVE ACLE (C-language extension intrinsics), use an opaque type
(similar to x86_mmx, but unsized) and just pass intrinsics straight
through to the backend. This would give users the ability to write
vector length agnostic (VLA) code for SVE without resorting to assembly.
* For SVE autovectorization, use fixed length autovec -- either for a
user-specified length, or multiversioned to different fixed lengths.

I've spent some time over the last month prototyping an opaque type SVE C
intrinsic implementation to see what it would look like; here's my notes so far:

* I initially tried to use a single unsized opaque type.

* I ran into problems with using just a single type, since predicates use
different registers and I couldn't find a nice way of reassigning all
the register classes cleanly.
- I added a second opaque type to represent predicates as a result
- This could be avoided if we added subtype info to the opaque type
(basically minimum element count and elt type); this would mean that
we would either need to represent the count and element type in
a serialized IR form, or that the IR reader would need to be able
to reconstruct the types by reading the types from the intrinsic name

* I ran into a problem with the opaque types being unsized -- the C
intrinsic variables are declared as locals and clang plants
alloca/load/store IR instructions for them
- Could special case alloca/load/store for these types, but that's very
intrusive and liable to break in future
- Could introduce a special 'alloca intrinsic', but would require quite
a bit of code in clang to diverge as well as a custom mem2reg-like
pass just for these types
- I ended up making them sized, but with a size of 0. I don't know if
there's a problem I'll run into later on by doing this.
- While 'load' and 'store' IR instructions are fine for spill/fill memory
operations on the stack, we need to use intrinsics for everything else
since we need to know the size of individual elements -- while there
might not be many big-endian systems in operation, we still need to
support that.

* I reused the same (clang-level) builtin type mechanism that OpenCL does
for the SVE C-level types, and just codegen to the two LLVM types

I now have a minimal end-to-end implementation for a small set of SVE C
intrinsics. I have some additional observations based on our downstream
intrinsic implementation:

* Our initial downstream implementation attempted to do everything in
intrinsics, so would be similar to the opaque type version. However,
we found that we missed several optimizations in the process. Part of
this is due to the intrinsics being higher-level than the instructions
-- things like addressing modes are not represented in the intrinsics,
and with a pure intrinsic approach we miss things like LSR
optimizations.

* We also thought that the need for custom extensions for optimizations
like instcombine on SVE intrinsics would be reduced since someone using
the intrinsics is already going to the trouble of hand-optimizing their
code, but we hadn't appreciated that using C++ templates with constant
parameters and other methods of code generation would be common. As a
result, we now have user requests that operations like 'svmul(X, 1.0)'
be recognized and folded away, and are trying to find better
representations, including lowering to normal IR operations in some cases.

* Some operations can't be represented cleanly in current IR, but should
work well with Simon Moll's vector predication proposal.

Any feedback? I've posted my (very rough) initial work to phabricator:

clang: https://reviews.llvm.org/D59245
llvm: https://reviews.llvm.org/D59246

-Graham

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

David Greene via llvm-dev

unread,

Mar 12, 2019, 11:17:57 AM3/12/19

to Graham Hunter, LLVM Dev, Chandler Carruth, Chris Lattner, Maxim Kuvyrkov

Graham Hunter <Graham...@arm.com> writes:

> We (Arm) have had more off-line discussions with some members of the
> community and they have expressed some reservations on adding scalable
> vectors as a first class type. They have proposed an alternative to
> enable support for C-level intrinsics and autovectorization for SVE.

Can we get a summary of those discussions? What are the concerns?

-David

Graham Hunter via llvm-dev

unread,

Mar 13, 2019, 9:57:59 AM3/13/19

to David Greene, LLVM Dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Hi David,

> On 12 Mar 2019, at 15:17, David Greene <d...@cray.com> wrote:
>
> Graham Hunter <Graham...@arm.com> writes:
>
>> We (Arm) have had more off-line discussions with some members of the
>> community and they have expressed some reservations on adding scalable
>> vectors as a first class type. They have proposed an alternative to
>> enable support for C-level intrinsics and autovectorization for SVE.
>
> Can we get a summary of those discussions? What are the concerns?
>
> -David

I did ask them to post their arguments on the list, but I guess they've been busy for the last month (or forgot about it).

The basic argument was that they didn't believe the value gained from enabling VLA autovectorization was worth the added complexity in maintaining the codebase. They were open to changing their minds if we could demonstrate sufficient demand for the feature.

-Graham

Renato Golin via llvm-dev

unread,

Mar 13, 2019, 10:29:26 AM3/13/19

to Graham Hunter, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Wed, 13 Mar 2019 at 13:57, Graham Hunter <Graham...@arm.com> wrote:
> I did ask them to post their arguments on the list, but I guess they've been busy for the last month (or forgot about it).

Who is "them" and who will write up a proposal / RFC on the use of
intrinsics for both lowering and vectorisation?

It goes without saying that those discussions should have been had in
the mailing list, not behind closed doors. Agreeing to implementations
in private is asking to get bad reviews in public, as the SVE process
has shown *over and over again*.

I don't understand why, after so many problems for so many years, this
is still the modus operandi...

> The basic argument was that they didn't believe the value gained from enabling VLA autovectorization was worth the added complexity in maintaining the codebase. They were open to changing their minds if we could demonstrate sufficient demand for the feature.

In that case, the current patches to change the IR should be
abandoned, as well as reverting the previous change to the types, so
that we don't carry any unnecessary code forward.

The review you sent seems to be a mechanical change to include the
intrinsics, but the target lowering change seems to be too small to
actually be able to lower anything.

Without context, it's hard to know what's going on.

cheers,
--renato

Finkel, Hal J. via llvm-dev

unread,

Mar 13, 2019, 11:27:51 AM3/13/19

to Renato Golin, Graham Hunter, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On 3/13/19 9:29 AM, Renato Golin wrote:
> On Wed, 13 Mar 2019 at 13:57, Graham Hunter <Graham...@arm.com> wrote:
>> I did ask them to post their arguments on the list, but I guess they've been busy for the last month (or forgot about it).
> Who is "them" and who will write up a proposal / RFC on the use of
> intrinsics for both lowering and vectorisation?
>
> It goes without saying that those discussions should have been had in
> the mailing list, not behind closed doors.

Renato, I understand your frustration, but I don't want an unproductive
conclusion to be drawn. We should encourage our community members to
talk to each other both on the mailing list and off the mailing list. We
have in-person discussions at the developers' meetings, and my
experience is that sitting in a room with someone, or sometimes talking
with someone over the phone, can really help reach a mutual
understanding more effectively than mailing-list communication. However,
the critical step is that the outcome of that conversation should be
summarized, in a timely manner, for the mailing list (or put in the
relevant code review, bug report, etc.) so that the rest of us can
provide input.

> Agreeing to implementations
> in private is asking to get bad reviews in public

+1

-Hal

> , as the SVE process
> has shown *over and over again*.
>
> I don't understand why, after so many problems for so many years, this
> is still the modus operandi...
>
>> The basic argument was that they didn't believe the value gained from enabling VLA autovectorization was worth the added complexity in maintaining the codebase. They were open to changing their minds if we could demonstrate sufficient demand for the feature.
> In that case, the current patches to change the IR should be
> abandoned, as well as reverting the previous change to the types, so
> that we don't carry any unnecessary code forward.
>
> The review you sent seems to be a mechanical change to include the
> intrinsics, but the target lowering change seems to be too small to
> actually be able to lower anything.
>
> Without context, it's hard to know what's going on.
>
> cheers,
> --renato

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Renato Golin via llvm-dev

unread,

Mar 13, 2019, 11:56:07 AM3/13/19

to Finkel, Hal J., LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Wed, 13 Mar 2019 at 15:27, Finkel, Hal J. <hfi...@anl.gov> wrote:
> However,
> the critical step is that the outcome of that conversation should be
> summarized, in a timely manner, for the mailing list (or put in the
> relevant code review, bug report, etc.) so that the rest of us can
> provide input.

That was my point. Discussing personally on the dev meetings is
essential, but without follow through, the people on the meeting will
be following a different path than the rest of the community.

Right now, whatever conclusions were drawn at that meeting are still
"behind closed doors", which pushed me and David to continue reviewing
patches that were essentially dead and have parallel discussions that
were not only moot, but counter-productive.

--renato

Graham Hunter via llvm-dev

unread,

Mar 13, 2019, 12:04:27 PM3/13/19

to Renato Golin, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Hi Renato,

> It goes without saying that those discussions should have been had in
> the mailing list, not behind closed doors.

I have encouraged people to respond on the list or the RFC many times,
but I've not had much luck in getting people to post even if they
approve of the idea.

> Agreeing to implementations
> in private is asking to get bad reviews in public, as the SVE process
> has shown *over and over again*.

There isn't an agreement on the implementation yet; I have posted two
possibilities and am trying to get consensus on an approach from the
community.

>> The basic argument was that they didn't believe the value gained from enabling VLA autovectorization was worth the added complexity in maintaining the codebase. They were open to changing their minds if we could demonstrate sufficient demand for the feature.
>
> In that case, the current patches to change the IR should be
> abandoned, as well as reverting the previous change to the types, so
> that we don't carry any unnecessary code forward.

There's no consensus on supporting the opaque types either yet. Even
if we do end up going down that route, it could be modified -- as I
mentioned in my notes, I could introduce a single toplevel type to
the IR if I stored additional data in it (making it effectively the
same as the current VectorType, just opaque to existing optimization
passes), and then would be able to lower directly to the existing
scalable MVTs we have.

> The review you sent seems to be a mechanical change to include the
> intrinsics, but the target lowering change seems to be too small to
> actually be able to lower anything.

The new patches are just meant to demonstrate the basics of the opaque
type to see if there's greater consensus in exploring this approach
instead of the VLA approach.

> Without context, it's hard to know what's going on.

The current state is just what you stated in your initial email in this
chain; we have a solution that seems to work (in principal) for SVE, RVV,
and SX-Aurora, but not enough people that care about VLA vectorization
beyond those groups.

Given the time constraints, Arm is being pushed to consider a plan B to
get something working in time for early 2020.

-Graham

Amara Emerson via llvm-dev

unread,

Mar 13, 2019, 2:45:34 PM3/13/19

to Graham Hunter, Renato Golin, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Disclaimer: I’m only speaking for myself, not Apple.

This is really disappointing. Resorting to multi-versioned fixed length vectorization isn’t a solution that’s competitive with the native VLA support, so it doesn’t look like a credible alternative suggestion (at least not without elaborating it on the mailing list). Without a practical alternative, it’s essentially saying “no” to a whole class of vector architectures of which SVE is only one.

Amara

Finkel, Hal J. via llvm-dev

unread,

Mar 13, 2019, 2:55:49 PM3/13/19

to Amara Emerson, Graham Hunter, Renato Golin, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On 3/13/19 1:45 PM, Amara Emerson via llvm-dev wrote:
> Disclaimer: I’m only speaking for myself, not Apple.
>
> This is really disappointing. Resorting to multi-versioned fixed length vectorization isn’t a solution that’s competitive with the native VLA support, so it doesn’t look like a credible alternative suggestion (at least not without elaborating it on the mailing list). Without a practical alternative, it’s essentially saying “no” to a whole class of vector architectures of which SVE is only one.

To the extent that this alternative direction represents an exploration
so that we can all evaluate in a more-informed manner, I think that is
valuable. However, let me agree with Amara, I prefer the original
approach. Among many other advantages, users will expect the compiler to
perform arithmetic optimizations on VLA operations (e.g., InstCombines),
and if we can't reuse the existing logic for this purpose, we'll end up
with an inferior result.

Thanks again,

Hal

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Renato Golin via llvm-dev

unread,

Mar 13, 2019, 7:02:24 PM3/13/19

to Hal Finkel, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Agreed with both!

Furthermore, any temporary solution will have to be very similar to what we expect to see natively, or the transition to native may never happen.

Graham Hunter via llvm-dev

unread,

Mar 14, 2019, 6:45:46 AM3/14/19

to Renato Golin, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Thanks for the support.

To clarify, Arm would very much prefer to proceed with the full scalable
IR type proposal, but we're facing time pressure now.

We would like to be able to reach consensus on an approach around the end
of EuroLLVM this year so that we can begin a full implementation.

The opaque type patches were only intended to show how the third party proposal
might look; I agree it should be closer to the scalable IR proposal. The
two main points that (imo) would make it easier to switch later are:
- Embedding the element type and minimum length, which copies the basic
semantics of VectorType
- Serializing in the same way we do in the scalable IR proposal
(to a '<scalable n x ty>'). We should then just be able to switch
the Types used by the IR reader and writer.

-Graham

Francesco Petrogalli via llvm-dev

unread,

Mar 14, 2019, 10:27:23 AM3/14/19

to Graham Hunter, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

> On Mar 14, 2019, at 5:45 AM, Graham Hunter via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> We would like to be able to reach consensus on an approach around the end
> of EuroLLVM this year so that we can begin a full implementation.

EuroLLVM will be a very good occasion to hear all the opinions and reach consensus on the approach.

@Graham, maybe you should set up a round table open to anyone, and make sure that key people (third parties asking for a different approach, people involved in this discussion) will receive the invite?

Francesco

David Greene via llvm-dev

unread,

Mar 14, 2019, 11:02:24 PM3/14/19

to Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Francesco Petrogalli via llvm-dev <llvm...@lists.llvm.org> writes:

>> On Mar 14, 2019, at 5:45 AM, Graham Hunter via llvm-dev <llvm...@lists.llvm.org> wrote:
>>
>> We would like to be able to reach consensus on an approach around the end
>> of EuroLLVM this year so that we can begin a full implementation.
>
> EuroLLVM will be a very good occasion to hear all the opinions and reach consensus on the approach.
>
> @Graham, maybe you should set up a round table open to anyone, and
> make sure that key people (third parties asking for a different
> approach, people involved in this discussion) will receive the invite?

We tried two round tables at the Nov. LLVMDev and no serious objections
were raised, but we knew we didn't have all the right people there. I
am somewhat skeptical another roundtable without commitment to attend
from all able parties ahead of time will accomplish much.

Speaking for myself (and not Cray), it is frustrating to have had a
bunch of discussion on the mailing list and in reviews where concerns
were raised and to see a lot of radio silence to responses to those
concerns, only to see a message about a potential change in direction
driven by off-list discussions where concerns and responses to concerns
are unknown and therefore not addressable.

I completely understand that ARM needs to make progress and I very much
want to see that progress. I just don't want to see a Plan B leading to
a situation where VLA support doesn't ever make it into LLVM. It is
somewhat embarrassing that gcc already has a release with VLA support
for SVE and LLVM is stuck in the starting blocks.

-David

Graham Hunter via llvm-dev

unread,

Mar 15, 2019, 6:18:29 AM3/15/19

to David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Hi David,

> We tried two round tables at the Nov. LLVMDev and no serious objections
> were raised, but we knew we didn't have all the right people there. I
> am somewhat skeptical another roundtable without commitment to attend
> from all able parties ahead of time will accomplish much.

Agreed, but I'll try scheduling one anyway.

> Speaking for myself (and not Cray), it is frustrating to have had a
> bunch of discussion on the mailing list and in reviews where concerns
> were raised and to see a lot of radio silence to responses to those
> concerns, only to see a message about a potential change in direction
> driven by off-list discussions where concerns and responses to concerns
> are unknown and therefore not addressable.

I didn't want private meetings either, but repeatedly requesting public
feedback for the RFC or patches hadn't provided reasoning behind any
concerns that people had.

The agreement reached at the meeting was for the objectors to post their
reasons for objecting and counter-proposal in public so discussion could
take place, and Arm would investigate the details of the counter-proposal.

Unfortunately, that post never happened, so I found myself a bit stuck and
had to post it for them -- not a situation I wanted.

I have always wanted the discussion to take place in public.

> I completely understand that ARM needs to make progress and I very much
> want to see that progress. I just don't want to see a Plan B leading to
> a situation where VLA support doesn't ever make it into LLVM. It is
> somewhat embarrassing that gcc already has a release with VLA support
> for SVE and LLVM is stuck in the starting blocks.

Agreed.

-Graham

Finkel, Hal J. via llvm-dev

unread,

Mar 15, 2019, 11:30:51 AM3/15/19

to Graham Hunter, David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, nd, Maxim Kuvyrkov, Chris Lattner

On 3/15/19 5:18 AM, Graham Hunter via llvm-dev wrote:
> Hi David,
>
>> We tried two round tables at the Nov. LLVMDev and no serious objections
>> were raised, but we knew we didn't have all the right people there. I
>> am somewhat skeptical another roundtable without commitment to attend
>> from all able parties ahead of time will accomplish much.
> Agreed, but I'll try scheduling one anyway.
>
>> Speaking for myself (and not Cray), it is frustrating to have had a
>> bunch of discussion on the mailing list and in reviews where concerns
>> were raised and to see a lot of radio silence to responses to those
>> concerns, only to see a message about a potential change in direction
>> driven by off-list discussions where concerns and responses to concerns
>> are unknown and therefore not addressable.
> I didn't want private meetings either, but repeatedly requesting public
> feedback for the RFC or patches hadn't provided reasoning behind any
> concerns that people had.
>
> The agreement reached at the meeting was for the objectors to post their
> reasons for objecting and counter-proposal in public so discussion could
> take place, and Arm would investigate the details of the counter-proposal.

I've talked with a number of people about this as well, and I think that
I understand the objections. I'm happy that ARM followed through with
the alternate set of patches. Regardless, however, unless those who had
wished to object still wish to object, and then actually do so, we now
clearly have a good collection of contributors actively desiring to do
code review, and we should move forward (i.e., start committing patches
once they're judged ready).

-Hal

>
> Unfortunately, that post never happened, so I found myself a bit stuck and
> had to post it for them -- not a situation I wanted.
>
> I have always wanted the discussion to take place in public.
>
>> I completely understand that ARM needs to make progress and I very much
>> want to see that progress. I just don't want to see a Plan B leading to
>> a situation where VLA support doesn't ever make it into LLVM. It is
>> somewhat embarrassing that gcc already has a release with VLA support
>> for SVE and LLVM is stuck in the starting blocks.
> Agreed.
>
> -Graham
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Renato Golin via llvm-dev

unread,

Mar 15, 2019, 11:39:10 AM3/15/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
<llvm...@lists.llvm.org> wrote:
> I've talked with a number of people about this as well, and I think that
> I understand the objections. I'm happy that ARM followed through with
> the alternate set of patches. Regardless, however, unless those who had
> wished to object still wish to object, and then actually do so, we now
> clearly have a good collection of contributors actively desiring to do
> code review, and we should move forward (i.e., start committing patches
> once they're judged ready).

Let's start by closing the three flying revisions, so that people that
weren't involved in the discussion don't waste time looking at them.

Graham, can you also add a comment to reflect the change in
implementation while closing?

Second, it would be very good if someone, anyone, would write an RFC
on the new proposal. Graham started it on this thread and on the two
reviews, but this thread is now dead. We should have a new one, with
subject "RFC etc" and the contents of the current understanding, so
that it can serve as context for the two reviews.

Thanks!
--renato

David Greene via llvm-dev

unread,

Mar 15, 2019, 11:54:29 AM3/15/19

to Graham Hunter, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Graham Hunter <Graham...@arm.com> writes:

> I have always wanted the discussion to take place in public.

I just want to make sure you know that we all know that. I was not in
any way disparaging the work you and ARM have done on this!

-David

David Greene via llvm-dev

unread,

Mar 15, 2019, 11:57:18 AM3/15/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

"Finkel, Hal J." <hfi...@anl.gov> writes:

>> The agreement reached at the meeting was for the objectors to post their
>> reasons for objecting and counter-proposal in public so discussion could
>> take place, and Arm would investigate the details of the counter-proposal.
>
>
> I've talked with a number of people about this as well, and I think that
> I understand the objections. I'm happy that ARM followed through with
> the alternate set of patches. Regardless, however, unless those who had
> wished to object still wish to object, and then actually do so, we now
> clearly have a good collection of contributors actively desiring to do
> code review, and we should move forward (i.e., start committing patches
> once they're judged ready).

I am not sure this is your intended meaning, but if those objecting
don't come forward, I would like to see Graham's current patches
supporting scalable types go in. A number of people have now stated
that they are desirable, people have reviewed them, Graham has made
changes and we're ready to review them some more and iterate on them.

It would set a bad precedent to block patches based on some vague
objections that aren't being discussed publicly. We can debate the
actual patches, that of course needs to happen. But the RFC looks
reasonable to me and apparently others. Let's start moving forward.

-David

David Greene via llvm-dev

unread,

Mar 15, 2019, 11:58:31 AM3/15/19

to Renato Golin, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Renato Golin <reng...@gmail.com> writes:

> On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
> <llvm...@lists.llvm.org> wrote:
>> I've talked with a number of people about this as well, and I think that
>> I understand the objections. I'm happy that ARM followed through with
>> the alternate set of patches. Regardless, however, unless those who had
>> wished to object still wish to object, and then actually do so, we now
>> clearly have a good collection of contributors actively desiring to do
>> code review, and we should move forward (i.e., start committing patches
>> once they're judged ready).
>
> Let's start by closing the three flying revisions, so that people that
> weren't involved in the discussion don't waste time looking at them.

See the reply I just posted to Hal. I am not sure we've made a decision
to abandon the current patches. We may in fact decide that, but I
haven't seen consensus for doing so yet. In fact I've seen the opposite
-- that people want to move forward with the scalable types.

-David

Renato Golin via llvm-dev

unread,

Mar 15, 2019, 12:11:11 PM3/15/19

to David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, 15 Mar 2019 at 15:58, David Greene <d...@cray.com> wrote:
> See the reply I just posted to Hal. I am not sure we've made a decision
> to abandon the current patches. We may in fact decide that, but I
> haven't seen consensus for doing so yet. In fact I've seen the opposite
> -- that people want to move forward with the scalable types.

I did see that reply.

While, like Hal, I do understand some concerns on introducing a
radical new concept to IR (the reason why I started this thread), I'm
unaware (mainly by not being on that meeting) of the individual issues
and how controversial they were with those involved.

Furthermore, the current state is uncertain and people need to be
convinced more of what will work by means of hacking up more
intrinsics and more kludge into the current IR.

This means that, even if we are to implement it natively in IR, it
won't come *before* we implement it with intrinsics, which will
hopefully convince people that this makes sense, and by which time,
the code will look completely different and we'll need a completely
new patch.

Ie. the current series is already dead, no matter what we do.

cheers,
--renato

James Y Knight via llvm-dev

unread,

Mar 15, 2019, 12:50:28 PM3/15/19

to Renato Golin, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, Mar 15, 2019 at 12:11 PM Renato Golin via llvm-dev <llvm...@lists.llvm.org> wrote:

On Fri, 15 Mar 2019 at 15:58, David Greene <d...@cray.com> wrote:
> See the reply I just posted to Hal. I am not sure we've made a decision
> to abandon the current patches. We may in fact decide that, but I
> haven't seen consensus for doing so yet. In fact I've seen the opposite
> -- that people want to move forward with the scalable types.

I did see that reply.

While, like Hal, I do understand some concerns on introducing a
radical new concept to IR (the reason why I started this thread), I'm
unaware (mainly by not being on that meeting) of the individual issues
and how controversial they were with those involved.

Furthermore, the current state is uncertain and people need to be
convinced more of what will work by means of hacking up more
intrinsics and more kludge into the current IR.

This means that, even if we are to implement it natively in IR, it
won't come *before* we implement it with intrinsics, which will
hopefully convince people that this makes sense, and by which time,
the code will look completely different and we'll need a completely
new patch.

Ie. the current series is already dead, no matter what we do

I have no opinion on the technical aspects here, not having researched this topic at all.

But this last statement seems odd. So far, there looks to be a fairly good consensus from a number of experienced llvm developers that the approach seems like a good idea, both on this thread, and from skimming the earlier threads you linked from your original message.

Doesn't that mean that the reasonable next step is to continue moving forward with the existing patch set?

Renato Golin via llvm-dev

unread,

Mar 15, 2019, 1:20:40 PM3/15/19

to James Y Knight, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, 15 Mar 2019 at 16:50, James Y Knight <jykn...@google.com> wrote:
>> Ie. the current series is already dead, no matter what we do
>

> But this last statement seems odd. So far, there looks to be a fairly good consensus from a number of experienced llvm developers that the approach seems like a good idea, both on this thread, and from skimming the earlier threads you linked from your original message.
>
> Doesn't that mean that the reasonable next step is to continue moving forward with the existing patch set?

It depends.

The previous public consensus was, indeed, that the native proposal
makes a lot of sense. I think this is ultimately where we want to be,
but I'm not clear on what the path really is.

From what Graham said, and from his current work, I guess the "new"
(not public) consensus seems to be to go with intrinsics first, then
move to native support, which is a valid path.

If the public agreement becomes that this is the path we want to take,
then that specific patch-set is dead, because even if we do native, it
will be a different set.

If the end result is that we'll stop at intrinsics (I really hope
not), the patch-set is also dead.

However, if people want to continue pushing for native support now,
the patch-set is not dead. But then we need to re-do the meeting that
happened in the US dev meeting with everyone in it, which won't
happen.

So, while I would also prefer to have native support first, and work
our the wrinkles between releases (as I proposed in this thread), I'm
ok with going the intrinsics way first, as long as the aim is to not
stop there.

Makes sense?

cheers,
--renato

PS: Until someone writes up what happened, who was involved, what were
the issues and why the current consensus is changed, we can only
guess...

Finkel, Hal J. via llvm-dev

unread,

Mar 15, 2019, 2:22:14 PM3/15/19

to David Greene, Renato Golin, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

On 3/15/19 10:58 AM, David Greene wrote:
> Renato Golin <reng...@gmail.com> writes:
>
>> On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
>> <llvm...@lists.llvm.org> wrote:
>>> I've talked with a number of people about this as well, and I think that
>>> I understand the objections. I'm happy that ARM followed through with
>>> the alternate set of patches. Regardless, however, unless those who had
>>> wished to object still wish to object, and then actually do so, we now
>>> clearly have a good collection of contributors actively desiring to do
>>> code review, and we should move forward (i.e., start committing patches
>>> once they're judged ready).
>> Let's start by closing the three flying revisions, so that people that
>> weren't involved in the discussion don't waste time looking at them.
> See the reply I just posted to Hal. I am not sure we've made a decision
> to abandon the current patches. We may in fact decide that, but I
> haven't seen consensus for doing so yet. In fact I've seen the opposite
> -- that people want to move forward with the scalable types.

I agree with David. We should move forward with native support for
scalable types.

-Hal

>
> -David

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Graham Hunter via llvm-dev

unread,

Mar 15, 2019, 3:50:29 PM3/15/19

to Renato Golin, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Hi,

> From what Graham said, and from his current work, I guess the "new"
> (not public) consensus seems to be to go with intrinsics first, then
> move to native support, which is a valid path.

There wasn't a consensus, just a proposal for a different option to
present to the community for feedback and discussion to get things
moving (whether for the full scalable IR proposal, the opaque types one,
or something in between). Sorry if I didn't make that clear enough.

Arm felt it was worth investing some time in investigating an alternative
if there was the possibility of progressing upstreaming, then presenting
the findings for discussion.

> If the public agreement becomes that this is the path we want to take,
> then that specific patch-set is dead, because even if we do native, it
> will be a different set.
>
> If the end result is that we'll stop at intrinsics (I really hope
> not), the patch-set is also dead.
>
> However, if people want to continue pushing for native support now,
> the patch-set is not dead. But then we need to re-do the meeting that
> happened in the US dev meeting with everyone in it, which won't
> happen.

While there was a roundtable at the devmeeting last year, there weren't
that many people in attendance to talk purely about SVE or scalable
types -- most of the discussion revolved around predication support,
from what I remember.

The main feedback I had which led to changes in the RFC were in side chats
when people had a few minutes of spare time (since they had other sessions
to attend which clashed with the roundtable slots).

-Graham

Renato Golin via llvm-dev

unread,

Mar 15, 2019, 4:07:45 PM3/15/19

to Graham Hunter, LLVM Dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Hi Graham,

By the extent of your "further work" , I assumed you had quite a strong push back and that this was more of an official session. That's why I wanted clarity over which reviews we should be looking at.

Honestly, so far, no one in this thread has pointed to any concrete request for non native support, so unless someone does so, the official consensus is still native.

So, with apologies to all bystanders, I repeat my original proposal: it's now time to try and push native support (time to next release) on trunk.

If anyone has concerns, either on the current proposal (reviews on my first email) or on the general idea of native scalable types, please speak up.

As David said, it's a bit silly that gcc has support for it for over a year and we're still arguing about very basic stuff.

Cheers,

Renato

Chandler Carruth via llvm-dev

unread,

Mar 15, 2019, 4:55:46 PM3/15/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, Mar 15, 2019 at 11:22 AM Finkel, Hal J. via llvm-dev <llvm...@lists.llvm.org> wrote:

On 3/15/19 10:58 AM, David Greene wrote:
> Renato Golin <reng...@gmail.com> writes:
>
>> On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
>> <llvm...@lists.llvm.org> wrote:
>>> I've talked with a number of people about this as well, and I think that
>>> I understand the objections. I'm happy that ARM followed through with
>>> the alternate set of patches. Regardless, however, unless those who had
>>> wished to object still wish to object, and then actually do so, we now
>>> clearly have a good collection of contributors actively desiring to do
>>> code review, and we should move forward (i.e., start committing patches
>>> once they're judged ready).
>> Let's start by closing the three flying revisions, so that people that
>> weren't involved in the discussion don't waste time looking at them.
> See the reply I just posted to Hal. I am not sure we've made a decision
> to abandon the current patches. We may in fact decide that, but I
> haven't seen consensus for doing so yet. In fact I've seen the opposite
> -- that people want to move forward with the scalable types.

I agree with David. We should move forward with native support for
scalable types.

Sorry I haven't been as available as usual for the past few weeks, but FWIW, I still am unconvinced that scalable vector types belong in the IR.

I think this adds complexity to LLVM's IR to serve a niche use case without proven benefit to a broad spectrum of hardware or software. I think the complexity is significant and will be a net drag on all parts of the IR and IR-level transformations. But I don't really think it is useful to re-hash all these debates. Nothing relevant has changed in the years this has been discussed.

That said, if I'm the only one who feels this way (and is willing to actually state this publicly), I'm not going to stop progress.

-Chandler

Troy Johnson via llvm-dev

unread,

Mar 15, 2019, 8:09:34 PM3/15/19

to Finkel, Hal J., Chandler Carruth, David Greene, Chris Lattner, llvm...@lists.llvm.org, Maxim Kuvyrkov

> I think this adds complexity to LLVM's IR to serve a niche use case without proven benefit to a broad spectrum of hardware or software. I think the complexity is significant and will be a net drag on all parts of the IR and IR-level transformations.

I view the situation differently, but I'm still relatively new to llvm-dev and may be too unfamiliar with the threshold for inclusion in trunk, so please help educate me. I see people from more than one organization saying that they'd like to see this in trunk. No one wants it to be a drag on all transforms because no one wants to have to rewrite a ton of code. So it would seem that there are two possible outcomes: it gets merged into trunk and the interested parties try hard to not adversely impact all of LLVM because that's in their best interest, too, OR it isn't merged and then we have potentially the same multiple organizations maintaining support for this on the side, out of trunk. This community tries to avoid the latter situation, right? I've always thought of LLVM trunk as where code ends up that is useful to multiple orgs and then each org maintains their own local patches for stuff that no one else would want (or that they can't share for competitive reasons).

-Troy

From: llvm-dev <llvm-dev...@lists.llvm.org> on behalf of Chandler Carruth via llvm-dev <llvm...@lists.llvm.org>
Sent: Friday, March 15, 2019 3:55:26 PM
To: Finkel, Hal J.
Cc: Francesco Petrogalli via llvm-dev; Chandler Carruth; David Greene; Chris Lattner; nd; Maxim Kuvyrkov
Subject: Re: [llvm-dev] Scalable Vector Types in IR - Next Steps?

David Greene via llvm-dev

unread,

Mar 18, 2019, 1:15:32 PM3/18/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

"Finkel, Hal J." <hfi...@anl.gov> writes:

> On 3/15/19 10:58 AM, David Greene wrote:
>> Renato Golin <reng...@gmail.com> writes:
>>
>>> On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
>>> <llvm...@lists.llvm.org> wrote:
>>>> I've talked with a number of people about this as well, and I think that
>>>> I understand the objections. I'm happy that ARM followed through with
>>>> the alternate set of patches. Regardless, however, unless those who had
>>>> wished to object still wish to object, and then actually do so, we now
>>>> clearly have a good collection of contributors actively desiring to do
>>>> code review, and we should move forward (i.e., start committing patches
>>>> once they're judged ready).
>>> Let's start by closing the three flying revisions, so that people that
>>> weren't involved in the discussion don't waste time looking at them.
>> See the reply I just posted to Hal. I am not sure we've made a decision
>> to abandon the current patches. We may in fact decide that, but I
>> haven't seen consensus for doing so yet. In fact I've seen the opposite
>> -- that people want to move forward with the scalable types.
>
>
> I agree with David. We should move forward with native support for
> scalable types.

Graham, which patch(es) would you like to concentrate on for review
first? A number are marked "not for review." Do you need to update
them before we continue reviewing?

-David

Eric Christopher via llvm-dev

unread,

Mar 18, 2019, 1:40:02 PM3/18/19

to Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Fri, Mar 15, 2019 at 1:55 PM Chandler Carruth via llvm-dev

You're not, and I'm in the same position here. I don't think there's a
really good answer for how this is going to affect a lot of the IR and
IR-level transformations from a maintainability perspective. It mostly
seems like this is a "we need this for the new ISA support" and while
I don't see a lot of compelling use case here and a lot of downside
that there...

-eric

Bruce Hoult via llvm-dev

unread,

Mar 18, 2019, 8:26:40 PM3/18/19

to Eric Christopher, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Three ISAs at present:

- SVE in Aarch64
- MVE in ARM Cortex-M (quite different from SVE)
- RVV in RISC-V

It would not surprise me if other ISAs implement similar vector
extensions in future.

On Mon, Mar 18, 2019 at 10:40 AM Eric Christopher via llvm-dev

Jacob Lifshay via llvm-dev

unread,

Mar 18, 2019, 9:17:02 PM3/18/19

to Bruce Hoult, David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

On Mon, Mar 18, 2019 at 5:26 PM Bruce Hoult via llvm-dev <llvm...@lists.llvm.org> wrote:

Three ISAs at present:

- SVE in Aarch64
- MVE in ARM Cortex-M (quite different from SVE)
- RVV in RISC-V

It would not surprise me if other ISAs implement similar vector
extensions in future.

We're planning on implementing scalable vector support in the SimpleV ISA extension as well. Admittedly, we will most likely need additional IR modifications (allowing vectors of vectors), but I think having scalable vector support already built in will help greatly.

Jacob Lifshay

Simon Moll via llvm-dev

unread,

Mar 18, 2019, 10:44:23 PM3/18/19

to Hal Finkel, David Greene, Renato Golin, Francesco Petrogalli via llvm-dev, Chandler Carruth, nd, Maxim Kuvyrkov, Chris Lattner

On 3/16/19 3:22 AM, Finkel, Hal J. via llvm-dev wrote:

On 3/15/19 10:58 AM, David Greene wrote:

Renato Golin <reng...@gmail.com> writes:

On Fri, 15 Mar 2019 at 15:30, Finkel, Hal J. via llvm-dev
<llvm...@lists.llvm.org> wrote:

I've talked with a number of people about this as well, and I think that
I understand the objections. I'm happy that ARM followed through with
the alternate set of patches. Regardless, however, unless those who had
wished to object still wish to object, and then actually do so, we now
clearly have a good collection of contributors actively desiring to do
code review, and we should move forward (i.e., start committing patches
once they're judged ready).

Let's start by closing the three flying revisions, so that people that
weren't involved in the discussion don't waste time looking at them.

See the reply I just posted to Hal.  I am not sure we've made a decision
to abandon the current patches.  We may in fact decide that, but I
haven't seen consensus for doing so yet.  In fact I've seen the opposite
-- that people want to move forward with the scalable types.

I agree with David. We should move forward with native support for
scalable types.

 -Hal

+1

NEC SX-Aurora will also be using scalable types when they become available.

-Simon

                       -David

-- 

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : mo...@cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

Erich Focht via llvm-dev

unread,

Mar 19, 2019, 2:54:55 AM3/19/19

to llvm...@lists.llvm.org, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Mon, Mar 18, 2019 at 18:16:42 Jacob Lifshay via llvm-dev
<llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>> wrote:
>
> Three ISAs at present:
>
> - SVE in Aarch64
> - MVE in ARM Cortex-M (quite different from SVE)
> - RVV in RISC-V
>
> It would not surprise me if other ISAs implement similar vector
> extensions in future.
>
> We're planning on implementing scalable vector support in the SimpleV
> ISA extension as well. Admittedly, we will most likely need additional
> IR modifications (allowing vectors of vectors), but I think having
> scalable vector support already built in will help greatly.
>

As Simon Moll also wrote, please add the NEC SX-Aurora vector engine to
the list of architectures aiming at and awaiting eagerly the SVE/AVL/VP
changes in LLVM. We have long vectors (256x64bit) and a vector length
register since many years, with the latest CPU being available on the
market since a year, mainly aiming at HPC and AI.

We're working on an LLVM backend and intend to open it and post an RFC
on its inclusion soon. Progress with AVL/VP is very important for this
backend and we rely on LLVM moving forward on these.

Regards,
Erich Focht

Graham Hunter via llvm-dev

unread,

Mar 19, 2019, 5:57:13 AM3/19/19

to David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Hi David,

I'll need to update the reviews (and rebase). I'll do that this week.

https://reviews.llvm.org/D32530 is the key patch. The backend codegen
patches can be safely ignored, I think -- we would want better isel
patterns.

It seems there's still some discussion to be had in this thread though.

-Graham

Graham Hunter via llvm-dev

unread,

Mar 19, 2019, 6:49:30 AM3/19/19

to Bruce Hoult, David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

Hi Bruce,

> On 19 Mar 2019, at 00:26, Bruce Hoult via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> Three ISAs at present:
>
> - SVE in Aarch64
> - MVE in ARM Cortex-M (quite different from SVE)
> - RVV in RISC-V

MVE isn't scalable in terms of registers (it's fixed at 128b iirc), so won't be using these types.

It can use execution units that are narrower than 128b, and start execution on partially
completed vectors in subsequent cycles (a bit like an RVV implementation might do when using
VLMul > 1, or the established vector supercomputer architectures).

Just a different set of design constraints.

As Simon and Erich point out though, SX-Aurora would like to use scalable vectors too, so
we still have three architectures intending to use the feature.

-Graham

Graham Hunter via llvm-dev

unread,

Mar 19, 2019, 7:11:27 AM3/19/19

to Eric Christopher, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

Hi Eric and Chandler,

I appreciate your concerns; I don't think the impact will be that great, but then it's
rather easy for me to keep SVE in mind when working on other parts of the codebase
given how long I've spent working on it.

Are there any additional constraints on the scalable types you think would alleviate
your concerns a little? At the moment we will prevent scalable vectors from being
included in structs and arrays, but we could add more (at least to start with) to
avoid potential hidden problems.

I'm also trying to come up with an idea of how much impact we have in our downstream
implementation; most places where there is divergence are in the AArch64 backend (as you'd
expect), followed by the generic SelectionDAG code -- but lowering and legalization for
current instructions should (hopefully) be a one-off.

Do you have any specific parts of the codebase you're interested in a report into the
extent of changes?

-Graham

Chandler Carruth via llvm-dev

unread,

Mar 19, 2019, 3:32:13 PM3/19/19

to Graham Hunter, David Greene, Francesco Petrogalli via llvm-dev, Chandler Carruth, Chris Lattner, nd, Maxim Kuvyrkov

On Tue, Mar 19, 2019 at 4:11 AM Graham Hunter <Graham...@arm.com> wrote:

Hi Eric and Chandler,

I appreciate your concerns; I don't think the impact will be that great, but then it's
rather easy for me to keep SVE in mind when working on other parts of the codebase
given how long I've spent working on it.

Are there any additional constraints on the scalable types you think would alleviate
your concerns a little? At the moment we will prevent scalable vectors from being
included in structs and arrays, but we could add more (at least to start with) to
avoid potential hidden problems.

While the constraints you mention are good, and important, I don't think there are more that matter.

I'm also trying to come up with an idea of how much impact we have in our downstream
implementation; most places where there is divergence are in the AArch64 backend (as you'd
expect), followed by the generic SelectionDAG code -- but lowering and legalization for
current instructions should (hopefully) be a one-off.

Do you have any specific parts of the codebase you're interested in a report into the
extent of changes?

This is *not* about the changes required. It is about the long term (think 10-years) complexity forced onto the IR.

We now have vectors that are unlike *all other vectors* in the IR. They're basically unlike all other types. I believe we will be finding bugs with this special case ~forever. Will it be an untenable burden? Definitely not. We can manage.

But the question is: does the benefit outweigh the cost? IMO, no.

I completely understand the benefit of this for the *ISA*, and I would encourage every ISA to adopt some vector instruction set with similar aspects.

However, the more I talk with and work with my users doing SIMD programming (and my entire experience doing it personally) leads to me to believe this will be of extremely limited utility to model in the IR. There will be a small number of places where it can be used. All of those where performance matters will end up being tuned for *specific* widths anyways to get the last few % of performance. Those that aren't performance critical won't provide any substantial advantage over just being 128-bit vectorized or left scalar. At that point, we pay the complexity and maintenance cost of this completely special type in the IR for no material benefit.

I've said this several times in various discussions. My opinion has not changed. No new information has been presented by others or by me. So I think debating this technical point is not really interesting at this point.

That said, it is entirely possible that I am wrong about the utility. If the consensus in the community is that we should move forward, I'm not going to block forward progress. It sounds like Hal, the Cray folks, and many ARM folks are all positive. So far, only myself and Eric have said anything to the contrary. If there really isn't anyone else concerned with this, please just move forward. I think the cost of continuing to debate this is rapidly becoming unsustainable all on its own.

Bruce Hoult via llvm-dev

unread,

Mar 20, 2019, 8:56:35 PM3/20/19

to Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Tue, Mar 19, 2019 at 12:32 PM Chandler Carruth via llvm-dev
<llvm...@lists.llvm.org> wrote:
> However, the more I talk with and work with my users doing SIMD programming (and my entire experience doing it personally) leads to me to believe this will be of extremely limited utility to model in the IR. There will be a small number of places where it can be used. All of those where performance matters will end up being tuned for *specific* widths anyways to get the last few % of performance. Those that aren't performance critical won't provide any substantial advantage over just being 128-bit vectorized or left scalar. At that point, we pay the complexity and maintenance cost of this completely special type in the IR for no material benefit.

To me, this is nothing like SIMD programming. I've done that, with
VMX/Altivec and NEON.

I've been working with a number of kernels implemented on RISC-V
vectors recently. At least for the things we've been looking at so
far, the code is almost exactly the same as you'd use to implement the
same algorithm (possibly pipelined, unrolled etc) using 32 normal FP
registers, it's just that you work on some unknown-at-compile-time
number of different outer-loop iterations in parallel. For example,
maybe you've got a whole lot of 3x3 matrices to invert. You load each
element of the first matrix into nine registers, then calculate the
determinant, then permute the input values into their new positions
while dividing them by the determinant, and write them all out. It's
exactly the same with the vector ISA, except you might be loading and
working on 1, 2, 4, ... 1000 of the matrices in parallel. You just
don't know, and it doesn't matter. The same for sgemm. You work on
strips eight (say) wide/high. In one dimension you have normal
loads/stores, and in the other dimension you have strided
loads/stores. You're working on rectangular blocks 8 high/wide and
some unknown-at-compile-time amount wide/high -- one some small
machine it might be 1 (i.e. basically a standard FP register file, but
the vector ISA works on it correctly), but presumably on most it will
be something like 4 or 8 or 16 elements. If you unroll either of these
kernels once (or software pipeline it) then you're going to pretty
much saturate your memory system or your fma units or both, depending
on the particular kernel's ratio of compute-to-bytes, how many
functional units you have, and the width of your memory bus.

Maybe you're right and hand-tuned SIMD code with explicit knowledge of
the vector length might get you single-digit percentage better
performance, but it probably won't be more than that and it's a lot of
work.

As for LLVM IR support .. I don't have a firm opinion on whether this
scalable type proposal is sufficient, insufficient, or overkill.

My own gut feeling is that the existing type system is fine for
describing vector data in memory, and that all we need (at least for
RISC-V) is a new register file that is very similar to any machine
with a unified int/fp register file. LLVM needs to manage register
allocation in this register file just as it does for regular int or fp
register files. Spills and reloads of these registers would be
undesirable, but it they are needed then the compiler would have to
allocate the space for this using alloca (or malloc).

The biggest thing needed I think is understanding one unusual
instruction: vsetvl{i}. At the head of each loop you explicitly use
the vsetvl{i} instruction to set the register width (the vector
element width) to something between 8 bits and 1024 bits. The vsetvl
instruction returns an integer which you normally use only to scale by
the element width that you just set, and use the result to bump your
input and output pointers to bump them by N elements instead of 1
element.

So, you kind of need a new type for the registers, but it's purely for
the registers. Not only can you not include it in arrays or structs,
you also can't load it from memory or store it to memory.

The plan for RISC-V is also that all 32 vector registers will be
caller-save/volatile. If you call a function then when it returns you
have to assume that all vector registers have been trashed. There are
no functions using the standard ABI that take vector registers as
arguments or return vector registers as results. The only apparent
exception is the compiler's runtime library that will have things the
compiler explicitly knows about such as transcendental functions --
but they don't use the standard ABI.

Sebastian Pop via llvm-dev

unread,

Mar 27, 2019, 5:34:31 PM3/27/19

to Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

I am of the opinion that handling scalable vectors (SV)
as builtins and an opaque SV type is a good option:

1. The implementation of SV with builtins is simpler than changing the IR.

2. Most of the transforms in opt are scalar opts; they do not optimize
vector operations and will not deal with SV either.

3. With builtins there are fewer places to pay attention to,
as most of the compiler is already dealing with builtins in
a neutral way.

4. The builtin approach is more targeted and confined: it allows
to amend one optimizer at a time.
In the alternative of changing the IR, one has to touch all the
passes in the initial implementation.

5. Optimizing code written with SV intrinsic calls can be done
with about the same implementation effort in both cases
(builtins and changing the IR.) I do not believe that changing
the IR to add SV types makes any optimizer work magically out
of a sudden: no free lunch. In both cases we need to amend
all the passes that remove inefficiencies in code written with
SV intrinsic calls.

6. We will need a new SV auto-vectorizer pass that relies less on
if-conversion, runtime disambiguation, and unroll for the prolog/epilog,
as the HW is helping with all these cases and expands the number
of loops that can be vectorized.
Having native SV types or just plain builtins is equivalent here
as the code generator of the vectorizer can be improved to not
generate inefficient code.

7. This is my point of view, I may be wrong,
so don't let me slow you down in getting it done!

Sebastian

Finkel, Hal J. via llvm-dev

unread,

Mar 27, 2019, 7:40:44 PM3/27/19

to Sebastian Pop, Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On 3/27/19 4:33 PM, Sebastian Pop via llvm-dev wrote:
> I am of the opinion that handling scalable vectors (SV)
> as builtins and an opaque SV type is a good option:
>
> 1. The implementation of SV with builtins is simpler than changing the IR.
>
> 2. Most of the transforms in opt are scalar opts; they do not optimize
> vector operations and will not deal with SV either.
>
> 3. With builtins there are fewer places to pay attention to,
> as most of the compiler is already dealing with builtins in
> a neutral way.
>
> 4. The builtin approach is more targeted and confined: it allows
> to amend one optimizer at a time.
> In the alternative of changing the IR, one has to touch all the
> passes in the initial implementation.

Interestingly, with similar considerations, I've come to the opposite
conclusion. While in theory the intrinsics and opaque types are more
targeted and confined, this only remains true *if* we don't end up
teaching a bunch of transformations and analysis passes about them.
However, I feel it is inevitable that we will:

1. While we already have unsized types in the IR, SV will add more of
them, and opaque or otherwise, there will be some cost to making all of
the relevant places in the optimizer not crash in their presence. This
cost we end up paying either way.

2. We're going to end up wanting to optimize SV operations. If we have
intrinsics, we can add code to match (a + b) - b => a, but the question
is: can we reuse the code in InstCombine which does this? We can make
the answer yes by adding sufficient abstraction, but the code
restructuring seems much worse than just adjusting the type system.
Otherwise, we can't reuse the existing code for these SV optimizations
if we use the intrisics, and we'll be stuck in the unfortunate situation
of slowing rewriting a version of InstCombine just to operate on the SV
intrinsics. Moreover, the code will be worse because we need to
effectively extract the type information from the intrinsic names. By
changing the type system to support SV, it seems like we can reuse
nearly all of the relevant InstCombine code.

3. It's not just InstCombine (and InstSimplify, etc.), but we might
also need to teach other passes about the intrinsics and their types
(GVN?). It's not clear that the problem will be well confined.

>
> 5. Optimizing code written with SV intrinsic calls can be done
> with about the same implementation effort in both cases
> (builtins and changing the IR.) I do not believe that changing
> the IR to add SV types makes any optimizer work magically out
> of a sudden: no free lunch. In both cases we need to amend
> all the passes that remove inefficiencies in code written with
> SV intrinsic calls.
>
> 6. We will need a new SV auto-vectorizer pass that relies less on
> if-conversion, runtime disambiguation, and unroll for the prolog/epilog,

It's not obvious to me that this is true. Can you elaborate? Even with
SV, it seems like you still need if conversion and pointer checking, and
unrolling the prologue/epilogue loops is handled later anyway by the
full/partial unrolling pass and I don't see any fundamental change there.

What is true is that we need to change the way that the vectorizer deals
with horizontal operations (e.g., reductions) - these all need to turn
into intrinsics to be handled later. This seems like a positive change,
however.

> as the HW is helping with all these cases and expands the number
> of loops that can be vectorized.
> Having native SV types or just plain builtins is equivalent here
> as the code generator of the vectorizer can be improved to not
> generate inefficient code.

This does not seem equivalent because while the mapping between scalar
operations and SV operations is straightforward with the adjusted type
system, the mapping between the scalar operations and the intrinsics
will require extra infrastructure to implement the mapping. Not that
this is necessarily difficult to build, but it needs to be updated
whenever we otherwise change the IR, and thus adds additional
maintenance cost for all of us.

Thanks again,

Hal

>
> 7. This is my point of view, I may be wrong,
> so don't let me slow you down in getting it done!
>
> Sebastian
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Chandler Carruth via llvm-dev

unread,

Mar 27, 2019, 7:46:09 PM3/27/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Wed, Mar 27, 2019 at 4:40 PM Finkel, Hal J. <hfi...@anl.gov> wrote:

On 3/27/19 4:33 PM, Sebastian Pop via llvm-dev wrote:
> I am of the opinion that handling scalable vectors (SV)
> as builtins and an opaque SV type is a good option:
>
> 1. The implementation of SV with builtins is simpler than changing the IR.
>
> 2. Most of the transforms in opt are scalar opts; they do not optimize
> vector operations and will not deal with SV either.
>
> 3. With builtins there are fewer places to pay attention to,
> as most of the compiler is already dealing with builtins in
> a neutral way.
>
> 4. The builtin approach is more targeted and confined: it allows
> to amend one optimizer at a time.
> In the alternative of changing the IR, one has to touch all the
> passes in the initial implementation.

Interestingly, with similar considerations, I've come to the opposite
conclusion. While in theory the intrinsics and opaque types are more
targeted and confined, this only remains true *if* we don't end up
teaching a bunch of transformations and analysis passes about them.

While I continue to disagree with Hal about whether we want this at all, FWIW I agree on this specific point.

*If* we are going to end up teaching a bunch of transformations about these, I think it will in many case be preferable to have them in the IR.

Using intrinsics and an opaque type, IMO, makes the most sense as a pass-through mechanism for allowing very limited usage without investing in any significant mid-level analysis or transformation awareness.

-Chandler

Sebastian Pop via llvm-dev

unread,

Mar 27, 2019, 10:46:28 PM3/27/19

to Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On Wed, Mar 27, 2019 at 6:46 PM Chandler Carruth <chan...@gmail.com> wrote:
> Using intrinsics and an opaque type, IMO, makes the most sense as a pass-through mechanism for allowing very limited usage without investing in any significant mid-level analysis or transformation awareness.

Ok, so if there are just a few passes to be amended, we may want to go
the opaque type route.

Can we list the passes that do something to the current vector types?
InstCombine
Vectorizer
...
Those passes will be immediate candidates to be taught about SV.
Any other pass/analysis like GVN for vectors is pie in the sky.

Finkel, Hal J. via llvm-dev

unread,

Mar 28, 2019, 2:52:32 AM3/28/19

to Sebastian Pop, Chandler Carruth, Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

On 3/27/19 9:45 PM, Sebastian Pop wrote:
> On Wed, Mar 27, 2019 at 6:46 PM Chandler Carruth <chan...@gmail.com> wrote:
>> Using intrinsics and an opaque type, IMO, makes the most sense as a pass-through mechanism for allowing very limited usage without investing in any significant mid-level analysis or transformation awareness.
> Ok, so if there are just a few passes to be amended, we may want to go
> the opaque type route.
>
> Can we list the passes that do something to the current vector types?
> InstCombine

I'm not sure that counting passes is meaningful so much as the amount of
code. How much of InstCombine works on vector types I don't know, but
InstCombine+InstructionSimplify is nearly 38k lines.

> Vectorizer
> ...
> Those passes will be immediate candidates to be taught about SV.
> Any other pass/analysis like GVN for vectors is pie in the sky.

Given that our GVN does not just do value numbering, but also does a lot
of our store-to-load forwarding (and, in fact, our current GVN spends
most of its time doing this), I think it's highly likely that we'd want
to teach GVN how to deal with these types. We do already have a special
infrastructure for doing this kind of thing for
target-intrinsic-returned values within EarlyCSE, and that would likely
be enhanced as well. This asymmetry in capabilities between EarlyCSE and
GVN isn't good, and you could make the argument that using it in GVN,
and thus using this mechanism to teach GVN about SV intrinsics would
also be a good thing, but regardless, one way or the other, I see it as
more likely than not that GVN is also affected.

-Hal

>
>
> Sebastian

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Sebastian Pop via llvm-dev

unread,

Mar 29, 2019, 12:34:21 PM3/29/19

to Finkel, Hal J., Francesco Petrogalli via llvm-dev, Chandler Carruth, David Greene, Chris Lattner, nd, Maxim Kuvyrkov

I had a phone conversation yesterday with Graham, Francesco,
and Kristof.

There is one more reason to go with the native type change:
ARM has already written the code with the SV types, and they
have patches ready to be reviewed and integrated in LLVM.

As I don't want to stand in the way of getting SVE in LLVM
as soon as possible, I will also support the integration of the
existing patches and I will help with the review.

Sebastian

Alex Susu via llvm-dev

unread,

Jul 13, 2019, 9:32:35 AM7/13/19

to llvm...@lists.llvm.org

Hello.
I am very interested in adding scalable vector support in my LLVM 8.0 compiler for
the Connex SIMD research processor - but the patches discussed on llvm-dev don't work
well. So can you help me?
The reason I need scalable vector types is our Connex SIMD/vector processor can be
synthesized in different instances with different number of lanes. Using a non-scalable
vector type in our Connex LLVM compiler can result in incorrect code, for example, for the
following LLVM IR instruction:
%res = icmp lt <8 x i16> stepvector, <8 x i16> <10, 10, 10, ..., 10>
because the result should be different if we are to run this instruction for a
Connex processor with 8 VS 16 lanes,
but in this case with non-scalable vector type the result is a predicate vector of
all true.

More exactly, I don't care right now that the LoopVectorize pass generates scalable
vector code. What I only want right now is that I give an LLVM IR program with scalable
vector code (obtained normally from the ARM SVE LLVM compiler, built from the source code
https://github.com/ARM-software/LLVM-SVE)
to my standard LLVM distro + scalable vector support and it is able to:
- parse the .ll LLVM IR program
- then generate assembly code for my Connex back end, which I plan to use
exclusively a scalable vector type like <vscale x 4 x i16>.

I applied the 3 patches discussed in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/130852.html
(https://reviews.llvm.org/D32530, https://reviews.llvm.org /D47770, and
https://reviews.llvm.org/D53137).
However, it seems there are quite a few features missing from this patch to add good
scalable vector support in LLVM. For example the patches do not parse the stepvector.
So I had to get inspired from the lib/AsmParser and lib/IR folders from
https://github.com/ARM-software/LLVM-SVE (the ARM SVE LLVM compiler) and changed the
following files in my LLVM 8.0 build:
llvm/lib/AsmParser/LLLexer.cpp
llvm/lib/AsmParser/LLParser.cpp
llvm/lib/AsmParser/LLParser.h
llvm/lib/AsmParser/LLToken.h
llvm/lib/IR/Instructions.cpp
llvm/include/llvm/IR/Constants.h
llvm/include/llvm/IR/Value.def
llvm/lib/IR/Constants.cpp

However, things seems to be a bit more complicated - it seems I need to patch also
files in the clang folder also (https://github.com/ARM-software/Clang-SVE, for example
clang/lib/CodeGen/CGExpr.cpp) since I get errors like:
<<tools/clang/lib/CodeGen/CGExpr.cpp:3337: clang::CodeGen::Address
emitArraySubscriptGEP(clang::CodeGen::CodeGenFunction&, clang::CodeGen::Address,
llvm::ArrayRef<llvm::Value*>, clang::QualType, bool, bool, clang::SourceLocation, const
llvm::Twine&): Assertion `isa<llvm::ConstantInt>(idx) &&
cast<llvm::ConstantInt>(idx)->isZero()' failed.>>.

So, since I have difficulties adding good scalable vector support in LLVM 8.0 (taken
from the SVN repo in Mar 2019) I'd like to ask if anybody else has tried adding good
scalable vector support in LLVM recently?

Thank you,
Alex

Graham Hunter via llvm-dev

unread,

Jul 16, 2019, 5:41:34 AM7/16/19

to Alex Susu, llvm...@lists.llvm.org

Hi Alex,

We've only recently managed to get the core scalable vector IR type into the codebase (so it will be present in 9.0); that allows you to write IR with scalable vector types, but there's no backend able to generate code for it yet, and as you mention no support for stepvector (or vscale). Arm will start upstreaming those soon.

-Graham

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Renato Golin via llvm-dev

unread,

Jul 16, 2019, 10:12:34 AM7/16/19

to Graham Hunter, llvm...@lists.llvm.org, Nico Weber

How's the Chromium build, btw?

Did you fix all remaining issues with build times?

--renato

Nico Weber via llvm-dev

unread,

Jul 16, 2019, 10:18:32 AM7/16/19

to Renato Golin, llvm...@lists.llvm.org

We're currently not aware of any build time regressions, but we're also somewhat reactive here. So far, nobody has complained this time, but we didn't actively check build time graphs as far as I know. So it's possible that all regressions are fixed, or there's a slowdown but it's below the complaint threshold :)

Graham Hunter via llvm-dev

unread,

Jul 16, 2019, 11:03:16 AM7/16/19

to Renato Golin, llvm...@lists.llvm.org, nd, Nico Weber

Hi Renato,

> On 16 Jul 2019, at 15:12, Renato Golin <reng...@gmail.com> wrote:
>
> How's the Chromium build, btw?
>
> Did you fix all remaining issues with build times?

I've fixed the issues I have reproducers for at least.

I *think* we're in the clear given that I removed the walk through the type maps in the verifier.

-Graham

Alex Susu via llvm-dev

unread,

Jul 16, 2019, 4:53:47 PM7/16/19

to llvm...@lists.llvm.org

Hello, Graham,
Could you please tell me if you have newer public patches for the core scalable
vector IR type than the ones already mentioned at

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130852.html
(https://reviews.llvm.org/D32530, https://reviews.llvm.org /D47770, and
https://reviews.llvm.org/D53137).

Also, can anybody tell me what was the official Clang SVN revision used for the Clang
SVE (https://github.com/ARM-software/Clang-SVE)?