[llvm-dev] Multi-Threading Compilers

1,387 views
Skip to first unread message

Nicholas Krause via llvm-dev

unread,
Feb 27, 2020, 10:05:25 PM2/27/20
to llvm...@lists.llvm.org
Greetings All,

For the last few months since the GCC Cauldron I've been
researching/helping out
plan for multi-threading GCC. I've posted my GCC ideas here:
https://docs.google.com/document/d/1po_RRgSCtRyYgMHjV0itW8iOzJXpTdHYIpC9gUMjOxk/edit

as I've heard there is interest on the LLVM side as well. I've talked to
some of the IBM folks from
the Toronto area about it face to face due to them having more middle
end and back end knowledge
then me.

Not sure if anyone is interested in my ideas or wants me to extend my
help on the LLVM side,

Nick
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Chris Lattner via llvm-dev

unread,
Feb 28, 2020, 12:19:49 AM2/28/20
to Nicholas Krause, llvm...@lists.llvm.org
Hi Nicholas,

You might want to check out MLIR: its pass manager is already automatically and implicitly multithreaded.

-Chris

Nicholas Krause via llvm-dev

unread,
Feb 28, 2020, 12:45:16 AM2/28/20
to Chris Lattner, llvm...@lists.llvm.org

On 2/28/20 12:19 AM, Chris Lattner wrote:
> Hi Nicholas,
>
> You might want to check out MLIR: its pass manager is already automatically and implicitly multithreaded.
>
> -Chris

Chris,

I was aware that LLVM was moving to MLIR at some point due to this. I've
curious as
to how MLIR deals with IPO as that's the problem I was running into.
Even if you have
pipelines what guarantees do we have about IPO or are there any. For
example if an
analysis pass runs after a loop pass for heuristics to get good data, 
does MLIR ensure
that?   The problem isn't getting a pipeline as much as IPO issues in
terms of rescheduling
in a pipeline or incorrectly splitting up into pipelines. I yet to find
a good solution on the
GCC side as well and it seems that there will be issues with this no
matter what, not
sure of what trade offs the LLVM side is willing to make.

The other issues is that graph data structures to my knowledge do not
allow insertion
of multiple nodes at the same time or how to scale the graphs for
callers or SSA. Not
sure if you guys have encapsulated the operators and data structures in
a series of
classes as that would be the first part. The other part is figuring out
how to link and
interact with build systems to launch threads from make -j or other
similar things.


Thanks for the notice about MLIR through maybe my IPO is not really there
but after reading parts of it seems to be a issue through a little
smaller still
and thanks for the  prompt response,

Nick

Chris Lattner via llvm-dev

unread,
Feb 28, 2020, 12:56:43 AM2/28/20
to Nicholas Krause, llvm...@lists.llvm.org, River Riddle
On Feb 27, 2020, at 9:44 PM, Nicholas Krause <xero...@gmail.com> wrote:
On 2/28/20 12:19 AM, Chris Lattner wrote:
Hi Nicholas,

You might want to check out MLIR: its pass manager is already automatically and implicitly multithreaded.

-Chris
Chris,

I was aware that LLVM was moving to MLIR at some point due to this. I've curious as
to how MLIR deals with IPO as that's the problem I was running into. Even if you have
pipelines what guarantees do we have about IPO or are there any. For example if an
analysis pass runs after a loop pass for heuristics to get good data,  does MLIR ensure
that?   The problem isn't getting a pipeline as much as IPO issues in terms of rescheduling
in a pipeline or incorrectly splitting up into pipelines. I yet to find a good solution on the
GCC side as well and it seems that there will be issues with this no matter what, not
sure of what trade offs the LLVM side is willing to make.

Hi Nicholas,

It is very difficult to mix things like loop passes and IPO passes in any system, unless one is prepared to preserve the analysis results that the other depend on.  One nice thing about MLIR is that it defines this away, by allowing explicit representation of loops in the IR.  This means that it isn’t an analysis pass that gets “invalidated” like the existing LLVM LoopInfo analysis pass.  It is just structurally impossible for this to happen.  I don’t think that all of the AnalysisPass’s in LLVM have been super successful.

The other issues is that graph data structures to my knowledge do not allow insertion
of multiple nodes at the same time or how to scale the graphs for callers or SSA. Not
sure if you guys have encapsulated the operators and data structures in a series of
classes as that would be the first part. The other part is figuring out how to link and
interact with build systems to launch threads from make -j or other similar things.

Yeah, MLIR handles that by factoring global use-def chains on symbols (e.g. functions) as being different than local use-def chains. This makes it much more efficient. You can find more information on MLIR symbols here.

Thanks for the notice about MLIR through maybe my IPO is not really there
but after reading parts of it seems to be a issue through a little smaller still
and thanks for the  prompt response,

Sure, happy to help.  It would be great to see a GIMPLE dialect in MLIR someday :-)

-Chris

Nicholas Krause via llvm-dev

unread,
Feb 28, 2020, 11:35:26 AM2/28/20
to Chris Lattner, llvm...@lists.llvm.org, River Riddle
Chris,

I asked the GCC what they want to do about MLIR and am waiting for a reply. Anyhow what is the status and
what parts are we planning to move to MLIR in LLVM/Clang.  I've not seen any discussion on that other than
starting to plan for it.  Perhaps we should start a wiki page for that as I don't think all parts need to be MLIR.
I don't have access to writing for the wiki so unfortunately I can't start writing it up unless I get access.

Regards and this is one reason you do your research before writing a multi-threading program a lot of the
research or work has been done at this point,
Nick

Johannes Doerfert via llvm-dev

unread,
Feb 28, 2020, 11:57:04 AM2/28/20
to Nicholas Krause, llvm...@lists.llvm.org, River Riddle
On 02/28, Nicholas Krause via llvm-dev wrote:
> Anyhow what is the status and what parts are we planning to move to
> MLIR in LLVM/Clang.  I've not seen any discussion on that other than
> starting to plan for it. 

As far as I know, there is no (detailed/discussed/agreed upon/...) plan
to move any existing functionality in LLVM-Core or Clang to MLIR. There
are some people that expressed interest in there is Chris's plan on how
the transition could look like.

If we want to look into multi-threading for the existing LLVM
infrastructure there are certainly possibilities. Computing (function)
analyses results prior to CGSCC and Module passes for example. When it
comes to transformations it is less straight forward but there is still
interest in looking into it.

I'm traveling but if you are interested we can discuss this further next
week.

Cheers,
Johannes
signature.asc

Nicholas Krause via llvm-dev

unread,
Feb 28, 2020, 12:14:43 PM2/28/20
to Johannes Doerfert, llvm...@lists.llvm.org, River Riddle

Johannes,
That's fine. Unfortunately I've a student so the GCC side knows as well
but I will
only be working on this in my spare time. We did get somewhere with my paper
here:
https://docs.google.com/document/d/1po_RRgSCtRyYgMHjV0itW8iOzJXpTdHYIpC9gUMjOxk/edit

I've looking into the SSA trees currently but we can discuss what you
guys want me
to start researching first. The research does need to happen including
profiling due
to avoiding making mistakes in choosing major implementation details.

Otherwise just ping me and the list when your ready or other people can,

Chris Lattner via llvm-dev

unread,
Feb 28, 2020, 9:04:05 PM2/28/20
to Johannes Doerfert, llvm...@lists.llvm.org, Tatiana Shpeisman, River Riddle
On Feb 28, 2020, at 8:56 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
> On 02/28, Nicholas Krause via llvm-dev wrote:
>> Anyhow what is the status and what parts are we planning to move to
>> MLIR in LLVM/Clang. I've not seen any discussion on that other than
>> starting to plan for it.
>
> As far as I know, there is no (detailed/discussed/agreed upon/...) plan
> to move any existing functionality in LLVM-Core or Clang to MLIR. There
> are some people that expressed interest in there is Chris's plan on how
> the transition could look like.

Yep, agreed, I gave a talk a couple days ago (with Tatiana) with a proposed path forward, but explained it as one possible path. We’ll share the slides publicly in a few days after a couple things get taken care of.

-Chris

>
> If we want to look into multi-threading for the existing LLVM
> infrastructure there are certainly possibilities. Computing (function)
> analyses results prior to CGSCC and Module passes for example. When it
> comes to transformations it is less straight forward but there is still
> interest in looking into it.
>
> I'm traveling but if you are interested we can discuss this further next
> week.
>
> Cheers,
> Johannes

_______________________________________________

David Blaikie via llvm-dev

unread,
Feb 29, 2020, 5:09:22 PM2/29/20
to Nicholas Krause, Chandler Carruth, Alina Sbirlea, llvm...@lists.llvm.org
+Chandler & Alina for new pass manager context

On Thu, Feb 27, 2020 at 9:45 PM Nicholas Krause via llvm-dev <llvm...@lists.llvm.org> wrote:


On 2/28/20 12:19 AM, Chris Lattner wrote:
> Hi Nicholas,
>
> You might want to check out MLIR: its pass manager is already automatically and implicitly multithreaded.
>
> -Chris
Chris,

I was aware that LLVM was moving to MLIR at some point due to this.

FWIW: I don't /tihnk/ that's quite an accurate representation of the situation. MLIR is now part of the LLVM project, but so far I haven't seen any discussions of moving the core LLVM project itself to MLIR (admittedly I'm not super clear on the details of MLIR or what that'd look like - I would've guessed that LLVM would itself be a lower level that a given MLIR stack might lower to, etc). There's been some discussion of folks interested in experimenting with using MLIR in Clang, though that's not a project goal right now.
 
I've
curious as
to how MLIR deals with IPO as that's the problem I was running into.

FWIW I believe LLVM's new pass manager (NPM) was designed with parallelism and the ability to support this situation (that MLIR doesn't? Or doesn't to the degree/way in which the NPM does). I'll leave it to folks (Chandler probably has the most context here) to provide some more detail there if they can/have time.
 

Chris Lattner via llvm-dev

unread,
Feb 29, 2020, 5:19:40 PM2/29/20
to David Blaikie, llvm...@lists.llvm.org
On Feb 29, 2020, at 2:08 PM, David Blaikie <dbla...@gmail.com> wrote:
I've 
curious as
to how MLIR deals with IPO as that's the problem I was running into. 

FWIW I believe LLVM's new pass manager (NPM) was designed with parallelism and the ability to support this situation (that MLIR doesn't? Or doesn't to the degree/way in which the NPM does). I'll leave it to folks (Chandler probably has the most context here) to provide some more detail there if they can/have time.

Historically speaking, all of the LLVM pass managers have been designed to support multithreaded compilation (check out the ancient history of the WritingAnLLVMPass doc if curious).

The problem is that LLVM has global use-def chains on constants, functions and globals, etc, so it is impractical to do this.  Every “inst->setOperand” would have to be able to take locks or use something like software transactional memory techniques in their implementation.  This would be very complicated and very slow.

MLIR defines this away from the beginning.  This is a result of the core IR design, not the pass manager design itself.

-Chris

David Blaikie via llvm-dev

unread,
Feb 29, 2020, 5:25:35 PM2/29/20
to Chris Lattner, llvm...@lists.llvm.org
On Sat, Feb 29, 2020 at 2:19 PM Chris Lattner <clat...@nondot.org> wrote:
On Feb 29, 2020, at 2:08 PM, David Blaikie <dbla...@gmail.com> wrote:
I've 
curious as
to how MLIR deals with IPO as that's the problem I was running into. 

FWIW I believe LLVM's new pass manager (NPM) was designed with parallelism and the ability to support this situation (that MLIR doesn't? Or doesn't to the degree/way in which the NPM does). I'll leave it to folks (Chandler probably has the most context here) to provide some more detail there if they can/have time.

Historically speaking, all of the LLVM pass managers have been designed to support multithreaded compilation (check out the ancient history of the WritingAnLLVMPass doc if curious).

I think the specific thing that might'v been a bit different in the NPM was to do with analysis invalidation in a way that's more parallelism friendly than the previous one - but I may be misrepresenting/misundrstanding some of it.
 
The problem is that LLVM has global use-def chains on constants, functions and globals, etc, so it is impractical to do this.  Every “inst->setOperand” would have to be able to take locks or use something like software transactional memory techniques in their implementation.  This would be very complicated and very slow.

Oh, yeah - I recall that particular limitation being discussed/not addressed as yet.
 
MLIR defines this away from the beginning.  This is a result of the core IR design, not the pass manager design itself.

What does MLIR do differently here/how does it define that issue away? (doesn't have use-lists built-in?)

- Dave
 

-Chris

River Riddle via llvm-dev

unread,
Feb 29, 2020, 6:17:38 PM2/29/20
to David Blaikie, llvm...@lists.llvm.org
The major thing is that constants and global-like objects don't produce SSA values and thus don't have use-lists. https://mlir.llvm.org/docs/Rationale/#multithreading-the-compiler discusses this a bit. 

For constants, the data is stored as an Attribute(context uniqued metadata, have no use-list, not SSA). This attribute can either placed in the attribute list(if the operand is always constant, like for the value of a switch case), otherwise it must be explicitly materialized via some operation. For example, the `std.constant` operation will materialize an SSA value from some attribute data.

For references to functions and other global-like objects, we have a non-SSA mechanism built around `symbols`. This is essentially using a special attribute to reference the function by-name, instead of by ssa value. You can find more information on MLIR symbols here

Along with the above, there is a trait that can be attached to operations called `IsolatedFromAbove`. This essentially means that no SSA values defined above a region can be referenced from within that region. The pass manager only allows schedule passes on operations that have this property, meaning that all pipelines are implicitly multi-threaded.

The pass manager in MLIR was heavily inspired by the work on the new pass manager in LLVM, but with specific constraints/requirements that are unique to the design of MLIR. That being said, there are some usability features added that would also make great additions to LLVM: instance specific pass options and statistics, pipeline crash reproducer generation, etc.

Not sure if any of the above helps clarify, but happy to chat more if you are interested.

-- River
 
- Dave
 

-Chris

Nicholas Krause via llvm-dev

unread,
Feb 29, 2020, 7:01:16 PM2/29/20
to River Riddle, David Blaikie, llvm...@lists.llvm.org
River,
The big thing from my reading of the Pass Manager in MLIR is that it allows us to iterate through
a pass per function or module as a group allowing it to run in async. I've proposed this
on the GCC side:
https://gcc.gnu.org/ml/gcc/2020-02/msg00247.html

Its to walk through the IPA passes which are similar to analyze passes on the LLVM side.

Nick

River Riddle via llvm-dev

unread,
Feb 29, 2020, 7:24:26 PM2/29/20
to Nicholas Krause, llvm...@lists.llvm.org
Hi Nicholas,

I can't say anything about the GCC side, but this isn't a particularly novel aspect of the MLIR pass manager. In many ways, the pass manager is the easiest/simplest part of the multi-threading problem. The bigger problem is making sure that the rest of the compiler infrastructure is structured in a way that is thread-safe, or can be made thread-safe. This is why most of the discussion is based around how to model things like constants, global values, etc. When I made MLIR multi-threaded a year ago, a large majority of my time was spent outside of the pass manager. For a real example, I spent much more time just on multi-threaded pass timing than making the pass manager itself multi-threaded.

-- River

Nicholas Krause via llvm-dev

unread,
Feb 29, 2020, 8:15:02 PM2/29/20
to River Riddle, llvm...@lists.llvm.org
Actually in my experience, the biggest problem is if we can detect IPO and run async guarantees on that. MLIR runs operations but only for a module or set of functions
without this. One of my dreams would be to run passes in parallel including IPO detection and stop if it cannot continue pass a IPO pass or set of passes due to changes.

Maybe MLIR does do that but its the one bottleneck that is really hard to fix,

Mehdi AMINI via llvm-dev

unread,
Feb 29, 2020, 9:39:38 PM2/29/20
to Nicholas Krause, llvm...@lists.llvm.org
What MLIR does (that would require quite some work in LLVM) is making sure that you can process and transform functions in isolation, allowing to run *local* optimizations in parallel. This does not solve the IPO problem you're after. As I understand it, this is a difficult thing to design, and it requires consideration about how you think the passes and the pass-pipeline entirely.

Running function-passes and "local" optimizations in parallel in LLVM isn't possible because the structures in the LLVMContext aren't thread-safe, and because the IR itself isn't thread-safe. Something like just DCE or CSE a function call requires to modify the callee (through its use-list).

-- 
Mehdi

Nicholas Krause via llvm-dev

unread,
Mar 1, 2020, 1:36:20 AM3/1/20
to Mehdi AMINI, llvm...@lists.llvm.org
I've been thinking about it on the GCC side for the last few months. It's non trivial through something similar to this may be possible but I
will need to research both pass managers better:
https://elixir.bootlin.com/linux/latest/source/include/linux/workqueue.h#L102

I'm looking on some RFCS for the pass manager and SSA iterators on the GCC side. Through one easy thing to do is a lot of core
classes in both GCC/LLVM there are loops that run a lot. LRA on the GCC for register allocation does. We may want to figure
out how and if its a good idea to run these loops in small worker functions that are only used by the one function that  requires
it. The only real question is some loops are only very performance intensive on corner cases which we may require heuristics
in order to decide when to launch i.e. number of elements iterating through e.t.c. Maybe that's a bad idea but after looking
through the GCC and LLVM side a little this seems to be a later tedious but trivial fix mostly due to figuring out what loops are
expensive enough to do this.

I believe Johannes was going to reach out next week about the Module Classes and we will go for there,

Nick

Neil Nelson via llvm-dev

unread,
Mar 1, 2020, 4:30:06 AM3/1/20
to llvm...@lists.llvm.org
This is a recent desktop.
Xubuntu 19.10
Compiling for 10.0.0 clang and llvm. See below.

For this test, running 14 processors in a gui VM. The cores are hyperthreaded, processors are twice the cores, but all the cores before the run are showing negligible activity.

compile_commands.json has 3022 entries.

The ninja compile run lasted 7 minutes and 43 seconds with 99% all processor usage throughout.

We then have 7*60+43 = 463 seconds.

Compile seconds per compile line in compile_commands.json 463/3022 = 0.1532 seconds. Average compile time per processor would be about 14*0.1532 seconds.

cmake -G Ninja -DLLVM_ENABLE_PROJECTS="clang;llvm" -DLLVM_USE_LINKER=lld -DCMAKE_BUILD_TYPE="Release" -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_LIBPFM=OFF -DRUN_HAVE_GNU_POSIX_REGEX=0 -DRUN_HAVE_THREAD_SAFETY_ATTRIBUTES=0 -Wno-dev ../llvm &> /home/nnelson/Documents/cmake.log

ninja &> /home/nnelson/Documents/ninja.log

Here are some useful pages from Threading Building Blocks.

Task-Based Programming
https://software.intel.com/en-us/node/506100

Appendix A Costs of Time Slicing
https://software.intel.com/en-us/node/506127

The point

When the number of compiles exceeds the number of cores such that all the cores are utilized, nothing is gained by trying to multi-thread the individual compiles. In fact, loading up the cores with more threads or tasks than there are cores will reduce compiling efficiency because of time slicing. And sequencing through more tasks than less when the cores are not overloaded will reduce compiling efficiency because more tasks have to be loaded and unloaded to the cores.

Neil Nelson

Chris Lattner via llvm-dev

unread,
Mar 2, 2020, 1:08:03 PM3/2/20
to Neil Nelson, llvm...@lists.llvm.org
On Mar 1, 2020, at 1:29 AM, Neil Nelson via llvm-dev <llvm...@lists.llvm.org> wrote:
The point

When the number of compiles exceeds the number of cores such that all the cores are utilized, nothing is gained by trying to multi-thread the individual compiles. In fact, loading up the cores with more threads or tasks than there are cores will reduce compiling efficiency because of time slicing. And sequencing through more tasks than less when the cores are not overloaded will reduce compiling efficiency because more tasks have to be loaded and unloaded to the cores.

That makes a lot of sense Neil.  Two counterpoints though:

1) In development situations, it is common to rebuild one file (the one you changed) without rebuilding everything.  Speeding that up is useful.

2) LLVM is a library that is used for a lot more than just C compilers.  Many of its use cases are not modeled by bulk “compile all the code” workloads like you describe.


You’re right that multithreading isn’t a panacea, but in my opinion, it is important to be able to provide access to multicore compilation for use cases that do benefit from it.

-Chris

Alexandre Ganea via llvm-dev

unread,
Mar 2, 2020, 3:19:26 PM3/2/20
to Chris Lattner, Neil Nelson, llvm...@lists.llvm.org

In addition to what Chris said, there’s also the case of large TUs / Unity files. Given that currently the compilation of a single TU is not multi-thread, you can get “spikes” as the one below, where only one core (out of many) is working. One minute wasted, when I have many cores available:

 

 

For this specific case, not having Unity files makes the build uniform (no spikes like the one above), but it takes 10x more time to compile (20k TUs and 25k .h files compiled, which reduce down to ~600 Unity).

 

To fix the spike, you then have to resort to a iterative optimization algorithm running nightly to find the best trade-off between size of unity and build times. This complicates things further.

 

The point being, +1 for multi-threading the compiler :-)

 

Alex.

 

De : llvm-dev <llvm-dev...@lists.llvm.org> De la part de Chris Lattner via llvm-dev
Envoyé : March 2, 2020 1:08 PM
À : Neil Nelson <nne...@infowest.com>
Cc : llvm...@lists.llvm.org
Objet : Re: [llvm-dev] Multi-Threading Compilers

Neil Nelson via llvm-dev

unread,
Mar 2, 2020, 4:06:23 PM3/2/20
to Chris Lattner, llvm...@lists.llvm.org

Very good. I just saw that Alex is making some good points.

The TBB pages earlier are an excellent reference for multi-threading or for TBB multi-tasking. They illustrate issues that a good threading design should consider that are not commonly recognized when embarking on a that design. I would say the task scheduler is their most comprehensive approach.

Clearly, I am not familiar enough with LLVM to be making any definite prescription and recommend your more knowledgeable judgment.

But in the spirit of attempting a more complete perspective we may consider that a threading design for LLVM is not simple. And I suspect not nearly as simple, not saying that the current ideas are simple in that I do not understand them, as has been explored to this point.

For example, you remarked there are use cases, and having a realistic appreciation for what use cases there may be is important, where a multi-threading compile would be useful. But then on that point, if we were to have threading on the compile to an object file, would we then overload the cores, make them slower, when using ninja to compile LLVM? How would the compile threading take into account this other use case, if that was something to consider?

Do we encumber the LLVM design with threading such that the trade-off for that design against its use-case benefits is not well justified? My sense is that if threading was judged a reasonable goal that a preliminary design would be presented after some research that would initially complement non-threading goals, a general improvement that would yield a better track for threading later. This reflects the sense that a good threading design is not simple and will impact a significant portion of LLVM. And it could well be that that is what is happening now.

How would threading impact future LLVM development? My sense is that in some reasonable portion of future development would be made more difficult.

You mentioned that single object compiling would have a benefit. Here are some factors that could mitigate that benefit.

Although core speeds are not expected to increase and remain in the 4-5 gigahertz rate, IPC (instructions per cycle) is expected to increase at a slow rate. Over time there will be a natural reduction in compile time for some benchmark compile.

There is also a gain in compile time for small source files as against larger source files and some argument with regard to design could be made for some maximum source file size to reduce complexity that would also address keeping compile time down. Is there a reason to expect an increase in source file size that would increase compile time?

Noting the LLVM compile test from my prior post gave an average compile time of 14*0.1532 = 2.1448 seconds, is compile time a significant or marginal issue? What would be the target for an average compile time?

Neil Nelson

Fedor Sergeev via llvm-dev

unread,
Mar 3, 2020, 7:04:02 AM3/3/20
to llvm...@lists.llvm.org, River Riddle
I can't say anything about the GCC side, but this isn't a particularly novel aspect of the MLIR pass manager. In many ways, the pass manager is the easiest/simplest part of the multi-threading problem. The bigger problem is making sure that the rest of the compiler infrastructure is structured in a way that is thread-safe, or can be made thread-safe. This is why most of the discussion is based around how to model things like constants, global values, etc. When I made MLIR multi-threaded a year ago, a large majority of my time was spent outside of the pass manager. For a real example, I spent much more time just on multi-threaded pass timing than making the pass manager itself multi-threaded.
Picking on this point, perhaps a bit off-topic for the whole discussion.
I have recently realized that for the purpose of multi-threaded pass timing currently existing LLVM timers

Fedor Sergeev via llvm-dev

unread,
Mar 3, 2020, 7:14:31 AM3/3/20
to River Riddle, llvm...@lists.llvm.org
Oops.. sorry, hit send too soon..

I can't say anything about the GCC side, but this isn't a particularly novel aspect of the MLIR pass manager. In many ways, the pass manager is the easiest/simplest part of the multi-threading problem. The bigger problem is making sure that the rest of the compiler infrastructure is structured in a way that is thread-safe, or can be made thread-safe. This is why most of the discussion is based around how to model things like constants, global values, etc. When I made MLIR multi-threaded a year ago, a large majority of my time was spent outside of the pass manager. For a real example, I spent much more time just on multi-threaded pass timing than making the pass manager itself multi-threaded.
Picking on this point, perhaps a bit off-topic for the whole discussion.


I have recently realized that for the purpose of multi-threaded pass timing currently existing LLVM timers
are hardly suitable since they measure per-process time instead of per-thread time.
(and there seem to be no portable LLVM interfaces for per-thread time query :( )

From a first glance it seems that in your MLIR timing examples all the times are also per-process.
How do you handle cases when half of your threads are doing something else?
And if you handle it per-thread - can you point me to the code doing that, pls :)

regards,
   Fedor.

PS as I read MLIR docs linked above I see quite a bunch of features that would be very welcome in LLVM core...

Fedor Sergeev via llvm-dev

unread,
Mar 3, 2020, 7:27:25 AM3/3/20
to David Blaikie, llvm...@lists.llvm.org


On 3/1/20 1:08 AM, David Blaikie via llvm-dev wrote:
+Chandler & Alina for new pass manager context

On Thu, Feb 27, 2020 at 9:45 PM Nicholas Krause via llvm-dev <llvm...@lists.llvm.org> wrote:


On 2/28/20 12:19 AM, Chris Lattner wrote:
> Hi Nicholas,
>
> You might want to check out MLIR: its pass manager is already automatically and implicitly multithreaded.
>
> -Chris
Chris,

I was aware that LLVM was moving to MLIR at some point due to this.

FWIW: I don't /tihnk/ that's quite an accurate representation of the situation. MLIR is now part of the LLVM project, but so far I haven't seen any discussions of moving the core LLVM project itself to MLIR (admittedly I'm not super clear on the details of MLIR or what that'd look like - I would've guessed that LLVM would itself be a lower level that a given MLIR stack might lower to, etc). There's been some discussion of folks interested in experimenting with using MLIR in Clang, though that's not a project goal right now.
 
I've
curious as
to how MLIR deals with IPO as that's the problem I was running into.

FWIW I believe LLVM's new pass manager (NPM) was designed with parallelism and the ability to support this situation (that MLIR doesn't? Or doesn't to the degree/way in which the NPM does). I'll leave it to folks (Chandler probably has the most context here) to provide some more detail there if they can/have time.
Indeed NPM does enable more parallelism compared to the old one.
However this parallelism is much more useful for our "multiple JIT-compilations in one process" case where each thread
is doing its own compilation on its own module/context. Independent pass pipelines, settings per pass instance,
individual caching analysis managers - everything helps.
Yet for the kind of multi-threaded compilation discussed here, where a single context/IR unit is shared across many threads,
it becomes much more involved due to limitations of IR itself, as others have rightfully pointed out here.

regards,
  Fedor.

Chris Lattner via llvm-dev

unread,
Mar 3, 2020, 12:23:21 PM3/3/20
to Fedor Sergeev, llvm...@lists.llvm.org
On Mar 3, 2020, at 4:12 AM, Fedor Sergeev via llvm-dev <llvm...@lists.llvm.org> wrote:
Oops.. sorry, hit send too soon..
I can't say anything about the GCC side, but this isn't a particularly novel aspect of the MLIR pass manager. In many ways, the pass manager is the easiest/simplest part of the multi-threading problem. The bigger problem is making sure that the rest of the compiler infrastructure is structured in a way that is thread-safe, or can be made thread-safe. This is why most of the discussion is based around how to model things like constants, global values, etc. When I made MLIR multi-threaded a year ago, a large majority of my time was spent outside of the pass manager. For a real example, I spent much more time just on multi-threaded pass timing than making the pass manager itself multi-threaded.
Picking on this point, perhaps a bit off-topic for the whole discussion.

I have recently realized that for the purpose of multi-threaded pass timing currently existing LLVM timers
are hardly suitable since they measure per-process time instead of per-thread time.
(and there seem to be no portable LLVM interfaces for per-thread time query :( )

From a first glance it seems that in your MLIR timing examples all the times are also per-process.
How do you handle cases when half of your threads are doing something else?
And if you handle it per-thread - can you point me to the code doing that, pls :)

Indeed, how you account for things is important.  Please take a look at "Multi-threaded Pass Timing”  on this page:

-Chris

Chris Lattner via llvm-dev

unread,
Mar 3, 2020, 8:38:02 PM3/3/20
to Johannes Doerfert, llvm...@lists.llvm.org, Tatiana Shpeisman, River Riddle
On Feb 28, 2020, at 6:03 PM, Chris Lattner <clat...@nondot.org> wrote:

On Feb 28, 2020, at 8:56 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
On 02/28, Nicholas Krause via llvm-dev wrote:
Anyhow what is the status and what parts are we planning to move to
MLIR in LLVM/Clang.  I've not seen any discussion on that other than
starting to plan for it.

As far as I know, there is no (detailed/discussed/agreed upon/...) plan
to move any existing functionality in LLVM-Core or Clang to MLIR. There
are some people that expressed interest in there is Chris's plan on how
the transition could look like.

Yep, agreed, I gave a talk a couple days ago (with Tatiana) with a proposed path forward, but explained it as one possible path.  We’ll share the slides publicly in a few days after a couple things get taken care of.

Hi all,

Here is a link to the CGO presentation slides (outlining a possible path to incremental adoption of MLIR in Clang) for anyone curious.

-Chris

David Greene via llvm-dev

unread,
Mar 6, 2020, 11:52:38 AM3/6/20
to Neil Nelson, Chris Lattner, llvm...@lists.llvm.org
Neil Nelson via llvm-dev <llvm...@lists.llvm.org> writes:

> For example, you remarked there are use cases, and having a realistic
> appreciation for what use cases there may be is important, where a
> multi-threading compile would be useful.

LTO could significantly benefit.

> There is also a gain in compile time for small source files as against
> larger source files and some argument with regard to design could be
> made for some maximum source file size to reduce complexity that would
> also address keeping compile time down. Is there a reason to expect an
> increase in source file size that would increase compile time?

One data point: we see many customer codes with generated sources.
Those sources just keep growing and growing.

> Noting the LLVM compile test from my prior post gave an average compile
> time of 14*0.1532 = 2.1448 seconds, is compile time a significant or
> marginal issue? What would be the target for an average compile time?

It is significant for us. Over the years we have even added special
options just to speed up compile time for customers' develop-test-debug
cycles.

-David

Nicholas Krause via llvm-dev

unread,
Mar 18, 2020, 2:23:51 AM3/18/20
to Chris Lattner, Johannes Doerfert, llvm...@lists.llvm.org
Greetings,
As to David Blaike's suggestion I'm merging the two threads for this discussion. The original commenters is Johannes Doefert
starting with Hey,:
Hey,

Apologies for the wait, everything right now is going crazy..
Compiler Folks are very busy people as there aren't as much of us unfortunately so no need to
apologize. I've yet to heard from someone on the GCC side and will wait until after GCC 11
is released due to this. Also not to mention the health issues of Coronavirus-19.

I think we should early in move this conversation on the llvm Dev list but generally speaking we can see three options here:
1) parallelize single passes or a subset of passes that are known to not interfer, e.g. the attributor,
2) parallelize analysis pass execution before a transformation that needs them,
3) investigate what needs to be done for a parallel execution of many passes, e.g. How can we avoid races on shared structure such as the constant pool.
I was researching this on and off for the last few months in terms of figuring out how to make the pass manager itself async. Its not easy and I'm not even
sure if that's possible. Not sure about GIMPLE as I would have to ask the middle end maintainer on the GCC side but LLVM IR does not seem to have shared
state detection or the core classes and same for the back ends. So yes this would interest me.

The first place to start with is which data structures are shared for sure. The biggest ones seem to be basic blocks and function definitions in terms of shared state, as
those would be shared by passes running on each function.  We should start looking at implementing here locks or ref counting here first if your OK with that.
It also allows me  to understand a little more concrete the linkage between the core classes as would be required for multi threading LLVM. In addition,
it allows us to look into partitioning issues with threads at the same thing in terms of how to do it.

As was discussed on the previous thread - generally the assumption is that one wouldn't try to run two function optimizations on the same function at the same time, but, for instance - run function optimizations on unrelated functions at the same time (or CGSCC passes on distinct CGSCCs). But this is difficult in LLVM IR because use lists are shared - so if two functions use the same global variable or call the same 3rd function, optimizing out a function call from each of those functions becomes a write to shared state when trying to update the use list of that 3rd function. MLIR apparently has a different design in this regard that is intended to be more amenable to these situations.
 

If others want to chip it as I've CCed the list, that's fine as well and this should be up for discussion with the whole community.

I've given up on the idea of a async pass manager as it seems to require IR level detection of changed state between passes but maybe I'm wrong,

The other part of this discussion before Hey is already on this thread,

Nick

Nicolai Hähnle via llvm-dev

unread,
Mar 18, 2020, 9:50:11 AM3/18/20
to Nicholas Krause, llvm...@lists.llvm.org

As mentioned on the other thread, the main challenge here is in the
use lists of constant values (which includes also globals and
functions). Right now, those are linked lists that are global for an
entire LLVMContext. Every addition or removal of a use of a constant
has to touch them, and doing such fine-grained locking doesn't seem
like a great idea.

So this is probably the biggest and seemingly most daunting thing
you'd have to address first, but it's feasible and seems like a good
idea to evolve LLVM IR in a direction where it ends up looking more
like MLIR and can avoid these locks.

Cheers,
Nicolai


--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Nicholas Krause via llvm-dev

unread,
Mar 18, 2020, 10:06:03 PM3/18/20
to Nicolai Hähnle, llvm...@lists.llvm.org

GCC has the same issues it terms of certain core structures so not
really surprised.


>
> So this is probably the biggest and seemingly most daunting thing
> you'd have to address first, but it's feasible and seems like a good
> idea to evolve LLVM IR in a direction where it ends up looking more
> like MLIR and can avoid these locks.

Sure that makes sense I will see what Johannes wants to start with.

Nick
>
> Cheers,
> Nicolai

Johannes Doerfert via llvm-dev

unread,
Mar 19, 2020, 5:32:19 PM3/19/20
to Nicholas Krause, Nicolai Hähnle, llvm...@lists.llvm.org

I think addressing this issue first makes sense. I would however start
by determining the actual impact of different design choices here. I
mean, do we know locks will be heavily contented? If I had to guess I'd
say most passes will not create or modify functions nor add or remove
calls. I further guess that passes which create/query llvm::Constant
values will do so for ConstantInt between -1 and 2, I mean most of the
time. This might be wrong but we should for sure check before we
redesign the entire constant handling (as MLIR did). My suggestion is to
profile first. What we want is to monitor the use-list of constants but
I'm not sure if that is easy off the top of my head. What we can do
easily is to print a message in the methods that are used to "create" a
constant, thus the constructors (of llvm::Constant) and the
ConstantXXX::get() methods. We print the pass names and these "constant
generation" messages in a run of the test suite and analyze the result.
What passes create constants, how often, which (kind of) constants, etc.
We should also determine if any pass ever walks the use list of
constants. I know we do it for global symbols but I don't know we do it
for others. That said, I think it is sensible to distinguish global
symbols and other constants at some point because (I think) we use them
differently.

From there we decide how to move forward. Localized constants, as MLIR
has them, some locking (or similar solution), or maybe just restrictions
on the parallel execution of passes.

I hope this makes some sense.

Cheers,
  Johannes

Nicholas Krause via llvm-dev

unread,
Mar 19, 2020, 8:38:08 PM3/19/20
to Johannes Doerfert, Nicolai Hähnle, llvm...@lists.llvm.org

If your talking about this class:
https://llvm.org/doxygen/classllvm_1_1Constant.html

Then yes that makes sense in terms of getting the data. The only three
questions I would
ask are:
1. There are methods in pass classes for getting the name of them.
However the problem
would be adding a switch statement to check what were passing into the
constructor. I'm
not sure of what the top level getName e.t.c. class is. Therefore if
there is one I'm assuming
Value or ModulePass. This makes it easier as you don't have to wonder
about data types
but walk down the virtual hierarchy for this. Names at least to my
knowledge are a top
level feature and we should just use it there. So just do:
XXX->GetName() where XXX is the top level class(es).
2. Lock contention is rather vague. For example we could have multiple
readers and few writers.
So we would also need to figure that out in order to choose the right
locks. Thoughts?
3. Does LLVM have something like the GCC compile farm as at most point
were going to need
to test on larger machines for scaling and race conditions.

The only other thing is LLVM seems to be late rc so it doesn't matter to
me but do we want to
wait for LLVM 10 final to be released as this is new work. I don't think
it matters frankly as this
is way off being in mainline anything soon.

Thanks and sorry if I mistaken,
Nick

Chris Lattner via llvm-dev

unread,
Mar 20, 2020, 2:11:53 PM3/20/20
to Johannes Doerfert, llvm...@lists.llvm.org


On Mar 19, 2020, at 2:31 PM, Johannes Doerfert <johannes...@gmail.com> wrote:

I think addressing this issue first makes sense. I would however start
by determining the actual impact of different design choices here. I
mean, do we know locks will be heavily contented? If I had to guess I'd
say most passes will not create or modify functions nor add or remove
calls.

The problem isn’t constants or functions themselves, it is that they are instances of llvm::Value.  Everything that walks a use/def list would have to run code that checks for this, and every call to inst->setOperand() would have to do locking or conditional locking.  This would be a significant across-the-board slowdown for the compiler even when globals and constants are not involved.

-Chris

Nicholas Krause via llvm-dev

unread,
Mar 20, 2020, 3:35:15 PM3/20/20
to Chris Lattner, Johannes Doerfert, llvm...@lists.llvm.org
Hi Chris,

Thanks for the comments. Sure that may be true but I would prefer to get real data to see how shared data actually looks. We can
argue over it at a theoretical level. However with any real mutli-threading of a code base, real data is required. I've not sure about
setOperand() or other llvm:Value issues but real data is more my concern. Sure it may turn out that is the case but I would prefer
to have data as guessing with multi-threaded code in terms of locking/scaling is always bad or even assuming issues.

Thanks and if this is a real concern we should get data for it then or other llvm:Value issues,
Nick

Chris Lattner via llvm-dev

unread,
Mar 21, 2020, 12:29:36 PM3/21/20
to Nicholas Krause, llvm...@lists.llvm.org


On Mar 20, 2020, at 12:34 PM, Nicholas Krause <xero...@gmail.com> wrote:


The problem isn’t constants or functions themselves, it is that they are instances of llvm::Value.  Everything that walks a use/def list would have to run code that checks for this, and every call to inst->setOperand() would have to do locking or conditional locking.  This would be a significant across-the-board slowdown for the compiler even when globals and constants are not involved.

-Chris

Hi Chris,

Thanks for the comments. Sure that may be true but I would prefer to get real data to see how shared data actually looks. We can
argue over it at a theoretical level. However with any real mutli-threading of a code base, real data is required. I've not sure about
setOperand() or other llvm:Value issues but real data is more my concern. Sure it may turn out that is the case but I would prefer
to have data as guessing with multi-threaded code in terms of locking/scaling is always bad or even assuming issues.

I also love and endorse the collection of real data here!

-Chris

Johannes Doerfert via llvm-dev

unread,
Mar 25, 2020, 2:52:59 AM3/25/20
to Nicholas Krause, Nicolai Hähnle, llvm...@lists.llvm.org

Yes.

> Then yes that makes sense in terms of getting the data. The only
three questions I would
> ask are:
> 1. There are methods in pass classes for getting the name of them.
However the problem
> would be adding a switch statement to check what were passing into
the constructor. I'm
> not sure of what the top level getName e.t.c. class is. Therefore if
there is one I'm assuming
> Value or ModulePass. This makes it easier as you don't have to wonder
about data types
> but walk down the virtual hierarchy for this. Names at least to my
knowledge are a top
> level feature and we should just use it there. So just do:
> XXX->GetName() where XXX is the top level class(es).

You lost me. I was thinking to add a print in all the interesting
llvm::Constant classes. If you run opt (or clang) with
-debug-pass=Details you should see what is printed interleaved with
messages what passes are run. So you can attribute passes to actions on
constants. Exactly this might not work but there should be a way to do
this.


> 2. Lock contention is rather vague. For example we could have
multiple readers and few writers.
> So we would also need to figure that out in order to choose the right
locks. Thoughts?

A lot, but at the end it depends on what we are dealing with. Let's
postpone this discussion.


> 3. Does LLVM have something like the GCC compile farm as at most point
> were going to need to test on larger machines for scaling and race
> conditions.

We have buildbots but most people have access to larger machines.
Testing and measuring clang on a large node or on multiple nodes will be
the least of our problems ;)


> The only other thing is LLVM seems to be late rc so it doesn't matter
> to me but do we want to wait for LLVM 10 final to be released as this
> is new work. I don't think it matters frankly as this is way off being
> in mainline anything soon.

Agreed. We need to gather data. Determine options forward, present those
options to the community and start a discussion, prototype options to
see how they pan out, hopefully finish the discussion, implement the
desired solution, go through review and testing, merge and enable it, in
roughly this order.

Doerfert, Johannes via llvm-dev

unread,
Mar 25, 2020, 3:52:47 AM3/25/20
to Chris Lattner, Nicholas Krause, llvm...@lists.llvm.org
Right, actual data would be great.


I think the solution space for the value/use-list issue might be larger
than what was mentioned so far.


Some random thoughts:

If no pass ever walks the use list of a constant, except globals which
we could handle differently, we could get rid of their use-list or
overwrite their use-list interface functions to make them no-ops. We
could also do this kind of specialization in the Use class (I think).

When creating a constant we could provide the function in which the
constant is used. We have a ConstantExpr wrapper per function to make
the constants local (similar to what I understand the MLIR design is).
We would not get pointer comparison between constants used in different
functions but I suspect the places we need this we can specialize
accordingly.

We wouldn't need to lock every setOperand call but only the ones that
set a constant operand.



> -Chris

Nicolai Hähnle via llvm-dev

unread,
Mar 25, 2020, 11:36:23 AM3/25/20
to Doerfert, Johannes, llvm...@lists.llvm.org
On Wed, Mar 25, 2020 at 8:52 AM Doerfert, Johannes <jdoe...@anl.gov> wrote:
> I think the solution space for the value/use-list issue might be larger
> than what was mentioned so far.
>
>
> Some random thoughts:
>
> If no pass ever walks the use list of a constant, except globals which
> we could handle differently, we could get rid of their use-list or
> overwrite their use-list interface functions to make them no-ops. We
> could also do this kind of specialization in the Use class (I think).

Okay, let's actually think through how practical that is. The class
hierarchy is:

Value
- Argument
- BasicBlock
- InlineAsm (huh, why is that not a constant?)
- MetadataAsValue (+ children)
- User
-- Instruction (+ children)
-- Constant
--- ConstantData (undef, token none, literals)
--- ConstantAggregate (non-literal aggregates)
--- BlockAddress
--- ConstantExpr
--- GlobalValue (+ children)
-- Operator (utility / facade, i.e. not real)
-- DerivedUser (extension point used by MemorySSA)

It seems to me that the only points of this hierarchy that are
guaranteed to be function-local are Argument and Instruction.
Everything else could end up having uses from multiple functions
and/or initializers of globals. DerivedUser seems particularly special
-- but it's only used by MemorySSA, and that appears to be
function-local?

Of the values that can have global references to them, I believe that
most could just be immortal and without use lists. I'd say this
applies to:
- InlineAsm
- MetadataAsValue
- Constant other than GlobalValue

This leaves as values with global references that _cannot_ be immortal:
- GlobalValue (+ children)
- BasicBlock

And values that will have use lists, and only local references are:
- Argument
- Instruction

So the following does seem like a feasible minimal step forward:

1. Build a mechanism to break the use dependencies for GlobalValue and
BasicBlock, i.e. allow immortal BlockAddress and ConstantGlobalValue
values while _also_ allowing us to delete GlobalValues and
BasicBlocks.

For this mechanism, we can let ourselves be inspired by
mlir::SymbolRefAttr, which uses strings for the linkage.

Alternatively (and perhaps preferably), we could use a weak_ptr-like
mechanism. This can be very efficient, since each GlobalValue and
BasicBlock only ever needs to have a single instance of
ConstantGlobalValue and BlockAddress referring to it, respectively, so
a simple back-link is sufficient.

2. Change setOperand to only update use lists of Argument and
Instruction. All other use lists will always be empty -- no special
handling in the use list accessors is required, but we should place
assertions there to assert that use lists are not accidentally used on
other value types.

How does that sound for that start?


> When creating a constant we could provide the function in which the
> constant is used. We have a ConstantExpr wrapper per function to make
> the constants local (similar to what I understand the MLIR design is).
> We would not get pointer comparison between constants used in different
> functions but I suspect the places we need this we can specialize
> accordingly.
>
> We wouldn't need to lock every setOperand call but only the ones that
> set a constant operand.

MLIR distinguishes between "attributes" and "operations". So you'd
define a constant by placing an operation (aka instruction) in each
function that has the actual constant as an attribute.
mlir::Attributes don't have use lists at all, so they're like
llvm::Metadata in that sense.

There is a certain elegance to having explicit instructions to import
constants into the llvm::Function context, but also ugliness, as IR
will then be full of "constant" instructions, unlike today. Plus,
changing all existing LLVM IR to have those is an enormous amount of
churn. I'm not totally opposed to doing this, but I'd lean towards
smaller, incremental steps.

Regardless, there are some related cleanups that seem like they would
be useful. For example, getting rid of llvm::ConstantExpr seems like a
very good idea to me, though somewhat orthogonal to the problem of
multi-threading. While we're at it, we could get rid of
llvm::ConstantAggregate and move llvm::Constant to no longer be an
llvm::User. This may impact how global value initializers work.

Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Doerfert, Johannes via llvm-dev

unread,
Mar 25, 2020, 12:12:54 PM3/25/20
to Nicolai Hähnle, llvm...@lists.llvm.org
Much better than my random though and reasonable to me. Let's verify the
assumptions about use list usage first though. (We probably also need to
allow iterating the emtpy use lists of constants in case your assertion
comment was target to prevent this.)
You should ask for feedback on this in an explicit thread.

Cheers,
  Johannes

Florian Hahn via llvm-dev

unread,
Mar 25, 2020, 12:26:23 PM3/25/20
to Nicolai Hähnle, Doerfert, Johannes, llvm...@lists.llvm.org
I think one important observation is that most passes currently probably do not really care too much about use lists of various constants, beyond the bookkeeping required to maintain them.

We might be able to break the dependencies on GlobalValue without too much work directly as an IR transformation. I’ve not thought this completely through, but we might get away with a module pass that duplicate globals per function and replace all uses in a function with the copy per function before running a set of function passes. Then merge the duplicates again before running passes that inspect global uses. It is probably not an optimal solution, but it might shorten the path towards getting into a state where we can run function passes in parallel. It would also require making ConstantInt & co function-local, but both things could be implemented in parallel.

I think this thread explores the problem in quite a few directions very well. It might be worth summarizing the problems & possible solutions somewhere, e.g. the docs, to preserve them.

Cheers,
Florian

Nicholas Krause via llvm-dev

unread,
Mar 25, 2020, 3:15:47 PM3/25/20
to Florian Hahn, Nicolai Hähnle, Doerfert, Johannes, llvm...@lists.llvm.org
I don't have access to the wiki for writing but I'm taking notes on what should  be added as I read the thread. :) Probably the holy grail
for multi threading a compiler is to make the pass manager async. I tried for months on the GCC side to figure that out by reading
the code e.t.c., its almost impossible it seems without IR level state detection.

Frankly I gave up on that but if people are curious its a issue of IPO between passes and pass reordering issues,

Nick

Chris Lattner via llvm-dev

unread,
Mar 25, 2020, 4:23:22 PM3/25/20
to Doerfert, Johannes, llvm...@lists.llvm.org


On Mar 25, 2020, at 12:52 AM, Doerfert, Johannes <jdoe...@anl.gov> wrote:


Some random thoughts:

If no pass ever walks the use list of a constant, except globals which
we could handle differently, we could get rid of their use-list or
overwrite their use-list interface functions to make them no-ops. 

This would be really problematic because it breaks orthogonality in the compiler.  Today, you can walk the use-list of any operand to an instruction.  This would be broken by this change, which would make it much easier to write buggy/incorrect compiler code and passes.

-Chris

Chris Lattner via llvm-dev

unread,
Mar 25, 2020, 4:27:49 PM3/25/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes

On Mar 25, 2020, at 8:35 AM, Nicolai Hähnle <nhae...@gmail.com> wrote:


MLIR distinguishes between "attributes" and "operations". So you'd
define a constant by placing an operation (aka instruction) in each
function that has the actual constant as an attribute.
mlir::Attributes don't have use lists at all, so they're like
llvm::Metadata in that sense.

FWIW, the Swift SIL IR does the same thing - it uses “integer_literal” instructions to materialize constants.  In addition to the multithreading benefits explored in this thread, this also allows carrying location information more correctly.  This is important for source level analysis tools, debugging etc.

There is a certain elegance to having explicit instructions to import
constants into the llvm::Function context, but also ugliness, as IR
will then be full of "constant" instructions, unlike today. Plus,
changing all existing LLVM IR to have those is an enormous amount of
churn.

Yes, it increases verbosity of the IR this is very true. In practice, it is not a problem though.

I'm not totally opposed to doing this, but I'd lean towards
smaller, incremental steps.

Regardless, there are some related cleanups that seem like they would
be useful. For example, getting rid of llvm::ConstantExpr seems like a
very good idea to me, though somewhat orthogonal to the problem of
multi-threading. While we're at it, we could get rid of
llvm::ConstantAggregate and move llvm::Constant to no longer be an
llvm::User. This may impact how global value initializers work.

I would love to kill off ConstantExpr.  It is a huge design wart: for example the fact that we have constantexpr “divide” instructions means that constants “canTrap()” which leads to all sorts of problems.

A much better design (ignoring the multithreading concerns) would be to make scalar constants be llvm::Constant’s, keep aggregate constants but only allow them in global variables, and introduce a new ConstantExpr like class which represents a linker-relocatable expressions like “symbol+constant” and “symbol1-symbol2+constant” on Darwin.

-Chris

Michael Kruse via llvm-dev

unread,
Mar 25, 2020, 6:13:01 PM3/25/20
to Chris Lattner, llvm...@lists.llvm.org
Am Sa., 29. Feb. 2020 um 16:19 Uhr schrieb Chris Lattner via llvm-dev
<llvm...@lists.llvm.org>:

> The problem is that LLVM has global use-def chains on constants, functions and globals, etc, so it is impractical to do this.

Since analyses do not modify the IR, we these look like a low-hanging
fruit. At least the legacy pass manager knows in advance which
analyses a pass needs, and could run all of them in parallel before
the passes themselves.

Michael

Doerfert, Johannes via llvm-dev

unread,
Mar 25, 2020, 7:15:10 PM3/25/20
to Chris Lattner, llvm...@lists.llvm.org
On 3/25/20 3:23 PM, Chris Lattner wrote:
>
>
>> On Mar 25, 2020, at 12:52 AM, Doerfert, Johannes <jdoe...@anl.gov>
wrote:
>>
>>
>> Some random thoughts:
>>
>> If no pass ever walks the use list of a constant, except globals which
>> we could handle differently, we could get rid of their use-list or
>> overwrite their use-list interface functions to make them no-ops.
>
> This would be really problematic because it breaks orthogonality in
> the compiler.

I doubt it is as simple as that and I we should differentiate more
before we make blanked statements.


> Today, you can walk the use-list of any operand to an instruction.
> This would be broken by this change,

First, you would be able to walk the use-list of any operand. So nothing
breaks just yet.


> which would make it much easier to write buggy/incorrect compiler code
> and passes.

I argued (implicitly) above that the uses of a Constant[Int/Float/...]
are really not interesting if no-one ever walks them. Let's assume you
walk the uses and we "removed" the use list so there are none, what does
that mean. I'd say, nothing much. If you inspect the Value and see it's
a Constant you have to assume it has Aliases that denote the same value
so direct uses are not really interesting anyway. If you inspect further
and see ConstantInt/Float/... you can deal with the missing use list. If
you don't inspect the Value you cannot really make much of an empty use
list, or can you? I doubt we ever call RAUW on a ConstantInt/Float/...

Feel free to elaborate on your concerns.

Thanks,
  Johannes

Chris Lattner via llvm-dev

unread,
Mar 25, 2020, 7:47:27 PM3/25/20
to Doerfert, Johannes, llvm...@lists.llvm.org
On Mar 25, 2020, at 4:14 PM, Doerfert, Johannes <jdoe...@anl.gov> wrote:
>
>> Today, you can walk the use-list of any operand to an instruction.
>> This would be broken by this change,
>
> First, you would be able to walk the use-list of any operand. So nothing
> breaks just yet.

If I understand correctly, you are suggesting that you can walk it, but you don’t get all the uses. This is extremely dangerous, it would be better to abort.

>> which would make it much easier to write buggy/incorrect compiler code
>> and passes.
>
> I argued (implicitly) above that the uses of a Constant[Int/Float/...]
> are really not interesting if no-one ever walks them.

There is a difference between “no one ever walks them in practice” and “no one can walk them. :-)

> Let's assume you
> walk the uses and we "removed" the use list so there are none, what does
> that mean. I'd say, nothing much. If you inspect the Value and see it's
> a Constant you have to assume it has Aliases that denote the same value
> so direct uses are not really interesting anyway. If you inspect further
> and see ConstantInt/Float/... you can deal with the missing use list. If
> you don't inspect the Value you cannot really make much of an empty use
> list, or can you? I doubt we ever call RAUW on a ConstantInt/Float/...
>
> Feel free to elaborate on your concerns.


Have you tested your thesis that no one walks the use-def chains? You could just add an assertion to the compiler (the methods like use_begin() hasOneUse() etc), that aborts on constants, then run the test suite.

Doerfert, Johannes via llvm-dev

unread,
Mar 25, 2020, 8:14:45 PM3/25/20
to Chris Lattner, llvm...@lists.llvm.org
On 3/25/20 6:47 PM, Chris Lattner wrote:
> On Mar 25, 2020, at 4:14 PM, Doerfert, Johannes <jdoe...@anl.gov>
wrote:
>>
>>> Today, you can walk the use-list of any operand to an instruction.
>>> This would be broken by this change,
>>
>> First, you would be able to walk the use-list of any operand. So nothing
>> breaks just yet.
>
> If I understand correctly, you are suggesting that you can walk it,
> but you don’t get all the uses.  This is extremely dangerous, it would
> be better to abort.

Correct, that is what I suggested.


>>> which would make it much easier to write buggy/incorrect compiler code
>>> and passes.
>>
>> I argued (implicitly) above that the uses of a Constant[Int/Float/...]
>> are really not interesting if no-one ever walks them.
>
> There is a difference between “no one ever walks them in practice” and
> “no one can walk them.    :-)

Sure. That is why I argued below walking the incomplete list is not a
problem. At leas I'm not aware of a problem yet.


>> Let's assume you
>> walk the uses and we "removed" the use list so there are none, what does
>> that mean. I'd say, nothing much. If you inspect the Value and see it's
>> a Constant you have to assume it has Aliases that denote the same value
>> so direct uses are not really interesting anyway. If you inspect further
>> and see ConstantInt/Float/... you can deal with the missing use list. If
>> you don't inspect the Value you cannot really make much of an empty use
>> list, or can you? I doubt we ever call RAUW on a ConstantInt/Float/...
>>
>> Feel free to elaborate on your concerns.
>
>
> Have you tested your thesis that no one walks the use-def chains?  You
> could just add an assertion to the compiler (the methods like
> use_begin() hasOneUse() etc), that aborts on constants, then run the
> test suite.

That is what I suggested a few emails back. However, I never said it is
safe because no one walks it but what I suggested is that it will
actually not impact much if we don't remember the uses of Constants
(except globals).

Chris Lattner via llvm-dev

unread,
Mar 26, 2020, 12:33:30 AM3/26/20
to Doerfert, Johannes, llvm...@lists.llvm.org


On Mar 25, 2020, at 5:14 PM, Doerfert, Johannes <jdoe...@anl.gov> wrote:

Let's assume you
walk the uses and we "removed" the use list so there are none, what does
that mean. I'd say, nothing much. If you inspect the Value and see it's
a Constant you have to assume it has Aliases that denote the same value
so direct uses are not really interesting anyway. If you inspect further
and see ConstantInt/Float/... you can deal with the missing use list. If
you don't inspect the Value you cannot really make much of an empty use
list, or can you? I doubt we ever call RAUW on a ConstantInt/Float/...

Feel free to elaborate on your concerns.


Have you tested your thesis that no one walks the use-def chains?  You
could just add an assertion to the compiler (the methods like
use_begin() hasOneUse() etc), that aborts on constants, then run the
test suite.

That is what I suggested a few emails back. However, I never said it is
safe because no one walks it but what I suggested is that it will
actually not impact much if we don't remember the uses of Constants
(except globals).

Sure, I was just curious what you base that opinion on.  Have you done an experiment?

-Chris

Nicolai Hähnle via llvm-dev

unread,
Mar 26, 2020, 6:19:36 AM3/26/20
to Chris Lattner, llvm...@lists.llvm.org, Doerfert, Johannes
On Wed, Mar 25, 2020 at 5:26 PM Florian Hahn <floria...@apple.com> wrote:
> I think one important observation is that most passes currently probably do not really care too much about use lists of various constants, beyond the bookkeeping required to maintain them.

Agreed, though as Chris points out it's something that can and should
be tested experimentally fairly easily.


> We might be able to break the dependencies on GlobalValue without too much work directly as an IR transformation. I’ve not thought this completely through, but we might get away with a module pass that duplicate globals per function and replace all uses in a function with the copy per function before running a set of function passes. Then merge the duplicates again before running passes that inspect global uses.

That feels like a really awkward hack :(

Now you have to keep track of whether the IR is currently in the state
where function pass parallelization is okay or not, which seems rather
fragile.

It also doesn't solve the problem of Functions themselves -- those are
also GlobalValues...


On Thu, Mar 26, 2020 at 12:47 AM Chris Lattner <clat...@nondot.org> wrote:
> > There is a certain elegance to having explicit instructions to import
> > constants into the llvm::Function context, but also ugliness, as IR
> > will then be full of "constant" instructions, unlike today. Plus,
> > changing all existing LLVM IR to have those is an enormous amount of
> > churn.
>
> Yes, it increases verbosity of the IR this is very true. In practice, it is not a problem though.

Okay, it seems like it could be a viable long-term goal to do this.
What I'm trying to figure out is how to incrementally evolve LLVM IR,
and it feels like this change to enable "constant materialization
instructions" is extremely invasive in terms of churn in the code base
on a level comparable to opaque pointers -- and it's not strictly
necessary for the goal of multi-threading. I'm just trying to be
pragmatic here.


[snip]


> On Mar 25, 2020, at 4:14 PM, Doerfert, Johannes <jdoe...@anl.gov> wrote:
> >
> >> Today, you can walk the use-list of any operand to an instruction.
> >> This would be broken by this change,
> >
> > First, you would be able to walk the use-list of any operand. So nothing
> > breaks just yet.
>
> If I understand correctly, you are suggesting that you can walk it, but you don’t get all the uses. This is extremely dangerous, it would be better to abort.

Agreed.


> >> which would make it much easier to write buggy/incorrect compiler code
> >> and passes.
> >
> > I argued (implicitly) above that the uses of a Constant[Int/Float/...]
> > are really not interesting if no-one ever walks them.
>
> There is a difference between “no one ever walks them in practice” and “no one can walk them. :-)

Yes, but you've got to ask yourself: what's the point of walking the
use list of a scalar constant? It feels like code that would do that
is quite likely broken to begin with and should probably abort already
today... which is exactly the experiment that you suggest below, and I
agree that doing this experiment would be a useful step in all this :)

Cheers,
Nicolai

--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Florian Hahn via llvm-dev

unread,
Mar 26, 2020, 6:53:45 AM3/26/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes

> On Mar 26, 2020, at 10:19, Nicolai Hähnle via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> On Wed, Mar 25, 2020 at 5:26 PM Florian Hahn <floria...@apple.com> wrote:
>> I think one important observation is that most passes currently probably do not really care too much about use lists of various constants, beyond the bookkeeping required to maintain them.
>
> Agreed, though as Chris points out it's something that can and should
> be tested experimentally fairly easily.
>
>
>> We might be able to break the dependencies on GlobalValue without too much work directly as an IR transformation. I’ve not thought this completely through, but we might get away with a module pass that duplicate globals per function and replace all uses in a function with the copy per function before running a set of function passes. Then merge the duplicates again before running passes that inspect global uses.
>
> That feels like a really awkward hack :(
>

> Now you have to keep track of whether the IR is currently in the state
> where function pass parallelization is okay or not, which seems rather
> fragile.
>

Agreed, it is not production solution. I think it was not really clear from my previous mail, but I see it mostly as a relatively cheap approach (in terms of work required to get it working) to try to get to a state where we can run experiments with function passes in parallel and get a better idea of the benefit and turn up other issues.


> It also doesn't solve the problem of Functions themselves -- those are

> also GlobalValues…

I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.

Cheers,
Florian

Nicolai Hähnle via llvm-dev

unread,
Mar 26, 2020, 6:55:01 AM3/26/20
to Chris Lattner, llvm...@lists.llvm.org, Doerfert, Johannes
On Thu, Mar 26, 2020 at 11:19 AM Nicolai Hähnle <nhae...@gmail.com> wrote:
> On Thu, Mar 26, 2020 at 12:47 AM Chris Lattner <clat...@nondot.org> wrote:
> > > There is a certain elegance to having explicit instructions to import
> > > constants into the llvm::Function context, but also ugliness, as IR
> > > will then be full of "constant" instructions, unlike today. Plus,
> > > changing all existing LLVM IR to have those is an enormous amount of
> > > churn.
> >
> > Yes, it increases verbosity of the IR this is very true. In practice, it is not a problem though.
>
> Okay, it seems like it could be a viable long-term goal to do this.
> What I'm trying to figure out is how to incrementally evolve LLVM IR,
> and it feels like this change to enable "constant materialization
> instructions" is extremely invasive in terms of churn in the code base
> on a level comparable to opaque pointers -- and it's not strictly
> necessary for the goal of multi-threading. I'm just trying to be
> pragmatic here.

I feel like I should clarify this after thinking a bit more about this.

It sounds like "constant materialization instructions" are a useful
feature to have for better location information (for diagnostics and
debug info). And _introducing_ constant materialization instructions
should be quite feasible to do incrementally. For example, if Clang or
some other frontend sees a use case for adding those in order to do
more work in IR rather than on the AST, then that should be
supportable.

The thing that I think is "extremely invasive in terms of churn in the
code base" is making constant materialization instructions the _only_
representation of constants inside a function. Which means that this
path is not very viable if your goal is to enable multi-threaded
compilation.

However, if your goal is to move towards constant materialization
instructions for the other benefits that they can bring, then there is
an incremental path available that can quickly bring you gains -
something along the lines of:

1. The initial MVP: Introduce constant materialization instructions as
a concept. Add a switch to the Builder so that constants are
optionally emitted as materialization instructions, start using that
switch in at least one frontend, add/convert some initial passes to
make use of them, and add a very simple pass that strips them out
again for the remainder of the compilation pipeline.

2. Teach an increasing number of passes, analyses, matching
infrastructure, etc. about the constant materialization instructions.

3. Eventually, get to the point where the pass that strips them out
can be removed. Somewhere around that time, we can make the Builder
unconditionally emit them everywhere.

4. Flip the switch that starts aborting when old-style Constants are
used instead.

5. Cleanup any remains of old constant support e.g. in the matching
infrastructure.

This seems viable, but very much a marathon and not a sprint, and I
feel that this is going away a bit from the topic of multi-threading

Nicolai Hähnle via llvm-dev

unread,
Mar 26, 2020, 6:56:29 AM3/26/20
to Florian Hahn, llvm...@lists.llvm.org, Doerfert, Johannes
On Thu, Mar 26, 2020 at 11:53 AM Florian Hahn <floria...@apple.com> wrote:
> > It also doesn't solve the problem of Functions themselves -- those are
> > also GlobalValues…
>
> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.

Function passes can remove, duplicate, or just plain introduce call
sites (e.g. recognize a memset pattern), which means the use lists of
Functions can be changed during a function pass...

Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Florian Hahn via llvm-dev

unread,
Mar 26, 2020, 7:06:37 AM3/26/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes

> On Mar 26, 2020, at 10:55, Nicolai Hähnle <nhae...@gmail.com> wrote:
>
> On Thu, Mar 26, 2020 at 11:53 AM Florian Hahn <floria...@apple.com> wrote:
>>> It also doesn't solve the problem of Functions themselves -- those are
>>> also GlobalValues…
>>
>> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.
>
> Function passes can remove, duplicate, or just plain introduce call
> sites (e.g. recognize a memset pattern), which means the use lists of

> Functions can be changed during a function pass…


Sure, but a single function won’t be processed in parallel by a function pass and would just work on the 'local version' of the globals it uses, including called functions. So a function pass adding/removing calls to existing ‘local versions’ of functions should not be a problem I think.

One problem is that function passes can add new global, e.g. new declarations if they introduce new calls. I guess that would require some locking, but should happen quite infrequently.

Cheers,
Florian

David Chisnall via llvm-dev

unread,
Mar 26, 2020, 7:09:35 AM3/26/20
to llvm...@lists.llvm.org
On 26/03/2020 11:06, Florian Hahn via llvm-dev wrote:
> One problem is that function passes can add new global, e.g. new declarations if they introduce new calls. I guess that would require some locking, but should happen quite infrequently.

Can they? I have had to make a pass a ModulePass in the past, because
it added a global to hold a cache. The global was used in only the
function being modified, but the fact that it was a global prevented the
pass from being a FunctionPass.

David

Florian Hahn via llvm-dev

unread,
Mar 26, 2020, 7:26:15 AM3/26/20
to David Chisnall, llvm-dev

> On Mar 26, 2020, at 11:09, David Chisnall via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> On 26/03/2020 11:06, Florian Hahn via llvm-dev wrote:
>> One problem is that function passes can add new global, e.g. new declarations if they introduce new calls. I guess that would require some locking, but should happen quite infrequently.
>
> Can they? I have had to make a pass a ModulePass in the past, because it added a global to hold a cache. The global was used in only the function being modified, but the fact that it was a global prevented the pass from being a FunctionPass.

Maybe they should not, but I think in practice plenty of function passes insert new declarations.

For example, I think Intrinsic::getDeclaration inserts a new declaration if the requested intrinsic is not yet declared (https://github.com/llvm/llvm-project/blob/master/llvm/lib/IR/Function.cpp#L1117) and a lot of function passes use it.

Cheers,
Florian

Doerfert, Johannes via llvm-dev

unread,
Mar 26, 2020, 11:11:29 AM3/26/20
to Chris Lattner, llvm...@lists.llvm.org

I haven't but, I suggested a few emails back we should gather such data

(and I was hoping Nicholas was going to do it).


Could you elaborate on what you said earlier. Why is problematic to
remove the use-lists tracking from Constants, except globals? If you
have a use case in which an always empty use list for such constants
would be problematic it would be beneficial for us to consider it sooner
than later.

Thanks,
  Johannes

Doerfert, Johannes via llvm-dev

unread,
Mar 26, 2020, 11:27:04 AM3/26/20
to Florian Hahn, Nicolai Hähnle, llvm...@lists.llvm.org
On 3/26/20 5:53 AM, Florian Hahn wrote:
>> It also doesn't solve the problem of Functions themselves -- those are
>> also GlobalValues…
>
>
> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.

FWIW, I think deleting and creating functions (or globals in general)
should be hidden behind an API that would allow us to synchronize things
in parallel runs.

I don't believe there are many passes that would be affected and we
could "easily" make this happen. Hide the constructors and destructors
so that only the friend API can access it.

Nicolai Hähnle via llvm-dev

unread,
Mar 26, 2020, 11:56:58 AM3/26/20
to Florian Hahn, llvm...@lists.llvm.org, Doerfert, Johannes
On Thu, Mar 26, 2020 at 12:06 PM Florian Hahn <floria...@apple.com> wrote:
> > On Mar 26, 2020, at 10:55, Nicolai Hähnle <nhae...@gmail.com> wrote:
> > On Thu, Mar 26, 2020 at 11:53 AM Florian Hahn <floria...@apple.com> wrote:
> >>> It also doesn't solve the problem of Functions themselves -- those are
> >>> also GlobalValues…
> >>
> >> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.
> >
> > Function passes can remove, duplicate, or just plain introduce call
> > sites (e.g. recognize a memset pattern), which means the use lists of
> > Functions can be changed during a function pass…
>
>
> Sure, but a single function won’t be processed in parallel by a function pass and would just work on the 'local version' of the globals it uses, including called functions. So a function pass adding/removing calls to existing ‘local versions’ of functions should not be a problem I think.

Oh, I see what you're saying now. Yes, agreed that that should work.

Though by the time you're done implementing the conversion from and to
that representation and made sure you've covered all the corner cases,
I'm not so sure you've really saved a lot of effort relative to just
doing it right in the first place :)


> One problem is that function passes can add new global, e.g. new declarations if they introduce new calls. I guess that would require some locking, but should happen quite infrequently.

Agreed.

Cheers,
Nicolai

--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Alexandre Ganea via llvm-dev

unread,
Mar 26, 2020, 2:58:01 PM3/26/20
to Nicolai Hähnle, Florian Hahn, Doerfert, Johannes, llvm...@lists.llvm.org

Hello everyone,

 

Just to add a bit of spice to the discussion about “Multi-Threading Compilers”: (sorry for just bringing high-level ideas)

 

We are heavy users of unity files (aka blobs or jumbos).

 

Unity files are a big pain, they add extra complexity, but at the same time they provide tremendous build-time reductions, 10x or more. Our game projects typically read >50,000 files during the full build of a single target, out of which 20,000 .CPPs. The same unity target compiles only 600 unity .CPPs, which themselves aggregate all of the 20,000 initial .CPPs. Building locally the 20,000 TUs on a modern 3.7 GHz 6-core PC takes more than 2 hours 30 min. With unity files, it takes 20 minutes. Distributing it remotely on pool of machines takes 5 min. Caching everything and rebuilding takes 45 sec.

 

However we’re now tributary of the order of files in the unities. If files or folders are added or removed in the codebase, the contents of the unity can change, thus the cache is invalidated for that unity CPP. And that happens quite often in production.

Unities also induce higher build times in some cases, spikes, like I was showing in a previous post of this thread. Without inspecting the AST, it is hard to determine an optimal “cutting” point when building the unity .CPPs. We can end up with unities including template-heavy .CPPs which will take a lot longer than other Unity files.

 

If we are to discuss multi-threading, this means we are discussing compile-time performance and how compilation would scale in the future. I think we should consider the functionality of unity files in the compiler (maybe behind a flag if it’s non-conformant).

 

While I don't know exactly how that fits in this (multi-treading) discussion, efficiently coalescing compilation of several TUs should be the compiler's responsibility, and likely will be more efficient than doing it by a pre-build tool, like we do today.

 

In essence, if we were to provide a large number of files to Clang, let's say with the same options: (the /MP flag is still WIP, I'll get back to that soon, [1])

 

                clang-cl /c a.cpp b.cpp c.cpp d.cpp ... /MP

 

And then expect the compiler to (somehow) share tokenization-lexing-filecaching-preprocessing-compilation-optims-computations-etc across TUs, in a lock-free manner preferably. Overlapped/duplicated computations across threads, in the manner of transactions, would be probably fine, if computations are small and if we want to avoid locks (but this needs to be profiled). Also the recent trend of NUMA processor “tiles” as well as HBM2 memory on-chip per “tile”, could change the way multi-threaded code is written. Perhaps states would need to be duplicated in the local NUMA memory for maximum performance. Additionally, I’m not sure (how/if) lock-based programming will scale past a few hundreds, or thousands of cores in a single image without major contention. Maybe, as long as locks don’t cross NUMA boundaries. This needs to be considered in the design.

 

So while the discussion seems to around multi-threading single TUs, it’d be nice to also consider the possibility of sharing state between TUs. Which maybe means retaining runtime state in global hash table(s). And possibly persisting that state on disk, or in a DB, after the compilation -- we could maybe draw a parallel with work done by SN Systems (Program Repo, see [2]), or zapcc [3].

 

Thanks!

Alex.

 

[1] https://reviews.llvm.org/D52193

[2] https://github.com/SNSystems/llvm-project-prepo

[3] https://github.com/yrnkrn/zapcc

 

Nicholas Krause via llvm-dev

unread,
Mar 26, 2020, 5:29:37 PM3/26/20
to Nicolai Hähnle, Florian Hahn, llvm...@lists.llvm.org, Doerfert, Johannes

On 3/26/20 11:56 AM, Nicolai Hähnle via llvm-dev wrote:
> On Thu, Mar 26, 2020 at 12:06 PM Florian Hahn <floria...@apple.com> wrote:
>>> On Mar 26, 2020, at 10:55, Nicolai Hähnle <nhae...@gmail.com> wrote:
>>> On Thu, Mar 26, 2020 at 11:53 AM Florian Hahn <floria...@apple.com> wrote:
>>>>> It also doesn't solve the problem of Functions themselves -- those are
>>>>> also GlobalValues…
>>>> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.
>>> Function passes can remove, duplicate, or just plain introduce call
>>> sites (e.g. recognize a memset pattern), which means the use lists of
>>> Functions can be changed during a function pass…
>>
>> Sure, but a single function won’t be processed in parallel by a function pass and would just work on the 'local version' of the globals it uses, including called functions. So a function pass adding/removing calls to existing ‘local versions’ of functions should not be a problem I think.
> Oh, I see what you're saying now. Yes, agreed that that should work.
>
> Though by the time you're done implementing the conversion from and to
> that representation and made sure you've covered all the corner cases,
> I'm not so sure you've really saved a lot of effort relative to just
> doing it right in the first place :)
>
>
>> One problem is that function passes can add new global, e.g. new declarations if they introduce new calls. I guess that would require some locking, but should happen quite infrequently.
> Agreed.
>
> Cheers,
> Nicolai

CCing myself back to the discussion. As to Johannes comments I've not
problem getting data
but the discussion seems to around implementation and what data we
actually care about.
That should be discussed first IMO.

Nick

Chris Lattner via llvm-dev

unread,
Mar 26, 2020, 6:59:13 PM3/26/20
to Doerfert, Johannes, llvm...@lists.llvm.org
On Mar 26, 2020, at 8:10 AM, Doerfert, Johannes <jdoe...@anl.gov> wrote:


Sure, I was just curious what you base that opinion on.  Have you 
done an experiment?

I haven't but, I suggested a few emails back we should gather such data

(and I was hoping Nicholas was going to do it).


Could you elaborate on what you said earlier. Why is problematic to
remove the use-lists tracking from Constants, except globals? If you
have a use case in which an always empty use list for such constants
would be problematic it would be beneficial for us to consider it sooner
than later.

Sure.  I think this changed, by at one point LLVM had a simple PRE algorithm based on lexical equivalence.  The way this works is that you have to find all instances of a computation (e.g. “x+y”) in a function, then see if any are redundant, and move things around to eliminate them.

There is a simple way to implement this, in pseudo code, for binary operators:

findCandidatesLexicallyEquivalentTo(BinaryOperator *inst) {
  for (auto user : lhs->getOperand(0)->getUses())
    If (user->getOperand(1) == inst->getOperand(1) && 
        user->getParentFunction() == inst->getFunction())
      Found.push_back(user);
}

This fails if this is an operation like “sub(42, x)” when 42 doesn’t have a use-def chain.

This has clear advantages and disadvantages, including:

1) It is sparse, defined over use-def chains instead of requiring dense scans of the IR.  As such, it scales to larger functions better.
 2) There can be a constant on the LHS and you end up scanning everything in the universe if you aren’t careful.

The right fix for this to make constants have instructions and unique them into the entry block of the function (which is what MLIR does :-). 

That said, my point isn’t about this algorithm in particular, it is that use-def chains are a pervasive part of the IR and are used in lots of different ways.  Making them work different will introduce bugs and inconsistencies in the compiler that only show up in weird cases, which is not great for the long term health of the project.

-Chris

Chris Lattner via llvm-dev

unread,
Mar 26, 2020, 7:02:30 PM3/26/20
to Doerfert, Johannes, llvm...@lists.llvm.org

> On Mar 26, 2020, at 8:26 AM, Doerfert, Johannes <jdoe...@anl.gov> wrote:
>
> On 3/26/20 5:53 AM, Florian Hahn wrote:
>>> It also doesn't solve the problem of Functions themselves -- those are
>>> also GlobalValues…
>>
>>
>> I am not sure why not. Function passes should only rely on the information at the callsite & from the declaration IIRC. For functions, we coulkd add extra declarations and update the call sites. But I might be missing something.
>
> FWIW, I think deleting and creating functions (or globals in general)
> should be hidden behind an API that would allow us to synchronize things
> in parallel runs.
>
> I don't believe there are many passes that would be affected and we
> could "easily" make this happen. Hide the constructors and destructors
> so that only the friend API can access it.

Right, this is what MLIR does. However, keep in mind that this is just one piece of the puzzle. You also have things like ValueHandles which get owned by IPA passes. The CallGraph used by the inline persists across function pass manager runs, and needs to be correctly updatable in order to run CallGraphSCC passes.

Getting to a multithread clean optimizer is not one bug fix away - this is a multiyear journey, and I would prefer not to see big changes with unclear tradeoffs made in an effort to get a quick win.

Instead, let’s break down the problems and fix them (the right way!) one at a time. For example, it seems that the thread agrees that the overall design of constants are a problem: let's talk about (incrementally!) moving constants to the right architecture. There are good suggestions about this upthread.

-Chris

Johannes Doerfert via llvm-dev

unread,
Mar 27, 2020, 3:01:52 AM3/27/20
to Chris Lattner, Doerfert, Johannes, llvm...@lists.llvm.org

So, syntactic matching, if performed by traversing the use list, will in
the presence of constants not work "as good" as it could. How that is a
correctness issue (or why does it matter otherwise)?

> This has clear advantages and disadvantages, including:
>
> 1) It is sparse, defined over use-def chains instead of requiring
dense scans of the IR.  As such, it scales to larger functions better.
> 2) There can be a constant on the LHS and you end up scanning
everything in the universe if you aren’t careful.

I'm unsure what you want to say here. I think/hope we don't do the
matching you mention anymore. Even if, you can easily make this work
just fine, I mean, all of it points 1) and 2) in the absence of a use
lists for constants: Only follow users of non-constants, which is the
only thing you will be able to do anyway as constants won't have users.
If all operands are constants chances are the operation is as well or it
is not well-suited for syntactic-only transformations anyway.


> The right fix for this to make constants have instructions and unique
> them into the entry block of the function (which is what MLIR does
> :-).

I know that MLIR does the right thing already. Nevertheless, I would
still like to talk about alternatives in a constructive way.


> That said, my point isn’t about this algorithm in particular, it is
> that use-def chains are a pervasive part of the IR and are used in
> lots of different ways.  Making them work different will introduce
> bugs and inconsistencies in the compiler that only show up in weird
> cases, which is not great for the long term health of the project.

This is where I am not so sure. So far, and including the example above,
I'm still not aware of a soundness problem or drawback that would come
from constants without use-lists (except globals). Anyway, we gather
more data once we perform some experiments.

FTR, I'm not even sure this is a good idea but I think it is
     worth/interesting to discuss and investigate anyway.

Cheers,

Johannes Doerfert via llvm-dev

unread,
Mar 27, 2020, 3:04:30 AM3/27/20
to Florian Hahn, David Chisnall, llvm-dev

On 3/26/20 6:25 AM, Florian Hahn via llvm-dev wrote:
>
>
>> On Mar 26, 2020, at 11:09, David Chisnall via llvm-dev
<llvm...@lists.llvm.org> wrote:
>>
>> On 26/03/2020 11:06, Florian Hahn via llvm-dev wrote:
>>> One problem is that function passes can add new global, e.g. new
declarations if they introduce new calls. I guess that would require
some locking, but should happen quite infrequently.
>>
>> Can they?  I have had to make a pass a ModulePass in the past,
because it added a global to hold a cache.  The global was used in only
the function being modified, but the fact that it was a global prevented
the pass from being a FunctionPass.
>
> Maybe they should not, but I think in practice plenty of function
passes insert new declarations.
>
> For example, I think Intrinsic::getDeclaration inserts a new
declaration if the requested intrinsic is not yet declared
(https://github.com/llvm/llvm-project/blob/master/llvm/lib/IR/Function.cpp#L1117)
and a lot of function passes use it.

I think you are unfortunately both correct. As I said before, creating,
modifying, and deleting globals will most likely need explicit
synchronization. It is not too hard to implement though and unlikely to
be a performance bottleneck (IMHO).

Cheers,
  Johannes

Johannes Doerfert via llvm-dev

unread,
Mar 27, 2020, 3:24:33 AM3/27/20
to Chris Lattner, Doerfert, Johannes, llvm...@lists.llvm.org

On 3/26/20 6:02 PM, Chris Lattner via llvm-dev wrote:
>
>
>> On Mar 26, 2020, at 8:26 AM, Doerfert, Johannes <jdoe...@anl.gov>
wrote:
>>
>> On 3/26/20 5:53 AM, Florian Hahn wrote:
>>>> It also doesn't solve the problem of Functions themselves -- those are
>>>> also GlobalValues…
>>>
>>>
>>> I am not sure why not. Function passes should only rely on the
information at the callsite & from the declaration IIRC. For functions,
we coulkd add extra declarations and update the call sites. But I might
be missing something.
>>
>> FWIW, I think deleting and creating functions (or globals in general)
>> should be hidden behind an API that would allow us to synchronize
things
>> in parallel runs.
>>
>> I don't believe there are many passes that would be affected and we
>> could "easily" make this happen. Hide the constructors and destructors
>> so that only the friend API can access it.
>
> Right, this is what MLIR does.
> However, keep in mind that this is just one piece of the puzzle.

I am aware.


> You also have things like ValueHandles which get owned by IPA passes.

Why are they more problematic?


> The CallGraph used by the inline persists across function pass manager
> runs, and needs to be correctly updatable in order to run CallGraphSCC
> passes.

Sure. We make everyone use a single `CallGraphUpdater` in which we can
synchronize appropriately. Again, the argument is the same as with
Global creation/modification/deletion, it does not happen much in most
passes.


> Getting to a multithread clean optimizer is not one bug fix away -
> this is a multiyear journey, and I would prefer not to see big changes
> with unclear tradeoffs made in an effort to get a quick win.

I think I missed the suggestion of "big changes with unclear tradeoffs
made in an effort to get a quick win". So far I thought this was a
discussion of ideas, methods to gather data, and potential pitfalls.


> Instead, let’s break down the problems and fix them (the right way!)
> one at a time.  For example, it seems that the thread agrees that the
> overall design of constants are a problem: let's talk about
> (incrementally!) moving constants to the right architecture. There
> are good suggestions about this upthread.

I'm not sure why you think anyone so far suggested anything to "fix it
all" or to do it "not-incrementally". For example, I think we are
talking about the design of constants and how it can be incrementally
improved (for different purposes). I also think there are interesting
suggestions upthread but I'm not sure sure there was any kind of
agreement on "the right architecture".

Cheers,
  Johannes

Chris Lattner via llvm-dev

unread,
Mar 27, 2020, 12:22:57 PM3/27/20
to Johannes Doerfert, llvm...@lists.llvm.org, Doerfert, Johannes

On Mar 27, 2020, at 12:23 AM, Johannes Doerfert <johannes...@gmail.com> wrote:


> Getting to a multithread clean optimizer is not one bug fix away -
> this is a multiyear journey, and I would prefer not to see big changes
> with unclear tradeoffs made in an effort to get a quick win.

I think I missed the suggestion of "big changes with unclear tradeoffs
made in an effort to get a quick win". So far I thought this was a
discussion of ideas, methods to gather data, and potential pitfalls.

Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me.  It breaks the liskov substitution principle and is likely to lead to widespread bugs.

> Instead, let’s break down the problems and fix them (the right way!)
> one at a time.  For example, it seems that the thread agrees that the
> overall design of constants are a problem: let's talk about
> (incrementally!) moving constants to the right architecture. There
> are good suggestions about this upthread.

I'm not sure why you think anyone so far suggested anything to "fix it
all" or to do it "not-incrementally". For example, I think we are
talking about the design of constants and how it can be incrementally
improved (for different purposes). I also think there are interesting
suggestions upthread but I'm not sure sure there was any kind of
agreement on "the right architecture”.

It sounds like there is agreement that whole module/llvmcontext use-def chains for constant are a bad idea.  There is one specific proposal for how to fix that that is incremental and avoids the problem I mention above: use instructions to materialize them, so Constant stops deriving from Value.  I would love to see this be explored as a step along the way, instead of spending time pondering how bad a break to the core APIs in the compiler would be in practice.

-Chris


Nicolai Hähnle via llvm-dev

unread,
Mar 27, 2020, 4:54:16 PM3/27/20
to Chris Lattner, llvm...@lists.llvm.org, Doerfert, Johannes
On Fri, Mar 27, 2020 at 5:22 PM Chris Lattner via llvm-dev
<llvm...@lists.llvm.org> wrote:
> On Mar 27, 2020, at 12:23 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
>
>
> > Getting to a multithread clean optimizer is not one bug fix away -
> > this is a multiyear journey, and I would prefer not to see big changes
> > with unclear tradeoffs made in an effort to get a quick win.
>
> I think I missed the suggestion of "big changes with unclear tradeoffs
> made in an effort to get a quick win". So far I thought this was a
> discussion of ideas, methods to gather data, and potential pitfalls.
>
>
> Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me. It breaks the liskov substitution principle and is likely to lead to widespread bugs.

That's why I'm also wary of the idea of just having use lists empty
for certain types without any other special handling. However, I would
argue that if Value::use_begin() etc. contain an assertion that fails
when called on one of the value types that don't have use lists, then
the Liskov substition principle is de facto not broken. It basically
leads to a situation that is as-if Value didn't have use lists in the
first place, and only certain sub-types had use lists.

If we go down that route, maybe the inheritance hierarchy could be
reorganized in a way that makes that more obvious.

Cheers,
Nicolai


>
> > Instead, let’s break down the problems and fix them (the right way!)
> > one at a time. For example, it seems that the thread agrees that the
> > overall design of constants are a problem: let's talk about
> > (incrementally!) moving constants to the right architecture. There
> > are good suggestions about this upthread.
>
> I'm not sure why you think anyone so far suggested anything to "fix it
> all" or to do it "not-incrementally". For example, I think we are
> talking about the design of constants and how it can be incrementally
> improved (for different purposes). I also think there are interesting
> suggestions upthread but I'm not sure sure there was any kind of
> agreement on "the right architecture”.
>
>
> It sounds like there is agreement that whole module/llvmcontext use-def chains for constant are a bad idea. There is one specific proposal for how to fix that that is incremental and avoids the problem I mention above: use instructions to materialize them, so Constant stops deriving from Value. I would love to see this be explored as a step along the way, instead of spending time pondering how bad a break to the core APIs in the compiler would be in practice.
>
> -Chris
>
>

> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--

Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

David Blaikie via llvm-dev

unread,
Mar 27, 2020, 4:59:54 PM3/27/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes
On Fri, Mar 27, 2020 at 1:53 PM Nicolai Hähnle via llvm-dev <llvm...@lists.llvm.org> wrote:
On Fri, Mar 27, 2020 at 5:22 PM Chris Lattner via llvm-dev
<llvm...@lists.llvm.org> wrote:
> On Mar 27, 2020, at 12:23 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
>
>
> > Getting to a multithread clean optimizer is not one bug fix away -
> > this is a multiyear journey, and I would prefer not to see big changes
> > with unclear tradeoffs made in an effort to get a quick win.
>
> I think I missed the suggestion of "big changes with unclear tradeoffs
> made in an effort to get a quick win". So far I thought this was a
> discussion of ideas, methods to gather data, and potential pitfalls.
>
>
> Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me.  It breaks the liskov substitution principle and is likely to lead to widespread bugs.

That's why I'm also wary of the idea of just having use lists empty
for certain types without any other special handling. However, I would
argue that if Value::use_begin() etc. contain an assertion that fails
when called on one of the value types that don't have use lists, then
the Liskov substition principle is de facto not broken. It basically
leads to a situation that is as-if Value didn't have use lists in the
first place, and only certain sub-types had use lists.

But it doesn't - it means you write some generic code, test it with some cases & looks like it generalizes to other cases (you don't want to/can't test generic code with all possible generic arguments - that's why substitutability is important) then the code breaks when it hits a constant Value because it doesn't conform to the contract.
 

Chris Lattner via llvm-dev

unread,
Mar 27, 2020, 11:34:47 PM3/27/20
to David Blaikie, llvm...@lists.llvm.org, Doerfert, Johannes


On Mar 27, 2020, at 1:55 PM, David Blaikie <dbla...@gmail.com> wrote:

That's why I'm also wary of the idea of just having use lists empty
for certain types without any other special handling. However, I would
argue that if Value::use_begin() etc. contain an assertion that fails
when called on one of the value types that don't have use lists, then
the Liskov substition principle is de facto not broken. It basically
leads to a situation that is as-if Value didn't have use lists in the
first place, and only certain sub-types had use lists.

But it doesn't - it means you write some generic code, test it with some cases & looks like it generalizes to other cases (you don't want to/can't test generic code with all possible generic arguments - that's why substitutability is important) then the code breaks when it hits a constant Value because it doesn't conform to the contract.

David is exactly right.  To say the same thing in another way, such a design point would break library based design, one of the core principles of LLVM that makes it so powerful.  

To make this explicit, a library that walks use-lists internally would have implicit dependencies in its APIs that some things are not allowed to be constants.

-Chris

Nicholas Krause via llvm-dev

unread,
Mar 27, 2020, 11:43:41 PM3/27/20
to Chris Lattner, David Blaikie, llvm...@lists.llvm.org, Doerfert, Johannes
I'm not sure who is removing me from this discussion but please keep me CCed to it.
Thanks,
Nick

Nicolai Hähnle via llvm-dev

unread,
Mar 28, 2020, 5:43:45 AM3/28/20
to David Blaikie, llvm...@lists.llvm.org, Doerfert, Johannes
On Fri, Mar 27, 2020 at 9:55 PM David Blaikie <dbla...@gmail.com> wrote:
> On Fri, Mar 27, 2020 at 1:53 PM Nicolai Hähnle via llvm-dev <llvm...@lists.llvm.org> wrote:
>>
>> On Fri, Mar 27, 2020 at 5:22 PM Chris Lattner via llvm-dev
>> <llvm...@lists.llvm.org> wrote:
>> > On Mar 27, 2020, at 12:23 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
>> >
>> >
>> > > Getting to a multithread clean optimizer is not one bug fix away -
>> > > this is a multiyear journey, and I would prefer not to see big changes
>> > > with unclear tradeoffs made in an effort to get a quick win.
>> >
>> > I think I missed the suggestion of "big changes with unclear tradeoffs
>> > made in an effort to get a quick win". So far I thought this was a
>> > discussion of ideas, methods to gather data, and potential pitfalls.
>> >
>> >
>> > Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me. It breaks the liskov substitution principle and is likely to lead to widespread bugs.
>>
>> That's why I'm also wary of the idea of just having use lists empty
>> for certain types without any other special handling. However, I would
>> argue that if Value::use_begin() etc. contain an assertion that fails
>> when called on one of the value types that don't have use lists, then
>> the Liskov substition principle is de facto not broken. It basically
>> leads to a situation that is as-if Value didn't have use lists in the
>> first place, and only certain sub-types had use lists.
>
>
> But it doesn't - it means you write some generic code, test it with some cases & looks like it generalizes to other cases (you don't want to/can't test generic code with all possible generic arguments - that's why substitutability is important) then the code breaks when it hits a constant Value because it doesn't conform to the contract.

It's not a break of the Liskov substitution principle, though. For
example: In the proposal, you can iterate over uses of an Instruction,
and you can iterate over uses of all sub-types of an Instruction. You
can't iterate over Constant, but then Constant isn't a sub-type of
Instruction.

It's a bit of an academic point. What you're really trying to say is
that with this particular version of the proposal, it is not
**statically** checked at compile-time whether a Value has iterable
use lists, and that makes it easier to write incorrect code
accidentally. Yes, that's a weakness, and precisely the reason why I
suggested that we may want to re-organize the inheritance hierarchy of
Values.

Cheers,
Nicolai

Nicholas Krause via llvm-dev

unread,
Mar 28, 2020, 3:47:44 PM3/28/20
to Nicolai Hähnle, David Blaikie, llvm...@lists.llvm.org, Doerfert, Johannes

 Nicolai,
Please stop removing me from the CC I asked in a previous email to be
CCed so I can follow
the discussion.

Thanks,
Nick

David Blaikie via llvm-dev

unread,
Mar 28, 2020, 4:38:32 PM3/28/20
to Nicholas Krause, llvm...@lists.llvm.org, Doerfert, Johannes
I don't think anyone's /removing/ you, we're just replying to the relevant emails/subthreads which you aren't already on/won't get on this way, generally. Might be the best thing to in some way flag this thread in your mail client rather than asking others to handle email threads differently for your use case.

- Dave

David Blaikie via llvm-dev

unread,
Mar 28, 2020, 4:45:01 PM3/28/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes
On Sat, Mar 28, 2020 at 2:40 AM Nicolai Hähnle <nhae...@gmail.com> wrote:
On Fri, Mar 27, 2020 at 9:55 PM David Blaikie <dbla...@gmail.com> wrote:
> On Fri, Mar 27, 2020 at 1:53 PM Nicolai Hähnle via llvm-dev <llvm...@lists.llvm.org> wrote:
>>
>> On Fri, Mar 27, 2020 at 5:22 PM Chris Lattner via llvm-dev
>> <llvm...@lists.llvm.org> wrote:
>> > On Mar 27, 2020, at 12:23 AM, Johannes Doerfert <johannes...@gmail.com> wrote:
>> >
>> >
>> > > Getting to a multithread clean optimizer is not one bug fix away -
>> > > this is a multiyear journey, and I would prefer not to see big changes
>> > > with unclear tradeoffs made in an effort to get a quick win.
>> >
>> > I think I missed the suggestion of "big changes with unclear tradeoffs
>> > made in an effort to get a quick win". So far I thought this was a
>> > discussion of ideas, methods to gather data, and potential pitfalls.
>> >
>> >
>> > Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me.  It breaks the liskov substitution principle and is likely to lead to widespread bugs.
>>
>> That's why I'm also wary of the idea of just having use lists empty
>> for certain types without any other special handling. However, I would
>> argue that if Value::use_begin() etc. contain an assertion that fails
>> when called on one of the value types that don't have use lists, then
>> the Liskov substition principle is de facto not broken. It basically
>> leads to a situation that is as-if Value didn't have use lists in the
>> first place, and only certain sub-types had use lists.
>
>
> But it doesn't - it means you write some generic code, test it with some cases & looks like it generalizes to other cases (you don't want to/can't test generic code with all possible generic arguments - that's why substitutability is important) then the code breaks when it hits a constant Value because it doesn't conform to the contract.

It's not a break of the Liskov substitution principle, though.

It does in the sense that all Values have a list of uses in the current API - yes, if you're proposing /changing the API/ so that Values don't have a use list API, that's one thing - but adding an assertion/runtime failure is not that. Having a runtime failure where by accessing the uses of /some/ Values is a contract violation/assertion failure/etc - that makes it not substitutable, you can't just treat all Values as equal & have to special case to avoid asking for the uses of some Values.

Yes, if you move the use-list API down from Value, so Values don't have use lists, that's different/not what Chris is objecting to, I don't think - it was the " Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me. " that was specifically the objection/liskov breaking issue.
 

Chris Lattner via llvm-dev

unread,
Mar 28, 2020, 7:57:02 PM3/28/20
to David Blaikie, llvm...@lists.llvm.org, Doerfert, Johannes


On Mar 28, 2020, at 1:41 PM, David Blaikie <dbla...@gmail.com> wrote:

>
> But it doesn't - it means you write some generic code, test it with some cases & looks like it generalizes to other cases (you don't want to/can't test generic code with all possible generic arguments - that's why substitutability is important) then the code breaks when it hits a constant Value because it doesn't conform to the contract.

It's not a break of the Liskov substitution principle, though.

It does in the sense that all Values have a list of uses in the current API - yes, if you're proposing /changing the API/ so that Values don't have a use list API, that's one thing - but adding an assertion/runtime failure is not that. Having a runtime failure where by accessing the uses of /some/ Values is a contract violation/assertion failure/etc - that makes it not substitutable, you can't just treat all Values as equal & have to special case to avoid asking for the uses of some Values.

Yes, if you move the use-list API down from Value, so Values don't have use lists, that's different/not what Chris is objecting to, I don't think - it was the " Making use-def chains work differently based on the dynamic type of a Value* is very problematic to me. " that was specifically the objection/liskov breaking issue.
 

Exactly, very well put David.  I would be much less concerned if the use list apis were sunk out of Value, but I don’t see how that could be accomplished in a realistic way.

-Chris

Owen Anderson via llvm-dev

unread,
Apr 12, 2020, 4:05:12 AM4/12/20
to Nicolai Hähnle, llvm...@lists.llvm.org, Doerfert, Johannes

> On Thu, Mar 26, 2020 at 12:47 AM Chris Lattner <clat...@nondot.org> wrote:
>>> There is a certain elegance to having explicit instructions to import
>>> constants into the llvm::Function context, but also ugliness, as IR
>>> will then be full of "constant" instructions, unlike today. Plus,
>>> changing all existing LLVM IR to have those is an enormous amount of
>>> churn.
>>
>> Yes, it increases verbosity of the IR this is very true. In practice, it is not a problem though.
>
> Okay, it seems like it could be a viable long-term goal to do this.
> What I'm trying to figure out is how to incrementally evolve LLVM IR,
> and it feels like this change to enable "constant materialization
> instructions" is extremely invasive in terms of churn in the code base
> on a level comparable to opaque pointers -- and it's not strictly
> necessary for the goal of multi-threading. I'm just trying to be
> pragmatic here.

What if we modeled constants as implicit arguments to functions rather than introducing a constant materialization instruction? That would solve the problem of making their use lists function local, while avoiding some of the unnecessary complexity introduced by materialization functions (de-duplication, code placement, etc.).

--Owen

Reply all
Reply to author
Forward
0 new messages