[llvm-dev] LLVM data structures between modules

Mohannad Ismail via llvm-dev

unread,

Feb 24, 2021, 3:28:34 PM2/24/21

to llvm-dev

Greetings everyone,

I currently have a pass (Pass 2) that does some transformations. What I want to do is to have a pass (Pass 1) that runs before Pass 2, collects some IR information, stores it in a data structure and passes the data structure to Pass 2 so that I can use it for specific transformations. I think this can be done with getAnalysisUsage, but I'm not sure how. I would like to know how to do that exactly, if it's possible.

Another thing, and this is the tricky part, is that I want Pass 1 to run on all the source files I have first before Pass 2 runs and pass a collective data structure to Pass 2. In other words, I want Pass 1 to run across all the modules and source files first, collect information, pass it to Pass 2 then Pass 2 runs. Is there a way to tell LLVM to do this type of "double compilation"?

Hope I was able to explain this well enough. Please let me know if I wasn't clear or if you have any questions. Thank you very much!

Best regards,

Mohannad Ismail

David Blaikie via llvm-dev

unread,

Feb 24, 2021, 4:01:35 PM2/24/21

to Mohannad Ismail, llvm-dev

If you want to do cross-file optimization, you're looking for/want to use something like LTO or ThinLTO. (see, for instance, the whole program devirtualization work done with ThinLTO recently)

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Mohannad Ismail via llvm-dev

unread,

Feb 24, 2021, 4:25:31 PM2/24/21

to David Blaikie, llvm-dev

Thank you for your reply!

Correct me if I'm wrong, but doesn't LTO kick in after the compilation phase? My current pass makes instrumentations and the pass I want to run before it won't make any instrumentations. So what I am thinking is that one would be an analysis pass and the other would be a transformation pass. From what I understand, instrumentation passes run first before optimization and linking, thus LTO might not help with that. Do I need to convert my passes to be optimization passes to take advantage of LTO? How do I do that? Thank you for your help and support!

David Blaikie via llvm-dev

unread,

Feb 24, 2021, 4:47:04 PM2/24/21

to Mohannad Ismail, llvm-dev

On Wed, Feb 24, 2021 at 1:25 PM Mohannad Ismail <imoh...@vt.edu> wrote:

Thank you for your reply!

Correct me if I'm wrong, but doesn't LTO kick in after the compilation phase?

"compilation phase" is a bit vague when it comes to LTO.

In simple full LTO - yes, the compiler runs, does some optimizations but keeps the representation in LLVM IR, not lowering it to machine code, then the linker runs, realizes it's been given IR not machine code, and passes all the IR files back to the compiler - the compiler links all the IR together, then does more optimization on that one big IR file and eventually does code generation on it.

This is the reality of the compilation model - the conversion to IR happens without global knowledge, otherwise that step couldn't be distributed/parallelized and builds would be very slow. But if you want to configure your own optimization pipelines etc, there's no reason that first (isolated) stage of compilation has to do much work - it could do no work, to ensure that during LTO whatever important properties are preserved for discovery by your analyses.

ThinLTO does all this, but its merge step is more custom built - during the first stage of compilation, again some IR transformations/optimizations are done and then the IR is emitted, along with a side file summary of important details - those summary files are sent to the "thin link" step, which does the cross-module/whole-program analysis using the summaries, and then produces a report of sorts, that says what cross-module compilation should be done (eg: "import thing X from file A into file B, so that B can see details of X (for analysis, inlining, etc)" ) and then backend compilations, running distributed/in parallel, consume those reports and load the originally emitted IR, perform further optimization based on the reports, and eventually emit machine code. That then goes to the traditional linker for the normal linking step.

My current pass makes instrumentations and the pass I want to run before it won't make any instrumentations. So what I am thinking is that one would be an analysis pass and the other would be a transformation pass. From what I understand, instrumentation passes run first before optimization and linking, thus LTO might not help with that.

instrumentation passes aren't "special" in any way I know of - they're another kind of transformation - but, yes, if they need to run early because optimizations would destroy the properties they want to discover, then you'd probably have to have a custom optimization pipeline to do no optimizations (or none that would destroy the invariants you care about) before you can do the whole program analysis you want to do (ie: don't destroy the invariants in the first optimization pipeline - so you can preserve them until LTO or ThinLTO time where they can be discovered, and then the backend compilation(s) can act on them)

Reply all

Reply to author

Forward