[llvm-dev] "Ordered" Instruction Iterator?

616 views
Skip to first unread message

Zhang via llvm-dev

unread,
May 31, 2021, 10:42:56 PM5/31/21
to llvm-dev
Hi:

In my analysis Pass, I'm trying to visit CallInsts to a special Intrinsic, in the order that they'll be executed, starting from the EntryBlock.
My naive initial assumption was to start from entryBlock and custom handle Br Instructions, but apparently this doesn't work well with more complicated CFGs like loop or Switch.

Is there any LLVM builtin infrastructure that could be used for this? Any hint would be appreciated.


Zhang

David Chisnall via llvm-dev

unread,
Jun 1, 2021, 9:25:42 AM6/1/21
to llvm...@lists.llvm.org
Hi,

It's probably worth stepping back a bit and explaining a bit more what
you mean by 'ordered' and why that's what you want. Instructions within
a basic block that do not have side effects are guaranteed to be
executed in some order that satisfies their dependencies but there is no
guarantee within code generation that the order in the IR is preserved.

Once you get to three basic blocks, there's no single execution order of
basic blocks within the function. The entry block will be executed
first, but then the next block to execute depends on the branch
conditions and some blocks may be executed multiple times.

You can trivially visit all blocks in an undefined order. You can do a
breadth-first traversal by looking at the successors found in the
terminator in every block at a given depth (thought it's a graph, so you
need to maintain a set of BBs you've visited before), but for any
non-trivial function that will only very loosely approximate execution
order.

If you want to track dependencies between invocations of an intrinsic,
then you might want to consider having that intrinsic return and consume
a token. You can then walk def-use chains to get a set of paths through
the function.

David

> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Cranmer, Joshua via llvm-dev

unread,
Jun 1, 2021, 12:10:54 PM6/1/21
to Zhang, llvm...@lists.llvm.org

As David points out, there’s some ambiguity as to what “ordered” means with respect to basic blocks. There are three main different traversals that LLVM automatically provides, and which one is appropriate depends on your use case.

 

The first is regular traversal over all basic blocks in the function:

for (auto &BB : F) {}

 

This will iterate over every basic block in the function in an arbitrary order (roughly, the order that they are actually laid out during emission), and the only guarantee you have is that the entry block is first.

 

Alternatively, you can choose to use the depth-first traversal:

for (auto BB : depth_first(F)) {}

 

This will execute a depth-first preorder, which means every basic block will be visited after one (but not necessarily all) of its predecessors have been visited. This is sufficient to guarantee that you visit every dominator of a block before you visit the block itself.

 

Another alternative is the reverse postorder traversal (ReversePostOrderTraversal is actually expensive to compute, so you may want to save it to a variable if you intend to do multiple traversals in a pass):

for (auto BB : ReversePostOrderTraversal(F)) {}

 

Reverse post-order visits every block after all of its predecessors (excluding loops) have been visited. From the short snippet of an explanation you have provided, this may suffice for your use-case. However, it may also be overkill for your needs, and you might instead investigate a cheaper alternative traversal.

Zhang via llvm-dev

unread,
Jun 1, 2021, 12:21:48 PM6/1/21
to Cranmer, Joshua, llvm-dev
Hi:
Basically in our backend, some domain specific information are encoded as calls to a custom intrinsic. For example our semantic supports encoding a struct in multiple calls (It's a DSL for JIT-Like operations, in case you are wondering), as following:

```
int main(int argc){
 uint32_t xxx = 0x1;
uint16_t yyy = 0x1;
uint16_t zzz = 0x1;
 __builtin_save(xxx);
 if(argc % 2 ==0){
     __builtin_save(yyy);
   }
else{
 __builtin_save(zzz);
}
........
}

What I'm trying to accomplish here is basically to iterate the IR from entry and reconstruct the structure layout:
```
{
 uint32_t xxx;
 uint16_t yyy;
}
```

I'm aware directly using a StructType would be a way much simpler solution, however this is currently limited by other parts in our usage pipeline.

Is this feasible to implement at IR Level? Or am I better off with libClang based solutions?

Thanks a lot for the info.


Zhang
 
 
------------------ Original ------------------
From:  "Cranmer, Joshua"<joshua....@intel.com>;
Date:  Wed, Jun 2, 2021 00:12 AM
To:  "Zhang"<ad...@mayuyu.io>; "llvm-dev"<llvm...@lists.llvm.org>;
Subject:  RE: [llvm-dev] "Ordered" Instruction Iterator?

Stefanos Baziotis via llvm-dev

unread,
Jun 1, 2021, 12:40:34 PM6/1/21
to Zhang, llvm-dev
Hi Zhang,

What has been explained in previous replies is that at *compile-time*, you don't know whether at *runtime* argc % 2 == 0 will be true or false. So, you don't know whether __builtin_save(yyy) or __builtin_save(zzz) will be reached first. Are you maybe interested in *text order* (I don't know if this is a standard term but I mean the order at which entities appear on the text of the source file) instead of *execution order* (the order at which entities are reached during the dynamic execution)?

To make that clear, consider this:
```
for (int i = 0; i < 100; ++i) {
  if (i > 0) {
    foo();  // executed at the 2nd iteration
  }
  bar(); // executed at the 1st iteration
}
```

`foo()` is executed *after* `bar()` but appears *before* it in text order.

Best,
Stefanos

_______________________________________________
Reply all
Reply to author
Forward
0 new messages