Arbitrary Regions in MLIR
Historically, MLIR has been designed as an SSA/CFG IR, with a special case for mlfunc which used to encapsulate in a function what is now the affine dialect. As part of removing the special case of mlfunc, and unifying the Operation class as a generic container, the concept of Region was introduced.
A very simplistic definition is that a Region is a list of blocks nested under an operation. The first use-cases for Region have been affine.for and affine.if, and more recently in-line GPU kernels. In both cases the attached region(s) is(are) assumed to contain a single block. A generalization/formalization of the Region concept is not implemented, one of the original motivation included: “the nested (lists of) blocks should be black-boxable from the perspective of code generation and traditional control- and data-flow analyses”.
However this raises the question about the common semantics for this list of blocks and to which point does the “black-boxable” property extend. This RFC advocates for pushing the "black-boxable" aspect further: regions are nothing else than a list of blocks whose semantics can only be inferred with knowledge of the region (either through attributes or through the enclosing Operation itself), in particular the list of blocks should not be guaranteed to form a CFG.
We can explore the “black-boxable” aspect of a Region through the anticipated use-cases we know about, and speculate on useful properties to express. The current state of region includes the ability for dialects to specify ”attribute verification hooks [...] to verify the additional constraints they impose on the regions”. Such constraints can be, for example, about a region limited to affine properties, or about a Region expressing a computation offloaded to a device like a TPU or a GPU.
The consequence of having dialect specific semantics on a Region limits the kind of transformations that can be conservatively applied on a region generically (in other words: without specific knowledge about this region).
Some views in favor of preserving CFGs is that passes could process regions as isolated CFGs (like a function body) and that having to verify if the region is a CFG or not is unnecessary boilerplate.
On the other hand, even regions that look like CFGs on the surface may not be compatible with traditional CFG transformations in subtle ways. A good example is GPU kernels and in general modeling SPMD/SIMT programming model where multiple concurrent threads execute simultaneously and implicitly the region. This model offers a simplified view of the device to the programmer, which seems to fit a traditional CFG and is represented as such in LLVM. However this is incorrect in general and LLVM has a history of subtle mis-compiles of GPU kernels. LLVM introduced the convergent attribute in 2015 to try to retrofit some of these properties on a CFG but it is known to be not enough, and even today LLVM Jump-Threading pass is still reported to be incorrect with respect to GPU kernels. The same reference indicates that HSAIL “attempts to define restrictions on program transformations in terms of immediate dominators and post-dominators”, basically restricting the definition of a kernel Region as something that does not hold the usual properties of a CFG. Embracing the "GPU extensions" to CFG as first-class concept like SPIR-V/HSAIL to make MLIR more convenient for supporting GPU programming models does not seem like a path compatible with the generality and extensibility promised by MLIR.
In the absence of the ability to process a region in a generic way by any transformation or analyzes, there isn’t much left to lose to give up on forcing every region to have a “CFG”. The main aspect that is likely a core concept is the dominance property and queries. Not having a CFG in every region implies that dominance cannot be queried in arbitrary regions. One possible solution around handling queries on dominance between two operations (or blocks) could be to consider undefined behavior (assertion in the compiler) to try to query dominance between two operations whose closest enclosing region isn’t a CFG. This seems reasonable since any consumer of dominance should already check if the Region is having the right set of properties before processing it. Without dominance, a region (or the enclosing op) should implement a dialect-specific verifier to ensure properties, such as that a value is “visible” where it is used.
On the aspects motivating making the CFG properties pluggable, on top of removing the false sense of security about working on a CFG (with the aforementioned lesson learnt from LLVM and GPU kernel for example), we open-up MLIR to be used to more accurately represent data-flow graphs like TensorFlow and/or XLA, explore high-level State-Machine IR, and finally revisit our representation of Module/Function in MLIR (RFC coming soon).
"foo"() : () -> () { // CFG Region
^bar:
%bar = "bar"()
br ^non_cfg
^non_cfg:
"non_cfg"() {std.noncfg} : () -> () {
^baz:
%baz = "baz"()
^nested:
"nested"() : () -> () { // CFG Region
^foobar:
%foobar = "foobar"(%bar)
br ^foobaz
^foobaz:
"foobaz"(%baz, %foobar)
return
}
return
}
}
In this example, there are three levels of nesting: the outer region is a CFG, the ^non_cfg block in this region has a “non_cfg”() operation which encloses a non-CFG region. This region contains two blocks which don’t have any dominance relationships. However any block or operation in this region is dominated by “bar”() since it dominates the whole region by dominating the enclosing operation in a CFG region.
The “nested”() operation has CFG region attached which contains two blocks (and two operations). The "foobar"(%bar) operation dominates "foobaz"(%baz, %foobar) in this region. For the same reason as before, the whole region is dominated by “bar”(). However the tricky question is about the %baz = "baz"() dominance status with respect to "foobaz"(%baz, %foobar). In this proposal we consider that such question is ill-formed as the closest enclosing region is not a CFG.
Regarding the question of using the value %baz, we view the implicit capture mechanism as a convenience over passing explicit arguments to an operation. So to consider identically the case above to the situation where %baz is explicitly passed to the “nested” operation, and so it is the responsibility of the verifier on the "non_cfg" region to ensure that %baz is visible to the “nested” operation.
We propose to introduce a new unit-type attribute in the standard dialect: `std.noncfg`. In the absence of this attribute on the enclosing operation, the region is assumed to be a CFG. We consider this as a pragmatic choice: CFG-based representation are pervasive and making it the default seems sensible. In the future we may consider adding an enum attribute (or a string attribute) to record specific type of regions as needed. For instance if/when we get to a point where the GPU kernel representation requires different semantics (around convergence for example) than the traditional CFG, we could introduce a `SIMT` or `SPMD` region kind.
LangRef will be updated such that Regions are defined as “a list of MLIR Blocks,'' as opposed to “a CFG of MLIR Blocks”. A new section is introduced to specify CFG regions as a special case of region, defining the properties of CFG regions.
Blocks are redefined to be: “a list of operations without any defined execution order.” The CFG region specification adds the existing constraints on block: “In a CFG Region, a block is a sequential list of operations without control flow (calls are not considered control flow for this purpose) that are executed from top to bottom. The last operation in a block is a terminator operation, which ends the block.”
The successor-list part of an operation is renamed to block-operands, the semantic is specific to the operation.
We expect other changes, the list above is non-exhaustive.
The potential restrictions on Region (which includes any attribute independently of the CFG aspects, like a SIMT GPU kernel for instance) affect many pieces of the infrastructure: for example it is no longer clear if the canonicalization patterns and pattern-rewriter are always legal regardless of the dialect restriction on a Region. Similarly the use of a walker can’t be as simple in general, so we need to provide convenient helpers for passes to traverse the nested IR and filter out CFG Regions. Although most dialect-specific traversals filter for dialect-specific operations which are known to be exclusively in CFG Region and thus can’t get away with the current walkers.
It is likely that some added complexity will be warranted by the fact that traversals in general must express their intent with respect to regions in general. We plan to keep convenient walker for the common case.
--
Mehdi
I worked with River on two proposals that are related, here is the first one. We'll send another RFC for the second. We believe this first one shouldn't be a very intrusive change at this point and we think this will open-up the design field for MLIR users. We're looking forward to some feedback here.
However this raises the question about the common semantics for this list of blocks and to which point does the “black-boxable” property extend. This RFC advocates for pushing the "black-boxable" aspect further: regions are nothing else than a list of blocks whose semantics can only be inferred with knowledge of the region (either through attributes or through the enclosing Operation itself), in particular the list of blocks should not be guaranteed to form a CFG.
Motivation and Use-Cases for Restrictions on Regions
We can explore the “black-boxable” aspect of a Region through the anticipated use-cases we know about, and speculate on useful properties to express. The current state of region includes the ability for dialects to specify ”attribute verification hooks [...] to verify the additional constraints they impose on the regions”. Such constraints can be, for example, about a region limited to affine properties, or about a Region expressing a computation offloaded to a device like a TPU or a GPU.The consequence of having dialect specific semantics on a Region limits the kind of transformations that can be conservatively applied on a region generically (in other words: without specific knowledge about this region).
In the absence of the ability to process a region in a generic way by any transformation or analyzes, there isn’t much left to lose to give up on forcing every region to have a “CFG”. The main aspect that is likely a core concept is the dominance property and queries. Not having a CFG in every region implies that dominance cannot be queried in arbitrary regions. One possible solution around handling queries on dominance between two operations (or blocks) could be to consider undefined behavior (assertion in the compiler) to try to query dominance between two operations whose closest enclosing region isn’t a CFG. This seems reasonable since any consumer of dominance should already check if the Region is having the right set of properties before processing it. Without dominance, a region (or the enclosing op) should implement a dialect-specific verifier to ensure properties, such as that a value is “visible” where it is used.
On the aspects motivating making the CFG properties pluggable, on top of removing the false sense of security about working on a CFG (with the aforementioned lesson learnt from LLVM and GPU kernel for example), we open-up MLIR to be used to more accurately represent data-flow graphs like TensorFlow and/or XLA, explore high-level State-Machine IR, and finally revisit our representation of Module/Function in MLIR (RFC coming soon).
Region Kind
We propose to introduce a new unit-type attribute in the standard dialect: `std.noncfg`. In the absence of this attribute on the enclosing operation, the region is assumed to be a CFG. We consider this as a pragmatic choice: CFG-based representation are pervasive and making it the default seems sensible.
Specification ChangesLangRef will be updated such that Regions are defined as “a list of MLIR Blocks,'' as opposed to “a CFG of MLIR Blocks”. A new section is introduced to specify CFG regions as a special case of region, defining the properties of CFG regions.Blocks are redefined to be: “a list of operations without any defined execution order.” The CFG region specification adds the existing constraints on block: “In a CFG Region, a block is a sequential list of operations without control flow (calls are not considered control flow for this purpose) that are executed from top to bottom. The last operation in a block is a terminator operation, which ends the block.”The successor-list part of an operation is renamed to block-operands, the semantic is specific to the operation.We expect other changes, the list above is non-exhaustive.
Practical Aspects (Walkers, Traversing the IR, etc.)
The potential restrictions on Region (which includes any attribute independently of the CFG aspects, like a SIMT GPU kernel for instance) affect many pieces of the infrastructure: for example it is no longer clear if the canonicalization patterns and pattern-rewriter are always legal regardless of the dialect restriction on a Region. Similarly the use of a walker can’t be as simple in general, so we need to provide convenient helpers for passes to traverse the nested IR and filter out CFG Regions. Although most dialect-specific traversals filter for dialect-specific operations which are known to be exclusively in CFG Region and thus can’t get away with the current walkers.It is likely that some added complexity will be warranted by the fact that traversals in general must express their intent with respect to regions in general. We plan to keep convenient walker for the common case.
On May 16, 2019, at 5:45 PM, Mehdi AMINI <joke...@gmail.com> wrote:I worked with River on two proposals that are related, here is the first one. We'll send another RFC for the second. We believe this first one shouldn't be a very intrusive change at this point and we think this will open-up the design field for MLIR users. We're looking forward to some feedback here.Fantastic, thank you for putting this together, I think this is a really important step to move the conversation forward!However this raises the question about the common semantics for this list of blocks and to which point does the “black-boxable” property extend. This RFC advocates for pushing the "black-boxable" aspect further: regions are nothing else than a list of blocks whose semantics can only be inferred with knowledge of the region (either through attributes or through the enclosing Operation itself), in particular the list of blocks should not be guaranteed to form a CFG.+1, this is great.Motivation and Use-Cases for Restrictions on Regions
We can explore the “black-boxable” aspect of a Region through the anticipated use-cases we know about, and speculate on useful properties to express. The current state of region includes the ability for dialects to specify ”attribute verification hooks [...] to verify the additional constraints they impose on the regions”. Such constraints can be, for example, about a region limited to affine properties, or about a Region expressing a computation offloaded to a device like a TPU or a GPU.The consequence of having dialect specific semantics on a Region limits the kind of transformations that can be conservatively applied on a region generically (in other words: without specific knowledge about this region).Right.In the absence of the ability to process a region in a generic way by any transformation or analyzes, there isn’t much left to lose to give up on forcing every region to have a “CFG”. The main aspect that is likely a core concept is the dominance property and queries. Not having a CFG in every region implies that dominance cannot be queried in arbitrary regions. One possible solution around handling queries on dominance between two operations (or blocks) could be to consider undefined behavior (assertion in the compiler) to try to query dominance between two operations whose closest enclosing region isn’t a CFG. This seems reasonable since any consumer of dominance should already check if the Region is having the right set of properties before processing it. Without dominance, a region (or the enclosing op) should implement a dialect-specific verifier to ensure properties, such as that a value is “visible” where it is used.I don’t think we necessarily have to give up this far.I agree with you that we need an ‘open’ ecosystem of ops that have regions, and given that, generic code should make no assumptions about the semantics or behavior of regions. I’ve had conversations with at least one hardware team where regions are a good way to express things like VLIW instruction bundles, which have clearly different semantics than your usual CFG. Similar, as you’re aware, regions are also a great way to represent generic DAGs like a TensorFlow graph, even when it doesn’t have normal control flow semantics.That said, even though I think we want the ability for a region to interpret its contents in an arbitrary way, we can still impose restrictions on the form of the operations and blocks in the region, and I think that control flow semantics (through extensible terminators, which we are already set for) and therefore dominance is a good thing to preserve. We don’t need to define the semantics of the operations and behavior of the code, but we can still require that any values are defined before use in a CFG or block if they exist.
The reason I raise this is that we do have CFG-like semantics baked into our IR due to block arguments, and I pretty strongly feel that block arguments are the right way to go.Do you see any problems with that?
On the aspects motivating making the CFG properties pluggable, on top of removing the false sense of security about working on a CFG (with the aforementioned lesson learnt from LLVM and GPU kernel for example), we open-up MLIR to be used to more accurately represent data-flow graphs like TensorFlow and/or XLA, explore high-level State-Machine IR, and finally revisit our representation of Module/Function in MLIR (RFC coming soon).Yes, absolutely. In addition to the open op/region ecosystem, I think it is really important that we standardize a few “properties” that these operations can buy into, because the affine.if/for, the std.func operations, and many others *will* have sensible LLVM-like semantics, and we want dialect independent passes like LICM, CSE, and others to be able to apply to these sorts of regions.I (still) think that this is a problem for another day, and will become very clear when we have more sorts of users and resolve some of the other more pressing issues.Region Kind
We propose to introduce a new unit-type attribute in the standard dialect: `std.noncfg`. In the absence of this attribute on the enclosing operation, the region is assumed to be a CFG. We consider this as a pragmatic choice: CFG-based representation are pervasive and making it the default seems sensible.Why encode this as an attribute? These properties seem like something that is true for all instances of a given operation, so they should be a bit on AbstractOperation, not an attribute. This allows you to invert the sense of the bit. We generally want the absence of a set bit on AbstractOperation to be the conservative interpretation of something, so this will work out better.
That said, it isn’t clear what “CFG semantics” means. Can you define that better? What sorts of transformations are allowed and prohibited? You gave a large number of GPU issues from LLVM - are those sorts of use cases not “CFG like”?Would it make more sense to have a small family of orthogonal bits to model the independent properties that various CFG-like regions have?For example, we could have a bit for things like “unreachable blocks are dead”.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CANF-O%3DYbZv%3De5Y7eo0W%2BekmQVYEOeAxy0hCHOXfSr%2BBuCRq2AA%40mail.gmail.com.
The reason I raise this is that we do have CFG-like semantics baked into our IR due to block arguments, and I pretty strongly feel that block arguments are the right way to go.Do you see any problems with that?I don't see a problem with not changing anything in the way we model CFGs: block arguments are nice and successors on terminator seem also like a well understood design.I'm not sure I understand your point about "dominance is a good thing to preserve". If a region is non-CFG the concept of dominance may not be applicable.If we talk about a data flow graph (like TensorFlow), we may want to express a non-CFG region with the sequence:%add1 = tf.Add(%0, %1)%add2 = tf.Add(%2, %3)%mul = tf.Mul(%add1, %add2)%add1 and %add2 are not ordered one with respect to each other, and there isn't any dominance between the two (%add2 does not post-dominate %add1 here): we would be able to have a block with unordered operations.
Similarly a TensorFlow loop could be expressed in a single block without breaking up the "back-edge":%loop.body.init = tf_executor.Merge(%loop.init, %next_count)%dec_count = "tf.Add"(%loop.body.init, -1)%loop_cond = "tf.NotEqual"(%dec_count, 0) :%true, %false = tf_executor.Switch (%loop_cond, %dec_count)%next_count = tf_executor.NextIteration(%false)
If we must preserve dominance, then it seems we always have a CFG somehow?
Region Kind
We propose to introduce a new unit-type attribute in the standard dialect: `std.noncfg`. In the absence of this attribute on the enclosing operation, the region is assumed to be a CFG. We consider this as a pragmatic choice: CFG-based representation are pervasive and making it the default seems sensible.Why encode this as an attribute? These properties seem like something that is true for all instances of a given operation, so they should be a bit on AbstractOperation, not an attribute. This allows you to invert the sense of the bit. We generally want the absence of a set bit on AbstractOperation to be the conservative interpretation of something, so this will work out better.I just didn't think about using a bit on the AbstractOperation: I intuitively try to design things in a generic way through the dialect interface: stealing a bit on AbstractOperation requires hard-coding the concept in MLIR right? Using attributes is the only way for dialects to express such concepts (without patching MLIR itself).
This is likely enough of a "core" concept in this case that I'm fine with baking it inside AbstractOperation though.We went with "default to CFG" motivated by a prettier syntax and expecting the overall majority of operation to hold a CFG-region, but in retrospect it wasn't justified since registered op can omit such attribute in their custom printer/parser.That said, it isn’t clear what “CFG semantics” means. Can you define that better? What sorts of transformations are allowed and prohibited? You gave a large number of GPU issues from LLVM - are those sorts of use cases not “CFG like”?Would it make more sense to have a small family of orthogonal bits to model the independent properties that various CFG-like regions have?For example, we could have a bit for things like “unreachable blocks are dead”.So far we were thinking about starting with a single "bit" expressing that we're in a CFG (in the LLVM sense) and so dominance queries are valid, and add more properties/traits as we encounter use-cases (SIMT/SIMD for GPU kernels could be one). We haven't done the work of expressing the various "bits" we would need to slice the useful properties of CFG we would want to express independently.
On May 16, 2019, at 10:18 PM, Mehdi AMINI <joke...@gmail.com> wrote:The reason I raise this is that we do have CFG-like semantics baked into our IR due to block arguments, and I pretty strongly feel that block arguments are the right way to go.Do you see any problems with that?I don't see a problem with not changing anything in the way we model CFGs: block arguments are nice and successors on terminator seem also like a well understood design.I'm not sure I understand your point about "dominance is a good thing to preserve". If a region is non-CFG the concept of dominance may not be applicable.If we talk about a data flow graph (like TensorFlow), we may want to express a non-CFG region with the sequence:%add1 = tf.Add(%0, %1)%add2 = tf.Add(%2, %3)%mul = tf.Mul(%add1, %add2)%add1 and %add2 are not ordered one with respect to each other, and there isn't any dominance between the two (%add2 does not post-dominate %add1 here): we would be able to have a block with unordered operations.Right, but we can already model this case.To be clear, what I mean by “dominance is a good thing to preserve” are structural properties of the IR. Regardless of this discussion, it sounds like we agree that regions can contain a graph of blocks,
and the edges of that graph are defined by terminators. If so, a dominator tree may be formed, because a dominator tree can be formed from any graph.
Similarly, I assume we agree that operands to terminators are expected to line up with the bbargs of the corresponding successor.
Given this (which I think we agree on) then these are the sorts of structural properties that emerge:1. Operations in the same block can only refer to each other in lexical order (lower ones can refer to earlier ones)
2. An operation in a block is allowed to refer to a bb argument defined on its block.3. An operation is only allowed to refer to a value defined in another block (by a bbarg or operation) when it is dominated (in the domtree sense) by it.
1) dominance queries are always fully defined based on structural properties. 2) dialects can choose the best way to [ab]use the structural restrictions of the system to model their problems. 3) regions by themselves do not imply any specific interpretation to the blocks or operations within them. 4) passes that know about specific regions (e.g. cfg.func, affine.for, etc) can use fully domain specific information to do good things. 5) We carefully define some systemic bits to capture the properties we need to make important basic passes (CSE, DCE, etc) dialect independent.
%add1 and %add2 are not ordered one with respect to each other, and there isn't any dominance between the two (%add2 does not post-dominate %add1 here): we would be able to have a block with unordered operations.Right, but we can already model this case.To be clear, what I mean by “dominance is a good thing to preserve” are structural properties of the IR. Regardless of this discussion, it sounds like we agree that regions can contain a graph of blocks,The key here to me here to know if we have the same things in mind is "can": I agree that a regions can contain a graph of blocks, but it does not imply that a region containing multiple blocks is necessarily chaining them like a CFG.
and the edges of that graph are defined by terminators. If so, a dominator tree may be formed, because a dominator tree can be formed from any graph.Right, but let's make sure we don't over assume: "any graph" to me implies that there can be multiple entries for instance (two different parallel entry blocks with side-effects that can't be ordered for instance).
Similarly, I assume we agree that operands to terminators are expected to line up with the bbargs of the corresponding successor.The point of non-CFG region is to not make this assumption about edges between blocks and terminator. This leave everything up to the verifier. As such nothing would prevent a modeling like:my_streaming_region() ({^b0():%val1 = "op"()"send"(%val1) [^b1] // Non terminator here.%val2" = "op"()"send"(%val2) [^b2]yield^b1(%arg1):...^b2(%arg2):...})Where every block is conceptually executing in parallel and wait for its arguments to be "ready" before executing.Of course you can model this with separate functions instead of blocks for instance, but then we force an inter-procedural design. Could also use future and explicit loop to model the runtime behavior, but that is obfuscating the high level semantics of the program here.I'm arguing for leaving the possibility of the IR to generalize here, without losing anything that exists today in the current (CFG) regions.
Given this (which I think we agree on) then these are the sorts of structural properties that emerge:1. Operations in the same block can only refer to each other in lexical order (lower ones can refer to earlier ones)Just a nit: this is already broken in unreachable blocks in LLVM, which means you can't make this assumption locally in a block.We're back to the notion of dominance and control flow, if the region isn't a CFG, you don't (necessarily) have an entry block (at least you don't know opaquely which one is it) or you can have multiple.
2. An operation in a block is allowed to refer to a bb argument defined on its block.3. An operation is only allowed to refer to a value defined in another block (by a bbarg or operation) when it is dominated (in the domtree sense) by it.This ties into CFG semantics as far as I can tell.The whole point of non-CFG region is to escape these constraints and allow:non_cfg_region() ({^b0():%val1 = "op"()yield^b1():%val2 = "op"()yield^b2():op(%val2, %val1)yield...})
So far we were thinking about starting with a single "bit" expressing that we're in a CFG (in the LLVM sense) and so dominance queries are valid, and add more properties/traits as we encounter use-cases (SIMT/SIMD for GPU kernels could be one). We haven't done the work of expressing the various "bits" we would need to slice the useful properties of CFG we would want to express independently.The design point I’m arguing for is what we currently have:To summarize: you advocating entirely against the non-CFG regions proposal entirely (most of what you're saying is exactly what Albert has been arguing about).
1) dominance queries are always fully defined based on structural properties. 2) dialects can choose the best way to [ab]use the structural restrictions of the system to model their problems. 3) regions by themselves do not imply any specific interpretation to the blocks or operations within them. 4) passes that know about specific regions (e.g. cfg.func, affine.for, etc) can use fully domain specific information to do good things. 5) We carefully define some systemic bits to capture the properties we need to make important basic passes (CSE, DCE, etc) dialect independent.I'm advocating for leaving the possibility of the IR to generalize, without losing anything that exists today in the current (CFG) regions.I believe this makes MLIR more generic and applicable to a larger range of domain. It embraces the "multi" level part of the goal. I believe that restricting to CFGs and SSA implies that representation outside of the middle-end (close to source-languages) will have more impedance mismatch and likely will have to rewrite another infrastructure because MLIR won't fit their need.
So far we were thinking about starting with a single "bit" expressing that we're in a CFG (in the LLVM sense) and so dominance queries are valid, and add more properties/traits as we encounter use-cases (SIMT/SIMD for GPU kernels could be one). We haven't done the work of expressing the various "bits" we would need to slice the useful properties of CFG we would want to express independently.The design point I’m arguing for is what we currently have:To summarize: you advocating entirely against the non-CFG regions proposal entirely (most of what you're saying is exactly what Albert has been arguing about).I haven’t followed Albert’s discussions, so I can’t comment on that.I’m arguing that we keep the structural properties of the IR, and define a set of bits that provide semantic properties needed by target-independent passes.1) dominance queries are always fully defined based on structural properties. 2) dialects can choose the best way to [ab]use the structural restrictions of the system to model their problems. 3) regions by themselves do not imply any specific interpretation to the blocks or operations within them. 4) passes that know about specific regions (e.g. cfg.func, affine.for, etc) can use fully domain specific information to do good things. 5) We carefully define some systemic bits to capture the properties we need to make important basic passes (CSE, DCE, etc) dialect independent.I'm advocating for leaving the possibility of the IR to generalize, without losing anything that exists today in the current (CFG) regions.I believe this makes MLIR more generic and applicable to a larger range of domain. It embraces the "multi" level part of the goal. I believe that restricting to CFGs and SSA implies that representation outside of the middle-end (close to source-languages) will have more impedance mismatch and likely will have to rewrite another infrastructure because MLIR won't fit their need.I think you’re potentially overly dramatizing the situation here. :-)I’m not saying that we should never do this. I’m saying that I would like to see large evidence that we “have to” do this, e.g. that any problems encountered by a narrower design cannot be solved or worked around in a reasonable way. With that understanding, I’d understand the cost/benefit tradeoff better.
We are progressively generalizing the MLIR design, and nothing is set in stone. I think it makes sense to take things incrementally here, and the next logical step is to start nailing down the existing region semantic bits that specify the dialect independent properties that do actually hold for certain of our existing regions.-Chris
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/0BD2E32C-82E4-407C-A552-24F3948AB8C8%40google.com.
On Fri, May 17, 2019 at 9:28 AM Chris Lattner <clat...@google.com> wrote:On May 16, 2019, at 10:18 PM, Mehdi AMINI <joke...@gmail.com> wrote:2. An operation in a block is allowed to refer to a bb argument defined on its block.3. An operation is only allowed to refer to a value defined in another block (by a bbarg or operation) when it is dominated (in the domtree sense) by it.This ties into CFG semantics as far as I can tell.The whole point of non-CFG region is to escape these constraints and allow:non_cfg_region() ({^b0():%val1 = "op"()yield^b1():%val2 = "op"()yield^b2():op(%val2, %val1)yield...})
On Friday, May 17, 2019 at 10:47:07 AM UTC-7, Mehdi AMINI wrote:On Fri, May 17, 2019 at 9:28 AM Chris Lattner <clat...@google.com> wrote:On May 16, 2019, at 10:18 PM, Mehdi AMINI <joke...@gmail.com> wrote:2. An operation in a block is allowed to refer to a bb argument defined on its block.3. An operation is only allowed to refer to a value defined in another block (by a bbarg or operation) when it is dominated (in the domtree sense) by it.This ties into CFG semantics as far as I can tell.The whole point of non-CFG region is to escape these constraints and allow:non_cfg_region() ({^b0():%val1 = "op"()yield^b1():%val2 = "op"()yield^b2():op(%val2, %val1)yield...})Can you clarify this further? Is the intended semantics of a "non-CFG" region that every block executes in parallel?
Earlier in this thread, it seemed like another implication was that operations within a specific block would be understood to be an unordered sea of ops.
The functions-as-ops proposal seems to generalize this to ops being "declarative" rather than representing something with runtime execution semantics. That seems like a major shift. Is there precendent for that? What are the ramifications of "declarative ops" and "declarative regions"?
-- Sean SilvaThe CFG region specification adds the existing constraints on block: “In a CFG Region, a block is a sequential list of operations without control flow (calls are not considered control flow for this purpose) that are executed from top to bottom. The last operation in a block is a terminator operation, which ends the block.”
The successor-list part of an operation is renamed to block-operands, the semantic is specific to the operation.
We expect other changes, the list above is non-exhaustive.
Practical Aspects (Walkers, Traversing the IR, etc.)
The potential restrictions on Region (which includes any attribute independently of the CFG aspects, like a SIMT GPU kernel for instance) affect many pieces of the infrastructure: for example it is no longer clear if the canonicalization patterns and pattern-rewriter are always legal regardless of the dialect restriction on a Region. Similarly the use of a walker can’t be as simple in general, so we need to provide convenient helpers for passes to traverse the nested IR and filter out CFG Regions. Although most dialect-specific traversals filter for dialect-specific operations which are known to be exclusively in CFG Region and thus can’t get away with the current walkers.
It is likely that some added complexity will be warranted by the fact that traversals in general must express their intent with respect to regions in general. We plan to keep convenient walker for the common case.
--
Mehdi
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/15a29729-e903-4e4d-a3eb-a70014af2e45%40tensorflow.org.
The functions-as-ops proposal seems to generalize this to ops being "declarative" rather than representing something with runtime execution semantics. That seems like a major shift. Is there precendent for that? What are the ramifications of "declarative ops" and "declarative regions"?I don't think this proposal is changing anything here: non-CFG region seems orthogonal to have "declarative operations".
For instance we should be able to move forward with the current proposal to turn module and function to be operations without the non-CFG regions, and at this point function operation inside a module would be declarative.Other things that will be possible (independently of the non-CFG region) with the "function as operations" proposal is the ability to have declarative operations for module-level metadata.Best,--Mehdi
---- Sean SilvaThe CFG region specification adds the existing constraints on block: “In a CFG Region, a block is a sequential list of operations without control flow (calls are not considered control flow for this purpose) that are executed from top to bottom. The last operation in a block is a terminator operation, which ends the block.”
The successor-list part of an operation is renamed to block-operands, the semantic is specific to the operation.
We expect other changes, the list above is non-exhaustive.
Practical Aspects (Walkers, Traversing the IR, etc.)
The potential restrictions on Region (which includes any attribute independently of the CFG aspects, like a SIMT GPU kernel for instance) affect many pieces of the infrastructure: for example it is no longer clear if the canonicalization patterns and pattern-rewriter are always legal regardless of the dialect restriction on a Region. Similarly the use of a walker can’t be as simple in general, so we need to provide convenient helpers for passes to traverse the nested IR and filter out CFG Regions. Although most dialect-specific traversals filter for dialect-specific operations which are known to be exclusively in CFG Region and thus can’t get away with the current walkers.
It is likely that some added complexity will be warranted by the fact that traversals in general must express their intent with respect to regions in general. We plan to keep convenient walker for the common case.
--
Mehdi
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
On Thursday, May 16, 2019 at 5:45:53 PM UTC-7, Mehdi AMINI wrote:Specification ChangesLangRef will be updated such that Regions are defined as “a list of MLIR Blocks,'' as opposed to “a CFG of MLIR Blocks”. A new section is introduced to specify CFG regions as a special case of region, defining the properties of CFG regions.
Blocks are redefined to be: “a list of operations without any defined execution order.”
Ops have always represented some sort of "execution". The functions-as-ops proposal seems to generalize this to ops being "declarative" rather than representing something with runtime execution semantics. That seems like a major shift. Is there precendent for that? What are the ramifications of "declarative ops" and "declarative regions"?-- Sean Silva
The functions-as-ops proposal seems to generalize this to ops being "declarative" rather than representing something with runtime execution semantics. That seems like a major shift. Is there precendent for that? What are the ramifications of "declarative ops" and "declarative regions"?I don't think this proposal is changing anything here: non-CFG region seems orthogonal to have "declarative operations".As this RFC is proposing a strict generalization of what we have, it's trivially true that it doesn't change anything here. So I don't think that is a valuable criterion by which to judge the RFC (any generalization would trivially satisfy that property). I think the point that needs more elaboration is what constraints still remain in place and why.For example, is the following op / non-cfg region permitted? here, "my_dialect.change_op_meaning" encloses a region where the "add" op should be interpreted as having the semantics of the "sub" op:```"my_dialect.change_op_meaning"() { old_op: "std.add", new_op: "std.sub", std.noncfg } : () -> () {^foo%val = "std.add"(%0, %1) : (f32, f32) -> f32print(%val) // Should print the numerical value `%0 - %1`, not %0 + %1`}```
Currently, our IR has both structural restrictions (e.g. a block holds a list of ops) and semantic restrictions (e.g. SSA, CFG). The way I'm interpreting this proposal is that by putting {std.noncfg} on the op/region, one opts out of all (?) the semantic restrictions. The RFC says that these become "dialect defined" but doesn't seem to bound or define what that means.
Partially, I think that our current situation is not great, since if one looks at LangRef now it seems to rely a lot on one having an intuitive idea of what an SSA CFG IR is. I would like to understand more what a "dialect-defined" definition for our current SSA CFG semantics would look like.Sorry for the unorganized thoughts on this. I'm trying to dig into some gut feelings about this proposal (and the "functions as operations" one, which I think is intimately related) and so I don't have a fully coherent statement on it yet.
--For instance we should be able to move forward with the current proposal to turn module and function to be operations without the non-CFG regions, and at this point function operation inside a module would be declarative.Other things that will be possible (independently of the non-CFG region) with the "function as operations" proposal is the ability to have declarative operations for module-level metadata.Best,--Mehdi---- Sean SilvaThe CFG region specification adds the existing constraints on block: “In a CFG Region, a block is a sequential list of operations without control flow (calls are not considered control flow for this purpose) that are executed from top to bottom. The last operation in a block is a terminator operation, which ends the block.”
The successor-list part of an operation is renamed to block-operands, the semantic is specific to the operation.
We expect other changes, the list above is non-exhaustive.
Practical Aspects (Walkers, Traversing the IR, etc.)
The potential restrictions on Region (which includes any attribute independently of the CFG aspects, like a SIMT GPU kernel for instance) affect many pieces of the infrastructure: for example it is no longer clear if the canonicalization patterns and pattern-rewriter are always legal regardless of the dialect restriction on a Region. Similarly the use of a walker can’t be as simple in general, so we need to provide convenient helpers for passes to traverse the nested IR and filter out CFG Regions. Although most dialect-specific traversals filter for dialect-specific operations which are known to be exclusively in CFG Region and thus can’t get away with the current walkers.
It is likely that some added complexity will be warranted by the fact that traversals in general must express their intent with respect to regions in general. We plan to keep convenient walker for the common case.
--
Mehdi
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/15a29729-e903-4e4d-a3eb-a70014af2e45%40tensorflow.org.
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/0301b7af-bcf3-414a-9452-1ed56c8d156f%40tensorflow.org.