affine.graybox op - RFC

135 views
Skip to first unread message

Uday Bondhugula

unread,
Sep 28, 2019, 11:28:18 AM9/28/19
to MLIR
Hi all,

Here is a proposal on adding a new op to the affine dialect. Please let me know what you think! Although I'm appending it below, the formatting may be messed up. Below is the same thing on GitHub; please feel free to comment there if that's better - I'll be updating/refining the GitHub version based on comments.

Thanks,
Uday

-------------------------------------------------------------------------------------------------------------------------
# RFC - affine.graybox op

This proposal is on adding a new op named *affine.graybox* to MLIR's [affine dialect](https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md).
The op allows the polyhedral form to be used without the need for outlining to
functions, and without the need to turn affine ops such as affine.for, affine.if
into standard unrestricted for's or to "list of basic blocks" control flow
respectively. In particular, with an *affine.graybox*, it is possible to represent
*every* load and store operation using an *affine.load* and *affine.store* 
respectively.

Some form of this op was prototyped by @Alex Zinenko a while ago as discussed on
this thread:
But the implementation is dated and there is no publicly available documentation
on it. This proposal has aspects that are different from the previous design.


## Op Description

1. The *affine.graybox* op has zero results, zero or more operands, and holds
   a single region, which is a list of zero or more blocks. The op's region can
   have zero or more arguments, each of which can only be a *memref*. The
   operands bind 1:1 to its region's arguments.  The op can't use any memrefs
   defined outside of it, but can use any other SSA values that dominate it. Its
   region's blocks can have terminators the same way as current MLIR functions 
   (FuncOp) can. Control from any *return* ops in its region returns to right 
   after the *affine.graybox* op.  The op will have the trait *FunctionLike*.

2. The requirement for an SSA value to be a valid symbol
   changes so that it also includes (a) symbols at the top-level of any
   *affine.graybox*, and (b) those values that dominate an affine graybox. In
   the latter case, symbol validity is sensitive to the enclosing graybox. As
   such, there has to be an additional method: mlir::isValidSymbol(Value \*v,
   Operation \*op) to check for symbol validity for use in the specific op. See
   design alternatives towards the end.

3. An *affine.graybox* is not isolated from above. All SSA values (other than
   memrefs) that dominate the op can be used in the graybox. Constants and
   other replacements can freely propagate into it. Since memrefs from outside
   can't be used in the graybox and have to be explicitly provided as operands,
   canonicalization or replacement for them will only happen through rewrite
   patterns registered on *affine.graybox* op. More on this further below.

4. *affine.graybox* is a non-executing op that is eventually discarded by
   -lower-affine.

## Syntax

```mlir {.mlir}
// Custom form syntax.
affine.graybox `[` memref-use-list `]` `:` memref-type-list `{`
  block+
`}`
```

## Examples

Here are three examples: one related to non-affine control flow, one to
non-affine loop bounds, and the third to non-affine data accesses that can be
represented via affine.grayboxes without having to outline the function or
without having to use std load/stores: the latter two are the current
possibilities of representing them.

### Example 1

Here is how a simple typical non-affine control flow is represented with
an affine.graybox; note that all load/stores are in it are affine. This example
is used in the Rationale document to show how outlining to a separate function
allows representation using affine constructs:

```
// A simple linear search in every row of a matrix.
for (i = 0; i < N; i++) {
  for (j = 0; j < N; j++) {
    // Dynamic/non-affine control flow
    if (a[i][j] == key) {
      s[i] = j;
      break;
    }
  }
}
```

```mlir {.mlir}
func @search(%A : memref<?x?xi32, %S : <?xi32>, %key : i32) {
  %ni = dim %A, 0 : memref<?x?xi32>
  // This loop can be parallelized.
  affine.for %i = 0 to %ni {
    affine.graybox [%A, %S] : (memref<?x?xi32>, memref<?xi32>) {
      %nj = dim %A, 1 : memref<?x?xi32>
      br ^bb1(%c0)

    ^bb1(%j: i32)
      %p1 = cmpi "lt", %j, %nj : i32
      cond_br %p1, ^bb2, ^bb5

    ^bb2:
      %v = affine.load %A[%i, %j] : memref<?x?xi32>
      %p2 = cmpi "eq", %v, %key : i32
      cond_br %p2, ^bb3(%j), ^bb4

    ^bb3(%j: i32)
      affine.store %j, %S[%i] : memref<?xi32>
      br ^bb5

    ^bb4:
      %jinc = addi %j, %c1 : i32
      br ^bb1(%jinc)

    ^bb5:
      return
    }
  }
  return
}
```

### Example 2

Non-affine loop bounds:

```
for (i = 0; i < N; i++)
  for (j = 0; j < N; j++)
    // non-affine loop bound for k loop
    for (k=0; k<pow(2,j); k++)
       for (l=0; l<N; l++) {
        // block loop body
        ...

```


```mlir {.mlir}
func @nest(%n : i32) {
  %c2 = constant 2 : index
  affine.for %i = 0 to %n {
    affine.for %j = 0 to %n {
      affine.graybox [] {
        %pow = call @powi(%c2, %j) : (index, index) ->  index
        affine.for %k = 0 to %pow {
          affine.for %l = 0 to %n {
            ...
          }
        }
        return
      }  // graybox end
    }
  }
  return
}
```

### Example 3

Non-affine loop bounds.

```
for (i = 0; i < N; ++i) {
  A[B[i]]++;
}
```

```mlir {.mlir}
func @non_affine_load_store(%A : memref<100xf32>, %B : memref<100xf32>) {
  %cf1 = constant 1.0 : f32
  for %i = 0 to 100 {
    %v = affine.load %B[%i] : memref<100xf32>
    affine.graybox [] {
      // %v is now a symbol here.
      %s = affine.load %A[%v] : memref<100xf32>  
      %o = addf %s, %cf1 : f32
      affine.store %o, %A[%v] : memref<100xf32>
      return;
    }
  }
  return
}
```


## Utilities and Passes

* **Hoist or eliminate affine grayboxes**

   There will be a function pass that will hoist or eliminate unnecessary
   *affine.graybox* ops, i.e., when an *affine.graybox* can be eliminated or hoisted
   without violating the dimension and symbol requirements. The propagation of
   constants and other simplification that happens in scalar optimizations /
   canonicalization helps get rid of affine.grayboxes. As such it's useful to
   have non-memref SSA values be implicitly captured and not isolate them.

* **Walkers**

   There has to be a new walkAffine method that will walk everything except
   regions of an affine.graybox. Most polyhedral/affine passes will see
   *affine.graybox* as opaque *for any walks from above*.

   An affine pass's runOnFunction should be changed to to run on that function
   as well as every *affine.graybox* op in it. Unfortunately, they have to be done
   sequentially only because the "declaration" of the *affine.graybox* and the
   "imperative" call to it are one thing - the affine grayboxes could have
   otherwise been processed in parallel. In summary, there can be an
   AffineFunctionPass that needs to only implement an runOnOp(op) where op is
   either a FuncOp or an AffineGrayBoxOp.

   Some of the current polyhedral passes/utilities can continue using walk (for
   eg. [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89), while many will just have to be changed to use walkAffine.

* **Simplification / Canonicalization**

   There has to be a simplification that drops unused block arguments from
   regions of ops that aren't function ops (since this is easy for non function
   ops) - in case this isn't already in place. This will allow removal of dead
   memrefs that could otherwise be blocked by operand uses in affine.graybox ops
   with the corresponding region arguments not really having any uses inside.
   Given this, no additional bookkeeping is needed as a result of having memrefs
   as explicit operands for gray boxes. [MemRefCastFold](https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228) is the only canonicalization pattern
   that the *affine.graybox* has to implement, and this is easily/cleanly done
   (by replacing the argument and its uses with a memref of a different type).
   Overall, having memrefs as explicit arguments is a good middle ground to
   make it easier to let standard SSA passes / scalar optimizations /
   canonicalizations work unhindered in conjunction with polyhedral passes, and
   with the latter not worrying about explicitly checking for escaping/hidden
   memref accesses. More discussion a little below in the next section.


*  There are situations/utilities where one can consistently perform
   rewriting/transformation/analysis cutting across grayboxes. One example is
   [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89), which turns all non-identity layout maps into identity
   ones. Having memrefs explicitly captured is a hindrance here, but
   mlir::replaceAllMemrefUsesWith can be extended to transparently perform the
   replacement inside any affine grayboxes encountered if the caller says so.
   In other cases like scalar replacement, memref packing / explicit copying,
   DMA generation, pipelining of DMAs, transformations are supposed to be
   blocked by those boundaries because the accesses inside the graybox can't be
   meaningfully analyzed in the context of the surrounding code. As such, the
   memrefs there are treated as escaping / non-dereferencing.

*  In the presence of affine constructs, the inliner can now simply inline 
   functions by putting the callee inside an affine graybox, without having to 
   worry about symbol restrictions.

* There has to be a mlir::getEnclosingAffineGrayBox(op) that returns the closest
  enclosing *affine.graybox* op or null if it hits a function op.


## Other Benefits and Implications

1. The introduction of this op allows arbitrary control flow (list of basic
   blocks with terminators) to be used within and mixed with affine.fors/ifs
   while staying in the same function. Such a list of blocks will be carried by
   an *affine.graybox* op whenever it's not at the top level.

2. Non-affine data accesses can now be represented through
   *affine.load/affine.store* without the need for outlining.

3. Symbol restrictions for affine constructs will no longer restrict inlining:
   any function can now be inlined into another by enclosing the just inlined
   function into a graybox.

4. Any access to a memref can be represented with an *affine.load/store*. This is
   quite useful in order to reuse existing passes more widely (for eg. to
   perform scalar replacement on affine accesses) --- there is no reason
   memref-dataflow-opt won't work on / shouldn't be reused on straightline code,
   which is always a valid *affine.graybox* region (at the limit, the innermost
   loop body is a valid *affine.graybox* under all circumstances.

5. Any countable C-style 'for' without a break/continue can be represented as an
   affine.for (irrespective of bounds). Any if/else without a continue/break
   inside can be represented as an affine.if. The rest are just represented as a
   list of basic blocks with an enclosing *affine.graybox* if not at the
   top-level.

6. SSA scalar opts work across ``affine boundaries'' without having to be
   interprocedural.

## Rationale and Design Alternatives - What to Capture as Arguments?

An alternative design is to allow all SSA values including memrefs to be
implicitly captured, i.e., zero operands and arguments for the op. This is
however inconvenient for all polyhedral transformations and analyses which will 
have to check and scan any affine.grayboxes encountered to see if any memrefs
are being used therein, and if so, they would most likely treat them as if the
memrefs were being passed to a function call. This would be the case with
dependence analysis, memref region computation/analysis, fusion, explicit
copying/packing, DMA generation, pipelining, scalar replacement and anything 
depending on the former analyses (like tiling, unroll and jam). Having memrefs
as explicit operands/arguments is a good middle ground to make it easier to let 
standard SSA passes / scalar optimization / canonicalization work
unhindered in conjunction with polyhedral passes, and with the latter not
worrying about  explicitly checking for escaping/hidden memref accesses.

Furthermore, a memref anyway never folds to a constant. The only
canonicalization related to a memref currently is a [memref_cast
which can easily be extended to fold  with an *affine.graybox* op (update its
argument and all uses inside).  As such, there aren't any cases where the
argument list has to be shrunk/grown from the outside. And for the cases where
the types have to be updated, it's straightforward since there is sort of only a
single use for that op instance (it's not declarative or "callable" from
elsewhere like a FuncOp).

Another design point could be of requiring symbols associated with the
affine constructs used in a graybox, but defined outside, to be explicitly
listed as operands/arguments, in addition to the memrefs used. This makes
isValidSymbol really simple. One won't need isValidSymbol(Value \*v,
AffineGrayBoxOp op). Anything that is at the top-level of an *affine.graybox* op
or its region argument will become a valid symbol. However, other than this, it
doesn't simplify anything else. Instead, it adds/duplicates a
bookkeeping with respect to propagation of constants, similar, to some
extent, to the argument rewriting done for interprocedural constant propagation.
Similarly, the other extreme of requiring everything from the outside used in an
*affine.graybox* to be explicitly listed as its operands and region arguments is
even worse on this front.

In summary, it appears that the requirement to explicitly capture only the
memrefs inside an affine.graybox's region is a good middle ground and better
than other options.

Mehdi AMINI

unread,
Sep 28, 2019, 3:39:59 PM9/28/19
to Uday Bondhugula, MLIR
Hi Uday,

Thanks! This is great overall

I'm not convinced by the "FunctionLike" traits: in general a function-like defines a block of code that can be referenced by a symbol later.
We can likely express the kind of property you want, possibly with another traits, what are you trying to achieve here? 
affine.graybox not being isolated from above is the main reason we can't have a pass operating on them.

 
- the affine grayboxes could have
   otherwise been processed in parallel. In summary, there can be an
   AffineFunctionPass that needs to only implement an runOnOp(op) where op is
   either a FuncOp or an AffineGrayBoxOp.

   Some of the current polyhedral passes/utilities can continue using walk (for
   eg. [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89), while many will just have to be changed to use walkAffine.

* **Simplification / Canonicalization**

   There has to be a simplification that drops unused block arguments from
   regions of ops that aren't function ops (since this is easy for non function
   ops) - in case this isn't already in place.

This would have to be a canonicalization on the op itself (here a canonicalization on the affine.graybox). 
(We can implement this as a trait, but that is just an implementation detail).

 
This will allow removal of dead
   memrefs that could otherwise be blocked by operand uses in affine.graybox ops
   with the corresponding region arguments not really having any uses inside.
   Given this, no additional bookkeeping is needed as a result of having memrefs
   as explicit operands for gray boxes. [MemRefCastFold](https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228) is the only canonicalization pattern
   that the *affine.graybox* has to implement, and this is easily/cleanly done
   (by replacing the argument and its uses with a memref of a different type).
   Overall, having memrefs as explicit arguments is a good middle ground to
   make it easier to let standard SSA passes / scalar optimizations /
   canonicalizations work unhindered in conjunction with polyhedral passes, and
   with the latter not worrying about explicitly checking for escaping/hidden
   memref accesses. More discussion a little below in the next section.


*  There are situations/utilities where one can consistently perform
   rewriting/transformation/analysis cutting across grayboxes. One example is
   [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89), which turns all non-identity layout maps into identity
   ones. Having memrefs explicitly captured is a hindrance here, but
   mlir::replaceAllMemrefUsesWith can be extended to transparently perform the
   replacement inside any affine grayboxes encountered if the caller says so.

It seems to me that this would have to be done with some sort of op-interface to avoid layering violation between core utilities (like replaceAllMemrefUsesWith) and a dialect specific Operation (affine.graybox).

 
   In other cases like scalar replacement, memref packing / explicit copying,
   DMA generation, pipelining of DMAs, transformations are supposed to be
   blocked by those boundaries because the accesses inside the graybox can't be
   meaningfully analyzed in the context of the surrounding code. As such, the
   memrefs there are treated as escaping / non-dereferencing.

*  In the presence of affine constructs, the inliner can now simply inline 
   functions by putting the callee inside an affine graybox, without having to 
   worry about symbol restrictions.

* There has to be a mlir::getEnclosingAffineGrayBox(op) that returns the closest
  enclosing *affine.graybox* op or null if it hits a function op.

It already exists under the generic of Operation::getParentOfType<AffineGrayboxOp>(): https://github.com/tensorflow/mlir/blob/master/include/mlir/IR/Operation.h#L134

 


## Other Benefits and Implications

1. The introduction of this op allows arbitrary control flow (list of basic
   blocks with terminators) to be used within and mixed with affine.fors/ifs
   while staying in the same function. Such a list of blocks will be carried by
   an *affine.graybox* op whenever it's not at the top level.

2. Non-affine data accesses can now be represented through
   *affine.load/affine.store* without the need for outlining.

3. Symbol restrictions for affine constructs will no longer restrict inlining:
   any function can now be inlined into another by enclosing the just inlined
   function into a graybox.

In particular, the "funcop" is just another operation: my take was already that any region help by a non-affine operation should be treated the same as held by a func-op from the perspective of affine.
Since the removal of MLFunc and the use of generic region there was really nothing specific left about FuncOp as far as I can tell, but the doc was blindly updated to replace MLFunction with Function.

This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols ; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

It isn't clear to me why this would be costly though? This is something than can be cached easily during analysis/transformation in a map, and isn't trivial to compute without any ambiguity or loss of precision. The information is encoded structurally in the IR even with implicit capture. The explicit capture looks like "caching the result of an analysis in the IR itself" to me right now.
 

Furthermore, a memref anyway never folds to a constant. The only
canonicalization related to a memref currently is a [memref_cast
which can easily be extended to fold  with an *affine.graybox* op (update its
argument and all uses inside).  As such, there aren't any cases where the
argument list has to be shrunk/grown from the outside. And for the cases where
the types have to be updated, it's straightforward since there is sort of only a
single use for that op instance (it's not declarative or "callable" from
elsewhere like a FuncOp).

Because this explicit capture is a pure "passthrough" (or am I missing something?), it acts as an inconvenience in any patterns though. You can't just blindly apply patterns or follow use-def chains across the affine.graybox boundary. This is the kind of restriction when doing explicit capture that should be more motivated. Our (short and recent) experience over the last few months developing other dialects is that so far implicit capture is in general more convenient and explicit capture should be motivated by the need to block otherwise problematic canonicalization/optimization across the region boundary.

 

Another design point could be of requiring symbols associated with the
affine constructs used in a graybox, but defined outside, to be explicitly
listed as operands/arguments, in addition to the memrefs used. This makes
isValidSymbol really simple. One won't need isValidSymbol(Value \*v,
AffineGrayBoxOp op). Anything that is at the top-level of an *affine.graybox* op
or its region argument will become a valid symbol. However, other than this, it
doesn't simplify anything else. Instead, it adds/duplicates a
bookkeeping with respect to propagation of constants, similar, to some
extent, to the argument rewriting done for interprocedural constant propagation.
Similarly, the other extreme of requiring everything from the outside used in an
*affine.graybox* to be explicitly listed as its operands and region arguments is
even worse on this front.

In summary, it appears that the requirement to explicitly capture only the
memrefs inside an affine.graybox's region is a good middle ground and better
than other options.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/b229eeaf-2a53-4df3-a690-2ef7f0946232%40tensorflow.org.

Uday Bondhugula

unread,
Sep 28, 2019, 11:39:24 PM9/28/19
to MLIR
Hi Mehdi,

Thanks very for your quick and detailed response! I've updated and incorporated some of the obvious comments/fixes, and provided better explanation at other places. I'll wait for other feedback before responding inline in detail. In short, I want to emphasize that although the op is not marked "IsolatedFromAbove", it is *effectively isolated from above for polyhedral/affine purposes* since walkAffine used by such passes treats affine.graybox ops opaquely and because all memrefs from the outside used inside a graybox op have to appear as its operands and arguments. (Reg. FunctionLike op trait, nothing crucial here: I just wanted to inherit most of its accessors due to the properties it shares with FuncOp. What you mention below as the defining property of FunctionLike doesn't appear in the bulletted list of its doc comment, and it isn't obvious from reading its methods. It would great to either document this in g3doc or augment the doc comment. I was interesting in inheriting accessors stemming from the 2nd, 4th, and 5th property from the doc comment list on FunctionLike). 

I'd like to request anyone interested to please only use the GitHub version for further reference (it's significantly updated from and better than the one I had pasted originally here and with rev history):

Thanks again,
Uday
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

Mehdi AMINI

unread,
Sep 29, 2019, 12:13:04 AM9/29/19
to Uday Bondhugula, MLIR
On Sat, Sep 28, 2019 at 8:39 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi Mehdi,

Thanks very for your quick and detailed response! I've updated and incorporated some of the obvious comments/fixes, and provided better explanation at other places. I'll wait for other feedback before responding inline in detail. In short, I want to emphasize that although the op is not marked "IsolatedFromAbove", it is *effectively isolated from above for polyhedral/affine purposes* since walkAffine used by such passes treats affine.graybox ops opaquely and because all memrefs from the outside used inside a graybox op have to appear as its operands and arguments.

I'd still want to see a better motivation for this. It isn't clear to me at the moment that not having explicit capture of memref would be a real issue in practice (I don't see my previous comment addressed from this point of view: you're duplicating information that is trivially gathered in the IR).
 
(Reg. FunctionLike op trait, nothing crucial here: I just wanted to inherit most of its accessors due to the properties it shares with FuncOp. What you mention below as the defining property of FunctionLike doesn't appear in the bulletted list of its doc comment, and it isn't obvious from reading its methods. It would great to either document this in g3doc or augment the doc comment. I was interesting in inheriting accessors stemming from the 2nd, 4th, and 5th property from the doc comment list on FunctionLike). 

The bullet points come as a whole and you can't cherry-pick, which is why I asked exactly what you're after in order to refactor it / split it to expose what makes sense.
The first bullet point is "Ops can be used with SymbolTable in the parent Op and have names", and the 3rd bullet point "the absence of a region corresonds to an external function" seem problematic to me.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/80845286-96e7-4947-9c8e-cf55b9770dca%40tensorflow.org.

Uday Bondhugula

unread,
Sep 29, 2019, 1:01:31 AM9/29/19
to MLIR


On Sunday, September 29, 2019 at 9:43:04 AM UTC+5:30, Mehdi AMINI wrote:


On Sat, Sep 28, 2019 at 8:39 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi Mehdi,

Thanks very for your quick and detailed response! I've updated and incorporated some of the obvious comments/fixes, and provided better explanation at other places. I'll wait for other feedback before responding inline in detail. In short, I want to emphasize that although the op is not marked "IsolatedFromAbove", it is *effectively isolated from above for polyhedral/affine purposes* since walkAffine used by such passes treats affine.graybox ops opaquely and because all memrefs from the outside used inside a graybox op have to appear as its operands and arguments.

I'd still want to see a better motivation for this. It isn't clear to me at the moment that not having explicit capture of memref would be a real issue in practice (I don't see my previous comment addressed from this point of view: you're duplicating information that is trivially gathered in the IR).

This is exactly what the first two paragraphs under "Rationale and Design Alternatives" discuss in detail. They are basically arguing that:

(a) having to inspect, scan, gather memrefs accessed within grayboxes from above within all affine passes is not worth the special casing needed just for the affine.graybox op. As an example, consider a pass that's computing memref regions and generating packing code for a memref. The scan of uses that currently happens via methods like getUses(), replaceAllMemRefUsesWith() will all just work transparently and do the work: the non-dereferencing uses of that memref on an affine.graybox op just makes things like double buffering, data copy generation, etc. all bail out on those (just because it isn't polyhedrally analyzeable unless the graybox can be eliminated and you get a larger encompassing affine region) -- the same way they currently bail out on any call ops taking memrefs as arguments or return ops returning memrefs.  The same is true for memref dependence analysis: there isn't a way to represent dependences between an affine access and another one that is inside another graybox dominated by it - for all these purposes, the latter access is like one happening on a memref that has escaped at the graybox op boundary. With explicit capture of memref's, an affine graybox gets treated as any other op (for eg. like a 'call' op that takes memref as an operand), and so for an affine pass, you don't even have to know that the affine.graybox exists, and one won't even have to change a line of code in any of the passes: that's the upshot! walkAffine simply won't walk their regions, and "operand uses" consistently have all that the affine pass needs for *every* op.

The isolation of a graybox's region for polyhedral passes running from above is necessary (this is why it's not a white box), and to do so cleanly, we need to explicitly capture memrefs used inside as operand uses on the graybox.

(b) the bookkeeping needed to maintain the memref arguments is trivial, and there is barely any benefit to having them implicitly captured (unlike scalar SSA values that do not refer to memory, where the tradeoffs are different and so we implicitly capture), and it's definitely not worth the trouble of the special casing/handling described in (a). 

I can augment the rationale in those paras in case these weren't already clear. 

~ Uday


 

Uday Bondhugula

unread,
Sep 29, 2019, 1:04:14 AM9/29/19
to MLIR


The bullet points come as a whole and you can't cherry-pick, which is why I asked exactly what you're after in order to refactor it / split it to expose what makes sense.
The first bullet point is "Ops can be used with SymbolTable in the parent Op and have names", and the 3rd bullet point "the absence of a region corresonds to an external function" seem problematic to me.

I see - got it, thanks! (Dropping that line from the proposal.)

~ Uday 

Mehdi AMINI

unread,
Sep 29, 2019, 2:13:54 PM9/29/19
to Uday Bondhugula, MLIR
On Sat, Sep 28, 2019 at 10:01 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:


On Sunday, September 29, 2019 at 9:43:04 AM UTC+5:30, Mehdi AMINI wrote:


On Sat, Sep 28, 2019 at 8:39 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi Mehdi,

Thanks very for your quick and detailed response! I've updated and incorporated some of the obvious comments/fixes, and provided better explanation at other places. I'll wait for other feedback before responding inline in detail. In short, I want to emphasize that although the op is not marked "IsolatedFromAbove", it is *effectively isolated from above for polyhedral/affine purposes* since walkAffine used by such passes treats affine.graybox ops opaquely and because all memrefs from the outside used inside a graybox op have to appear as its operands and arguments.

I'd still want to see a better motivation for this. It isn't clear to me at the moment that not having explicit capture of memref would be a real issue in practice (I don't see my previous comment addressed from this point of view: you're duplicating information that is trivially gathered in the IR).

This is exactly what the first two paragraphs under "Rationale and Design Alternatives" discuss in detail. They are basically arguing that:

(a) having to inspect, scan, gather memrefs accessed within grayboxes from above within all affine passes is not worth the special casing needed just for the affine.graybox op

"is not worth" is just a judgement call you're making, and it isn't clear to me what leads to this conclusion.
 
. As an example, consider a pass that's computing memref regions and generating packing code for a memref. The scan of uses that currently happens via methods like getUses(), replaceAllMemRefUsesWith() will all just work transparently and do the work: the non-dereferencing uses of that memref on an affine.graybox op just makes things like double buffering, data copy generation, etc. all bail out on those (just because it isn't polyhedrally analyzeable unless the graybox can be eliminated and you get a larger encompassing affine region)

That is probably the part that wasn't clear to me actually from reading the RFC. From the outside you actually want a blackbox more than a graybox. My mental model so far was that the main purpose of the graybox was to create the possibility to have symbol when analyzed from the inside. 
That said the extra check needed in replaceAllMemReUsesWith() to account for uses in graybox does not seem very difficult, there can be more cumbersome cases though, I'm not sure.
 

-- the same way they currently bail out on any call ops taking memrefs as arguments or return ops returning memrefs.  The same is true for memref dependence analysis: there isn't a way to represent dependences between an affine access and another one that is inside another graybox dominated by it - for all these purposes, the latter access is like one happening on a memref that has escaped at the graybox op boundary. With explicit capture of memref's, an affine graybox gets treated as any other op (for eg. like a 'call' op that takes memref as an operand), and so for an affine pass, you don't even have to know that the affine.graybox exists, and one won't even have to change a line of code in any of the passes: that's the upshot! walkAffine simply won't walk their regions, and "operand uses" consistently have all that the affine pass needs for *every* op.
 
The isolation of a graybox's region for polyhedral passes running from above is necessary (this is why it's not a white box), and to do so cleanly, we need to explicitly capture memrefs used inside as operand uses on the graybox.

(b) the bookkeeping needed to maintain the memref arguments is trivial, and there is barely any benefit to having them implicitly captured (unlike scalar SSA values that do not refer to memory, where the tradeoffs are different and so we implicitly capture), and it's definitely not worth the trouble of the special casing/handling described in (a). 

I can augment the rationale in those paras in case these weren't already clear. 

Thanks that would be great. Also if you can provide more clear examples as well to really make an argument support your assessment that it "definitely not worth the trouble of the special casing/handling" it would be nice. When reading the doc (especially the first version) it felt to me that you were jumping to a conclusion based on your experience, and for people that don't have this experience (so most people really) it is hard to see all the implications if you don't go the extra mile to help :)

Something also that isn't totally clear to me is if you always want the graybox to be fully opaque from above: the graybox may be needed because of a symbol needed for a small subset of the IR contained in the graybox for example (after inlining). This may be rare enough that it wouldn't be an issue in practice?

Thanks,

-- 
Mehdi



Uday Bondhugula

unread,
Sep 30, 2019, 5:31:26 AM9/30/19
to MLIR
Hi Mehdi,

Comments / responses on your first post inline below. 


On Sunday, September 29, 2019 at 1:09:59 AM UTC+5:30, Mehdi AMINI wrote:
Hi Uday,

Thanks! This is great overall

On Sat, Sep 28, 2019 at 8:28 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi all,

1. The *affine.graybox* op has zero results, zero or more operands, and holds
   a single region, which is a list of zero or more blocks. The op's region can
   have zero or more arguments, each of which can only be a *memref*. The
   operands bind 1:1 to its region's arguments.  The op can't use any memrefs
   defined outside of it, but can use any other SSA values that dominate it. Its
   region's blocks can have terminators the same way as current MLIR functions 
   (FuncOp) can. Control from any *return* ops in its region returns to right 
   after the *affine.graybox* op.  The op will have the trait *FunctionLike*.

I'm not convinced by the "FunctionLike" traits: in general a function-like defines a block of code that can be referenced by a symbol later.
We can likely express the kind of property you want, possibly with another traits, what are you trying to achieve here? 

I think this has been cleared out - have removed the referenced to this. 


 

* **Simplification / Canonicalization**

   There has to be a simplification that drops unused block arguments from
   regions of ops that aren't function ops (since this is easy for non function
   ops) - in case this isn't already in place.

This would have to be a canonicalization on the op itself (here a canonicalization on the affine.graybox). 
(We can implement this as a trait, but that is just an implementation detail).

You are right. In addition, other ops that have a region will benefit from such canonicalization - so we could have something generic to reuse for this. Instead of a trait, we could just register the same canonicalization pattern on multiple ops (just like MemRefCastFolder is registered on 6-7 ops); so eventually, this isn't something specially needed just for affine.graybox.

 

 
This will allow removal of dead
   memrefs that could otherwise be blocked by operand uses in affine.graybox ops
   with the corresponding region arguments not really having any uses inside.
   Given this, no additional bookkeeping is needed as a result of having memrefs
   as explicit operands for gray boxes. [MemRefCastFold](https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228) is the only canonicalization pattern
   that the *affine.graybox* has to implement, and this is easily/cleanly done
   (by replacing the argument and its uses with a memref of a different type).
   Overall, having memrefs as explicit arguments is a good middle ground to
   make it easier to let standard SSA passes / scalar optimizations /
   canonicalizations work unhindered in conjunction with polyhedral passes, and
   with the latter not worrying about explicitly checking for escaping/hidden
   memref accesses. More discussion a little below in the next section.


*  There are situations/utilities where one can consistently perform
   rewriting/transformation/analysis cutting across grayboxes. One example is
   [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89), which turns all non-identity layout maps into identity
   ones. Having memrefs explicitly captured is a hindrance here, but
   mlir::replaceAllMemrefUsesWith can be extended to transparently perform the
   replacement inside any affine grayboxes encountered if the caller says so.

It seems to me that this would have to be done with some sort of op-interface to avoid layering violation between core utilities (like replaceAllMemrefUsesWith) and a dialect specific Operation (affine.graybox).

That's right. Currently, replaceAllMemRefUsesWith is part of Transforms, and it composes maps directly into the affine ops; so, it does know about affine ops, and Transforms/ depends on the affine dialect. So we should be fine layering-wise for now, but as you point out, if/when Transforms and this utility are freed from knowing about affine ops, we'll need to use an op-interface. 
 

In particular, the "funcop" is just another operation: my take was already that any region help by a non-affine operation should be treated the same as held by a func-op from the perspective of affine.
Since the removal of MLFunc and the use of generic region there was really nothing specific left about FuncOp as far as I can tell, but the doc was blindly updated to replace MLFunction with Function.

This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols ; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

This sounds accurate to me except for some of the terminology. "non-affine region" is confusing - because there isn't any such thing as a non-affine region! Every op is part of some affine region, either of the closest enclosing affine.graybox op or of the func op. The "affine region" term is usable only to demarcate such a region (anything above the closest enclosing affine.graybox and anything inside the graybox ops that are in this region aren't part of the region). IIUC, your first bullet should just be replaced by:

- defined at the top level of a function or of an affine graybox op

(just like for a function, the top level of an affine graybox is the top part of its region's entry block)

 

## Rationale and Design Alternatives - What to Capture as Arguments?

An alternative design is to allow all SSA values including memrefs to be
implicitly captured, i.e., zero operands and arguments for the op. This is
however inconvenient for all polyhedral transformations and analyses which will 
have to check and scan any affine.grayboxes encountered to see if any memrefs
are being used therein, and if so, they would most likely treat them as if the
memrefs were being passed to a function call. This would be the case with
dependence analysis, memref region computation/analysis, fusion, explicit
copying/packing, DMA generation, pipelining, scalar replacement and anything 
depending on the former analyses (like tiling, unroll and jam). Having memrefs
as explicit operands/arguments is a good middle ground to make it easier to let 
standard SSA passes / scalar optimization / canonicalization work
unhindered in conjunction with polyhedral passes, and with the latter not
worrying about  explicitly checking for escaping/hidden memref accesses.

It isn't clear to me why this would be costly though? This is something than can be cached easily during analysis/transformation in a map, and isn't trivial to compute without any ambiguity or loss of precision. The information is encoded structurally in the IR even with implicit capture. The explicit capture looks like "caching the result of an analysis in the IR itself" to me right now.

I'm working on addressing this better in the RFC.


 

Furthermore, a memref anyway never folds to a constant. The only
canonicalization related to a memref currently is a [memref_cast
which can easily be extended to fold  with an *affine.graybox* op (update its
argument and all uses inside).  As such, there aren't any cases where the
argument list has to be shrunk/grown from the outside. And for the cases where
the types have to be updated, it's straightforward since there is sort of only a
single use for that op instance (it's not declarative or "callable" from
elsewhere like a FuncOp).

Because this explicit capture is a pure "passthrough" (or am I missing something?), it acts as an inconvenience in any patterns though. You can't just blindly apply patterns or follow use-def chains across the affine.graybox boundary. This is the kind of restriction when doing explicit capture that should be more motivated. Our (short and recent) experience over the last few months developing other dialects is that so far implicit capture is in general more convenient and explicit capture should be motivated by the need to block otherwise problematic canonicalization/optimization across the region boundary.

This isn't actually motivated by blocking canonicalization patterns across region boundaries. There are more than pattern rewrites that are going on here :-). None of the polyhedral passes are actually written as pattern rewrites. As I mentioned, I'll work on explaining this better in the RFC. Your questions are very valid - thanks! 

~ Uday
Message has been deleted

Alex Zinenko

unread,
Sep 30, 2019, 11:28:06 AM9/30/19
to Uday Bondhugula, MLIR
Thanks for the proposal, it is very similar to what I had in mind and what we experimented originally.

I share Mehdi's analysis about explicit capture of memrefs. Similarly, to me, the idea of a greybox is simply to make some SSA values bindable to symbols. It should be relatively straightforward to filter the uses of a memref inside a given region by checking if the region is an ancestor of the use's owner op. Regarding the affine passes, my thinking was that many of them should operate on a region regardless of it being a function body or a greybox. In a sense, a function is-a greybox rather than a greybox is function-like region. This can be pushed further to drop the special treatment for functions in affine passes and require a greybox as a top-level operation in a function in order to make SSA values bindable to symbols.

There is also a deeper concern about the implicit notion of "affine regions". In the same way there is no "non-affine" region, there also is no affine region.  In fact, there is no "whatever" region because regions don't have semantics outside of the enclosing op (they did in the first iteration of the regions proposal, that used affine regions and greyboxes as an example). Unless we explicitly restrict the bodies of affine.for and affine.if to only contain a set of well-behaved, registered ops, we have no control over the affine-ness of the regions enclosed by these ops.  One can have

func @foo(%m : memref<...>) {
  affine.for %i = 0 to 42 {
    %0 = "crazy.op"() ({
      %1 = "crazy.load"(%m) : (memref<...>) -> index
      %2 = "crazy.more_load"(%m, %1) : (memref<...>, index) -> f32
      "crazy.yield"(%2) : (f32) -> ()
    }) : () -> (f32)
  }
}

where we have no idea about the effect of "crazy.op" or any of its nested ops on the memrefs it is allowed to capture implicitly.  Without restricting the set of ops supported inside `affine.for`, we should treat all unknown ops conservatively.  "affine.greybox" wouldn't be any different from "crazy.op" in this sense, so it does not make sense to me why it would get some special handling.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.


--
-- Alex

Uday Bondhugula

unread,
Sep 30, 2019, 11:34:21 AM9/30/19
to MLIR
Hi Alex,

It looks like our updates might have crossed each other. I just pushed a large update to the RFC. I had earlier missed the fact that an affine.graybox does return results (matching with the returns from its blocks). I've also added a much more detailed discussion on the trade-offs on implicit vs explicit captures to the rationale section. It's still evolving though.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.


--
-- Alex

Uday Bondhugula

unread,
Sep 30, 2019, 11:46:00 AM9/30/19
to MLIR
Hi Alex,

On Monday, September 30, 2019 at 8:58:06 PM UTC+5:30, Alex Zinenko wrote:

func @foo(%m : memref<...>) {
  affine.for %i = 0 to 42 {
    %0 = "crazy.op"() ({
      %1 = "crazy.load"(%m) : (memref<...>) -> index
      %2 = "crazy.more_load"(%m, %1) : (memref<...>, index) -> f32
      "crazy.yield"(%2) : (f32) -> ()
    }) : () -> (f32)
  }
}

where we have no idea about the effect of "crazy.op" or any of its nested ops on the memrefs it is allowed to

The code like this is not at an issue and will just work correctly both the way things are now and with the affine.graybox proposal. In the above example, the '%m' operand uses on the "crazy.*" ops will be seen
as non-deferencing uses of %m, and all affine utilities/passes will correctly bail out on / act conservatively on those as far as that memref goes. In effect, the ops where you are using a memref as an operand (but not dereferencing them like in affine.load, affine.store, affine.dma_start/wait) are all also similar to say a call where %m is an operand. There is no need to separately/specially restrict what's inside affine.for or affine.if - they all get treated conservatively for the parts that have to be. However, if you want to do something special/advanced that is less conservative, we'll have to think about what that is depending on the use case.

~ Uday

Uday Bondhugula

unread,
Sep 30, 2019, 12:16:23 PM9/30/19
to MLIR
Only if affine.graybox explicitly captures the memref would it be no different from "crazy.op". The special handling argument in the proposal is for the design point where an affine.graybox implicitly captures the memrefs used inside. In your case, %m is an operand on "crazy.load", and not an implicitly used memref in its region. 

~ Uday

Mehdi AMINI

unread,
Sep 30, 2019, 11:13:07 PM9/30/19
to Uday Bondhugula, MLIR
Hi Uday,

Thanks for your answers, appreciate it.



On Mon, Sep 30, 2019 at 2:31 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Sunday, September 29, 2019 at 1:09:59 AM UTC+5:30, Mehdi AMINI wrote: 
This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

This sounds accurate to me except for some of the terminology. "non-affine region" is confusing - because there isn't any such thing as a non-affine region! Every op is part of some affine region, either of the closest enclosing affine.graybox op or of the func op. The "affine region" term is usable only to demarcate such a region (anything above the closest enclosing affine.graybox and anything inside the graybox ops that are in this region aren't part of the region).


Sorry to being sloppy with terminology here :)
I had defined (my first sentence above): "an affine region (a region attached to an affine operation)". My idea was that you could bind a symbol whenever the symbol is defined in a region that isn't attached to an affine operation. For example

spv.func @foo() {
  not_affine.async_launch() {
      %symbol = ....
       affine.for .... {
            // here %symbol can be bound as a symbol since it is defined in a region not attached to an affine operation.
       }
  }
}

 
IIUC, your first bullet should just be replaced by:

- defined at the top level of a function or of an affine graybox op
(just like for a function, the top level of an affine graybox is the top part of its region's entry block)

I'm not comfortable to refer to "function" here: we don't have "first class" function anymore in MLIR (SpirV dialect is using a different operation for representing functions, and GPU kernel would likely do the same).
Any region attached to an operation that isn't affine should be able to be considered identically to a function body if it encloses an affine operation (cf my example above).
This is the language I was trying to get at with "non-affine region", of course I'm fine with any terminology that would represent the same concept :)

 
It isn't clear to me that we have the same reading of the code here inside the affine.fo you have "crazy.op" which is defining a new enclosing region, just like affine.graybox would.
Inside this region we have a use of a memref in "crazy.load", this is an implicit capture for the "crazy.op" and does not seem different to me than if affine.graybox was allowing to implicit capture memrefs.
Can you help clarify here?

Thanks,

-- 
Mehdi



Uday Bondhugula

unread,
Oct 1, 2019, 3:43:44 AM10/1/19
to MLIR


On Tuesday, October 1, 2019 at 8:43:07 AM UTC+5:30, Mehdi AMINI wrote:
Hi Uday,

Thanks for your answers, appreciate it.



On Mon, Sep 30, 2019 at 2:31 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Sunday, September 29, 2019 at 1:09:59 AM UTC+5:30, Mehdi AMINI wrote: 
This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

This sounds accurate to me except for some of the terminology. "non-affine region" is confusing - because there isn't any such thing as a non-affine region! Every op is part of some affine region, either of the closest enclosing affine.graybox op or of the func op. The "affine region" term is usable only to demarcate such a region (anything above the closest enclosing affine.graybox and anything inside the graybox ops that are in this region aren't part of the region).


Sorry to being sloppy with terminology here :)
I had defined (my first sentence above): "an affine region (a region attached to an affine operation)". My idea was that you could bind a symbol whenever the symbol is defined in a region that isn't attached to an affine operation. For example

This doesn't really cover it all. For eg. in Example 2, Example 3 of the RFC resp., %pow and %v are valid symbols that are defined inside regions held by affine operations. They become symbols for the inner affine regions (for the affine.graybox nested), not for the one they are contained in. Second, a symbol defined at the top level of an affine.graybox is also a valid for that graybox's region (just like the current rule of top-level of a function op).  I've added a terminology section to the RFC - we probably need another term instead of overloading 'region'.



spv.func @foo() {
  not_affine.async_launch() {
      %symbol = ....
       affine.for .... {
            // here %symbol can be bound as a symbol since it is defined in a region not attached to an affine operation.
       }
  }
}

 
IIUC, your first bullet should just be replaced by:

- defined at the top level of a function or of an affine graybox op
(just like for a function, the top level of an affine graybox is the top part of its region's entry block)

I'm not comfortable to refer to "function" here: we don't have "first class" function anymore in MLIR (SpirV dialect is using a different operation for representing

By "function", I just meant the function op here ('FuncOp'): just didn't want to use the class name.
 
functions, and GPU kernel would likely do the same).
Any region attached to an operation that isn't affine should be able to be considered identically to a function body if it encloses an affine operation (cf my example above).
This is the language I was trying to get at with "non-affine region", of course I'm fine with any terminology that would represent the same concept :)

 
On Mon, Sep 30, 2019 at 9:16 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:


On Monday, September 30, 2019 at 9:16:00 PM UTC+5:30, Uday Bondhugula wrote:
Hi Alex,

On Monday, September 30, 2019 at 8:58:06 PM UTC+5:30, Alex Zinenko wrote:

func @foo(%m : memref<...>) {
  affine.for %i = 0 to 42 {
    %0 = "crazy.op"() ({
      %1 = "crazy.load"(%m) : (memref<...>) -> index
      %2 = "crazy.more_load"(%m, %1) : (memref<...>, index) -> f32
      "crazy.yield"(%2) : (f32) -> ()
    }) : () -> (f32)
  }
}

where we have no idea about the effect of "crazy.op" or any of its nested ops on the memrefs it is allowed to

The code like this is not at an issue and will just work correctly both the way things are now and with the affine.graybox proposal. In the above example, the '%m' operand uses on the "crazy.*" ops will be seen
as non-deferencing uses of %m, and all affine utilities/passes will correctly bail out on / act conservatively on those as far as that memref goes. In effect, the ops where you are using a memref as an operand (but not dereferencing them like in affine.load, affine.store, affine.dma_start/wait) are all also similar to say a call where %m is an operand. There is no need to separately/specially restrict what's inside affine.for or affine.if - they all get treated conservatively for the parts that have to be. However, if you want to do something special/advanced that is less conservative, we'll have to think about what that is depending on the use case.

~ Uday

 
capture implicitly.  Without restricting the set of ops supported inside `affine.for`, we should treat all unknown ops conservatively.  "affine.greybox" wouldn't be any different from "crazy.op" in this sense, so it does not

Only if affine.graybox explicitly captures the memref would it be no different from "crazy.op". The special handling argument in the proposal is for the design point where an affine.graybox implicitly captures the memrefs used inside. In your case, %m is an operand on "crazy.load", and not an implicitly used memref in its region. 

It isn't clear to me that we have the same reading of the code here inside the affine.fo you have "crazy.op" which is defining a new enclosing region, just like affine.graybox would.
Inside this region we have a use of a memref in "crazy.load", this is an implicit capture for the "crazy.op" and does not seem different to me than if affine.graybox was allowing to implicit capture memrefs.
Can you help clarify here?

Affine passes will not want to normally walk through the affine.graybox op (since the symbol list of its region is different than that of the affine region being walked/analyzed/transformed). This is why one would want to explicitly capture %m if "crazy.op" had been affine.graybox. In this case, "crazy.op" isn't defining a symbol for itself and dereferencing the memref. Consider the case where you dereference %m inside, i.e., you have the following as well in the crazy.op's region:

    %idx = affine.load %I[...]
     ...     = affine.load %m [ %idx ] 

This will fail IR verification with "crazy.op", but succeed with affine.graybox. And if you let the walk from the top to get inside the graybox op, you'll be incorrectly treating the affine.load on %m because it's not really affine for the outer region (%idx is not a symbol for the outer region) - it's conceptually an indirect access!

~ Uday
    
 

Thanks,

-- 
Mehdi



Alex Zinenko

unread,
Oct 1, 2019, 8:08:12 AM10/1/19
to Uday Bondhugula, MLIR
It feels like there are multiple conflated points here.
1. Ability to use a value as a symbol
Indeed, we need a way to make values suitable for use as symbols. Supporting values defined immediately within the `affine.greybox` region sounds good and mostly uncontroversial.

2. Stopping the pre-order traversal
In the pre-order (aka going from the top) traversal of ops, one may want to ignore certain regions attached to the op, or treat them differently.  This sounds like a reusable IR traversal pattern that can be parameterized by a condition functor used to analyze the op and decide whether to enter its regions. This does not have to be tied to greybox or affine.

3. Conservative analysis through implicit capture
Affine passes would treat any unknown op conservatively. This means that an unknown op with regions that implicitly captures any values is treated by those passes as an opaque function call that takes all implicitly captured values as arguments. This means that the passes should compute the set of implicitly captured values for unknown ops. I don't see what benefits do we get by avoiding this computation for `afine.greybox` at the expense of explicit capture.  Yes, we don't have to traverse the region attached to the greybox, but it sounds like we just use the IR as a medium for what looks like a (premature) performance optimization, as we would have to traverse all other types of regions. While in manually constructed examples, you would only get greyboxes and loops, it is unclear that it would be the case in a larger setting. 
 
--
-- Alex

Uday Bondhugula

unread,
Oct 1, 2019, 8:42:04 AM10/1/19
to MLIR
Just responding to these parts since I agree with the rest.

> 3. Conservative analysis through implicit capture
> Affine passes would treat any unknown op conservatively. This means
> that
> an unknown op with regions that implicitly captures any values is
> treated by those passes as an opaque function call that takes all
> implicitly captured values as arguments. This means that the passes
> should compute the set of implicitly captured values for unknown ops.

Actually, the answer is "No" here - they should not! For the unknown ops, why would you want to compute the set of implicitly captured values when you are going to anyway walk through the op?! The walkAffine method in the proposal walks through inner regions of *all* region-holding ops except the affine.graybox. And if you don't want to block an inner traversal, there is neither a need to explicitly capture nor a need to compute on-the-fly! It'd just work as is with the walk. The question really isn't about "known" vs "unknown" op here, but about an "op within the current polyhedral symbol context" vs an "op outside the current polyhedral symbol context". affine.graybox has its own symbol context, and so it falls into the latter class. In your earlier example, "crazy.op" (irrespective of whether it's known or unknown) doesn't define a new polyhedral context and so there isn't a need to block a walk into it nor explicitly capture.


>2. Stopping the pre-order traversal
>In the pre-order (aka going from the top) traversal of ops, one may want to ignore certain regions attached to >the op, or treat them differently.  This sounds like a reusable IR traversal pattern that can be parameterized by a >condition functor used to analyze the op and decide whether to enter its regions. This does not have to be tied >to greybox or affine.

Yes, there may be a future scenario where we don't want to traverse the inner regions of certain ops although they may not be affine.grayboxes and although they fall within the current region's symbol context (for efficiency/whatever reasons) - you'd have to compute summaries on them on-the-fly irrespective of whether they have an explicit capture on them, and this would in fact be the case for 'call' ops! if you want to do something more advanced than being super-conservative. But I don't see why that affects the current design choices on affine.graybox - the unique thing about the latter is that it introduces a new symbol context or has a list of basic blocks with their terminators and so the control flow isn't affine in it. Please think about the affine.graybox as bringing in polyhedral information from another "domain" that doesn't compose with the polyhedral information of its surrounding affine region. 

~ Uday

Uday Kumar Reddy B

unread,
Oct 1, 2019, 11:53:39 AM10/1/19
to Nicolas Vasilache, MLIR
Hi Nicolas,

Thank you for commenting. Responses inline.

On 30/09/2019 20:06, 'Nicolas Vasilache' via MLIR wrote:
> Hi Uday, thanks for the proposal!
>
> I see 3 principal changes that `affine.greybox` bring: 1. ability to
> bypass the restriction on symbol definition (i.e. you can insert an
> `affine.greybox` anywhere) 2. context sensitivity (i.e. behaves like
> an encapsulated inlined function call) 3. the decision on what is
> implicitly captured vs passed as argument
>
> This brings clear expressivity benefits as your examples show.
> However I believe there is also a case to be made about new analysis
> and transformations that can cross the `affine.greybox` boundary. I'd
> be curious to hear your thoughts on this topic and how
> `affine.greybox` improve the situation, what new
> analysis/transformation it allows.

Hoisting invariant code out of the graybox is I think one of the most
practically useful transformations I can think of. Imagine arbitrary
control flow (list of basic blocks inside) and an affine.for immediately
surrounding the graybox, and something could be hoisted out. This would
have been hard interprocedurally. Although this isn't purely polyhedral,
we could leverage affine references. This is also a low hanging one to
try out in the near future. But on this note, nearly all pattern
rewrites will work cross graybox boundaries as is since we implicitly
capture scalars and the op isn't marked IsolateFromAbove (so, const
folding, const prop into maps, composition of maps, cse, etc.). I've
listed the new ones we'll need here:
https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md#maintaining-memref-operandsarguments
MemRef normalization will also work across boundaries. If your question
was on polyhedral ones like interchange, affine unroll-and-jam, etc.
that will have to either start with using conservative info or refine it
or exploit some other domain info.

>
> Regarding `affine.greybox` itself, in my opinion the real limiter is
> the restriction on symbol definition and I think there may be a
> simpler way of addressing the problem. Consider a pair of ops to
> allow injecting/capturing symbols more freely, resembling: ```mlir
> {.mlir} ssa-value `=` affine.bind_symbol ssa-value `:` index-type
> affine.release_symbol ssa-value `:` index-type ```
>
> In some sense, this could be thought of as an exact complement to
> `affine.greybox`: 1. the values specified are ssa-value that are
> explicitly turned into symbols 2. there is no region/nesting
> involved
>
> The implication for affine passes is that symbol validity is defined
> by dominance by `affine.bind_symbol` (and optionally postdominance by
> `affine.release_symbol`) + rules induced by special ops (e.g. dim).
> `affine.bind_symbol` / `affine.release_symbol` act as natural
> boundary to affine transformations (with behavior varying on a
> case-by-case basis, like you mention for memref normalization and
> DMA). The part about Helpers, Utilities, and Passes seems to be
> mostly transparent (I have not thought deeply about those points but
> did not see particular points of concern). Inlining seems to work
> with minimal effort. Expressiveness benefits are the same in the 3
> examples you highlight.
>
> Besides simplicity of these 2 ops without regions, there is value in
> explicitly tracking symbol bindings. Consider your `pow` example:
>
> ``` func @nest(%n : i32) { %c2 = constant 2 : index affine.for %i = 0
> to %n { affine.for %j = 0 to %n { %pow = call @powi(%c2, %j) :
> (index, index) -> index %pow_sym = affine.bind_symbol %pow : index
> affine.for %k = 0 to %pow_sym { affine.for %l = 0 to %n ... } } } }
> return } ```
>
> It is trivial to compute that `i` and any `affine.bind_symbol` are in
> different program slices and that `i` can be structurally
> stripmined-and-sunk below `k` (assuming it is legal and profitable).
> I don't see a particular issue with doing the same with
> `affine.greybox` on this example, except that
> analyses/transformations/walkers/etc need to know about and work
> across the new special region you propose.
>
> So bottom-line, what do you think are the cost/benefits of regions vs
> just using ops? Where do you think something like
> `affine.bind_symbol` would break? Can we think of some new

As you point out, the design you describe is the opposite approach:
instead of using valid/verified regions marked in the IR and implicitly
inferring symbols based on those, you are inferring the regions after
having explicitly marked symbols. The real problem with your proposal I
feel is that: given the symbols, there isn't a unique way to partition
into regions -- you'll have many valid choices for your regions given
certain symbol definitions. Even if you try to define it maximally in
some way, "maximal" is not clearly defined in general or the desired
one. Consider the case where you have a 5-d loop nest and there could be
a choice to split it 2 + 3 or 3 + 2 based on the symbol op defs you
have; which one would you choose? But the bigger concern is that the
choice is made by the analysis/transforms infrastructure and a
user/developer doesn't see the regions/units being worked on by
analyses/transforms.

And finally, even if there is a unique/canonical splitting for an
explicit symbol definition, I feel it's really uncomfortable from a dev
standpoint not having the regions clearly materialized/visible in the
IR. I sort of see a region as the elephant and the symbol as the tail
:-) But this is my quick impression.

> analyses/transformations that cross `affine.greybox` boundaries and
> for which the abstraction helps (vs makes transformations more
> intricate to write because of a new special region)?

I think it'd be great to think of some use cases and see how the
proposed designs make it easier/difficult. Let me know if you have
something in mind. To move incrementally, for now, I wanted to make sure
that:

(a) all existing pattern rewrites work across the graybox op
(b) all existing affine passes work correctly as is, which they will
with the current RFC,
(c) everything that was forcing us to outline due to symbol restrictions
no longer requires outlining, which is the case irrespective of the
choice of capture

The composition of the these three itself will make the whole thing
significantly more powerful than the state-of-the-art. We could try
things like hoisting, scal rep across the gray box, etc. next.

~ Uday


>
> Thanks!
>
> On Saturday, September 28, 2019 at 11:28:18 AM UTC-4, Uday Bondhugula
> wrote:
>
> Hi all,
>
> Here is a proposal on adding a new op to the affine dialect. Please
> let me know what you think! Although I'm appending it below, the
> formatting may be messed up. Below is the same thing on GitHub;
> please feel free to comment there if that's better - I'll be
> updating/refining the GitHub version based on comments.
> https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md
> <https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md>
>
> Thanks, Uday
>
> -------------------------------------------------------------------------------------------------------------------------
>
>
# RFC - affine.graybox op
> https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md
> <https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md>
>
> This proposal is on adding a new op named *affine.graybox* to MLIR's
> [affine
> dialect](https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md
>
>
<https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md>).
> <https://github.com/tensorflow/mlir/blob/3671cf5558a273a865007405503793746e4ddbb7/lib/Dialect/AffineOps/AffineOps.cpp#L128>))
<https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89>),
> while many will just have to be changed to use walkAffine.
>
> * **Simplification / Canonicalization**
>
> There has to be a simplification that drops unused block arguments
> from regions of ops that aren't function ops (since this is easy for
> non function ops) - in case this isn't already in place. This will
> allow removal of dead memrefs that could otherwise be blocked by
> operand uses in affine.graybox ops with the corresponding region
> arguments not really having any uses inside. Given this, no
> additional bookkeeping is needed as a result of having memrefs as
> explicit operands for gray boxes.
> [MemRefCastFold](https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228
>
>
<https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228>)
> is the only canonicalization pattern that the *affine.graybox* has to
> implement, and this is easily/cleanly done (by replacing the argument
> and its uses with a memref of a different type). Overall, having
> memrefs as explicit arguments is a good middle ground to make it
> easier to let standard SSA passes / scalar optimizations /
> canonicalizations work unhindered in conjunction with polyhedral
> passes, and with the latter not worrying about explicitly checking
> for escaping/hidden memref accesses. More discussion a little below
> in the next section.
>
>
> * There are situations/utilities where one can consistently perform
> rewriting/transformation/analysis cutting across grayboxes. One
> example is
>
> [normalizeMemRefs](https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89
> <https://github.com/tensorflow/mlir/blob/331c663bd2735699267abcc850897aeaea8433eb/include/mlir/Transforms/Utils.h#L89>),
<https://github.com/tensorflow/mlir/blob/ef77ad99a621985aeca1df94168efc9489de95b6/lib/Dialect/StandardOps/Ops.cpp#L228>)
> which can easily be extended to fold with an *affine.graybox* op
> (update its argument and all uses inside). As such, there aren't any
> cases where the argument list has to be shrunk/grown from the
> outside. And for the cases where the types have to be updated, it's
> straightforward since there is sort of only a single use for that op
> instance (it's not declarative or "callable" from elsewhere like a
> FuncOp).
>
> Another design point could be of requiring symbols associated with
> the affine constructs used in a graybox, but defined outside, to be
> explicitly listed as operands/arguments, in addition to the memrefs
> used. This makes isValidSymbol really simple. One won't need
> isValidSymbol(Value \*v, AffineGrayBoxOp op). Anything that is at the
> top-level of an *affine.graybox* op or its region argument will
> become a valid symbol. However, other than this, it doesn't simplify
> anything else. Instead, it adds/duplicates a bookkeeping with respect
> to propagation of constants, similar, to some extent, to the argument
> rewriting done for interprocedural constant propagation. Similarly,
> the other extreme of requiring everything from the outside used in
> an *affine.graybox* to be explicitly listed as its operands and
> region arguments is even worse on this front.
>
> In summary, it appears that the requirement to explicitly capture
> only the memrefs inside an affine.graybox's region is a good middle
> ground and better than other options.
>
> -- You received this message because you are subscribed to the Google
> Groups "MLIR" group. To unsubscribe from this group and stop
> receiving emails from it, send an email to
> mlir+uns...@tensorflow.org
> <mailto:mlir+uns...@tensorflow.org>. To view this discussion on
> the web visit
> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/26b131b6-45a4-4740-873d-5021c75e885d%40tensorflow.org
>
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/26b131b6-45a4-4740-873d-5021c75e885d%40tensorflow.org?utm_medium=email&utm_source=footer>.

Mehdi AMINI

unread,
Oct 1, 2019, 1:04:05 PM10/1/19
to Uday Bondhugula, MLIR
On Tue, Oct 1, 2019 at 12:43 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:


On Tuesday, October 1, 2019 at 8:43:07 AM UTC+5:30, Mehdi AMINI wrote:
Hi Uday,

Thanks for your answers, appreciate it.



On Mon, Sep 30, 2019 at 2:31 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Sunday, September 29, 2019 at 1:09:59 AM UTC+5:30, Mehdi AMINI wrote: 
This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

This sounds accurate to me except for some of the terminology. "non-affine region" is confusing - because there isn't any such thing as a non-affine region! Every op is part of some affine region, either of the closest enclosing affine.graybox op or of the func op. The "affine region" term is usable only to demarcate such a region (anything above the closest enclosing affine.graybox and anything inside the graybox ops that are in this region aren't part of the region).


Sorry to being sloppy with terminology here :)
I had defined (my first sentence above): "an affine region (a region attached to an affine operation)". My idea was that you could bind a symbol whenever the symbol is defined in a region that isn't attached to an affine operation. For example

This doesn't really cover it all. For eg. in Example 2, Example 3 of the RFC resp., %pow and %v are valid symbols that are defined inside regions held by affine operations

Just a nit but the way you phrased "regions held by affine operations" makes me think we don't use the same terminology: a region is held by only one operation in the IR. This is not a transitive property, the region semantic is defined by the operation it is attached to.
 
 
. They become symbols for the inner affine regions (for the affine.graybox nested), not for the one they are contained in.

Again terminology maybe, but they are defined inside the affine.graybox as far as I can tell, so I can't connect to "not for the one they are contained".
(SSA values defined in the graybox can't even be directly referred to outside of the graybox region in an way MLIR, this is structural)
 
Second, a symbol defined at the top level of an affine.graybox is also a valid for that graybox's region

I agree.
 
(just like the current rule of top-level of a function op).  I've added a terminology section to the RFC - we probably need another term instead of overloading 'region'. 



spv.func @foo() {
  not_affine.async_launch() {
      %symbol = ....
       affine.for .... {
            // here %symbol can be bound as a symbol since it is defined in a region not attached to an affine operation.
       }
  }
}

 
IIUC, your first bullet should just be replaced by:

- defined at the top level of a function or of an affine graybox op
(just like for a function, the top level of an affine graybox is the top part of its region's entry block)

I'm not comfortable to refer to "function" here: we don't have "first class" function anymore in MLIR (SpirV dialect is using a different operation for representing

By "function", I just meant the function op here ('FuncOp'): just didn't want to use the class name.

I know, but I feel you're missing my point: FuncOp is nothing special, so you should not refer to it in any way with respect to defining the rule for binding symbols.

I think gave some extensive reasoning and included an example as of why is that.
As far as I can tell this seems like a matter of implementation, and nothing fundamental here. We *can* specify that it is valid (crazy.op and affine.graybox are implicitly opaque to affine analysis from the outside): nothing here can't be structurally checked on the IR. 

-- 
Mehdi

Message has been deleted

Mehdi AMINI

unread,
Oct 1, 2019, 1:58:24 PM10/1/19
to Uday Bondhugula, MLIR
FYI I'm fully in agreement with Alex's points at the moment.


On Tue, Oct 1, 2019 at 5:42 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Just responding to these parts since I agree with the rest.

> 3. Conservative analysis through implicit capture
> Affine passes would treat any unknown op conservatively. This means
> that
> an unknown op with regions that implicitly captures any values is
> treated by those passes as an opaque function call that takes all
> implicitly captured values as arguments. This means that the passes
> should compute the set of implicitly captured values for unknown ops.

Actually, the answer is "No" here - they should not! For the unknown ops, why would you want to compute the set of implicitly captured values when you are going to anyway walk through the op?!
The walkAffine method in the proposal walks through inner regions of *all* region-holding ops except the affine.graybox.
And if you don't want to block an inner traversal, there is neither a need to explicitly capture nor a need to compute on-the-fly! It'd just work as is with the walk. The question really isn't about "known" vs "unknown" op here, but about an "op within the current polyhedral symbol context" vs an "op outside the current polyhedral symbol context". affine.graybox has its own symbol context, and so it falls into the latter class. In your earlier example, "crazy.op" (irrespective of whether it's known or unknown) doesn't define a new polyhedral context and so there isn't a need to block a walk into it nor explicitly capture.

Actually that's probably the point I was missing: I'm seeing an unknown op with a region like an opaque call. Basically my take on an invariant of MLIR currently is that if this is valid:


func @foo(%m : memref<...>) {
  not.crazy {
    crazy.op(%m) : memref<...>
  }

Then the following should always be valid regardless of what is done with %m (this is from the point of view of looking at the `not.crazy` op, assuming that the inner most region is valid for crazy.op of course).

func @foo(%m : memref<...>) {
  not.crazy {
    crazy.op {
      // use of %m implicitly captured by crazy.op
    }
  }
}


The important property here is that the enclosing `not.crazy` cannot constrain what `crazy.op` can do within its enclosed region. 
When exposing the region inside crazy.op, it can provide better analysis than if it wasn't there (for instance you could know that %m is never stored to for the purpose of side-effect analysis), but the first form should already "assume the worst" and so if the first form is valid the second must be.
(This goes beyond affine)

-- 
Mehdi



-- 
Mehdi





>2. Stopping the pre-order traversal
>In the pre-order (aka going from the top) traversal of ops, one may want to ignore certain regions attached to >the op, or treat them differently.  This sounds like a reusable IR traversal pattern that can be parameterized by a >condition functor used to analyze the op and decide whether to enter its regions. This does not have to be tied >to greybox or affine.

Yes, there may be a future scenario where we don't want to traverse the inner regions of certain ops although they may not be affine.grayboxes and although they fall within the current region's symbol context (for efficiency/whatever reasons) - you'd have to compute summaries on them on-the-fly irrespective of whether they have an explicit capture on them, and this would in fact be the case for 'call' ops! if you want to do something more advanced than being super-conservative. But I don't see why that affects the current design choices on affine.graybox - the unique thing about the latter is that it introduces a new symbol context or has a list of basic blocks with their terminators and so the control flow isn't affine in it. Please think about the affine.graybox as bringing in polyhedral information from another "domain" that doesn't compose with the polyhedral information of its surrounding affine region. 

~ Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

Alex Zinenko

unread,
Oct 1, 2019, 2:19:04 PM10/1/19
to Nicolas Vasilache, MLIR
I am not convinced about the benefits of using a pair of ops instead of a region. It requires to build a dominance tree to check for symbol validity instead of a simple list traversal of regions (and we do this check in more places than the op verifier that also builds the tree). It makes the symbol validity scope non-obvious from the textual representation.  indeed, affine transformations will have to be made aware of `affine.greybox` semantics to operate properly, but this is also the case for `affine.bind_symbol`.

If there are benefits of explicit symbol bindings, we can make `affine.greybox` take the values that should become symbols and forward them to region arguments...

On Tue, Oct 1, 2019 at 4:56 AM 'Nicolas Vasilache' via MLIR <ml...@tensorflow.org> wrote:
Hi Uday, thanks for the proposal!

I see 3 principal changes that `affine.greybox` bring:
  1. ability to bypass the restriction on symbol definition (i.e. you can insert an `affine.greybox` anywhere)
  2. context sensitivity (i.e. behaves like an encapsulated inlined function call)
  3. the decision on what is implicitly captured vs passed as argument

This brings clear expressivity benefits as your examples show.
However I believe there is also a case to be made about new analysis and transformations that can cross the `affine.greybox` boundary.
I'd be curious to hear your thoughts on this topic and how `affine.greybox` improve the situation, what new analysis/transformation it allows.

Regarding `affine.greybox` itself, in my opinion the real limiter is the restriction on symbol definition and I think there may be a simpler way of addressing the problem.
Consider a pair of ops to allow injecting/capturing symbols more freely, resembling:
```mlir {.mlir}
ssa-value `=` affine.bind_symbol ssa-value `:` index-type
affine.release_symbol ssa-value `:` index-type
```

In some sense, this could be thought of as an exact complement to `affine.greybox`:
  1. the values specified are ssa-value that are explicitly turned into symbols
  2. there is no region/nesting involved

The implication for affine passes is that symbol validity is defined by dominance by `affine.bind_symbol` (and optionally postdominance by `affine.release_symbol`) + rules induced by special ops (e.g. dim).
`affine.bind_symbol` / `affine.release_symbol` act as natural boundary to affine transformations (with behavior varying on a case-by-case basis, like you mention for memref normalization and DMA).
The part about Helpers, Utilities, and Passes seems to be mostly transparent (I have not thought deeply about those points but did not see particular points of concern).
Inlining seems to work with minimal effort.
Expressiveness benefits are the same in the 3 examples you highlight.

Besides simplicity of these 2 ops without regions, there is value in explicitly tracking symbol bindings.
Consider your `pow` example:

```
func @nest(%n : i32) {
  %c2 = constant 2 : index
  affine.for %i = 0 to %n {
    affine.for %j = 0 to %n {
      %pow = call @powi(%c2, %j) : (index, index) ->  index
      %pow_sym = affine.bind_symbol %pow : index
      affine.for %k = 0 to %pow_sym {
        affine.for %l = 0 to %n  
           ...
        }
      }
    }
  }
  return
}
```

It is trivial to compute that `i` and any `affine.bind_symbol` are in different program slices and that `i` can be structurally stripmined-and-sunk below `k` (assuming it is legal and profitable).
I don't see a particular issue with doing the same with `affine.greybox` on this example, except that analyses/transformations/walkers/etc need to know about and work across the new special region you propose.

So bottom-line, what do you think are the cost/benefits of regions vs just using ops?
Where do you think something like `affine.bind_symbol` would break?
Can we think of some new analyses/transformations that cross `affine.greybox` boundaries and for which the abstraction helps (vs makes transformations more intricate to write because of a new special region)?

Thanks!

On Saturday, September 28, 2019 at 11:28:18 AM UTC-4, Uday Bondhugula wrote:

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.


--
-- Alex

Uday Bondhugula

unread,
Oct 2, 2019, 10:08:52 AM10/2/19
to MLIR

On Mon, Sep 30, 2019 at 2:31 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Sunday, September 29, 2019 at 1:09:59 AM UTC+5:30, Mehdi AMINI wrote: 
This is something that we should clarify with respect to symbols: either we always for these to be defined inside an affine region (a region attached to an affine operation), in which case you have to use a graybox to materialize symbols through explicit capture, or we should consider that any SSA value defined in a non-affine region can be used as a symbol inside an affine region.

I'm referring to this section: https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Affine.md#restrictions-on-dimensions-and-symbols; which after defining what an affine region is, I would update along the line of :

> A symbolic identifier can be bound to an SSA value that is either:
> - defined in a non-affine region, 
> - defined in a region separated from the current region by an affine.graybox operation in the nesting structure (the value is "captured", possibly implicitly, by an affine.graybox)
> - the result of a constant operation,
> - or the result of an affine.apply operation that recursively takes as arguments any symbolic identifiers. 

This sounds accurate to me except for some of the terminology. "non-affine region" is confusing - because there isn't any such thing as a non-affine region! Every op is part of some affine region, either of the closest enclosing affine.graybox op or of the func op. The "affine region" term is usable only to demarcate such a region (anything above the closest enclosing affine.graybox and anything inside the graybox ops that are in this region aren't part of the region).


Sorry to being sloppy with terminology here :)
I had defined (my first sentence above): "an affine region (a region attached to an affine operation)". My idea was that you could bind a symbol whenever the symbol is defined in a region that isn't attached to an affine operation. For example

This doesn't really cover it all. For eg. in Example 2, Example 3 of the RFC resp., %pow and %v are valid symbols that are defined inside regions held by affine operations

Just a nit but the way you phrased "regions held by affine operations" makes me think we don't use the same terminology: a region is held by only one operation in the IR. This is not a transitive property, the region semantic is defined by the operation it is attached to.

As I mentioned, by "region" as in "affine region", I didn't mean the same thing as the region held by an op. I defined it separately in the RFC under "Terminology". May be I should I change it to an "affine zone" or "affine sub-region". 
 

 
 
. They become symbols for the inner affine regions (for the affine.graybox nested), not for the one they are contained in.

Again terminology maybe, but they are defined inside the affine.graybox as far as I can tell, so I can't connect to "not for the one they are contained".
(SSA values defined in the graybox can't even be directly referred to outside of the graybox region in an way MLIR, this is structural)
 
Second, a symbol defined at the top level of an affine.graybox is also a valid for that graybox's region

I agree.
 
(just like the current rule of top-level of a function op).  I've added a terminology section to the RFC - we probably need another term instead of overloading 'region'. 



spv.func @foo() {
  not_affine.async_launch() {
      %symbol = ....
       affine.for .... {
            // here %symbol can be bound as a symbol since it is defined in a region not attached to an affine operation.
       }
  }
}

 
IIUC, your first bullet should just be replaced by:

- defined at the top level of a function or of an affine graybox op
(just like for a function, the top level of an affine graybox is the top part of its region's entry block)

I'm not comfortable to refer to "function" here: we don't have "first class" function anymore in MLIR (SpirV dialect is using a different operation for representing

By "function", I just meant the function op here ('FuncOp'): just didn't want to use the class name.

I know, but I feel you're missing my point: FuncOp is nothing special, so you should not refer to it in any way with respect to defining the rule for binding symbols.

A FuncOp has all the properties of an affine graybox since it starts a new symbol context and explicitly captures (at least) the memrefs used. It's almost as if it was also an affine.graybox. I still don't see how the rule for defining a symbol is free from a reference to FuncOp unless you use other terminology that indirectly covers it. The earlier bullets you had used "non-affine region" in its first bullet, which isn't meaningful. There isn't such a thing as a non-affine region as it would fail IR verification. Could you clarify what your first bullet replacement would be if it's free from referring to FuncOp?
This contradicts the original example Alex started from where "crazy.op" was an unknown op - you can't specify traits for an unknown op! :-) Second, irrespective of whether it's an unknown or known op, if the goal was to start a new symbol context at "crazy.op", one would just surround it by an affine.graybox, and if one's only goal was to start a new symbol context, one would just use affine.graybox instead of a "crazy.op", since you'd already have all the infra around it. Scan and get hold of the memrefs inside, and build the graybox. affine.graybox is a known op whose only goal is to start a new polyhedral symbol context for its "affine region". If you think there are cases where various known ops would want to be isolated polyhedrally (i.e., behave like affine.graybox), defining a PolyhedralIsolateFromAbove trait is an option (walkAffine won't walk their inside). But even this isn't as readable as just inserting the special affine.graybox op wherever you want to start a new symbol context, i.e., I tend to prefer having affine.graybox meant for this unique purpose instead of going around sticking its trait to other (known) ops of interest. 

~ Uday

Caballero, Diego

unread,
Oct 2, 2019, 5:04:33 PM10/2/19
to Uday Bondhugula, MLIR

Hi Uday,

 

Thanks for the RFC! The graybox idea sounds interesting to me. I see the discussion is already pretty advanced but I would have some basic questions, mostly for my learning/understanding:

 

1.      It seems that with greybox op we are making all the affine ops context sensitive. affine.load/affine.store/affine.if/affine.for may not be actually affine if they are within a greybox. I’m trying to understand the upsides and downsides of this approach vs using plain load/store/if/for and perhaps using graybox only for more arbitrary non-affine control flow cases. That would keep, at least, affine semantics context free, I guess.

2.      Based on example 2, it seems that we may have affine constructs that are actually affine within a graybox (loop %l). Not sure if my question makes sense but, are affine algorithms expected to be applied on these "nested" affine constructs or, on the contrary, a graybox "downgrades" all the nested construct to be treated as non-affine? In practice, for example, would it be possible to fuse loop %l and loop %m using an affine transformation?

 

  affine.for %i = 0 to %n {

    affine.for %j = 0 to %n {

      affine.graybox [] {

        %pow = call @powi(%c2, %j) : (index, index) ->  index

        affine.for %k = 0 to %pow {

          affine.for %l = 0 to %n {

              ...

          }

          affine.for %m = 0 to %n {

              ...

          }

        }

      }  // graybox end

    }  // %j

  }  // %i

 

Thanks!

Diego

--

You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

Uday Kumar Reddy B

unread,
Oct 2, 2019, 5:46:36 PM10/2/19
to Caballero, Diego, MLIR
Hi Diego,

On 03/10/19 2:34 AM, Caballero, Diego wrote:
> Hi Uday,
>
> Thanks for the RFC! The graybox idea sounds interesting to me. I see the
> discussion is already pretty advanced but I would have some basic
> questions, mostly for my learning/understanding:
>
> 1.It seems that with greybox op we are making all the affine ops context
> sensitive. affine.load/affine.store/affine.if/affine.for may not be
> actually affine if they are within a greybox. I’m trying to understand

If they are inside a graybox, they are conceptually non-affine when
viewed at the function op level, but they are affine locally (within the
boundary of the graybox), and such affine structure can be fully exploited.


> the upsides and downsides of this approach vs using plain
> load/store/if/for and perhaps using graybox only for more arbitrary
> non-affine control flow cases. That would keep, at least, affine
> semantics context free, I guess.

The option to use std load/store/if/for always exists - it already
exists. But the affine analysis/transform infrastructure won't work with
std ops since, as you know, there isn't a structure with them to exploit
- for eg., analyzing subscripts, composing maps, rewriting subscripts,
calculating footprints, and transformations while maintaining the affine
properties.

>
> 2.Based on example 2, it seems that we may have affine constructs that
> are actually affine within a graybox (loop %l). Not sure if my question
> makes sense but, are affine algorithms expected to be applied on these
> "nested" affine constructs or, on the contrary, a graybox "downgrades"

Yes, of course; all of the affine algorithms apply recursively on these
nested grayboxes. For an affine pass, walkAffine() will be called on
each of the grayboxes in addition to each of the functions.
https://github.com/bondhugula/mlir/blob/graybox/rfc/rfc-graybox.md#helpers-utilities-and-passes


> all the nested construct to be treated as non-affine? In practice, for

There is no such thing as non-affine with the grayboxes, everything is
graybox-local affine!


> example, would it be possible to fuse loop %l and loop %m using an
> affine transformation?

Yes, this is exactly what it enables! You should be able to apply all
existing passes - for eg., unroll-and-jam on %k, packing/explicit data
copying on memrefs right above %k or %l, etc.

~ Uday

>
>   affine.for %i = 0 to %n {
>
>     affine.for %j = 0 to %n {
>
>       affine.graybox [] {
>
>         %pow = call @powi(%c2, %j) : (index, index) ->  index
>
>         affine.for %k = 0 to %pow {
>
>           affine.for %l = 0 to %n {
>
>               ...
>
>           }
>
>           affine.for %m = 0 to %n {
>
>               ...
>
>           }
>
>         }
>
>       }  // graybox end
>
>     }  // %j
>
>   }  // %i
>
> Thanks!
>
> Diego
>
> *From:*'Uday Bondhugula' via MLIR [mailto:ml...@tensorflow.org]
> *Sent:* Wednesday, October 2, 2019 7:09 AM
> *To:* MLIR <ml...@tensorflow.org>
> *Subject:* Re: [mlir] affine.graybox op - RFC
> code here inside the affine.fo <http://affine.fo> you have
> <mailto:mlir+uns...@tensorflow.org>.
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e1f195b3-92c5-4448-a42f-d434aaa9546a%40tensorflow.org?utm_medium=email&utm_source=footer>.
>

Caballero, Diego

unread,
Oct 2, 2019, 6:59:00 PM10/2/19
to Uday Kumar Reddy B, MLIR
Thanks Uday! Much clearer now. I think this is the key point I was missing:

> There is no such thing as non-affine with the grayboxes, everything is graybox-local affine!

Everything makes sense now.

Thanks!
Diego

Mehdi AMINI

unread,
Oct 2, 2019, 8:36:21 PM10/2/19
to Uday Bondhugula, MLIR
You're using circular logic here: I read this as justifying using a FuncOp for affine purpose by attaching an affine purpose in the first place to the FuncOp.
"starting a new symbol context" is not something that should be defined in terms of FuncOp: this is arbitrary.
 
and explicitly captures (at least) the memrefs used. It's almost as if it was also an affine.graybox. I still don't see how the rule for defining a symbol is free from a reference to FuncOp unless you use other terminology that indirectly covers it.

Yes, the terminology is "any op not in the affine dialect holding a region" (which I called "non-affine region" after defining it). I'm sticking closer the MLIR terminology for region here: a region correspond the "class mlir::Region" and its semantic is defined by the operation that is it directly attached to.

The earlier bullets you had used "non-affine region" in its first bullet, which isn't meaningful.

I don't understand what isn't meaningful here. A FuncOp is a non-affine op (an op not in the affine dialect) which is holding a region (its body), which matches the definition I gave for a "non-affine region". Feel free to propose another definition/terminology if you don't find it convenient.
 
There isn't such a thing as a non-affine region as it would fail IR verification. Could you clarify what your first bullet replacement would be if it's free from referring to FuncOp?

I hope I clarified here.
Sorry if I'm unclear: but I'm talking about *affine* specification with respect to operation not in the affine dialect. There is no trait involved, you *have* to specify your treatment of opaque operation, this is what I'm referring to.

 
Second, irrespective of whether it's an unknown or known op

To be clear "unknown" as in "unknown from your affine dialect" as well as "not holding any particular traits", not "unregistered with MLIR". Not that it makes any difference actually.
 
, if the goal was to start a new symbol context at "crazy.op", one would just surround it by an affine.graybox, and if one's only goal was to start a new symbol context, one would just use affine.graybox instead of a "crazy.op", since you'd already have all the infra around it. Scan and get hold of the memrefs inside, and build the graybox. affine.graybox is a known op whose only goal is to start a new polyhedral symbol context for its "affine region". If you think there are cases where various known ops would want to be isolated polyhedrally (i.e., behave like affine.graybox), defining a PolyhedralIsolateFromAbove trait is an option (walkAffine won't walk their inside).

Having a trait is a possibility, but in the absence of trait you need to be conservative: so the trait should be reversed: the default is "PolyhedralIsolateFromAbove" and you can introduce a trait "PolyhedralTransparentFromAbove" as needed.
 
But even this isn't as readable as just inserting the special affine.graybox op wherever you want to start a new symbol context, i.e., I tend to prefer having affine.graybox meant for this unique purpose instead of going around sticking its trait to other (known) ops of interest. 

Forcing affine.graybox to be present to create a symbol context is a possible design, however if you'd like to go in this direction, then since FuncOp body is nothing special we would use an affine.graybox inside a funcop to begin any new affine context. The affine verifier could check that any bound symbol is always define in a region attached to an affine.graybox.

However I still have some doubt about this possible design, as I expressed in my last email (which you haven't addressed yet I believe).

-- 
Mehdi

Mehdi AMINI

unread,
Oct 2, 2019, 8:46:08 PM10/2/19
to Caballero, Diego, Uday Bondhugula, MLIR
On Wed, Oct 2, 2019 at 2:04 PM Caballero, Diego <diego.c...@intel.com> wrote:

Hi Uday,

 

Thanks for the RFC! The graybox idea sounds interesting to me. I see the discussion is already pretty advanced but I would have some basic questions, mostly for my learning/understanding:

 

1.      It seems that with greybox op we are making all the affine ops context sensitive. affine.load/affine.store/affine.if/affine.for may not be actually affine if they are within a greybox. I’m trying to understand the upsides and downsides of this approach vs using plain load/store/if/for and perhaps using graybox only for more arbitrary non-affine control flow cases. That would keep, at least, affine semantics context free, I guess.

2.      Based on example 2, it seems that we may have affine constructs that are actually affine within a graybox (loop %l). Not sure if my question makes sense but, are affine algorithms expected to be applied on these "nested" affine constructs or, on the contrary, a graybox "downgrades" all the nested construct to be treated as non-affine? In practice, for example, would it be possible to fuse loop %l and loop %m using an affine transformation?

 

  affine.for %i = 0 to %n {

    affine.for %j = 0 to %n {

      affine.graybox [] {

        %pow = call @powi(%c2, %j) : (index, index) ->  index

        affine.for %k = 0 to %pow {

          affine.for %l = 0 to %n {

              ...

          }

          affine.for %m = 0 to %n {

              ...

          }

        }

      }  // graybox end

    }  // %j

  }  // %i

 


The mental model I have (Uday please correct me if I'm wrong) is that your example is equivalent to:

func @graybox_func(%c2 : index, %j : index, %n : index, .... /* memref */) {

        %pow = call @powi(%c2, %j) : (index, index) ->  index

        affine.for %k = 0 to %pow {

          affine.for %l = 0 to %n {

              ...

          }

          affine.for %m = 0 to %n {

              ...

          }

        }

            }


And separately : 

  affine.for %i = 0 to %n {

    affine.for %j = 0 to %n {

      call @graybox_func(%c2, %j, %n, ... /* memref */)

    }  // %j

  }  // %i

 

The graybox content can be analyzed / operated on the same way as the separate function here.
The gray-box operation is indeed (also) intended indeed to enable inlining inside a region-attached-to-an-affine-operation (I'm afraid to use "affine region" now). This isn't valid in all cases (like here) today so we can't inline at all such cases (like here).

-- 
Mehdi


 

Uday Bondhugula

unread,
Oct 3, 2019, 12:27:41 AM10/3/19
to MLIR
A FuncOp has always started a new symbol context and thus always had an affine purpose - even without this proposal --- simple because the definition of a symbol in the MLIR spec has always included "SSA Values of index type defined at the top level of a function". "Values of index type defined at the top level of an affine.graybox or a function op" thus has no circular logic - both are well-defined known ops. 

 
 
and explicitly captures (at least) the memrefs used. It's almost as if it was also an affine.graybox. I still don't see how the rule for defining a symbol is free from a reference to FuncOp unless you use other terminology that indirectly covers it.

Yes, the terminology is "any op not in the affine dialect holding a region" (which I called "non-affine region" after defining it). I'm sticking closer the MLIR terminology for region here: a region correspond the "class mlir::Region" and its semantic is defined by the operation that is it directly attached to.

The earlier bullets you had used "non-affine region" in its first bullet, which isn't meaningful.

I don't understand what isn't meaningful here. A FuncOp is a non-affine op (an op not in the affine dialect) which is holding a region (its body), which matches the definition I gave for a "non-affine region". Feel free to propose another definition/terminology if you don't find it convenient.

Calling FuncOp a "non-affine" op is highly confusing because "not from the affine dialect" and "non-affine" can be two very different things. It's clear now you meant the former but "non-affine" is a behavior (subscripts or bounds that involve non-affine functions - for eg. std.for, loop.for, etc. allow such subscripts). With or without this graybox proposal, there is no non-affine behavior associated with FuncOp itself - it's std.for/std.if/std.load/std.store/'lists of more than one block' that are potentially non-affine in behavior. 
 
 
There isn't such a thing as a non-affine region as it would fail IR verification. Could you clarify what your first bullet replacement would be if it's free from referring to FuncOp?

I hope I clarified here.

Yes, but your definition would still miss the case of SSA values (of index type) defined at the top-level of an affine.graybox itself. But I think there is clarity on this, and everyone is in agreement on what symbols are for affine purposes. 
If there is no trait or if it's an unknown op with a region, polyhedral passes will walk through all of it - just like they do for affine.for/affine.if. There is no need to insert a trait here - they'll do the desired / semantically correct thing as is. This is what I was describing in my most recent response to Alex. I'm not sure I understand what you have in mind here.
 
 
But even this isn't as readable as just inserting the special affine.graybox op wherever you want to start a new symbol context, i.e., I tend to prefer having affine.graybox meant for this unique purpose instead of going around sticking its trait to other (known) ops of interest. 

Forcing affine.graybox to be present to create a symbol context is a possible design, however if you'd like to go in this direction, then since FuncOp body is nothing special we would use an affine.graybox inside a funcop to begin any new affine context. The affine verifier could check that any bound symbol is always define in a region attached to an affine.graybox.

As I mentioned, a FuncOp has all the properties of an affine.graybox as is. One could assume an affine.graybox right under it in their minds or think of it as being elided right under the function definition. 
 

However I still have some doubt about this possible design, as I expressed in my last email (which you haven't addressed yet I believe).

I'm a bit lost now on which doubt you are referring to :-) - whether it's the explicit capture of memrefs thing or something else.

~ Uday
 

-- 
Mehdi

Mehdi AMINI

unread,
Oct 3, 2019, 1:13:42 AM10/3/19
to Uday Bondhugula, MLIR
This is a quite long sequences of back-and-worth, to end-up to what I called out in my very first email in this thread, direct quote here:

> In particular, the "funcop" is just another operation: my take was already that any region help by a non-affine operation should be treated the same as held by a func-op from the perspective of affine.
> Since the removal of MLFunc and the use of generic region there was really nothing specific left about FuncOp as far as I can tell, but the doc was blindly updated to replace MLFunction with Function.

There had been a `sed` (pure text-replacement) when MLFunc have been removed, and has not been properly updated when Function was removed from being first class and replaced with an op (I should have looked at it closer at the time, really), I claim this is incorrect (a bug to fix in the spec), this has no principled reason to exist today.

 
"Values of index type defined at the top level of an affine.graybox or a function op" thus has no circular logic - both are well-defined known ops. 

You're misquoting here: the circular logic is there: "A FuncOp has all the properties of an affine graybox since it starts a new symbol context". Basically I'm saying that there is no reason FuncOp has any specific properties that would make it more special than other op (like spv.func) with respect to affine/symbol, and that FuncOp is special because it create a symbol context. So the only thing special about it is that you make it special (arbitrarily as far as I can tell).
This is not conservative, I don't understand why you believe this would be correct without a traits.
I explained earlier why I doubt that this is consistent with an important MLIR invariant with respect to implicit captures and region in general , but you seem to have missed this email: https://groups.google.com/a/tensorflow.org/d/msg/mlir/O5PXVbtlSng/sUE4bMXwAgAJ


 
 
 
But even this isn't as readable as just inserting the special affine.graybox op wherever you want to start a new symbol context, i.e., I tend to prefer having affine.graybox meant for this unique purpose instead of going around sticking its trait to other (known) ops of interest. 

Forcing affine.graybox to be present to create a symbol context is a possible design, however if you'd like to go in this direction, then since FuncOp body is nothing special we would use an affine.graybox inside a funcop to begin any new affine context. The affine verifier could check that any bound symbol is always define in a region attached to an affine.graybox.

As I mentioned, a FuncOp has all the properties of an affine.graybox as is.

So do potentially 1000 other ops? Is your intent to update the affine spec with the list of all the other ops that could behave the same way (I gave the example of spv.Func, should we update the affine spec to add this next to FuncOp?)?

 
One could assume an affine.graybox right under it in their minds or think of it as being elided right under the function definition. 
 

However I still have some doubt about this possible design, as I expressed in my last email (which you haven't addressed yet I believe).

I'm a bit lost now on which doubt you are referring to :-) - whether it's the explicit capture of memrefs thing or something else.

Uday Bondhugula

unread,
Oct 3, 2019, 2:04:24 AM10/3/19
to MLIR


On Tuesday, October 1, 2019 at 11:28:24 PM UTC+5:30, Mehdi AMINI wrote:
FYI I'm fully in agreement with Alex's points at the moment.

Looks like I missed all your comments underneath below. 
 


On Tue, Oct 1, 2019 at 5:42 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Just responding to these parts since I agree with the rest.

> 3. Conservative analysis through implicit capture
> Affine passes would treat any unknown op conservatively. This means
> that
> an unknown op with regions that implicitly captures any values is
> treated by those passes as an opaque function call that takes all
> implicitly captured values as arguments. This means that the passes
> should compute the set of implicitly captured values for unknown ops.

Actually, the answer is "No" here - they should not! For the unknown ops, why would you want to compute the set of implicitly captured values when you are going to anyway walk through the op?!
The walkAffine method in the proposal walks through inner regions of *all* region-holding ops except the affine.graybox.
And if you don't want to block an inner traversal, there is neither a need to explicitly capture nor a need to compute on-the-fly! It'd just work as is with the walk. The question really isn't about "known" vs "unknown" op here, but about an "op within the current polyhedral symbol context" vs an "op outside the current polyhedral symbol context". affine.graybox has its own symbol context, and so it falls into the latter class. In your earlier example, "crazy.op" (irrespective of whether it's known or unknown) doesn't define a new polyhedral context and so there isn't a need to block a walk into it nor explicitly capture.

Actually that's probably the point I was missing: I'm seeing an unknown op with a region like an opaque call. Basically my take on an invariant of MLIR currently is that if this is valid:

That's right; an unknown op with a region will get fully walked like an affine.for/affine.fi.
 


func @foo(%m : memref<...>) {
  not.crazy {
    crazy.op(%m) : memref<...>
  }

Then the following should always be valid regardless of what is done with %m (this is from the point of view of looking at the `not.crazy` op, assuming that the inner most region is valid for crazy.op of course).

Yes of course (as long as you don't use affine.load %m [ ...] in a way that violates symbol restrictions). 
 
 

func @foo(%m : memref<...>) {
  not.crazy {
    crazy.op {
      // use of %m implicitly captured by crazy.op
    }
  }
}


The important property here is that the enclosing `not.crazy` cannot constrain what `crazy.op` can do within its enclosed region. 
When exposing the region inside crazy.op, it can provide better analysis than if it wasn't there (for instance you could know that %m is never stored to for the purpose of side-effect analysis), but the first form should already "assume the worst" and so if the first form is valid the second must be.
(This goes beyond affine)

It's not clear to me what point you are making here. The 'not.crazy' op may not constraint "crazy.op" (I don't know what traits/properties you have on it), but the affine.load/stores's in crazy op if any are still constrained by the symbol restrictions based on its closest enclosing affine.graybox or func op. Any non-dereferencing uses of %m (those outside of affine.load/store/dma_start/dma_wait) in "crazy.op" will be correctly dealt with by all affine passes - use %m the way you like.

~ Uday
 

-- 
Mehdi



-- 
Mehdi





>2. Stopping the pre-order traversal
>In the pre-order (aka going from the top) traversal of ops, one may want to ignore certain regions attached to >the op, or treat them differently.  This sounds like a reusable IR traversal pattern that can be parameterized by a >condition functor used to analyze the op and decide whether to enter its regions. This does not have to be tied >to greybox or affine.

Yes, there may be a future scenario where we don't want to traverse the inner regions of certain ops although they may not be affine.grayboxes and although they fall within the current region's symbol context (for efficiency/whatever reasons) - you'd have to compute summaries on them on-the-fly irrespective of whether they have an explicit capture on them, and this would in fact be the case for 'call' ops! if you want to do something more advanced than being super-conservative. But I don't see why that affects the current design choices on affine.graybox - the unique thing about the latter is that it introduces a new symbol context or has a list of basic blocks with their terminators and so the control flow isn't affine in it. Please think about the affine.graybox as bringing in polyhedral information from another "domain" that doesn't compose with the polyhedral information of its surrounding affine region. 

~ Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

Uday Bondhugula

unread,
Oct 3, 2019, 2:28:01 AM10/3/19
to MLIR

But the LangRef doc is only about the core IR ops -- the ones that reside in lib/IR/. Which other op has the property of FuncOp and still resides there? If you are suggesting changing the symbol def rules to say "any op with the FunctionLike trait", then (a) we need to define the FunctionLike trait in LangRef, (b) then update the definition of symbol first there -- irrespective of this affine.graybox proposal. I still can't understand why the FuncOp defined in lib/IR/ isn't a special op - it's the only thing there that has the property. I'd suggest starting a separate thread to discuss why it's a bug and how it should be fixed.



 
"Values of index type defined at the top level of an affine.graybox or a function op" thus has no circular logic - both are well-defined known ops. 

You're misquoting here: the circular logic is there: "A FuncOp has all the properties of an affine graybox since it starts a new symbol context". Basically I'm saying that there is no reason FuncOp has any specific properties that would make it more special than other op (like spv.func) with respect to affine/symbol, and that FuncOp is special because it create a symbol context. So the only thing special about it is that you make it special (arbitrarily as far as I can tell).

Same as above. 
I missed the inline comments there - responded now. As I point out, you don't need any traits to be correct for the polyhedral passes. 
 


 
 
 
But even this isn't as readable as just inserting the special affine.graybox op wherever you want to start a new symbol context, i.e., I tend to prefer having affine.graybox meant for this unique purpose instead of going around sticking its trait to other (known) ops of interest. 

Forcing affine.graybox to be present to create a symbol context is a possible design, however if you'd like to go in this direction, then since FuncOp body is nothing special we would use an affine.graybox inside a funcop to begin any new affine context. The affine verifier could check that any bound symbol is always define in a region attached to an affine.graybox.

As I mentioned, a FuncOp has all the properties of an affine.graybox as is.

So do potentially 1000 other ops? Is your intent to update the affine spec with the list of all the other ops that could behave the same way (I gave the example of spv.Func, should we update the affine spec to add this next to FuncOp?)?

Of course not, because LangRef can't refer to dialect ops! It was written with FuncOp in mind, and if there are other ops which polyhedral passes should know about as starting a new symbol context, there are two options here: (1) always generate an affine.graybox right at their top level, (b) create a trait meant to match the properties of an affine.graybox and stick that on that op. As I mentioned, I tend to prefer option (1) -- because that way an MLIR compiler developer will "see" in the IR what's going to happen instead of relying on knowledge of traits of known ops. (For eg., a develop of an affine pass may or may not know if the spv.Func op has the trait that allows it to start a new symbol context at that level.).

~ Uday

Mehdi AMINI

unread,
Oct 3, 2019, 3:26:18 AM10/3/19
to Uday Bondhugula, MLIR
I am trying to not consider affine at all here. I wrote these example to try to illustrate how MLIR region/op interaction are structured opaquely to be able to derive cross-dialect invariants in general.
The invariant I am presenting above is independent from any dialect, let me abstract the type further:

func @foo(%value : !dialect.type) {  
  op.with_region {
    any.op(%value) : (!dialect.type) -> ()
  }
}

If I look at it generically, here is my take on it:
a) The `op.with_region` defines the semantic of its immediate region, so it can either accept or reject `any.op`. 
b) Let's assume that `op.with_region` does not know anything about `any.op` (no traits, no prior knowledge, the `any.op` could be unregistered at this point).
c) For this IR to be valid, `op.with_region` must be accepting unknown op (like `any.op`). 
d) From the perspective of `op.with_region`, the `any.op` is like an opaque call to some code it cannot see.

But what if `any.op` has a region?

func @foo(%value : !dialect.type) {
    op.with_region {
      any.op(%value) ({
        ^bb(%value_inside):
           // do something with %value_inside (explicitly captured)
      }) : (!dialect.type) -> ()
  }}

Here what changes is that:
e) any.op has a region now. Unless `op.with_region` forbids unknown operation from having a region in its verifier, and it does not have specific handling for `any.op`, then the IR should be valid.
f) Since %value_inside is explicitly captured, without knowing specifically `any.op`, then the uses of %value_inside cannot be restricted by `op.with_region` but only by `any_op`. 
g) For any practical purpose here, there should not be any difference between this form and the first one above.

Finally, what if `any.op` has implicit capture?

func @foo(%value : !dialect.type) {
    op.with_region {
      any.op() ({
         // do something with %value (implicit capture instead of explicit)
      }) : (!dialect.type) -> ()
  }}

Now:
h) `any.op()` is implicitly capturing %value.
g) Without more information about `any.op` (traits, etc.), this should be equivalent to the explicit capture case: if the IR was valid the first and second case, then it should be valid here.

If we don't have these properties, and if `op.with_region` can constrain the validity of the region attached to `any.op`, then `any.op` is not longer in control of the semantics of the enclosed region. No transformation can operate on `any.op` without knowing all of the enclosing operations, since these can add arbitrary restrictions. 
For example, this is a valid IR (you can pipe this in mlir-opt right now):

module {
  "d1.op1" () ({
    "d2.op2" () ({
      module {
        func @bar() {
          return
        }
        func @foo() {
          call @bar() : () -> ()
          return
        }
      }
      "d2.done" () : () -> ()
     }) : () -> ()
  }) : () -> ()
}

If I get the inner @foo function, and would like to inline the call to @bar, what do I need to check to ensure I can?
If the FuncOp defines the semantic of the region, then the FuncOp should control itself whether it allows to inline or not, and I should query FuncOp for @foo, CallOp for the call-site, FuncOp for @bar, and likely the op inside @bar that I am about to inline.

If you allow to put restriction on what can happen inside @foo(), based on the enclosing operation, then you can't inline unless you ensure that all the enclosing operation will be happy with it (so you need to check the enclosing modules, but also "d1.op1" and "d2.op2").

Basically, this would be breaking the composability of the IR: you couldn't assemble independent pieces and reason about them independently. I don't know why we would want that, here we really want to reason about the functions in the inner module independently if they are surrounded by "d1.op1"() and "d2.op2"() like here (otherwise none of the current passes in MLIR are correct).

To go back to affine, if you treat every unknown operation as a blackbox (or gray, or green, or ...), wouldn't it be possible to fit in this model? Wouldn't the polyhedral analyses be just as efficient as now while being conservative and reusable when facing un-modeled part of the IR?

-- 
Mehdi

Caballero, Diego

unread,
Oct 3, 2019, 10:04:36 PM10/3/19
to Mehdi AMINI, Uday Bondhugula, MLIR

 

        > The mental model I have (Uday please correct me if I'm wrong) is that your example is equivalent to:

 

Thanks, Mehdi. That mental model was very useful!

 

I would have another example, based on Example 3, that would help my basic understanding of the new picture that graybox is bringing and the limitations of affine representation.

For this particular example, would graybox provide any extra value vs representing the indirect load with a std.load?

  %cf1 = constant 1.0 : f32

  affine.for %i = 0 to 100 {

    %v = affine.load %B[%i] : memref<100xf32>

    affine.graybox [%A] {

      // %v is now a symbol here.

      %s = affine.load %A[%v] : memref<100xf32> // Indirect load

      affine.store %s, %C[%i] : memref<100xf32>  // Note change in the subscript here.

      return;

    }

 

  affine.for %j = 0 to 100 {

    %l = affine.load %C[%j] : memref<100xf32>

  }

 

Using affine loop fusion as an example, I understand that fusion of loop %i and %j couldn’t happen because the store is only considered affine within the gray box.

However, if we changed loop %i to something like this, it would work:

 

    …

    %t = affine.graybox [%A] … {

      // %v is now a symbol here.

      %s = affine.load %A[%v] : memref<100xf32> // Indirect load

      return %s;

    }

    affine.store %t, %C[%i] : memref<100xf32>  // Note change in the subscript here.

    …

 

Thanks!

Diego

 

From: Mehdi AMINI [mailto:joke...@gmail.com]

Sent: Wednesday, October 2, 2019 5:46 PM
To: Caballero, Diego <diego.c...@intel.com>

Uday Bondhugula

unread,
Oct 3, 2019, 10:42:12 PM10/3/19
to MLIR


On Friday, October 4, 2019 at 7:34:36 AM UTC+5:30, Caballero, Diego wrote:

 

        > The mental model I have (Uday please correct me if I'm wrong) is that your example is equivalent to:

 

Thanks, Mehdi. That mental model was very useful!

 

I would have another example, based on Example 3, that would help my basic understanding of the new picture that graybox is bringing and the limitations of affine representation.

For this particular example, would graybox provide any extra value vs representing the indirect load with a std.load?


In this case, since the access on %A in the graybox is the only one to it, there isn't a difference b/w using std.load vs affine.load. But had there been multiple accesses to %A, it could have made a difference.

 

  %cf1 = constant 1.0 : f32

  affine.for %i = 0 to 100 {

    %v = affine.load %B[%i] : memref<100xf32>

    affine.graybox [%A] {


                                  ^^^

This should be "affine.graybox [%A, %C] { ".
 
 

      // %v is now a symbol here.

      %s = affine.load %A[%v] : memref<100xf32> // Indirect load

      affine.store %s, %C[%i] : memref<100xf32>  // Note change in the subscript here.

      return;

    }

 

  affine.for %j = 0 to 100 {

    %l = affine.load %C[%j] : memref<100xf32>

  }

 

Using affine loop fusion as an example, I understand that fusion of loop %i and %j couldn’t happen because the store is only considered affine within the gray box.


It will actually happen because the affine.store on %C can be hoisted out of the graybox (this has to be implemented, but as I mentioned in the discussion to @Nicolas' question, this is one of the most practically useful things to implement first.). Once it's hoisted out, %C will be dropped from the arg list of the graybox and fusion will work, and in fact, store to load forwarding (-memref-dataflow-opt) will also work on it subsequently and eliminate the references to %C altogether.
 

However, if we changed loop %i to something like this, it would work:

 

    …

    %t = affine.graybox [%A] … {

      // %v is now a symbol here.

      %s = affine.load %A[%v] : memref<100xf32> // Indirect load

      return %s;

    }

    affine.store %t, %C[%i] : memref<100xf32>  // Note change in the subscript here.

    …


Yes, exactly. These are the kind of things affine.graybox indirectly enables. By having a way to represent these first, one could implement things that enable optimization at least the affine parts to start with (here, affine.store C[%i] didn't actually need to be in the graybox). 
 
~ Uday

Uday Bondhugula

unread,
Oct 14, 2019, 11:12:56 PM10/14/19
to MLIR
Hi Mehdi,

Getting back to this - inline response. 


On Thursday, October 3, 2019 at 12:56:18 PM UTC+5:30, Mehdi AMINI wrote:

g) Without more information about `any.op` (traits, etc.), this should be equivalent to the explicit capture case: if the IR was valid the first and second case, then it should be valid here.

If we don't have these properties, and if `op.with_region` can constrain the validity of the region attached to `any.op`, then `any.op` is not longer in control of the semantics of the enclosed region. No transformation can operate on `any.op` without knowing all of the enclosing operations, since these can add arbitrary restrictions. 
For example, this is a valid IR (you can pipe this in mlir-opt right now):

module {
  "d1.op1" () ({
    "d2.op2" () ({
      module {
        func @bar() {
          return
        }
        func @foo() {
          call @bar() : () -> ()
          return
        }
      }
      "d2.done" () : () -> ()
     }) : () -> ()
  }) : () -> ()
}



Yes, this may be valid IR, but not even canonicalization or dead code elimination works on this. Running -canonicalize on it:

$ mlir-opt -canonicalize test.mlir

module {
  "d1.op1"() ( {
    %c0_i32 = constant 0 : i32
    %c0_i32_0 = constant 0 : i32
    "d2.op2"() ( {
      module {
        func @bar() {
          %c1_i32 = constant 1 : i32
          %c1_i32_1 = constant 1 : i32
          return
        }
        func @foo() {
          call @bar() : () -> ()
          return
        }
      }
      "d2.done"() : () -> ()
    }) : () -> ()
  }) : () -> ()
}

I'll respond to the connection to the affine.graybox proposal in another post, but I think just being able to represent such IR without having the basic infra working correctly on it has little meaning. A separate thread/issue should be started to discuss/fix this, before we discuss how affine or other higher order passes should handle these -- because the latter use runOnFunction as well. 

~ Uday

River Riddle

unread,
Oct 14, 2019, 11:48:20 PM10/14/19
to Uday Bondhugula, MLIR
On Mon, Oct 14, 2019 at 8:12 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi Mehdi,

Getting back to this - inline response. 

On Thursday, October 3, 2019 at 12:56:18 PM UTC+5:30, Mehdi AMINI wrote:

g) Without more information about `any.op` (traits, etc.), this should be equivalent to the explicit capture case: if the IR was valid the first and second case, then it should be valid here.

If we don't have these properties, and if `op.with_region` can constrain the validity of the region attached to `any.op`, then `any.op` is not longer in control of the semantics of the enclosed region. No transformation can operate on `any.op` without knowing all of the enclosing operations, since these can add arbitrary restrictions. 
For example, this is a valid IR (you can pipe this in mlir-opt right now):

module {
  "d1.op1" () ({
    "d2.op2" () ({
      module {
        func @bar() {
          return
        }
        func @foo() {
          call @bar() : () -> ()
          return
        }
      }
      "d2.done" () : () -> ()
     }) : () -> ()
  }) : () -> ()
}



Yes, this may be valid IR, but not even canonicalization or dead code elimination works on this. Running -canonicalize on it:
The only reason -canonicalize isn't running here is because it is still a function pass and there are no functions directly within the top level module. If/When it is a generic operation pass it will work as expected.

-- River
 

$ mlir-opt -canonicalize test.mlir

module {
  "d1.op1"() ( {
    %c0_i32 = constant 0 : i32
    %c0_i32_0 = constant 0 : i32
    "d2.op2"() ( {
      module {
        func @bar() {
          %c1_i32 = constant 1 : i32
          %c1_i32_1 = constant 1 : i32
          return
        }
        func @foo() {
          call @bar() : () -> ()
          return
        }
      }
      "d2.done"() : () -> ()
    }) : () -> ()
  }) : () -> ()
}

I'll respond to the connection to the affine.graybox proposal in another post, but I think just being able to represent such IR without having the basic infra working correctly on it has little meaning. A separate thread/issue should be started to discuss/fix this, before we discuss how affine or other higher order passes should handle these -- because the latter use runOnFunction as well. 

~ Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/96eb1d19-2df7-4519-968a-ba777dabcc01%40tensorflow.org.

Mehdi AMINI

unread,
Oct 15, 2019, 1:22:42 PM10/15/19
to Uday Bondhugula, MLIR
I'm seeing it the other way around: we're defining the semantics of the IR, and this needs to be soundly defined independently of the infrastructure (which we are also building by the way). I am not sure how to build an infrastructure without solid and sound foundation. This is why I abstracted the question away from affine and into structural aspect that are dialect independent.

-- 
Mehdi

Reply all
Reply to author
Forward
0 new messages