--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/31231477-2774-4b02-8d3e-c40cacd7125b%40tensorflow.org.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c467a672-a642-46d7-8141-891e1f2bfe2c%40tensorflow.org.
How would you represent memref<2x5xf32, [3, 17]> ?
Independently of that, current implementation of Linalg transformation relies on stride information to be fully dynamic, which is not possible with affine maps.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c467a672-a642-46d7-8141-891e1f2bfe2c%40tensorflow.org.
---- Alex
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>
Independently of that, current implementation of Linalg transformation relies on stride information to be fully dynamic, which is not possible with affine maps.Do you mean strides that are SSA values (and unknown at compile time)? This is just represented using semi-affine maps and normalized to identity maps. If your strides are (s0, s1) along the two dimensions, you'll have:memref < 128 x 128 x f32, (d0, d1) -> (s0 * d0, s1 * d1) >
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/68df4b54-6a97-4766-adbb-5483274676c5%40tensorflow.org.
Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.3. The proposed lowering to LLVM seems to differ from what we currently have, in particular we only store _dynamic_ size whereas what you propose looks like storing _all_ sizes. What is the rationale for that?
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655iBcP72Y6R15AF-E__CLyQhhi0Ppq4YK4isRxn_6Bn4OA%40mail.gmail.com.
On Tue, Sep 10, 2019 at 6:01 AM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.3. The proposed lowering to LLVM seems to differ from what we currently have, in particular we only store _dynamic_ size whereas what you propose looks like storing _all_ sizes. What is the rationale for that?What about function boundaries? If you don't store all memref for a given rank the same way you need to make new copies across function boundaries I guess?Inside a local scope, nothing prevent from constant-propagating and compressing a locally used struct from unused fields.
On Tue, Sep 10, 2019 at 3:54 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>I don't quite get where does 4x29 come from.I suspect there might be a terminology confusion. Stride in this proposal doesn't mean the same thing as in affine world. It is a number of _linearized_ elements you need to step over to get to the next element along the given dimension. So stride [3, 17] is not equivalent to (i, j) -> (3*i, 17*j). Stride [51, 17] would have been equivalent to that. In this particular case, the linearized offsets are as follows:[0,0] -> 0[0,1] -> 17[0,2] -> 34[0,3] -> 51[0,4] -> 68[1,0] -> 3[1,1] -> 20[1,2] -> 37[1,3] -> 54[1,4] -> 71Independently of that, current implementation of Linalg transformation relies on stride information to be fully dynamic, which is not possible with affine maps.Do you mean strides that are SSA values (and unknown at compile time)? This is just represented using semi-affine maps and normalized to identity maps. If your strides are (s0, s1) along the two dimensions, you'll have:memref < 128 x 128 x f32, (d0, d1) -> (s0 * d0, s1 * d1) >It seems to rather be memref<4096 x f32, (d0, d1) -> (s0*d0 + s1*d1)> for the stride model in question, which leads to premature linearization of accesses.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/68df4b54-6a97-4766-adbb-5483274676c5%40tensorflow.org.
---- Alex
--Mehdi
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/31231477-2774-4b02-8d3e-c40cacd7125b%40tensorflow.org.
---- Alex
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655iBcP72Y6R15AF-E__CLyQhhi0Ppq4YK4isRxn_6Bn4OA%40mail.gmail.com.
---- Alex
On Tue, Sep 10, 2019 at 7:34 PM Mehdi Amini <ami...@google.com> wrote:On Tue, Sep 10, 2019 at 6:01 AM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.3. The proposed lowering to LLVM seems to differ from what we currently have, in particular we only store _dynamic_ size whereas what you propose looks like storing _all_ sizes. What is the rationale for that?What about function boundaries? If you don't store all memref for a given rank the same way you need to make new copies across function boundaries I guess?Inside a local scope, nothing prevent from constant-propagating and compressing a locally used struct from unused fields.I don't think this is anyhow different from the sizes. If the stride is static, it is present in the type itself. If the stride is dynamic, we need a field in the struct to store that.On concrete examples, `memref<42x?xf32>` is currently equivalent to `{float*, i64}` where the integer stores the dynamic size and 42 always remains a constant. What this seems to propose (unless I am misreading the section), is to change that to be `{float*, i64, i64}` with integers containing _both_ sizes, regardless of them being static or dynamic in the type. On the function boundary, you are not allowed to pass a memref<42x?xf32> where a memref<?x?xf32> is expected and vice versa, you need an exact match.For strides, we can use exactly the same kind of encoding. `memref<2x2xf32, [-1,1]>` will have one field for the dynamic stride, represented by -1 in the type and `memref<2x2xf32, [2,1]>` will not have any. This will pass function boundaries just as well as the current one for dynamic sizes does. There is not necessarily a link between sizes being dynamic and strides being dynamic.
In fact, I originally proposed to store all sizes when going to LLVM. We decided against it with the following arguments: (1) it creates a hazard of inconsistency between the static value visible in the type and the dynamic value stored; although we can declare such situation as undefined behavior, we did not want to have undefined behaviors at that point;
(2) it makes it difficult to interface with C functions, without storing dynamic sizes, a statically-shaped memref is a bare pointer;
(3) it creates storage overhead, and we did not (still don't) have CSE on the LLVM dialect.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655gXvAb9A_UgxhuNGJ5S%2Bc2oFo8JFtNwNiv%2B4Um8rkH%2BhQ%40mail.gmail.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/62f3457a-9d44-4a10-995a-876468bb30ce%40tensorflow.org.
Strided tensors should be allowed to have negative or zero strides.Also an offset, that points to the first element of the tensor should be added1) Zero-strides are needed to represent broadcasting operations.2) Negative strides are needed to reverse a tensor dimension without copy.This also makes dynamic striding with -1 impossible but we can use a flag instead.The main advantage is that it would allow MLIR to support all shape operations supported by Numpy, DLPack (a cross-framework tensor representation effort - https://github.com/dmlc/dlpack), PyTorch and all other frameworks that follow the "rank + shape + strides + offset + data" paradigm.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/cfddc618-76c2-4e6d-a46d-c0465eb48ed7%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655gXiOis3sCZm9FM0MWeBdEbJT%3DuYtqdSXd%2BCfZeFLCKJA%40mail.gmail.com.
On Tue, Sep 10, 2019 at 2:02 PM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:On Tue, Sep 10, 2019 at 7:34 PM Mehdi Amini <ami...@google.com> wrote:On Tue, Sep 10, 2019 at 6:01 AM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.3. The proposed lowering to LLVM seems to differ from what we currently have, in particular we only store _dynamic_ size whereas what you propose looks like storing _all_ sizes. What is the rationale for that?What about function boundaries? If you don't store all memref for a given rank the same way you need to make new copies across function boundaries I guess?Inside a local scope, nothing prevent from constant-propagating and compressing a locally used struct from unused fields.I don't think this is anyhow different from the sizes. If the stride is static, it is present in the type itself. If the stride is dynamic, we need a field in the struct to store that.On concrete examples, `memref<42x?xf32>` is currently equivalent to `{float*, i64}` where the integer stores the dynamic size and 42 always remains a constant. What this seems to propose (unless I am misreading the section), is to change that to be `{float*, i64, i64}` with integers containing _both_ sizes, regardless of them being static or dynamic in the type. On the function boundary, you are not allowed to pass a memref<42x?xf32> where a memref<?x?xf32> is expected and vice versa, you need an exact match.For strides, we can use exactly the same kind of encoding. `memref<2x2xf32, [-1,1]>` will have one field for the dynamic stride, represented by -1 in the type and `memref<2x2xf32, [2,1]>` will not have any. This will pass function boundaries just as well as the current one for dynamic sizes does. There is not necessarily a link between sizes being dynamic and strides being dynamic.Right, what I was alluding to is that if you have a memref<8x?x?x?x?xf32> and you want to call a function that takes a memref<?x?x?x?x?xf32> you have to reallocate and copy the structure.The advantage of always materializing the shape in the struct is that you don't need any conversion across function calls.
In fact, I originally proposed to store all sizes when going to LLVM. We decided against it with the following arguments:
(1) it creates a hazard of inconsistency between the static value visible in the type and the dynamic value stored; although we can declare such situation as undefined behavior, we did not want to have undefined behaviors at that point;Such undefined behavior does not seem terrible to me, because it is one that is trivial to catch with runtime checks (we should avoid any UB that we don't have a "cheap" and principled way of detecting reliably).
(2) it makes it difficult to interface with C functions, without storing dynamic sizes, a statically-shaped memref is a bare pointer;I'm not sure I totally follow what is more difficult here: if you have a static type then you can extract the pointer of the structure and pass it to a C function. If you don't have a static shape then you can't pass this to a C function?
(3) it creates storage overhead, and we did not (still don't) have CSE on the LLVM dialect.This one seems short-sighted to me: I would not design core-components of the system based on the current optimizer (or lack of thereof) state (and there is CSE at the LLVM level, we should likely try to emit the code and improve LLVM if it does not match our IR patterns).
Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>
It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).
2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.
3. The proposed lowering to LLVM seems to differ from what we currently have, in particular we only store _dynamic_ size whereas what you propose looks like storing _all_ sizes. What is the rationale for that?
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655iBcP72Y6R15AF-E__CLyQhhi0Ppq4YK4isRxn_6Bn4OA%40mail.gmail.com.
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/68df4b54-6a97-4766-adbb-5483274676c5%40tensorflow.org.
On Tue, Sep 10, 2019 at 3:54 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>I don't quite get where does 4x29 come from.I suspect there might be a terminology confusion. Stride in this proposal doesn't mean the same thing as in affine world. It is a number of _linearized_ elements you need to step over to get to the next element along the given dimension. So stride [3, 17] is not equivalent to (i, j) -> (3*i, 17*j). Stride [51, 17] would have been equivalent to that. In this particular case, the linearized offsets are as follows:[0,0] -> 0[0,1] -> 17[0,2] -> 34[0,3] -> 51[0,4] -> 68[1,0] -> 3[1,1] -> 20[1,2] -> 37[1,3] -> 54[1,4] -> 71Independently of that, current implementation of Linalg transformation relies on stride information to be fully dynamic, which is not possible with affine maps.Do you mean strides that are SSA values (and unknown at compile time)? This is just represented using semi-affine maps and normalized to identity maps. If your strides are (s0, s1) along the two dimensions, you'll have:memref < 128 x 128 x f32, (d0, d1) -> (s0 * d0, s1 * d1) >It seems to rather be memref<4096 x f32, (d0, d1) -> (s0*d0 + s1*d1)> for the stride model in question, which leads to premature linearization of accesses.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655gTe5Ock1b9kbAtKh4S0n5vgqQJsORDkfcfof%2BcXuXRTA%40mail.gmail.com.
Strided tensors should be allowed to have negative or zero strides.Also an offset, that points to the first element of the tensor should be added1) Zero-strides are needed to represent broadcasting operations.2) Negative strides are needed to reverse a tensor dimension without copy.This also makes dynamic striding with -1 impossible but we can use a flag instead.The main advantage is that it would allow MLIR to support all shape operations supported by Numpy, DLPack (a cross-framework tensor representation effort - https://github.com/dmlc/dlpack), PyTorch and all other frameworks that follow the "rank + shape + strides + offset + data" paradigm.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/cfddc618-76c2-4e6d-a46d-c0465eb48ed7%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655jzbtBe0miTuAyAAn_mFrvSZ87AD9JAP7goHs1_a%3DGDFQ%40mail.gmail.com.
Thanks for your replies all, sorry for the delay I was out yesterday.Here are some answers below.On Tue, Sep 10, 2019 at 9:01 AM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>Sure, consider it done.It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).Technically they can during lowering, as is pointed out a bit later by Volodymyr Arbatov. However, they may create undecidable problems for dependence analysis and force the insertion of case disjunction early in the compilation chain which seems undesirable. Since a strided memref is a (non-contiguous) view into an existing contiguous buffer it is subject to aliasing and is safer to disallow for now.2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.We can but we would have to use a syntax such as `memref<42x16xf32, 1x?, /*memory space=*/2>` because integer array list attributes won't parse, I don't personally find this much better but I am open to suggestions. Maybe someone has a better proposal, in any case it will be easy to modify once implemented.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/31231477-2774-4b02-8d3e-c40cacd7125b%40tensorflow.org.
---- Alex
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAPL655iBcP72Y6R15AF-E__CLyQhhi0Ppq4YK4isRxn_6Bn4OA%40mail.gmail.com.
--N
How exactly does this compose with the affine compositions already present on memrefs?
I'm also a bit concerned with adding a bunch of things to types, especially given the recent discussion about adding a layout attribute to tensor types.
Are we ever going to inter-mix affine and strides?
On Wednesday, September 11, 2019 at 9:09:04 AM UTC-7, Nicolas Vasilache wrote:Thanks for your replies all, sorry for the delay I was out yesterday.Here are some answers below.On Tue, Sep 10, 2019 at 9:01 AM 'Alex Zinenko' via MLIR <ml...@tensorflow.org> wrote:Thanks for the proposal! I appreciate the unification effort, and this looks like a step in the right direction given the tensor+memref layout discussions. I have several questions:1. If affine map composition and stride list are mutually exclusive, is it possible to just alternate them in the type signature to make it syntactically impossible to mix?Like in memref<42x16xf32, [1, 64], /*memory space=*/2>Sure, consider it done.It does not seem difficult to parse.More conceptually, do you intend for stride list and affine maps to eventually mix, or are strides more of an alternative layout representation for memrefs (tiled layout proposed for tensors and MKL can be other supported layout kinds).Technically they can during lowering, as is pointed out a bit later by Volodymyr Arbatov. However, they may create undecidable problems for dependence analysis and force the insertion of case disjunction early in the compilation chain which seems undesirable. Since a strided memref is a (non-contiguous) view into an existing contiguous buffer it is subject to aliasing and is safer to disallow for now.2. (syntax nitpick): is it possible to use `?` instead of -1 for the stride list for dynamic sizes; `?` better reflects the fact that the stride is unknown, while `-1` can be interpreted as going in the backward direction.We can but we would have to use a syntax such as `memref<42x16xf32, 1x?, /*memory space=*/2>` because integer array list attributes won't parse, I don't personally find this much better but I am open to suggestions. Maybe someone has a better proposal, in any case it will be easy to modify once implemented.I'd rather not bake in '-1' as dynamic in the IR, this is the exact reason why dimension lists use ?.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/f2932fc7-40a5-4335-8fb2-50e4b8059b9b%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/cfddc618-76c2-4e6d-a46d-c0465eb48ed7%40tensorflow.org.
--N
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/698ff147-45a1-4c0f-94d2-7230afa0efa6%40tensorflow.org.
Side note: in case it wasn't clear from Nicolas' initial email, this work is a subset of work from the same working group that is proposing changes to both the Tensor and MemRef types. We'll be giving a public talk in a couple weeks to detail the proposal.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAFfp4mOGXtYazzGkVi5iKs8P%3DLk6pXHLM1DvhKJD8Gb2%2BMnHRw%40mail.gmail.com.
The cblas interface is quite outdated, it only works with contiguous matrices and unit-stride in the non-leading dimension.
For several use-cases the BLIS interface that allows any striding (including zero/negative) provides key benefits for many use-cases:- LLVM Polly is basing polyhedral optimizations kernels on BLIS- Einsum and tensor-contraction can work without copies or transposition external to the BLIS call for most contraction types- spaCy, the leading NLP framework now uses BLIS by default
Now if we have a higher-level dialect that exposes Numpy-like memref that high-level frameworks can target and that lowers to the dialect of the specification that would also work but it seems like extra work that we could avoid.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/698ff147-45a1-4c0f-94d2-7230afa0efa6%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAFfp4mOGXtYazzGkVi5iKs8P%3DLk6pXHLM1DvhKJD8Gb2%2BMnHRw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/698ff147-45a1-4c0f-94d2-7230afa0efa6%40tensorflow.org.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAFfp4mOGXtYazzGkVi5iKs8P%3DLk6pXHLM1DvhKJD8Gb2%2BMnHRw%40mail.gmail.com.
--N
If I am reading this correctly, then memref<5x5xf32> and memref<5x5xf32, [25, 1]> are different types.1) Is it the responsibility of the operation on memrefs to see if a given stride matches the semantics of the operation. For example, currently 'addf` doesn't have any notion of strides. Presumably if it gets an operand with the stride specified, the verifier should (would?) fail. If the op truly doesn't care about the strides, then the operation verifier would have to be extended to allow that
If I am reading this correctly, then memref<5x5xf32> and memref<5x5xf32, [25, 1]> are different types.
1) Is it the responsibility of the operation on memrefs to see if a given stride matches the semantics of the operation. For example, currently 'addf` doesn't have any notion of strides. Presumably if it gets an operand with the stride specified, the verifier should (would?) fail. If the op truly doesn't care about the strides, then the operation verifier would have to be extended to allow that
2) One could interpret memref<5x5xf32, [25, 1]> as a specialization of the memref<5x5xf32>.
So any op that has an operand of type memref<5x5xf32> should be happy with a memref<5x5xf32, [25, 1]>. Is there a general mechanism to allow for this. One way would be to have MemrefStridedType which derives from a MemrefType. That would capture the specialization and ops that need just need a Memref type operand would be fine getting a MemrefStridedType opeand. Are there downsides of that approach?
3) This might be outside of the scope of this proposal, but is there a notion of "default strides". I realize that having a default with dynamics shapes is tricky, but just want to get some clarity on what is being considered along this line.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a4a640df-a7a6-471c-9756-7040d086a07f%40tensorflow.org.
On Tue, Sep 10, 2019 at 9:54 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>`memref<2x5xf32, [3, 17]>` is a strided view into a non-contiguous region of at least (4 * 17) + (1 * 3) + 1 elements (indexed as Alex explains below).The memref itself is a reference to 2x5 elements.`memref<4x29xf32>` is a contiguous region of 4x29 elements.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/68df4b54-6a97-4766-adbb-5483274676c5%40tensorflow.org.
--N
On Wed, Sep 11, 2019 at 1:51 PM 'River Riddle' via MLIR <ml...@tensorflow.org> wrote:How exactly does this compose with the affine compositions already present on memrefs?They are mutually exclusive as explicitly stated. Taking into account Alex's comment the ebnf looks like:``` {.ebnf}
memref-type ::= `memref` `<` dimension-list-ranked tensor-memref-element-type
((`,` semi-affine-map-composition) | (`,` stride-list))?
(`,` memory-space)? `>`
semi-affine-map-composition ::= (semi-affine-map `,` )* semi-affine-map
stride-list ::= `[` integer (`,` integer)* `]`
memory-space ::= integer-literal /* | TODO: address-space-id */
```However, as Volodymyr points out: "Index mapping functions, which in my understanding are conceptually similar to affine maps, work on top of that within dimensions of a tensor shape."This indeed opens the door to possible interactions during lowering but I do not expect such interactions will happen during analyses.I'm also a bit concerned with adding a bunch of things to types, especially given the recent discussion about adding a layout attribute to tensor types.Concerned how?I am much more concerned that memrefs just cannot represent non-continuous input memory regions.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/f2932fc7-40a5-4335-8fb2-50e4b8059b9b%40tensorflow.org.
--N
On Wednesday, September 11, 2019 at 9:47:30 PM UTC+5:30, Nicolas Vasilache wrote:On Tue, Sep 10, 2019 at 9:54 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Tuesday, September 10, 2019 at 6:39:07 PM UTC+5:30, Alex Zinenko wrote:How would you represent memref<2x5xf32, [3, 17]> ?Are 3 and 17 the strides along the two respective dimensions? Why is this special? This should just work. The normlizeMemRefs utility will turn this memref into memref<4x29xf32> with an identity layout, and a load on the original one will be rewritten to:affine.load %0[%arg0 * 3, %arg1 * 7] : memref<4x29xf32>`memref<2x5xf32, [3, 17]>` is a strided view into a non-contiguous region of at least (4 * 17) + (1 * 3) + 1 elements (indexed as Alex explains below).The memref itself is a reference to 2x5 elements.`memref<4x29xf32>` is a contiguous region of 4x29 elements.It became clear later that memref<2x5xf32, [3,17]> as per this proposal is the same asmemref<2x5xf32, (d0, d1) -> (3*d0 + 17*d1) >with the current memref with affine map. Is this accurate? There is no linearization of the access here. You would still have load/stores represented as %A[%i, %j] since the logical space is 2 x 5 -- so it does represent a non-contiguous region. The linearization hides in the non-identity affine map until one normalizes the memref.
On Wed, Sep 11, 2019 at 5:26 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:If I am reading this correctly, then memref<5x5xf32> and memref<5x5xf32, [25, 1]> are different types.Yes, although in this very particular corner case we could promote memref<5x5xf32, [25, 1]> to memref<5x5xf32>
1) Is it the responsibility of the operation on memrefs to see if a given stride matches the semantics of the operation. For example, currently 'addf` doesn't have any notion of strides. Presumably if it gets an operand with the stride specified, the verifier should (would?) fail. If the op truly doesn't care about the strides, then the operation verifier would have to be extended to allow thatYes, it is the responsibility of each op that takes memrefs to specify what they want and verifiers should be adapted accordingly.affine.load/store, std.load/store and std.dim work fine with strided memrefs.2) One could interpret memref<5x5xf32, [25, 1]> as a specialization of the memref<5x5xf32>.So any op that has an operand of type memref<5x5xf32> should be happy with a memref<5x5xf32, [25, 1]>. Is there a general mechanism to allow for this. One way would be to have MemrefStridedType which derives from a MemrefType. That would capture the specialization and ops that need just need a Memref type operand would be fine getting a MemrefStridedType opeand. Are there downsides of that approach?In general a strided memref is not a specialization of a continuous memref, they are different types.The case you point out is a special case that we may want to just promote automatically.3) This might be outside of the scope of this proposal, but is there a notion of "default strides". I realize that having a default with dynamics shapes is tricky, but just want to get some clarity on what is being considered along this line.Yes a MemRef without strides implicitly has "default strides" which is what you'd get from a continuous buffer.By abusing notations, memref<MxNxPxf32> has strides [N * P, P, 1].This is already implicit in the existing memref address calculation and does not change.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAO%3D1vNNkRqz7U1L9%2BA1iutrVt_hnknv3Z499%2BscsaEXWskk6Yw%40mail.gmail.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/8ea8642f-c6c1-4c8f-ac7d-c941b0d31c5f%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/8ea8642f-c6c1-4c8f-ac7d-c941b0d31c5f%40tensorflow.org.
---- Alex
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/b18a8cc4-bdd4-46d7-bd44-141148aba89b%40tensorflow.org.
This sounds interesting. As I understand the affine map on memref is the way to express data layout, right?
One of potential problems with strides is that they correspond to specific, though most commonly used, data layouts (or, in other words, it's a particular form of index mapping). When it comes to more complex data layouts, for example multi-level nested tiled data layout, these parameters might become useless or reflect memory organization connected to original dimensions in a complex way.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/b18a8cc4-bdd4-46d7-bd44-141148aba89b%40tensorflow.org.
The symbols associated with affine layout maps serve multiple purposes, as you know, not just strides. They could represent parametric tile sizes for data tiling, or they could be parametric shifts/padding for example. You get all at once (including compositions) without having to specialize/reimplement support for each separately.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/b18a8cc4-bdd4-46d7-bd44-141148aba89b%40tensorflow.org.
On Thu, Sep 12, 2019, 01:03 'Nicolas Vasilache' via MLIR <ml...@tensorflow.org> wrote:On Wed, Sep 11, 2019 at 5:26 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:If I am reading this correctly, then memref<5x5xf32> and memref<5x5xf32, [25, 1]> are different types.Yes, although in this very particular corner case we could promote memref<5x5xf32, [25, 1]> to memref<5x5xf32>This sounds like a reasonable behavior to me. We already do the same to eliminate identity maps from the memref.1) Is it the responsibility of the operation on memrefs to see if a given stride matches the semantics of the operation. For example, currently 'addf` doesn't have any notion of strides. Presumably if it gets an operand with the stride specified, the verifier should (would?) fail. If the op truly doesn't care about the strides, then the operation verifier would have to be extended to allow thatYes, it is the responsibility of each op that takes memrefs to specify what they want and verifiers should be adapted accordingly.affine.load/store, std.load/store and std.dim work fine with strided memrefs.2) One could interpret memref<5x5xf32, [25, 1]> as a specialization of the memref<5x5xf32>.So any op that has an operand of type memref<5x5xf32> should be happy with a memref<5x5xf32, [25, 1]>. Is there a general mechanism to allow for this. One way would be to have MemrefStridedType which derives from a MemrefType. That would capture the specialization and ops that need just need a Memref type operand would be fine getting a MemrefStridedType opeand. Are there downsides of that approach?In general a strided memref is not a specialization of a continuous memref, they are different types.The case you point out is a special case that we may want to just promote automatically.3) This might be outside of the scope of this proposal, but is there a notion of "default strides". I realize that having a default with dynamics shapes is tricky, but just want to get some clarity on what is being considered along this line.Yes a MemRef without strides implicitly has "default strides" which is what you'd get from a continuous buffer.By abusing notations, memref<MxNxPxf32> has strides [N * P, P, 1].This is already implicit in the existing memref address calculation and does not change.Can we specify this in documentation, for clarity? This is also relevant for lowering to LLVM.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/4c2a4e6e-e0b1-e249-b10b-a2c48524ed45%40iisc.ac.in.
While this may be profitable in some case, I can imagine that the increased code-size can be undesirable, so is the potentially very large amount of cloning and specialization on compile time.But more importantly, this does not address the use of memref as a type that can cross module boundaries. Not being able to have some modularity in the codegen (through the ability to specify an ABI) seems overly restrictive to me.It seems that the lowering and the structure that Nicolas mentioned is allowing to express an external function taking a memref for a given shape (or even including some dynamic shapes), and call this external function with memref arguments that have various stride configurations without having to specialize this function.While admittedly this is only for a subset of the expressiveness power of affine maps, the modularity aspect still seems valuable depending on your use-case.Do you believe that there is no applicability for this?
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/ad37f7db-e80f-4859-9e12-66372803c32d%40tensorflow.org.
On Tue, Sep 17, 2019 at 10:22 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
On Tuesday, September 17, 2019 at 12:46:03 PM UTC+5:30, Mehdi AMINI wrote:
Hi Uday,Thanks for the very thorough email!
On Mon, Sep 16, 2019 at 1:14 PM 'Uday Kumar Reddy B' via MLIR <ml...@tensorflow.org> wrote:
...
Thanks for the detailed arguments on both sides. From my perspective, it looks like there are many valid points _and_ some missing aspects in both cases.For the "just keep the memref", there is not enough practical considerations in the LangRef or in the lowering for it to be compared to the proposal in question. In particular, the following questions1. How does memref normalization work for layout maps with symbols?def @foo(%A: memref<?x?xf32, (d0,d1)[s0] -> (s0*d0 + d1)>, %i: index, %j: index) {load %A[%i, %j] : memref<?x?xf32, (d0,d1)[s0] -> (s0*d0 + d1)>/*...*/}I checked the current implementation, and it does not do anything.
(Side note, it crashes on `alloc()[%s] : memref<10x10xf32, (d0,d1)[s0] -> (10*d0 + d1)>`).
We cannot move [s0] to the load arguments because it has been bound at allocation, and we obviously cannot just drop
it. If it is somehow present in the memref, how do we differentiate, after normalization, between memref<?x?xf32> that was normalized from a map with a symbol, and form a map without a symbol?
2. How does normalization work with external functions, which are defined in another module and cannot be duplicated in principle (be it beneficial to performance or not)?
3. If we don't normalize, we need an LLVM or C runtime data structure to store the data associated with this memref and the related affine map. What would that be? Assuming it still has a base pointer, what is the formula to compute the linear offset in number of elements of a specific element given the indices and the data present in the dynamic structure?
Hi Nicolas
I'm going to respond to your numerous questions/comments in bulk here
instead of inline :-), since I think they are all inter-related, and I
wanted a particular order to it. (I'm skipping some of the questions
that I found digressing and unrelated to this strided memref proposal.)
1) I think my first question was on clarifying separately on
*representational power* and *ease of representation/transformation*.
Although I don't see an explicit response to the first part, it's clear
(and also with Alex confirming) that this proposal deals with a subset
of what affine maps can already represent. In your earlier response to
River's question, you stated
"I am much more concerned that memrefs just cannot represent
non-continuous input memory regions.", i.e., the current affine map
layout couldn't represent non-contiguous regions and that this proposal
did; but that is clearly inaccurate. It would be good to reconsider
things in view of that. Your previous email has thus been all about ease
of representation - so let's get to that.
2) Coming to the ease of representation: I think your point that it is
hard to pull out individual aspects from compound maps like strides,
tile sizes, offsets (for whatever reasons you need them) doesn't make
sense at all because you are really comparing apples to oranges there!
:-) This proposal in discussion is on strided memrefs, and if you look
at how those strides get into and are represented in affine maps, it's
just trivial to extract them out if the strides are what you wanted to
represent (just with a pattern match on the affine maps - see further
below).
Now, if the affine map is a compound map representing a composition of
several things, the comparison should be made to a design that is
roughly as powerful (representational power-wise), and is able to
represent all of that (strides, tiled layouts, nested tiled layout, all
useful compositions thereof such as strided intra-tile + permuted tile
space), and then evaluate which one is better.
The representation in
this proposal is really adding a very thin layer of abstraction to
represent a proper subset of what's possible with affine maps, and at
the expense of new syntax and a new representation, and providing an
alternate/duplicate lowering path.
3) Reg. your comment on the different number of dimensional arguments
and symbolic arguments to the AllocOp: I'm not sure why that's an issue.
One should let the compiler do its job here! :-) If there are really
going to be 42 symbolic arguments binding to a layout maps (albeit
unrealistic), there's probably a reason, and let the machinery propagate
stuff. The key is that the current affine map based design is unified
with the rest of the SSA IR and as such SSA values bound to the
dimensions, symbols of the alloc all benefit from the existing infra on
constant propagation into affine maps, and algebraic simplification and
canonicalization of maps.
4) Reg. your comment on hearing about a counter proposal on the affine
dialect, this is a bit surprising, since the counter proposal you are
asking for is already the current memref design we have! :-)
(b) keep track of the symbols the same way as the shape SSA values
associated with the dynamic dimensions in the LLVM lowering.
You need
this design only to preserve/hide complex non-identity layouts *all the
way* up until the LLVM lowering - normally, one would like to just
normalize during -lower-affine, because the linearized stuff can be left
alone as we are anyway out of affine.load/stores. Or you could do that
just before LLVM lowering in order to separate concerns and not make the
really nice LLVM lowering we currently have heavy.
With all these design points, the LLVM lowering will NOT see any symbolic operands on the
alloc's or views!
5) At several points, you claim that affine is limiting in many ways and
give orthogonal examples: but the thing in discussion is *this* layout
proposal and what is being proposed here is a small subset of what can
already be achieved with affine maps. It doesn't make sense to make the
affine restriction argument unless the proposal can deal with something
that can't already be represented with affine maps (for eg. for
affine.for, the case of non-constant steps is a valid one, but this
proposal isn't about loop structure). So let's stick to the topic at hand.
6) The danger with the proposed representation is that if someone needs
something more in the future, this specialized representation and syntax
are hard to adapt/extend to it, and you'll have to go back to the
drawing board. And consequently, one gets a pot pourri of different
syntax/representations attached with the memref grammar looking like:
memref<? x ? x f32, rep1 | rep2 | rep3 | ...>,
To conclude, I think this proposal is really adding a very thin layer of
abstraction at the expense of new syntax and a new representation; it
also only achieves what is already possible with current memref design,
and it is also in a manner not as unified with the rest of MLIR as
current memrefs are. Without augmenting it further to represent more
stuff, I think it doesn't pay the price of changing the IR and the type
system.
(b) keep track of the symbols the same way as the shape SSA values
associated with the dynamic dimensions in the LLVM lowering.If symbol information exists in the memref descriptor when lowering to LLVM, then I can indeed use and extract that from an (asserted) normal form and get it into the shape I want to pass to external library calls and achieving the unification I am after.Thanks for raising that point, it wasn't clear to me that this information would be present in the type.
You need
this design only to preserve/hide complex non-identity layouts *all the
way* up until the LLVM lowering - normally, one would like to just
normalize during -lower-affine, because the linearized stuff can be left
alone as we are anyway out of affine.load/stores. Or you could do that
just before LLVM lowering in order to separate concerns and not make the
really nice LLVM lowering we currently have heavy.With all these design points, the LLVM lowering will NOT see any symbolic operands on the
alloc's or views!
5) At several points, you claim that affine is limiting in many ways and
give orthogonal examples: but the thing in discussion is *this* layout
proposal and what is being proposed here is a small subset of what can
already be achieved with affine maps. It doesn't make sense to make the
affine restriction argument unless the proposal can deal with something
that can't already be represented with affine maps (for eg. for
affine.for, the case of non-constant steps is a valid one, but this
proposal isn't about loop structure). So let's stick to the topic at hand.This is precisely the topic at hand: "unify existing codegen strategies (Affine and Linalg dialects) by using the same standard type".I am not sure how this comment applies.
6) The danger with the proposed representation is that if someone needs
something more in the future, this specialized representation and syntax
are hard to adapt/extend to it, and you'll have to go back to the
drawing board. And consequently, one gets a pot pourri of different
syntax/representations attached with the memref grammar looking like:
memref<? x ? x f32, rep1 | rep2 | rep3 | ...>,Yes, I am sympathetic to that.The corollary seems to be that no other layout information should be attached to memref unless it encompasses / generalizes the layout map ?
To conclude, I think this proposal is really adding a very thin layer of
abstraction at the expense of new syntax and a new representation; it
also only achieves what is already possible with current memref design,
and it is also in a manner not as unified with the rest of MLIR as
current memrefs are. Without augmenting it further to represent more
stuff, I think it doesn't pay the price of changing the IR and the type
system.I think a similar result can be achieved without modifying the syntax by asserting a layout map in strided form (as you pointed out) and bailing out.The downside is there is no guarantee by construction and extra work will be needed but at least progress could be made.I'll give that a shot.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/fbb9e2e9-250b-486f-9f37-615ff7f93b4a%40tensorflow.org.
Thanks Uday. In your example%0 = alloc() : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8)I think your calculation for the allocation size also assumes implicit padding when the dim size is not a multiple of the tile size (similar to XLA's tiled layout).
By using the view, you can "record" the scalar shape information in the base memref type, but the issues we've always had with derived views like this is how they cross function boundaries. If you pass this vectorized view memref type into a function, then lowering inside this function will likely still need to see the "base" memref type to see where the padding is.Don't get me wrong, I've argued for views for quite some time. We tend to get stuck with the function boundary issue, how pas the chain of types. Here is an example where I append the base memref type onto the view type (not saying this is the right thing to do):%0 = alloc() : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8)%1 = view %0 : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8), memref<10 x 38 x vector<8xf32>>call @func(%1)And the function signature would look like this:func (%arg0 : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8), memref<10 x 38 x vector<8xf32>>) {%v = load %arg0[%i, %j] : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8), memref<10 x 38 x vector<8xf32>>}This would work, but if we had dynamic dims/symbols, then the runtime struct for memref would need to keep a stack of dims/symbols for the chain of derived memref types.
Alternatively, if we could compose the chain of maps for these derived memref types, into a single memref type that could pass function boundaries, but also supported padding, that would be great.
Andy
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/7c10a0da-3b2c-45eb-9d3f-918fce07f1c9%40tensorflow.org.
Thanks Uday. In my example above, the view memref with vector element type, has its innermost dimension indexed by tile index in that dimension (not scalar element dim index), so we've lost information and I'm not sure how we would compose the view map with the base memref map.
I suppose we could allow the just use the base memref as is, and use the affine map "range_sizes" property (that we got rid of) to represent the vector element shape. For example:A. %0 = alloc() : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8), range_sizes = [*, *, 8]>
Alternatively, we could add element type "transfer size" explicitly:B. %0 = alloc() : memref<10 x 300 x f32, (d0, d1) -> (d0, d1 floordiv 8, d1 mod 8), vector<8xf32>>
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.