LLVM lowering of memrefs

1,051 views

Skip to first unread message

mlir.de...@gmail.com

unread,

Dec 5, 2019, 10:18:12 AM12/5/19

to MLIR

Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

2. Representing Global variables:

c code example:

int global_var;

int func() {

int local = <some value>;

global_var = local;

......

}

For the above case, I have created a "global" operation in my own dialect. The new dialect I created works along with standard and affine dialects. I am using the std.load/ std.store operations for the "local" variable and planning to use the same for "global_var". Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

Thanks!

Alex Zinenko

unread,

Dec 5, 2019, 2:02:20 PM12/5/19

to MLIR

Hi,

it feels like there are multiple questions here.

allow me first to reiterate the argument I have been advancing in several discussions regarding alloc/dealloc lowering. The idea that alloc/dealloc can be converted to different calls or platform-specific instruction is certainly necessary, but it is a _lowering_ concern rather than an _operation_ concern. That is, I find proposition the "alloc" that somehow encodes how it should be lowered a leaky abstraction. It creates unnecessarily tight coupling between std.alloc and the LLVM dialect, which defies the purpose of dialect separation. In my opinion, the right approach here is to put the lowering specific into the lowering itself. The recent commit that added the use-alloca option went exactly that way: it's a lowering option that doesn't leak to the operation itself.

With the MLIR conversion infrastructure, it is actually really easy to inject any kind of lowering you might need. Implement the custom pattern with any conditions and behavior you want (it's C++ after all), give it a higher benefit than the default one, and populate the set of conversions with the rest. Yes, it requires defining a new lowering pass. But it keeps the "main" pass simple and clean of any target- or project-specific constraints.

I am open to reconsidering the "main" lowering (FWIW, my original implementation was calling __mlir_alloc instead of malloc to allow for dispatching), but I would ask to see more than one specific example where it trying to inject patterns wouldn't work.

The rest of the answers are inline.

On Thu, Dec 5, 2019 at 4:18 PM <mlir.de...@gmail.com> wrote:

Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

No, there's no such plan. But you can implement your own lowering patterns and mix them with the existing ones. We can decouple `populateStandardToLLVMLoweringPatterns` into `populateStdToLLVMArithmPatterns` and `populateStdToLLVMAllocationPatterns` if that helps.

If you need a dramatically different runtime representation of memrefs that what we currently adopted, I'd suggest you implement your lowerings for any std operation on memrefs.

2. Representing Global variables:

c code example:

int global_var;

int func() {
int local = <some value>;
global_var = local;
......
}

For the above case, I have created a "global" operation in my own dialect. The new dialect I created works along with standard and affine dialects. I am using the std.load/ std.store operations for the "local" variable and planning to use the same for "global_var".

Is your global variable creating a pointer/memref equivalent? Otherwise, it does not make sense to load from it using std.load/std.store. If it has value semantics, you need custom operations for it.

Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

The approach I've adopted and recommended so far is to define a global variable and provide an operation that "takes the address" of it locally. See for example, llvm.global and llvm.addressof that model LLVM globals. If you follow this scheme, the "addressof" operation can return a memref<1 x your-type>, which will integrate smoothly with the rest of memref lowering as long as you can provide a lowering for the "addressof" operation that builds the memref descriptor.

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.

Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.

In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

-- Alex

Compiler Developer

unread,

Dec 6, 2019, 12:31:54 AM12/6/19

to MLIR

Thanks for the reply please find the comments inline.

On Friday, December 6, 2019 at 12:32:20 AM UTC+5:30, Alex Zinenko wrote:

Hi,

it feels like there are multiple questions here.

allow me first to reiterate the argument I have been advancing in several discussions regarding alloc/dealloc lowering. The idea that alloc/dealloc can be converted to different calls or platform-specific instruction is certainly necessary, but it is a _lowering_ concern rather than an _operation_ concern. That is, I find proposition the "alloc" that somehow encodes how it should be lowered a leaky abstraction. It creates unnecessarily tight coupling between std.alloc and the LLVM dialect, which defies the purpose of dialect separation. In my opinion, the right approach here is to put the lowering specific into the lowering itself. The recent commit that added the use-alloca option went exactly that way: it's a lowering option that doesn't leak to the operation itself.

With the MLIR conversion infrastructure, it is actually really easy to inject any kind of lowering you might need. Implement the custom pattern with any conditions and behavior you want (it's C++ after all), give it a higher benefit than the default one, and populate the set of conversions with the rest. Yes, it requires defining a new lowering pass. But it keeps the "main" pass simple and clean of any target- or project-specific constraints.

I agree with handling the alloc() as part of LLVM lowering and not introduce any changes to std.alloc itself.

I am open to reconsidering the "main" lowering (FWIW, my original implementation was calling __mlir_alloc instead of malloc to allow for dispatching), but I would ask to see more than one specific example where it trying to inject patterns wouldn't work.

The rest of the answers are inline.

On Thu, Dec 5, 2019 at 4:18 PM <mlir.d...@gmail.com> wrote:
Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

No, there's no such plan. But you can implement your own lowering patterns and mix them with the existing ones. We can decouple `populateStandardToLLVMLoweringPatterns` into `populateStdToLLVMArithmPatterns` and `populateStdToLLVMAllocationPatterns` if that helps.

Looks like I can override the existing patterns for LLVM lowering by adding a custom lowering pattern later(?) in the OwningRewriterPatternList. Is it always guaranteed to converge to my custom pattern and not the existing one in ConvertStandardToLLVM.cpp ?

If not, exposing all the lowering patterns(in ConvertStandardToLLVM.cpp) as utility helps in picking the ones I need.

If you need a dramatically different runtime representation of memrefs that what we currently adopted, I'd suggest you implement your lowerings for any std operation on memrefs.

2. Representing Global variables:

c code example:

int global_var;

int func() {
int local = <some value>;
global_var = local;
......
}

For the above case, I have created a "global" operation in my own dialect. The new dialect I created works along with standard and affine dialects. I am using the std.load/ std.store operations for the "local" variable and planning to use the same for "global_var".

Is your global variable creating a pointer/memref equivalent? Otherwise, it does not make sense to load from it using std.load/std.store. If it has value semantics, you need custom operations for it.

MemRefType is perfect type for the global variables which I am creating.

Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

The approach I've adopted and recommended so far is to define a global variable and provide an operation that "takes the address" of it locally. See for example, llvm.global and llvm.addressof that model LLVM globals. If you follow this scheme, the "addressof" operation can return a memref<1 x your-type>, which will integrate smoothly with the rest of memref lowering as long as you can provide a lowering for the "addressof" operation that builds the memref descriptor.

I tried to use the same GlobalOp/AddressOfOp in my own dialect with the MemRefType for globals. The operation semantics are same as LLVM counterparts.

Case 1:

"MyDialect.global"() {sym_name = "global_var", type = i32} : () -> ()

%1 = "MyDialect.addressOf"() {global_name = @global_var} : () -> memref<i32>

The above representation lowered to LLVM using LLVM::GlobalOp and LLVM::AddressOfOp fails as AddressOfOp should have the pointer type for LLVM::GlobalOp it points to and not a memref descriptor structure.

Also, Why do I need to create memref descriptor structure for a scalar value? All I need is a pointer to the llvm::GlobalVariable value to do load and store. The current implementation forces me to have a memref structure for the scalar values.

Case 2:

Lets say I want to create a static Global array with the custom memory layout. I have created a custom operation for it.

"MyDialect.global"() {sym_name = "global_var", type = memref<10xi32>} : () -> ()

For the above case, the current lowering only gives me the memref structure allocated. What about the allocation of actual array? In case of linking multiple files, I would not know where to initialize the global variable!

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.

Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

Alex Zinenko

unread,

Dec 6, 2019, 5:44:46 PM12/6/19

to Compiler Developer, MLIR

On Fri, Dec 6, 2019 at 6:31 AM Compiler Developer <mlir.de...@gmail.com> wrote:

Thanks for the reply please find the comments inline.

On Friday, December 6, 2019 at 12:32:20 AM UTC+5:30, Alex Zinenko wrote:
Hi,

it feels like there are multiple questions here.

allow me first to reiterate the argument I have been advancing in several discussions regarding alloc/dealloc lowering. The idea that alloc/dealloc can be converted to different calls or platform-specific instruction is certainly necessary, but it is a _lowering_ concern rather than an _operation_ concern. That is, I find proposition the "alloc" that somehow encodes how it should be lowered a leaky abstraction. It creates unnecessarily tight coupling between std.alloc and the LLVM dialect, which defies the purpose of dialect separation. In my opinion, the right approach here is to put the lowering specific into the lowering itself. The recent commit that added the use-alloca option went exactly that way: it's a lowering option that doesn't leak to the operation itself.

With the MLIR conversion infrastructure, it is actually really easy to inject any kind of lowering you might need. Implement the custom pattern with any conditions and behavior you want (it's C++ after all), give it a higher benefit than the default one, and populate the set of conversions with the rest. Yes, it requires defining a new lowering pass. But it keeps the "main" pass simple and clean of any target- or project-specific constraints.

I agree with handling the alloc() as part of LLVM lowering and not introduce any changes to std.alloc itself.

I am open to reconsidering the "main" lowering (FWIW, my original implementation was calling __mlir_alloc instead of malloc to allow for dispatching), but I would ask to see more than one specific example where it trying to inject patterns wouldn't work.

The rest of the answers are inline.

On Thu, Dec 5, 2019 at 4:18 PM <mlir.d...@gmail.com> wrote:
Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

No, there's no such plan. But you can implement your own lowering patterns and mix them with the existing ones. We can decouple `populateStandardToLLVMLoweringPatterns` into `populateStdToLLVMArithmPatterns` and `populateStdToLLVMAllocationPatterns` if that helps.

Looks like I can override the existing patterns for LLVM lowering by adding a custom lowering pattern later(?) in the OwningRewriterPatternList. Is it always guaranteed to converge to my custom pattern and not the existing one in ConvertStandardToLLVM.cpp ?

If not, exposing all the lowering patterns(in ConvertStandardToLLVM.cpp) as utility helps in picking the ones I need.

You shouldn't rely on the order of the patterns in the list. However, all patterns have a benefit (passed in as a constructor argument), and those will higher benefit will take priority.

Otherwise, I am happy to review a patch that splits out the population function into pieces. I'm not convinced we want to expose each pattern individually, mainly because there are likely more patterns to be added as we expand the dialects, but having logical groups sounds reasonable to me. Something like AllocationPatterns, MemoryAccessPatterns, ArithmeticPatterns, etc. I'd proceed on the per-need basis.

If you need a dramatically different runtime representation of memrefs that what we currently adopted, I'd suggest you implement your lowerings for any std operation on memrefs.

2. Representing Global variables:

c code example:

int global_var;

int func() {
int local = <some value>;
global_var = local;
......
}

For the above case, I have created a "global" operation in my own dialect. The new dialect I created works along with standard and affine dialects. I am using the std.load/ std.store operations for the "local" variable and planning to use the same for "global_var".

Is your global variable creating a pointer/memref equivalent? Otherwise, it does not make sense to load from it using std.load/std.store. If it has value semantics, you need custom operations for it.

MemRefType is perfect type for the global variables which I am creating.

MemRef is a structured pointer with sizes attached. You seem to insist a lot on having a pointer to a scalar value, which may not exactly be what memref was designed for. That being said, MLIR does not have raw pointers.

Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

The approach I've adopted and recommended so far is to define a global variable and provide an operation that "takes the address" of it locally. See for example, llvm.global and llvm.addressof that model LLVM globals. If you follow this scheme, the "addressof" operation can return a memref<1 x your-type>, which will integrate smoothly with the rest of memref lowering as long as you can provide a lowering for the "addressof" operation that builds the memref descriptor.

I tried to use the same GlobalOp/AddressOfOp in my own dialect with the MemRefType for globals. The operation semantics are same as LLVM counterparts.

Case 1:

"MyDialect.global"() {sym_name = "global_var", type = i32} : () -> ()

%1 = "MyDialect.addressOf"() {global_name = @global_var} : () -> memref<i32>

The above representation lowered to LLVM using LLVM::GlobalOp and LLVM::AddressOfOp fails as AddressOfOp should have the pointer type for LLVM::GlobalOp it points to and not a memref descriptor structure.

The lowering is not one-to-one for addressof. You can get the address of the global and use it to populate a memref descriptor. See https://github.com/tensorflow/mlir/commit/4e7d67e778541165b29e6c39afd6273a1c6ca00e#diff-5a8a4487c6fe7f9f478b6981b405fd71R542 for an example.

Also, Why do I need to create memref descriptor structure for a scalar value? All I need is a pointer to the llvm::GlobalVariable value to do load and store. The current implementation forces me to have a memref structure for the scalar values.

We used to have a lowering from a statically-shaped memref to a bare pointer some time ago and we removed it. The rationale had been that consistency was more important at this stage. In the default flow, an n-D memref is always lowered to a speicifc structure type. Moreover, we can interface with C functions regardless of memrefs having dynamic shapes. We decided that we could reconsider this choice if we are provided with compelling evidence that using structures significantly degrades runtime performance. Does it?

Case 2:

Lets say I want to create a static Global array with the custom memory layout. I have created a custom operation for it.

"MyDialect.global"() {sym_name = "global_var", type = memref<10xi32>} : () -> ()

For the above case, the current lowering only gives me the memref structure allocated. What about the allocation of actual array? In case of linking multiple files, I would not know where to initialize the global variable!

The lowering of your dialect should know how to allocate memory for it. Standard-to-llvm cannot not possibly know about the semantics of operations in another dialect. We did not have to deal with linking in MLIR so there is no support in the standard dialect. I could suggest introducing some notion of external symbols to your dialect and lower them to LLVM globals that have "externally_available" and "external" linkage for declarations and definitions, respectively. I've added the support for linkage types in LLVM just recently.

A proposal on handling linking in the standard dialect is very welcome.

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

It's exposed in ConvertStandardToLLVM.h, should be usable without any other pattern. It cannot be trivially detached from that file because it depends on the type converter as well.

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.
Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

Make a list of what you want to reuse from the existing lowering and we'll see how to restructure it.

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/52991f61-37a3-407f-a8e0-56efec2c5648%40tensorflow.org.

-- Alex

Mehdi AMINI

unread,

Dec 6, 2019, 7:41:38 PM12/6/19

to Compiler Developer, MLIR

On Fri, Dec 6, 2019 at 6:31 AM Compiler Developer <mlir.de...@gmail.com> wrote:

The "addressof" operation need to materialize in an alloca the memref descriptor and populate it (LLVM should constant fold this away, it can also be constant folded in MLIR if we add support for such optimization). Do I miss something?

Also, Why do I need to create memref descriptor structure for a scalar value?

All I need is a pointer to the llvm::GlobalVariable value to do load and store. The current implementation forces me to have a memref structure for the scalar values.

Case 2:

Lets say I want to create a static Global array with the custom memory layout. I have created a custom operation for it.

"MyDialect.global"() {sym_name = "global_var", type = memref<10xi32>} : () -> ()

For the above case, the current lowering only gives me the memref structure allocated. What about the allocation of actual array? In case of linking multiple files, I would not know where to initialize the global variable!

You can use the same strategy as above (AddressOf materializing a descriptor), or you can emit two globals: one for the array and another for the descriptor.

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I am not sure what is the problem with linking, but juste in case: LLVM has support for global initialization: https://llvm.org/docs/LangRef.html#the-llvm-global-ctors-global-variable

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.
Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

This may not compose well with the existing lowering: if you change the lowering for a type, you need to override the lowering of all the operations for this type, since the default lowering patterns will assume a specific lowering for the type.

Mehdi

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/52991f61-37a3-407f-a8e0-56efec2c5648%40tensorflow.org.

Uday Bondhugula

unread,

Dec 6, 2019, 8:52:59 PM12/6/19

to MLIR

On Thursday, December 5, 2019 at 8:48:12 PM UTC+5:30, mlir.d...@gmail.com wrote:

Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

I've had an std.alloca op for experimentation for some time now (which is also what I needed/used for my article here: https://github.com/bondhugula/mlir/blob/hop/g3doc/HighPerfCodeGen.md )

I've made that available in an 'alloca' branch here:

https://github.com/bondhugula/mlir/commit/d374b39a0ef29e9d85caa84be8f0948e0a2f371a

I just postponed submitting it upstream because it needs two clean ups:

1) use an "AllocLikeOpInterface" to share methods between AllocOp and AllocaOp (besides rewrite patterns and pass behavior)

2) refactor to reuse the common code b/w AllocOpLowering and AllocaOpLowering in the LLVM lowering (pretty straightforward)

Let me know if you are interested in contributing to it to submit upstream.

Using a separate alloca op with the op interface is I believe the right approach (as opposed to using an attribute on the alloc op -- mainly because a DeallocOp isn't needed for the latter as argued well by @aminim on this thread: https://github.com/tensorflow/mlir/pull/55 -- but OpInterfaces weren't available at that time and so the thinking was different).

I think the -use-alloca was recently just for quick testing; for all the reasons you mentioned and more, it's not really a solution/option for lowering.

~ Uday

Compiler Developer

unread,

Dec 7, 2019, 12:25:14 AM12/7/19

to MLIR

On Saturday, December 7, 2019 at 4:14:46 AM UTC+5:30, Alex Zinenko wrote:

On Fri, Dec 6, 2019 at 6:31 AM Compiler Developer <mlir.d...@gmail.com> wrote:
Thanks for the reply please find the comments inline.

On Friday, December 6, 2019 at 12:32:20 AM UTC+5:30, Alex Zinenko wrote:
Hi,

it feels like there are multiple questions here.

allow me first to reiterate the argument I have been advancing in several discussions regarding alloc/dealloc lowering. The idea that alloc/dealloc can be converted to different calls or platform-specific instruction is certainly necessary, but it is a _lowering_ concern rather than an _operation_ concern. That is, I find proposition the "alloc" that somehow encodes how it should be lowered a leaky abstraction. It creates unnecessarily tight coupling between std.alloc and the LLVM dialect, which defies the purpose of dialect separation. In my opinion, the right approach here is to put the lowering specific into the lowering itself. The recent commit that added the use-alloca option went exactly that way: it's a lowering option that doesn't leak to the operation itself.

With the MLIR conversion infrastructure, it is actually really easy to inject any kind of lowering you might need. Implement the custom pattern with any conditions and behavior you want (it's C++ after all), give it a higher benefit than the default one, and populate the set of conversions with the rest. Yes, it requires defining a new lowering pass. But it keeps the "main" pass simple and clean of any target- or project-specific constraints.

I agree with handling the alloc() as part of LLVM lowering and not introduce any changes to std.alloc itself.

I am open to reconsidering the "main" lowering (FWIW, my original implementation was calling __mlir_alloc instead of malloc to allow for dispatching), but I would ask to see more than one specific example where it trying to inject patterns wouldn't work.

The rest of the answers are inline.

On Thu, Dec 5, 2019 at 4:18 PM <mlir.d...@gmail.com> wrote:
Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

No, there's no such plan. But you can implement your own lowering patterns and mix them with the existing ones. We can decouple `populateStandardToLLVMLoweringPatterns` into `populateStdToLLVMArithmPatterns` and `populateStdToLLVMAllocationPatterns` if that helps.

Looks like I can override the existing patterns for LLVM lowering by adding a custom lowering pattern later(?) in the OwningRewriterPatternList. Is it always guaranteed to converge to my custom pattern and not the existing one in ConvertStandardToLLVM.cpp ?

If not, exposing all the lowering patterns(in ConvertStandardToLLVM.cpp) as utility helps in picking the ones I need.

You shouldn't rely on the order of the patterns in the list. However, all patterns have a benefit (passed in as a constructor argument), and those will higher benefit will take priority.

I will try to use this.

Otherwise, I am happy to review a patch that splits out the population function into pieces. I'm not convinced we want to expose each pattern individually, mainly because there are likely more patterns to be added as we expand the dialects, but having logical groups sounds reasonable to me. Something like AllocationPatterns, MemoryAccessPatterns, ArithmeticPatterns, etc. I'd proceed on the per-need basis.

If you need a dramatically different runtime representation of memrefs that what we currently adopted, I'd suggest you implement your lowerings for any std operation on memrefs.

2. Representing Global variables:

c code example:

int global_var;

int func() {
int local = <some value>;
global_var = local;
......
}

For the above case, I have created a "global" operation in my own dialect. The new dialect I created works along with standard and affine dialects. I am using the std.load/ std.store operations for the "local" variable and planning to use the same for "global_var".

Is your global variable creating a pointer/memref equivalent? Otherwise, it does not make sense to load from it using std.load/std.store. If it has value semantics, you need custom operations for it.

MemRefType is perfect type for the global variables which I am creating.

MemRef is a structured pointer with sizes attached. You seem to insist a lot on having a pointer to a scalar value, which may not exactly be what memref was designed for. That being said, MLIR does not have raw pointers.

Lowering Memref of scalars to raw LLVM values/ pointers makes perfect sense for my language. Currently, LLVMConverter.convertMemRefType() is private. Why not expose it so that I can utilize it so that I can continue using standard dialect at MLIR level?

Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

The approach I've adopted and recommended so far is to define a global variable and provide an operation that "takes the address" of it locally. See for example, llvm.global and llvm.addressof that model LLVM globals. If you follow this scheme, the "addressof" operation can return a memref<1 x your-type>, which will integrate smoothly with the rest of memref lowering as long as you can provide a lowering for the "addressof" operation that builds the memref descriptor.

I tried to use the same GlobalOp/AddressOfOp in my own dialect with the MemRefType for globals. The operation semantics are same as LLVM counterparts.

Case 1:

"MyDialect.global"() {sym_name = "global_var", type = i32} : () -> ()

%1 = "MyDialect.addressOf"() {global_name = @global_var} : () -> memref<i32>

The above representation lowered to LLVM using LLVM::GlobalOp and LLVM::AddressOfOp fails as AddressOfOp should have the pointer type for LLVM::GlobalOp it points to and not a memref descriptor structure.

The lowering is not one-to-one for addressof. You can get the address of the global and use it to populate a memref descriptor. See https://github.com/tensorflow/mlir/commit/4e7d67e778541165b29e6c39afd6273a1c6ca00e#diff-5a8a4487c6fe7f9f478b6981b405fd71R542 for an example.

Also, Why do I need to create memref descriptor structure for a scalar value? All I need is a pointer to the llvm::GlobalVariable value to do load and store. The current implementation forces me to have a memref structure for the scalar values.

We used to have a lowering from a statically-shaped memref to a bare pointer some time ago and we removed it. The rationale had been that consistency was more important at this stage. In the default flow, an n-D memref is always lowered to a speicifc structure type. Moreover, we can interface with C functions regardless of memrefs having dynamic shapes. We decided that we could reconsider this choice if we are provided with compelling evidence that using structures significantly degrades runtime performance. Does it?

Representing a scalar value or llvm::ArrayType as an user defined structure would create an AliasAnalysis problem might restrict some optimizations as well at LLVM IR level.

Case 2:

Lets say I want to create a static Global array with the custom memory layout. I have created a custom operation for it.

"MyDialect.global"() {sym_name = "global_var", type = memref<10xi32>} : () -> ()

For the above case, the current lowering only gives me the memref structure allocated. What about the allocation of actual array? In case of linking multiple files, I would not know where to initialize the global variable!

The lowering of your dialect should know how to allocate memory for it. Standard-to-llvm cannot not possibly know about the semantics of operations in another dialect.

We did not have to deal with linking in MLIR so there is no support in the standard dialect. I could suggest introducing some notion of external symbols to your dialect and lower them to LLVM globals that have "externally_available" and "external" linkage for declarations and definitions, respectively. I've added the support for linkage types in LLVM just recently.

A proposal on handling linking in the standard dialect is very welcome.

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

It's exposed in ConvertStandardToLLVM.h, should be usable without any other pattern. It cannot be trivially detached from that file because it depends on the type converter as well.

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.
Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

Make a list of what you want to reuse from the existing lowering and we'll see how to restructure it.

A separate populate list for memref related patterns (Alloc, MemoryAccess patterns) and a way to override the LLVMConverter.convertMemRefType().

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/52991f61-37a3-407f-a8e0-56efec2c5648%40tensorflow.org.

--
-- Alex

Compiler Developer

unread,

Dec 7, 2019, 1:27:20 AM12/7/19

to MLIR

I can handle by creating two LLVM global variables (actual array and memref structure). But, isn't that a workaround ?

The current use case (supporting globals) contains:

1. Type defined in std dialect (MemRefType) : perfect for my need.

2. A new operation from my new dialect (myDialect.global) : Currently, there are no globals supported in standard or any other existing dialect.

3. Loads and stores from std dialect. : I am using these for local variables and defining new ones just for global variables doesn't make sense.

I am currently using transformations implemented for standard / affine dialects.

In general, whats the problem with creating a custom LLVM lowering by retaining all the semantics of standard(or any existing) dialect which are valid for my language implementation?

My suggestion is to expose the lowering patterns implemented in ConvertStandardToLLVM.cpp so that I can make use of the existing code in that file along with my custom patterns. A way to override the existing conversion (std to llvm) just for the ops/ types which needs custom lowering will help a lot.

Thanks!

On Saturday, December 7, 2019 at 6:11:38 AM UTC+5:30, Mehdi AMINI wrote:

This doesn't work for the dynamic memrefs memref<?xi32>

or you can emit two globals: one for the array and another for the descriptor

That is another option. But, wouldn't it create AliasAnalysis problem in LLVM?

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I am not sure what is the problem with linking, but juste in case: LLVM has support for global initialization: https://llvm.org/docs/LangRef.html#the-llvm-global-ctors-global-variable

I just mentioned that, when there are multiple files , __global_initialization like function doesn't work. But now have realized that I can do it with "linkonce" linkage type.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.
Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

This may not compose well with the existing lowering: if you change the lowering for a type, you need to override the lowering of all the operations for this type, since the default lowering patterns will assume a specific lowering for the type.

--
Mehdi

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/52991f61-37a3-407f-a8e0-56efec2c5648%40tensorflow.org.

Compiler Developer

unread,

Dec 7, 2019, 1:47:25 AM12/7/19

to MLIR

Isn't this the LLVM lowering problem which needs to be solved at conversion-to-llvm level rather than in MLIR itself? Creating a new std dialect operation for alloca (with similar semantics) in MLIR for non-LLVM lowerings doesn't mean much, right?

Uday Bondhugula

unread,

Dec 7, 2019, 2:36:15 AM12/7/19

to MLIR

On Saturday, December 7, 2019 at 12:17:25 PM UTC+5:30, Compiler Developer wrote:

Isn't this the LLVM lowering problem which needs to be solved at conversion-to-llvm level rather than in MLIR itself?

It shouldn't be - that's the point the other discussion was trying to make.

> Creating a new std dialect operation for alloca (with similar semantics) in MLIR for non-LLVM lowerings doesn't mean much, right?

For a different target, you would convert it to whatever notion you have for stack allocation in the target dialect. alloca has certain similar and certain different semantics to heap allocation. Consider a transformation that hoists an alloc op out of a loop (or sinks it into a call) - you won't need to worry about a dealloc it if it was an alloca. With a heap alloc, you'll have to worry about moving around its dealloc op (all its dealloc ops in the general case) along with the alloc op.

~ Uday

Alex Zinenko

unread,

Dec 8, 2019, 10:42:47 AM12/8/19

to Uday Bondhugula, MLIR

I would consider a different name than "alloca", it's too similar to "alloc".

I think the -use-alloca was recently just for quick testing; for all the reasons you mentioned and more, it's not really a solution/option for lowering.

~ Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/7122b2ea-6261-450c-9b23-606018c2515e%40tensorflow.org.

-- Alex

Alex Zinenko

unread,

Dec 8, 2019, 10:58:31 AM12/8/19

to Compiler Developer, MLIR

What is your definition of "memrefs of scalars"? memref<f32> is technically supported, but is not clearly specified in the documentation. Saying it's just a scalar (i.e. a value) goes against the idea of memrefs IMO, it should rather be a pointer.

I wouldn't object to making `convertMemRefType` protected virtual (convertType is already virtual, so LLVMTypeConveter has a vtable).

Currently the memref lowering in LLVM dialect conversion is hard coded to the MemRefDescriptor structure even for scalar variables. This makes it impossible (where to alloc/ initialize?) to lower the global operation to LLVM Global variable using std.load and std.store at MLIR level.

The approach I've adopted and recommended so far is to define a global variable and provide an operation that "takes the address" of it locally. See for example, llvm.global and llvm.addressof that model LLVM globals. If you follow this scheme, the "addressof" operation can return a memref<1 x your-type>, which will integrate smoothly with the rest of memref lowering as long as you can provide a lowering for the "addressof" operation that builds the memref descriptor.

I tried to use the same GlobalOp/AddressOfOp in my own dialect with the MemRefType for globals. The operation semantics are same as LLVM counterparts.

Case 1:

"MyDialect.global"() {sym_name = "global_var", type = i32} : () -> ()

%1 = "MyDialect.addressOf"() {global_name = @global_var} : () -> memref<i32>

The above representation lowered to LLVM using LLVM::GlobalOp and LLVM::AddressOfOp fails as AddressOfOp should have the pointer type for LLVM::GlobalOp it points to and not a memref descriptor structure.

The lowering is not one-to-one for addressof. You can get the address of the global and use it to populate a memref descriptor. See https://github.com/tensorflow/mlir/commit/4e7d67e778541165b29e6c39afd6273a1c6ca00e#diff-5a8a4487c6fe7f9f478b6981b405fd71R542 for an example.

Also, Why do I need to create memref descriptor structure for a scalar value? All I need is a pointer to the llvm::GlobalVariable value to do load and store. The current implementation forces me to have a memref structure for the scalar values.

We used to have a lowering from a statically-shaped memref to a bare pointer some time ago and we removed it. The rationale had been that consistency was more important at this stage. In the default flow, an n-D memref is always lowered to a speicifc structure type. Moreover, we can interface with C functions regardless of memrefs having dynamic shapes. We decided that we could reconsider this choice if we are provided with compelling evidence that using structures significantly degrades runtime performance. Does it?

Representing a scalar value or llvm::ArrayType as an user defined structure would create an AliasAnalysis problem might restrict some optimizations as well at LLVM IR level.

I am not sure I follow you here. If you have a scalar value, represent it as such, not as a memref. If you have a fixed-sized array (i.e. something llvm::ArrayType is suitable for), maybe you should consider using a custom type for such values because it's way too restrictive for memrefs (do dynamic rank, no dynamic size, no layout). Or use a non-default memory space that will guide the lowering.

"Might restrict some optimization" is a speculation. It might or might not affect some optimization, which may or may not be fixed by canonicalization/constant propagation passes in advance. In particular, a pass that expands aggregate types to sets of values and a constant propagation + dce are likely to nivelate the effects of using structs for memref descriptors. While we are careful about not hindering the entire pipeline, we need to trade off real code complexity today against potential future performance gains, hence the decision about concrete compelling evidence, as in "I replaced structs by raw pointers, and saw a 5% performance increase of vectorized non-parallel code in resnet on x86".

Case 2:

Lets say I want to create a static Global array with the custom memory layout. I have created a custom operation for it.

"MyDialect.global"() {sym_name = "global_var", type = memref<10xi32>} : () -> ()

For the above case, the current lowering only gives me the memref structure allocated. What about the allocation of actual array? In case of linking multiple files, I would not know where to initialize the global variable!

The lowering of your dialect should know how to allocate memory for it. Standard-to-llvm cannot not possibly know about the semantics of operations in another dialect.
We did not have to deal with linking in MLIR so there is no support in the standard dialect. I could suggest introducing some notion of external symbols to your dialect and lower them to LLVM globals that have "externally_available" and "external" linkage for declarations and definitions, respectively. I've added the support for linkage types in LLVM just recently.

A proposal on handling linking in the standard dialect is very welcome.

MemRefDescriptor is merely a utility class that abstracts away some repeated parts of the lowering.

Yes, can this and also the actual memref structure representation be moved as a utility so that it might be useful for custom lowering?

It's exposed in ConvertStandardToLLVM.h, should be usable without any other pattern. It cannot be trivially detached from that file because it depends on the type converter as well.

Initializing global variables seems orthogonal to the meref lowering. You need to do it somewhere, which can be a special operation in your dialect, or a lowering convention specific to your pipeline like `func __global_initialization` is executed before `func main`.

This cannot work if I link multiple files.

I do not want create a new set of loads/ stores in the my new dialect as I am able to use standard dialect for most of other operations which I need. I am also using the Affine dialect in few cases.

You don't have to create new Ops, different lowering patterns could suffice.
Affine loads and stores are lowered to standard loads and stores without changing the type. I don't think they this lowering should be affected in any way.

Is there any plan on making the memref/ alloc/ dealloc lowering custom? If not, how should I make use of the transformations written for standard and Affine dialects (which are mostly based on memrefs) if I write a new dialect of my own with new operations/ types/ lowering?

It would help if you could elaborate on the operations/types you want to introduce, and what transformations you want to reuse.
In general, new operations on existing types work well and lower equally well if the lowering is aware of how types should be handled. New types integrate reasonably well with existing types if you provide a cast operation, the lowering of which is aware of how both types should be lowered. Core transformations are often based on OpInterfaces and will work if your Ops implement those interfaces.

In general, I do want to create new types and operations which are already in standard dialect but needs different LLVM lowering.

Make a list of what you want to reuse from the existing lowering and we'll see how to restructure it.

A separate populate list for memref related patterns (Alloc, MemoryAccess patterns) and a way to override the LLVMConverter.convertMemRefType().

I'd suggest splitting all memref-related operations (load/store/alloc/dealloc/prefetch, etc.) from the rest. Will you be willing to implement the change?

Thanks!

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a18e2022-a8fe-4800-ba3c-f1c13756ee1a%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/52991f61-37a3-407f-a8e0-56efec2c5648%40tensorflow.org.

--
-- Alex

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/b91e8aef-3c27-4c3c-a03d-a98f82f28a60%40tensorflow.org.

-- Alex

Alex Zinenko

unread,

Dec 8, 2019, 11:11:39 AM12/8/19

to Compiler Developer, MLIR

On Sat, Dec 7, 2019 at 7:27 AM Compiler Developer <mlir.de...@gmail.com> wrote:

I can handle by creating two LLVM global variables (actual array and memref structure). But, isn't that a workaround ?

The current use case (supporting globals) contains:

1. Type defined in std dialect (MemRefType) : perfect for my need.
2. A new operation from my new dialect (myDialect.global) : Currently, there are no globals supported in standard or any other existing dialect.
3. Loads and stores from std dialect. : I am using these for local variables and defining new ones just for global variables doesn't make sense.

I am currently using transformations implemented for standard / affine dialects.

In general, whats the problem with creating a custom LLVM lowering by retaining all the semantics of standard(or any existing) dialect which are valid for my language implementation?

I think there may be a problem if your lowering _diverges_ from the semantics of standard operations. It should be okay to have a set of lowering patterns that only apply to a subset of standard operations.

You need to define the size of the allocated memory somewhere. From that place, you have to communicate it to the descriptor. The "how" of it is up to your language/lowering semantics. For example, it can be defined when the global is allocated. In this case, the global allocation should write it somewhere where addressof can read it and put it into the descriptor. This can be done with two globals as Mehdi suggested, or by lowering yourdialect.global : !yourdialect.foo to llvm.global !llvm<"{i64, lowering-of(foo) }">. Or it can be a null-terminated string, at which point addressof can call an equivalent of strlen.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e551b8b0-21ee-429b-9875-64579d97c251%40tensorflow.org.

-- Alex

Alex Zinenko

unread,

Dec 8, 2019, 11:17:32 AM12/8/19

to Uday Bondhugula, MLIR

On Sat, Dec 7, 2019 at 8:36 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

On Saturday, December 7, 2019 at 12:17:25 PM UTC+5:30, Compiler Developer wrote:
Isn't this the LLVM lowering problem which needs to be solved at conversion-to-llvm level rather than in MLIR itself?

It shouldn't be - that's the point the other discussion was trying to make.

I agree with the motivation of having allocations that are deleted automatically (e.g., alloca) and manually (e.g. malloc/free). I would argue that the location on heap or on stack is an implementation detail of a lower-level dialect. What's trickier in the definition of "alloca" equivalent in MLIR is the absence of first-class functions or a consistent notion of a call stack. Should alloca'ed values be deallocated at the end of the containing region? What about closures (e.g., a region in a following operation that uses alloca'ed memory is "stored" in a variable that survives the scope of the alloca)? If there is an "alloca" in, e.g., a loop-like region, how does it behave across different executions of the said region?

> Creating a new std dialect operation for alloca (with similar semantics) in MLIR for non-LLVM lowerings doesn't mean much, right?

For a different target, you would convert it to whatever notion you have for stack allocation in the target dialect. alloca has certain similar and certain different semantics to heap allocation. Consider a transformation that hoists an alloc op out of a loop (or sinks it into a call) - you won't need to worry about a dealloc it if it was an alloca. With a heap alloc, you'll have to worry about moving around its dealloc op (all its dealloc ops in the general case) along with the alloc op.

~ Uday

On Saturday, December 7, 2019 at 7:22:59 AM UTC+5:30, Uday Bondhugula wrote:

On Thursday, December 5, 2019 at 8:48:12 PM UTC+5:30, mlir.d...@gmail.com wrote:
Hi,

Standard to LLVM conversion for memref types is always MemRefDescriptor structure. Memory is allocated either by malloc/ or alloc (using clUseAlloca). But this seems to be a restriction for following cases.

1. Using malloc/ llvm::alloca interchangeably based on the static/dynamic shape of the Memref, etc. Is there a plan to implement the configurable Alloc/Dealloc op ?

I've had an std.alloca op for experimentation for some time now (which is also what I needed/used for my article here: https://github.com/bondhugula/mlir/blob/hop/g3doc/HighPerfCodeGen.md )
I've made that available in an 'alloca' branch here:
https://github.com/bondhugula/mlir/commit/d374b39a0ef29e9d85caa84be8f0948e0a2f371a
I just postponed submitting it upstream because it needs two clean ups:
1) use an "AllocLikeOpInterface" to share methods between AllocOp and AllocaOp (besides rewrite patterns and pass behavior)
2) refactor to reuse the common code b/w AllocOpLowering and AllocaOpLowering in the LLVM lowering (pretty straightforward)
Let me know if you are interested in contributing to it to submit upstream.

Using a separate alloca op with the op interface is I believe the right approach (as opposed to using an attribute on the alloc op -- mainly because a DeallocOp isn't needed for the latter as argued well by @aminim on this thread: https://github.com/tensorflow/mlir/pull/55 -- but OpInterfaces weren't available at that time and so the thinking was different).

I think the -use-alloca was recently just for quick testing; for all the reasons you mentioned and more, it's not really a solution/option for lowering.

~ Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/a45a58c4-27e4-4a1a-9dd0-3ef81a48f1f6%40tensorflow.org.

-- Alex

Alex Zinenko

unread,

Dec 17, 2019, 5:22:04 PM12/17/19

to Compiler Developer, MLIR

FYI, https://github.com/tensorflow/mlir/commit/291a309d7164d9d38a8cc11cfdf88ca2ff709da0 implemented a separation between memory-related conversion patterns and the remaining patterns. From there, it should suffice to derive LLVMTypeConverter, override its convertType() to lower MemRefType differently while deferring to the base class for the other types, and provide the lowering patterns for std operations listed in populateStdToLLVMMemoryConversionPatterns that you care about.

-- Alex

Compiler Developer

unread,

Dec 19, 2019, 10:47:06 PM12/19/19

to MLIR

Thanks a lot!

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e551b8b0-21ee-429b-9875-64579d97c251%40tensorflow.org.