On Apr 11, 2019, at 9:21 AM, Stella Laurenzo <laur...@google.com> wrote:Hey mlir team, I have a question about how we are planning to handle signed vs unsigned integer types in the various dialects. The way I see it, there are two rough categories of dialects currently, which make different assumptions here:
- "Source" dialects like TF, XLA, and TFLite often come from a history where the arithmetic ops are generic (i.e. Add, Mul, Sub, Div) and use the type to determine signed-ness.
- "Machine" dialects like LLVM, IREE, etc typically have discrete ops for the combination of (arithmetic operation, integral type, signedness) where the distinction matters and the raw type is just an IntegerType with a width.
MLIR is currently biased towards the "Machine" dialects, which I think is a fine starting point. It does leave us in a funny place, though, with respect to the "Source" dialects which need another place in the type system to carry the bit.
The TensorFlow dialect has taken an opinion on this and has defined its types to include element types that model the classic DType structure (here). To my reading of it, though, it appears to only model the unsigned ref types, not the actual signed/unsigned distinction on the primitives. This makes sense to me because it means that we don't have, for example, two types in the type system that can represent an integer. Are we just kicking the signedness issue down the road, or is there a plan here?
In my mind, this issue is similar to the previous discussion about tuple types where we decided that it is important to faithfully model the source dialect conventions and, even at the cost of some purity, type conventions that are pretty standard may make sense in the core vs each dialect (but we leave interpretation and operations on them to the dialect). Like the tuple, having MLIR take an opinion on this, where it matters (i.e. not the "Machine" dialects) could help interop. The opinion could be as simple as: in dialects where there is a signed/unsigned distinction, assume that IntegerType is signed (which MLIR already does in its representation) and use a (new) wrapper type "unsigned<i8>" to signal that the integer needs to be interpreted as an unsigned quantity." We could also take it a step further and say that the IntegerType implies nothing about signedness and if it matters to you, we provide both signed and unsigned wrapper types (i.e. signed<i8> and unsigned<i8>). I think I prefer the latter, but it would be a bigger change.
Putting such wrapper types in the core type system would also allow us to make sure that the way they are used is pretty straight-forward in the core (i.e. teach attributes and printers/parsers how to operate on them vs the patchwork of assumptions that are in place now).
Thoughts? I'm asking because we're starting to bump up against this with XLA. Also, unsigned i8 types are important for imaging models that operate on framebuffers (and the ops that go into that) -- and of course there are other reasons. We're adjacent to needing some resolution for both of those.Thanks.- Stella
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/D8CBF119-FC2B-4BF9-ADBC-9ED5623A52F4%40google.com.
On Thu, Apr 11, 2019 at 9:26 AM 'Chris Lattner' via MLIR <ml...@tensorflow.org> wrote:On Apr 11, 2019, at 9:21 AM, Stella Laurenzo <laur...@google.com> wrote:Hey mlir team, I have a question about how we are planning to handle signed vs unsigned integer types in the various dialects. The way I see it, there are two rough categories of dialects currently, which make different assumptions here:
- "Source" dialects like TF, XLA, and TFLite often come from a history where the arithmetic ops are generic (i.e. Add, Mul, Sub, Div) and use the type to determine signed-ness.
- "Machine" dialects like LLVM, IREE, etc typically have discrete ops for the combination of (arithmetic operation, integral type, signedness) where the distinction matters and the raw type is just an IntegerType with a width.
This is a great way to think about this.MLIR is currently biased towards the "Machine" dialects, which I think is a fine starting point. It does leave us in a funny place, though, with respect to the "Source" dialects which need another place in the type system to carry the bit.The general philosophy of MLIR is to allow dialects to model their domain as closely as possible, so I think it is important to represent things like “signed integer 8” because there are certainly dialects that need that.We have room within (e.g.) the std dialect to be opinionated about things, but when interfacing with existing systems we should try to do things they way they want.The TensorFlow dialect has taken an opinion on this and has defined its types to include element types that model the classic DType structure (here). To my reading of it, though, it appears to only model the unsigned ref types, not the actual signed/unsigned distinction on the primitives. This makes sense to me because it means that we don't have, for example, two types in the type system that can represent an integer. Are we just kicking the signedness issue down the road, or is there a plan here?Dunno.In my mind, this issue is similar to the previous discussion about tuple types where we decided that it is important to faithfully model the source dialect conventions and, even at the cost of some purity, type conventions that are pretty standard may make sense in the core vs each dialect (but we leave interpretation and operations on them to the dialect). Like the tuple, having MLIR take an opinion on this, where it matters (i.e. not the "Machine" dialects) could help interop. The opinion could be as simple as: in dialects where there is a signed/unsigned distinction, assume that IntegerType is signed (which MLIR already does in its representation) and use a (new) wrapper type "unsigned<i8>" to signal that the integer needs to be interpreted as an unsigned quantity." We could also take it a step further and say that the IntegerType implies nothing about signedness and if it matters to you, we provide both signed and unsigned wrapper types (i.e. signed<i8> and unsigned<i8>). I think I prefer the latter, but it would be a bigger change.I think I prefer the latter too. To me, the important principle is that the dialect type system can model the behavior the dialect wants. We want to have tf.Div, not tf.DivU and tf.DivS or something like that.There is space to decide *how* we want to model this, but I think a signed<> and unsigned<> modifier on integer types makes perfect sense and will compose nicely.Putting such wrapper types in the core type system would also allow us to make sure that the way they are used is pretty straight-forward in the core (i.e. teach attributes and printers/parsers how to operate on them vs the patchwork of assumptions that are in place now).Yes, agreed, I think it is important to standardize this, for the same reason we have standardized tuple types.It isn't clear to me what is means to "standardize" something. It may come back to a more fundamental question about the "standard" dialect itself though (why isn't it a combination of "arithmetic dialect" and others?).I wonder if we can see it as having dialects that are designed for extensibility and are more generic and others that are more specific.What seems like the existing extensibility/composability of dialect is that a new dialect is able to re-use types and extend the operations defined by another dialect with its own set of operations, but not able to inject new types for use by the operations defined by an existing dialect.This is because operations from a dialect usually would have a verifier enforcing the set of supported types (for example the standard "add" operations may not like to operate on a custom dialect type), while types themselves don't have verifier (my dialect can happily operates on types from the LLVM dialect).It seems to me that it hints toward a model where a dialect designed for extensibility/reuse should be having rich types rather than rich operations (i.e. designed as a "source dialect" in the definition Stella gave).With this logic, a dialect like the "standard" one would be better be with signed/unsigned integer modeled at the type level, and have generic operations to operate on them.I'm not sure it makes sense to define the "standard" dialect with generic op and encourage reusing them, unless we manage to have dialect plug their verifier on the use of types (i.e. using a "standard" add operation with TF dialect types should call into the TF dialect verifier), we don't have such mechanism today.
Hey mlir team, I have a question about how we are planning to handle signed vs unsigned integer types in the various dialects. The way I see it, there are two rough categories of dialects currently, which make different assumptions here:
- "Source" dialects like TF, XLA, and TFLite often come from a history where the arithmetic ops are generic (i.e. Add, Mul, Sub, Div) and use the type to determine signed-ness.
- "Machine" dialects like LLVM, IREE, etc typically have discrete ops for the combination of (arithmetic operation, integral type, signedness) where the distinction matters and the raw type is just an IntegerType with a width.
MLIR is currently biased towards the "Machine" dialects, which I think is a fine starting point. It does leave us in a funny place, though, with respect to the "Source" dialects which need another place in the type system to carry the bit.The TensorFlow dialect has taken an opinion on this and has defined its types to include element types that model the classic DType structure (here). To my reading of it, though, it appears to only model the unsigned ref types, not the actual signed/unsigned distinction on the primitives. This makes sense to me because it means that we don't have, for example, two types in the type system that can represent an integer. Are we just kicking the signedness issue down the road, or is there a plan here?
Putting such wrapper types in the core type system would also allow us to make sure that the way they are used is pretty straight-forward in the core (i.e. teach attributes and printers/parsers how to operate on them vs the patchwork of assumptions that are in place now).Yes, agreed, I think it is important to standardize this, for the same reason we have standardized tuple types.It isn't clear to me what is means to "standardize" something. It may come back to a more fundamental question about the "standard" dialect itself though (why isn't it a combination of "arithmetic dialect" and others?).
I wonder if we can see it as having dialects that are designed for extensibility and are more generic and others that are more specific.What seems like the existing extensibility/composability of dialect is that a new dialect is able to re-use types and extend the operations defined by another dialect with its own set of operations, but not able to inject new types for use by the operations defined by an existing dialect.This is because operations from a dialect usually would have a verifier enforcing the set of supported types (for example the standard "add" operations may not like to operate on a custom dialect type), while types themselves don't have verifier (my dialect can happily operates on types from the LLVM dialect).
With this logic, a dialect like the "standard" one would be better be with signed/unsigned integer modeled at the type level, and have generic operations to operate on them.
My general tendency is to push-back on the notion of "built-in" until it is very clear that it can't be modeled as a dialect. Another way to put it: builtins can be seen as poking through abstraction layers to workaround limitation of the abstractions. Similarly, whenever there is a need for first-class printer/parser that isn't just plugged through via a dialect (the standard dialect is unfortunately doing this right now, but I don't see why it couldn't be refactored), I tend to see an indication of a missing extensibility aspect of MLIR.In general, this is why I regret the name "standard" for this dialect: I am wary of considering it as "built-in" and take advantage of this to bypass building the right abstractions. LangRef isn't yet totally clear on these aspects by the way, the only built-in type is the function type: and separately it mentions standard types.So if your point is that the standard dialect should expose a distinction between signed/unsigned, then I agree: this is what I was trying to carry in my previous email, re-use and interoperability across dialects is on types and not operations, so the types exposed by a dialect intended for re-use like "standard" should be richer (and include the distinction between signed/unsigned).If you're advocating for something more built-in about signed/unsigned than a dialect type, then I'm not sure I understand.I'll strike my mention of "built-in". I'm really referring to first class support in some fuzzy way which comes down to "other dialects can use it freely and it feels like a part of MLIR, not a bolt-on". In addition, we would want to be able to store instances of this type in attributes (i.e. so that you can define a const of "unsigned<i8>"). There are some limitations in the current implementation of custom types that I think biases us against putting types like this in a dialect. Your feedback is taken: that is a bug we should probably fix versus kicking it down the road and continuing to put things like this at the wrong level.Agreed on the standard dialect exposing a signed/unsigned distinction. From an implementation standpoint, I don't know how to do that by relying only on the dialect type hooks.
On Apr 11, 2019, at 11:48 AM, Mehdi AMINI <joke...@gmail.com> wrote:Putting such wrapper types in the core type system would also allow us to make sure that the way they are used is pretty straight-forward in the core (i.e. teach attributes and printers/parsers how to operate on them vs the patchwork of assumptions that are in place now).Yes, agreed, I think it is important to standardize this, for the same reason we have standardized tuple types.It isn't clear to me what is means to "standardize" something. It may come back to a more fundamental question about the "standard" dialect itself though (why isn't it a combination of "arithmetic dialect" and others?).In this case, we’re talking about types, not operations. We’ve standardized tuple types by moving them to being built in types (just like basic integers and floats are)
but provide no operations on them. This is described here:I wonder if we can see it as having dialects that are designed for extensibility and are more generic and others that are more specific.What seems like the existing extensibility/composability of dialect is that a new dialect is able to re-use types and extend the operations defined by another dialect with its own set of operations, but not able to inject new types for use by the operations defined by an existing dialect.This is because operations from a dialect usually would have a verifier enforcing the set of supported types (for example the standard "add" operations may not like to operate on a custom dialect type), while types themselves don't have verifier (my dialect can happily operates on types from the LLVM dialect).Yeah, I’m generally skeptical about saying that one dialect can extend some other dialect’s operations. I don’t see how this can work, because in addition to the operations themselves, there will be some number of transformations, and some number of accessors defined on the operations. Extending them retroactively will break assumptions about invariants that the operations are supposed to hold.That said, it is possible that the invariants are defined in terms of classes (e.g. this works on any “floating point” type) and that the class could be extended (e.g. by adding float80) safely. I’m not sure that that is a common enough thing to worry about.With this logic, a dialect like the "standard" one would be better be with signed/unsigned integer modeled at the type level, and have generic operations to operate on them.I pretty strongly believe that the “standard ops” (whatever they end up being called) should model things as non-signed integers and should have floating point types separate. This is described here:That section doesn’t explain the signed vs unsigned design, but I can add that if it is useful.
In this case, we’re talking about types, not operations. We’ve standardized tuple types by moving them to being built in types (just like basic integers and floats are)I'm not sold on the concept of "built-in": could the standard dialect's types not be builtins but use regular dialect parsing/plugin mechanism?Having the absence of a dialect in the assembly be routed to the standard dialect as the only thing different from other dialect for the sake of not having to replicate "std." in front of "i32" seems OK to me if this is the only layer that is hard-coded.
I pretty strongly believe that the “standard ops” (whatever they end up being called) should model things as non-signed integers and should have floating point types separate. This is described here:That section doesn’t explain the signed vs unsigned design, but I can add that if it is useful.It would be. I know this section, but it seems very oriented toward "machine dialect" rather than "source dialect" in Stella's taxonomy. It also talks about "The MLIR operation set", which is giving what I believe is a false impression about MLIR. The converter from TF to TFLite is not related to this for example.
On Apr 11, 2019, at 9:29 PM, Mehdi AMINI <joke...@gmail.com> wrote:
I think that there is some confusion here. Are you suggesting that the types be dialect types or builtin types (like tuple, function, integers, complex, vector, etc)?I agree that we should not have builtin operations, and I agree that standard ops is poorly considered (we should probably eventually split it to arithmetic ops, control flow ops, etc) but standardizing structure type concepts into the core seems principled to me - there is no benefit to forcing dialects to duplicate these concepts, at least if they are simple, easy to specify, and have non-debatable semantics.Since dialects can re-use types defined by other dialects, you don't need "builtin types" to avoid duplicating concepts right?An "arithmetic" dialect (or "scalar" dialect, or whatever we can name it) in MLIR can provide a set of re-usable types by other dialects.I think we agree on providing re-usable components (or types here), I'm just not sure what's in your mind behind "builtin types" and "standardizing [...] into the core" beyond dialects that are checked-in MLIR repo.
On Apr 11, 2019, at 9:03 PM, Mehdi AMINI <joke...@gmail.com> wrote:In this case, we’re talking about types, not operations. We’ve standardized tuple types by moving them to being built in types (just like basic integers and floats are)I'm not sold on the concept of "built-in": could the standard dialect's types not be builtins but use regular dialect parsing/plugin mechanism?Having the absence of a dialect in the assembly be routed to the standard dialect as the only thing different from other dialect for the sake of not having to replicate "std." in front of "i32" seems OK to me if this is the only layer that is hard-coded.These types are built in, they are not part of the standard dialect or any other dialect.They are spelled like “f32” and “complex<t>” they are not spelled “!std.f32” or “!std.complex<t>”.
On Apr 11, 2019, at 10:28 PM, Mehdi AMINI <joke...@gmail.com> wrote:On Thu, Apr 11, 2019 at 9:50 PM Chris Lattner <clat...@google.com> wrote:On Apr 11, 2019, at 9:03 PM, Mehdi AMINI <joke...@gmail.com> wrote:In this case, we’re talking about types, not operations. We’ve standardized tuple types by moving them to being built in types (just like basic integers and floats are)I'm not sold on the concept of "built-in": could the standard dialect's types not be builtins but use regular dialect parsing/plugin mechanism?Having the absence of a dialect in the assembly be routed to the standard dialect as the only thing different from other dialect for the sake of not having to replicate "std." in front of "i32" seems OK to me if this is the only layer that is hard-coded.These types are built in, they are not part of the standard dialect or any other dialect.They are spelled like “f32” and “complex<t>” they are not spelled “!std.f32” or “!std.complex<t>”.I think we can make a distinction between the "pretty" IR printing and the structural representation in the IR. For instance the operations in the standard dialect aren't prefixed with "std." either, we just assume that std. is the default: https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp#L3407 ; the same thing could apply to types and we could forward types without prefix to the standard dialect as well.Is there anything else than the lack of explicit prefix that makes these types more "built in" than any other dialect-defined types?What limitations are OK on dialect types that "built in" don't have?(I don't think we document this clearly, do we?)
On Apr 11, 2019, at 10:40 PM, Mehdi AMINI <joke...@gmail.com> wrote:
I think we can make a distinction between the "pretty" IR printing and the structural representation in the IR. For instance the operations in the standard dialect aren't prefixed with "std." either, we just assume that std. is the default: https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp#L3407 ; the same thing could apply to types and we could forward types without prefix to the standard dialect as well.Is there anything else than the lack of explicit prefix that makes these types more "built in" than any other dialect-defined types?What limitations are OK on dialect types that "built in" don't have?(I don't think we document this clearly, do we?)I’m sorry, my point isn’t really the spelling here, it is the structure. These are builtin types, defined in include/mlir/IR/Types.h, not in a dialect. The point of doing it this way (as described in the rationale doc) is so that dialects don’t have to redundantly specify these.It is possible that we could push these into some new dialect, but if so that dialect shouldn’t have any ops, so I don’t know how that would be better than what we have now.The way I see it, having these types defined in a dialect instead of "built in" would help ensuring that they aren't exploited to poke through any abstraction layer at the dialect level (not bypass any limitation that applies to dialect-defined types).Doing the exercise or moving these types to a dialect wouldn't make these types better than what we have now, but it may help making sure that the MLIR type system is as extensible as possible through the dialect abstraction.
On Apr 11, 2019, at 10:40 PM, Mehdi AMINI <joke...@gmail.com> wrote:
I think we can make a distinction between the "pretty" IR printing and the structural representation in the IR. For instance the operations in the standard dialect aren't prefixed with "std." either, we just assume that std. is the default: https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp#L3407 ; the same thing could apply to types and we could forward types without prefix to the standard dialect as well.Is there anything else than the lack of explicit prefix that makes these types more "built in" than any other dialect-defined types?What limitations are OK on dialect types that "built in" don't have?(I don't think we document this clearly, do we?)I’m sorry, my point isn’t really the spelling here, it is the structure. These are builtin types, defined in include/mlir/IR/Types.h, not in a dialect. The point of doing it this way (as described in the rationale doc) is so that dialects don’t have to redundantly specify these.It is possible that we could push these into some new dialect, but if so that dialect shouldn’t have any ops, so I don’t know how that would be better than what we have now.The way I see it, having these types defined in a dialect instead of "built in" would help ensuring that they aren't exploited to poke through any abstraction layer at the dialect level (not bypass any limitation that applies to dialect-defined types).Doing the exercise or moving these types to a dialect wouldn't make these types better than what we have now, but it may help making sure that the MLIR type system is as extensible as possible through the dialect abstraction.I think I understand what you mean: you’re saying that if we have a privileged set of types implemented in a different way, then we may have things that only they can do, which works against one of the goals of MLIR.I can see that, and I can see that we already have one small aspect of that (likely to get fixed soon) which is that the dialect type parser can’t really call into the standard parser right now, so you can’t implement something like complex<> or unsigned<> conveniently as a dialect type. I think that that specific issue will be fixed for a variety of reasons, but there may be others.
OTOH, the first time a dialect specific type wants to do something (like the above), we can easily generalize the system on the fly, as we’re about to do. As such, I don’t see any specific reason that fixing this is a priority.Are you specifically interested in designing and fixing all of this (i.e., designing the dialect that contains these types, moving all code to depend on that, possibly introducing a dialect dependence system, moving all the code over etc) or is this something you’re observing about the system but agree it doesn’t cause active harm?
On Apr 12, 2019, at 8:16 AM, Mehdi AMINI <joke...@gmail.com> wrote:OTOH, the first time a dialect specific type wants to do something (like the above), we can easily generalize the system on the fly, as we’re about to do. As such, I don’t see any specific reason that fixing this is a priority.Are you specifically interested in designing and fixing all of this (i.e., designing the dialect that contains these types, moving all code to depend on that, possibly introducing a dialect dependence system, moving all the code over etc) or is this something you’re observing about the system but agree it doesn’t cause active harm?I agree it does not cause active harm, and even though I’d be interested to give a try to do this, it is not enough of a priority to fix it right now.The reason I am bringing this up is just the usual “look at where we want to be on the long term, and make sure everyone is aligned on the direction“.Here it is mainly to make sure we’re aligned on how to view these types and we make sure we operate with the right mindset and don’t take advantage of the fact that these types aren’t (yet) in a dialect.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/04E164F1-7269-4AA6-82EF-CB323AF82DB8%40google.com.