Hi,
Regarding your 1st question, it's true that there's no undef instruction in
LLVM. Undef is a value, hence it *has* to be able to produce a different
result on each use (due to how RAUW works).
If undef was an instruction, it could have the semantics that all uses
observe the same value.
There are performance considerations to be had when you impose that all uses
see the same value. In particular, you may need to materialize a value and
keep it in a register so that all uses observe the same value.
Having undef take different values on different uses introduces a lot of
complexity in the optimizers.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/0B9518F3CE2F46BCA702ACBFC94C6C34%40PC07655.
On Mon, Oct 7, 2019 at 2:58 PM 'Mahesh Ravishankar' via MLIR
<ml...@tensorflow.org> wrote:
>
> I wanted some guidance on how to handle undef values in the SPIR-V dialect. Please see below for details. Some of the details below requires some basic understanding of SPIR-V dialect.
>
> Issue:
>
> The aim here is to model the OpUndef instruction in the SPIR-V dialect. The semantics of this according to the spec is that the instruction creates an SSA value and each use of the SSA value produces a different bit-pattern (similar to LLVM's undef operation with one difference highlighted below). We can model this directly in the SPIR-V dialect, i.e. an spirv::UndefOp that produces an SSA value with the semantics that every use produces a different bit-pattern.
> I find that hard to reason about, since this makes an SSA value produced by the spirv::UndefOp different compared to other SSA values.
>
> Current solution I have in mind :
>
> An alternative way I was thinking of doing this is to define the semantics of spirv::UndefOp as producing an SSA value which represents the same bit-pattern on every use. With this, some special handling is needed during deserialization of a SPIR-V binary into the SPIR-V dialect in MLIR.
This isn't needed for correctness though right? Since every use of
OpUndef _could have_ had the same value? It would even be correct to
"deserialize" each OpUndef into a constant zero.
However, I think we'll have a problem going the other way, from the
SPIR-V dialect to OpUndef. If spirv::UndefOp is defined to produce a
consistent value then it cannot be lowered into OpUndef which will
produce a different value on each use.
-- Sanjoy
> To stay consistent with the SPIR-V spec, a new UndefOp is created on every use of the result <id> generated by an OpUndef instruction. This is consistent with SPIR-V spec and we don't need to treat every use of the result of a spirv::UndefOp as a different bit-pattern.
>
> Questions
> 1) I was looking at what LLVM does for undef. The description seems to be consistent with SPIR-V, but there is a slight difference AFAICS. There is no "undef instruction". The undef value is directly embedded into the IR where needed, so this issue does not arise. Could someone confirm this?
> 2) The LLVM dialect has a llvm.mlir.undef operation that has a specification similar to what the spirv::UndefOp. I think (havent verified), when lowered to LLVM IR an undef value is being created at every use of the result SSA value. This effectively means that the LLVM dialect is modeling undef value as having a different bit pattern on every use. Is there any guidance/opinions of how undef values should/would be modeled in MLIR?
>
> Finally though, I am not really sure there is any observable difference between the two approaches. So this might all be a moot point. Some input here would be very useful to cover any blindspots on my side.
>
> --
> You received this message because you are subscribed to the Google Groups "MLIR" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/067202df-7e32-409f-9418-7ccc82942b9b%40tensorflow.org.
>
>>
>>
>> -- Sanjoy
>>
>> > To stay consistent with the SPIR-V spec, a new UndefOp is created on every use of the result <id> generated by an OpUndef instruction. This is consistent with SPIR-V spec and we don't need to treat every use of the result of a spirv::UndefOp as a different bit-pattern.
>> >
>> > Questions
>> > 1) I was looking at what LLVM does for undef. The description seems to be consistent with SPIR-V, but there is a slight difference AFAICS. There is no "undef instruction". The undef value is directly embedded into the IR where needed, so this issue does not arise. Could someone confirm this?
>> > 2) The LLVM dialect has a llvm.mlir.undef operation that has a specification similar to what the spirv::UndefOp. I think (havent verified), when lowered to LLVM IR an undef value is being created at every use of the result SSA value. This effectively means that the LLVM dialect is modeling undef value as having a different bit pattern on every use. Is there any guidance/opinions of how undef values should/would be modeled in MLIR?
>> >
>> > Finally though, I am not really sure there is any observable difference between the two approaches. So this might all be a moot point. Some input here would be very useful to cover any blindspots on my side.
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups "MLIR" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
>> > To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/067202df-7e32-409f-9418-7ccc82942b9b%40tensorflow.org.
>
>
>
> --
> Mahesh
On Mon, Oct 7, 2019 at 2:58 PM 'Mahesh Ravishankar' via MLIR
<ml...@tensorflow.org> wrote:
>
> I wanted some guidance on how to handle undef values in the SPIR-V dialect. Please see below for details. Some of the details below requires some basic understanding of SPIR-V dialect.
>
> Issue:
>
> The aim here is to model the OpUndef instruction in the SPIR-V dialect. The semantics of this according to the spec is that the instruction creates an SSA value and each use of the SSA value produces a different bit-pattern (similar to LLVM's undef operation with one difference highlighted below). We can model this directly in the SPIR-V dialect, i.e. an spirv::UndefOp that produces an SSA value with the semantics that every use produces a different bit-pattern.
> I find that hard to reason about, since this makes an SSA value produced by the spirv::UndefOp different compared to other SSA values.
>
> Current solution I have in mind :
>
> An alternative way I was thinking of doing this is to define the semantics of spirv::UndefOp as producing an SSA value which represents the same bit-pattern on every use. With this, some special handling is needed during deserialization of a SPIR-V binary into the SPIR-V dialect in MLIR.
This isn't needed for correctness though right? Since every use of
OpUndef _could have_ had the same value? It would even be correct to
"deserialize" each OpUndef into a constant zero.
However, I think we'll have a problem going the other way, from the
SPIR-V dialect to OpUndef. If spirv::UndefOp is defined to produce a
consistent value then it cannot be lowered into OpUndef which will
produce a different value on each use.
-- Sanjoy
> To stay consistent with the SPIR-V spec, a new UndefOp is created on every use of the result <id> generated by an OpUndef instruction. This is consistent with SPIR-V spec and we don't need to treat every use of the result of a spirv::UndefOp as a different bit-pattern.
>
> Questions
> 1) I was looking at what LLVM does for undef. The description seems to be consistent with SPIR-V, but there is a slight difference AFAICS. There is no "undef instruction". The undef value is directly embedded into the IR where needed, so this issue does not arise. Could someone confirm this?
> 2) The LLVM dialect has a llvm.mlir.undef operation that has a specification similar to what the spirv::UndefOp. I think (havent verified), when lowered to LLVM IR an undef value is being created at every use of the result SSA value. This effectively means that the LLVM dialect is modeling undef value as having a different bit pattern on every use. Is there any guidance/opinions of how undef values should/would be modeled in MLIR?
>
> Finally though, I am not really sure there is any observable difference between the two approaches. So this might all be a moot point. Some input here would be very useful to cover any blindspots on my side.
>
> --
> You received this message because you are subscribed to the Google Groups "MLIR" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/067202df-7e32-409f-9418-7ccc82942b9b%40tensorflow.org.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CABBcqdHaAF7HyuixkeLUdJJEg_ikdM9ZMKywKAd-3wKi%3DSB4Tg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAArwm2bySg3XkLtQHpKwqE5qrqi%3DX5Dpy%3DPWxp7jpyiLbdp4pA%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CABBcqdGUNaa16dc2FTzit3vOOUyr_XiwqjOb%2B0SS5hGEE_LedQ%40mail.gmail.com.
Yes technically we can, but by doing this at the very beginning we are limiting the optimization opportunity there right? My take on undef values is that they are meant to provide flexibility to compilers so compilers can choose a suitable value interpretation for it as fit.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAGhUxBDuZ%2BUeNFnu3W_8iz_%2BtVCLOheWr9gjxSeTJfBwMfLkww%40mail.gmail.com.
On Mon, Oct 7, 2019 at 3:28 PM Nuno Lopes <nuno....@ist.utl.pt> wrote:
Hi,
Regarding your 1st question, it's true that there's no undef instruction in
LLVM. Undef is a value, hence it *has* to be able to produce a different
result on each use (due to how RAUW works).
If undef was an instruction, it could have the semantics that all uses
observe the same value.
There are performance considerations to be had when you impose that all uses
see the same value. In particular, you may need to materialize a value and
keep it in a register so that all uses observe the same value.
Having undef take different values on different uses introduces a lot of
complexity in the optimizers.
But it also simplifies other optimizers right? Not trying to downplay the other problems, your paper did a good job at it :)
I seemed to me that many peepholes can take local decision and propagate undef without having to "remember" or encode the value they chose to based their early decision on.
It means that `if x == 0 && x == 1` is allowed to evaluate to true when x is undef in the IR, because the two conditions can be evaluated in isolation.
True. Though there aren’t that many peephole optimizations that require this. In the past we’ve experimented with disabling undef altogether and checking which optimizations became wrong. And it’s a very small % (I don’t have the concrete number, sorry).
Nuno
I have no clue about SPIR-V, but I just had a quick look to the spec here: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpUndef
The semantics of their undef op seems identical to LLVM’s: each use *may* observe a different value.
Some examples:
x = spirv::UndefOp()
y = xor x, x // can yield any value
for (...) {
print(x); // can print any value in each iteration, including all equal or all different
print(y); // likewise
}
BTW, there’s another twist in LLVM semantics, which I’m not sure SPIR-V follows is that any expression based on undef may yield a different value each time it’s used. i.e. undefs can flip values even within sub-expressions. In the example above, “xor x, x” *may* yield a different value in each iteration of the loop.
Nuno
This is true for SPIR-V too. The SPIR-V dialect only uses FuncOp from standard dialect.As far as I can tell, SPIR-V dialect is using the standard types for its arithmetic: https://github.com/tensorflow/mlir/blob/master/test/Dialect/SPIRV/arithmetic-ops.mlirI'm not sure I'd agree with considering a standard dialect i32 SSA value has having a different bit pattern on every use.
So it can implement any semantics needed to help with optimizations/SPIR-V (de)serialization. But on this point, what would happen in LLVMIR dialect with the example above.
%0 = llvm.mlir.undef : i32%1 = llvm.xor %0, %0 : i32If no analysis is done here in the LLVM dialect, and it is just lowered to LLVM IR, I understand that it implements the LLVM semantics. But if I had to write a peephole optimizer for the LLVM dialect, then I should handle llvm.xor differently based on whether %0 is undef value or not?
If you export this, you would have the undef being folded as a constant, not an SSA value I believe.
--Mehdi
--Mahesh
On Tue, Oct 8, 2019 at 9:51 AM Mahesh Ravishankar
<ravish...@google.com> wrote:
> Agree with (1) and (2). Good example again about (3). I don't really know how to fix the issue though. You suggested using freeze (from the paper). I think the way I was thinking of spirv::UndefOp is already incorporates the freeze semantics. The value returned by an spirv::UndefOp is a freeze of an "undefined" value with every spirv::UndefOp generating a (potentially) different bit pattern. That still has the same issue that the semantics of the code as expressed in the SPIR-V dialect is different from what would be the semantics of the serialized SPIR-V binary. Is the suggestion that we mimic "freeze" in the SPIR-V binary as well?
Yes, that is what I was suggesting. spirv::UndefOp() can be lowered
into Freeze(OpUndef()) in the SPIR-V binary.
> We can do this by doing the following transformation at the time of serialization to make the example above:
>
> x = spirv::UndefOp
> y = spirv::AndOp x, 0
I assume you meant `sprirv::OrOp x, 0`.
In any case, I'm not sure if this is sufficient, `Or undef, 0` needs
to be `undef` as otherwise the optimizer cannot replace `Or x, 0`
with `x` without proving that x is not undef (which can be difficult).
However, if SPIR-V binaries do not need to be optimized further and
the Or x, 0 => x transform was not important then we could define `Or
0, x` to be freezing `x` and get this property.
I was thinking about using some special intrinsic that SPIR-V
optimizers do not optimize. Perhaps there is a way to inject a "no
op" inline assembly that is opaque to optimizers?
On Tue, Oct 8, 2019 at 1:49 PM Sanjoy Das <san...@google.com> wrote:On Tue, Oct 8, 2019 at 9:51 AM Mahesh Ravishankar
<ravish...@google.com> wrote:
> Agree with (1) and (2). Good example again about (3). I don't really know how to fix the issue though. You suggested using freeze (from the paper). I think the way I was thinking of spirv::UndefOp is already incorporates the freeze semantics. The value returned by an spirv::UndefOp is a freeze of an "undefined" value with every spirv::UndefOp generating a (potentially) different bit pattern. That still has the same issue that the semantics of the code as expressed in the SPIR-V dialect is different from what would be the semantics of the serialized SPIR-V binary. Is the suggestion that we mimic "freeze" in the SPIR-V binary as well?
Yes, that is what I was suggesting. spirv::UndefOp() can be lowered
into Freeze(OpUndef()) in the SPIR-V binary.
> We can do this by doing the following transformation at the time of serialization to make the example above:
>
> x = spirv::UndefOp
> y = spirv::AndOp x, 0
I assume you meant `sprirv::OrOp x, 0`.
In any case, I'm not sure if this is sufficient, `Or undef, 0` needs
to be `undef` as otherwise the optimizer cannot replace `Or x, 0`
with `x` without proving that x is not undef (which can be difficult).
However, if SPIR-V binaries do not need to be optimized further and
the Or x, 0 => x transform was not important then we could define `Or
0, x` to be freezing `x` and get this property.
I was thinking about using some special intrinsic that SPIR-V
optimizers do not optimize. Perhaps there is a way to inject a "no
op" inline assembly that is opaque to optimizers?I am not aware of any such special intrinsic. Maybe @Lei Zhang knows about some way of freezing it.Since we cannot assume that the compilers within Vulkan drivers (that convert SPIR-V binary to machine code) implement freeze semantics, there might be an issue supporting spirv::UndefOps. The only thing we can be sure of is that converting a SPIR-V binary to SPIR-V dialect would preserve semantics. I am not sure that going from SPIR-V binary -> SPIR-V dialect -> SPIR-V binary we can gaurantee that the semantics of the initial and final SPIR-V binary are the same if the original binary had an OpUndef instruction unless undef is supported the way LLVM does either within MLIR or the SPIR-V dialect in MLIR
undef
’ “variable” can arbitrarily change its value over its “live range”."On Tue, Oct 8, 2019 at 8:58 PM Mahesh Ravishankar
<ravish...@google.com> wrote:
> I can't think of a concrete case where the SPIR-V dialect will do something that will change the semantics of the original SPIR-V binary, but I don't know if I can rule it out either.
Maybe I'm reading too much into this, but I don't think this is the
best way to think about the problem. The only two questions we should
ask are:
- How can we correctly convert SPIR-V binaries to MLIR?
- How can we correctly convert MLIR to SPIR-V binaries?
Ideally these would be well-posed questions because both MLIR and
SPIR-V have well defined semantics.
If either of the two steps are incorrect we have a miscompile in a
SPIR-V -> MLIR -> SPIR-V pipeline, period. It does not matter if we
happen to have an optimization today that manifests this miscompile or
not.
The second order way of reasoning, is this MLIR transform valid for
the "original" SPIR-V binary, unnecessarily confusing I think.
On Tue, Oct 8, 2019 at 9:19 PM Sanjoy Das <san...@google.com> wrote:On Tue, Oct 8, 2019 at 8:58 PM Mahesh Ravishankar
<ravish...@google.com> wrote:
> I can't think of a concrete case where the SPIR-V dialect will do something that will change the semantics of the original SPIR-V binary, but I don't know if I can rule it out either.
Maybe I'm reading too much into this, but I don't think this is the
best way to think about the problem. The only two questions we should
ask are:
- How can we correctly convert SPIR-V binaries to MLIR?
- How can we correctly convert MLIR to SPIR-V binaries?
Ideally these would be well-posed questions because both MLIR and
SPIR-V have well defined semantics.
If either of the two steps are incorrect we have a miscompile in a
SPIR-V -> MLIR -> SPIR-V pipeline, period. It does not matter if we
happen to have an optimization today that manifests this miscompile or
not.
The second order way of reasoning, is this MLIR transform valid for
the "original" SPIR-V binary, unnecessarily confusing I think.I am sorry if I wasn't being clear. I am agreeing with what you are saying. It seems to me that there is no way of getting the semantics right between MLIR and SPIR-V unless we have a mechanism to handle undef values in MLIR consistent with SPIR-V/LLVM (or SPIR-V has a more robust mechanism for "freeze"). The immediate use case for me is to make sure that the SPIR-V -> MLIR -> SPIR-V works fine. But as you pointed out, if MLIR -> SPIR-V miscompiles, there is no roundtrip. Using a dummy Or operation to mimic freeze does not actually seem to solve the problem. A compiler within the Vulkan driver might optimize the Or operation away. AFAICS, there is no solution for the correct compilation of MLIR -> SPIR-V at this point...
Hi,It seems like some emails answers to the list were lost, we misconfigured it to just reject emails from non-members instead of going through moderation. This should be fixed now.Sorry for the inconvenience, feel free to resend the messages!Best,--Mehdi
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAGhUxBAzrsjX4L6Qma2B8x_%2BGXjrGVp3_76aZFJJBr8PykCffA%40mail.gmail.com.
On Wed, Oct 9, 2019 at 8:28 AM John Kessenich <johnke...@google.com> wrote:
> Yes, we are discussing having other operations that produce undefined results produce the same kind of undef as OpUndef.
And so is it true that we can't rely on making all the OpUndef's
explicit in the graph so we can't necessarily replace them by a safe
constant value?
> BTW, I don't think turning undef into ConstantNull is satisfactory: The optimization allowed by undef isn't an aside, but is a primary purpose.
OTOH we're talking about serializing an already optimized (?) MLIR
program so maybe there isn't a need to re-optimize it once we have it
as a SPIR-V binary? And we don't need to fold it to `0` necessarily,
we could fold it to any constant "convenient" for optimization.
However, this is a moot point unless we have a way to reliably replace
all undefs, not just the ones that are explicitly represented as
OpUndef.
-- Sanjoy
>
> JohnK
>
>
> On Wed, Oct 9, 2019 at 8:30 AM Sanjoy Das <san...@google.com> wrote:
>>
>> Is OpUndef the only way to get undef? Or can (e.g.) loading from
>> uninitialized memory also produce undef?
>>
>> -- Sanjoy
>>
>> On Wed, Oct 9, 2019 at 7:24 AM David Neto <dn...@google.com> wrote:
>> >
>> > When you are translating *into* SPIR-V, then you can make your life simple by using OpConstantNull instead of OpUndef.
>> > Then there is no confusion. The worst effect of this, as far as I know, is that it may force downstream compilers to save and restore a register value, which *might* slow down the code. OpUndef, and undef values in general, is a way of saying to the compiler "hey you can ignore the current value of the register you're about to use because my code is going to overwrite it soon anyway".
>> >
>> > You can think of OpConstantNull as a "freeze" to a specific value, i.e. zero. :-)
- I later saw that Mehdi already pointed out the "you can always replace with null" trick for the one direction conversion.- FYI. As part of the WebGPU effort, some folks surveyed sources of undefined behaviour and undefined values in SPIR-V. See the catalog here https://github.com/gpuweb/gpuweb/issues/34 It's two years old but a good start- "when we go from SPIR-V binary -> SPIR-V dialect -> SPIR-V binary. If there is an undef operation used in the original binary, and we go from a more general semantics of undef to a narrower semantics of constant value everywhere, and then serialize it back to the SPIR-V binary, I am not able to decide if that implies an incorrect change to the program semantics."This is not incorrect. Yes, it is a narrowing of possible behaviours. However think of it this way: Every valid execution of the final form of the module is also a valid execution of the original module.- undefined behaviour vs. undef value. These are two different things. Undefined behaviour is "the world can blow up" level of weirdness. But "undef value" is "you get lots of possibilities for this one value, each time you evaluate/look at it" but that's the whole possible scope of weirdness.- example: of undefined behaviour: writing outside the bounds of a storage buffer is undefined behaviour. Data race is undefined behaviour- example of undef value: OpUndef result. Or the value you get from division by zero. Or bit shift by too many its. Or OpVectorExtractDynamic with an out-of-bounds index.david
Thanks David for the note. I maybe overthinking this, but if we are going just from {something} -> SPIR-V dialect -> SPIR-V binary, then I think we are OK with using a constant value all the time ( ...@Mehdi Amini : you pointed this out as well). My concern is when we go from SPIR-V binary -> SPIR-V dialect -> SPIR-V binary. If there is an undef operation used in the original binary, and we go from a more general semantics of undef to a narrower semantics of constant value everywhere, and then serialize it back to the SPIR-V binary, I am not able to decide if that implies an incorrect change to the program semantics. Reasoning about it in the way ...@Sanjoy Das suggested as two separate steps, and as long as they are individually correct, the roundtripping seems to suggest that it is OK. But this relies on some undefined behavior. Given that drivers have a compiler that consumes the SPIR-V binary, which often is an LLVM-based compiler, we might have a situation where the original SPIR-V binary and the final SPIR-V binary do different things.
>> >>>> To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
>> >>>> To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CABBcqdGUNaa16dc2FTzit3vOOUyr_XiwqjOb%2B0SS5hGEE_LedQ%40mail.gmail.com.
--Mahesh
Sounds good to me. And +1 to the example relating undef to undefined behaviour.I agree with John that there is a potential for perf loss in converting Undef to ConstantNull.Also, that the potential for performance gain is the main reason to have Undef in the first place.That said:- My team's Clspv compiler lowers LLVM's undef into SPIR-V OpConstantNull. We did that to workaround a driver bug, but nobody has complained about performance so far, and I made this the default.
- One of the main places I see an undef value in LLVM code is in constructing a vector by using begining with an undef vector value, then repeatedly inserting scalars to fill all the slots. In this case the existence of the undef is an artifact from the fact that LLVM doesn't have a "create a vector all at once" instruction. But SPIR-V does have OpCompositeConstruct to do exactly that. I wrote a pass in Clspv to transform those LLVM chains-of-inserts into a single OpConstantConstruct (effectively). It actually shortened the code and made it more clear.- JF Bastien (C++ lead at Apple, formerly NaCl lead at Google) will talk at the upcoming LLVM Dev meeting about automatically initializing all local variables. The idea is to do this for security reasons, but I've seen him report that the perf hit is small. See https://llvmdevmtg2019.sched.com/event/W2tz/making-ub-hurt-less-security-mitigations-through-automatic-variable-initializationI'd love to see updated perf data for the undef -> ConstantNull transform, or more specifically perf data focused on GPU compute. In case anyone gets bored. :-)
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/284c4c44-f056-4869-8777-d4ea190722ff%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAArwm2azbuWdyoUJ-k0tOnuAH3KcUcy%3DqE1379CBejsyH9D3wg%40mail.gmail.com.
Sounds good to me. And +1 to the example relating undef to undefined behaviour.I agree with John that there is a potential for perf loss in converting Undef to ConstantNull.Also, that the potential for performance gain is the main reason to have Undef in the first place.That said:- My team's Clspv compiler lowers LLVM's undef into SPIR-V OpConstantNull. We did that to workaround a driver bug, but nobody has complained about performance so far, and I made this the default.- One of the main places I see an undef value in LLVM code is in constructing a vector by using begining with an undef vector value, then repeatedly inserting scalars to fill all the slots. In this case the existence of the undef is an artifact from the fact that LLVM doesn't have a "create a vector all at once" instruction. But SPIR-V does have OpCompositeConstruct to do exactly that. I wrote a pass in Clspv to transform those LLVM chains-of-inserts into a single OpConstantConstruct (effectively). It actually shortened the code and made it more clear.
- JF Bastien (C++ lead at Apple, formerly NaCl lead at Google) will talk at the upcoming LLVM Dev meeting about automatically initializing all local variables. The idea is to do this for security reasons, but I've seen him report that the perf hit is small. See https://llvmdevmtg2019.sched.com/event/W2tz/making-ub-hurt-less-security-mitigations-through-automatic-variable-initializationI'd love to see updated perf data for the undef -> ConstantNull transform, or more specifically perf data focused on GPU compute. In case anyone gets bored. :-)davidOn Wed, Oct 9, 2019 at 6:04 PM Mahesh Ravishankar <ravish...@google.com> wrote:Great! Thanks everyone for all the feedback! I think the discussion gives me a better idea of how to go about this.As it stands, the deserializer in SPIR-V dialect already does the right thing while going from SPIR-V binary to MLIR. I think the consensus is to use a ConstantNull value for OpUndef during serialization. I plan to take the approach for now unless something else comes up.
This is the rewrite inserts pass in LLVM.It handles a bunch of non-obvious cases. Unfortunately I don't have separate per-pass tests for it, as it preceded Alan's infra work for per-pass tests.