Hi Reid,
Thanks for sharing this. I plan to work on improving debug-info for variables
living in memory as my next "project" so I am very interested to hear with what
others have to say about "Idea 1".
There is one part of the idea that confuses me. You say we could "keep the
dbg.declare instructions", but I don't see where dbg.declare instructions - at
least with their current semantics - fit into the design if the frontend is
emitting dbg.values after every assignment. Could you please expand on this part
a little?
Thanks,
Orlando
Hi Reid,
Thanks for sharing this. I plan to work on improving debug-info for variables
living in memory as my next "project" so I am very interested to hear with what
others have to say about "Idea 1".
There is one part of the idea that confuses me. You say we could "keep the
dbg.declare instructions", but I don't see where dbg.declare instructions - at
least with their current semantics - fit into the design if the frontend is
emitting dbg.values after every assignment. Could you please expand on this part
a little?
> I think what I meant is that we need to keep the association between the alloca and the local variable somewhere. The current implementation of dbg.declare is not what we want: when we have one, it overrides any dbg.values, and the alloca is used as the variable location for the full scope. That's not what we want, but we do need an association between alloca and variable+scope somewhere. The dbg.values in the design as we discussed it would not contain the alloca, but the SSA value which is stored to the alloca. Maybe if we augmented the dbg.values with the store destination, we could get by without tracking that information separately.
> The main design goal here was to have the variable location information be correct even when DSE occurs, without updating every pass that deletes stores, because there are many. Thinking about it today, I'm not sure this design is complete yet. Even if the frontend effectively emits two stores for every assignment, a real store, and dbg.value, the backend needs to determine if the real store survived optimization. If it did, then the variable value lives in memory. If it did not, then the variable value is the value which would've been stored, if it is available at this program point, or if it is available somewhere nearby. Maybe that's acceptable, but it seems difficult. However, maybe that's similar to what we already do for dbg.values uses that precede definitions.
That makes sense, thank you. I had also thought about extending dbg.value to keep the store destination. One step further could be to introduce a mechanism to refer to the store itself instead of the store destination (possibly similar to Jeremy's recent MIR
debug instruction referencing work [1], but for this situation in IR). If the referenced store has been removed (e.g. by DSE) then we know to use the value instead, which may also have been salvaged or be undef at this point. And if the store has not been
removed then we can grab the destination from the store itself.
I'm starting to form the opinion that the ideal solution will provide correctness by default without passes having to do anything, but allow the onus of improving coverage to fall upon those passes. We've recently taken a turn in this direction in D80264 [2] whereby we now replace an instruction's ValueAsMetadata uses with undef on deletion. This provides correctness by default for debug intrinsics using the instruction, and the coverage may be improved by a pass appropriately calling salvageDebugInfo beforehand.
> The alternative to this design would be to say that stores to static allocas are special, and cannot be deleted without updating debug info. This is basically
https://llvm.org/pr34136#c25. Thinking about it again today, maybe this approach is feasible. We could build tooling to audit for passes that delete these special "assigning stores". Maybe that's better than bending
over backwards just to make it easy for the optimizers.
The "static allocas are special" [3] approach is appealing because it seems like it may give promising results without too much work (compared with these other ideas, anyway). It does rely on passes to provide debug-info correctness, but tooling may be an appropriate stand in for correctness by default if we decide to go down this route. Did you have anything in mind for this?
Thanks,
Orlando
[1] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html
The "static allocas are special" [3] approach is appealing because it seems like it may give promising results without too much work (compared with these other ideas, anyway). It does rely on passes to provide debug-info correctness, but tooling may be an appropriate stand in for correctness by default if we decide to go down this route. Did you have anything in mind for this?
Hi Reid,
Sorry for the long delay in getting back to you. I've been thinking about this problem a lot and I have some thoughts on the existing ideas, and have come up with another partial solution.
# Thoughts on existing ideas
Currently I prefer the idea you proposed earlier in this thread (the "Coffee Chat" method, which I summarise later) to the "Bugzilla" method [1] based on the idea that there's less chance of us generating wrong debug info, because - as you mentioned - most optimisation passes won't need to worry about updating any debug info when messing with stores.
Additionally, regarding tooling for the Bugzilla method, you said:
> All I had in mind was something like:
> - add a pass that finds all "local variable stores" (stores to a static alloca used by a dbg.declare), and inserts markers (llvm.donothing?) carrying the local variable metadata after every such store
> - after optimization, check that every marker is preceded by either a store to a local variable of a dbg.value for the same local variable. Any marker with no dbg.value or store before it is a bug, unless the variable is available nowhere (I forget how that is represented)
This itself sounds similar to the Coffee Chat method implementation, which makes the Bugzilla method less appealing IMO since we'd almost be implementing both solutions. Summarising the version of the Coffee Chat method from this email [2] as I understand it:
00. Remove LowerDbgDeclare.
01. Frontends emit a new kind of debug intrinsic (dbg.assign?) after every store, tracking both the stored value and store destination.
02. When mem2reg promotes allocas it converts dbg.assign(value, dest) -> dbg.value(value).
03. If any dbg.assign remain when we get to ISel then we know the alloca is still used for at least some of the variable's lifetime. We must choose to convert remaining dbg.assign(value, dest) intrinsics to either:
A) DBG_VALUE value
B) DBG_VALUE dest, DIExpression(DW_OP_deref))
where B is preferable if we know the dest address holds the correct value.
To be able to choose correctly in step 03 we need a way of knowing if a specific store has been deleted or moved. To keep the desirable property of allowing optimisation passes to remain ignorant of that requirement this needs to be handled automatically in the background. This would involve linking the store instruction to the dbg.assign, and I believe we can (and should) take advantage of existing infrastructure to do this. For instance, we could change the store instruction to return a special DebugToken type value which dbg.assign intrinsics take as an argument. If the store is deleted, the store-token argument gets RAUW(undef) as per D80264, and then in step 03 we simply look to see if that argument is undef or not. We can also reason about movement of the store relative to the position of the dbg.assign safely in step 03 because we know exactly which store the dbg.assign is associated with.
Here is an example showing what that process looks like with DSE:
=== IR snippet from the frontend ===
%dest = alloca i32
%one = ...
%two = ...
%token.first = store i32 %one, i32* %dest
dbg.assign(%token.first, %one, %dest, "my_var", DIExpression())
%token.second = store i32 &two, i32* %dest
dbg.assign(%token.second, %two, %dest, "my_var", DIExpression())
=== DSE the first store ===
%dest = alloca i32
%one = ...
%two = ...
; ### First store DSE'd, RAUW(undef)
dbg.assign(undef, %one, %dest, "my_var", DIExpression())
%token.second = store i32 &two, i32* %dest
dbg.assign(%token.second, %two, %dest, "my_var", DIExpression())
=== Lower to MIR ===
...
DBG_VALUE %one, "my_var", DIExpression() ; Use value componnent because the first dbg.assign argument is undef.
MOV32mr %dest, %two
DBG_VALUE %dest, "my_var", DIExpression(DW_OP_deref) ; Use dest component + deref because the first dbg.assign argument is not undef.
Finally, imagine that %one also gets DCE'd after the store using it is removed. Its metadata use in the dbg.assign will get RAUW(undef), and when we lower to MIR we’ll insert a DBG_VALUE $noreg just like we would for a dbg.value(undef, …), correctly indicting that there is no location for the variable available here.
# An alternative idea
I also hoped to get your opinion on an idea for an alternative solution. I was trying to come up with something that would involve less work than these other ideas. It works in a similar way to your Coffee Chat idea, but doesn't require that link back to stores. The idea is basically to use stores to track assignments to alloca variables, and solve the DSE problem by inserting a new intrinsic dbg.store(value, dest) every time a store is deleted.
When mem2reg promotes allocas it converts the dbg.declare to dbg.values as per usual, and also inserts dbg.values after each dbg.store using the alloca. Later, if the alloca hasn't been promoted, we can determine whether a dbg.declare using it is still valid for the entire lifetime of the variable by looking for dbg.stores using the alloca. If there are none then no stores have been removed and we know the stack location is valid. Otherwise, we convert the dbg.stores to dbg.values and insert dbg.value+deref after the remaining (real) stores.
00. Remove LowerDbgDeclare.
01. Keep dbg.declare and dbg.value semantics as they are now.
02. When a store is deleted insert a dbg.store like this:
store value, dest -> dbg.store(value, dest, DIExpression())
03. mem2reg keeps its existing pipeline for dbg.declare -> dbg.value when promoting, with this addition:
04. For each dbg.store(value, dest, expr) using the alloca:
05. Insert a dbg.value(value, var, expr) before the dbg.store. var taken from the dbg.declare.
06. Delete all dbg.store(_, dest, _) using the alloca.
07. For each dbg.declare(dest, var, expr) that exists when we get to ISel:
08. If there are no dbg.store(_, dest) using dest, emit frame-index variable.
09. Else for each dbg.store(value, dest, expr) using dest:
10. Insert a DBG_VALUE(value, var, expr) before the dbg.store.
11. For each (real) store to dest:
12. Insert DBG_VALUE(dest, var, DIExpression(DW_OP_deref)) before the store.
This solves the DSE problem, and gives us better location coverage than LowerDbgDeclare for the same reasons that the Coffee Chat and Bugzilla methods do: we'd be end up with dbg.value+deref after stores, and dbg.values after removed dead stores, instead of what we get now which is dbg.values after both stores and removed stores.
The advantage of this method is that it shouldn't involve too much implementation work. The biggest downside is that it doesn't account for code motion moving stores, which is a regression from LowerDbgDeclare and is a problem that the Coffee Chat method doesn't suffer from. If both dbg.stores and (real) stores represent assignments to alloca variables, moving stores around causes the program location of the assignments to differ from those in the source code. Unfortunately, I don't think this solution is viable unless we can adjust it to account for that. What do you think?
Many thanks,
Orlando
[1] Bugzilla method:
https://bugs.llvm.org/show_bug.cgi?id=34136#c25
[2] Coffee Chat method:
https://groups.google.com/g/llvm-dev/c/HX8HL1f3sw0/m/9ylo8-CnDAAJ