We will be giving a quick introduction to summarize the current
state of debug info handling in LLVM and highlight some of the
biggest problem areas. To get a productive discussion going,
here are some points we'd like to invite you to think about ahead
of time:
* What are the major pain points that you or your customers
experience? What works — what doesn't?
* Do you have out-of-tree patches that improve the optimized
debugging experience? What are your experiences with it and
what are the challenges in upstreaming them?
* We would really like to start tracking debug info quality the
same way we track code size and performance numbers. What are
useful metrics for your use-case?
See you all at the dev meeting!
Adrian Prantl
Fred Riss
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Our custom processor is a VLIW processor, and I think that there is a growing trend in CPU design toward VLIW as transistor count increases, but clock speed does not. When we compile at '-O0' we do not use more than one functional-unit at a time (except for the predication unit plus one other because we have to). This means that each "instruction" corresponds to a particular line of code, and an assembly '.loc' directive can be emitted for that instruction which produces a reasonably straight-forward correspondence between the instruction and the source code location.
Optimisation for scalar processors introduces all sorts of fun - code-motion, code-duplication, code-elimination, you name it - but even so, an instruction generally corresponds to some line of code in the source even if it is not obvious to the person debugging their code why the line-tracking jumps around all the time.
But VLIW optimisation introduces a new twist in that each of the functional-units can correspond to different lines of code, and this is really very hard to understand in optimised code when debugging. The more functional units the VLIW architecture supports, the more difficult this becomes. And to complicate this further, predication units can control some functional-units and not others, so a single instruction may contain elements which are conditional and other elements which are not, and of course corresponding to several different lines of code.
I do not even pretend to know much about Dwarf and the representation of debug information, but it does appear that there is little or no support for the idea that a single "instruction" can correspond to multiple diverse lines in the source file. It would be useful for an assembly '.loc' directive to take a list of source locations, and for the debug meta-data to represent this one-to-many mapping ('.debug_{[a]ranges|lines}' etc.). At the moment we are limited to picking one of the several possible locations, and hoping that it is helpful to the programmer when debugging their optimised programs.
Perhaps this is simply my misunderstanding of the degree of meta-data support in Dwarf, but I don't see any obvious way of representing the one-to-many mapping in LLVM for bundles, or the selective predication of the elements in the bundle. This kind of bundling happens very late in the MI level, and by then the DebugLoc abstractions are getting quite difficult. For instance, how do I tell LLVM that some of the instructions in a bundle that contain a predicate are predicated by that predicate, and while instructions are always active? And how does this pass on to the debug meta-data? The Dwarf support in LLVM is (thankfully) largely independent of the target, but it still needs to have support for these abstractions and use-cases.
Another area that is really very painful for programmers, whether VLIW or scalar, is C++ templates. This is an enormously complex area to debug, in particular because the optimised program usually involves extensive use of inlining and multiple tiers of template specialisations. Even reasoning about the specialisation, whether at '-O0' or fully optimised is incredibly hard.
I don't yet have out-of-tree contributions to make, because I have to admit I struggle with this too, but I have some ideas which I would like to promote in the future once I have ironed out the details. But it would be good to have the topic of VLIW and C++ templates added to the hot-topics for the BoF.
Thanks and I am looking forward to reading about the outcome from this BoF,
MartinO
There is. There is even a patch for LLVM:
https://reviews.llvm.org/D16697
-Krzysztof
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
The patch refers to the target providing an 'op_index' register, but this seems like something that can only be handled by an integrated assembler. We use an external assembler and I am curious if there are new directives that we need to support for this? At the moment our assembler is unable to accept '.loc' directives between each operation in a VLIW instruction, is this something that we need to implement to get this level of VLIW debug support?
Thanks,
MartinO
-Krzysztof
Yes, it was me. It was pointed out (in conversations after the BoF) that we already have some pass (SROA?) that builds expressions for things; but that's pretty limited. We'll need utilities to build more-general expressions (and maybe some kind of SCEV visitor to build them), and also for full generality, debug intrinsics that take multiple value operands so that we can write DWARF expressions that refer to multiple values (which is currently not possible).
Thanks again,
Hal
Thanks for writing this up, Paul, and thanks everyone who participated in the session! I found it to be very a productive discussion.
>>
>>
>> Unpacking that a bit...
>>
>> Induction variable tracking
>> ---------------------------
>> Somebody (Hal?) observed that in counted loops (I = 1 to N) the
>> counter
>
> Yes, it was me. It was pointed out (in conversations after the BoF) that we already have some pass (SROA?) that builds expressions for things; but that's pretty limited.
Yes that was SROA. There is also a patch lying around in review limbo that does a similar thing for the type legalizer.
> We'll need utilities to build more-general expressions (and maybe some kind of SCEV visitor to build them), and also for full generality, debug intrinsics that take multiple value operands so that we can write DWARF expressions that refer to multiple values (which is currently not possible).
To expand on this, the problem is that we cannot refer to IR from metadata, so in order to support this we could, for example, extend llvm.dbg.value() to accept multiple operands:
; Straw man syntax for calculating *(ptr+ofs).
; The first argument is pushed implicitly, for the subsequent ones we'll need a placeholder.
call @llvm.dbg.value(metadata i64* %ptr, metadata i64 %ofs, i64 0,
!DIExpression(DW_OP_LLVM_push_arg, DW_OP_plus, DW_OP_deref))
or something like that.
One topic that also came up in the discussion (after the?) session was the interest in making -O1 only enable optimizations that are known not to have an adverse effect on debuggability or even introducing a dedicated -Og mode like GCC has.
-- adrian