improving gc debug info

Ryan Brown

unread,

Jul 15, 2015, 8:34:04 PM7/15/15

to golan...@googlegroups.com

I've been working on adding support for go programs (compiled with gc) to the lldb debugger.

There's a few issues with the generated dwarf info that make things really difficult:

- There's no scope information for variables. It is very confusing seeing uninitialized values if you try to print a local variable before it comes into scope. Or even worse, when variables are shadowed there's no way to know which one to print.

- The imports for a file are not represented in dwarf. This makes expression parsing difficult when you need to refer to a type.

There's also a few smaller annoyances:

- I believe the addresses for variables are only guaranteed to be correct at function calls. Things mostly work if you disable optimizations, but it would be nice to have correct location expressions for each line.

- The dwarf could contain information on inlined functions so you don't have to compile with inlining disabled.

Part of the problem seems to be that the dwarf info is generated entirely in the linker, but the linker doesn't have information about scopes or register contents.

I don't know the toolchain well enough but it seems like either the the compiler needs to pass this information to the linker, or the dwarf generation needs to be moved into the compiler.

Ian Lance Taylor

unread,

Jul 16, 2015, 2:17:59 AM7/16/15

to Ryan Brown, golan...@googlegroups.com

On Wed, Jul 15, 2015 at 3:55 PM, Ryan Brown <rib...@google.com> wrote:
>
> - There's no scope information for variables. It is very confusing seeing
> uninitialized values if you try to print a local variable before it comes
> into scope. Or even worse, when variables are shadowed there's no way to
> know which one to print.
> - The imports for a file are not represented in dwarf. This makes
> expression parsing difficult when you need to refer to a type.

Yes.

>
> There's also a few smaller annoyances:
> - I believe the addresses for variables are only guaranteed to be correct
> at function calls. Things mostly work if you disable optimizations, but it
> would be nice to have correct location expressions for each line.

This is conceptually complex and I find it less useful in practice
than it would appear. Alexandre Oliva went to considerable effort
adding this kind of thing to GCC. In my experience, the main effect
was that where in the past "print v" would show garbage, now it show
an error message "v has no value at this point." That is certainly
nicer, but I'm not sure it's worth the amount of work that is
required.

> - The dwarf could contain information on inlined functions so you don't
> have to compile with inlining disabled.

Yes.

> Part of the problem seems to be that the dwarf info is generated entirely in
> the linker, but the linker doesn't have information about scopes or register
> contents.
> I don't know the toolchain well enough but it seems like either the the
> compiler needs to pass this information to the linker, or the dwarf
> generation needs to be moved into the compiler.

Yes.

I don't know of anybody actively working on this.

Increased binary size due to extra debug info would be a conern, as
would increased compile time.

Ian

Austin Clements

unread,

Jul 16, 2015, 9:17:20 AM7/16/15

to Ian Lance Taylor, Keith Randall, Ryan Brown, golan...@googlegroups.com

On Thu, Jul 16, 2015 at 2:17 AM, Ian Lance Taylor <ia...@golang.org> wrote:

On Wed, Jul 15, 2015 at 3:55 PM, Ryan Brown <rib...@google.com> wrote:
> There's also a few smaller annoyances:
> - I believe the addresses for variables are only guaranteed to be correct
> at function calls. Things mostly work if you disable optimizations, but it
> would be nice to have correct location expressions for each line.

This is conceptually complex and I find it less useful in practice
than it would appear. Alexandre Oliva went to considerable effort
adding this kind of thing to GCC. In my experience, the main effect
was that where in the past "print v" would show garbage, now it show
an error message "v has no value at this point." That is certainly
nicer, but I'm not sure it's worth the amount of work that is
required.

The other problem in gc is that we don't output any debug info for registerization; the DWARF only gives the stack locations of variables. As a result, there are many cases where gdb will currently print out an old (but legal!) value from the stack copy of a variable, but its current value is perfectly accessible and just sitting in a register. If we could fix this, the output would change from (subtly wrong!) garbage to correct output, not "no value". I thought about this a bit when I was working on the ppc64 registerizer and I think it would be fairly easy to track the necessary information in the current registerizer. However, I don't have any sense how it would be in the new SSA world.

Keith, is SSA being designed with correct debug info for registerized variables in mind, or do you have a sense of how hard it would be to add? It seems like something that might not be hard if considered from the get-go, but will be very hard to add once there are serious optimization passes.

("v has no value at this point" is certainly annoying, but at least you know that you don't know the value of v. I hate it when debuggers lie to me because it often results in a disproportionate amount of wasted time; I'm not just debugging my program, I'm debugging the debugger's output.)

Keith Randall

unread,

Jul 16, 2015, 11:40:00 AM7/16/15

to Austin Clements, Ian Lance Taylor, Ryan Brown, golan...@googlegroups.com

On Thu, Jul 16, 2015 at 6:17 AM, Austin Clements <aus...@google.com> wrote:

On Thu, Jul 16, 2015 at 2:17 AM, Ian Lance Taylor <ia...@golang.org> wrote:
On Wed, Jul 15, 2015 at 3:55 PM, Ryan Brown <rib...@google.com> wrote:
> There's also a few smaller annoyances:
> - I believe the addresses for variables are only guaranteed to be correct
> at function calls. Things mostly work if you disable optimizations, but it
> would be nice to have correct location expressions for each line.

This is conceptually complex and I find it less useful in practice
than it would appear. Alexandre Oliva went to considerable effort
adding this kind of thing to GCC. In my experience, the main effect
was that where in the past "print v" would show garbage, now it show
an error message "v has no value at this point." That is certainly
nicer, but I'm not sure it's worth the amount of work that is
required.

The other problem in gc is that we don't output any debug info for registerization; the DWARF only gives the stack locations of variables. As a result, there are many cases where gdb will currently print out an old (but legal!) value from the stack copy of a variable, but its current value is perfectly accessible and just sitting in a register. If we could fix this, the output would change from (subtly wrong!) garbage to correct output, not "no value". I thought about this a bit when I was working on the ppc64 registerizer and I think it would be fairly easy to track the necessary information in the current registerizer. However, I don't have any sense how it would be in the new SSA world.

Keith, is SSA being designed with correct debug info for registerized variables in mind, or do you have a sense of how hard it would be to add? It seems like something that might not be hard if considered from the get-go, but will be very hard to add once there are serious optimization passes.

The regalloc pass knows what lives in registers, so it would not be hard in principle to figure out ranges when values are in registers and dirty. The harder problem is probably mapping from SSA values back to Go variable names, I'm not sure how that is going to happen yet.

Josh Bleecher Snyder

unread,

Jul 22, 2015, 1:21:16 PM7/22/15

to Keith Randall, golan...@googlegroups.com

The regalloc pass knows what lives in registers, so it would not be hard in principle to figure out ranges when values are in registers and dirty. The harder problem is probably mapping from SSA values back to Go variable names, I'm not sure how that is going to happen yet.

Having an SSA value to Go variable mapping would also be very helpful for debugging SSA issues directly; I have been manually tracing vars back to the AST. I don't want to volunteer (yet, at least) to convert that into DWARF, but I'm motivated to do this first step for my own sanity.

The obvious thing to do is to add a *Node field to Value and propagate it through. However, Ian commented that gcc's experience was that the size of the Value struct has a significant impact on compile times. What do you think?

-josh

Keith Randall

unread,

Jul 22, 2015, 2:49:12 PM7/22/15

to Josh Bleecher Snyder, golan...@googlegroups.com

It definitely needs more thought. Rather than a generic *Node, I would think a specific SSA type would be better (*ssa.DebugVar?).

How would this value be propagated through SSA rewrite passes?

We already have line numbers propagated through SSA. Maybe we can overload that field somehow so we don't make Values bigger.

Ian Lance Taylor

unread,

Jul 22, 2015, 4:02:53 PM7/22/15

to Keith Randall, Josh Bleecher Snyder, golan...@googlegroups.com

On Wed, Jul 22, 2015 at 11:49 AM, 'Keith Randall' via golang-dev
<golan...@googlegroups.com> wrote:
> It definitely needs more thought. Rather than a generic *Node, I would
> think a specific SSA type would be better (*ssa.DebugVar?).
> How would this value be propagated through SSA rewrite passes?

If you are thinking in those terms, you may want to take a look at
https://gcc.gnu.org/wiki/Var_Tracking_Assignments and
http://www.fsfla.org/~lxoliva/papers/vta/slides.pdf .

Ian

Ryan Brown

unread,

Jul 30, 2015, 5:19:56 PM7/30/15

to Keith Randall, Austin Clements, Ian Lance Taylor, golan...@googlegroups.com

Maybe it's worth looking at how llvm handles debug info. I think they have it set up to automatically preserve the debug info from one pass to the next without the pass being involved.

I assume we want to continue generating dwarf info in the linker. How should we pass the extra info to the linker? I see that there's already VARDEF and VARKILL instructions which are used for other things. We could add DEBUGVARDEF and DEBUGVARKILL instructions to encode variable scope. I'm not sure what to do for the imports. Maybe for each file you could add an init function which calls a fake function to record the imported packages and their names.

-- Ryan Brown

Josh Bleecher Snyder

unread,

Jul 30, 2015, 5:30:55 PM7/30/15

to Ryan Brown, Keith Randall, Austin Clements, Ian Lance Taylor, golan...@googlegroups.com

> Maybe it's worth looking at how llvm handles debug info. I think they have
> it set up to automatically preserve the debug info from one pass to the next
> without the pass being involved.

Do you happen to have a reference handy?

I've been playing with this, and it looks to me so far like something
like the gcc approach that Ian linked to is the most promising.

(The most interesting alternative I played with was a named gc.Node
type that implemented ssa.Type by calling through to its underlying
type. It was pretty handy for debugging because so many ssa values had
rich descriptions, like "x != f()". However, it was fundamentally
rather a hack.)

-josh

Ryan Brown

unread,

Jul 30, 2015, 5:55:10 PM7/30/15

to Josh Bleecher Snyder, Keith Randall, Austin Clements, Ian Lance Taylor, golan...@googlegroups.com

I've only found documentation that explains how to use their apis to produce debug info. I haven't found any explanation of how it's implemented.