Thoughts around writing a Go debugger

504 views
Skip to first unread message

Rocky Bernstein

unread,
Jun 19, 2013, 8:29:45 AM6/19/13
to golan...@googlegroups.com
As a hobby I've been considering writing a debugger for go. Since talk is cheap, I'll lay out here some thoughts behind writing a debugger for Go.

There are two extremes one places in the overall build process that can hook into and each has its own benefits and disadvantages.

At one extreme, one can run "go build" (possibly with some additional options) and debug the final code using whatever symbol-table or debugger information that is available in the executable. This has the advantage of having the program act most like what gets run when the debugger is not around. From the programmer's standpoint, this is also the most desirable as it reduces the chance of the debugger causing a Heisenbug.

But depending on the level of debugger expressive power provided and compiler optimization used in generating the program, this can be the hardest to do. Source code can get transformed all sorts of ways that the casual go programmer may not be aware of. As a result, a Go expression might map to different non-contiguous regions of the generated code, or might not appear in the code at all. And on the other hand, a single instruction might map to fragments of several different distinct locations in the source code or no identifiable place in the code. The value of a variable might be ether in storage or in a register or might not exist explicitly but instead might be derivable from other values. Or maybe it's not available at all.

Although this sounds dire, in practice I've debugged optimized C code using gdb and it's generally not too bad.

Before leaving this extreme, I'd like to ask those who have used the go Windows debugger what it's capabilities are. I assume one can set breakpoints at specific lines or functions in the program? And step statements? How about the ability to evaluate expressions, look at or modify variables? Can one get a call stack trace with the names of parameters and values? Or see view the state of all goroutines? Does going into the debugger pause goroutines?

At the other extreme, a debugger can hook into AST and provide an interpreter for that. The AST most closely resembles the source code.

In practice I think having debuggers at both ends is ideal. But let me start with at the AST side, since for me that's more fun.

Having written debuggers for many scripting languages -- POSIX Shell (bash, ksh, zsh), Python, Perl, and Ruby -- the way all of these types of debuggers work is that at various "events" in the course of executing the debugged program, a routine is called. That routine handles debugging, profiling, execution coverage, or tracing. And rather than have some fixed code, what is typically done is to provide a function to register a callback function.

The most common "events" are before
- starting a user-specified breakpoint location
- starting to execute a new statement
- entering a function,
- leaving a function,
- accepting a message,
- a fatal exception,


One could also add things like before evaluation of the expression part of an if, a loop test, and so on.

As others have mentioned, right now there is a pretty usable interpreter in go.tools. Although that works off of SSA rather than AST, to start off with that's probably okay for a debugger. But it would be useful to turn off the lifting code, and use the "Naive SSA form" which is not on by default. And right now the "emit" part of the SSA builder lacks these kinds of marks and callbacks, but that is easily added.

So one trepidation I have here is that the stated goal of the interpreter seems to be a more high-performance interpreter which for reasons I've stated is might be a hindrance in a debugger. In fact, one of the problems I've had in trying to get better debugger run-time support into programming languages is the necessary concern for degrading performance. My own take when this occurs is to provide two run-time environments, possibly with a way to switch back and forth.

The rubinius implementation of Ruby is kind of interesting in this regard. It does optimization by JIT'ing. So it always keeps a simple, stupid but close-to-high-level representation of the program which uses whenever it detects that a debugger is used.

And this last section, let me close with some aspects that Go currently lacks that would make it easier to write a debugger but might also be of use in go otherwise.

The first thing is adding an "eval()" function. This for example allows the debugger to simply see or set values of variables and expressions.

In an interpreter such as the one using SSA, I think this is pretty straightforward to add.

Since Go is normally strongly typed, I guess one should provide a strongly-typed version in addition to one that returns a generic interface{} or unsafe Pointer that one has to reflect on or use a type cast. An alternative to an eval which returns interface{} or an unsafe Pointer, the eval() could just assume whatever needs to be done is handled inside the string it is passed. That is, if you want to set a variable then do that inside the eval.

For the purpose of a debugger, it is possible to work up other mechanisms to set and see variables. In a debugger towards the gdb-spectrum, such a debugger will probably still have to do this.

But for an AST-like interpreter, I think eval() will add lots of power with very little mechanism. And one might imagine a gdb-like debugger somehow using it's more cumbersome variable access mechanism hooked into an embedded AST/SSA-like interpreter.

- - - -

The last aspect I want to bring up is dealing with imports in an AST interpreter and mixed-mode interpreter, compiled go/C code interaction.

An AST-style Go interpreter of the kind I've been describing *can* easily provide the ability to dynamically import code. To do so it would pulling in the source and compiling it on the fly.

I believe there's also a way to indicate in a limited way using the compiled versions of some of the imported code that is known to the interpreter, perhaps because it needs to import it too. So here one can give up I guess the ability to debug such compiled code in favor of code speed.

In a gdb-level debugger, dynamic imports become more important to be able to run eval code.

John Nagle

unread,
Jun 19, 2013, 3:01:55 PM6/19/13
to golan...@googlegroups.com
On 6/19/2013 5:29 AM, Rocky Bernstein wrote:
> As a hobby I've been considering writing a debugger for go. Since talk is
> cheap, I'll lay out here some thoughts behind writing a debugger for Go.

> The first thing is adding an "eval()" function. This for example allows the
> debugger to simply see or set values of variables and expressions.

Debuggers which allow the user to alter the program at run time
are rarely necessary. The machinery needed to support them clogs up
the language.

John Nagle

Jan Mercl

unread,
Jun 19, 2013, 3:36:10 PM6/19/13
to John Nagle, golang-nuts

My pleasure to agree.

-j

Nick Owens

unread,
Jun 19, 2013, 7:06:52 PM6/19/13
to golan...@googlegroups.com
This is one of the reasons I love acid [1]. It doesn't hurt the language
really, and there is already support in the go toolchain. (see -a flag
of 8c).

However, acid from plan9port on linux doesn't really work in my
experience (some kind of ptrace problem), and using acid to debug go on
Plan 9 is a hassle because acid knows nothing about goroutines, or Ms or
Gs or Ps. Has anyone bothered writing acid definitions for the purpose
of debugging go?

Using gdb is a hassle too.. but that's probably PEBKAC. gdb doesn't seem
to recognize the 'end' of a go function, so things like 'finish' just
run right off the end of the current function. The only way I've figured
out how to step single lines is by manually setting a breakpoint on a
file:line.

A new debugger would be nice, but the existing ones could certainly be
fixed.

mischief

[1] http://plan9.bell-labs.com/sys/doc/acid.html

rocky

unread,
Jun 21, 2013, 12:18:25 PM6/21/13
to golan...@googlegroups.com
Thanks for the pointer to acid. It is something to look at and investigate more.


> This is one of the reasons I love acid [1]. It doesn't hurt the language
really,

In this thread I think there has been somewhat fuzzy thinking here. Let me start with the notion that acid "doesn't hurt the language really".

One doesn't have to change the Go language at all to implement eval(). No one seems to kringe at writing:

    s := fmt.Sprintf("%s", v).

So suppose one instead writes something like:

    v1 := 1
   v2 := 2
   var s int
   s = fmt.Eval("%d + %d", v1, v2)


and one gets back 3. Let's say fmt.Eval() returns an interface {} to make the types work out.

Ok.

Going further one could imagine passing in a mapping of variable names and their current values, so instead of writing "%d" one might write "$var1 or #{var1}" or something like that. Going further, maybe one would like to get this mapping easily. Well, it could be reflection on local variables and global variables.

How to get variable names? As mentioned in a previous post there is go/parser. How about the variable values though? In a debugger don't you want to be able to see the value of variables? So there probably needs to be a way to get that.

 I'll come back to that a little later, but let me finish off with this aspect of "hurting the language" view. The place where this eval() was proposed was inside the debugger. Specifically to simplify it's interface with other debugger commands such as to show a call stack or show values of variables. That it might be used outside of a debugger is a different issue, and one that I'm cool with letting individual programmers decide on whether they want to use that.

In an acid debugger for Go, one might imagine writing an embedded go evaluator written in the acid language. And that way one doesn't have to learn the acid language but rather use Go to express what they want to do.

In fact in the first paragraph of [1] mentions a weakness of acid the fact that the programmer has to learn acid, yet another language.

So which is worse "hurting" (or extending) this Go interpreter language -- and I argue that it's more like a built-in function than it is a language change -- or having to write in the acid language in order to debug a program. For those that prefer writing in the acid language, that doesn't bother me one bit. Likewise hope you respect that there may be others who choose differently.

Now let me come back to the comment I also find a little weird if not fuzzy thinking:


> Debuggers which allow the user to alter the program at run time are rarely necessary.

If one doesn't alter the program at run time, how do you think breakpoints are set? Or do your debuggers not provide breakponts? Do they stop execution of the program? If so, that too is altering the program at run time.

Lest one think that acid is sanitized here, it isn't. Acid does let one alter the value of memory locations and registers. The difference is that the mechanism by which one does that is more like writing assembly code rather than Go expressions. Variable storage mapping is the responsibility of the programmer with acid rather than the responsibility of the debugger. Again, if that's what you want I'm cool with your doing that or not using a debugger at all. I merely suggest again that other may have another view.

- - -

Now with all of this out of the way, acid does look like a prospect worth pursuing. Again in my view, ideally there would two kinds of debuggers something more at a higher-level AST with a really crappy but source-code faithful interpreter and something like acid which minimally changes the code for things like stepping and breakpoints or allowing user to alter values in memory.

It looks like ogle was pulled from the go source tree. And where is there a version of acid that runs say on Ubuntu that has the support for go 8c reading?


In [1] http://www2.informatik.hu-berlin.de/~apolze/LV/plan9.docs/acidpaper.html

rocky

unread,
Jun 21, 2013, 7:28:20 PM6/21/13
to golan...@googlegroups.com
I had said one should think of eval as a "built-in function". On reflection I think I should said think of eval as being a function inside an implementation-specific runtime-reflection package.

That run-time reflection package might also include routines to get: 

  1. a map of parameter variable/values, 
  2. a map of local variables/values, 
  3. a map of global variables/values,
  4. a map of function signatures (in a way that's easier than using go/parse)
and so on.

For an AST-like interpreter that reflection package I think I know how to write the code for this. For gcgo or gccgo that's less clear. Or it may be that gcgo/gccgo may have such a runt-time reflection package that just does not contain as many of these things. 

rocky

unread,
Jun 25, 2013, 4:44:55 PM6/25/13
to golan...@googlegroups.com
I've started extending the SSA interpreter for high-level tracing and then debugging. I am rather pleased, so far, with how it's going. I think it's possible to do much more than what I've done previously in other environments terms of reporting positions the granularity of stepping and perhaps other things.

I really like the compact token.Pos type [1]. I like the fact that AST nodes have both start and end positions. I think one should go further and have a compact version of a position range that merges both the compactness of Pos and start/end of the AST called Pos() and End(). That way, one could remove combine start and end in AST and replace that with PosRange() or some such thing.

In Go as in most other programming languages, one can't begin a syntactical construct in one source file or source container and end up in another. So in this range thing one there needs only to be one filename accessor field.

For my little project though, I have my own version for position intervals. I've extended the Position String syntax from


     file:line:column    valid position with file name
     line:column         valid position without file name
     ...

To add:

     file:line:column - line:column   # different line and column
     file:line:column - column        # same line, different column
     line:column - column

This is not incompatible with the way Emacs tracks error positions.

I also like that there is a documented AST [2] . In Ruby, the lack of even a semi-official AST and the incompatible changes over the various releases made source-code analysis tools break between versions.

The fact that I'm debugging off of SSA rather than AST is less of an issue than I originally thought. It allows me to piggyback off of the existing interpreter effort and saves me the effort of writing such a thing. It also allows me to gently tackle how to handle more and more optimization in terms of debugging.

But going the other way around there of starting with the least common denominator of what can be done using a process that pushes information all the way through and hoping to expand that is less satisfying. Looking at the SSA interpreter, already there is this notion of a "canonical" position for an instruction that feels to me that we've already lost information. Perhaps this could be replaced by such a compact range position.

The implementation for this high-level tracing adds pseudo Trace instruction. It works, but it is far from ideal. For example, it bloats the code. I'd prefer to have optional parallel information to the instruction maps attached to some instructions. Setting breakpoints would also work that way too. But given the organization of the ssa builder code which issues high-level emit() calls which then map way ultimately to instructions, I don't see an easy way to track this that isn't real hacky. Perhaps as I get to know the code better I will.


On Wednesday, June 19, 2013 8:29:45 AM UTC-4, rocky wrote:
Reply all
Reply to author
Forward
0 new messages