with-continuation-marks in errortrace

20 views
Skip to first unread message

Sorawee Porncharoenwase

unread,
Jul 26, 2020, 11:22:10 PM7/26/20
to Racket list

Hi everyone,

I have a question about the implementation of errortrace.

Consider the classic factorial program, except that the base case is buggy:

(define (fact m)
  (let loop ([n m])
    (cond
      [(zero? n) (/ 1 0)]
      [else (* (loop (sub1 n)) n)])))

(fact 5)

Running this program with racket -l errortrace -t fact.rkt gives the following output:

/: division by zero
  errortrace...:
   /Users/sorawee/playground/fact.rkt:9:17: (/ 1 0)
   /Users/sorawee/playground/fact.rkt:10:12: (* (loop (sub1 n)) n)
   /Users/sorawee/playground/fact.rkt:10:12: (* (loop (sub1 n)) n)
   /Users/sorawee/playground/fact.rkt:10:12: (* (loop (sub1 n)) n)
   /Users/sorawee/playground/fact.rkt:10:12: (* (loop (sub1 n)) n)
   /Users/sorawee/playground/fact.rkt:10:12: (* (loop (sub1 n)) n)

I find this result subpar: it doesn’t indicate which call at the top-level leads to the error. You can imagine another implementation of fact that errors iff m = 5. Being able to see that (fact 5) at the top-level causes the error, as opposed to (fact 3), would be very helpful.

Not only that, (* (loop (sub1 n)) n) also looks weird. There’s nothing wrong with multiplication, so I don’t find this information useful.

The tail-recursive factorial is similarly not helpful:

(define (fact m)
  (let loop ([n m] [acc 1])
    (cond
      [(zero? n) (/ 1 0)]
      [else (loop (sub1 n) (* n acc))])))

(fact 5)

produces:

/: division by zero
  errortrace...:
   /Users/sorawee/playground/fact.rkt:9:17: (/ 1 0)

I have been toying with another way to instrument the code. It roughly expands to:

(define-syntax-rule (wrap f)
  (call-with-immediate-continuation-mark
   'errortrace-k
   (λ (k)
     (let ([ff (thunk f)])
       (if k
           (ff)
           (with-continuation-mark 'errortrace-k 'f
             (ff)))))))

(define (handler ex)
  (continuation-mark-set->list (exn-continuation-marks ex) 'errortrace-k))

(define (fact m)
  (wrap (let loop ([n m])
          (wrap (cond
                  [(wrap (zero? n)) (wrap (/ 1 0))]
                  [else (wrap (* (wrap n) (wrap (loop (wrap (sub1 n))))))])))))

(with-handlers ([exn:fail? handler])
  (wrap (fact 5)))

which produces:

'((loop (wrap (sub1 n)))
  (loop (wrap (sub1 n)))
  (loop (wrap (sub1 n)))
  (loop (wrap (sub1 n)))
  (loop (wrap (sub1 n)))
  (fact 5))

This result is more aligned with the traditional stacktrace, and gives useful information that I can use to trace to the error location.

It is also safe-for-space:

(define (fact m)
  (wrap (let loop ([n m] [acc 1])
          (wrap (cond
                  [(wrap (zero? n)) (wrap (/ 1 0))]
                  [else (wrap (loop (wrap (sub1 n)) (wrap (* n acc))))])))))

(with-handlers ([exn:fail? handler])
  (wrap (fact 5)))

produces:

'((fact 5))

Now, the question: why is the current errortrace implemented in that way? Am I missing any downside of this new strategy? Would switching and/or integrating with the new strategy be better?

Thanks,
Sorawee (Oak)

Sorawee Porncharoenwase

unread,
Jul 26, 2020, 11:39:50 PM7/26/20
to Racket list
(By "integrating" with the new strategy, I meant having two keys: one for the new strategy and one for the old strategy. I can see that the first entry of the old strategy is useful, and it's missing in the new strategy).

Shu-Hung You

unread,
Jul 27, 2020, 1:14:33 AM7/27/20
to Sorawee Porncharoenwase, Racket list
By changing (fact 5) to (* 2 (fact 5)), the stack information becomes

/: division by zero
errortrace...:
/Volumes/ramdisk/fact.rkt:6:17: (/ 1 0)
/Volumes/ramdisk/fact.rkt:7:12: (* (loop (sub1 n)) n)
/Volumes/ramdisk/fact.rkt:7:12: (* (loop (sub1 n)) n)
/Volumes/ramdisk/fact.rkt:7:12: (* (loop (sub1 n)) n)
/Volumes/ramdisk/fact.rkt:7:12: (* (loop (sub1 n)) n)
/Volumes/ramdisk/fact.rkt:7:12: (* (loop (sub1 n)) n)
/Volumes/ramdisk/fact.rkt:9:0: (* 2 (fact 5))

Here, the difference is that (fact 5) is no longer at tail position. I
believe errortrace is aiming at preserving proper tail implementation
behavior.
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CADcuegto9%2BDtFTwAVmiReOcCwpARzBSbFhF0knyexb7UhoHQiA%40mail.gmail.com.

Shu-Hung You

unread,
Jul 27, 2020, 8:53:44 AM7/27/20
to Sorawee Porncharoenwase, Racket list
Thinking about your example again, is the idea here to preserve the
first (so perhaps outermost) continuation mark information, instead of
the innermost continuation mark? I don't yet fully understand how this
approach interacts with the evaluation of tail position expressions,
but keeping both seems pretty useful.

Regarding the (* (loop (sub1 n)) n) information, as I understand
errortrace does wrap subexpressions. Here is the instrumentation
result of (* (loop (sub1 n)) n):

(with-continuation-mark ek:errortrace-key '((* (loop (sub1 n)) n) SRCLOC)
(#%app
*
(with-continuation-mark ek:errortrace-key '((loop (sub1 n)) SRCLOC)
(#%app
loop
(with-continuation-mark ek:errortrace-key '((sub1 n) SRCLOC)
(#%app sub1 n))))
n))

A guess is that the continuation mark value '((loop (sub1 n)) SRCLOC)
is being overwritten by its subsequent evaluation to (* (loop (sub1
n)) n). This provides error-centric backtrace information.

What I can think about the effect of keeping only the outermost
continuation mark is that the control-flow information w.r.t. tail
expressions will be lost. In the following (unreal) example, there
will be no chance to identify which (/ y 0) caused the error.

Matthew Flatt

unread,
Jul 27, 2020, 9:27:02 AM7/27/20
to Sorawee Porncharoenwase, Racket list
At Sun, 26 Jul 2020 20:21:56 -0700, Sorawee Porncharoenwase wrote:
> I have been toying with another way to instrument the code. It roughly
> expands to:
>
> (define-syntax-rule (wrap f)
> (call-with-immediate-continuation-mark
> 'errortrace-k
> (λ (k)
> (let ([ff (thunk f)])
> (if k
> (ff)
> (with-continuation-mark 'errortrace-k 'f
> (ff)))))))

This variant probably generates faster code:

(define-syntax-rule (wrap f)
(call-with-immediate-continuation-mark
'errortrace-k
(λ (k)
(with-continuation-mark 'errortrace-k (or k 'f)
f))))


> Now, the question: why is the current errortrace implemented in that way?
> Am I missing any downside of this new strategy? Would switching and/or
> integrating with the new strategy be better?

I don't recall there was any careful study of the alternatives. Always
setting the mark is easiest, and so that's probably why the current
implementation always sets the mark. Maybe keeping the first expression
for a frame instead of the last is consistently more useful.

At Sun, 26 Jul 2020 20:39:35 -0700, Sorawee Porncharoenwase wrote:
> (By "integrating" with the new strategy, I meant having two keys: one for
> the new strategy and one for the old strategy. I can see that the first
> entry of the old strategy is useful, and it's missing in the new strategy).

Instead of a separate mark, `or` above could be replaced by some
combinator that keeps more information in the mark value, such as a
first and last call using a pair:

(define-syntax-rule (wrap f)
(call-with-immediate-continuation-mark
'errortrace-k
(λ (k)
(with-continuation-mark 'errortrace-k (let ([here 'f])
(cons (if k (car k) here)
here))
f))))

Something other than a pair could keeps the first plus up to 5 most
recent calls. But then you'd probably want the errortrace annotator to
be a little smarter and not useless report syntactically enclosing
expressions, like a sequence of `begin`s in something like

(begin
....
(begin
....
(begin
....
....)))

Overall, your simple change seems clearly worth trying out in
errortrace, and maybe other variants would be interesting to explore.


Matthew

John Clements

unread,
Jul 27, 2020, 10:36:13 AM7/27/20
to Matthew Flatt, Sorawee Porncharoenwase, Racket list
Let me jump in here and say a few things that maybe everyone already knows :).

The stepper’s annotation places a *ton* of annotation on a computation, and allows the reconstruction of the full computation. Errortrace does less, and provides less.

The both share a goal of allowing the programmer to see “where you are in the computation”, by capturing “what remains to be done in the computation”, and neither one tries to capture “how we got here”.

Specifically, in the (* (loop (sub1 n) n) expression, a mark on the application of loop is currently overwritten immediately by the mark on the body of the called function. In this case, the still-present mark on the multiplication is telling you “this multiplication still remains to be done”, but in fact you’ve lost important information on *which* of the subterms of the multiplication is currently being evaluated. In the stepper, this is essentially converted to a-normal form so that the mark isn’t just “the multiplication isn’t done yet” but rather “the second argument hasn’t yet been evaluated.”

My understanding of your proposal is that it suggests preserving the first mark that is associated with a continuation, rather than (or perhaps in addition to) the last.

I do think that this could be helpful in some situations.

I also think that the fundamental problem that you point to in the beginning isn’t one of what information to store, though, but rather how it’s presented. Specifically, the presence of the mark around the multiplication captures the following information: “we’re currently in a call to loop, and we’re evaluating either the first or second argument of this multiplication”. I’m not sure what a good way to present this information would be:

> in call to ‘loop’, at either (* [] n) or (* (loop (sub1 n) []).

yeah, that was terrible. Ugh. Surely someone can do better than that.

Anyhow, my point is that for the example that I believe you’re examining, the information you want can be extracted from the information that’s present.

I would like to acknowledge though, that that’s definitely *not* always the case; if you have a bunch of calls that only make tail calls, then you can easily lose information about how you got here. I understand that from a programmer’s standpoint, the question “how did we get here” may be more relevant than “where are we now,” and I do think that it might make sense to take the same approach that (IIRC) Andrew Tolmach & others took in the ML debugger of saving a rolling sequence of up to five marks (or six or ten or whatever you choose).

Put differently, I think that there is potentially a “lost information” situation here, but I think that if you want to preserve that information, you should do so by using a rolling buffer, rather than by supplementing “last expression” with “first expression”.

More broadly, I also think that errortrace’s display of information is probably not the best; I think it’s been a long time since anyone was thinking hard about how best to convey the errortrace information to the programmer. Apologies if I’m wrong about that.

AND, as always, apologies if I’m misunderstanding something important about your proposal!

Best,

John
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/20200727072658.3df%40sirmail.smtp.cs.utah.edu.



Reply all
Reply to author
Forward
0 new messages