Debugging passes: Returned language expression not contained in pass result

27 views
Skip to first unread message

panos.st...@gmail.com

unread,
Apr 18, 2019, 5:29:45 PM4/18/19
to nanopass-framework
Hi all!

The nanopass framework is conceptually and technically a great achievement, many thanks for making it available to the public!

I do however currently struggle to debug a quite simple pass in the context of the Chez Scheme compiler. The pass looks similar to this:
(define-pass instrument-Lsrc-pass : Lsrc (e instrument-function) -> Lsrc ()
    (Expr : Expr (e) -> Expr ()
              ((case-lambda ,preinfo ,(cl))
                 (instrument-function e preinfo name (preinfo-src preinfo))))))

In the instrument-function I return a freshly constructed call Expr (e.g. (with-output-lang (Lsrc Expr) `(call ....))).

I expect that any case-lambda within the supplied expression tree will be replaced by the return value of instrument function. This does however not seem to be the case.
Inspecting the generated code by means of echo-define-pass didn't bring enlightenment so far as it is quite convoluted with many magic numbers (tags).

I would highly appreciate any tips on how to approach the problem.

Thanks,
Panos Stergiotis

Andy Keep

unread,
Apr 21, 2019, 11:12:54 AM4/21/19
to nanopass-framework
Thanks for the nice words on the nanopass framework, sorry to hear you're running into trouble.

I'm not sure if you're running into this problem during compilation time of the pass or during the runtime, but since it sounds like echo-define-pass gave you some output, the pass itself must have expanded, though this does not preclude that there was an issue with one of the output patterns if it didn't match the output for the language.

Assuming this is a bug happening at runtime though, there are a few tools I might try to in debugging something like this.

First, I'd try using the trace so that at run-time I could see what the input and output for each pass is.  You can do this by using trace-define-pass, which traces the input and output for the full pass, or by adding trace before the name of your processor.  For instance the Expr processor would be:

(trace Expr : Expr (e) -> Expr () ---)

The trace will attempt to unparse the output of the pass or the processor, so if it doesn't match the output language, you'll see an exception like:

Exception in unparse-Lsrc2: unrecognized language record with irritant #[#{Lsrc:lambda:Expr.3 hkn51ufr3zxv6q0nsurarrjpf-29} 2 x x]

You'll note in this unparse-Lsrc2 is complaining it didn't recognize the output record, and we can see from the name of the output record that this is an Lsrc:lambda:Expr --- which means it is an Lsrc form instead of an Lsrc2 form.  If it can unparse, you'll see the S-expression output.

Another option is to add a body in the pass itself, to bind the result of calling the expression:

(define-pass instrument-Lsrc-pass : Lsrc (e instrument-function) -> Lsrc ()
  (Expr : Expr (e) -> Expr ()
    ((case-lambda ,preinfo ,(cl))
     (instrument-function e preinfo name (preinfo-src preinfo))))
  (let ([e (Expr e)])
    ;; check that e is what you expect here
    e))

The define-language form creates predicates that also allow for some explicit checking of language form and nonterminal types.  So, if you have a language named Lsrc with an Expr nonterminal, define-language will produce Lsrc? and Lsrc-Expr? predicates.

A couple of things to be aware of:

When the define-pass form auto-generates the body, it will expect that the output form will be of the entry nonterminal type of the language.  This is either the nonterminal named explicitly with the entry form or is the first nonterminal listed in the language.  You can override this by specifying the nonterminal in the pass type:

(define-pass instrument-Lsrc-pass : (Lsrc Expr) (e instrument-function) -> (Lsrc Expr) () ---)

When writing the pattern match part of a processor or nanopass-case form, the names of the metavariables are used for matching, and in the case of cata-morphisms, it is used to determine the output type of the processor to recur to.  So, these should name metavariables that match the forms in the language.

Unless you are running at optimize-level 3, the constructors for language forms will check their input types, and if they do not match what is expected they will complain.  One of the tricks with this is that if an s-expression does leak in (for instance if one of the arguments to the `(call ....) you are constructing is constructed outside of the with-output-language form, it might look like value you expected, but is actually an s-expression and not a language form.  Language nonterminals are always records.

This can also happen if a part of a language form is not processed into the output language (probably not a problem here, since the input and output language are the same).

Finally, even though we use quasiquote to construct the output terms, we are not quite as flexible as s-expression construction would be.  For instance if you have a nonterminal production like:

(define-language L
  ---
  (Expr (e)
    ---
    (let ([x* e*] ...) e)))

You cannot construct a let form as `(let ,binding* e), where binding is a list of (x e) lists, you need to separate out the x and e: `(let ([,(map car binding*) ,(map cadr binding*)] ...) ,e).

Good luck!  If you can share more of what is going on, I might be able to help more.  It would be good to get the specific error, the language definition, and the pass---or at least a version of code that demonstrates the problem you're seeing.

-andy:)

panos.st...@gmail.com

unread,
Apr 24, 2019, 3:52:17 AM4/24/19
to nanopass-framework
Thank you very much for your quick and exhaustive answer! It indeed helped me fixing my code, although my mistakes have been caused by various minor misconceptions and not by issues with nanopass itself.

As always: RTM and "With great power ..."!

For later reference:
What I found confusing
1) Cata-morphism:
Directly returning the term (expr) does prevent (catamorphic) recursion:
(Expr : Expr (expr) -> Expr ()
  [(or ,[e*] ... )
   (if (good-day?)
    `(my-or ,e* ... )
     expr])
will not recurse, while
(Expr : Expr (expr) -> Expr ()
  [(or ,[e*] ... )
   (if (good-day?)
    `(my-or ,e* ... )
    `(or ,e* ... )])
does.

2) Transformers: It was not clear how the different transformers interact with each other as there seem to be certain limitations when one transformer emits non-terminals handled in another transformer (i.e. 'CaseLambdaClause' transformer emitting a 'call' nonterminal handled in 'Expr' transformer). Mutually recursive passes seem to better fit my mental model (as a np novice).
3) Error messages: Transforming the Lsrc within the Chez-Scheme compiler to a semantically invalid AST caused errors in the cpnanopass passes. By chance my tranformers did had the same name as those in cpnanopass. The error messages therefore lead to wrong conclusions, before I compiled Chez with debug-level 3 to get better stack traces.
Reply all
Reply to author
Forward
0 new messages