Taming Macros
This overall goal of this paper is to provide a new way of debugging macros in Scheme. Though macros are very powerful in letting users add domain specific language features to their programs, they are extremely difficult to de-bug in a program. Macros don't automatically perform type checking on their inputs, nor check the number of inputs. In the case where a string is passed into a macro instead of an integer, or when three inputs are given when only two are accepted, the program will execute unexpectedly. Sometimes an error message will appear, and other times the user will simply get the wrong output. Even when the error messages do appear, they usually don't occur until later on when the output of the macro is used, and they usually show little indication of where the error originated from. As everyone from skilled programmers to novices make mistakes, this is a problem for the Scheme language. This paper attempts to implement type checking in such a way that the effects on the expressiveness of macros are affected minimally at best.
The writer then goes on to discuss adding in assertions to macros. He uses the term “eager language” describe Scheme. I know this has been mentioned before in class, but I don't recall what it means. Regarding the exceptions, he gives an example of a divide by zero assertion, where (/ 1 0) would return true, otherwise false. These assertions/exceptions cannot be defined as functions, but rather must be defined as macros. The example they demonstrate uses the syntax-rules operation. After their example they go on to show how macros are so powerful that they can create un-parse-able syntax. This can occur in two ways: in the macro use or in the macro definition. The first relates to user inputs the macro was not expecting. This could come in the form of too many inputs, inputs that are the wrong type, or inputs that act as patterns. Some might think that this is just like when functions are mis-used, but there is a big difference; functions have advanced debugging features that allow the user to quickly locate and fix the bug. Macros on the other hand have no such debugging support. If an error like that occurs in a macro, the error is only reported once something uses the invalid result. At that point the error often states that there is invalid syntax, but gives no reference to the actual source of the problem. Certain debugging tools such as DrScheme might give a hint to the location of the source of the bug, but that is it.
The model of certain features of Scheme macros are then discussed. It is interesting that they mention that hygiene and referential transparency are not relevant for their work. The syntax of the macros is briefly discussed. The surface language macros are expanded to become features of the core language. The don't allow local macros, hey have made identifiers a different lexical class than the macro keywords, their lambda term has been modified to contain a single formal parameter, and they don't support literals or ellipses (what are ellipses?) in their macro patterns. Though they have made these modifications they state that their model may be scaled to the full Scheme language. Next the author goes on to the formal semantics of the modified version of Scheme languages. Their new language features allow macro definitions to be collected into a macro environment. This environment holds the type annotations and all other information surrounding the macro. All of the top level terms are expanded into the macro environment using what they call the reduction relation. Additionally they use an app tag to represent procedure applications. These applications are found in parenthesized terms in expansion positions that are not special form applications. Additionally they cannot expand into keywords. In this new model, auxiliary functions are used to describe single-step macro expansion. These auxiliary functions are called match and transcribe. The match function maps pattern variables to terms. The transcribe function produces a replacement term for the matched pattern. The author then goes on to discuss stuck terms. Stuck terms are the terms which cause unexpected behavior from macros. This is because they are not in the core language, but rather come from three different groups. The first group consists terms used in macro templates or applications of the right form, which include: pattern variables, macro keywords, and primitive keywords. The second group includes all macro applications that don't match a given clause. The example they give is (lambda x x), which only recognizes a single parameter. The third group consists of expanded terms that are not part of the core language.
The shape (shape types) of the program is then discussed. They state that their language consists of two syntactic categories into which macro applications can expand; expressions and definitions. The shape types are used to describe the type system. They then go through the syntax of the shape types. In order to ensure that the shape types work properly they put restrictive guards on the inputs to macros. This prevents the macros from ever becoming stuck. Their type checking occurs before the macro expansion, allowing the shape types of special forms to control recursive type checking of the arguments to those special forms. It is important to note that information needed to type check the macros without expansion is found in the macros shape. In order to type check the entire program, the macro type environment is constructed and the macro templates and subsequent top-level terms are type checked. The semantics for this type checking are then stated, along with a few rules for their proper execution.
The soundness of their model is then proven using the reduction technique. They do this using the preservation and progress theorems. The preservation theorem proves that an expanded program has the same type as the original program. The progress theorem proves that typed terms consist of either terms from the core language, or expandable terms. They then step through the separate proofs of the theorems. After the proofs, they discuss the issues with converting their model to standard Scheme.
The work put forth by this paper is a model for a restricted version of Scheme that allows for type checking on macros. This provides valuable debugging features to macro languages, though it does restrict their possible inputs. Most of the paper was clear up until the author started talking about the shape type system, where it really dove into the semantics. I assume this is where we will spend a majority of our time in class. I thought it was interesting to see the trade-offs he was willing to make to implement his model, so it will be nice to discuss it all tomorrow.