BYU - Spring 2013 - CS 630 - 6/4

Matthew Ashcraft

unread,

Jun 5, 2013, 2:00:44 AM6/5/13

to byu-cs-630-spring-2013

Fortifying Macros

The overall idea of this paper revolved around an improvement to the MBE macros. As macros have become more and more complex over the years, the errors they produce have become worse and worse. The error are reported not at the input to the macros, but rather when one of operations using it errors. We talked briefly about the let example explained shortly after, which is used as the basis for all of the errors throughout the rest of the paper. By changing various patterns used in the let macro, many different errors are introduced, most of which produce erroneous error messages.

One of the main motivations for this paper was that macro writers are lazy. They are eager to move on to the next problem as soon as “it works.” Because of this, they will continue to write sloppy unsafe code unless there are easy to use tools provided to them which help to create robust macros. One of these tools was the Macro-By-Example (MBE). MBE uses the syntax-case operand to try and ensure robustness. It replaces the procedural code with a sequence of clauses. Each clause consists of a pattern and a template. The pattern is the macro syntax, and the template is the macro definition. One of the main innovations of MBE was the use of ellipses. The ellipses provide additional expressiveness to patterns, allowing the pattern to accept a wider range of inputs without having to make recursive calls. We also talked about the ellipse+ in class, which says that there will be 1 or more arguments rather than 0 or more. Additionally we talked about the ellipse depth, which means how many ellipses deep a certain set of syntax is. The example given was that in code such as (( lambda( var … …) body) rhs …), the rhs would have a depth of one, body would have a depth of one, and var would have a depth of 2. Why was it that body would have a depth of 1 when there are no ellipses next to it? If it is because of the ellipses right of rhs, then why doesn’t var have am ellipse depth of 3?

Though MBE is powerful, it lacks the ability to handle certain errors. The two let examples at the top of the second column of the second page were discussed. Neither of these errors were detectable under MBE. Other solutions can be put in place, but they are tedious and time consuming. The first option is to insert guards on the code. Guards check certain identifiers for the correct syntax. The biggest problem with guards is they generally don’t tell you the cause of the error, other than “bad syntax.” The other problem with them is they often require duplicating code. The example in class required us to duplicate the pattern section of the code, and check to see if all of the syntax was correct, otherwise it failed. Since this actually requires the duplication of code, it was as if some of the code were running twice. This is a inefficient and tedious way. Another option they propose is to use explicit error checking, putting in syntax checking statements for each possible syntax. As not every possible syntax may be realized ahead of time, this can be a long, tedious, and un-guaranteed process, which is why most people don’t do it. Two other options they mention are simply ignoring the problem or hand-coding the parser. Hand-coding the parser is basically turning the entire macro into its own parser. This means that the macro is passes some sort of syntax object, then the macro parses the input itself, and determines what should be done, and if it is a valid input.

Their new system is called syntax-parse, and is a domain specific language which supports parsing, validation, and error reporting. They list three new improvements, but it was said in the discussion that it was more like two improvements: an expressive language of syntax patterns and their corresponding operations, and a matching algorithm that tracks the progress of the expansion to rank and report failures, which failures carry error information. We talked a little about defining syntax classes, and then moved onto error reporting.

The error reporting system works in two steps. First, the matching algorithm choses what they consider to be the appropriate error to report. The error chosen comes from the template that matched the syntax the best. We had a little discussion about this because the user might be trying to do one thing, but his syntax is closer to another of the input options. In this case the wrong error message would be reported. For this reason the class thought that all syntaxes that match at all should be printed, and it should be up to the user to determine it. The second step chooses the faulty step in which the error occurred, which is simply reporting what part of the syntax it failed on. We then went through the syntax patterns listed in the paper, along with many more that have been added in the Racket source code. We went over action patterns like parse and fail, and reviewed the head patterns such as seq. We concluded our discussion on this paper with the formal model, and some of the semantics more seriously.

Debugging Hygienic Macros

This paper was summarized by Jay at the end of class due to time limitations. The whole idea of this paper is that the writers have added extra code into the compiler which allows for the user to see desired expansions as they are taking place. The graphical interface when this was released was amazing to users. It was said that the first time people saw the graphical interface from this research during conferences they actually clapped. We concluded this paper talking about how clever the authors proof was.

Andrew Kent

unread,

Jun 5, 2013, 10:49:21 AM6/5/13

to byu-cs-630-...@googlegroups.com

Summary

We began our discussion in the Fortifying Macros paper. We breezed through the beginning and really began talking about the “Expressing Macros” section. Here the paper began highlighting some of the many problems that can occur when macros are misused - or rather, highlighting what should and what does happen when they are misused. We reviewed each of the ways the example they gave, the let example, could be misused and how this was handled. I had not fully realized that these were not just examples of how a macro implementation might misbehave, but were examples of how the current definitions of let in the various Lisp/Scheme family currently did misbehave. We went over a little bit of why this was the case and briefly again looked at some of the reasons why Macro-By-Example was still and is still the primary method for macro definitions. During all of this it was also noted that this paper had succeeded in finding an extremely simple and expressive example for problems with macros, in that it was easy to understand and could itself demonstrate the various problems macros face - this is a great type of example in general to identify when creating a paper.

We then started talking about some of the current ways these problems with macros were “handled” at the time. Guards could be used to check for erroneous arguments but they produced very unhelpful messages (“bad syntax”). We then looked at another method that can be used, where the macro can be littered with error checking code. This method could produce better errors but can also quickly create extremely messy macro definitions and are less intuitive and duplicate code. Interestingly, this error handling code can still be found in some more modern code - we looked at some of the macro definitions for the Racket teaching languages. Here there are extremely detailed error handling cases in place to try and pinpoint what kind of mistake was made to give more useful information to the student trying to use the macro. Some of these definitions seemed to be an order of magnitude larger than they needed to be (perhaps not that much, but it was significant).

We moved on talking about the even more complicated define-struct macro, an even better example of why something more powerful than the current macro definitions was needed. I then asked a question about command line parsing. For some reason it sounded like this strange idea (parsing... from the command line???), and then as it was explained that it just meant parsing the command line input/options/etc... doh. I felt stupid.

We talked more in depth about the details of syntax parse, the proposed solution for the current oversimplicity of macro definitions. Eventually we got into the details of how the more powerful parsing method using head patterns, needed for the more interesting macros like define-struct, was defined and out it worked to create more interesting and more powerful specifications/descriptions.

We had an interesting discussion about the “moral implications” of the error tracking that was discussed in this paper - namely whether or not it is actually a good idea to try and have your parsing code predict which possible branch of a macro the user was trying to use, or whether a more general error should be used that makes no assumptions. Very interesting - from a logical perspective I can see why one would favor the latter, but I can’t help but wonder if statistically the former would be “more useful” - Actually that method Jay mentioned for having a color-coded description of how much of each branch was matched was fascinating. That would be super cool.

Anyway, we eventually got to the second paper - the amazing introduction of being able not only view macro expansion, but being able to control at what level it occurred at. Main takeaways I got from this were just how ridiculously useful this can be, as well as the takeaway from the discussion we had about the idea of adding a piece to a proven system/theorem, proving it is still sound, and then using that added piece to attack some other target. I’ll have to keep this idea in mind, perhaps I’ll find a useful instance in Coq where this concept obliterates some problem.

lingering questions

- Why again was the macro code for the learning dialect of Racket so heavy-laden with error checking code? Was the syntax class and parse stuff not available when it was written? Or they don’t provide as much granularity as they were shooting for with that beginner audience?

Jay McCarthy

unread,

Jun 5, 2013, 12:09:20 PM6/5/13

to Matthew Ashcraft, byu-cs-630-spring-2013

On Wed, Jun 5, 2013 at 12:00 AM, Matthew Ashcraft
<matthew.b...@gmail.com> wrote:

var has depth 2, rhs has depth 1, and body has depth 0. The depth is
determined by the input pattern, not the output template. If a
variable with depth N appears to the left of N+1+m ellipses, then it
is copied in the output. If a variable of depth N appears to the left
of N-(1+m) ellipses, then it is an error.

--
Jay McCarthy <j...@cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93

Jay McCarthy

unread,

Jun 5, 2013, 12:13:31 PM6/5/13

to Andrew Kent, byu-cs-630-...@googlegroups.com

That code was primarily written in 1996, whereas syntax-parse wasn't
released until like 2010

Jay

Reply all

Reply to author

Forward

BYU - Spring 2013 - CS 630 - 6/4 - Discussion

Matthew Ashcraft

Andrew Kent

Jay McCarthy

Jay McCarthy