Writing an interpreter with SableCC - avoiding switching on type

242 views

interpretersablecc

Skip to first unread message

richard...@cantab.net

unread,

Sep 9, 2013, 9:35:06 AM9/9/13

to sab...@googlegroups.com

Hi,

Is there a canonical example for writing an interpreter (as opposed to a compiler) with SableCC?

I have been writing an interpreter for a simple language, and I have been using SableCC 3.7. I found SableCC very nice to work with, with two minor exceptions

1. Some of the error messages are not as helpful as they could be.

Can I submit a patch to make these more helpful? I notice that the git repos at https://github.com/egagnon/sablecc is for v4 and doesn't appear to have a 3.x branch. Where should I send patches for 3.x bug fixes? Or is 3.x now at the end of its life?

2. The design choice to disallow adding code to production objects leads to ugly "switch on type" statements

I understand some of the reasons that SableCC does not allow you to add code to the generated classes, but I find it makes my interpreter code ugly.

I wonder if I'm missing a trick, and there is a better way to achieve this?

Let me show you what I've written so far:

The grammar is a fairly standard expression tree with multiple productions to support operator precedence. Here's a small snippet:

        expression =
                  {single} [single]:multdiv |
                  {plus} [left]:expression plus [right]:multdiv |
                  {minus} [left]:expression minus [right]:multdiv;
        multdiv =
                  {single} [single]:prefixop |
                  {mult} [left]:multdiv mult [right]:prefixop |
                  {div} [left]:multdiv div [right]:prefixop;

In the interpreter, I need some code to evaluate an expression production.

I can't add code to the PExpression class (as it's final), and I don't want to use the DepthFirstAdapter since all the visitors are "void" typed, so I would have to use an untyped stack to hold the intermediate values, which I don't like (it seems like spooky action at a distance to me). Perhaps there is a nice way to make the DepthFirstAdapter work for this case. Does anyone have a worked example I can have a look at?

This is what I have written:

public class Evaluator
{
   ...

   public BigDecimal evaluate(PExpression expression)
   {
      if (expression instanceof ASingleExpression) {
         return evaluate(((ASingleExpression) expression).getSingle());
      }
      if (expression instanceof APlusExpression) {
         return evaluate((APlusExpression) expression);
      }
      if (expression instanceof AMinusExpression) {
         return evaluate((AMinusExpression) expression);
      }
      throw new CompileException(String.format(
            "Unexpected type: %s at %s",
            expression.getClass(),
            Printer.nodeToString(expression)));
   }

   public BigDecimal evaluate(APlusExpression expression)
   {
      BigDecimal left = evaluate(expression.getLeft());
      BigDecimal right = evaluate(expression.getRight());
      return left.add(right, STORED_PRECISION);
   }

   ...
}

This works well, and it is typesafe, but it has several problems:

It's not object-oriented, in that the code associated with a "APlusExpression" is not in the "APlusExpression" class.
I need to have a lot of "if" chains to detect the specific type of a production. If I could add code to the production classes, I could use dynamic dispatch to handle this much more cleanly.
Every time I rework the grammar, I have to make corresponding changes to all my "if chains" in the Evaluator class.

Is there a better way to write this code?

I have some ideas about how SableCC could be extended to support adding code to the Node classes. Is that option completely off the table?

What if there was an option to have the Node classes generated by SableCC marked as "abstract", and you could write your own concrete subclass of the generated node class? Then there would need to be a way to make SableCC aware of which concrete type to use in the Parser and voila - you'd have clean auto-generated files plus the option to add user-written action code to the Nodes.

Thanks,

Rich

Etienne Gagnon

unread,

Sep 12, 2013, 3:25:22 PM9/12/13

to sab...@googlegroups.com

Hi Richard,

You might be interested to look at the Mino interpreter: https://github.com/egagnon/mino . It is an interpreter for a minimal object language.

I devote my development efforts to SableCC 4. Its error messages should be much clearer.

The clean approach to add functionalities to generated classes is the use class refinement (like in the Nit language, http://nitlanguage.org) or aspect programming (like in AspectJ, http://www.sable.mcgill.ca/abc).

In plain Java/SableCC, the clean solution is the use of tree walkers (DepthFirstAdapter, Switchable) along with HashMaps for storing node-specific data.

As for evaluating expressions, please look at Mino ( InterpreterEngine.getExpEval() ). It has a very elegant and simple solution.

Have fun!

Etienne

Etienne Gagnon, Ph.D.
http://sablecc.org

On 2013-09-09 09:35, richard...@cantab.net wrote:

Is there a canonical example for writing an interpreter (as opposed to a compiler) with SableCC?

I have been writing an interpreter for a simple language, and I have been using SableCC 3.7. I found SableCC very nice to work with, with two minor exceptions

1. Some of the error messages are not as helpful as they could be.

Can I submit a patch to make these more helpful? I notice that the git repos at https://github.com/egagnon/sablecc is for v4 and doesn't appear to have a 3.x branch. Where should I send patches for 3.x bug fixes? Or is 3.x now at the end of its life?

2. The design choice to disallow adding code to production objects leads to ugly "switch on type" statements

I understand some of the reasons that SableCC does not allow you to add code to the generated classes, but I find it makes my interpreter code ugly.

I wonder if I'm missing a trick, and there is a better way to achieve this?

Let me show you what I've written so far: