The top-level of a Magpie program

58 views
Skip to first unread message

Bob Nystrom

unread,
Jun 2, 2012, 3:53:30 PM6/2/12
to magpi...@googlegroups.com
I've been mulling over a language change that I'd like to get some feedback on.
The existing Java interpreter for Magpie is pretty loose about the language's
semantics. Now that I'm working on a real VM with a bytecode compiler, I'm
trying to nail down some stuff more precisely.

One question is how things like mutual recursion work. It's important that both
mutually recursive methods and classes work:

defclass Foo
var bar is Bar
end

defclass Bar
var foo is Foo
end

def (n is Num) isOdd
not (n - 1) isEven
end

def (n is Num) isEven
if n == 0 then true else not n isOdd
end

But that doesn't play nice with the idea that def and defclass are just
expressions that get evaluated in the middle of a block. In the above example,
when we execute "defclass Foo", "Bar" hasn't been defined yet even though it
uses it in a pattern.

One simple way to fix this would be to make the top-level of a program special.
Right now, a program is just an expression and things like defclass and def are
expressions that can appear anywhere. That means that they're imperative and
execute in the order that they appear. But that doesn't play nice with mutual
recursion.

What I could do instead is say that the top level of a program only contains
definitions, and that that's the only place definitions of methods and classes
can appear. This is what Java, C#, and Dart do.

The downside is that that means you can't easily have normal imperative code at
the top level of a program. Instead, you would probably have to put it inside
a main() method.

I'm trying to give Magpie a "scripty" feel and that seems like it might get in
the way of that. On the other hand, having only static definitions at the top
level means compilation and mutual recursion are much simpler. Also, if I (or
someone else) ever decides to write an ahead-of-time compiler for Magpie,
having a non-imperative top level will make that simpler.

I can think of a couple ways to possibly have the top-level be definitions
without requiring a full "main()" method in every program:

1. Use a do block instead. So a program at the top level is a set of definitions
and one "do" block whose body is executed after all of the definitions. Like:

def sayHi()
print("Hi!")
end

do
sayHi()
end

If you allow that block to appear anywhere, though, it gets a little
confusing:

do
print(pi) // Is this valid?
end

// Define a top-level constant.
val pi = 3.14159

This leads do:

2. No "do" block. Instead, a program is a series of definitions followed by
a series of expressions. The definitions must come first. So:

def sayHi()
print("Hi!")
end

sayHi()

This is actually how most Magpie programs look now, so that's good. A program
with no definitions is just imperative code, so the scriptyness is preserved.
The two downsides I can see are that a) it enforces an ordering to your code
which you might not like and b) it's a bit ambiguous with "var" and "val"
which are both definitions and expressions.

(a) doesn't actually seem that bad to me. And actually, if the top level of
a program is just expressions like it is now, you have the same restriction
since you have to define something before it can be used.

(b) can be easily resolved by just saying that "var" and "val" will be
parsed as definitions if possible.

OK, having written all of that out, I think I've talked myself into a solution.
Here goes:

The top level of a Magpie program is, in order:
- A series of imports and exports.
- A series of definitions ("def", "defclass", "val", and (maybe?) "var").
- A series of expressions.

Any of these sections may be omitted, of course. Definitions are "simultaneous"
and can refer to each other. Circular dependencies between initializers will be
detected and flagged as an error (at compile time? runtime?).

After that, the expressions are evaluated.

I believe this will make it easier to handle mutual recursion and compiling
multimethods to bytecode. The main thing you lose is the ability to have "local"
multimethods or classes: you can't do "def" or "defclass" inside a block. I
don't think you would do that in practice anyway, so that doesn't seem like a
big loss.

Thoughts?

- bob

Mike Austin

unread,
Jun 2, 2012, 5:33:42 PM6/2/12
to magpi...@googlegroups.com
What about using forward declarations like other languages?  For example:

   decclass Bar


   defclass Foo
       var bar is Bar
   end

This would add a 'null' binding for Bar, and would be replaced when Bar is actually defined.  You could later go through the AST and ensure there are no 'null' bindings.  I imagine I'd do something similar in Ruby or Python.

Mike

Paulo Ferreira

unread,
Jun 3, 2012, 9:48:55 AM6/3/12
to magpi...@googlegroups.com
Another option would be to follow the path of other expression-based languages. I'm not familiar with many alternatives, but the one that comes to mind is a "letrec"-like expression, as in Scheme or OCaml: 

  defrec
    defclass Foo
      var bar is Bar
    end

    defclass Bar
      var foo is Foo
    end
  end

Or alternatively, something like:

  defclass Foo
    var bar is Bar

  and defclass Bar
    var foo is Foo
  end

Paulo

Bob Nystrom

unread,
Jun 3, 2012, 3:00:46 PM6/3/12
to magpi...@googlegroups.com
On Sat, Jun 2, 2012 at 2:33 PM, Mike Austin <mike.aus...@gmail.com> wrote:
> What about using forward declarations like other languages?

Hmm, that would work, but it feels hackish to me. My understanding was
that C only has explicit forward declarations because compilers at the
time didn't have enough memory to do anything more user-friendly.

> This would add a 'null' binding for Bar, and would be replaced when Bar is
> actually defined.  You could later go through the AST and ensure there are
> no 'null' bindings.

Having to deal with that redundancy feels a bit hairy to me. This is
one of the things I was hoping to move away from by having top-level
definitions appear "simultaneous". The compiler right now does
something similar to what you suggest when it compiles method calls.
If it sees a call to a method it doesn't know (yet), it implicitly
defines it and then hopes that later the real method will come along
to fill that definition.

In other words, a method call is implicitly treated like a forward
declaration for that method. But that seems kind of nasty to me. I'd
rather do the simpler solution of:

1. Walk through all of the definitions and declare them.
2. Then compile their bodies.

I can do that in two passes if definitions aren't considered
expressions and occur "simultaneously".

>  I imagine I'd do something similar in Ruby or Python.

They (like the current Magpie interpreter in Java) handle this by
basically late binding everything I think.

- bob

Bob Nystrom

unread,
Jun 3, 2012, 3:05:18 PM6/3/12
to magpi...@googlegroups.com
On Sun, Jun 3, 2012 at 6:48 AM, Paulo Ferreira <paul...@gmail.com> wrote:
> Another option would be to follow the path of other expression-based
> languages. I'm not familiar with many alternatives, but the one that comes
> to mind is a "letrec"-like expression, as in Scheme or OCaml:
>
>   defrec
>     defclass Foo
>       var bar is Bar
>     end
>
>     defclass Bar
>       var foo is Foo
>     end
>   end
>
> Or alternatively, something like:
>
>   defclass Foo
>     var bar is Bar
>
>   and defclass Bar
>     var foo is Foo
>   end

Right. One way to frame what I'm proposing is that top-level
definitions work like letrec and nested ones work like normal let. I
don't think I'd want to force users to explicitly write things in a
letrec form because most other OOP languages don't require that.

This does give me an idea, though. If we wanted to, we could later
extend blocks so that they can support letrec-style definitions too
(sort of like how a (begin) block in Scheme can have (define) in it as
long as they're before any other expressions.

That would basically mean that a block in Magpie is:
1. A series of definitions (defclass and def), followed by:
2. A series of expressions.

The definitions can be mutually recursive (i.e. like letrec). Then the
top-level of a program is just a block in this way.

I think for now, I'll try just allowing this at the top-level to see
how that feels, but it's nice to be open to extending it in this way
in the future.

Cheers!

- bob

Grant Posner

unread,
Dec 19, 2012, 11:19:08 PM12/19/12
to magpi...@googlegroups.com
I think that F#-style (the language I know that has it) mutual recursion defintions are a good idea. They tell the user that what they are writing might be "dangerous" and needs to be treated slightly differently, due to recursion. The let-rec definitions, with the "and" keyword, makes sense, and I, at least, like it. Relating to multimethods within method bodies: they are a very important part of functional programming. Unless multimethod != function... I think that there should be local multimethods.

Bob Nystrom

unread,
Dec 29, 2012, 4:04:34 AM12/29/12
to magpi...@googlegroups.com
On Wed, Dec 19, 2012 at 8:19 PM, Grant Posner <feral...@gmail.com> wrote:
I think that F#-style (the language I know that has it) mutual recursion defintions are a good idea. They tell the user that what they are writing might be "dangerous" and needs to be treated slightly differently, due to recursion. The let-rec definitions, with the "and" keyword, makes sense, and I, at least, like it.

It does make sense, but I found that to be the most annoying quirk of programming in F#. Maybe this is just me, but I don't consider recursion "dangerous" or tricky.
 
Relating to multimethods within method bodies: they are a very important part of functional programming. Unless multimethod != function... I think that there should be local multimethods.

Methods and functions are different in Magpie. The former are created with "def" and the latter with "fn". Methods are what you usually create: they are the named, directly invokable bread and butter procedures of the language.

Functions are more like lambdas, even moreso like blocks in Smalltalk and Ruby: they are anonymous, first-class callable objects with non-local returns. You create a method when you want an abstraction you can invoke. You create a function when you want an object that you can pass around that contains a chunk of code.

That being said, I think local methods may still be useful. The main corner case I worry about is confusion about shadowing. For now, the C++ VM doesn't support nested methods, but the grammar and most of the code is organized such that it wouldn't be hard to add support at some point if we find it missing.

Cheers!

- bob


Mark Janssen

unread,
Jun 3, 2014, 4:24:42 PM6/3/14
to magpi...@googlegroups.com
What about using forward declarations like other languages?  For example:

   decclass Bar

   defclass Foo
       var bar is Bar
   end
 
This looks like the right way to me....

Just $0.01 worth...
mark


Reply all
Reply to author
Forward
0 new messages