Looking for some pointers/guidance for extending the language

35 views
Skip to first unread message

Juan Manuel Barreneche

unread,
Feb 18, 2014, 7:48:48 PM2/18/14
to mozart-...@googlegroups.com
Hi everybody!,

I'm a student from "Universidad de Buenos Aires", and I'm starting my final student project (kind of a thesis). I'm currently exploring the idea of extending the Mozart language and I'd like some pointers on where would be better to start digging or hacking this idea.

My first step is to restrict function definition's body to one expression only (currently it allows multiple statements and returns the last expression).

The current syntax for functions (from the docs) is:
fun { <atom> } "{" <expression> { <pattern> } "}"
  <in expression>
end
I'm thinking on introducing a keyword (in the <atom> list) to mark this pure functions and limit the <in expression> to a <expression>. Does it make sense?

This kind of changes will definitely need that I work in fork of the repository, right? Or is there some way I can do this kind of extension/hook I can create/use without modifying the core?

Thanks!
Juan Barreneche

yjaradin

unread,
Mar 1, 2014, 7:14:35 AM3/1/14
to mozart-...@googlegroups.com


First, I'd like to welcome you. Even if what I write after seems like harsh criticism, pleas take it only as constructive and feel free to make any experiment you wish with Oz, it's one of the top reasons for the language's existence in my mind.

I see several problems with what you propose.

The first is the way you propose to identify your extension. The atom list in functions and procedures is part of the grammar. In order for a specific atom (let's say 'pure') to be used to mark a grammatical change, the rule for function and procedure as to be changed so that the current rule accepts  n(>=0) atom-which isn't-pure and your new rule accepts n(>=0) atom-which-isn't-pure followed by 'pure' follwed by m(>=0) atom-which-isn't-pure. The defintion of the non-terminal atom-which-isn't-pure is left as a (non-trivial) exercise. The reason this is difficult is that nowhere in the language, is the value of an atom used to decide on the grammar. In essence, this breaks the abstract grammar of the language. A must better alternative would be to introduce a new keyword (let's say purefun) which is unlikely to be used much as an atom in existing code to replace the keyword fun in your new rule.

The second problem is purely grammatical. I don't think that forcing the body to be an <expression> will be satisfactory. As an example, one of the rule for <expression> is "(" <in-expression> ")" which would basically allows to write any function-body that one wants by just adding parenthesis around it. and there are many more ways to "escape" out of the expression into a whole in-expression.

The third is more pragmatic. In the implementation of the Oz grammar, there is no actual non-terminal <expression>, <in-expression>, <statement> or <in-statement>. The real non-terminals are <phrase> and <in-phrase> and the check for expression-in-statement-position or the opposite is done after the parsing in one of the first passes of the compiler. The reason for this is that is actually very involved to decide whether an expression or statement is expected in certain places (e.g. body of a procedure with a $ deeply nested in one of the formal argument pattern) and that it would almost double the size of the grammar (most constructions are allowed in both sides)

The fourth problem is more philosophical. By disallowing multiple statements in a pure function, you may break the equivalence between the full language and the kernel-language including the unnesting of expression and the definition of n-ary function as n+1-ary procedures. (More not so relevant philosophy: Even if it is presented as an extension of the language, it still looks to me more like a restriction than an extension...)

Finally, there is a semantic problem. I guess that your intent with pure functions isn't just grammatical but attempts to capture the idea of functions in a more mathematical sense (i.e. side-effect free). This would be quite difficult because Oz is (from my point of view) not a functional language at all but an imperative language done rightly. At a certain level (in particular at the level at which the Oz semantic is defined) there are no functions at all and all 'returns' are just side-effects (binding a variable that was an input to the procedure) This can possibly be avoided by rewriting the semantic (and kernel translation rules) in a more functional (even if strictly equivalent) way in order to get a better definition of what a pure function would be. In any case this definition would probably be semantic and not grammatical and potentially extremely non-trivial (e.g. encapsulated state) to check. One of the most evident properties should be that a pure function doesn't call impure functions in ways which can't be proven pure and this may require flagging pure functions. You'd end up having to do some runtime check (e.g. is the result of the higher-order Compose function pure or not? This depends on pureness of the arguments) and clearly be outside of anything remotely grammatical.

A much less controversial example (in my opinion) would be the addition of a while loop even if this has been done a number of times already ;-)

On practical matters, changing the Oz grammar is relatively easy. The API of the Oz compiler gives access to the parser as an independent component and allows to plug another parser as the compiler front-end, as long as the same AST type is produced (http://mozart.github.io/mozart-v1/doc-1.4.0/compiler/node7.html#appendix.syntax). Of course writing a parser from scratch is probably not what you want to do. In Mozart 2.0, the lexer and parser are for the moment (performance is terrible right now) written in Oz, so you can just get the modules you need from the sources and hack away. You could even do the same with the compiler but that is made of a lot more files. You can of course also fork the repo and work on your own branch, this isn't as difficult as it sounds.

Good luck!
Yves
Reply all
Reply to author
Forward
0 new messages