Name resolution is always a complex part of any object-oriented language. It's even more complex in a dynamically-typed one that can't rely on the static type system to help, and even more so in a language like Wren with a single pass compiler.
I had a couple of high level goals:
- When possible, do try to report a compile-time error on an unknown variable. Their a common mistake, and if it's possible for the language to help you, it should do so.
- Don't require an explicit "this." or "self." when accessing fields and members inside a method of a class. That syntax *really* simplifies things, which is why JavaScript and Python do it, but I just think it's too verbose for a language like Wren that wants to put object-oriented programming front and center.
- Support "getters"—computed methods that don't require parentheses.
- Make classes efficient. In particular, creating an instance of a class should be a single allocation of a known fixed size. Unlike JS, Python, etc., instances aren't just arbitrary property bags that grow and mutate over time because that's slow.
- As much as possible, make the code look familiar to users coming from other languages.
- Avoid unnecessary ceremony. Wren is a lightweight scripting language, so you shouldn't have to jump through hoops to get things done.
- As always, keep the implementation simple.
Local variables
For local variables, I require explicit declaration mainly because it pins down the scope of the variable. If you do implicit declaration, then in code like:
It's not clear if the user intends the second assignment to create a new variable inside the block, or assign to the existing variable. Requiring variables to be explicitly declared means an assignment is always assigning to an existing variable. Also, it avoids common bugs where a typo in a variable name creates a new variable instead of causing an error.
Fields
I wanted Wren to support getters, but that opens up an ambiguity. If the compiler sees an identifier expression inside a class body, it could be:
- reading a field
- calling a getter
- accessing a local variable inside the method
- accessing a variable declared outside the class
It's even more complex when you consider that the setter it might be calling could be an inherited one. Since the superclass reference is itself evaluated at runtime, we don't know the superclass members at compile time. This is one of the most difficult design challenges I ran into with the language. I carved it up into a few pieces:
First, if we can see that a local variable with that name is in scope, which we can tell a compile time, it must be that. Locals shadow members and variables declared outside the class. This is pretty straightforward.
If we walk the local scopes and eventually hit the outermost scope of the method, then it must not be a local in there. Now what? Do we treat it like a member or do we keep looking in the lexical scopes surrounding the class definition? Either answer is obviously wrong in some cases.
Here, it would be pretty weird if the compiler treated "System" as "this.System". (Fun trivia, that *is* how Ruby resolves calls to "puts" and calls to methods declared at the top level of a file outside of any class.)
Here, it would be pretty weird if the compiler *didn't* treat "baseMethod()" as "this.baseMethod()". These two examples seem diametrically opposed. The trick is this: Most of the time, when you want to access an identifier defined in the scope surrounding a class and not look it up on "this", it's either a class name or a constant. Those both start with capital letters. So the compiler treats those differently. Capitalized identifiers are resolved lexically, and lowercase ones are not.
Note that the latter also means we can't give you a compile error on an unknown variable when you use it inside the class. For all the compiler knows, that variable name will refer to a getter on the superclass. You only get unknown variable errors for capitalized names, or for lowercase names used outside of a class.
That leaves distinguishing getters/setters from fields. One option would be for the compiler to always compile them to getter/setter calls. Then a field declaration would implicitly define a getter/setter that access the value. However, that puts a real performance penalty on every field usage.
The compiler can't statically tell which identifiers are fields and which aren't because, again, it doesn't know the inherited superclass ones until runtime. So the solution I came up with was to use a naming convention. A leading underscore is always a field. The compiler can compile it to a bytecode instruction that directly reads the field from the instance.
We don't require explicit declaration of fields because it's not really needed. There is no ambiguity around which block scope a field goes into. A field always refers to the containing class. When compiling the class, it's easy to note the name of every field encountered and implicitly declare them. We also know how many there are, which gives us the performance we want. We could require fields to be declared, but doing so wouldn't help the compiler. It might help the user if they tried to assign to a field that wasn't declared. But supporting that is a good bit more bookkeeping in the compiler. In general, to keep Wren minimal, the compiler doesn't do much checking that only benefits the user. Most of the compile-time features of Wren are there because they affect the bytecode and performance.
Hope that helps,
– bob