What's the reasoning behind implicit instance variable declaration?

107 views
Skip to first unread message

Seòras Macdonald

unread,
Dec 18, 2018, 4:09:11 PM12/18/18
to Wren
Hi,

It seems odd to me that instance variables are declared implicitly, and read as null if they are undefined (as well as methods being read as null if undefined).
A lot of languages seem to do things this way, but seeing it in wren was strange since variables need to be declared before being assigned to.
Is there a performance reason for this? or are there benefits for the programmer with implicit instance variable declaration that I'm not seeing?

I've only read through the language guide so far, but wren seems like a lovely language :)

iLiquid

unread,
Jan 1, 2019, 11:16:56 AM1/1/19
to Wren
What I think Wren does behind the scenes, is not declare any implicit variables – it just returns null when you try to access an undeclared variable.

Thorbjørn Lindeijer

unread,
Jan 1, 2019, 2:49:25 PM1/1/19
to ;wren-lang@googlegroups.com;;;
Actually Wren requires variables to be declared. The program "System.print(a)" yields a compile-time "Error at 'a': Undefined variable.". That's different from just getting null when trying to access an undeclared variable.

Variables always need to be declared before they are used, so even though Wren uses a single-pass compiler, it can immediately output an error if you try to use a variable that wasn't previously declared.

This is different for instance variables, which may be read in some method that is declared before another method that writes to the variable. So as far as I am aware, any mention of a class variable will cause a field for that variable to exist.

In principle it would be possible for Wren to output an error if it detects that a certain instance variable is only ever read and never written to, but to do this it would need to keep track of a bunch of information and check for this when it gets to the end of the class definition. Doing this would cost a bit of performance, yet it would be solely for the purpose of reporting read-only or write-only variables as warnings (since I guess neither makes sense), so it is potentially better suited as an optional check that could be enabled while developing.

Btw, for method use it would get even more complicated to check for the use of methods that are not declared, since an instance may call a method that is declared in a subclass. This is not the case for member variables, which are always private.

Cheers,
Bjørn
> --
> You received this message because you are subscribed to the Google
> Groups "Wren" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to wren-lang+...@googlegroups.com.
> To post to this group, send email to wren...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/wren-lang/f2295332-6d76-4f7e-9193-241051e4c2cb%40googlegroups.com <https://groups.google.com/d/msgid/wren-lang/f2295332-6d76-4f7e-9193-241051e4c2cb%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.
>

Michel Hermier

unread,
Jan 2, 2019, 12:17:20 AM1/2/19
to wren-lang


Le mar. 1 janv. 2019 à 20:49, Thorbjørn Lindeijer <bj...@lindeijer.nl> a écrit :
Actually Wren requires variables to be declared. The program "System.print(a)" yields a compile-time  "Error at 'a': Undefined variable.". That's different from just getting null when trying to access an undeclared variable.

Variables always need to be declared before they are used, so even though Wren uses a single-pass compiler, it can immediately output an error if you try to use a variable that wasn't previously declared.

This is a little bit inconsistent in the language here. If the variable is defined with a leading capital, I think it should pass because of the resolved of definition. It would require a later definition to compile thought, and would evaluate to null prior to it's proper initialisation.


This is different for instance variables, which may be read in some method that is declared before another method that writes to the variable. So as far as I am aware, any mention of a class variable will cause a field for that variable to exist.

This was decided to try to make class definition as fluent as possible.
While I'm not a big fan of the _ notation, it is effective.
The biggest issue for me is that we lose the control of members variable declaration and layout control, meaning that every method could use/declare undefined variable too easily for my taste. And that introduce a huge class of bugs.


In principle it would be possible for Wren to output an error if it detects that a certain instance variable is only ever read and never written to, but to do this it would need to keep track of a bunch of information and check for this when it gets to the end of the class definition. Doing this would cost a bit of performance, yet it would be solely for the purpose of reporting read-only or write-only variables as warnings (since I guess neither makes sense), so it is potentially better suited as an optional check that could be enabled while developing.

Or the variables could be initialised to UNDEFINED value, and make use of that somehow. By defining it for consumption in language, or by generating an error at reading it.

The later seems more safe, but since it could inpact every variable read, I would do that only on debug build.

The first one has some advantages, but Bob explicited that he didn't wanted another null. I think it is interesting because it helps to distinguish between a null and an optional/undefined value.

At the end of the day, it all goes down to how to encode/manage *bogus* values. Letting them go in/out, or aborting/throwing when detecting them.


Btw, for method use it would get even more complicated to check for the use of methods that are not declared, since an instance may call a method that is declared in a subclass. This is not the case for member variables, which are always private.

This is not possible at compile time because of the dynamic nature of the language. Even if the optional type declaration is introduced, because it would need to be evaluated at runtime, it would only be possible to reason at strongly defined signature.

The best that can be done here, is a static analysis of a whole project to try to detect passing invalid values.

Cheers,
Michel

Bob Nystrom

unread,
Jan 10, 2019, 12:41:12 AM1/10/19
to wren-lang
Name resolution is always a complex part of any object-oriented language. It's even more complex in a dynamically-typed one that can't rely on the static type system to help, and even more so in a language like Wren with a single pass compiler.

I had a couple of high level goals:

  • When possible, do try to report a compile-time error on an unknown variable. Their a common mistake, and if it's possible for the language to help you, it should do so.
  • Don't require an explicit "this." or "self." when accessing fields and members inside a method of a class. That syntax *really* simplifies things, which is why JavaScript and Python do it, but I just think it's too verbose for a language like Wren that wants to put object-oriented programming front and center.
  • Support "getters"—computed methods that don't require parentheses.
  • Make classes efficient. In particular, creating an instance of a class should be a single allocation of a known fixed size. Unlike JS, Python, etc., instances aren't just arbitrary property bags that grow and mutate over time because that's slow.
  • As much as possible, make the code look familiar to users coming from other languages.
  • Avoid unnecessary ceremony. Wren is a lightweight scripting language, so you shouldn't have to jump through hoops to get things done.
  • As always, keep the implementation simple.
Local variables

For local variables, I require explicit declaration mainly because it pins down the scope of the variable. If you do implicit declaration, then in code like:

a = "outer"
{
  a = "inner"
}

It's not clear if the user intends the second assignment to create a new variable inside the block, or assign to the existing variable. Requiring variables to be explicitly declared means an assignment is always assigning to an existing variable. Also, it avoids common bugs where a typo in a variable name creates a new variable instead of causing an error.

Fields

I wanted Wren to support getters, but that opens up an ambiguity. If the compiler sees an identifier expression inside a class body, it could be:
  • reading a field
  • calling a getter
  • accessing a local variable inside the method
  • accessing a variable declared outside the class
It's even more complex when you consider that the setter it might be calling could be an inherited one. Since the superclass reference is itself evaluated at runtime, we don't know the superclass members at compile time. This is one of the most difficult design challenges I ran into with the language. I carved it up into a few pieces:

First, if we can see that a local variable with that name is in scope, which we can tell a compile time, it must be that. Locals shadow members and variables declared outside the class. This is pretty straightforward.

If we walk the local scopes and eventually hit the outermost scope of the method, then it must not be a local in there. Now what? Do we treat it like a member or do we keep looking in the lexical scopes surrounding the class definition? Either answer is obviously wrong in some cases.

class Foo {
  sayHi() {
    System.print("Hi")
  }
}

Here, it would be pretty weird if the compiler treated "System" as "this.System". (Fun trivia, that *is* how Ruby resolves calls to "puts" and calls to methods declared at the top level of a file outside of any class.)

class Base {
  baseMethod() {}
}

class Derived is Base {}
  derivedMethod() {
    baseMethod()
  }
}

Here, it would be pretty weird if the compiler *didn't* treat "baseMethod()" as "this.baseMethod()". These two examples seem diametrically opposed. The trick is this: Most of the time, when you want to access an identifier defined in the scope surrounding a class and not look it up on "this", it's either a class name or a constant. Those both start with capital letters. So the compiler treats those differently. Capitalized identifiers are resolved lexically, and lowercase ones are not.

Note that the latter also means we can't give you a compile error on an unknown variable when you use it inside the class. For all the compiler knows, that variable name will refer to a getter on the superclass. You only get unknown variable errors for capitalized names, or for lowercase names used outside of a class.

That leaves distinguishing getters/setters from fields. One option would be for the compiler to always compile them to getter/setter calls. Then a field declaration would implicitly define a getter/setter that access the value. However, that puts a real performance penalty on every field usage.

The compiler can't statically tell which identifiers are fields and which aren't because, again, it doesn't know the inherited superclass ones until runtime. So the solution I came up with was to use a naming convention. A leading underscore is always a field. The compiler can compile it to a bytecode instruction that directly reads the field from the instance.

We don't require explicit declaration of fields because it's not really needed. There is no ambiguity around which block scope a field goes into. A field always refers to the containing class. When compiling the class, it's easy to note the name of every field encountered and implicitly declare them. We also know how many there are, which gives us the performance we want. We could require fields to be declared, but doing so wouldn't help the compiler. It might help the user if they tried to assign to a field that wasn't declared. But supporting that is a good bit more bookkeeping in the compiler. In general, to keep Wren minimal, the compiler doesn't do much checking that only benefits the user. Most of the compile-time features of Wren are there because they affect the bytecode and performance.

Hope that helps,

– bob


--
You received this message because you are subscribed to the Google Groups "Wren" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wren-lang+...@googlegroups.com.
To post to this group, send email to wren...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages