Feature request: single-dispatch generic functions

pcxunl...@gmail.com

unread,

Apr 22, 2014, 9:43:31 AM4/22/14

to strati...@googlegroups.com

I have an idea for a new feature for StratifiedJS. First, I will explain the problem that this feature is trying to solve.

In traditional OOP languages, you have "classes" which can have "attributes" (data) and "methods" (behavior). JavaScript does not have a traditional concept of classes, but its prototypical model is essentially the same: objects have attributes and methods, just like in a classical language.

However, putting methods into classes is brittle. I will now demonstrate this claim. Consider these classes:

function FileInput() {}

FileInput.prototype.read = function () { ... }

function Iterator() {}

Iterator.prototype.next = function () { ... }

function Book() {}

Book.prototype.read = function () { ... }

If we have some object "foo" and we want to treat it as an Iterator, we can just use "foo.next()". If "foo" is an Iterator it succeeds, but if "foo" isn't an Iterator, the method won't exist, so it will throw an error, which is exactly what we want. This is duck typing, hurray!

But what happens when we call "foo.read()"? Well, the answer is that it depends on whether "foo" is a FileInput or a Book. Since reading a Book is very different from reading a FileInput, we really want to distinguish these two cases. We can do so by using instanceof:

assert(foo instanceof FileInput)

foo.read()

By doing this we have guaranteed that "foo" is a FileInput, and thus the correct method will be called. And we can do the same whenever we want to treat "foo" as a Book:

assert(foo instanceof Book)

foo.read()

At first, this seems to work... until we define a new class:

function MyBook() {}

We want MyBook to behave like a Book, an Iterator, and a FileInput, all at the same time. Since we're using instanceof to distinguish between FileInput and Book, we can only make it work if MyBook is a subclass of FileInput:

MyBook.prototype = Object.create(FileInput.prototype)

But now it *isn't* a subclass of Book, so it will *only* work as a FileInput, and never as a Book. Some languages (like Python) introduce multiple inheritance to solve this problem. I think this is an insane hack: we already have classes, instances, attributes, methods, single-dispatch, single-inheritance, and now you want to add in multiple-inheritance too?! The language keeps getting more and more complicated.

What about duck typing? Well, if we just call "foo.read()" without checking the type, it *may* throw a nice error, or it may fail silently, or it may go and munge data... who knows. Duck typing isn't safe. We really would like our methods to fail fast if you call them with the wrong inputs.

And neither of those strategies solve the problem. Methods are named using strings, meaning `foo.read` is the same as `foo["read"]`. This means that *even* with multiple inheritance and/or duck typing, `Book.prototype.read` and `FileInput.prototype.read` will conflict with each other, because they use the same name.

It is hopeless. OOP failed. Multiple inheritance failed. Duck typing failed. But there is a solution! It just requires you to no longer treat methods as being defined inside classes (or in the case of JavaScript, defined on prototypes).

Well, if a method isn't defined on a class/prototype, where is it, then? And here is where my feature request comes in. I propose a new keyword called "generic". It looks like this:

generic foo;

Notice that we only gave it a variable name, nothing more. What this does is, it creates a new *generic function* and assigns it to the variable "foo". A generic function is *exactly* like a normal function... except the behavior of a generic function changes depending on the type of its first argument.

To define new behavior, we use the keyword "extend":

extend foo(x is Foo) {

return 1;

}

What the above means is, whenever you call "foo" with an argument which is instanceof "Foo"...

foo(new Foo())

...it will return 1. As a convenience, you can write this...

generic foo(x is Foo) {

return 1;

}

...which is exactly the same as this:

generic foo;

extend foo(x is Foo) {

return 1;

}

Now, let's add another rule:

extend foo(x is Bar) {

return 2;

}

Notice that we used "extend" rather than "generic". Using "generic" creates a *new* generic function. Using "extend" changes an *existing* generic function.

Now if we call "foo" with an argument that is instanceof "Bar", it will return 2. In other words...

foo(new Foo()) => 1

foo(new Bar()) => 2

If you think about it, this is exactly like methods! To understand what I mean, take the method "foo.read()" and replace it with the generic function "read(foo)". In both cases, depending on the type of "foo", we end up with different behavior.

The difference between generic functions and methods is... methods use strings to name things, so you end up with name collisions. But the generic function is put into an actual variable, so we can use the language's module system to prevent name conflicts!

Let's now solve the FileInput/Iterator/Book/MyBook problem using generic functions:

// file.sjs

function FileInput() {}

generic read(x is FileInput) { ... }

exports.FileInput = FileInput

exports.read = read

// iter.sjs

function Iterator() {}

generic next(x is Iterator) { ... }

exports.Iterator = Iterator

exports.next = next

// book.sjs

function Book() {}

generic read(x is Book) { ... }

exports.Book = Book

exports.read = read

Notice I'm now defining each type in a separate file. This wasn't necessary with methods, but is now necessary because both file.sjs and book.sjs define a variable called "read". Now, let us create MyBook:

// mybook.sjs

var file = require("./file")

var book = require("./book")

function MyBook() {}

extend file.read(x is MyBook) { ... }

extend book.read(x is MyBook) { ... }

exports.MyBook = MyBook

As you can see, we use StratifiedJS's built-in module system to allow us to extend both file.read and book.read. Now, when you treat a MyBook object as a book, it will behave in one way, and when you treat it as a file, it will behave in another way. This solves the problem completely! And unlike duck-typing, this is *safe*: if you try to call a generic function on something that it doesn't understand, it throws an error.

Now, let's suppose you release MyBook as a library, and people start to use it. People want to use MyBook as an iterator, but you (the author of mybook.sjs) didn't extend the iter.next function. That's okay, other people can extend it for you!

var iter = require("./iter")

var mybook = require("./mybook")

extend iter.next(x is mybook.MyBook) { ... }

With generic functions, it's trivial to define new behavior on old types, and new types that work with old behavior. Oh yeah, and it works really nicely with StratifiedJS's .. syntax:

@ = require(["./mybook", "./file", "./iter"])

new @MyBook() .. @read() .. @next()

Woah, it's almost like we're doing method calls, except it's safe and there's no name conflicts!

Here's the best part. You can get all this with ridiculous speed to boot. The ClojureScript team discovered a neat trick when they were trying to implement Clojure Protocols in JavaScript. I shamelessly stole their idea and applied it to generic functions. Here's how it works. This...

generic foo;

extend foo(x is Foo) { ... }

...will get compiled to this JavaScript:

// The "generic" and "extend" functions only need to be defined once.

// They can then be used to create/extend multiple generic functions.

var generic = (function () {

var id = 0;

return function (name) {

var key = "__unique_#{name}_#{++id}__";

function f(x) {

var method = x[key];

assert(typeof method === "function", "#{name} called on invalid type");

return method.apply(null, arguments);

}

f.__generic_key__ = key;

return f;

};

})();

var extend = function (gen, Type, f) {

var key = gen.__generic_key__;

assert(typeof key === "string", "extend can only be used on generic functions");

Type.prototype[key] = f;

};

var foo = generic("foo");

extend(foo, Foo, function (x) { ... });

What the?! Okay, to explain, let's look at some simplified code that does the same thing:

var foo = function (x) {

return x.__foo__.apply(null, arguments);

}

Foo.prototype.__foo__ = function (x) { ... }

So, what's going on is that a generic function is just a normal function that calls a method on its first argument (in this case, the method is "__foo__"). And then, extend will attach a __foo__ method to Foo.prototype. This is crazy, but it works and is really smoking fast.

Now, how is this different from the big blob of code up there? Well, first off, we can't use the method name "__foo__" because there might be other generic functions named "foo". So that's why we give it a unique method name, "__unique_foo_1__".

But now that we have given it a unique name, how does extend know which method name to use? Answer: the unique name is stored in foo.__generic_key__.

And then the "generic" and "extend" functions do a bit of checking so they throw errors if you try to call "foo" on something that it doesn't understand, or if you try to call "extend" on something that isn't a generic function.

I ran some benchmarks: https://github.com/Pauan/programming-notes/blob/master/Benchmarks/Single%20dispatch%20multimethods.rst

Raw methods ended up being 3.63 times faster than generic functions with assertions. In my opinion, this is very acceptable.

Note: the benchmark was run with normal synchronous JavaScript. I should test its speed in StratifiedJS as well, because it may perform very differently due to SJS having continuations.

P.S. As you can see in the compiled JavaScript, it's actually possible to implement this right now as a library without any new syntax, it's just a bit clunky.

The syntax I proposed in here is decent, but I'm totally open to different syntaxes! Having good idiomatic support for generic functions is what's important, the exact syntax doesn't matter much.

Alexander Fritze

unread,

Apr 22, 2014, 5:51:19 PM4/22/14

to strati...@googlegroups.com

Hiya,

You don't really need to convince me that OOP has turned out to be a
bad idea; I wholeheartedly agree with your statements :-)

If you look at SJS you'll notice that in designing the standard
libraries we're actually trying to get away from traditional OO
abstractions as much as possible.

It's not only the conflation of data & behaviour which is problematic,
but also the fact that objects are often used to encapsulate behaviour
that relies on some implicit imperative protocol that the user needs
to manually execute.

E.g. you might have a class:

class File {
open: function(path) { ... },
close: function() { ... },
read: function(...) { ... },
write: function(...) { ... }
}

To use this correctly, there is the implicit assumption that the
programmer call `open` before any `read` or `write`, and `close` when
they are done. This is unsafe, because e.g. the programmer might
forget to call `close`, or issue a `read` before the file is open or
after it is closed.

In SJS we try to use functional abstractions instead. E.g. for a file
we might have:
withFile(path) {
|file_interface|

file_interface.write(...);
file_interface.read(...);
}

Here, no matter how the code block is being exited (whether normally
or by exception), the file will always be closed, and the file is
guaranteed to be open when the code block starts. We have effectively
factored out the management of the 'protocol state', which makes this
abstraction safer to use than the corresponding object.

The way this would be implemented would be something like this:

function withFile(path, f) {
var handle = fopen(path);
try {
f({
write: x -> fwrite(handle, x),
read: -> fread(handle, x)
})
}
finally {
close(handle);
}
}

Anyway, getting to your generic function proposal:

I think this could be a great addition to SJS, but my main gripe is
that it relies on prototype-backed objects for the type information.
This is bad for two reasons.

Firstly we have several types that can't be backed by prototypes (e.g.
Streams - https://conductance.io/reference#sjs:sequence::Stream -
which are functions).

Secondly we are trying to get away from prototype-backed objects in
general (note e.g. how the file_interface is implemented above).

I thus think any implementation of generic functions would have to go
hand in hand with a rethink of how we can systematically attach type
information to *any* construct - objects, functions, arrays, quasis.
Ideally even numbers (although i don't see how that's possible).

Hmm, maybe we could go down the route of scheme and leave the
indentification of types up to normal 'isType' functions. We already
have functions like
https://conductance.io/reference#sjs:quasi::isQuasi,
https://conductance.io/reference#sjs:string::isString, or
https://conductance.io/reference#sjs:sequence::isStream, it is just
not very systematic yet.

Of course then we couldn't use the prototype-based dispatch mechanism
that you are proposing, but maybe we could go down the route of this
proposal for scheme: http://www.standarddeviance.com/p45-tessman.pdf.

Generic methods would then be coded something like:

extend foo(x:isStream, y:isString) { ... }

Need to think about this for a bit... keep the ideas coming :-)

Cheers,
Alex

> --
> You received this message because you are subscribed to the Google Groups
> "StratifiedJS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to stratifiedjs...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

pcxunl...@gmail.com

unread,

Apr 23, 2014, 10:48:04 AM4/23/14

to strati...@googlegroups.com

On Wednesday, April 23, 2014 5:51:19 AM UTC+8, Alexander Fritze wrote:

> If you look at SJS you'll notice that in designing the standard

> libraries we're actually trying to get away from traditional OO

> abstractions as much as possible.

This is one of the reasons I am proposing generic functions: I noticed that SJS already chooses a more functional style rather than the traditional OOP model, so I figured generic functions would fit in naturally with SJS.

> The way this would be implemented would be something like this:

>

> function withFile(path, f) {

> var handle = fopen(path);

> try {

> f({

> write: x -> fwrite(handle, x),

> read: -> fread(handle, x)

> })

> }

> finally {

> close(handle);

> }

I wholeheartedly agree. As an example, I recently added in generic functions to my own language. There are three generic functions: "empty", "push", and "traverse". Together, they form a "list-like" protocol, meaning that any data-type that extends those three functions will behave like a list. The generic functions have this behavior:

// return an empty list of the same type as x

empty(x)

// add 1 to the end of x and return x

// think of it like Array.prototype.push, except it's functional

// it returns x so it can work on immutable things like strings

// push may not always mutate its first argument, so you shouldn't rely on it mutating

push(x, 1)

// return an iterator for x

traverse(x)

And here's the implementation for arrays:

extend empty(x is Array) {

return []

}

extend push(x is Array, y) {

x.push(y)

return x

}

// alternate implementation if you want to enforce functionalness, but this is a lot slower

extend push(x is Array, y) {

return x.concat([y])

}

extend traverse(x is Array) {

var len = x.length

return {

next: function () {

if (i < len) {

return { done: false, value: x[i] }

} else {

return { done: true }

}

And then there are a bunch of normal (non-generic) functions that use empty/push/traverse. Examples include len, foldl, map, keep, and zip. As a programmer, you won't usually call empty/push/traverse directly, instead you'll use other functions (like map). And when you want to create a new data-type, all you have to do is extend push/empty/traverse and you get all those awesome functions for free!

So I'm not suggesting that everything be generic. Instead, I'm suggesting that for each interface (file-like, list-like, string-like, etc.) you have a handful of generic functions that essentially form the "axioms". And then you create some normal (non-generic) functions that abstract over the axioms.

Your example is a perfect demonstration of that: fopen, fwrite, fread, and close would be generic functions that serve as the axioms, while the withFile function would be a normal function. The benefit of generic functions is that somebody can create a new data-type that extends fopen/fwrite/fread/close, and then gets withFile for free.

> Firstly we have several types that can't be backed by prototypes (e.g.

> Streams - https://conductance.io/reference#sjs:sequence::Stream -

> which are functions).

Looking at the source for sequences/streams... I think this is a great example of where generic functions would be preferable. In particular, the "each" function does branching based on isStream, isArrayLike, and typeof. The way I would implement that with generic functions is to have empty/push/traverse, and then the "each" function would use those three functions without any type-checking or branching. The type-checking would automatically be done by empty/push/traverse simply because they are generic functions.

> Secondly we are trying to get away from prototype-backed objects in

> general (note e.g. how the file_interface is implemented above).

That's entirely understandable given how bad OOP is, but I think the problem of prototype-backed objects goes away if your language has generic functions.

> I thus think any implementation of generic functions would have to go

> hand in hand with a rethink of how we can systematically attach type

> information to *any* construct - objects, functions, arrays, quasis.

> Ideally even numbers (although i don't see how that's possible).

Absolutely. What I'm suggesting is not a small little feature that can be tacked on. It would need to be an idiomatic thing that programmers use regularly, and so it would have to be supported by the entire language and base libraries. That is not a small change.

However, I do think that even though it would cause short-term pain to migrate the language to generic functions, it may be worth it in the long run. And getting them in now while the language is young is a lot easier than trying to shove them in later once the language has matured (backwards compatibility is hell).

Whether you think that short-term pain is worth it or not, is entirely up to you. I completely understand if you think the benefits of generic functions are not worth the amount of effort that would be required for the language to support them.

P.S. Generic functions work even with strings and numbers, thanks to JavaScript doing auto-boxing. Generic functions just attach functions to the prototype, so you could do things like this:

generic foo;

extend foo(x is String) { ... }

That will get compiled into something equivalent to this:

function foo(x) {

return x.__unique_foo_1__.apply(null, arguments);

}

String.prototype.__unique_foo_1__ = function (x) { ... }

And now if you use foo("bar") JavaScript will automatically coerce "bar" into a String object, and so it works. Although it works, it will also probably be slower than working with string/number values directly, so that is a concern.

Also, something that you need to be careful about is that the strings passed to the function will be String objects and not string values. In other words, the above system treats everything as an object. This is natural for Python or Ruby programmers, but JavaScript programmers are not used to the "everything is an object" idea, so it will trip up some people.

Also, this technique won't work with null, arguments, or undefined, because there is no Null, Arguments, or Undefined constructor. You could work around this by injecting some extra logic into every generic function, which would make every generic function slower (by a constant amount):

function Null() {}

function Undefined() {}

function Arguments() {}

function isArguments(x) {

return {}.toString.call(x) === "[object Arguments]";

}

function getMethod(x) {

if (x === null) {

return Null.prototype.__unique_foo_1__;

} else if (x === void 0) {

return Undefined.prototype.__unique_foo_1__;

} else if (isArguments(x)) {

return Arguments.prototype.__unique_foo_1__;

} else {

return x.__unique_foo_1__;

}

function foo(x) {

return getMethod(x).apply(null, arguments);

}

String.prototype.__unique_foo_1__ = function (x) { ... }

This is hacky, but it's an implementation detail that is hidden from the programmer. My concern with this is that it does add some extra overhead... so it may be better to just say, "we support extending everything except null/undefined/arguments".

If your concern is that you want to support calling "each" on arguments objects, you can do this:

extend empty(x is Object) {

if (isArguments(x)) {

return empty([])

} else {

throw new TypeError()

}

extend push(x is Object, y) {

if (isArguments(x)) {

return push([].slice.call(x), y)

} else {

throw new TypeError()

}

extend traverse(x is Object) {

if (isArguments(x)) {

return traverse([].slice.call(x))

} else {

throw new TypeError()

}

This works because arguments objects inherit from Object. And we can then do some manual type-checking using isArguments. It's hacky, but it has *no* performance penalty. Unfortunately, this technique won't work on null/undefined because they don't inherit from anything (not even Object).

> Hmm, maybe we could go down the route of scheme and leave the

> indentification of types up to normal 'isType' functions. We already

> have functions like

> https://conductance.io/reference#sjs:quasi::isQuasi,

> https://conductance.io/reference#sjs:string::isString, or

> https://conductance.io/reference#sjs:sequence::isStream, it is just

> not very systematic yet.

Right, that is a very general approach, but the downside is that it's slower. Single dispatch generic functions have the benefit that they use JavaScript's existing prototype system, which has been optimized by VMs.

By the way, it is possible to use the isStream/isQuasi pattern with generic functions...

generic isQuasi;

extend isQuasi(x is Foo) { return true }

extend isQuasi(x is Bar) { return true }

But one of the nice things about generic functions is that most of the time you don't need to even have isFoo functions: you just call the generic function, and if it is called on the wrong type, you get an immediate error.

This is like in Python where you just call the method and don't bother doing type-checking. The difference with Python is that duck-typing in Python is not safe, while generic functions are safe.

> Of course then we couldn't use the prototype-based dispatch mechanism

> that you are proposing, but maybe we could go down the route of this

> proposal for scheme: http://www.standarddeviance.com/p45-tessman.pdf.

>

> Generic methods would then be coded something like:

>

> extend foo(x:isStream, y:isString) { ... }

Right, what you're talking about is called "predicate dispatch" (http://en.wikipedia.org/wiki/Predicate_dispatch). It is by far the most powerful dispatch method, since you are dispatching based on arbitrary predicate functions.

There's three reasons I chose single-dispatch generic functions rather than the super-powerful predicate dispatch system...

1) Predicate dispatch is slooooooooooooow. Every time you call a function, you need to call the predicate functions to determine the right behavior. This tends to be (at best) linear with regard to the number of extensions. With single-dispatch generic functions, you're relying on the VM's optimizations for prototype lookup, so it tends to be constant time.

There are some optimizations that you can do for predicate dispatch, and I'm sure more techniques will be discovered in the future. If you're creating a new language specifically built around predicate dispatch, you can use various tricks to make it pretty damn fast. But StratifiedJS compiles to JavaScript and uses JS idioms, so that severely limits its ability to optimize.

As an example of that, if your language can determine (at compile-time) the exact generic function that you're extending, it can make it faster:

generic foo;

extend foo(...) { ... }

We can easily tell at compile-time that "extend foo" refers to the "foo" generic function, so we can optimize that, hurray! But here's an example where that *doesn't* work:

generic foo;

var bar = {

foo: foo

};

extend bar.foo(...) { ... }

Because we can't determine at compile-time that "extend bar.foo" is extending the generic function "foo", we lose out on some optimizations. Since SJS's module system is based around objects, that means we *have* to extend the generic function at run-time.

That isn't the end of the world, and there's plenty of other optimizations we can do. But no matter what, without some kind of compile-time information, predicate dispatch is going to be slower than single-dispatch.

Ultimately, the only way to be sure is to add predicate dispatch to SJS and see if it's "fast enough". If it is, then you should use it because it can do everything single dispatch can do, and then some.

If it isn't fast enough, single dispatch is an alternative that is significantly faster while still being more flexible than traditional OOP techniques.

2) Another concern is... consider this:

function isa(x, y) {

return x instanceof y

}

generic qux;

function Foo() {}

extend qux(x:isa(Foo)) { return 1 }

function Bar() {}

Bar.prototype = Object.create(Foo.prototype)

extend qux(x:isa(Bar)) { return 2 }

Now the problem is, when you call "qux", it will check each predicate in the order that you called "extend". So if you call "qux(new Bar())" it will first check "isa(x, Foo)" which is true, and so it returns 1. But we wanted it to return 2!

We could instead make it check them in the reverse-order that they were extended, so it checks "isa(x, Bar)" first, which solves that particular problem... but there are other situations where reverse-order isn't correct either!

We don't really want our behavior to vary depending on the order that you called "extend", we want it to choose the *most specific* predicate that matches. But how will you determine which predicate is "most specific"? You would need some system where a programmer can specify the specifity of a predicate function. Going down that route will obviously add more complexity to the language. Whether you think that extra complexity is worth it or not is up to you.

With single-dispatch, choosing the most specific method is easy, because single-dispatch is equivalent to predicate-dispatch where the *only* predicate is "instanceof". And the system then says that a sub-type is more specific than a super-type. Bam, done.

3) Another concern is... let's say you have two different libraries that both extend the same function:

// library1.sjs

extend foo(x:isStream) { ... }

// library2.sjs

extend foo(x:isStream) { ... }

You will get no error or warning for this. Whichever library is loaded last will "win". You might say, "okay we'll throw an error if you try to extend with the same predicate function". But now what about this?

// library1.sjs

extend foo(x:isStream) { ... }

// library2.sjs

function isMyStream(x) {

return isStream(x)

}

extend foo(x:isMyStream) { ... }

And what about the situations where you have two extensions that only overlap partially?

// library1.sjs

extend foo(x:isStream) { ... }

// library2.sjs

function isMyStream(x) {

if (someCondition) {

return isStream(x)

} else {

return false

}

extend foo(x:isMyStream) { ... }

Now if you call "foo" with a stream, sometimes it will call the first extension, and sometimes the second, based on whether "someCondition" is true or not.

One of the things that I like about single-dispatch is that it doesn't have this problem: if you try to extend a function in a way that overlaps with another extension, you get an error. The only overlap allowed is a data-type inheriting from another data-type (single inheritance), which is easy to reason about.

When you have multiple different parts of your program (e.g. libraries) extending the same function, single-dispatch makes it easier to reason about what's going on (i.e. which behavior will be used when you call the generic function).

Perhaps this isn't a problem in practice... I don't know. I haven't used any languages that use predicate dispatch, so it's possible this is a non-concern.

And even if it *is* a concern, there may be ways for the language to compensate, like having the programmer specify that a dispatch function is a subset of another dispatch function. In which case you're essentially re-implementing object inheritance... but with predicate functions instead of objects. That may actually be a pretty neat idea, I'll have to do some research on that.

pcxunl...@gmail.com

unread,

Apr 23, 2014, 11:27:38 PM4/23/14

to strati...@googlegroups.com, pcxunl...@gmail.com

I read that Scheme article you linked to (great stuff, by the way), and it had a couple neat ideas for implementing predicate dispatch efficiently. One of those ideas is to compare the arguments left-to-right rather than giving all the arguments equal priority. From a theoretical basis this isn't as good, but from a *practical* basis I think it's a great idea.

So I decided to try implementing predicate dispatch, if solely to see just how *much* slower it is than single dispatch... the results shocked me. First, the code:

function __Method__(pattern, behavior) {

this.pattern = pattern

this.behavior = behavior

}

function generic() {

var methods = []

function f() {

top:

for (var i = 0; i < methods.length; ++i) {

var x = methods[i]

var a = x.pattern

if (arguments.length === a.length) {

for (var i2 = 0; i2 < a.length; ++i2) {

if (!a[i2](arguments[i2])) {

continue top

}

return x.behavior.apply(null, arguments)

}

throw new Error("no valid behavior")

}

f.__methods__ = methods

return f

}

function extend(generic, pattern, behavior) {

var a = generic.__methods__

var x = new __Method__(pattern, behavior)

top:

for (var i = 0; i < a.length; ++i) {

var y = a[i]

if (x.pattern.length === y.pattern.length) {

for (var i2 = 0; i2 < x.pattern.length; ++i2) {

var pat1 = x.pattern[i2]

, pat2 = y.pattern[i2]

, sort1 = pat1.__generic_subtype__

, sort2 = pat2.__generic_subtype__

if (sort1 && sort2) {

if (sort1(pat1, pat2)) {

a.splice(i, 0, x)

return

} else if (sort2(pat1, pat2)) {

continue top

}

} else if (pat1 === pat2) {

throw new Error("overlapping predicate functions")

} else {

continue top

}

a.push(x)

}

function isa(x) {

function f(y) {

return y instanceof x

}

f.__type__ = x

f.__generic_subtype__ = function (self, other) {

if (other.__type__) {

return self.__type__.prototype instanceof other.__type__

} else {

return false

}

return f

}

You would use it like this:

var foo = generic()

function Foo() {}

extend(foo, [isa(Foo)], function (x) {

return 1

})

function Bar() {}

Bar.prototype = Object.create(Foo.prototype)

extend(foo, [isa(Bar)], function (x) {

return 2

})

Now, here's the benchmark results...

Plain ol' functions x 43,713,041 ops/sec ±0.24% (99 runs sampled)

Plain ol' methods x 48,136,394 ops/sec ±0.15% (101 runs sampled)

Single dispatch (fast) x 18,407,276 ops/sec ±0.19% (99 runs sampled)

Single dispatch (slow) x 4,002,476 ops/sec ±1.08% (94 runs sampled)

Predicate dispatch x 18,468,231 ops/sec ±0.14% (100 runs sampled)

Crazy! Predicate dispatch is faster than single dispatch?! This absolutely shocked me. Now, let's have the "foo" function vary its behavior based on 2 types rather than 1:

Single dispatch (fast) x 18,865,773 ops/sec ±1.33% (97 runs sampled)

Single dispatch (slow) x 3,924,047 ops/sec ±0.34% (100 runs sampled)

Predicate dispatch x 9,340,151 ops/sec ±0.49% (95 runs sampled)

As expected, single dispatch is constant time, while predicate dispatch is linear time. With dispatching on 5 different types:

Single dispatch (fast) x 18,154,164 ops/sec ±0.27% (97 runs sampled)

Single dispatch (slow) x 4,025,299 ops/sec ±1.46% (95 runs sampled)

Predicate dispatch x 4,698,704 ops/sec ±0.45% (97 runs sampled)

If these results are correct, then predicate dispatch with 5 different extensions is the same speed as single dispatch. This is *very* acceptable speed. This nullifies at least one of my concerns with predicate dispatch. It also raises the question as to why the single dispatch method is so slow. I will have to do more research on this.

pcxunl...@gmail.com

unread,

Apr 24, 2014, 1:38:33 AM4/24/14

to strati...@googlegroups.com, pcxunl...@gmail.com

Alright, I did some digging and found out why single-dispatch is so slow. In V8, these two expressions are SUPER fast:

foo.bar

foo["bar"]

But this expression is drastically slower:

var bar = "bar"

foo[bar]

Also, running the benchmarks with "use strict" caused the "Single dispatch (fast)" test to run a LOT faster. These are the new results:

Plain ol' functions x 51,441,379 ops/sec ±0.24% (96 runs sampled)

Plain ol' methods x 55,661,889 ops/sec ±0.42% (100 runs sampled)

Single dispatch (fast) x 52,136,062 ops/sec ±0.72% (88 runs sampled)

Single dispatch (slow) x 4,523,854 ops/sec ±0.32% (98 runs sampled)

Single dispatch (eval) x 20,890,359 ops/sec ±0.16% (99 runs sampled)

Predicate dispatch x 20,660,212 ops/sec ±0.18% (101 runs sampled)

So, it is exactly as I said: single dispatch is going to be faster than predicate dispatch.

But the story is a bit more nuanced than that... notice the huge disparity between "Single dispatch (fast)" and "Single dispatch (slow)".

If SJS can determine at compile-time all the generic functions and extensions... it would be possible to compile to the same code as "Single dispatch (fast)". Which means generic functions would be *just as fast as normal functions*!

On the other hand, if SJS can't determine that at compile-time, you have two options: "Single dispatch (slow)" will always work, but is ridiculously slow. Or you can use "Single dispatch (eval)" which uses "new Function" to generate an optimized generic function, but that won't always work (e.g. Chrome Extensions don't allow for eval).

I don't think "Single dispatch (slow)" is viable in the slightest. And "Single dispatch (eval)" isn't really much faster than predicate dispatch, so it's not really viable either. So the only two choices are between "Single dispatch (fast)" and predicate dispatch.

It will come down to two things:

1) Is it even possible to determine all uses of generic/extend at compile-time? If not, how many restrictions do we have to add in to make it possible? Are those restrictions acceptable, or are they too limiting for the programmer?

2) Assuming we CAN determine generic functions at compile-time, is it worth it to use single dispatch (twice as fast, less flexible, less powerful), or use predicate dispatch (half as slow, but more powerful)? My gut says predicate dispatch is better, given that it performed vastly better than in my wildest dreams.

Alexander Fritze

unread,

Apr 25, 2014, 8:00:05 AM4/25/14

to strati...@googlegroups.com

Yeah, I get the appeal of generic functions. In our case `each` would
make a good candidate for a generic function.
We could have alternatively used what you call `traverse` - we call it
`consume`. The difference is that in `each`, the downstream is paced
by the upstream (it has "push" semantics), whereas in
`traverse`/`consume` the downstream paces the upstream (it has "pull"
semantics).

If you have a stream S in SJS, this gives you "push" semantics:

S .. @each {
|x|
handle_x(x);
}

And this gives you "pull" semantics:

var eos_token = {};
S .. @consume(eos_token) {
|get_next|
while (1) {
var x = get_next();
if (x === eos_token) break;
handle_x(x);
}
}

We chose the "push" version as default, because for finite streams it
is often more elegant: end-of-stream is implicit in this form. In the
"pull" version, you have to handle end-of-stream manually, as shown
above.

As for `empty` and `push` - I'm curious what you use those for? I can
see that you can build up a sequence library from `empty`, `push`, and
`head`, `tail` - but how does `traverse` go with `empty` and `push`?
Doesn't `traverse` already abstract the process of creating a
sequence? In SJS, we've got a lot of functions that operate on streams
(folds, maps, parallel maps, etc), and `each` is the only axiom they
need. E.g. this is our `transform` function which generates a stream
f(x1), f(x2), ... from x1, x2, ...:

var transform = (seq, f) -> Stream(r -> seq .. each { |x| r(f(x)) })

>> Secondly we are trying to get away from prototype-backed objects in
>> general (note e.g. how the file_interface is implemented above).
>
> That's entirely understandable given how bad OOP is, but I think the problem
> of prototype-backed objects goes away if your language has generic
> functions.

Well, you were proposing tagging the type of object via the prototype.
If we generally don't use prototypes, that is problematic :-)
We want a way of saying that this object:

{
first_name: '...',
last_name: '...'
}

is of type "Person" and this object:

{
street: '...',
city: '...',
country: '...'
}

is of type "Address". And we want to be able to say those things
without setting any prototype other than the standard "Object"
prototype on them.

>> I thus think any implementation of generic functions would have to go
>> hand in hand with a rethink of how we can systematically attach type
>> information to *any* construct - objects, functions, arrays, quasis.
>> Ideally even numbers (although i don't see how that's possible).
>
> Absolutely. What I'm suggesting is not a small little feature that can be
> tacked on. It would need to be an idiomatic thing that programmers use
> regularly, and so it would have to be supported by the entire language and
> base libraries. That is not a small change.
>
> However, I do think that even though it would cause short-term pain to
> migrate the language to generic functions, it may be worth it in the long
> run. And getting them in now while the language is young is a lot easier
> than trying to shove them in later once the language has matured (backwards
> compatibility is hell).
>
> Whether you think that short-term pain is worth it or not, is entirely up to
> you. I completely understand if you think the benefits of generic functions
> are not worth the amount of effort that would be required for the language
> to support them.
>
> P.S. Generic functions work even with strings and numbers, thanks to
> JavaScript doing auto-boxing.

That's very true; I didn't think of boxing!

Yeah, it's not so much the speed I'd worry about but the non-constant scaling.

Hmm, yeah, it seems like predicate dispatch is opening a Pandora's
box. So if we go for generic functions, it looks like it would have to
be done by tagging objects with a single type, and *possibly* simple
subtypes. It's a shame really... because an object might of course be
different things in different contexts :/

Alexander Fritze

unread,

Apr 25, 2014, 8:10:14 AM4/25/14

to strati...@googlegroups.com

Nice comprehensive analysis :-)

I don't think it's possible without significant limitations :/

>
> 2) Assuming we CAN determine generic functions at compile-time, is it worth
> it to use single dispatch (twice as fast, less flexible, less powerful), or
> use predicate dispatch (half as slow, but more powerful)? My gut says
> predicate dispatch is better, given that it performed vastly better than in
> my wildest dreams.

But what about those other concerns about predicate dispatch that you
mentioned (besides performance) - like resolution order or clashes
between modules?

Cheers,
Alex

pcxunl...@gmail.com

unread,

Apr 25, 2014, 11:41:36 PM4/25/14

to strati...@googlegroups.com

On Friday, April 25, 2014 8:00:05 PM UTC+8, Alexander Fritze wrote:

> We could have alternatively used what you call `traverse` - we call it

> `consume`. The difference is that in `each`, the downstream is paced

> by the upstream (it has "push" semantics), whereas in

> `traverse`/`consume` the downstream paces the upstream (it has "pull"

> semantics).

Right, it's like the difference between iterators and event listeners. In one case the caller pulls in the changes, while in the other case the event pushes the changes to the listeners.

> We chose the "push" version as default, because for finite streams it

> is often more elegant: end-of-stream is implicit in this form. In the

> "pull" version, you have to handle end-of-stream manually, as shown

> above.

Yup, that makes sense. In my language, foldl is trivially defined in terms of traverse, and then map/each/len are defined in terms of foldl. So it ends up being reasonably clean and elegant, even with the extra overhead of pull semantics. But the decision made by SJS is totally fine too.

> As for `empty` and `push` - I'm curious what you use those for? I can

> see that you can build up a sequence library from `empty`, `push`, and

> `head`, `tail` - but how does `traverse` go with `empty` and `push`?

> Doesn't `traverse` already abstract the process of creating a

> sequence? In SJS, we've got a lot of functions that operate on streams

> (folds, maps, parallel maps, etc), and `each` is the only axiom they

> need.

"traverse" only handles actually traversing over the list. In my language, if you implement traverse but not empty/push, you can use things like foldl, each, and len. But stuff like map actually returns a new list, so you need empty/push.

If you call map on an object, you get back an object. If you call map on an array, you get back an array. If you call map on a cons cell, you get back a cons cell. If you call map on a stream, you get back a stream, etc.

The model you're talking about is more like in Clojure, where map always returns a sequence (regardless of what the input is), and then you would use "into" to convert the sequence back into a concrete data-type (like an array). That's perfectly valid too! It's just not how I designed my language.

In the case of SJS, you would make "each" a generic function, and define everything else in terms of "each". No need for "empty" or "push".

> Well, you were proposing tagging the type of object via the prototype.

> If we generally don't use prototypes, that is problematic :-)

That's just an implementation detail to make single-dispatch work with JavaScript. My system only needs a way to:

Create new data types.

Create new data types that inherit from existing data types.

Create new generic functions.

Extend generic functions to change their behavior based on the data type of their argument(s).

Whether the implementation uses classes or prototypes or whatever doesn't really matter.

> We want a way of saying that this object:

>

> {

> first_name: '...',

> last_name: '...'

> }

>

> is of type "Person" and this object:

>

> {

> street: '...',

> city: '...',

> country: '...'

> }

>

> is of type "Address". And we want to be able to say those things

> without setting any prototype other than the standard "Object"

> prototype on them.

What's wrong with doing this?

function Person(first, last) {

this.first_name = first

this.last_name = last

}

function Address(street, city, country) {

this.street = street

this.city = city

this.country = country

}

generic foo;

extend foo(x is Person) { ... }

extend foo(x is Address) { ... }

foo(new Person(...))

foo(new Address(...))

You might say, "but now everything only has a single concrete type! An object that is a Person is ONLY a Person, and never an Address!" That is true. But it also doesn't matter, because you can extend any existing generic functions to accept a Person. So if you want a Person to behave like an Address, you can, just by extending generic functions.

The point is that with this way of thinking, you rarely do any type checking: you don't care whether the thing is a Person or an Address or whatever, you just call the generic function "foo". If the generic function accepts the type, it'll work. If not, you get an error. It's just like Python's duck typing, but without the problems of Python's duck typing.

> Yeah, it's not so much the speed I'd worry about but the non-constant scaling.

Thankfully most generic functions would have <5 extensions. Unfortunately, there *will* be generics that end up with more than 5 extensions... I can see math operators like + / * - being especially problematic in that regard.

> Hmm, yeah, it seems like predicate dispatch is opening a Pandora's

> box. So if we go for generic functions, it looks like it would have to

> be done by tagging objects with a single type, and *possibly* simple

> subtypes. It's a shame really... because an object might of course be

> different things in different contexts :/

I believe generic functions (even with single dispatch) perfectly solves the problem of an object meaning different things in different contexts. Could you please give an example of what situation you think is not solved by generic functions?

P.S. You *definitely* want a subtype system, regardless of whether you use single dispatch or predicate dispatch. I discovered that pretty quickly when adding in generic functions to my language.

> I don't think it's possible without significant limitations :/

Yeah that's the rub of it. In my language, we have compile-time macros and solid ways of dealing with compile-time stuff, so making generic functions compile-time-only would be a completely reasonable thing to do.

But SJS builds on top of JavaScript where "do everything at run-time" is the norm, so compile-time generic functions just wouldn't work.

> But what about those other concerns about predicate dispatch that you

> mentioned (besides performance) - like resolution order or clashes

> between modules?

Yeah, those are still concerns. But! My benchmarks showed predicate dispatch being faster than (run-time) single dispatch!

That means you don't have to deal with all these complex problems right now. You can add in a restricted predicate dispatch system where the only predicate allowed is "instanceof". This will have the same (sane) behavior as single dispatch. And then you can *later* add in more predicates in a backwards-compatible way.

Essentially, we can punt on dealing with all this stuff, by restricting the predicate system to behave the same as a single dispatch system.

Of course, that won't play well with SJS's existing isString/isQuasi/isStream system, which is unfortunate. But I'm not convinced yet that type predicates are the best way to deal with these situations: I wonder if isString/isQuasi/isStream are just hacks to work around the lack of generic functions.

So, since I haven't delved too deeply into SJS's libraries, I have to ask: if "each" was a generic function that you could extend for each list-like data type, would you still have implemented isStream? Or was the isStream function added solely so "each" could do type dispatch? Basically, are there situations that are handled by isStream that couldn't be easily handled by generic functions?

Alexander Fritze

unread,

Apr 27, 2014, 5:16:49 PM4/27/14

to strati...@googlegroups.com

On Sat, Apr 26, 2014 at 5:41 AM, <pcxunl...@gmail.com> wrote:
> On Friday, April 25, 2014 8:00:05 PM UTC+8, Alexander Fritze wrote:
>> We could have alternatively used what you call `traverse` - we call it
>> `consume`. The difference is that in `each`, the downstream is paced
>> by the upstream (it has "push" semantics), whereas in
>> `traverse`/`consume` the downstream paces the upstream (it has "pull"
>> semantics).
>
> Right, it's like the difference between iterators and event listeners. In
> one case the caller pulls in the changes, while in the other case the event
> pushes the changes to the listeners.

Yeah, kind of. `each` is a bit different to listeners in that a) it
manages the lifetime of the producer and b) it can pace the producer,
e.g.:

@integers() .. @transform(x->x*x) .. @each { |x| console.log(x); hold(1000); }

This will print squares of the integers, one every second, ad infinum.
integers is just implemented like this:

var integers = -> @Stream(function(receiver) { var i=0; while (true)
receiver(i++) })

>
>> We chose the "push" version as default, because for finite streams it
>> is often more elegant: end-of-stream is implicit in this form. In the
>> "pull" version, you have to handle end-of-stream manually, as shown
>> above.
>
> Yup, that makes sense. In my language, foldl is trivially defined in terms
> of traverse, and then map/each/len are defined in terms of foldl. So it ends
> up being reasonably clean and elegant, even with the extra overhead of pull
> semantics. But the decision made by SJS is totally fine too.

Out of curiosity, does your implementation work with infinite
sequences like @integers? What about streams of events?

Well, as I say in SJS we try to get away from `new X()`.

For starters, sub classing doesn't work with things other than
objects, like arrays, or numbers. E.g. we might want a generic
function that does something different for 'ordinal' integers than for
'cardinal' integers. Or we might want to say that a particular array
is a set. These things are not possible because we cannot (without
hacks) extend the builtin Array and Number classes.

Secondly, it's not very nice to require special syntax for the sole
reason of getting `instanceof` to work. It's too easy for programmers
to forget to write `new`.

Thirdly, the way subclassing is conflated with constructors in JS
feels like a big hack. E.g. how would you subclass your Person class?
Where do you call your Person constructor? You have to do something
like:

var EmployeePrototype = new Person('dummy', 'dummy');
function Employee(name, address) {
Person.apply(this, arguments);
}
Employee.prototype = EmployeePrototype;

Try to explain this to someone new to the language :-)

Finally, prototypes don't remote easily. E.g. in conductance (the web
app stack built on sjs), you can call functions from client to server
and vice versa. Conductance can handle complex types (objects, arrays,
functions), but prototypes are really difficult to marshal in a way
that does what the user expects under all circumstances.

> You might say, "but now everything only has a single concrete type! An
> object that is a Person is ONLY a Person, and never an Address!" That is
> true. But it also doesn't matter, because you can extend any existing
> generic functions to accept a Person. So if you want a Person to behave like
> an Address, you can, just by extending generic functions.
>
> The point is that with this way of thinking, you rarely do any type
> checking: you don't care whether the thing is a Person or an Address or
> whatever, you just call the generic function "foo". If the generic function
> accepts the type, it'll work. If not, you get an error. It's just like
> Python's duck typing, but without the problems of Python's duck typing.
>
>
>> Yeah, it's not so much the speed I'd worry about but the non-constant
>> scaling.
>
> Thankfully most generic functions would have <5 extensions. Unfortunately,
> there *will* be generics that end up with more than 5 extensions... I can
> see math operators like + / * - being especially problematic in that regard.
>
>
>> Hmm, yeah, it seems like predicate dispatch is opening a Pandora's
>> box. So if we go for generic functions, it looks like it would have to
>> be done by tagging objects with a single type, and *possibly* simple
>> subtypes. It's a shame really... because an object might of course be
>> different things in different contexts :/
>
> I believe generic functions (even with single dispatch) perfectly solves the
> problem of an object meaning different things in different contexts. Could
> you please give an example of what situation you think is not solved by
> generic functions?

Well, let's say you have an object that represents an employee record,
like this:

{ name: xxx,
title: xxx
}

Different parts of my program might want to see this object as
different things. E.g. to my business logic module this could be of
type 'Employee', a subtype of 'Person'. To the database module it
would be be database record, e.g. of type 'PlainJsonDBRecord', a
subtype of 'DBRecord'. And to the caching module it might be of type
'Purgeable', a subtype of 'Cacheable'.

So ideally I want a 'traits-like' system that can express that my
object is *all* of these things. A simple subtype hierarchy cannot
capture this.

> P.S. You *definitely* want a subtype system, regardless of whether you use
> single dispatch or predicate dispatch. I discovered that pretty quickly when
> adding in generic functions to my language.
>
>
>> I don't think it's possible without significant limitations :/
>
> Yeah that's the rub of it. In my language, we have compile-time macros and
> solid ways of dealing with compile-time stuff, so making generic functions
> compile-time-only would be a completely reasonable thing to do.
>
> But SJS builds on top of JavaScript where "do everything at run-time" is the
> norm, so compile-time generic functions just wouldn't work.

Yeah :/

>
>> But what about those other concerns about predicate dispatch that you
>> mentioned (besides performance) - like resolution order or clashes
>> between modules?
>
> Yeah, those are still concerns. But! My benchmarks showed predicate dispatch
> being faster than (run-time) single dispatch!
>
> That means you don't have to deal with all these complex problems right now.
> You can add in a restricted predicate dispatch system where the only
> predicate allowed is "instanceof". This will have the same (sane) behavior
> as single dispatch. And then you can *later* add in more predicates in a
> backwards-compatible way.
>
> Essentially, we can punt on dealing with all this stuff, by restricting the
> predicate system to behave the same as a single dispatch system.
>
> Of course, that won't play well with SJS's existing
> isString/isQuasi/isStream system, which is unfortunate. But I'm not
> convinced yet that type predicates are the best way to deal with these
> situations: I wonder if isString/isQuasi/isStream are just hacks to work
> around the lack of generic functions.

Yeah, not sure 'hack' is the right word, but if you have generic
functions coupled with a powerful enough traits/types system (see
Employee/DB/Cache example above), then I'm pretty sure you could get
rid of them.

> So, since I haven't delved too deeply into SJS's libraries, I have to ask:
> if "each" was a generic function that you could extend for each list-like
> data type, would you still have implemented isStream? Or was the isStream
> function added solely so "each" could do type dispatch? Basically, are there
> situations that are handled by isStream that couldn't be easily handled by
> generic functions?

No, if we had generic functions, then `each` definitely wouldn't
require `isStream`!

pcxunl...@gmail.com

unread,

Apr 28, 2014, 5:34:46 AM4/28/14

to strati...@googlegroups.com

> This will print squares of the integers, one every second, ad infinum.

Sure, because SJS has continuations.

> integers is just implemented like this:

>

> var integers = -> @Stream(function(receiver) { var i=0; while (true) receiver(i++) })

Looking through the source, I'm not sure how Stream is implemented... It obviously needs to do some continuation hackery, or it would go into an infinite blocking loop. But I'm not seeing where it actually uses continuations. Are they just implicit in the second argument to "each"?

> Out of curiosity, does your implementation work with infinite

> sequences like @integers? What about streams of events?

I haven't tried, but it should. It's no different from ECMAScript 6/Python iterators that produce an infinite sequence. You could write an infinite sequence like this:

function Infinite(state, f) {

this.state = state

this.next = f

}

extend traverse(x is Infinite) {

return {

next: function () {

return { done: false, value: x.next(x.state) }

}

function integers() {

return new Infinite([0], function (state) {

return state[0]++

})

}

Not as elegant as in SJS, for sure! Though it'd be a lot nicer if all iterators/generators were automatically traversable. Then you could write "integers" like this:

function* integers() {

var i = 0;

while (true) {

yield i++

}

And bam it would just work. That is possible in my system, though it's not very clean...

extend traverse(x) {

if (Symbol.iterator in x) {

return x[Symbol.iterator]()

} else {

throw new TypeError()

}

With predicate dispatch, it becomes very clean:

function isIterator(x) {

return Symbol.iterator in x

}

extend traverse(x:isIterator) {

return x[Symbol.iterator]()

}

I'm not sure what you mean about "streams of events".

> Well, as I say in SJS we try to get away from `new X()`.

>

> For starters, sub classing doesn't work with things other than

> objects, like arrays, or numbers. E.g. we might want a generic

> function that does something different for 'ordinal' integers than for

> 'cardinal' integers. Or we might want to say that a particular array

> is a set. These things are not possible because we cannot (without

> hacks) extend the builtin Array and Number classes.

Yeah, I admit JavaScript's object system isn't great.

> Secondly, it's not very nice to require special syntax for the sole

> reason of getting `instanceof` to work. It's too easy for programmers

> to forget to write `new`.

Sure. My language *requires* the use of "new", and it's idiomatic to write a helper function that creates the object. As an example, you'd have a type "Foo" which must be called with "new", and a function "foo" which calls "new Foo", perhaps with some default state or whatever. It'd look like this:

(type Foo state)

(def foo ->

(new Foo 1))

(foo)

But that's different from what JavaScript (and SJS) do. The above is roughly equivalent to this, in JavaScript:

function Foo(state) {

this.state = state

}

function foo() {

return new Foo(1)

}

foo()

> Thirdly, the way subclassing is conflated with constructors in JS

> feels like a big hack. E.g. how would you subclass your Person class?

> Where do you call your Person constructor? You have to do something

> like:

>

> var EmployeePrototype = new Person('dummy', 'dummy');

> function Employee(name, address) {

> Person.apply(this, arguments);

> }

> Employee.prototype = EmployeePrototype;

>

> Try to explain this to someone new to the language :-)

Absolutely. I'll be the first to admit JavaScript's object model is bad. But SJS can fix that. CoffeeScript has "class" syntax which handles the fine minutae of creating/subclassing objects. And in fact, ECMAScript 6 will have "class" syntax too.

P.S. You'd actually do it like this:

function Employee(name, address) {

Person.apply(this, arguments)

}

Employee.prototype = Object.create(Person.prototype)

Employee.prototype.constructor = Employee

Which just proves your point that subclassing is non-trivial in vanilla JS.

> Finally, prototypes don't remote easily. E.g. in conductance (the web

> app stack built on sjs), you can call functions from client to server

> and vice versa. Conductance can handle complex types (objects, arrays,

> functions), but prototypes are really difficult to marshal in a way

> that does what the user expects under all circumstances.

How does it handle functions? And why doesn't its strategy for functions work for types/constructors?

> Well, let's say you have an object that represents an employee record,

> like this:

>

> { name: xxx,

> title: xxx

> }

>

> Different parts of my program might want to see this object as

> different things. E.g. to my business logic module this could be of

> type 'Employee', a subtype of 'Person'. To the database module it

> would be be database record, e.g. of type 'PlainJsonDBRecord', a

> subtype of 'DBRecord'. And to the caching module it might be of type

> 'Purgeable', a subtype of 'Cacheable'.

Why would different parts of your program even care what type it is? The database module would have a generic function called "marshal". Anything that can be marshalled to the format the database expects would extend the "marshal" function. The caching module would have a generic function called "cache". Anything that can be cached would extend the "cache" generic function.

So you'd just extend the marshal and cache generic functions so they recognize a Person. And then an Employee will automatically work, because it's a subtype of Person. And if an Employee needs different marshal/cache behavior than a Person, you can extend marshal/cache again, this time for Employees.

You wouldn't have a "PlainJsonDBRecord" type, or a "Cacheable" type or whatever. There's no need. The database/cache modules don't care what type it is: they just call the marshal/cache functions. If the marshal/cache functions have been properly extended, it will work. If they haven't been extended, you get an error.

Basically, types are just about bundling up state. All the behavior goes into generic functions. "PlainJsonDBRecord" doesn't bundle up state. "Cacheable" doesn't bundle up state. They describe *behavior*. The fact you called it "Cacheable" in the first place proves that. But with generic functions, you wouldn't need a "Cacheable" type at all! The behavior of caching would be put into a generic function, rather than a type.

> So ideally I want a 'traits-like' system that can express that my

> object is *all* of these things. A simple subtype hierarchy cannot

> capture this.

Yes, a type hierarchy cannot express that. But generic functions *can* express all those things. You just have to think of it in terms of behavior (generic functions) and not in terms of types (or traits, or whatever).

> Yeah, not sure 'hack' is the right word

To me, "hack" implies that it's a non-ideal solution that nonetheless works for now. In other words, something that immediately solves the problem, but in a way that will end up causing problems later on. The way SJS implements "each", the different cases are hardcoded: the only way to add new behavior to "each" is to change the source code of "each". I think "hack" is a pretty good way to describe that. But I'm open for a better word to describe that situation.

> If you have generic functions coupled with a powerful enough traits/types

> system (see Employee/DB/Cache example above), then I'm pretty sure you could get

> rid of them.

I'm not convinced you need a powerful traits/types system. In my view, types are just a way to conveniently extend generic functions. That's all. Types don't need to be powerful, because the language revolves around generic functions, not types.

Basically, you don't care about whether something is of type Duck, you only care that it *quacks* like a duck, *walks* like a duck, and *swims* like a duck. To put it more concretely, you don't care what type something is, you only care that it works with the "quack", "walk", and "swim" generic functions. *Any* type that wants to behave duck-like can extend those functions. No need for a powerful type system, no need for type checking.

> No, if we had generic functions, then `each` definitely wouldn't

> require `isStream`!

Sure, but are there any *other* functions (besides each) that need isStream? I'm just trying to figure out whether SJS could completely get rid of the isFoo functions and replace them with generic functions. If not, that would be a sign that generic functions are weak.

Alexander Fritze

unread,

Apr 28, 2014, 8:47:36 AM4/28/14

to strati...@googlegroups.com

On Mon, Apr 28, 2014 at 11:34 AM, <pcxunl...@gmail.com> wrote:
>> This will print squares of the integers, one every second, ad infinum.
>
> Sure, because SJS has continuations.
>
>
>> integers is just implemented like this:
>>
>> var integers = -> @Stream(function(receiver) { var i=0; while (true)
>> receiver(i++) })
>
> Looking through the source, I'm not sure how Stream is implemented... It
> obviously needs to do some continuation hackery, or it would go into an
> infinite blocking loop. But I'm not seeing where it actually uses
> continuations. Are they just implicit in the second argument to "each"?

No, @Stream doesn't do anything besides tagging its argument as being
of 'stream' type - it doesn't play any part in how data gets moved
around.

There is no real magic; streams just takes advantage of the fact that
SJS code is allowed to block (unlike JS).

`integers` is basically nothing more than the function:

function integers(receiver) {
var i=0;
while (true)
receiver(i++);
}

So this function expects an argument `receiver` which itself is a
function. If you e.g. have:

function logNumberAndWaitOneSecond(x) {
console.log(x);
hold(1000);
}

Then you can iterate over integers like this:

integers(logNumberAndWaitOneSecond);

The function `each` just abstracts this by one level (with the
intention that we then can also call `logNumberAndWaitOneSecond` with
other things, like arrays).
For streaming functions (like `integers`), `each` just does this:

function each(src, dest) {
src(dest);
}

So the code

each(integers, logNumberAndWaitOneSecod);

or written in SJS-specific syntax:

integers .. each(logNumberAndWaitOnSecond)

or

integers .. each { |x| logNumberAndsWaitOneSecond(x) }

or

integers .. each { |x| console.log(x); hold(1000) }

is just equivalent to

integers(logNumberAndWaitOneSecond);

For arrays `each` would do:

function each(src, dest) {
for (var i=0; i<src.length; ++i)
dest(src[i]);
}

Then we can do:

[1,2,3,4] .. each { |x| console.log(x); hold(1000); }

So as you can see `each` makes an excellent candidate for a generic function :-)

>
>> Out of curiosity, does your implementation work with infinite
>> sequences like @integers? What about streams of events?
>
> I haven't tried, but it should. It's no different from ECMAScript 6/Python
> iterators that produce an infinite sequence. You could write an infinite
> sequence like this:
>
> function Infinite(state, f) {
> this.state = state
> this.next = f
> }
>
> extend traverse(x is Infinite) {
> return {
> next: function () {
> return { done: false, value: x.next(x.state) }
> }
> }
> }
>
> function integers() {
> return new Infinite([0], function (state) {
> return state[0]++
> })
> }
>
> Not as elegant as in SJS, for sure! Though it'd be a lot nicer if all
> iterators/generators were automatically traversable. Then you could write
> "integers" like this:
>
> function* integers() {
> var i = 0;
> while (true) {
> yield i++
> }
> }

Just as an aside, one of the reasons why I think SJS is more powerful
than generators, is because of it allows you to construct abstractions
that are more "compositional". E.g. we can have a stream of the lines
of a file:

function lines(file) {
return Stream(function(receiver) {
var handle = fopen(file);
try {
while (!is_eos(handle))
receiver(read_line(handle));
}
finally {
close(handle);
}
}

Now we can use this e.g. like this:

lines('foo.txt') .. transform(line -> line.length) .. each { |length|
if (length > 80) {
console.log('warning, line width exceeded');
break;
}
}

So here, if we hit the length>80 condition, the `break` will bail out
of the {|| ... } blocklamba, which will in turn bail out of the
`receiver(read_line(.))` statement. This causes the `finally` handler
to be run.
So the file handle will always be automatically be cleaned up.

I haven't found a generators construction yet which can do something
similar... you always have to clean up stuff manually (like call
`close` somewhere).

> And bam it would just work. That is possible in my system, though it's not
> very clean...
>
> extend traverse(x) {
> if (Symbol.iterator in x) {
> return x[Symbol.iterator]()
> } else {
> throw new TypeError()
> }
> }
>
> With predicate dispatch, it becomes very clean:
>
> function isIterator(x) {
> return Symbol.iterator in x
> }
>
> extend traverse(x:isIterator) {
> return x[Symbol.iterator]()
> }
>
> I'm not sure what you mean about "streams of events".

I mean e.g. a stream of 'click' events on a dom element:

dom_elem .. events('click') .. each { |ev|
console.log('you clicked '+dom_elem);

If you remote a server function to the client, then, when the client
calls it, it will be executed on the server. (And vice versa).
We could remote prototypes, but because the underlying constructors
wouldn't be remoted, `instanceOf` will not work on the other side.

>
>> Well, let's say you have an object that represents an employee record,
>> like this:
>>
>> { name: xxx,
>> title: xxx
>> }
>>
>> Different parts of my program might want to see this object as
>> different things. E.g. to my business logic module this could be of
>> type 'Employee', a subtype of 'Person'. To the database module it
>> would be be database record, e.g. of type 'PlainJsonDBRecord', a
>> subtype of 'DBRecord'. And to the caching module it might be of type
>> 'Purgeable', a subtype of 'Cacheable'.
>
> Why would different parts of your program even care what type it is? The
> database module would have a generic function called "marshal". Anything
> that can be marshalled to the format the database expects would extend the
> "marshal" function. The caching module would have a generic function called
> "cache". Anything that can be cached would extend the "cache" generic
> function.
>
> So you'd just extend the marshal and cache generic functions so they
> recognize a Person.

Right, but it might be non-trivial to write a Person-method for the
generic `cache` function. E.g. let's say my `cache` function allows me
to enter two things into the cache: 'purgeable' items, and items that
should always stay in the cache whatever happens. My methods might
look like this:

generic cache;

extend cache(item /* non purgeable;default */, theCache) {
theCache.add({data: item, purgeable: false});
}

extend cache(item is purgeable, theCache) {
var size = calculateSize(item);

var priority;
// ... 100 lines of complicated logic to decide purging priority ...

theCache.add({data: item, purgeable: true, size: size, priority: priority});
}

So if I now want to make cache "aware" of 'Persons', and I want
Persons to be purgeable, then I have to copy the whole implementation
of `cache(item is purgeable, theCache)`. And because other things that
the `cache` method calls might be generic too, I'll have to extend
those too (e.g. `calculateSize`):

extend cache(item is Person, theCache) {
... copy code from `extend cache(item is purgeable, theCache)` here ...
}

extend calculateSize(item is Person) {
...
}

It looks like it would be much easier if we could just say that Person
is also purgeable.

Yes, I agree :-)
But note the `cache` example above... if you just have a simple
type/subtype system, then you might need to delve into the `each`
implementation anyhow. As far as I can see, only a 'traits' system
would solve this.

>
>> If you have generic functions coupled with a powerful enough traits/types
>> system (see Employee/DB/Cache example above), then I'm pretty sure you
>> could get
>> rid of them.
>
> I'm not convinced you need a powerful traits/types system. In my view, types
> are just a way to conveniently extend generic functions. That's all. Types
> don't need to be powerful, because the language revolves around generic
> functions, not types.
>
> Basically, you don't care about whether something is of type Duck, you only
> care that it *quacks* like a duck, *walks* like a duck, and *swims* like a
> duck. To put it more concretely, you don't care what type something is, you
> only care that it works with the "quack", "walk", and "swim" generic
> functions. *Any* type that wants to behave duck-like can extend those
> functions. No need for a powerful type system, no need for type checking.

Right, agreed - I only worry that it might be difficult to extend the
generic functions, and that often it would be easier to say to the
generic function, 'hey, i am a duck' and to another, 'hey, i have
feathers'.

>
>> No, if we had generic functions, then `each` definitely wouldn't
>> require `isStream`!
>
> Sure, but are there any *other* functions (besides each) that need isStream?
> I'm just trying to figure out whether SJS could completely get rid of the
> isFoo functions and replace them with generic functions. If not, that would
> be a sign that generic functions are weak.

As far as I can see, you could get rid of isFoo given generic functions :-)

pcxunl...@gmail.com

unread,

Apr 29, 2014, 4:47:34 AM4/29/14

to strati...@googlegroups.com

> There is no real magic; streams just takes advantage of the fact that

> SJS code is allowed to block (unlike JS).

Ah, I see, so the continuations are implicit in the second argument to "each".

> Just as an aside, one of the reasons why I think SJS is more powerful

> than generators, is because of it allows you to construct abstractions

> that are more "compositional". E.g. we can have a stream of the lines

> of a file:

Generators can do that too, I believe:

function* lines(file) {

var handle = fopen(file);

try {

while (!is_eos(handle))

yield read_line(handle);

} finally {

close(handle);

}

for (let line of lines('foo.txt')) {

let length = line.length;

if (length > 80) {

console.log('warning, line width exceeded');

break;

}

I haven't tested the above, but I think it should work.

By the way, I'm NOT trying to defend generators! I think SJS's system is much better, in part because of continuations, and in part because it's created using the basic building blocks of the language rather than requiring a new generator/iterator system. I'm just pointing out that generators should be able to handle similar use cases, so the difference isn't necessarily about power, but about elegance.

> I mean e.g. a stream of 'click' events on a dom element:

Ahhh, I see, you're talking about FRP signals. In my language I tried to get signals to use the same map/each/traverse/etc. system that arrays use... but I found that it ended up interacting weirdly, so I decided to make signals/events into a separate data type with a separate API. Perhaps SJS's "each" works better.

> If you remote a server function to the client, then, when the client

> calls it, it will be executed on the server. (And vice versa).

> We could remote prototypes, but because the underlying constructors

> wouldn't be remoted, `instanceOf` will not work on the other side.

I see! So the function is basically just a tiny wrapper that calls the remote function and blocks until it's done. I can see why that wouldn't work well with prototypes. You'd essentially need to use some kind of unique tag (like UUID) rather than instanceof...

> Right, but it might be non-trivial to write a Person-method for the

> generic `cache` function. E.g. let's say my `cache` function allows me

> to enter two things into the cache: 'purgeable' items, and items that

> should always stay in the cache whatever happens. My methods might

> look like this:

I would actually consider those two separate caches. So you'd have a "purgeable cache" module and a "permanent cache" module. What's the benefit of having a single cache that contains both purgable and non-purgable items?

> So if I now want to make cache "aware" of 'Persons', and I want

> Persons to be purgeable, then I have to copy the whole implementation

> of `cache(item is purgeable, theCache)`.

Why would you have 100 lines of complicated logic in the "cache" generic function? That should be put into a helper function. Then when extending "cache" to work with Person, you would call the helper function.

> And because other things that the `cache` method calls might be generic too, I'll have to extend

> those too (e.g. `calculateSize`):

Yes it is possible that there are more than one axioms that you would have to extend (in this case, cache and calculateSize). I don't think that's any worse than a trait system, though.

> It looks like it would be much easier if we could just say that Person

> is also purgeable.

And if you "just say" that a Person is purgable, how would you specify what it means to be purgable? If purgability can apply to any object, then you can just extend cache and calculateSize so they work on Objects. If it can't be applied to any object, that implies that there are different ways of purging things, which implies you'll have to write code that applies to that specific use case. That's true regardless of whether you use traits or generic functions or whatever.

> As far as I can see, only a 'traits' system would solve this.

>

> Right, agreed - I only worry that it might be difficult to extend the

> generic functions, and that often it would be easier to say to the

> generic function, 'hey, i am a duck' and to another, 'hey, i have

> feathers'.

I think that can be greatly alleviated by carefully choosing good axioms. Ideally, in order to say "I am a duck" you would only have to specify the bare minimum of what it means to be a duck.

However, you do bring up a good point in that *even* with carefully chosen axioms, you can still end up with code duplication.

e.g. a Person, Employee, and Car might have similar cache/purging behavior, but because a Car doesn't inherit from a Person, they have no choice but to copy the behavior.

This is greatly helped by putting the common behavior into helper functions... but that feels like a bit of a hack. So I guess you're right: traits CAN be better than generic functions + single inheritance.

Before adding in generic functions, I actually had another idea, which is more powerful... the idea was that rather than using strings as keys (like JavaScript does), you would instead use "unique keys". A "unique key" is something which is guaranteed to be unique: it can never collide with other unique keys, and cannot collide with strings either.

ECMAScript 6 will have unique keys, which it calls "Symbols". In any case, here's how it works:

var bar = Symbol()

var foo = {}

foo[bar] = 5

We used "Symbol" to create a new unique key, which we then put into the variable "bar". We then slapped the unique key into the object "foo", giving it a value of 5. Now if we use "foo[bar]" we get back 5. So why not just use "foo.bar" instead? Well, the difference is that unique keys are guaranteed to never collide with anything else ever.

This is an alternate way of solving the Python problem of "we have different methods with the same name that mean different things". Each method would be a different unique key (rather than a string), so there's no more collision.

So, using the cache example... the cache module would export a unique key called "cacheable". Any object with the "cacheable" key can specify custom cache behavior:

@ = require("path/to/cache")

var foo = {}

foo[@cacheable] = function () {

...

}

This is analogous to extending the "cache" generic function, except we're specifying custom behavior on a particular object, rather than on a type. Of course, if you want it to apply to an entire type, you can do that too:

function Person() {}

Person.prototype[@cacheable] = function () {

...

}

And now all Persons are cacheable. And let's say you wanted a Car to be cachable using the same caching strategy as a Person...

function Car() {}

Car.prototype[@cacheable] = Person.prototype[@cacheable]

Bam, now a Car has the same cacheability as a Person, without needing to inherit from Person. This is a very flexible and powerful system. It feels a lot like traits/mixins, without the complexity. It's also very fast too.

The reason I decided to go with generic functions rather than unique keys is that I felt unique keys were too verbose/clunky to use. I really wanted to use *functions* rather than *methods*. But maybe unique keys would be a better fit for SJS rather than generic functions. It doesn't hurt that they're official in ECMAScript 6, whereas generic functions aren't.

> As far as I can see, you could get rid of isFoo given generic functions :-)

Okay, great!

Alexander Fritze

unread,

Apr 29, 2014, 6:10:05 AM4/29/14

to strati...@googlegroups.com

On Tue, Apr 29, 2014 at 10:47 AM, <pcxunl...@gmail.com> wrote:
>> There is no real magic; streams just takes advantage of the fact that
>> SJS code is allowed to block (unlike JS).
>
> Ah, I see, so the continuations are implicit in the second argument to
> "each".
>
>
>> Just as an aside, one of the reasons why I think SJS is more powerful
>> than generators, is because of it allows you to construct abstractions
>> that are more "compositional". E.g. we can have a stream of the lines
>> of a file:
>
> Generators can do that too, I believe:
>
> function* lines(file) {
> var handle = fopen(file);
> try {
> while (!is_eos(handle))
> yield read_line(handle);
> } finally {
> close(handle);
> }
> }
>
> for (let line of lines('foo.txt')) {
> let length = line.length;
> if (length > 80) {
> console.log('warning, line width exceeded');
> break;
> }
> }
>
> I haven't tested the above, but I think it should work.

You would think so, but no, it doesn't.

E.g. try this in a recent version of firefox:

<html>
<head>
<script>
function* foo() {
try {
for (var i=0; i<10; ++i)
yield i;
}
finally {
console.log('generator cleaned up');
}
}

for (var i in foo()) {
console.log(i.value);
if (i.value == 5) break;
}

console.log('done');
</script>
</head>
</html>

The finally block will never be called!

> By the way, I'm NOT trying to defend generators! I think SJS's system is
> much better, in part because of continuations, and in part because it's
> created using the basic building blocks of the language rather than
> requiring a new generator/iterator system. I'm just pointing out that
> generators should be able to handle similar use cases, so the difference
> isn't necessarily about power, but about elegance.

No, I think it is actually about expressive power, not only elegance -
see example above :-)

>
>> I mean e.g. a stream of 'click' events on a dom element:
>
> Ahhh, I see, you're talking about FRP signals. In my language I tried to get
> signals to use the same map/each/traverse/etc. system that arrays use... but
> I found that it ended up interacting weirdly, so I decided to make
> signals/events into a separate data type with a separate API. Perhaps SJS's
> "each" works better.

Yeah, it seems to work quite well with both events/behaviours/signals
and list-like streams. The key to make it work is the implicit
continuation & retraction handling (the latter is what makes the
cleanup automatic; the thing that iterators can't do).

>
>> If you remote a server function to the client, then, when the client
>> calls it, it will be executed on the server. (And vice versa).
>> We could remote prototypes, but because the underlying constructors
>> wouldn't be remoted, `instanceOf` will not work on the other side.
>
> I see! So the function is basically just a tiny wrapper that calls the
> remote function and blocks until it's done. I can see why that wouldn't work
> well with prototypes. You'd essentially need to use some kind of unique tag
> (like UUID) rather than instanceof...

Exactly (and this is what e.g. @Stream / @isStream uses)!

>
>> Right, but it might be non-trivial to write a Person-method for the
>> generic `cache` function. E.g. let's say my `cache` function allows me
>> to enter two things into the cache: 'purgeable' items, and items that
>> should always stay in the cache whatever happens. My methods might
>> look like this:
>
> I would actually consider those two separate caches. So you'd have a
> "purgeable cache" module and a "permanent cache" module. What's the benefit
> of having a single cache that contains both purgable and non-purgable items?

Yeah, it's a really bad made-up example. I'll try to come up with
something better :-)

>
>> So if I now want to make cache "aware" of 'Persons', and I want
>> Persons to be purgeable, then I have to copy the whole implementation
>> of `cache(item is purgeable, theCache)`.
>
> Why would you have 100 lines of complicated logic in the "cache" generic
> function? That should be put into a helper function. Then when extending
> "cache" to work with Person, you would call the helper function.

Yeah, true.

>
>> And because other things that the `cache` method calls might be generic
>> too, I'll have to extend
>> those too (e.g. `calculateSize`):
>
> Yes it is possible that there are more than one axioms that you would have
> to extend (in this case, cache and calculateSize). I don't think that's any
> worse than a trait system, though.

Hmm, maybe not. I have to think about this for a bit.

>
>> It looks like it would be much easier if we could just say that Person
>> is also purgeable.
>
> And if you "just say" that a Person is purgable, how would you specify what
> it means to be purgable? If purgability can apply to any object, then you
> can just extend cache and calculateSize so they work on Objects. If it can't
> be applied to any object, that implies that there are different ways of
> purging things, which implies you'll have to write code that applies to that
> specific use case. That's true regardless of whether you use traits or
> generic functions or whatever.

Well, 'purgability' could be a business-logic related thing. I might
want to say that Person-records are so infrequently accessed that they
should be purgable. But as I say, it's a really bad made-up example :/

>
>> As far as I can see, only a 'traits' system would solve this.
>>
>> Right, agreed - I only worry that it might be difficult to extend the
>> generic functions, and that often it would be easier to say to the
>> generic function, 'hey, i am a duck' and to another, 'hey, i have
>> feathers'.
>
> I think that can be greatly alleviated by carefully choosing good axioms.
> Ideally, in order to say "I am a duck" you would only have to specify the
> bare minimum of what it means to be a duck.

Yeah, that's true, but that relies on the author of a module to have a
lot of foresight. I'm just trying to play through in my mind which
type/traits system hits the sweet spot in terms of
design/coding/understanding complexity.

We do use something like this already in SJS to add marshaling
information to objects
(https://conductance.io/reference#mho:rpc/bridge::setMarshallingDescriptor).
I like your generic function idea much better though :-)

>
>> As far as I can see, you could get rid of isFoo given generic functions
>> :-)
>
> Okay, great!
>

pcxunl...@gmail.com

unread,

Apr 29, 2014, 8:51:48 AM4/29/14

to strati...@googlegroups.com

> E.g. try this in a recent version of firefox:

Fascinating, I did not know that. I tried it in Chrome and got the same result. So, it seems generators are weak!

> Yeah, it seems to work quite well with both events/behaviours/signals

> and list-like streams.

In light of that, I'm going to have to rethink how I do signals.

> But as I say, it's a really bad made-up example :/

It's really hard to come up with good examples. The best examples are always real code, but since we're talking about a theoretical construct that hasn't been implemented yet, we don't have any examples of real code.

> Yeah, that's true, but that relies on the author of a module to have a lot of foresight.

I would argue that *no* system automagically gives you good design. Good design is hard, period. Plus, like any system, it will grow and evolve over time. Things will be made generic functions that previously did not need to be. Things that used to be generic functions will no longer be. APIs will change. That's fine, and normal.

> I'm just trying to play through in my mind which

> type/traits system hits the sweet spot in terms of

> design/coding/understanding complexity.

Indeed, I'm struggling to come up with a good system as well. I thought generic functions would work, and I still think they would work well, but this discussion has shown me that even predicate dispatch isn't necessarily the end-all-be-all. Perhaps I should look more into traits.

By the way, something that's somewhat relevant is the idea of Contracts: http://disnetdev.com/contracts.coffee/

Depending on how you use it, it can be used as a hybrid between Python's loose duck typing and generic function's strictness.

> We do use something like this already in SJS to add marshaling information to objects

Ah yes I see, the good ol' __oni_ property name. Unique keys would let you do the same thing but with *no* collisions as opposed to merely making collisions unlikely. Though from a practical stand-point the __oni_ prefix is Good Enough.

The important thing about my idea was to structure the entire language around unique keys: no single inheritance or types or anything like that. Whether an object "is-a" duck is determined by what unique keys it has. Like I said, though, it ended up being a bit too verbose for my tastes, but I'm still interested in researching it. Maybe there's still some good ideas in that direction.

Hmm... come to think of it... what if a "type" was a unique key? The mere existence of the key on an object would qualify it as belonging to that type. Something like this:

function Type(parent) {

function f(o) {

if (parent != null) {

o[parent.__type__] = true

}

o[f.__type__] = true

}

f.__type__ = Symbol()

f.__parent__ = parent

return f

}

var Person = Type()

var Employee = Type(Person)

var foo = Employee({})

Now the object "foo" is both a Person and an Employee. The above is equivalent to this:

var Person = Symbol()

var Employee = Symbol()

var foo = {}

foo[Person] = true

foo[Employee] = true

And the generic function/extend system wouldn't use instanceof, but would instead use these unique keys:

extend cache(x is Person) {

...

}

extend cache(x is Employee) {

...

}

Since an object can have an arbitrary number of these keys, we aren't limited to just single-inheritance. This isn't very fleshed out, I'll need to think about this more...

pcxunl...@gmail.com

unread,

May 2, 2014, 5:31:30 AM5/2/14

to strati...@googlegroups.com, pcxunl...@gmail.com

After thinking about it a bit more, I came up with the idea of using unique keys as an interface:

var Person = new Interface([], {

age: Number

})

var Company = new Interface([], {

})

var Employee = new Interface([Person], {

company: Company

})

var fooInc = {}

fooInc[Company.name] = "Foo Incorporated"

var foo = {}

foo[Person.name] = "Foo"

foo[Person.age] = 50

// foo is an employee at fooInc...

foo[Employee.company] = fooInc

// ...but they're also a self-employed freelancer in their spare time

foo[Company.name] = "Foo Freelance"

So, what's going on here is... an interface specifies what keys an object must have, and what type those keys must be. So as an example, the above says that a Person is an object that has a "name" key which must be a String, and an "age" key which must be a Number.

The twist is that each key is a *unique key*, meaning "name" isn't the string "name", but is instead the unique key "Person.name". Basically, the above is equivalent to this...

var Person = {

age: Symbol()

}

var Company = {

}

var Employee = {

company: Symbol()

}

...except that Interface verifies that an object has all of the appropriate keys, and they also does type-checking.

So, this is a lot like structural typing. But because it uses unique keys, collisions do not happen: Person and Company both specify a "name" key, so ordinarily you couldn't have an object be both a Person and a Company at the same time, but because they're unique keys, you can!

It also prevents false positives: it's impossible for a random object to *accidentally* be treated as a Person or a Company or whatever. Because they're unique keys, you have to *intentionally* implement the Person/Company/whatever interface. This system gives a similar flavor to structural typing while being superior to structural typing.

Also, you can specify that an interface *inherits* from another interface, which simply means that an object must fulfill *both* interfaces. As an example, an Employee inherits from Person, which means an object is only treated as an Employee if it has the Employee.company, Person.name, and Person.age keys.

Of course an interface can inherit from multiple interfaces. As an added bonus... because we're using unique keys, I'm pretty sure that the diamond problem does not occur.

Now that we have these interfaces, generic functions can use them to dispatch:

generic walk(x:Person) {

...

}

When you call the "walk" function, it will verify that "x" is an object that fulfills the Person interface. In other words, that "x" has a Person.name key which is a String, and a Person.age key which is a Number.

This idea is pretty crazy, and I'm not actually suggesting it for SJS, but maybe it will be of use to somebody somewhere.

pcxunl...@gmail.com

unread,

May 4, 2014, 4:41:40 AM5/4/14

to strati...@googlegroups.com, pcxunl...@gmail.com

I've spent the past week trying to come up with a powerful type system.

Ideally we would want our types to be something like mathematical sets: a number is the set of all values that are like this... an integer is the set of all numbers that are like this... etc.

This can be achieved through contracts:

contract isNumber(x) {

return typeof x === "number"

}

contract isInteger(x) subset isNumber {

return Math.round(x) === x

}

A "contract" is like a function that takes in a single argument and returns true/false if the argument matches. And a contract can also be a *subset* of another contract, which means both contracts must be true: something is an integer if and only if it is also a number.

This works out really well, until we add in generic functions. Consider these two contracts:

contract isLowerThan5(x) subset isNumber {

return x < 5

}

contract isGreaterThan0(x) subset isNumber {

return x > 0

}

And now we try using the contracts to extend a generic function...

generic foo;

extend foo(x:isLowerThan5) { ... }

extend foo(x:isGreaterThan0) { ... }

Now, if we call "foo" with 10 it will work fine, and if we call "foo" with -10 it will work fine... but what if we call "foo" with 3? Now *both* the isLowerThan5 and isGreaterThan0 contracts apply, and so we have to choose between them.

Unfortunately, since there's no obvious way to choose, we end up using whichever one was extended first, which in this case means isLowerThan5. Having the behavior change depending on the order that the generic function was extended is bad: we don't want that.

In order to prevent this, if we compare a contract to another contract, it must be in one of four states:

1) they are the same contract

2) one contract is a subset of the other contract

3) one contract is a superset of the other contract

4) they are disjoint (that is, they do not overlap in any way)

The problem with isLowerThan5 and isGreaterThan0 is that they are not in any of these states: they are not the same, neither is a subset/superset of the other, and they are not disjoint (because they overlap).

Since we cannot reliably tell when two extensions partially overlap, this means the behavior will change depending on the extension order, which is a huge problem for predicate dispatch!

The contracts that we created up above are a powerful form of structural typing: we don't care whether something is-a Integer, we only care that it happens to have the properties of an integer. But structural typing always allows for partial overlap.

Thinking about it further, OOP languages also have the same problem, so how do they solve it? Answer: nominal typing. So in order to prevent partial overlap, we have to change the contract system to use nominal typing instead of structural typing.

A nominal typed contract system might look like this:

contract Number(x) {

return typeof x === "number"

}

contract Integer(x) subset Number {

return Math.round(x) === x

}

contract LowerThan5(x) subset Number {

return x < 5

}

contract GreaterThan0(x) subset Number {

return x > 0

}

Wait a second, this looks exactly the same! All we did is rename "isFoo" to "Foo". That's right, but there's an extra twist: a contract is no longer a function that returns true/false when given a value. Instead, a contract is a function that takes a single argument and *wraps* it.

In other words, rather than saying this:

extend foo(x:isNumber) { ... }

foo(5)

You would instead say this:

extend foo(x:Number) { ... }

foo(Number(5))

We're taking values and wrapping them in contracts. Since a value can only be wrapped by a single contract at any given time, you cannot have a value which fulfills both the LowerThan5 and GreaterThan0 contracts, and so the problem of partial overlaps is solved!

And of course you can *coerce* from one contract to another:

var bar = LowerThan5(3)

var qux = GreaterThan0(bar)

We took the value LowerThan5(3) and coerced it into GreaterThan0(3). So rather than having both LowerThan5 and GreaterThan0 applying at the same time, we have to choose which contract will apply to the value. This is nominal typing rather than structural typing. And so it solves the problem of partial overlap.

What are the downsides? Well, firstly, a value can only fulfill a single contract at once. But that's not actually a problem, since contracts can be subsets and supersets of other contracts.

The other downside is that you have to manually wrap values. That is a pain, yeah, but that's the price you have to pay if you want generic functions to behave sanely.

P.S. As an optimization, we can verify that the value fulfills the contract at the time we wrap it. And then, generic functions would just dispatch based on the wrapper. This should be faster than full predicate dispatch while still having most of the power of predicate dispatch.

The downside is that a mutable object might fulfill the contract at the time that it is wrapped, then you mutate it in such a way that it no longer fulfills the contract, but it's still accepted by the generic function.

pcxunl...@gmail.com

unread,

May 4, 2014, 8:59:11 AM5/4/14

to strati...@googlegroups.com, pcxunl...@gmail.com

So, after trying to add in nominal predicate dispatch to my language, I realized something... it won't work with multiple inheritance!

contract Foo(x) { return true }

contract Bar(x) { return true }

contract Qux(x) subset Foo, Bar { return true }

generic foo;

extend foo(x:Foo) { ... }

extend foo(x:Bar) { ... }

If we call "foo(Foo())" everything is okay, and if we call "foo(Bar())" everything is okay... but if we call "foo(Qux())" suddenly we don't know which extension to call, since either one is acceptable!

So this means nominal predicate dispatch can only allow for single inheritance: a contract can be a subset of at most 1 other contract.

This has severe consequences for composability. If we only have single inheritance, then we can't have a contract called Positive and a contract called Integer... and then combine them together to make PositiveInteger. You'd have to copy code from either the Positive or Integer contract.

Well, back to the drawing board...

pcxunl...@gmail.com

unread,

May 6, 2014, 10:27:39 AM5/6/14

to strati...@googlegroups.com

Eureka! I think I got it! It has been very fascinating, studying generic functions, predicates/contracts, mathematical sets, and structural/nominal types.

This is my new proposal. First off, types! A type is basically a function that takes a single argument and returns true/false based on whether the argument is of that type or not:

type Number(x) {

return typeof x === "number"

}

type String(x) {

return typeof x === "string"

}

So we're saying if something is typeof "number" then it belongs to the Number type. And if it's typeof "string" it belongs to the String type.

But there's a catch: it isn't just enough for it to be typeof "number" or typeof "string" or whatever, you must also wrap it with the type:

Number(5)

The above verifies that 5 is in fact typeof "number", and then it returns a wrapper. This wrapper simply says that it is of type Number.

We can create generic functions that dispatch on the type of something:

generic multiply;

extend multiply(x:Number, y:Number) {

return x * y

}

extend multiply(x:String, y:Number) {

return new Array(y + 1).join(x)

}

multiply(Number(10), Number(5)) // 50

multiply(String("foo"), Number(5)) // "foofoofoofoofoo"

Gosh, having to wrap things is a huge pain in the butt. Well, what you can do is make Number and String special magical built-in types that don't require you to wrap them. Now we can use the "multiply" function like so:

multiply(10, 5)

multiply("foo", 5)

Ahhh, much better! If somebody creates their own custom type, they'll still have to wrap it. It's only the special built-in types that don't need to be wrapped.

You can also have a type that inherits from another type:

type Foo(x) {

return true

}

type Bar(x) subset Foo {

return true

}

Foo(5)

Bar(5)

Anything that expects a Foo will work with a Bar, but not vice versa. Also, a type can inherit from only one type: no multiple inheritance.

It's important to note that the above creates two different types: if you want something to be treated as a Bar, you must wrap it with Bar. Wrapping it with Foo is not enough to have it treated as a Bar. This is nominal typing, just like with OOP classes.

Also, you can freely coerce from one type to another:

var foo = Foo(5)

var num = Number(foo)

The above wraps the number 5 with type Foo, then passes it to the Number type. The Number type first unwraps it, verifies that it is a Number, then rewraps it with the Number type. And we can of course coerce back to a Foo:

var foo2 = Foo(num)

Or we can coerce it to a Bar...

var bar = Bar(foo2)

You can freely coerce things as much as you like, as long as the type returns true.

Now, the above system is pretty neat, and very powerful. But it's a bit clunky... I mean, let's say you wanted to have an Integer type:

type Integer(x) subset Number {

return Math.round(x) === x

}

Okay, cool, but now we have to manually wrap things with "Integer(...)" to have them be treated as an integer! What a pain. Why can't we just say that an Integer is any Number that happens to be an integer? Well, you can! Introducing, contracts.

contract Integer(x) matches Number {

return Math.round(x) === x

}

The above says that any Number that happens to be an integer is an Integer. This doesn't create a new type, and you don't have to wrap anything in "Integer(...)". So, why not just use contracts all the time? Why even bother with types? Two reasons:

1) Sometimes you want to distinguish between two things that happen to be the same: just because something has a "length" property doesn't necessarily mean it's an Array. Types let you clearly state that this is *definitely* an Array and not just something that *happens* to be *similar to* an Array.

2) You can't use contracts with generic functions.

Say whaaaaaat? Yeah, you can't use contracts with generic functions. To explain why, first I'll define a new contract:

contract Positive(x) matches Number {

return x >= 0

}

Okay, cool, any positive Number matches the Positive contract. Now let's create a generic function:

generic foo;

extend foo(x:Integer) { ... }

extend foo(x:Positive) { ... }

Now if we call "foo(-5)" it will choose the Integer path, if we call "foo(5.5)" it will choose the Positive path... but which path should it choose if we call "foo(5)"? Answer: it can't choose. So we have to disallow these situations.

Generic functions can dispatch on types because types behave sanely. Contracts do not behave sanely, and so they cannot be used with generic functions.

But there's a little twist... rather than saying that a contract matches a type, we can say that a contract is a subset of a type:

contract Integer(x) subset Number {

return Math.round(x) === x

}

contract Positive(x) subset Number {

return x >= 0

}

All we did is change "matches" to "subset". Now, it's important to note that this does *not* create new types. Instead, if you call "Integer(5)" it will behave exactly the same as if you had called "Number(5)". The only difference is that it will also verify that it is, in fact, an integer.

If we had used "type" instead of "contract", it would have created Integer and Positive types. But since we used "contract", that means the type of Integer and Positive is Number.

Now, since generic functions dispatch based on types, and both Integer and Positive have a type of Number... that means that doing this...

extend foo(x:Integer) { ... }

extend foo(x:Positive) { ... }

...is the same as doing this:

extend foo(x:Number) { ... }

This is an obvious error that can be easily detected by the generic function system. And so the system is guaranteed to behave sanely.

----

So, to recap: a type is a function that takes an argument and returns true/false. You must manually wrap a value with the type in order for it to be treated as that type.

A value can only belong to a single type at one time, but you can freely coerce it to another type at any time, as long as the type returns true.

A type can inherit from another type, which means that anywhere the parent type works, the child type works too.

Generic functions always dispatch based on the type.

A contract is a function that takes an argument and returns true/false. You don't need to wrap anything, but they can't be used with generic functions because they don't have a type.

A contract can inherit from a type, which simply means that the type of the contract is the type of the parent. Now it can be used with generic functions, but you can't combine it with other contracts that inherit from the same type.

In the end, you use "type" to create... well... new types. If you aren't creating a new type, but instead just want to restrict an existing type, you should probably use "contract".

pcxunl...@gmail.com

unread,

May 10, 2014, 2:45:39 AM5/10/14

to strati...@googlegroups.com, pcxunl...@gmail.com

Okay, new type system!

type Number(x) {

return typeof x === "number"

}

type Integer(x::Number) {

return Math.round(x) === x

}

type Positive(x::Number) {

return x > 0

}

generic foo

extend foo(x::Integer) {

return "integer"

}

extend foo(x::Positive) {

return "positive"

}

foo(-5) // returns "integer"

foo(5.5) // returns "positive"

foo(5) // error because it's ambiguous

foo(5::Integer) // returns "integer"

foo(5::Positive) // returns "positive"

extend foo(x::Positive Integer) {

return "positive integer"

}

foo(5) // returns "positive integer"

foo(5::Integer) // returns "integer"

foo(5::Positive) // returns "positive"

foo(5::Positive Integer) // returns "positive integer"

So first off, we use "type" to create new types. In this system, a "type" means a predicate that returns true/false based on whether its argument matches the type or not.

You can read "foo::Bar" as "foo is-a Bar". So, an Integer is-a Number that is an integer. Now, when I say "foo is-a Bar" I don't mean in the normal OOP sense: anything that happens to match the Bar predicate is-a Bar. So any Number that matches the Integer predicate is *automatically* treated as an Integer. This is structural typing.

You can of course use the :: operator with generic functions to dispatch to different behavior. Wait a minute... I said earlier that structural typing and generic functions don't fit well together... That's true, but this time, rather than trying to disallow partial overlapping, I embrace it. It's totally fine to define partially overlapped predicates like Integer and Positive. Instead of throwing an error when extending "foo", it'll throw an error when calling "foo", as you can see above.

Now, when you have two predicates that both match the same value, you have to *disambiguate* between them. You do that by using the :: operator. In a function's argument list, :: means "check that this value matches this type". When not in a function's argument list, it means "check that this value matches this type, then wrap it in such a way that it is always treated as this type".

In other words, if you say "5::Integer" you're using nominal typing: you're saying that 5 is definitely an Integer and definitely NOT a Positive, even though both the Positive and Integer types match. This allows you to choose which branch will be taken, and thus resolve the ambiguity. This means that every type can be used both structurally and nominally, depending on what you want to do with it.

There's another way to resolve the ambiguity... notice that we extended "foo" with "x::Positive Integer". That means that the argument must match both the Positive AND Integer types. Now there's no more ambiguity, since either the Positive behavior will be called, the Integer behavior will be called, or the Positive+Integer behavior will be called. But you can of course still use :: to choose whichever behavior you want.

That about covers it. To recap, types are predicates that return true/false, you can create generic functions that dispatch on types. Types behave structurally by default, but you can use :: to treat a type as if it were nominal. This lets you manually choose which behavior to dispatch on. This is very useful not only for resolving ambiguities, but also because it lets you treat the same data differently depending on the context.

In JavaScript, you'd use "new Person(...)" to create a Person type, but in this system, you'd say "{ ... }::Person". Notice that we created a normal hash table, then annotated it with type Person. This can be easily encapsulated into a function:

function person(name, age) {

return { name, age }::Person

}

person("foo", 50)

Of course, even if you didn't annotate it with ::Person, it would still work with functions expecting a Person... but annotating it makes *sure* that it'll work without any ambiguities. You could also choose to not annotate it, and only annotate things when there's an ambiguity.

This system thus lets you seamlessly blend structural and nominal typing together: you can annotate everything (nominal typing), or annotate nothing (structural typing), or use some mixture between them.

Why didn't I use this system before? Welllll, because it's tricky to make it fast... but I have some ideas that might make it fast. If they work, then I think this system will be the most powerful and flexible.

Alexander Fritze

unread,

May 22, 2014, 3:56:20 AM5/22/14

to strati...@googlegroups.com

Heh, you're really prolific with the type system ideas - it's taking
me a while to catch up :)

I've had a couple of thoughts on the latest type system you're proposing:

- Predicate-based types don't work well for situations where your type
identifies some feature that is difficult to test.

E.g. in SJS we have 'Streams', which are just functions that implement
a particular protocol (they take a functional parameter to which they
sequentially promise to pass a number of values). We also have
'Observables', which are 'Streams' that fulfil a few additional
constraints (like passing an initial value after a finite time).

Testing whether a function is a 'Stream' or an 'Observable' would
basically mean that you would have to run it and study its behaviour.

- I think the global nature of generic functions makes predicate-based
typing extremely brittle.

Consider e.g. that you are using a function `multiply(::number,
::number)` is module A. Now you add a module B to your program where
you extend `multiply` to work for integers. Because integers are also
numbers, you'll know that you now need to take great care to
disambiguate any calls to `multiply` by explicitly naming the desired
type, e.g. `multiply(5::Integer, 6::Integer)`.

The problem is that you don't only need to do this disambiguation in
new code; you also need to do it all existing code that calls the
generic function in question, such as module A.

Worse, because your types are predicate-based, nothing can be inferred
at compile-time, so you'll only spot any problems at runtime. If code
in module A is only seldomly used, then it might be a long time before
you actually spot a runtime error :(

Cheers,
Alex

pcxunl...@gmail.com

unread,

May 22, 2014, 3:02:54 PM5/22/14

to strati...@googlegroups.com

> Predicate-based types don't work well for situations where your type

> identifies some feature that is difficult to test.

Sure, which is why you can explicitly tag something with the type. In essence, you're stating that your *intention* is that it belongs to the type. Trying to precisely articulate *all* the requirements for *all* types is unreasonable. Instead, you would just lay out some general restrictions and trust that if a user says it's an Observable, then it's an Observable, dammit.

> I think the global nature of generic functions makes predicate-based

> typing extremely brittle.

That particular example you gave is actually not a problem, because an Integer is a subset of Number. The only time ambiguity arises is with partial overlap.

However, you are right that *in general* it is possible for an extension in one module to break code in a different module. The only way to avoid that is to require everything to be explicitly tagged. And even explicit tagging isn't enough when you factor in multiple types and subtypes... Basically, there's no real way to avoid having one module muck with another module. You can just make it more likely or less likely in practice.

> Worse, because your types are predicate-based, nothing can be inferred

> at compile-time, so you'll only spot any problems at runtime.

This is actually not true if you always explicitly tag types. As an example, if you have this code...

generic foo(x::Integer) {

...

}

foo(5::Integer)

...then it's quite possible to detect at compile-time which behavior will be called. It won't be perfect (there will still be some situations where it's too dynamicy to do the analysis at compile-time) but you can still understand a lot more at compile-time than you might think. Of course that only applies if you explicitly tag everything...

In any case, you're right that the type system I proposed does have certain issues with regard to conflicting extensions. Interestingly, even a single-inheritance single-dispatch system (as seen in a lot of OOP languages) can have conflicting extensions... it just doesn't happen very much in practice.

I've since discovered the amazingness of tables (think SQL databases), so now I'm trying to figure out a way to integrate them seamlessly into my language. If I succeed, tables may end up replacing types/generic functions. I don't know what that would mean for SJS.

Alexander Fritze

unread,

May 22, 2014, 3:17:06 PM5/22/14

to strati...@googlegroups.com

On Thu, May 22, 2014 at 9:02 PM, <pcxunl...@gmail.com> wrote:
>> I think the global nature of generic functions makes predicate-based
>> typing extremely brittle.
>
> That particular example you gave is actually not a problem, because an
> Integer is a subset of Number. The only time ambiguity arises is with
> partial overlap.
>
> However, you are right that *in general* it is possible for an extension in
> one module to break code in a different module. The only way to avoid that
> is to require everything to be explicitly tagged. And even explicit tagging
> isn't enough when you factor in multiple types and subtypes... Basically,
> there's no real way to avoid having one module muck with another module. You
> can just make it more likely or less likely in practice.
>

I think it is just the predicate-based typing bit that makes it
brittle, because new predicates that you introduce might match
existing code: Objects in module A that were previously of type T1,
might now also be of type T2, just by virtue of adding a new type
definition in module B.
If, on the other hand, types would always have to be explicitly
assigned to objects, as in:

// module B
type T2;

var foo = { ... some object } :: T2

Then, because T2 isn't visible in module A, none of the objects in
module A will accidentally acquire that type.

> I've since discovered the amazingness of tables (think SQL databases), so
> now I'm trying to figure out a way to integrate them seamlessly into my
> language. If I succeed, tables may end up replacing types/generic functions.
> I don't know what that would mean for SJS.

Ohh, sounds interesting! You mean you're integrating relational
calculus into your language?

pcxunl...@gmail.com

unread,

May 22, 2014, 8:20:03 PM5/22/14

to strati...@googlegroups.com

> Then, because T2 isn't visible in module A, none of the objects in

> module A will accidentally acquire that type.

Right, but that can happen even with fully-qualified types:

type Integer(x::Number) {

...

}

type Positive(x::Number) {

...

}

// module A

generic foo(x::Number) {

...

}

foo(x::Integer) // expects this to use the behavior for Number

// module B

// oops, now the behavior is changed for module A!

extend foo(x::Integer) {

...

}

And it can also happen if you allow for something to be multiple types at once:

// module A

extend foo(x::Integer) {

...

}

foo(x::Positive Integer) // expects this to use the behavior for Integer

// module B

// oops, now the behavior of foo is ambiguous

extend foo(x::Positive) {

...

}

It's just that the above two situations are unlikely to happen in practice.

> Ohh, sounds interesting! You mean you're integrating relational

> calculus into your language?

I guess you could say that, though I'm not being really formal about it or anything. I just really like how tables combine the best aspects of both lists and dictionaries. And that they let you trivially describe many-to-many relationships, which is something that OOP is super bad at.

Also, in an application I wrote, I currently use a very simple FRP signal implementation to bind the UI to the backend logic. This works okay, but I think using tables will make it a lot cleaner, if I can work out some issues.

Reply all

Reply to author

Forward