non syntax hell?

46 views
Skip to first unread message

Raoul Duke

unread,
Dec 18, 2016, 4:25:35 PM12/18/16
to PiLuD
One way or another, the syntax and semantics of records seems to never
have been done in a way that doesn't somehow majorly suck. Haskell,
Go, C++, whatever, sometimes even if you stick to records that aren't
composed/subtypes of others.

Is there a language anywhere that gets it as least wrong as possible
vs. all the others?

David Barbour

unread,
Dec 18, 2016, 7:03:18 PM12/18/16
to PiLuD

Part of the issue has been that records have rather ad-hoc and second class type in most languages.

It isn't easy to work with them, in terms of abstraction, composition, decomposition, renaming of labels, enumeration of labels, use as a case dispatch for labeled variants, dynamic label construction, etc.. They also need special attention from the type checker.

Row types can help a bit, but still seem very ad-hoc.

For Awelon, I've only very recently found a satisfactory model of labeled data - records (labeled products) and variants (labeled sums). It's easiest to explain starting with sums.

The basic algebraic sum type (A+B) for types A and B is easy to model. We might describe this as "A in left or B in right". The type 0 is identity for sums, that is (A+0) has the same number of values as just A. But it clearly has different structure.

We can model arbitrarily "deep" sum path like (((0+(0+A))+0)+0) without adding information.  But doing so gives us a lot of slots for extension - e.g. (((0+(0+A))+0)+(B+0)). Further, this left-right-left path can encode arbitrary bit strings into the type - a formal basis for labels without any semantics beyond trivial algebraic sums.

There are a lot of ways to encode labels in a bit field. One option I favor is "(RR|RL)*(LL)" to encode the bit field with a terminal (a self synchronizing code), then use simple UTF-8 or ASCII. But ideally the runtime or compiler should know about this pattern and heavily optimize it.

Given labeled variants as bit fields encoded in a deep sum type, a natural encoding for records is a trie using bit field keys. The trie eliminates irrelevant ordering information. Depending on the encoding of this trie, we might be able to enumerate keys or combine records in ad-hoc ways like row types.

A trie is a bit awkward syntactically. So what I actually use for records is a function that add keys to a trie. This conflates the model of "record" and "record update function" since the latter is equivalent to the former if applied to an initially empty trie. Record update functions can easily be sparse, and the order of distinct keys is irrelevant to semantics. As it should be.

Importantly, labels are not the "symbol" values common to so many languages. A (symbol, value) pair requires dependent types. Labeled values only need algebraic types. It's a huge semantic difference.

OTOH, if we have labels, they can easily serve the same role as symbols. We can also systematically rewrite labels with simply typed functions. So labels are more general and first class.

The main remaining issue is syntax. A language or editable view will need to provide good syntactic sugar for these labels. Something like :foo could work well enough.


--
You received this message because you are subscribed to the Google Groups "PiLuD" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pilud+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pilud/CAJ7XQb7thsD-5ZFZ8GL%2Bk8fqzXm0cibF7CVv8QAy0mD-fNUu2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Raoul Duke

unread,
Dec 18, 2016, 11:10:55 PM12/18/16
to PiLuD

thanks, i am hopeful such experiments will over time help pull us out of our current nadirs.

Reply all
Reply to author
Forward
0 new messages