Virtual Sums for Typechecking?

67 views
Skip to first unread message

David Barbour

unread,
Mar 19, 2014, 6:15:05 PM3/19/14
to reactiv...@googlegroups.com, Kyle Blake
This afternoon, I had an idea for a new ABC operator, which creates a virtual fork on a value - something like:

        X :: a * e → (a + a) * e  -- in right

In this case, the active runtime sum will always be in the right branch, but the static type information is also reflected into the left branch. The resulting sum can trivially be merged with M or eliminated with K; XM or XK would reduce to identity. (The difference between M and K is that operator M requires merging of future-compatible types, whereas K asserts we're in the right branch and drops the left.)

The primary utility of this 'virtual fork' would be to support typechecking or type inference. Developers can essentially use it to assert that a value is compatible with a given function, without actually applying the function. Or, similarly, they could assert that two functions are compatible with respect to merge.

I'm interested in any second opinions on this approach.

A little extra context:

Last week, I chose to remove the introspection operators P,S,B,N from ABC. It is no longer possible to ask whether a value is a product, you can only assume it. This simplifies type inference and erasure, but came at a cost: it's now more difficult to assert structural properties at certain boundaries (most notably, it's very useful to assert type information around `{capabilityText}` invocations.)

So these virtual sums might serve as a replacement. I hypothesize virtual forks will serve the typechecking role much more effectively than introspection.


Kyle Blake

unread,
Mar 20, 2014, 3:17:08 AM3/20/14
to reactiv...@googlegroups.com, Kyle Blake
What's the motivation for this as a primitive? It seems like you should be able to implement X using the existing operators. That said, this strategy of asserting structure seems like a good idea.

Matt McLelland

unread,
Mar 20, 2014, 9:17:33 AM3/20/14
to reactiv...@googlegroups.com
> Last week, I chose to remove the introspection operators P,S,B,N from ABC. It is no longer possible to ask whether a value is a product, you can only assume it.

Sounds good to me.  Parametricity is preserved and assertions can be kept at the type level.

> X :: a * e → (a + a) * e  -- in right

So this is analogous to code that looks like this?

    x = ...
    if false
        foo x -- "assert" that we're allowed to call foo with x

Seems kind of hokey.





--
You received this message because you are subscribed to the Google Groups "reactive-demand" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactive-dema...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Barbour

unread,
Mar 20, 2014, 10:27:52 AM3/20/14
to reactiv...@googlegroups.com

Hmm. I suppose it might be possible using some combination of v, V, and D. I'll see what I can do.

On Mar 20, 2014 2:17 AM, "Kyle Blake" <klkb...@gmail.com> wrote:
What's the motivation for this as a primitive? It seems like you should be able to implement X using the existing operators. That said, this strategy of asserting structure seems like a good idea.

--

David Barbour

unread,
Mar 20, 2014, 10:32:10 AM3/20/14
to reactiv...@googlegroups.com


On Mar 20, 2014 8:17 AM, "Matt McLelland" <mclella...@gmail.com> wrote:
>
> So this is analogous to code that looks like this?
>
>     x = ...
>     if false
>         foo x -- "assert" that we're allowed to call foo with x

Yes, effectively.

>
> Seems kind of hokey.
>

Why so?

David Barbour

unread,
Mar 20, 2014, 10:47:45 AM3/20/14
to reactiv...@googlegroups.com
Okay, you're right. I don't need a primitive for this. Though, it might be a good candidate for ABCD in the future. Something like:

    intro1 inR swap distrib [%c] .left [%c] .right

Should do the job as is.

Best,

Dave

Matt McLelland

unread,
Mar 20, 2014, 10:48:44 AM3/20/14
to reactiv...@googlegroups.com
>> Seems kind of hokey.

>Why so?

Well, that was more of a gut reaction, but thinking about it more, it's because you're introducing real code, even if it is just a dead branch, as a round about way of providing annotations for the type checker.  I'd rather see something more direct:

x = ...

assert  well-formed (foo x)

Or something like that.   Can you not have something like 

assert : a * (a->Prop) * e -> a*e

And then allow the construction of Props to do type level introspection type things?



--

David Barbour

unread,
Mar 20, 2014, 11:57:12 AM3/20/14
to reactiv...@googlegroups.com
On Thu, Mar 20, 2014 at 9:48 AM, Matt McLelland <mclella...@gmail.com> wrote:
Well, that was more of a gut reaction, but thinking about it more, it's because you're introducing real code, even if it is just a dead branch, as a round about way of providing annotations for the type checker.  I'd rather see something more direct:

x = ...

assert  well-formed (foo x)

Or something like that.   Can you not have something like 

assert : a * (a->Prop) * e -> a*e

And then allow the construction of Props to do type level introspection type things?


I suppose such a primitive would be possible. Introduction of a new 'Prop' concept (essentially, a new sub-language) seems excessive, especially for a bytecode (ABC is Awelon Bytecode). But it wouldn't be difficult to skip the intermediate (a+a) structure and simply use:

        X :: [a→b] * (a*e) → (a*e)

OTOH, I feel the intermediate (a+a) structure is more elegant. ABC currently includes the following operators:

        ? :: [a→a'] * (a+b)*e → (a'+b)*e -- conditional apply
        K :: (a+b)*e → b*e  -- assert
        M :: (a+a)*e → a*e -- merge

The earlier formulation for `X` would allow something like `X[foo]?K` or `X[foo]?M` to express various kinds of assertions. The latter, using merge, would also require the output of [foo] be of compatible type with the input, so is a bit more expressive than your `assert well-formed`. The reason I find this a bit more elegant is that the intermediate `(a'+a)` type after the `?` operator (but before the `K` operator, which kills the dead branch) provides a clear, typeful indication of exactly what's happening.

Turns out I don't need a new operator for this. I can use the D (distrib) and V (intro void) operators in addition to a few others to achieve the same strategy. One implementation for X is:

        X = vvrwlcVwrzDVR[c]?W[c]?LCl

Anyhow, the use of dead code to support type inference and static analysis has long been an idea I've toyed with, but I haven't yet moved it into idiom or convention. I have some sense that your preference for a separate 'assert well-formed' is just a way to clearly document this idiom for a human user, which seems somewhat less relevant for a bytecode.

David Barbour

unread,
Mar 20, 2014, 11:59:58 AM3/20/14
to reactiv...@googlegroups.com
On Thu, Mar 20, 2014 at 10:57 AM, David Barbour <dmba...@gmail.com> wrote:
One implementation for X is:

        X = vvrwlcVwrzDVR[c]?W[c]?LCl

This is actually a variation for the AO stack: X :: ((a*s)*e)→(((a+a)*s)*e)
 

Matt McLelland

unread,
Mar 20, 2014, 1:37:12 PM3/20/14
to reactiv...@googlegroups.com
> I have some sense that your preference for a separate 'assert well-formed' is just a way to clearly document this idiom for a human user, which seems somewhat less relevant for a bytecode.

Yeah, that's quite possible.


Matthew Blount

unread,
Jul 6, 2014, 10:10:44 PM7/6/14
to reactiv...@googlegroups.com, ky...@klkblake.com

> Given any existing parsing expressions e, e1, and e2, a new parsing
> expression can be constructed using the following operators:
> Sequence: e1 e2
> Ordered choice: e1 / e2
> Zero-or-more: e*
> One-or-more: e+
> Optional: e?
> And-predicate: &e
> Not-predicate: !e

Sequencing and choice are already in ABC and AMBC, but I want to draw
your attention to and-predicate and not-predicate.

> The and-predicate expression &e invokes the sub-expression e, and then
> succeeds if e succeeds and fails if e fails, but in either case never
> consumes any input.
> The not-predicate expression !e succeeds if e fails and fails if e
> succeeds, again consuming no input in either case.

In the context of program execution, we might say that `&p` "succeeds"
if `p` can be applied to the environment, but does not actually apply
`p` to the environment. `!p` "succeeds" if `p` cannot be applied to
the environment, and again leaves the environment unchanged.

This has the same "dead code assertion" feel as the `X` operator, but
it's handled at a lower level, and more explicitly. If I'm
understanding your description correctly, the semantics of `X` do not
actually require the implementation pay any attention to code
executing on the left branch at all, and the latent typing of the ABC
compile time ensure it does not need to be type correct.  It may be
the case that a clever implementation will recognize this dead code as
an opportunity for implicit structural assertions, but things would
work just fine if the code on the left branch was completely garbage
and couldn't even be executed. In contrast, `&e` will fail if `e`
could not be run, and so is a more explicit structural assertion.

I think the parallels with PEGs can be taken even further: there's a
nice implementation for Kleene star and the derived operators `+` and
`?`, and an argument for making these and `|` deterministic (i.e. make
`|` ordered choice and banning backtracking in general).

If `|` is ordered choice and backtracking is banned under `|`, `*`,
`+` and `?`, then programs of the form `(a | b)*` (that is,
alternation with Kleene repetition) are basically machines of the same

Gowers' argument for banning backtracking is:

> The ‘extreme human’ approach we are trying to adopt can be summarized
> as follows: we do not allow our programs to do anything that a good
> human mathemati- cian wouldn’t do. A serious difficulty in trying to
> design a program that satisfies this constraint is that while humans
> do their very best to avoid search and back- tracking, they
> undoubtedly do at least some search and backtracking when they are
> trying to solve difficult problems. What is the distinction between
> the kind of search that humans do and the kind that humans would never
> do? Indeed, is there a distinction, or is it just a matter of degree?
> These are difficult questions, and rather than try to answer them
> immediately, it seems more sensible to try to isolate them by
> concentrating first on other difficulties and facing up to this one
> only when we are forced to do so. Accordingly, we define a routine
> problem to be one that a good human mathematician will typically solve
> easily without backtracking. If a program is to satisfy the main
> constraint, then it too will have to solve routine problems without
> backtracking, so we can simply ban it. Although this is a significant
> scaling down of ambition, it also makes the project far more realistic
> in the short term. Furthermore, the class of routine problems is large
> and diverse enough that the challenges it raises are still very
> interesting.
> Move types are ranked in order of attractiveness or priority, and the
> basic operation of the program consists of repeatedly choosing the
> most attractive move type that can be applied, generating the moves of
> that type, and applying the most attractive one. In practice with the
> problems we are considering, it is rare for the most attractive move
> type to generate more than one move, except in cases where the moves
> generated are related by some simple symmetry.

With ordered choice and no backtracking, `(a | b | c)*` is the program
that repeatedly executes the first function it can between `a`, `b`
and `c`. After looking at section 2.4 in the paper, you'll see that
Gowers' program is basically

    (deleteDone | deleteDoneDisjunct | deleteDangling | ...)*

David Barbour

unread,
Jul 7, 2014, 12:52:18 AM7/7/14
to reactiv...@googlegroups.com, Kyle Blake, drda...@gmail.com
On Sun, Jul 6, 2014 at 9:10 PM, Matthew Blount <drda...@gmail.com> wrote:
In the context of program execution, we might say that `&p` "succeeds"
if `p` can be applied to the environment, but does not actually apply
`p` to the environment. `!p` "succeeds" if `p` cannot be applied to
the environment, and again leaves the environment unchanged.

This has the same "dead code assertion" feel as the `X` operator, but
it's handled at a lower level, and more explicitly.

Thanks, Matthew. This is an interesting observation. There certainly are advantages to a more explicit type assertion, especially since dead code is often not evaluated. 

Translated for ABC, we might have:

    & :: (a'→b) * a → a ; assert a is compatible with a'
     !  :: (b→c) * a → a ; assert a is incompatible with b

Though, not every ABC runtime is required to perform typechecking. These might be better as annotations. And we could certainly explore them via annotations without adding them as primitive operators.

   {&applies} :: (a'→b) * a → (a'→b) * a ; a,a' compatible
   (&does not apply} :: (b→c) * a → (b→c) * a ; a,b incompatible

My intuition is that `!` or `{&does not apply}` would be awkward operations to use effectively. 


understanding your description correctly, the semantics of `X` do not
actually require the implementation pay any attention to code
executing on the left branch at all

Yes, this is intentional. I don't expect every implementation of ABC to perform or contribute to typechecking. 

In this sense, a more implicit approach like operations on dead code or annotations can be useful in part because they can be ignored by some runtimes. And certain kinds of types - e.g.  'does not observe' types like parametric polymorphism or 'negative' types we might see in property testing - might be better supported by declarations in a higher level language. (I've been contemplating how to declare types in AO.)


an argument for making these and `|` deterministic (i.e. make
`|` ordered choice and banning backtracking in general).

With parser combinators and PEGs, we can avoid backtracking by trying to parse one character then committing to that search path. I can't think of any clear 'point of commitment' for AMBC. I suppose we could use irreversible side-effects as the boundary, but that seems rather ad-hoc.

In any case, the motivation for AMBC is exploratory and generative programming, describing a large but well defined and finite 'space' of programs such that we can search for high quality programs based on external heuristics - not just type safety (though an inconsistent program isn't of very high quality). Syntactically ordered search doesn't contribute to this use case, and actually interferes. A 'soft' ordering orthogonal to syntax, e.g. based on weighted grammars, would be a better basis. 

I'm not sure I'll use AMBC. I'm tempted to shift the exploratory programming role primarily into the IDE, supporting ambiguous operations or words and refinement of them with human support. I plan to focus on ABC and RDP for now, and come back to this later.


> The ‘extreme human’ approach we are trying to adopt can be summarized
> as follows: we do not allow our programs to do anything that a good
> human mathemati- cian wouldn’t do.

Machines are useful in part because they can perform tasks that humans should not, cannot, or would not do. Exploring a large space of programs by hand is, I think, among these. :)

David Barbour

unread,
Nov 23, 2014, 12:57:54 AM11/23/14
to reactiv...@googlegroups.com, Kyle Blake, Matthew Blount, Matt McLelland
(Adapted from a recent discussion with Kyle):

If an error occurs in dead code, should it make a sound? 

I think a warning makes a good default, but it seems feasible to annotate void types for whether they should suppress warnings or promote them to errors. We might have some concept of unchecked voids (errors suppressed) vs. weakly checked voids (errors reduced to warnings) vs. strongly checked voids (errors block compilation). In concrete terms, something like:

{&void suppress}  :: (a+0)*e → (a+0')*e
{&void checked}   :: (a+0)*e → (a+0')*e 

For example, {&void suppress} would both fail if the target input was not a void, and mark the void such that subsequent unification errors are not reported. These annotations could all be stripped by the compiler.

Earlier, Matt McClelland called it 'hokey' to implicitly use dead branches and argued in favor of a more explicit model. I generally agree that explicit is preferable to implicit. I don't care for the second class introspective proposition logic he suggested.  But it seems to me that explicitly marking some voids as 'checked' would serve the role of keeping type assertions more explicit.

 typedVoid :: a -- (a+a); with left `a` in void
 typedVoid = vvrwlcV{&void checked}wrzDVR[c]?W[c]?LCl    (bytecode)
  
  (typedVoid is what I called `X` earlier.)

 assert :: (0+a) -- a
 .left :: (a+b) [a→a'] -- (a'+b)

 assertProp :: a [a→b] -- a; assert compatible with prop
 assertProp = swap typedVoid swap .left assert

Matthew Blount suggested an interesting sort of 'type error expected' option, the `!p` proposition in his notation. This would require we raise a type error UNLESS we have the expected type error on the void. This also seems feasible to indicate via annotation, e.g. something like:

 {&void should fail} :: (a+0)*e → (a+0')*e

 badlyTypedVoid :: a -- (a + a)
 badlyTypedVoid = vvrwlcV{&void should fail}wrzDVR[c]?W[c]?LCl    (bytecode)

 assertPropFail :: a [b→c] -- a; assert argument incompatible with prop
 assertPropFail = swap badlyTypedVoid swap .left assert

This probably isn't as useful for static tests if compared to utility for PEGs, but I can imagine some use cases for negative assertions about types. And this technique decouples some of the logic for introducing the notion of an expected type failure from causing the type error, which results in a much more compositional and less awkward structure than I described to Matthew earlier.

It seems to me that typechecking on voids is a very simple and effective tool for carrying type annotations and contracts in a bytecode without requiring a secondary logic. Further, it seems it only takes a few simple annotations - oriented around concrete behaviors such as what to report or suppress - to gain a lot of useful control over this idea.

Thank you Matt, Matthew, Kyle, for your input to this idea.

Reply all
Reply to author
Forward
0 new messages