Staging for PL/UI Synthesis [was Re: Static Choices and Offerings]

84 views
Skip to first unread message

David Barbour

unread,
Sep 14, 2013, 6:21:28 PM9/14/13
to reactiv...@googlegroups.com

On Sat, Sep 14, 2013 at 1:08 PM, Ross Angle <rok...@gmail.com> wrote:

I tend to think of 'staging' this way: there is a static stage, then there is a dynamic stage. But, there are some dynamic behaviors, and they can have their static stage at runtime.

Oh, "static stage at runtime." So you'd alternate between a layer of continuous reactive side effects and a layer of code generation side effects. This seems a bit arbitrary to me, but it's also a reasonable-looking choice to make.

Well, I must admit: a richer staging model is looking more and more tempting. If I ever figure out how to do it while keeping it conceptually simple  it could  become something very powerful. I think the sort of staging you've been describing could be a much better basis for PL/UI synthesis

My intuition, at this point, is that Awelon's basic structure with controlled support for "program history editing" is almost powerful enough to be a basis for universal-scale reactive PL/UI unification. Unfortunately, the staging I've been contemplating seems to interfere! It hinders keeping snapshots, clearing unneeded history, and makes a problematic semantic distinction:

* when users want a reactive effect, they use a live-programming or history-editing concept
* when anything else (sensors, databases, etc.) have a reactive effect, they use time-varying signals

How frustrating, my model has such a harsh edges! But I wonder: What if...

* What if sensor inputs are also modeled as editing a program history?
* What if all agents and behaviors are modeled this way? 

Conceptually, this idea shifts everything (including the act-of-programming) into a conceptual RDP runtime. In practical terms, I think it simply means my static values must also bind spatial-temporal information, such that they can model the distribution and locations of program fragments. (<-- for some reason, discovering that I might want even more precision, uniformity, and symmetry surprises me every time.)

But what does this mean for types? Or a related question: for PL/UI synthesis, a useful question is:

* under which conditions should a user be constrained against editing the past?
* how can an agent *protect* its prerogative to edit particular parts of the program?

I feel that the underlying nature of types is bound within these questions. A type is a way of describing what can and cannot be changed. A subprogram can safely be altered so long as no downstream agency is depending on its structure. An agent can protect its agency to alter a program by representing type information such that downstream agencies must code conservatively against it - e.g. we can say: don't assume this value won't change, just because it hasn't changed for the last few hours.  

A good UI/IDE should help users rewrite history in a way that protects dependents.

So getting back to staged programming: staging is modeled in terms of operations of constructing program fragments (behaviors), perhaps wrapping these up with acquired signals, and use of combinators (first, left). I should be able to drop `beval` entirely, and switch to dependently-typed 'static' behaviors as you seem to have been suggesting all along.

So, please, keep trying to make me understand this. I think it could be very useful.
 

An interpreter for untyped code will have to deal with at least as many issues as an interpreter for typed code. Namely, both kinds of interpretation need to avoid instantaneous feedback loops through external resources. We might be able to address this with some explicit delay and/or a system of "tainted" types.

Ah, I see. I hadn't been considering the feedback loop issue. Is this the main reason you're using explicit stages for your constructs?
 

What do you think of sending a self-interpretation request to an external resource and waiting for it to come back with a result?

The idea of an intermediate compiler/optimizer as-a-service has always been in my designs, in part due to the greater ability to leverage memoization, caching, and machine-learning strategies, and potential hot-spot analysis or profiling.
 

I'm not sure if this is what you're pondering about, but... since Underreact doesn't have dynamic behaviors yet, it doesn't have any way to expand and retract the reactive node graph at run time.

Yes, that's also a big use-case for dynamic behaviors. RDP without some ability to programmatically manipulate the graph at runtime is only half the paradigm. Also, without such an explanation, it seems 

Many expand/retract operations could be handled by something simpler, though. Like:

      bforeach :: b x y ~> b [x] [y]




It's interesting that you're encountering a desire to have closures at the static level, but it makes complete sense because you're trying to have open-system composition at that level too.

Well, that's part of it. But I think the tipping point was when I realized I can provide static signal inputs to a dynamic behavior, i.e. like configuration variables. I imagine this could make high-performance dynamic behaviors quite feasible.   


Which of these type signatures is closest to what you mean by "provide static signal inputs to a dynamic behavior"?

  • Use (Dyn (Code (Sta x ~~> y))) and (Dyn x) to get `y`.
  • Use (Dyn (Code (Sta x ~~> y))) and (Sta x) to get `y`.
  • Use (Sta (Code (Sta x ~~> y))) and (Sta x) to get `y`.

The middle one is closest. But none of them are right! In all cases you listed, `y` would also need to be a static constant, and the behavior wouldn't actually do anything. (I'm assuming no encapsulation of signals in type Code.)

Closer would be:   

     Dyn (Code (Sta x * Dyn y ~~> z)) * (Sta x * Dyn y)  ~> z

Also, it only makes sense if we accept that the code is polymorphic to a range of static values.
 

Awelon's sealer/unsealer pairs currently are only good for static computations. If I figure out how to 'serialize' a sealed value for partition crossing (without unsealing it first) perhaps I'll address cryptography then. :)

Sounds more like the converse to me. If we ever have a satisfactory way to serialize a sealed value, that's probably because we're using some kind of cryptography.

Well, the conceptual difficulty is this:

1) Sealed sealer (Atom p1 x1 * Atom p2 x2)
2) cross :: Atom p1 x1 ~> Atom p3 x1

Somehow, I would need to shift the cross into the sealer without unsealing the value. This suggests some kind of transform, but not one for which I have a good answer at the moment. And not just conceptually, I also don't have a good cryptographic answer for this scenario. 

Anyhow, I'm thinking I might try to model the PL/UI synthesis ideas in Haskell or Agda a bit before pursuing it.

Warm Regards,

Dave

 

David Barbour

unread,
Sep 14, 2013, 10:12:59 PM9/14/13
to reactiv...@googlegroups.com

Ok. I spent the last few hours modeling part of this proposed design in Haskell.

One challenge I'm having difficulty with is figuring out 'where' the raw literals (text, numbers, quoted behaviors) should initially go.

It seems that, if I want to model this as a multi-agent program editing process, I'll need to be very explicit and precise about it - modeling the editing and distribution behaviors.

I'm also thinking I'll switch to Agda. I'm modeling Awelon as it might be, and I already miss text and numbers for metaprogramming and typing. Dealing with partition constraints, and substructural types (like copy ops where one copy of relevant type is not relevant) is painful. 

On the success side:

1) It turns out that keeping latency info in the 'static' model is a great fit. It really contributes to the intuition that, in live programming, it takes time for updates to distribute. The normal benefits with respect to looping and scheduling should also apply.

2) In many ways the model seems simpler. I don't need a distinct signal concept (though I can use one to implement). I think state will be a better fit too, more explicitly part of the environment.

David Barbour

unread,
Sep 15, 2013, 4:40:29 PM9/15/13
to reactiv...@googlegroups.com
Another update.

It seems statics don't really have a clear spatial location except for "wherever this program fragment is physically represented", which is currently implicit in the RDP model. I'm beginning to think I should formalize some of the 'small-step semantics' for how a program is rewritten into a lot of smaller program fragments for distribution. 

With regards to PL/UI synthesis, the 'isolation' of statics is problematic. 

We cannot treat extension and expression of the program as having clean staging, a barrier to changes from the outside world; instead, we must build on the world as we can observe it, accepting some risk that our programs will break when the world changes, and developing more adaptive/flexible programs to mitigate this risk. I think dependent types won't scale to a system where a million user/programmers are adjusting and extending their code in real-time. 

When I initially considered how PL/UI synthesis might scale to multiple users, I was thinking in terms of DVCS, with a bunch of programmers sharing code transactionally. In retrospect, this is a terrible idea: it defeats most of the benefits of RDP . What I need instead is an RDP-based "static interaction" between agents, and the ability to model resources as agents.

In Sirea, I modeled agents using independent behaviors operating in a shared environment.

     (foo |*| bar |*| baz)
           where
     (|*|) f g :: B x y -> B x z -> B x x
     f |*| g = bvoid f >>> bvoid g
     bvoid :: B x y -> B x x
     bvoid f = (f &&& id) >>> bsnd

I think all that's needed for PL/UI synthesis is to support 'static' interaction between software agents modeled within an RDP application. If I do so, I can then (in a metacircular fashion) treat each programmer as a software agent in an even larger model.

Right now, Awelon applications take two arguments: (1) a powerbox, (2) an 'activity' signal starting in some application. I believe that I can model static interactions through capabilities in the powerbox, and I can remove the 'activity' signal. If developers want to "turn off" their application (or any subprogram) it is no longer modeled externally (by powering down a staged activity signal); instead, it is treated as an act of live programming. 

I always love it when I can remove to improve. So, I'm going to remove the staged activity signal and see if I can make things work.

One area where it might not work out so well is the issue of 'choice' and conditional expressions. The relevant issue is:

* If a capability that influences a resource is used in a choice... (e.g. in just one branch)
* What shall other software agents and resources observe?
* Can I effectively control the number of 'potential' signals an agent must deal with?

The earlier staged model makes it easy to ensure that shared resources only observe actual choices, even if I compute both branches in advance to support rapid switching. There was never an issue of effects as part of the compilation process.  Now, if I compute both branches, I must somehow model some statics as being 'potential' values. Or if I only compute one branch, I won't have as much stability. 

Perhaps this problem can be adequately addressed within capabilities themselves. 

Getting back to the notion of partitions and modeling agents within agents, it seems that it should not be difficult to have distinct concepts for isolated effects - a different logical 'this' partition for each agent (perhaps leveraging an affine uniqueness source). Under these conditions, I could perhaps track the greatest scope any behavior or capability might directly effect. Purity is just a local partition without resources. This could potentially enable a staged compilation based on isolation. The whole notion of 'partitions' must be internalized within RDP (as opposed to treated 'external' to the program, which is what I have been thinking). So every program is modeling logical internal partitions. 

Then we simply stretch the idea so the entire Internet is conceptually internal to RDP. :)

To use capabilities to model communication between RDP agents or to resources modeled within an application... I've been thinking a bit how to accomplish that. Do I create some state and share it using? Do I model some kind of continuous channel leveraging linear types? I've a few ideas. The only critical issue is that I must model this sort of connectivity as a 'side-effect' enabling a dimension for composition for concurrent (independently developed and maintained) behavior. 

Regarding temporal semantics for statics, it might be interesting to focus on temporal bounds (e.g. upper and lower bounds) for consumption of a static value. 

David Barbour

unread,
Sep 15, 2013, 10:38:17 PM9/15/13
to reactiv...@googlegroups.com
If I want to model an RDP system consisting of multiple 'agents' (human and otherwise) live-programming their own little subprograms, then I need a very precise model for how things will fail. Because sometimes people will write bad programs. And sometimes dependencies will break, or change due to independent maintenance. And sometimes our assumptions about what is 'stable' are wrong, and our systems will not readily adapt. The idea of 'whole program' failure is unacceptable for a combined multi-user PL/UI system.

Fortunately, RDP already has some awesome failure handling built in: I use continuous values/signals instead of fire-and-forget events, and I use precise logical time, so it is uniformly trivial to trace those signals those down and say "oops, my bad; scrub that signal starting at time T". This is by design, for smooth logical transitions and failovers. (It is very similar to the Time Warp protocol.) Of course, there are limits on how far back one can scrub. Fortunately, RDP systems aren't "long-running" the same way as traditional processes, they are intended to operate within a bounded latency. Usefully, this 'bounded latency' concept can be strictly enforced by applying the 'temporal bounds' concept to capabilities and the powerbox itself, forbidding super-long pipelines except where we really want to allow them.

But smooth failover, logical time, and latency control are insufficient.

When a program fails due to some *internal* issue (e.g. a type violation) then an RDP program would would just be continuously shoving its insanity into the world. What I need to do is somehow disable the problematic subprogram until conditions change This presents challenges:

1) How do I identify the scope of "the problematic subprogram"?
2) How can I precisely shut down "the problematic subprogram"?
3) How do I determine when conditions change to try again?

For problem 1, I want some way of declaring: "this subprogram will be an independent unit-of-failure". 

I've thought of a few ways to express this. I'll list them worst to best.

* Treat the declaration as a manipulation of the powerbox. 
* Treat the declaration as a modifier for a block. 
* Treat the declaration as a new primitive combinator.
* Model the declaration as an evaluation capability in the powerbox.

The problem with the 'powerbox' approach is that membranes are awkward and difficult to express in terms of the capability security model, and it's really easy to do them badly (especially when interacting with sealer/unsealer pairs), and it's very difficult to validate that you've done them correctly (nobody even tries).

The middle two options bother me in the same way dynamic typing usually does: they "bake in" the decision to tolerate sanity failures at too fine a granularity. From the outside, it can be difficult to determine whether a program is operating correctly or dying inside, or control the granularity for failure.

The last option is to model the compiler/interpreter as an external service that is being accessed. This offers two features: (a) the ability to withhold the interpreter from a subprogram, (b) the ability to audit how a subprogram is using interpreters/evaluators and get some coarse-grained idea of health. 

(I'm interested in any other ideas.)

Problem 2 is fortunately much easier to address: "shutting down" is built right into RDP's reactive concept. Capabilities are implicitly revoked when a program stops sharing them. So if an evaluator stops evaluating a behavior, the system will properly clean up all downstream behaviors.

Problem 3 is challenging to do precisely, but easy to brute force.

If RDP was "pure" I could simply wait for inputs to change, then try again. But RDP is not pure; it has its own simple form of highly constrained effects and observations, some communications flowing orthogonally to the primary dataflow path. The sanity failure may partially belong to the external world, e.g. someone might be upgrading a dependency service. So unless we have strong reason to believe the failure is internal, we should recover automatically.

A viable approach is brute force: try again after some time has passed, leveraging some sort of heuristic or backoff algorithm to avoid burdening the system. If the problems happen again, back out again.

A more interesting approach might be to record precisely the resources and queries that caused the failure before, and surgically prod just those using RDP's signal/cancel model - this technique could be very efficient, simple, and usable. The surgical approach could be made even lighter weight by heuristically and probabilistically testing a partially random subset of the external resources that might have been involved with the problem (if there was more than one).

This semi-precise polling approach seems promising, and the development of it should also result in more precise error descriptions.

One (very bad) idea I initially pursued for problem 3 up with was to model standard resources as having a sort of 'sign up to hear about updates' capability. However, this idea would result in stateful, fragile systems, and the 'sign-up' concept is difficult to compose (especially when a capability composes many services).

Anyhow, I think the use of these evaluator 'membranes' could offer most of the benefits of static typing without hindering dynamism where developers don't really trust certain subprograms.

Ross Angle

unread,
Sep 17, 2013, 2:04:41 AM9/17/13
to reactiv...@googlegroups.com
Hi David,

This reply is extremely long, so good luck reading it!

Each time you sent another email, I just kept tacking it to the end of this one. I kept noticing that you were going down pretty weird paths, and I wonder if I could have persuaded you toward a different direction had I just hit send earlier. :-p


On Sat, Sep 14, 2013 at 3:21 PM, David Barbour <dmba...@gmail.com> wrote:

On Sat, Sep 14, 2013 at 1:08 PM, Ross Angle <rok...@gmail.com> wrote:

I tend to think of 'staging' this way: there is a static stage, then there is a dynamic stage. But, there are some dynamic behaviors, and they can have their static stage at runtime.

Oh, "static stage at runtime." So you'd alternate between a layer of continuous reactive side effects and a layer of code generation side effects. This seems a bit arbitrary to me, but it's also a reasonable-looking choice to make.

Well, I must admit: a richer staging model is looking more and more tempting. If I ever figure out how to do it while keeping it conceptually simple  it could  become something very powerful. I think the sort of staging you've been describing could be a much better basis for PL/UI synthesis

I was mentioning this to a non-programming friend on Friday.

oh, in slightly less recent news, I had a very productive conversation with David Barbour about programming languages

we've encountered some pretty amazing new ideas

like a programming language that would work with a clear physical metaphor, carrying items around and using them with each other

and (relatedly) a programming language that means that the things being done now are really just a blueprint for doing things later

I was trying to describe your main focus in the first part and my main focus in the second, but I finally tacked on "(relatedly)" because it really helped clarify both of our trajectories. :)



My intuition, at this point, is that Awelon's basic structure with controlled support for "program history editing" is almost powerful enough to be a basis for universal-scale reactive PL/UI unification. Unfortunately, the staging I've been contemplating seems to interfere! It hinders keeping snapshots, clearing unneeded history, and makes a problematic semantic distinction:

* when users want a reactive effect, they use a live-programming or history-editing concept
* when anything else (sensors, databases, etc.) have a reactive effect, they use time-varying signals

How frustrating, my model has such a harsh edges!

This edge is just fine. Not everything can be done in real time. Some actions on the world can take preparation time before they're worthy of being deployed, and that preparation will take the form of manipulating an abstract representation of the project (doing "program history editing" etc.).


But I wonder: What if...

* What if sensor inputs are also modeled as editing a program history?
* What if all agents and behaviors are modeled this way? 

If I understand correctly, you're talking about taking every higher-order API and littering it with a stateful protocol for program editing, just so that humans' stateful workspaces blend in with the rest of the system. Besides being complex, this would work against your own live programming vision, because it would make ahead-of-time workspaces the norm. (It seems you've figured this out in your latest follow-up.)



Conceptually, this idea shifts everything (including the act-of-programming) into a conceptual RDP runtime. In practical terms, I think it simply means my static values must also bind spatial-temporal information, such that they can model the distribution and locations of program fragments. (<-- for some reason, discovering that I might want even more precision, uniformity, and symmetry surprises me every time.)

Careful. In my multi-stage model, for every stage-preceding-a-stage relationship, the later stage lives within a *single* partition of the earlier one.

Fortunately, I don't expect to be able to serialize code if its type captures stage or partition labels. Most typed code will have to have a type that begins with some "for all stages..." and "for all partitions..." quanitfiers. This means it'll be possible to combine code from multiple partitions after all. If all else fails, just do pointwise reasoning: Instantiate both fragments in a fresh local sub-stage, compose them, and generalize this construction to any stage.

 

I feel that the underlying nature of types is bound within these questions. A type is a way of describing what can and cannot be changed. A subprogram can safely be altered so long as no downstream agency is depending on its structure. An agent can protect its agency to alter a program by representing type information such that downstream agencies must code conservatively against it - e.g. we can say: don't assume this value won't change, just because it hasn't changed for the last few hours.  

A good UI/IDE should help users rewrite history in a way that protects dependents.

It's reasonable to think of types that way, but I recommend we "help users rewrite history" by making it so the thing they're exposing is a time-varying *immutable* behavior in the first place. They can still use a stateful workspace to help them change the behavior's implementation over time, but that statefulness is external to the network/API protocol.

Sirea's dynamic behaviors were like this, so I'm kinda encouraging you to go back to that. :-p

 

So getting back to staged programming: staging is modeled in terms of operations of constructing program fragments (behaviors), perhaps wrapping these up with acquired signals, and use of combinators (first, left). I should be able to drop `beval` entirely, and switch to dependently-typed 'static' behaviors as you seem to have been suggesting all along.

So, please, keep trying to make me understand this. I think it could be very useful.

You'll need something like `beval` in order to execute the program fragments at all.

The "static" behaviors in Underreact are expanded at the moment where untyped code is interpreted to become typed code. The multi-stage system, with the use of getSourceCode, would make it easy to construct typed code values in the first place.

Maybe I can clarify what these operations might look like, in broad strokes:

eval :
  -- For any type...
  (t : Type) ->
  
  -- ..for any serializable value of this type...
  Code t ->
  
  -- ...we can simply not serialize it.
  -- (TODO: Come up with a type signature that guarantees the output
  -- is equivalent to the input.)
  t

-- Given a string, we have a possibly nonterminating way to extract
-- either a parse error or a serializable value of some type.
-- (NOTE: The string is just one way we might represent untyped code.
-- We could also use binary data, collections of independently
-- developed modules, or a system of lisp macros, for instance.)
imbueType :
  -- For any expected type...
  (t : Type) ->
  
  -- ...for any piece of untyped code....
  String ->
  
  -- ...we have a possibly nonterminating, possibly failing way to
  -- extract a serializable value of that type.
  -- (TODO: Come up with a type signature that guarantees the output
  -- is equivalent to the String input.)
  -- (TODO: See if it would be useful to return a type-tagged value
  -- rather than taking an expected type upfront.)
  Partial (Maybe (Code t))

getSourceCode :
  -- For any type...
  (t : Type) ->
  
  -- ...if we can prove this type and its values have no extra
  -- dependencies...
  -- (TODO: Figure out how to implement `isolated`.)
  isolated t ->
  
  -- ...then we can can serialize any value of this type.
  -- (TODO: Come up with a type signature that guarantees the output
  -- is equivalent to the input.)
  t -> Code t

Notice that these operations work even outside any stage or partition. If a program uses them in that way, they'll be applied during the `imbueType` step that's parsing that very program. If they're lifted into a particular stage and/or partition, they'll be applied then and there.

Sorry, my notation keeps drifting all over the place. In this example, I'm using ((_ : _) -> _) for dependent function types and (_ -> _) for non-dependent function types, like Agda. These (_ -> _) types correspond to the (_ ~~> _) types I was originally using to describe Underreact's first-class behaviors.


 

An interpreter for untyped code will have to deal with at least as many issues as an interpreter for typed code. Namely, both kinds of interpretation need to avoid instantaneous feedback loops through external resources. We might be able to address this with some explicit delay and/or a system of "tainted" types.

Ah, I see. I hadn't been considering the feedback loop issue.

Well, for context, you and I started talking about feedback loops through external resources a long time ago. I think you brought it up again when I talked about Underreact's termination guarantee:

~~


Any level of call nesting is possible, but there's no way for a (~~>) behavior to receive itself as a parameter.

I see. Even if you eliminate pure recursion or fixpoints, you still have demand monitors. Effects make it easy to dodge around such limitations. That's a cost of effects, and a benefit.

Yeah. Underreact currently only supports serializable data in demand monitors, but I know what you mean.

~~

Is [the feedback loop issue] the main reason you're using explicit stages for your constructs?

No, I think that's unrelated. You're making it sound like I would have one stage that executes the first part of a loop and then another stage that continues on from there. The way I imagine it, a later stage never executes unless an earlier stage decides it's time to eval that code, at which point it executes *during* the earlier stage.

We could build a framework that evals over and over a few times, but that's a special case.


 

I'm not sure if this is what you're pondering about, but... since Underreact doesn't have dynamic behaviors yet, it doesn't have any way to expand and retract the reactive node graph at run time.

Yes, that's also a big use-case for dynamic behaviors. RDP without some ability to programmatically manipulate the graph at runtime is only half the paradigm. Also, without such an explanation, it seems 

It seems you've only written half a sentence? :p



Many expand/retract operations could be handled by something simpler, though. Like:

      bforeach :: b x y ~> b [x] [y]

Yeah, that's interesting....

 




It's interesting that you're encountering a desire to have closures at the static level, but it makes complete sense because you're trying to have open-system composition at that level too.

Well, that's part of it. But I think the tipping point was when I realized I can provide static signal inputs to a dynamic behavior, i.e. like configuration variables. I imagine this could make high-performance dynamic behaviors quite feasible.   


Which of these type signatures is closest to what you mean by "provide static signal inputs to a dynamic behavior"?

The middle one is closest. But none of them are right! In all cases you listed, `y` would also need to be a static constant, and the behavior wouldn't actually do anything. (I'm assuming no encapsulation of signals in type Code.)

Closer would be:   

     Dyn (Code (Sta x * Dyn y ~~> z)) * (Sta x * Dyn y)  ~> z

Makes sense. Thanks for clarifying. :) And you're right that I wouldn't expect to encapsulate anything in type Code.


(I'm skipping your second email without comment and moving on to the third.)


On Sun, Sep 15, 2013 at 1:40 PM, David Barbour <dmba...@gmail.com> wrote:
Another update.

It seems statics don't really have a clear spatial location except for "wherever this program fragment is physically represented", which is currently implicit in the RDP model. I'm beginning to think I should formalize some of the 'small-step semantics' for how a program is rewritten into a lot of smaller program fragments for distribution. 

Good luck. At this point I really don't expect either of us to implement something that has the full power of the multi-stage model I'm describing, so let's not be too ambitious. I don't expect it to be all that much better than a single-stage RDP with dynamic behaviors, so it'll be fine if we keep shooting for that.

To really complete the multi-stage model, I expect we'd have to take several difficult steps:

  1. Start with a sound static type system.
  2. Add staging in this impure style we've been discussing. This will probably mean adding dependent typing, adding tweaked versions of observational equality to formalize index completions, and maybe even adding effect types to account for the code generation side effect. (Or maybe linear types would work for that side effect! Ooooh...)
    1. Add getSourceCode and eval operators so our stages can interact with each other.
  3. Add partitions. This will probably mean putting complex side conditions on several of our primitive operations so that they don't implicitly communicate between partitions.
  4. Implement a concrete infrastructure that uses partitions to represent individual devices (or simulated devices) on a network.
  5. Add linearly typed resources on the network. Obviously, we'll have to add linear types for this if we haven't done so already.

I just Googled [location type theory], and apparently we might benefit from doing some reading (and really dense reading, wow):


With any luck, maybe these modal logics will avoid some of the complexity of Underreact's provisos, or at least give us clear guideposts for managing it. Giuseppe Primiero's paper looks like it might come closer to what we're doing, with our explicit use of location labels.

...As I indicated just above ("Ooooh..."), I just thought of using linear types to account for the code generation side effect. It's not that we can't use contraction and weakening for later-stage values, but when we *do so*, it's not quite a free operation. As a side effect, we record our use of contraction or weakening onto that stage's program history. So we might be able to harness the code generation side effect by handling our values with linear types at certain points in our code, maybe like this:


getSourceCode :
  -- For any input and output types...
  (x : Type) -> (y : Type) ->
  
  -- ...if we consume an isolated subprogram...
  (Hose x -o Hose y) -o
  
  -- ...then we get a serializable program.
  -- (TODO: Come up with a type signature that guarantees the (Code (x -> y)) is
  -- equivalent to the (Hose x -o Hose y).)
  Code (x -> y)

We might have a special syntax that takes a program of type (x -> y) with no free variables and returns a program of type (Hose x -o Hose y).

That was last night. This morning I tried ironing out the formalism a bit, and I'm not sure it's working out. I don't have the hang of this linear type approach yet, so I'm having trouble getting a foothold on any one starting point. I might have to abandon linear types just to constrain my creativity and get myself back to a familiar basis.



With regards to PL/UI synthesis, the 'isolation' of statics is problematic. 

We cannot treat extension and expression of the program as having clean staging, a barrier to changes from the outside world; instead, we must build on the world as we can observe it, accepting some risk that our programs will break when the world changes, and developing more adaptive/flexible programs to mitigate this risk. I think dependent types won't scale to a system where a million user/programmers are adjusting and extending their code in real-time. 

(It's odd that you throw in dependent types at the end, as though dependent types themselves imply all the properties of the multi-stage systems we're discussing.)

I would agree only if you mean "we cannot treat ... as having ... a barrier" as a guideline, but not as a rule. There are many cases where a barrier seems natural to me, and therefore where clean staging could come in handy:

  • For RDP, you're already modeling a formal barrier between one partition and another.
  • Humans set up soft barriers to clarify their personal identities and rights.
  • Technology has a soft barrier where one side is the reusable tool and the other side is any of the contexts where it's useful.
    • Mathematics produces technology where that barrier is very precisely engineered for maximal reusability.
  • I think when we make a decision, that decision has a barrier proportional to its importance and subtlety, and inversely proportional to its urgency: One side of the barrier is our full range (subtlety) of choices and all their external consequences (importance), and the other side is all the internal simulation we can do before we're forced to take action (urgency).

Personally, I'm interested in making a system that can represent hard barriers, just so people don't struggle in vain to harden their barriers past the expressive limits of the system.

Even if people do choose to specify hard barriers, I expect these barriers to "break when the world changes" under many conditions: The hard spec may be dissatisfying for the authors' own purposes, the spec's implementation may have unaccounted-for qualities (e.g. susceptibility to hardware sabotage), or the world may have moved on to some other kind of technological infrastructure. For instance, Facebook and its app developers coordinate their efforts over an elaborate, precise(-ish?) API, but a Facebook app developer is in a tough spot if Facebook has trouble holding up their part of the API bargain or if Facebook's audience has migrated to other platforms.



When I initially considered how PL/UI synthesis might scale to multiple users, I was thinking in terms of DVCS, with a bunch of programmers sharing code transactionally. In retrospect, this is a terrible idea: it defeats most of the benefits of RDP .

I agree. >.>



Right now, Awelon applications take two arguments: (1) a powerbox, (2) an 'activity' signal starting in some application. I believe that I can model static interactions through capabilities in the powerbox, and I can remove the 'activity' signal. If developers want to "turn off" their application (or any subprogram) it is no longer modeled externally (by powering down a staged activity signal); instead, it is treated as an act of live programming.

This seems kinda sorta reasonable. You're using two stages: One is an expressive stage which needs a lot of dynamic checking, and the other is a stage where programs have a rich static structure that lets them distribute across a network without latent errors. It makes sense to do all programming in the most expressive stage, and therefore to build modularity features that support such a habit.

On the other hand, I wonder if it's really advantageous to use those two stages rather than a single-stage RDP system with dynamic behaviors. Can't we do all the dynamic checking in a convenience library that helps us build dynamic behaviors? (Maybe that isn't efficient enough?)



One area where it might not work out so well is the issue of 'choice' and conditional expressions. The relevant issue is:

* If a capability that influences a resource is used in a choice... (e.g. in just one branch)
* What shall other software agents and resources observe?
* Can I effectively control the number of 'potential' signals an agent must deal with?

The way I've been looking at it, they won't observe anything until you eval the generated program. Even if you consider it to be eval'ed implicitly, a metacircular implementation would eval it explicitly at some level.

If you're not only using an implicit eval, but you're trying to perform impure dynamic computations incrementally as the impure static computations go along, that could be much more complicated. The multi-stage system is complicated enough, so I kinda don't even want to think about that. XD



Getting back to the notion of partitions and modeling agents within agents, it seems that it should not be difficult to have distinct concepts for isolated effects - a different logical 'this' partition for each agent (perhaps leveraging an affine uniqueness source). Under these conditions, I could perhaps track the greatest scope any behavior or capability might directly effect. Purity is just a local partition without resources. This could potentially enable a staged compilation based on isolation. The whole notion of 'partitions' must be internalized within RDP (as opposed to treated 'external' to the program, which is what I have been thinking). So every program is modeling logical internal partitions. 

Then we simply stretch the idea so the entire Internet is conceptually internal to RDP. :)

I'm not sure what meaning you attribute to the word "internal" in that case.

I'd say the meaning of externalizing state is that it's no longer under the program's exclusive access, so it's available for debugging, mashups, and so on. (As a great bonus, it also means the program is unlikely to be construed as an independent moral agent, so we avoid some ethical complexity.)

If we can dynamically model partitions which correspond to external machines, the machines are still external, in the sense of being accessible in a way that circumvents our specified program semantics.



To use capabilities to model communication between RDP agents or to resources modeled within an application... I've been thinking a bit how to accomplish that. Do I create some state and share it using? Do I model some kind of continuous channel leveraging linear types? I've a few ideas. The only critical issue is that I must model this sort of connectivity as a 'side-effect' enabling a dimension for composition for concurrent (independently developed and maintained) behavior. 

At this point I don't understand how anything you're saying goes beyond single-stage RDP with dynamic behaviors.


(Now for the fourth email.)


On Sun, Sep 15, 2013 at 7:38 PM, David Barbour <dmba...@gmail.com> wrote:

But smooth failover, logical time, and latency control are insufficient.

When a program fails due to some *internal* issue (e.g. a type violation) then an RDP program would would just be continuously shoving its insanity into the world. What I need to do is somehow disable the problematic subprogram until conditions change This presents challenges:

1) How do I identify the scope of "the problematic subprogram"?
2) How can I precisely shut down "the problematic subprogram"?
3) How do I determine when conditions change to try again?

I recommend modeling the data-control flow for this, by using static types. If a subprogram could fail, it'll have to account for that possibility in its output type. If a program is ill-typed, we won't be able to execute it in the first place. (The imbueType operation will fail.)

This doesn't mean programmers always have deal with fully precise static typing. They can model things with more coarse-grained types, or even a fully untyped approach, as long as they're willing to handle dynamic errors at every turn.



For problem 1, I want some way of declaring: "this subprogram will be an independent unit-of-failure". 

I've thought of a few ways to express this. I'll list them worst to best.

* Treat the declaration as a manipulation of the powerbox. 
* Treat the declaration as a modifier for a block. 
* Treat the declaration as a new primitive combinator.
* Model the declaration as an evaluation capability in the powerbox.


Technically the evaluation capability is what I have in mind, but I also don't intend to run programs unless they're "sane" throughout.

(Looks like you commit to this option by the time you talk about problem 2. "So if an evaluator stops evaluating a behavior, the system will properly clean up all downstream behaviors.")


Problem 3 is challenging to do precisely, but easy to brute force.

If RDP was "pure" I could simply wait for inputs to change, then try again. But RDP is not pure; it has its own simple form of highly constrained effects and observations, some communications flowing orthogonally to the primary dataflow path. The sanity failure may partially belong to the external world, e.g. someone might be upgrading a dependency service. So unless we have strong reason to believe the failure is internal, we should recover automatically.

A viable approach is brute force: try again after some time has passed, leveraging some sort of heuristic or backoff algorithm to avoid burdening the system. If the problems happen again, back out again.

Ah, that's an interesting problem. Evaluation potentially takes a lot of computation to perform, and at this point we're even talking about modeling it as an external resource that takes an unknown amount of real time. We'll want to be extra careful about triggering reevaluation.

Notice that I'm proposing imbueType, a pure operation, which just translates untyped code to typed code. I think this covers all the worst parts of eval, and at the same time it means you can avoid dealing with weird side effect semantics. This step does all the parsing and preprocessing, and afterward, we can reason about resource usage almost as though we're dealing with a single program again.

I don't really have any feedback to address your particular concern that "The sanity failure may partially belong to the external world." I think I do understand what you're going for: If we're interpreting code that's combined from internal and external sources, then it'll inherit the instability of each source, and it would be nice if we could filter out external updates that are irrelevant to our internal code. If there is a good solution here, I think it will have a lot to do with the way we choose to represent untyped code. (For instance, I would probably represent it using Era modules to achieve other goals, but I don't think that would actually help here.)

As far as brute-force heuristics go, a very specific design does come to mind:

If I were building this right now, I would redesign imbueType so that it's an external RDP resource with a demand capability (String -> ()) and an observation capability (() -> Maybe (String * Maybe (Code t))). (The output contains its associated input value so we can tell how out-of-date it is.) This external resource would behave according to the following algorithm:

Whenever the input changes, execute it as a new thread. Whenever we're executing more than two threads, kill all but the oldest, the newest, and maybe a few random ones in between. Whenever a thread completes, update the output to be (Just (input * threadResult)), and kill any threads older than this one.

(Boilerplate: Ignore all inputs except the first lexicographically. The output starts out as Nothing, and it remains stable until we update it. Whenever the input is inactive, kill all threads, and update the output to be Nothing again.)

(Optimizations: If the input changes, but we already have a thread for that input, just promote that thread so it counts as the newest thread. If the input changes, but the output already shows the result for this input, just kill all threads.)

Thanks to the survival of the oldest thread, long-running evaluations will eventually be able to respond even if the input changes frequently. Thanks to the survival of the newest thread, a short-running evaluation will be able to knock out any long-running evaluations that have overstayed their welcome. If the oldest evaluation is taking forever and the newest threads are replaced too frequently to get a word in edgewise, then a thread in between may help break this pattern.

 

A more interesting approach might be to record precisely the resources and queries that caused the failure before, and surgically prod just those using RDP's signal/cancel model - this technique could be very efficient, simple, and usable. The surgical approach could be made even lighter weight by heuristically and probabilistically testing a partially random subset of the external resources that might have been involved with the problem (if there was more than one).

This semi-precise polling approach seems promising, and the development of it should also result in more precise error descriptions.

Oh yes, that does sound like a nice way to filter.



Anyhow, I think the use of these evaluator 'membranes' could offer most of the benefits of static typing without hindering dynamism where developers don't really trust certain subprograms.


I believe it!

As I've been making clear above, I was already on track for this approach, thanks in large part to this very discussion we've been having. :-p

Warm Regards,
Ross

David Barbour

unread,
Sep 17, 2013, 3:20:35 AM9/17/13
to reactiv...@googlegroups.com
Staged IDEs and Coding; User-Defined Language and Syntax

first a quick recap & the vision so far

For PL/UI synthesis, my current vision is that each 'agent' (human or otherwise) has control of an RDP subprogram that continuously observes and influences the world, and action is conceptually modeled by live-editing this subprogram. The actual implementation for sensors, state, etc.. may not involve editing such a program, but the result should be compatible with that interpretation. The ability to observe and influence the world is determined by use of capabilities. 

There would be an initial set of capabilities - i.e. a 'human rights' package. In addition to network and personal-cloud access, this set would include features like a linear-typed uniqueness source, which enables developers to create 'new' initially exclusive state (after giving it a stable identifier). Since this state is initially exclusive, there may be utility in hoarding it or sharing it with other humans and devices. 

In an AR, VR, or telepresence environment, agents (human or otherwise) may implicitly grant capabilities based on proximity, eye-contact, social interactions. Of course, this set would be configurable, and some of the granted capabilities may be revoked once the conditions are no longer fulfilled (e.g. acting like a personal area network). Using such capabilities, people could share values, information, objects, views of their environments. 

RDP is primarily designed for long-running behaviors. An RDP program would maintain concurrent relationships with people, services, and devices - potentially thousands of them, built over a user's lifetime.  However, even 'volatile' actions can be modeled in terms of 'temporary' live program edits, and may be useful if they adjust a stateful resource (like turning on a light, or turning up the volume). RDP doesn't do events (a very good reason for this is that events are difficult to undo or cancel in open systems, compared to signals), but  a temporary program can often serve the same roles.    

Last week, my vision of RDP included a staged 'go signal' that clearly delineated the compile-time from the runtime. 

Yesterday, I made the decision to eliminate that stage. The benefit of doing so is that it allows me to more directly model interaction between "partially edited" code by different agents. Without this sort of static interaction, I kept feeling that the PL was too isolated from the world to also serve as an effective UI - i.e. you can't really see the results as you edit because the observable changes belong to a future stage. 

But there are also costs of eliminating this isolation: we can also build stronger interdependencies between agents. If one system breaks, it can break those depending on it. To address this, users must establish proper failure boundaries. For the technical aspect of those boundaries, I plan to use an 'executive' capability. For the more social/didactic aspects, I think visualization (e.g. coloring objects based on established trusted-path relationships and frequency-of-update) may be a very effective mechanism. 

I initially believed that the loss of a compile-time might also hurt my ability to reason typefully about my code. Even if this were true, it wouldn't be a terrible loss, seeing as RDP has plenty of protective padding form of capability-security, sealer/unsealer patterns, RDP's time-warp approach to anticipating and preventing errors before they occur (and disabling the problem program until conditions change). Further, the ability for users to reason about code more concretely and get immediate, real-world feedback would probably be better than type-safety. But I'm starting to think that I can still get most of static typing for my code, so long as I include some sort of 'static assertion' model which may statically describe my assumptions and enable me to validate some of them and color others in red. 

But my vision is incomplete: I still need to clearly understand the editing and visualization processes.

STAGED PROGRAMMING

Stage Zero: The Program of a Lifetime

In Awelon, all static value manipulations can be described strictly in terms of addending a concatenative program - i.e. one built on generalized arrow primitives like: swap, assocl, rot3, intro1, elim1, first. Plus a few literals. Above these primitives, as part of manipulating the pure value, I can also build concepts like: navigation, hands, tools, macro extensions, and so on. So long as these manipulations are "pure" - i.e. if we supply a powerbox quite yet - then we could even forget the history we extend the program, perhaps keeping snapshots in an exponential-decay-of-history model to support review and undo. 

This model could easily last a full human lifetime, of continuous extensions. What I'm going to assume is that users (potential programmers) are given access to physical devices that consistently turn human actions into meaningful actions on a program. For purpose of my PL/UI synthesis, this pure value manipulation of the lowest layer will be the stage zero program. Stage zero is still formalized as a program for the following reasons:

* easy to recognize patterns (which reduce to sequences of Awelon words)
* easy to start anticipating user actions on this program (Markov models, ML)
* supports recording and reviewing at a semantics layer, rather than syntactic
* support programming-by-example, and construction of macros
* macros and tools for the "pure" stage zero will also be useful in higher layers

At any instant time, the stage zero program is just a pure Awelon value, containing four kinds of things: blocks, numbers, text, and pairs.  Technically, it also contains the `(x + y)` sum types as well, but those would usually only have a temporary existence within a macro. There would be some default visualizations for the stage zero, but these might later be augmented by 'lense' based blocks.

The most important concern for stage zero is: how do we interpret stage-zero value into a stage-one program?

Stage zero is the lowest layer for live programming, but I imagine it will also be extremely personal. Each user may have his or her own private languages and dialects for stage zero - different ways of thinking about the world and organizing it, different toolkits. This language may be a mix of graphical and textual. People can, of course, easily share their values and macros, tips and tricks and ways of thinking, creating subcultures. 

(I'm not asking people to think in tacit concatenative programming; I just think it's a promising foundation for PL/UI synthesis.)

But regardless of its personal meaning, the value must also be interpreted into a program. And, honestly, I'm not entirely sure of the best way to do so. One possibility is that a linear-typed block be held in a fixed relative location in the environment (i.e. such that it is not disturbed by navigation), that this block provide a little bootstrapping (how to interpret some text into blocks) and that by some means we build up a set of parser-combinators on the larger data structure. 

The basic Awelon environment consists of a pair (s*e), with literals (numbers, blocks, etc.) being added to `s` in order to keep `e` in a stable location. This was both to optimize text programming - writing `7 assocl "foo" assocl` etc. so we can maintain access to `e` would quickly grow annoying. A more graphical environment doesn't have this issue, so it can inject new literals wherever the designer pleases. Either way, we can provide an invariant location for the aforementioned block.

Interestingly, I don't need a separate module system. Long term, Awelon might bootstrap away its textual representation, leaving it primarily to analyzers. That might be neat.

Stage One Programming: Observation and Influence

I'm assuming users get their capabilities at this point. It's important that we're able to control the total size of the program: a correctly written RDP program is bounded in time and space, meaning it takes a bounded amount of time for a change at one end of the program to percolate to the other. In an ideal world, the time bound might be zero (instantaneous propagation) but formalizing a larger time-bound helps model real-world communication delays, and helps stabilize feedback loops. Developers need don't need to inject capabilities by hand: capabilities may reactively adjust their own delays - e.g. long-distance communication might try to choose a latency that is just slightly conservative based on actual observations, to ensure consistency and support a more logical model.  In a practical program, the time bound is probably a few seconds (and bounds can be enforced). The spatial model might grow pretty large, too, depending on the number of relationships or modeling richer environments. 

The idea I have is that users can conceptually 'enter' their stage-one model.

Actually, that isn't quite right. It's more that stage one or two is where users normally live, and users can 'exit' the stage-one model to back up to seeing their stage-zero program. Or maybe the stage-zero output and program is always observable, but more in the background. (The visualization modes should also be modeled.)

Regardless, users will have various ways of visualizing resulting value of their stage-one program, and to manipulate it. Just like with the stage-zero program, users can develop an extended vocabulary for such manipulations. The difficulty is that this stage is also entangled with external state, so some of these manipulations may be fragile to changes in that state. But as I mentioned earlier, a robust vocabulary of manipulations, plus use of visualization to help users understand relative stability and trust, should help mitigate this fragility. 

The ability to navigate and extend the stage-one program will have corresponding actions for extending and manipulating the stage-zero program. Actually, how these manipulations are applied will need to be carefully considered when I design how the stage-zero program is compiled into a block.

Stage Two Programming: Sharing and Interaction

In stage one, developers can begin to share and gather capabilities. One thing they can do with these capabilities is build a bigger powerbox, apply value transformers, and begin to construct another program - one built upon the shifty sands of time-varying state. By doing so, the output is the new stage two. Developers can then 'enter' stage two just as they did with stage one.

I believe stage two enables a more direct representation of user interactions and negotiations. 

In stage one, the underlying program obeys a closed set of simple rules to form a spatially-temporally bounded value - a sort of fixed potential structure. To reach stage two from stage one, we model a full interpreter just like we modeled to get from stage zero to stage one. We might  use the same interpreter.

After we 'enter' stage two, the values we see will contain the result of interpreting state. Any direct manipulation (e.g. use of the 'hands' to pick these up) will only be robust if the underlying state and other inputs are stable. In general, I think stage two would encompass higher stages unless we also enforce staged state models. But there isn't any big technical limit on how many stages we build, so long as they have a stability much greater than their latency.

David Barbour

unread,
Sep 17, 2013, 3:22:38 AM9/17/13
to reactiv...@googlegroups.com
Heh, I was in the middle of writing my message I sent just now when I received yours. I'll reply after some sleep. :)

Best,

Dave


--
You received this message because you are subscribed to the Google Groups "reactive-demand" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactive-dema...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Barbour

unread,
Sep 17, 2013, 2:28:57 PM9/17/13
to reactiv...@googlegroups.com
On Mon, Sep 16, 2013 at 11:04 PM, Ross Angle <rok...@gmail.com> wrote:

* when users want a reactive effect, they use a live-programming or history-editing concept
* when anything else (sensors, databases, etc.) have a reactive effect, they use time-varying signals

How frustrating, my model has such a harsh edges!

This edge is just fine. Not everything can be done in real time. Some actions on the world can take preparation time before they're worthy of being deployed, and that preparation will take the form of manipulating an abstract representation of the project (doing "program history editing" etc.).

The problem with this harsh edge is that, for a PL/UI synthesis, I think it would be very awkward to force everything into the mold of 'prepare then deploy'. I think some actions need to be real-time. 

Naturally, the act of constructing a tool or object (function, block) to apply later can also be modeled in such a system. 
 

But I wonder: What if...

* What if sensor inputs are also modeled as editing a program history?
* What if all agents and behaviors are modeled this way? 

If I understand correctly, you're talking about taking every higher-order API and littering it with a stateful protocol for program editing, just so that humans' stateful workspaces blend in with the rest of the system.

These two points hold even now. They don't require littering APIs with stateful protocols. They do require modeling the act of programming. 

But the idea is that (a) humans should fit nicely with each other and software agents with regards to real-time programs in a near-continuous state of editing, (b) the program developed by a human should be able to represent within their program concrete actions based on actual observed conditions about real-world state and relationships. 

I think in a good PL/UI synthesis, it's often pointless to handle all potential conditions, because the human is free to edit his or her program when conditions change. And, of course, also free to add new conditional options to it to make the program more robust/stable to such changes in the future. (Keeping good state histories for review would probably be very useful.)
 

Careful. In my multi-stage model, for every stage-preceding-a-stage relationship, the later stage lives within a *single* partition of the earlier one.

Yeah, I think the same might fall out naturally in this case. In a sense, each human or software agent can only directly manipulate code in a 'self' partition. That code, of course, may also push some more code up to a cloud. 

This is relatively easy if:

* a block is a value that lives in only one partition
* a block can only be composed with a block in the same partition
* serializable or mobile blocks are effectively forall partitions

Though, a block can still require arguments from multiple partitions, and can still output values in many partitions, it can only exist within one partition. 

ASIDE:

Thinking about serializability has just now reminded me about work I did a few years back (with the actors variations preceding RDP) regarding 'mobility' annotations on code. Basically, it reduced to constraints of the form: "if you can prove you can acquire sealed value foo", and perhaps a converse proof: "if you can prove you can unseal this challenge". In many cases, using sealer-unsealer pairs to control serialization may be both cheaper and more effective than using it to control encryption (need to handle a time-varying challenge in real-time instead of a brute-force offline attack).

This gives me an old idea to incorporate! Yay! 

Up to now, I've been treating sealed values as atomic entities (all you can do is unseal them). But maybe I should LOGICALLY represent the full gamut of homomorphic encryption and Chaumian blinding... without worrying about whether it can be directly implemented using encryption, since I can achieve similar benefits by simply controlling serialization.

    % deconstruction for data plumbing
    sealedJoin   :: (Sealed u x * Sealed u y) -> (Sealed u (x * y))
    sealedJoin+  :: (Sealed u x + Sealed u y) -> (Sealed u (x + y))
    sealedSplit  :: (Sealed u (x * y)) -> Sealed u x * Sealed u y
    sealedSplit+ :: (Sealed u (x + y)) -> Sealed u x + Sealed u y
    sealedApply  :: (Sealed u ((x->y)*x) -> Sealed u y

    % also, we can do zero-information manipulations without the sealer
    sealedIntro1 :: Sealed u x -> Sealed u (1 * x)
    sealedElim1  :: Sealed u (1*x) -> Sealed u x
    sealedIntro0 :: Sealed u x -> Sealed u (0 + x)
    sealedElim0  :: Sealed u (0+x) -> Sealed u x

Anyone with the sealer can push stuff into the barrier for operating on a sealed value (subject to type-safety). In some sense, 'sealedApply' might be treated as an instruction to be performed later by the unsealer. Using multiple distinct sealers could further control against blind manipulation. With this design, the idea of 'Sealed' is closer to a membrane or barrier than to PKI encryption.

As a side benefit, sealed values now have fewer responsibilities. In particular, they no longer model the responsibility of "tight coupling". This creates more space for something like a dedicated encapsulation type, e.g. for potentially modeling 'objects'. If I still need them.


Fortunately, I don't expect to be able to serialize code if its type captures stage or partition labels.

If the block type itself captured partition labels of its inputs and outputs, that would be a problem. 

I think it would be okay to lift any atomic value to a block of the form `1->value` in the same partition, and to compose blocks that are in the same partition. This is a highly constrained form of closure/capture that I (without having proven it) think would be safe for blocks.
 
 
I recommend we "help users rewrite history" by making it so the thing they're exposing is a time-varying *immutable* behavior in the first place. They can still use a stateful workspace to help them change the behavior's implementation over time, but that statefulness is external to the network/API protocol.

Let me know what you think of the design in my previous e-mail. 

From the stage 2 programming, you can probably see that I seriously believe that building a behavior *above* the unstable sands of time-varying state is essential to the unification of PL/UI. But stage0 behavior is pure (and one-dimensional: addend/undo/review only), while its output is the stage1 program. The stage 1 program incorporates real-time observations in a relatively static manner, and optionally uses these to build a stage 2 program.
 

The "static" behaviors in Underreact are expanded at the moment where untyped code is interpreted to become typed code. The multi-stage system, with the use of getSourceCode, would make it easy to construct typed code values in the first place.

Maybe I can clarify what these operations might look like, in broad strokes: [..]

Thanks, that code (and especially the comments) helped me understand what you're doing. Your 'imbue' was big-step; my intuition is that if you made it small-step (e.g. using concatenative bytecode) you should be able to get your proof of equivalence in a dependently typed language.

I finally grok `getSourceCode`. Yay!

I think you cannot actually `getSourceCode` for most interesting types, though, such as a pair of integers in two different partitions. But I also think the only place it's important to getSourceCode is a block...


Ah, I see. I hadn't been considering the feedback loop issue.

Well, for context, you and I started talking about feedback loops through external resources a long time ago. I think you brought it up again when I talked about Underreact's termination guarantee:

It's so hard to keep this in my head all at once. RDP is simple, but the list of requirements that went into its design was very long.
 

The way I imagine it, a later stage never executes unless an earlier stage decides it's time to eval that code, at which point it executes *during* the earlier stage.

I like that way of imagining it.
 



Many expand/retract operations could be handled by something simpler, though. Like:

      bforeach :: b x y ~> b [x] [y]

Yeah, that's interesting....

For some reason, the tone I get from your words is sarcasm. Maybe it's the ellipses. Do actually you find this boring? :)

For the Sirea-like staged model, I felt that collection-oriented operations would provide a useful balance between dynamic behavior and performance. For the updated design, I don't think it's as necessary (since I can model time-varying lists and vectors using (*)), but even so a type and operators for dynamic-sized collections might offer serious optimization benefits. 
 

To really complete the multi-stage model, I expect we'd have to take several difficult steps:
  1. Start with a sound static type system.
  2. Add staging in this impure style we've been discussing. This will probably mean adding dependent typing, adding tweaked versions of observational equality to formalize index completions, and maybe even adding effect types to account for the code generation side effect. (Or maybe linear types would work for that side effect! Ooooh...)
    1. Add getSourceCode and eval operators so our stages can interact with each other.
  3. Add partitions. This will probably mean putting complex side conditions on several of our primitive operations so that they don't implicitly communicate between partitions.
  4. Implement a concrete infrastructure that uses partitions to represent individual devices (or simulated devices) on a network.
  5. Add linearly typed resources on the network. Obviously, we'll have to add linear types for this if we haven't done so already.


I plan to start with partitions early on. When I start without them, I am ineffective at designing with the idea of adding them later. Actually, I think this design-order network effect might be one of the reasons partitions haven't been done well in other PLs. 

Substructural types are much easier to add later. Though, I think it only makes sense for block types. I don't really understand what it might mean to have 'linear' text or numbers, with respect to say concatenation or multiplication (does that consume a number? Is multiplication even a valid operation on an affine type?) With blocks, substructural types for composition is easy to grok (the block obviously hasn't been used yet), and there is a very clear concept for 'using' a block. 

 
I just Googled [location type theory], and apparently we might benefit from doing some reading (and really dense reading, wow):

Oh, nice! I'll definitely give these a try.


...As I indicated just above ("Ooooh..."), I just thought of using linear types to account for the code generation side effect. It's not that we can't use contraction and weakening for later-stage values, but when we *do so*, it's not quite a free operation.

I'm finding many uses for substructural types. Keeping all drop and copy operations explicit was and remains the current plan for Awelon. With the somewhat more dynamic design Awelon has now, I think this is even more critical.



We cannot treat extension and expression of the program as having clean staging, a barrier to changes from the outside world

I would agree only if you mean "we cannot treat ... as having ... a barrier" as a guideline, but not as a rule. There are many cases where a barrier seems natural to me, and therefore where clean staging could come in handy

Oh, without a doubt. 

What I meant (in context) was: for the specific purpose of PL/UI synthesis, forcing this staging is a serious problem. Essentially, people are forced to include conditions in their code for all-possible worlds instead of the one they're actually looking at and living in, i.e. because they're dealing with 'future' values instead of 'present' values.  For a PL/UI synthesis, I believe we must have the ability to manipulate the program based only on concrete, actual observed conditions, without always thinking "but what if the world wasn't this way, what then?" just to satisfy a type checker. The cost is that our programs become more fragile to unexpected changes in real-world state.

Of course, it's perfectly reasonable that the visualization highlight the conditions that aren't handled, so users are at least aware of concerns.

But even within such a system, we could develop subprograms/blocks whose behavior doesn't depend on any external state. As you mention, when applied to the real world, these blocks can still be fragile because they don't account for changes. 
 
 

I'd say the meaning of externalizing state is that it's no longer under the program's exclusive access, so it's available for debugging, mashups, and so on. (As a great bonus, it also means the program is unlikely to be construed as an independent moral agent, so we avoid some ethical complexity.)

It seems to me that "external" and "exclusive" are orthogonal - i.e. we can have (initially) exclusive capabilities to external state, and we could model construction of these bindings via substructural types. 


I plan to provide a uniqueness source to every powerbox for humans. Humans, then, could model for themselves some user-agents that might grow some independent moral awareness and capacity. :)

Yet, we still benefit from the state being formally 'external' with regards to runtime update, orthogonal persistence, auditing, etc..


I recommend modeling the data-control flow for this, by using static types. If a subprogram could fail, it'll have to account for that possibility in its output type.

The problem with this design is that the 'failure' is too fine-grained, at the subprogram level. 

Instead of dealing with coarse-grained atomic-success vs. atomic-failure, users have to comprehend all the different ways a program could "partially" fail. And, in general, there are a lot more ways for things to go wrong, than for things to go right. The problem with partial failure is that you've also partially succeeded, and you don't want to be left in this 'limbo' state so you're forced to handle fallbacks locally. Error handling in a system with partial failure tends to become the bulk of the programming experience.

Conversely, probing the system explicitly to prevent partial failure deeper in the program (the 'gatekeeper' pattern) has the problem of not being compositional. Developers are forced to explicitly build a set of pre-tests, and the data gathered in those tests is no longer directly accessible for use.

For purpose of PL/UI synthesis - i.e. regarding the 'user experience' - even being asked on every action to account for how subprograms might fail due to states that aren't observable in the real world seems difficult to accept. Dealing with fine-grained partial failures without explicitly asking for them? That would be a new brand of hell.  It seems useful that I can specify how much of the program is shut down when failure is anticipated, such that I can handle fallbacks with much coarser granularity.


I'm proposing imbueType, a pure operation, which just translates untyped code to typed code. I think this covers all the worst parts of eval, and at the same time it means you can avoid dealing with weird side effect semantics. This step does all the parsing and preprocessing, and afterward, we can reason about resource usage almost as though we're dealing with a single program again.

I think you're assuming you can provide static types for all your capabilities without at least partially invoking them. This might be difficult to achieve in practice, without overgeneralizing types (which would bring back the 'partial failure' problems). 

Consider, for example: we might want to assume some shared state has a particular type structure (e.g. an XML file). This might not be something you can properly determine without loading the file - i.e. the capability might not have an 'XML file type'. 
 

Whenever the input changes, execute it as a new thread. Whenever we're executing more than two threads, kill all but the oldest, the newest, and maybe a few random ones in between. Whenever a thread completes, update the output to be (Just (input * threadResult)), and kill any threads older than this one.


I'd recommend a larger basic buffer. A common update pattern is "bursts of updates with long periods of silence", e.g. when pressing buttons on a keyboard each press might correspond to an update or two (depending on how code was written). Using the exponential-decay-of-history pattern might be a good way to decide which thread is killed each time you add a new one.

Warm Regards,

Dave

David Barbour

unread,
Sep 17, 2013, 10:06:30 PM9/17/13
to reactiv...@googlegroups.com
It seems to me that a simple use of partitions could offer most of the sealer/unsealer features: Rather than modeling sealed values as a `Sealed u x` type, model sealed values as living in a (unique, logical) `sealed` partition that can only be accessed by a pair of capabilities. This would eliminate most of the operational and semantic overhead for sealer/unsealer pairs. 

I wonder if staging could also be modeled well enough by something similar, without getting too obtuse. Maybe a dedicated 'semi-permeable' kind of logical partition types (free entry, no exit) would work. 










David Barbour

unread,
Sep 17, 2013, 11:21:45 PM9/17/13
to reactiv...@googlegroups.com
The main strength of the staged model was that it provided a very static structure. A consequence was that I needed a different kind of collection type, such as a vector or set, to account for observed state. A consequence is that it becomes very difficult to program against a system's actual state; we must target the general possibility (a vector or set of any size with ad-hoc elements) instead of a specific possibility. In order to block direct dependency on unstable signals, users would also be unable to effectively treat dependable signals as part of their environment. 

I'm still fond of the staged model. I just don't see a good way to make it fit with the PL/UI synthesis, at least for many use-cases.

I wonder if there's a good intermediate. Maybe capabilities and users can both mark types with known trusted-path and stability information, and typefully prevent any accidental dependencies. If we require capabilities to add and remove certain type information, we could perhaps also control what dependencies an ad-hoc subprogram can have upon structure. Most type information seems to be associated with the atoms and endpoints. But staging and such are probably something the `*` and `+` types should also support, indicating the stability and trust in that `*` structure. 

Ross Angle

unread,
Nov 19, 2013, 3:11:37 AM11/19/13
to reactiv...@googlegroups.com
Hi David,

I'm simultaneously starting a new thread, "Philosophy of errors and state," to cover the deeper topics we had been talking about here. I've changed my position in some ways, and I might have misunderstood you in this thread too, so it's a whole new discussion. :-p

I still want to reply to some minor points in this therad, so here goes....


Your "blocks" sound like exactly what I was calling typed code, so at least we're on a similar page. The ability to coerce a value to a block of type (1 -> value) is pretty interesting to think about. I wonder if it's only valid if blocks can't perform side effects.

Those operations on `Sealed` values look like they're incomplete. You have `sealedJoin`, but there's no `sealedSealedJoin` or `sealedSealedSealedJoin`. :) I think you might be able to cover most of this with an operator of type ((a -> b -> c) -> (Sealed u a -> Sealed u b -> Sealed u c)). Since that's such an obvious generalization, I bet you ruled it out for being too expressive.

Back when I was finding those academic papers about modal types, I noticed there were papers about modal types for sealed values, so your idea to represent sealed values in terms of partitions seems doable. ^_^

I didn't intend any sarcasm when I said `bforeach` was interesting. I just don't have anything to say about it at this point.

In response to my roadmap for the multi-stage system, you said "I plan to start with partitions early on." I think staging is even harder to get right, and it has pretty much all the same semantic challenges as partitions. Come to think of it, if we do a `bdisjoin` that involves two later stages (let's call them "child stages") such that neither generates the other (let's say they're "cousins"), then even if one child can observe the branch condition, the other child might not be able to. From this point of view, Sirea's partitions are just a bunch of cousin stages where every two cousins have a way to pass values to each other.




I'm proposing imbueType, a pure operation, which just translates untyped code to typed code. I think this covers all the worst parts of eval, and at the same time it means you can avoid dealing with weird side effect semantics. This step does all the parsing and preprocessing, and afterward, we can reason about resource usage almost as though we're dealing with a single program again. 
 
I think you're assuming you can provide static types for all your capabilities without at least partially invoking them. This might be difficult to achieve in practice, without overgeneralizing types (which would bring back the 'partial failure' problems). 

The `imbueType` operation takes a static type and a coarsely typed value, and it tries to parse the value to construct a representative of the type. It doesn't need to invoke the value to determine its type, because the type is a separate input. (The coarsely typed value would be something like a string or binary blob anyway, rather than a capability.)


Consider, for example: we might want to assume some shared state has a particular type structure (e.g. an XML file). This might not be something you can properly determine without loading the file - i.e. the capability might not have an 'XML file type'. 

In this example, the file content itself needs to be parsed, but the capability to it can't be. I guess that's what you mean by an overgeneralized type; our programs have to pass around values of type (File (Maybe Xml)) rather than (File Xml).


On Tue, Sep 17, 2013 at 8:21 PM, David Barbour <dmba...@gmail.com> wrote:
The main strength of the staged model was that it provided a very static structure. A consequence was that I needed a different kind of collection type, such as a vector or set, to account for observed state. A consequence is that it becomes very difficult to program against a system's actual state; we must target the general possibility (a vector or set of any size with ad-hoc elements) instead of a specific possibility. In order to block direct dependency on unstable signals, users would also be unable to effectively treat dependable signals as part of their environment. 

I'm still fond of the staged model. I just don't see a good way to make it fit with the PL/UI synthesis, at least for many use-cases.

I wonder if there's a good intermediate. Maybe capabilities and users can both mark types with known trusted-path and stability information, and typefully prevent any accidental dependencies. If we require capabilities to add and remove certain type information, we could perhaps also control what dependencies an ad-hoc subprogram can have upon structure. Most type information seems to be associated with the atoms and endpoints. But staging and such are probably something the `*` and `+` types should also support, indicating the stability and trust in that `*` structure. 


It took me a few tries to read this, so I'll summarize for myself and others: If state is stable and trustworthy enough, you want developers to be able to build upon it the same way they would build upon a programming language's core utilities. The unchanging and trustworthy properties of a core utility are described by its static type, so you want programmers to be able to extract statically typed values from dynamically occurring state, even if it's not 100% stable or trustworthy.

Pretty nice idea. :) The basic intuition of this idea is unsound, but we can keep the contradictions under modalities like "Hopefully an X." :-p

Those modalities should be enough to mark * and + types too, as long as the Bool signal is explicit. Well, you might not be able to make the Bool signal explicit without dealing with dependent types ((X * Y) becomes ((b : Bool) -> if b then X else Y), so that's not necessarily an easy change to make.

Warm Regards,
Ross

David Barbour

unread,
Nov 19, 2013, 9:40:35 AM11/19/13
to reactiv...@googlegroups.com
Hi, Ross.

My thoughts have been refined a lot since my discussion in this thread. I've primarily been working on Awelon Bytecode (ABC) for the last couple months. So I'm sure my position has changed a fair bit as well, since I wrote the above.

On Tue, Nov 19, 2013 at 2:11 AM, Ross Angle <rok...@gmail.com> wrote:

Your "blocks" sound like exactly what I was calling typed code, so at least we're on a similar page. The ability to coerce a value to a block of type (1 -> value) is pretty interesting to think about. I wonder if it's only valid if blocks can't perform side effects.

I've decided to call this coercion "quoting". If I quote the value 3, I get a block [#3c]. Where: 

    # :: x → N(0) * x
    3 :: N(x)*e → N(10x+3)*e
    c :: x*1 → x

    [#3c] :: 1 → N(3)

(I finally looked up how to write unicode characters by codepoint in Ubuntu. :)

Quoting is limited with regards to the kinds of values that can be quoted. I.e. you cannot quote a value that is distributed across two partitions. And if you quote blocks, the quotation inherits substructural types.
 

Those operations on `Sealed` values look like they're incomplete. You have `sealedJoin`, but there's no `sealedSealedJoin` or `sealedSealedSealedJoin`. :)

True. I was probably thinking those would be addressed using sealedJoin + sealedApply. But I think I won't be using those operators. 

I'm still contemplating exactly what I want with respect to sealed values, how (and whether) they're to be represented at the bytecode level. The notion of sealed logical overlay partitions instead of sealed values also remains promising as the basis for rights amplification.


In response to my roadmap for the multi-stage system, you said "I plan to start with partitions early on." I think staging is even harder to get right, and it has pretty much all the same semantic challenges as partitions.

Agreed, especially if staging is implicit. At this point, I'm leaning more towards explicit staging, modeled within the program in terms of constructing blocks, and shifted implicitly back into compile-time via simple, universal techniques such by deep partial evaluation. I think this would cover the common use-cases for staging (e.g. modeling of DSLs) pretty well.

To support a high degree of partial evaluation, I've decided to explicitly assert/assume the following properties of ABC:

* causal commutativity: `[foo] first [bar] second = [bar] second [foo] first`.  Subprograms (even effectful ones) can commute so long as there is no direct input/output relationship between them. 
* spatial idempotence: `dup [foo] first [foo] second = [foo] first dup`. There is no observable impact from duplicating a computation. 
* fast-and-loose reasoning: we require and assume that every subprogram terminates (e.g. with respect to optimizations, strictness, partial evaluation). If we perform a termination analysis and prove a subprogram doesn't terminate, we can call that an error. If we cannot prove a subprogram terminates, we may issue a warning.

ABC can be used for RDP, functional code, or even capability-secure imperative code. In the latter case, to ensure imperative code is compatible with causal commutativity and spatial idempotence, I can model programmable objects or threads using linear blocks such that effects are sequenced wherever needed.

Anyhow, these assumptions make it easy to perform a high level of compile-time evaluation and optimization, without explicitly addressing staging.


If state is stable and trustworthy enough, you want developers to be able to build upon it the same way they would build upon a programming language's core utilities. The unchanging and trustworthy properties of a core utility are described by its static type, so you want programmers to be able to extract statically typed values from dynamically occurring state, even if it's not 100% stable or trustworthy.

Pretty nice idea. :) The basic intuition of this idea is unsound, but we can keep the contradictions under modalities like "Hopefully an X." :-p

That's a good way of phrasing it. I hadn't thought about a modal approach to reasoning about it, but that might work at a higher level. I'm not sure what it would mean to include "hopefully" semantics in the bytecode. :)

At the moment, ABC does contain a partiality/assertion operator:

        K :: (a + b) * e -> b * e  -- prove me 'right', kill the left

The current interpretation is that, first we try to prove statically that we're in the right condition. If we prove we're in the left condition, that's an error. If we fail to prove it either way, we may issue a warning, but we compile anyway. In the latter case, the operator results in full-program failure (up to the boundaries of an interpreter/executive) if we ever observe the left condition at runtime.

K provides a simple basis for modeling partial functions, from which we can infer dependent types. But it may also be a suitable basis for making assumptions explicit in the code. To the extent observed conditions are based on state, perhaps these assumptions could be regression-tested against *past observed states* (especially those known to cause errors before).

Warm Regards,

Dave 
Reply all
Reply to author
Forward
0 new messages