Why a Tacit Style for Awelon? [was Re: Static Choices and Offerings]

69 views
Skip to first unread message

David Barbour

unread,
Aug 31, 2013, 8:49:21 PM8/31/13
to reactiv...@googlegroups.com
On Sat, Aug 31, 2013 at 3:57 PM, Ross Angle <rok...@gmail.com> wrote:
Incidentally, could you describe the primary reasons you're pursuing tacit programming rather than programming with variables? I could make some guesses, but I don't want to put words in your mouth. :)

It isn't quite correct to say I'm "not programming with variables". I can model variables with named stacks.

     "foo" load  -- move top element of "foo" stack onto current stack
     "foo" store -- move top element of current stack onto "foo" stack

From there I can use a little polymorphic compile-time metaprogramming to access a list of named stacks in the environment, find a stack named "foo", and load or store values. Similarly, I can model records for keyword arguments lists. And I can model ADTs or existential objects, by clever use of sealers/unsealers and a few blocks. 

But I am marginalizing use of variables. They certainly have a second-class status in Awelon. As for why? 

In some ways, variables are too expressive. A lot of syntactically valid expressions don't make sense. Consider an input type `(x * (y + z))`. There are several potential issues here. First, if 'x' is named, it would be easy to combine with y or z by accident ('disjoin'). If 'y' or 'z' are named independently, that's even worse: we might try to combine y with z. Substructural types are another area where variables are often too expressive. With variables, we need to do a lot of analysis, and say 'no' to a lot of syntactically valid expressions.  

I imagine that abstraction with parameters would encourage unnecessary use of dynamic behaviors. I have a goal to discourage dynamic behaviors, because they are relatively expensive in RDP.

Also, I often feel the idea of "programs with holes in them" is a bad way of thinking for composition. Developers spend their time "filling the holes" rather than "composing the objects". These two different modes of thinking are realized in a very noisy syntax, i.e. where functions are both parameterized and composed, and dataflows are all over the place. There is much less consistency across modules.

When developers have variables, they want to understand the variables, so they give them named types.  Named types are one of the most common failure modes for linear types (cf. Plaid language), because we need new types after every non-trivial action. I would prefer developers focus on structures - the understanding of structure, the fine-grained manipulation of structure, but NOT the preserving of structure. 

I believe that many uses of names (variable names, field names, method names, pattern constructors, type names, etc.) in common programming practice are syntactic clutter that result in unnecessarily verbose and rigid programs with little cognitive benefit.  

Finally, Awelon is intended as a distribution language, and an intermediate language (for visual programming). In both of these roles, the simple syntax and easy analyses are very useful.
 

David Barbour

unread,
Sep 1, 2013, 2:18:08 PM9/1/13
to reactiv...@googlegroups.com
I'll extend what I said earlier.

I find over-use of names very unnatural. Outside of programming, I get through most sentences, even most paragraphs, sometimes even full days, without using specific names for objects, people, places, or data. I might speak of a hammer, a nail, a chair, a workbench - but these things are not specifically named; they are simply available in context. I sometimes wonder whether this experience could be made part of programming.

The word "tacit" means "to be understood without being stated". Tacit programming really is a step closer to a more natural environment in which we manage objects without explicitly naming them. When I extended the tacit environment with a user-model - with hands, navigation, and an ability to extend our models further - I believe I took another giant leap forward to natural programming. (I recently discussed this on the augmented-programming and fonc-vpri mailing lists.) At this point, we can almost *directly* write programs in a AR/VR environment in terms of simple gestures, without ever using a name. 

Another powerful tacit approach is the use of zippers and document-like objects: we can represent documents, diagrams, game-worlds, higher-order programs or IDEs, tree-structured databases, spreadsheets, cellular automata, etc.. - and we can navigate and manipulate artifacts without naming them. Again, we would rely on structure - e.g. tags in a document - but a lot of structure can be repeated, and a lot of operations can be implicit relative to our location, or potentially to cursors or flags we place. 

I had initially considered a zipper as the basis for Awelon's environment, but I decided against it. I didn't like the intuition of wandering around a gigantic programming environment, potentially losing track of where you left your tools, so now I simply have the current stack, hand, and a list of named stacks.  

However, there is still a use for zippers: I can easily model an "open document" at the top of the current stack. Instead of navigation, the intuition becomes: I am a demented data-surgeon, sifting around an open document, occasionally transplanting, cloning, or removing the still-beating live data. I can perform transclusion surgery with the open documents on one of my other named stacks.  

But names aren't useless. 

Names are useful when I have objects I don't use often, but that I want readily accessible at all times. A uniqueness source is a good example. So is a clock signal, if one isn't an ambient authority. Global information about preferences, configurations, etc. could all be accessed by name.  Indeed, it is these roles for which I decided to support named stacks as a standard part of Awelon's environment.

I had mentioned keyword arguments, but I think those require too much use of names. Keywords are often too specialized for their use-case, and thus the data-plumbing that builds the keyword list becomes highly specialized. Something closer to a document-passing style, where we have a single document-like structure used by a whole library of words, seems more compositional because it doesn't require specialized data manipulation for each use case.

Names for complex types are a terrible mistake. I had mentioned this is true for linear types, but linear types really just shove the issue in our face. Names for complex types are a mistake even without linear types, because they hinder operating on intermediate structures in a type-safe way. I can't even imagine naming the environment-types of Awelon. Now, programmers do have a *reason* to name these types: to help understand them. But names aren't necessary to understand types. One could provide assertions on structure, render in an IDE, or use unique values or sealer/unsealer pairs to ensure certain data flows expectedly. 

If I can cut explicit name usage - relative to Java, C++ - down to 5%, I think that would be a great thing for a more natural programming experience. If names can be cut down further than that, we might even manage to integrate PL with AR/VR such that every single gesture can be understood as a program extension (perhaps rewriting the history to eliminate redundancies).

Best,

Dave

Matt McLelland

unread,
Sep 1, 2013, 9:43:25 PM9/1/13
to reactiv...@googlegroups.com
Hello!

> I might speak of a hammer, a nail, a chair, a workbench - but these things are not specifically named.

What does "specific name" mean?  Those are names, and I find your use of the word "name" is confusing.  Do you mean GUID or something?  Usually in conversation we first establish a referent unambiguously ("the red chair over there", pointing if necessary) and then settle into a shorter reference ("the chair"), which acts as a name for the thing in further conversation.  This is vaguely similar to how one imports a fully qualified name in Java under a shorter local name. 

Even if you do add a mechanism for constraining "the longest string in this list that begins with the letter B", the first thing you'll want to do is put a local name to that constraint so that you can more easily reference it. 

I'm sorry I haven't been following along too closely, but by "stack" do you mean that you're using de Brujin indices (or worse, dip bloop blarp stack manipulation schemes)?   If so, how is keeping track of the stack depth of what you're interested easier than keeping track of a name?

Finally, could you say another sentence or two about why naming types is bad?

Best,
Matt





--
You received this message because you are subscribed to the Google Groups "reactive-demand" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactive-dema...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Barbour

unread,
Sep 2, 2013, 1:02:34 AM9/2/13
to reactiv...@googlegroups.com
On Sun, Sep 1, 2013 at 6:43 PM, Matt McLelland <mclella...@gmail.com> wrote:
Hello!

> I might speak of a hammer, a nail, a chair, a workbench - but these things are not specifically named.

What does "specific name" mean?  Those are names

Hammer, nail, chair - we might say these 'name' classes of things. But I would not say they are specific names. I.e. they do not name a specific hammer, nail, or chair. A specific name is any symbol, word, or short phrase assigned to uniquely identify an object (or value, or function) within a context. Names, by their usual nature, are not complete descriptors: the association between a name and an object must be stored somewhere, and an explicit act of naming is often necessary to build this association.

 
Usually in conversation we first establish a referent unambiguously ("the red chair over there", pointing if necessary) and then settle into a shorter reference ("the chair"), which acts as a name for the thing in further conversation.

This may be the case when we need to reference the same chair many times in one conversation. But how often do you create conversation about specific chairs? If you were asked to move the chair, you wouldn't name it. You'd move it. 

I think very few things in your life are named. The last t-shirt you wore? The last bar of soap you used? Your car keys? The path you walk in the morning? We do so much without names.
 

Even if you do add a mechanism for constraining "the longest string in this list that begins with the letter B", the first thing you'll want to do is put a local name to that constraint so that you can more easily reference it. 

Nope! Not true. More likely - in actual practice - your constraint is just one of many in a bulleted list, and you never assigned a specific name to it. Later, if you determine that you actually need to reference it, you might develop a name for it. But the act of "future-coding" just in case you need to "easily reference it" later? You must be a contract lawyer if you think that's common.


I'm sorry I haven't been following along too closely, but by "stack" do you mean that you're using de Brujin indices (or worse, dip bloop blarp stack manipulation schemes)?  

I am using a arrow-based programming model. In arrow models, the type of the program between any two arrows can be described by a massive product. Programmers have several that can manipulate these arrows:

        first ::  (a ~> a') -> ((a * b) ~> (a' * b))
        swap :: (a * b) ~> (b * a)
        assocl :: (a * (b * c)) ~> ((a * b) * c)
        intro1 :: a ~> (Unit * a)
        elim1 :: (Unit * a) ~> a

For tacit programming, I had to tweak 'first' to operate on a pair:  
  
       first :: ((a ~> a') * (a * b)) ~> (a' * b)

Which unfortunately damages some useful properties of arrows (notably, I can now describe Turing-complete fixpoint combinators unless I use constrain by types), but I'm okay with that for my language. 

Anyhow, "the stack" in this context would simply be a way of arranging/conceptualizing objects in this product type. A stack of seven items might look like:

       (a * (b * (c * (d * (e * (f * Unit))))))

If we wish, we can define FORTH-like operators (roll, dup, pick, etc.) to manipulate this stack, in addition to operators that can open up 'a' or apply blocks to one or two items in the stack. I assume this is what you meant by "dip bloop blarp stack manipulation schemes". 

Since we're working with types, we actually aren't limited to a simple FORTH stack. A rather useful concept is to have a 'hand' so that we can model take/put actions instead of just rolling stuff to the top all the time. If we add a hand, the environment might look like:

      (stack * hand)
      take :: ((x * s) * h) ~> (s * (x * h))
      put :: (s * (x * h)) ~> ((x * s) * h)

And with a hand to carry things, we could feasibly explore other environments - e.g. based on a zipper. But more complex environments aren't necessarily 'good' if it means we can't figure out how to use them, or if developers are likely to get "lost". 

Anyhow, my language augments the simple arrow model with some compile-time introspection and static data types (i.e. text and number literals are, by default, carried in the type) which enables some non-parametric polymorphism and compile-time metaprogramming.  So I can use a more powerful notion of "named stacks" in that extended environment. 

Awelon's environment:

    (stack * (hand * (stackName * listOfNamedStacks))))

With this, I can `goto` stacks by name (carrying whatever is in the hand), or load/store from named stacks (which treats them much like memory locations or registers). I've also found that zippers seem useful for manipulating the top object on the current stack. (I had previously tried zippers for the environment itself, but I didn't like how it felt after a couple libraries and thought experiments.)


 how is keeping track of the stack depth of what you're interested easier than keeping track of a name?

It isn't! :D

But you should ask a different, more complete question: "how is keeping track of the stack depth of what you're interested easier than inventing names, typing them, managing relationships between names and objects, AND keeping track of names for multiple objects?" The burden of names is much greater than merely keeping track of one of them. 

Keeping track of a few objects on the stack, maybe a few in the hand - is not difficult. And, relevantly, that covers 90%+ of use-cases: the vast majority of functions take three arguments or fewer, and return three items or fewer, so we can easily develop functions that operate on only the top of the stack. (And, with an arrowized model, we can precisely control the number of input and output arguments by use of 'first', creating a fresh environment for a subprogram, etc..)

The "five to seven items" limit of human memory can be amplified by use of an IDE. Basically, since the environment is a static type, we can render the static type, or even animate how it changes between two positions in a program. If the type is rendered and developers have continuous visual feedback, I think (based on my experience as a gamer) developers wouldn't have difficulty working with a ring of perhaps six to eight items in the hand, and objects five to seven items deep on the stack, plus multiple stacks for inventories or other locations. 

How hard would it be to invent names for all those things? How hard would it be to keep track of those names? 

Given that you lack physical constraints, how often would you choose to name a chair... rather than simply pick it up and carry it, or drop it on the current stack to operate on it?
 

Finally, could you say another sentence or two about why naming types is bad?

Sure.

My argument primarily regards complex types, i.e. named structures, unions. 

When we name these types as a matter of course, the systems they are part of become rigid and less compositional. The types can't easily be picked apart, rearranged, or composed because we lack named types for these different sub-structures and rearrangements. Similarly, constructors for these types cannot readily and safely be decomposed, because partial-constructors don't have a well defined type. 

This problem is something I began to recognize after a bunch of different encounters:

1) In dataflow models, I often want to send different "fields" of an object on completely different paths. But with named types, the pressure always exists to just send the whole object (because it's easier than coming up with a new type). Interestingly, the same is true for sums/unions (which, in a dataflow system, can represent switching networks). 

2) In linear typed systems, developers often struggle how to deal with objects and structs composed of linear types. In many cases the right (safe, natural) thing to do is to change the type after each operation. But this requires a lot of names. Worse, it is very painful to propagate this named-change-in-type up through the superstructure that holds the linear type, especially since the number of such type changes (and names) would tend to be combinatorial in the number of linear elements held. To avoid this pain, users of the language will begin to ignore relevant changes in type, or (using Plaid as an example) use dynamic mechanisms to check the 'typestate'. 

3) When we deal with compositional data, there are always different ways to name it and arrange it. Within an isolated library, we can often pretend that our arbitrary choice is the right one. But if we ever need to integrate or translate between protocols, tuple spaces, pubsub buses, databases, etc. the arbitrariness quickly makes itself known. Dealing with named types in these cases is especially painful because we generally use automated code generators, and the names and fields they pick aren't obvious... and then we *still* can't easily pick them apart or rearrange them without dealing with a bunch of intermediate states. It is both easier and more robust to focus on self-describing anonymous structures that can be decomposed, studied, and recomposed within the type system. (To make this work effectively and efficiently we need a little introspection and compile-time metaprogramming.)

4) When I became interested in distributed programming, I found named types problematic because distributed systems might not agree on the meaning of names, e.g. due to different versions of libraries. Later, I learned the same problem exists with respect to versioned libraries even within local applications (especially if plugins are involved). These days I much prefer to use sealer/unsealer pairs (which might exist only in the static type system) to understand ADTs in a first-class manner, rather than using the module system.

There might be a few more scenarios that left me with this strong negative impression on named types, but that's all I recall right now. I feel that they've been an anchor. Names have their use-cases, but they're rarer than most programmers might imagine. We'd be better using them sparingly, and carefully.

"Why is programming hard?" 
"In part because you're chaining yourself down with all those names." 
"Huh? Names are great! I use them all the time. That couldn't be a problem." 
"What do you name in real life?" 
"Um. My dog, my wife, my job, me." 
"Really. What is the name of your job?"
"Uh, Software Engineer? Okay, that's lame. But my dog's name is Shoe."
"Please don't tell me your pet name for your wife..."
"Restaurants and streets have names."
"How often do you use them?"
"When I'm giving or receiving directions."
"Indeed. How often is that?"
"A few times a month?"
"You said you use names all the time."
"Well, sure! I write Java code!"

Even using names as a shorthand for structural types could be problematic if it leads to us ascribing that type incorrectly, resulting in functions less polymorphic than they could be. We should really stick with with anonymous, structural types for the most part, and use specific structural types where we really want the rigidity. 

Warm Regards,

Dave

Matt McLelland

unread,
Sep 2, 2013, 10:06:25 AM9/2/13
to reactiv...@googlegroups.com
Dave,

I still think we use "specific names" (by your definition) more than you're letting on.  Consider:

"Have you seen Hooters T-shirt?"
"Oh, please don't wear *that shirt*"     
"I'm just gonna wear *it* in the back while I do yard work"

The emphasized terms would seem to be specific names under your definition.  I take your point, though, that a new name wasn't introduced as in "Have you seen my Hooters T-shirt (which I will henceforth refer to as Mandy)?".

One thing I do in my language that eliminates naming things is sort of a generalization of what Haskell does with type classes.  When you write 'print', the system infers that there is an IO parameter, but you don't have to name it (frustratingly, in Haskell, you can't name it ... though I think someone has an extension where you can).  So I can get behind the problem that you've identified, but as you might have inferred from my derisive "dip bloop blarp", I hate FORTH style stack manipulation and I think you are making a terrible terrible mistake if you go that route.  It's on the order of magnitude of deciding to use S-Expressions for your syntax... which is to say it's not actually a huge deal, and probably has some advantages, but must programmers  (including me!!) will still hate your language because of it.  :D   But seriously.


Thanks for elaborating on your problem with naming complex types.

Point by point:

1) I agree this is a problem.  I emphasize a style of programming where objects are modeled as a set of abstract IDs and an open set of properties.  The system can track / infer what properties of the data are used.  Saying "I emphasize" is a little simplistic -- I think such a scheme is frequently the right way to model data, because a dependency on a closed defining set of objects is unnecessary.   As I've mentioned before, my approach is to try to track (but also minimize) logical assumptions.

2) I'm not experienced with linear types.  My weakly held opinion is that sub-structural types can usually be promoted to structure types by modeling (linear types become something like processes or maybe map to my representative selection mechanism), but I'm having trouble mapping your complaint about linear types into one about processes (or algebraic effects).

3) The problem of arbitrariness of naming still exists as arbitrariness of ordering in your scheme, though, right?  And both of those problems are shallow compared to the problem of minor but pervasive semantic differences.   It seems clear to me that you're never going to have API compatibility for free.  You can make a tool that compares two structures and tries to find a correspondence based on the types, but you can do that with names just as easily as with stack ordering.

4) Names should be shorthand for structural definitions.  We use names to get the machine to know what we're talking about, but it should always check what we attempt to do with those names based on the underlying structural properties.  Names can be used in a distributed setting, but you have to be careful to ensure that the structural definitions match on either side of the wire.
 
Best,
Matt


David Barbour

unread,
Sep 2, 2013, 1:22:11 PM9/2/13
to reactiv...@googlegroups.com
On Mon, Sep 2, 2013 at 7:06 AM, Matt McLelland <mclella...@gmail.com> wrote:
Dave,

I still think we use "specific names" (by your definition) more than you're letting on.  Consider:

"Have you seen Hooters T-shirt?"
"Oh, please don't wear *that shirt*"     
"I'm just gonna wear *it* in the back while I do yard work"

The emphasized terms would seem to be specific names under your definition.  I take your point, though, that a new name wasn't introduced as in "Have you seen my Hooters T-shirt (which I will henceforth refer to as Mandy)?".

I don't consider the use of anaphora ("that shirt" and "it" in this case) to be names. There is no act of naming, and they certainly have different properties than names - unstable, often vague or ambiguous, humans do them wrong all the time. Though they are convenient. I would be happy to use more anaphora; they are very natural to humans. I would prefer validation against local ambiguity, though. 
 
 
as you might have inferred from my derisive "dip bloop blarp", I hate FORTH style stack manipulation and I think you are making a terrible terrible mistake if you go that route.  It's on the order of magnitude of deciding to use S-Expressions for your syntax... which is to say it's not actually a huge deal, and probably has some advantages, but must programmers  (including me!!) will still hate your language because of it.  :D   But seriously.

I think a lot of people who "hate" FORTH style manipulations have not seriously tried it. But that's their prerogative. Haters gonna hate. 

I think this won't be a problem. 

My language started as an Awelon Byte Code (ABC), intended as an intermediate and distribution language for an idealized, heterogeneous, RDP-based virtual machine. Initially, primitives were just assigned to UTF-8 characters. (UTF-8 because I like to see the code, and because I felt 256 might not be enough for a heterogeneous machine: GPU vs. CPU, client vs. server, FPGA and DSP, etc.). A concatenative language was an outstanding fit for an arrowized byte code (and also has the nice property that applications are directly useful as software components). ABC can be easily compressed by modeling reusable blocks as first-class values in the type system.

I had, at the time, envisioned that Awelon would be built using variables and a desugaring operation that would translate local use of names into proper arrows. And anyone who wants such a syntax could easily develop one. (Indeed, the bulk of the desugaring computation could even be pushed to Awelon.) I had also envisioned that Awelon would be used in visual programming environments - e.g. boxes and wires, or something more exotic for augmented reality - which is something I'm still interested in.

As I worked with ABC, I realized it's quite expressive, so I tweaked and fiddled until I had a complete language. Awelon, as it exists now, is really just a trivial expansion of ABC: I use words with spaces between them instead of individual UTF-8 codes, and I support module imports. Imports can be cyclic and definitions can be out of order, but definitions cannot be cyclic: the meaning of a word is simple substitution. That's it. The module system doesn't provide any special support, e.g. for recursive definitions (developers must still use fixpoint combinators). It would be trivial to compile any Awelon application into ABC. 

Anyhow, anyone who doesn't like tacit programming will have other options available. Whether those options are good for them is a different question. Syntactic sugar sometimes leads to syntactic obesity.

Yet, I sincerely believe that developers will find it easy to work within such an environment - assuming a decent IDE that provides continuous visual feedback (which is possible due to the static nature of this environment).  Also, Awelon's environment is much richer than a FORTH stack. Developers do have access to names when they want them (via named stacks) and thus can easily access those rare-use objects (like uniqueness sources) without painfully threading them through the 'one and only' stack.  
 

3) The problem of arbitrariness of naming still exists as arbitrariness of ordering in your scheme, though, right?

Yes and no. Before I listed my main points I described one of the issues: "The types can't easily be picked apart, rearranged, or composed because we lack named types for these different sub-structures and rearrangements". I can pick apart, reorder, and rearrange anonymous structures incrementally. (I can even use *simple* introspection to help perform this incremental mapping in a robust, type-sensitive way.) 

The problem is much more painful with named types - because now we need a bunch of intermediate types, and to eventually integrate with the named-type constructors. Unless our languages have very powerful metaprogramming facilities, this can be very painful. (Indeed, it's often done by hand, which results in fragility to changes in the protocol.)


 
you're never going to have API compatibility for free.  You can make a tool that compares two structures and tries to find a correspondence based on the types, but you can do that with names just as easily as with stack ordering.

True. But we can get a solution for a much lower price if we favor heterogeneous structural types instead of traditional nominative types. And that has a lot of real value. Also, RDP is designed to address other aspects of the same issue (such as the scatter/gather and maintaining consistent views problems) in other ways.
 

4) Names should be shorthand for structural definitions. We use names to get the machine to know what we're talking about, but it should always check what we attempt to do with those names based on the underlying structural properties.  

I also would prefer we use names to refer to structural types, rather than using nominative types. Though, in many cases, names should actually be names. We should model the relationship between reference and referent, and the distinction between them.


Reply all
Reply to author
Forward
0 new messages