Programming in the Large

65 views
Skip to first unread message

Joshua Marinacci

unread,
Oct 28, 2013, 2:55:32 PM10/28/13
to augmented-...@googlegroups.com
I'm catching up with the latest flurry of discussions on Smalltalk, Self, etc. I love all of the ideas and good efforts to make old concepts work in the real world (Smalltalk would be so much better if recreated today). I'd like to suggest something else we should talk about. Something orthogonal to the current discussions. Programming in the large.

Most discussion of types, objects, and syntax focus on the individual programmer. How can I create a program which best represents the algorithm and will be as future proof as possible. Something that best represents what is inside my head right now. This is good but it doesn't help much with building large software. The kind of software that spans tens or hundreds of programmers, across multiple organizations. The most recent example is the new national health care website, but I think we all have our own favorite examples from industry of a huge system that was over schedule, over budget, and never worked quite right; if at all. Our current tools just don't help with building software on this scale.

During my career the biggest advances in building large software seem to be unit tests and continuous integration. They help express programmer intent to others (unit tests) and provide an early warning system when things are going awry. Distributed version control systems serve as multipliers to make them work even better.

So, what are some other pie in the sky systems that would make programming in the large better? Here's a few ideas:


* visual programming at the component level and higher. see how components are connected, attach probes for performance measurement, works across multiple processes and machines. A way to see the design and health of your whole system.

* pattern finders that look for semi-duplicated code. indicates a place where a common lib could be introduced that multiple modules and programmers would share.

* an automated repo of common open source components that can be searched by memory usage, language, performance, etc. not just ratings.

* screensharing and social networking built into the IDE. call up a friend to help you at any moment.

* wysiwyg docs integrated into the IDE.

* wysiwyg for DSLs

* a tool to calculate the real dependencies of any 3rd party library. no more pulling in a lib and then discovering it has 18 other dependent libs.


What are your ideas?
- J



David Barbour

unread,
Oct 28, 2013, 3:29:26 PM10/28/13
to augmented-...@googlegroups.com
The best approach for programming in the large is to focus on composition before all else, even before correctness. I mean composition in a formal sense. 

There is a set of compositional operators, components, and compositional properties. 

* The compositional operators are universal. They apply to every component.
* The set of components is closed over the compositional operators. Every composite is a component.
* Compositional properties must be mathematically inductive. I.e. P(A*B) = f(P(A), `*`, P(B)).

Compositional properties are exactly those we can reason about without knowing the implementation details of the components, and thus exactly those that are useful once a system grows too big for our heads or even for whitebox analysis. If we want to scale, then we need compositional properties useful for scaling - e.g. for performance, progress, type safety, open extensibility, resource control, process control, security, runtime update, failure modes. Compositional properties offer a much more constructive, formally justifiable approach to modularity and scalability. 

But composition can be challenging to do well: how do we achieve more compositional properties without sacrificing expressiveness? How do we keep the system suitable for general purpose programming, while still supporting certain kinds of compositional reasoning? I've spent many years working on this problem, and my best approach by far is Reactive Demand Programming (RDP). More recently, the pursuit of composition has led me to push it all the way to the syntax, i.e. focusing on tacit concatenative language.

I am also interested in the social aspects of in-the-large programming. I think wikis to share code and refactor across hundreds of projects would be a good thing. However, I don't believe we can effectively solve any of the social challenges without first addressing the technical challenges.

NOTE: *Type systems*, by nature, tend to be compositional. I.e. we can know the type of a structure by knowing the type of its elements. However, *types* individually are often not compositional - i.e. there is no way to create a larger class from two smaller classes. In this sense, traditional type systems have often failed us. Similar can be noted for module systems.

NOTE: Graphical programming, historically, has been a unilateral failure for in-the-large programming. My hypothesis is that this is mostly due to the first-order nature of graphical PLs, historically. And partly due to the closed nature of most visual programming environments, i.e. it's often difficult to build extra tools that didn't come with the IDE. But graphical PLs do get one thing right, which is the focus on components and a standard means to compose them.

- J



--
You received this message because you are subscribed to the Google Groups "Augmented Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to augmented-progra...@googlegroups.com.
To post to this group, send email to augmented-...@googlegroups.com.
Visit this group at http://groups.google.com/group/augmented-programming.
For more options, visit https://groups.google.com/groups/opt_out.

John Nilsson

unread,
Oct 29, 2013, 9:23:23 AM10/29/13
to augmented-...@googlegroups.com, augmented-...@googlegroups.com
The one question that keeps popping up in my head is "what breaks if I change this?"
In large systems that involves tracking currently deployed subsytems, current requirements and such.
Tools like database compare for deploying and versioning structural changes helps somewhat.
I guess some form of highlevel requirements modelling insisting on keeping referential integrity intact between current design decisions and current design motivations would keep the design honest.
Every decision needs to be negotiable to avoid legacy debt from stoping you.

BR ,
John
Skickat från min iPhone

David Barbour

unread,
Oct 29, 2013, 9:49:00 AM10/29/13
to augmented-...@googlegroups.com
It would be convenient if configurations or environments were modeled formally within the language, accessible for source-control and cross-compilation, marginalizing use of autoconf, environment variables, ad-hoc versioning, and similar. Even better if configurations were executable, representing a mockup environment for regression testing. We could still decide to stop supporting certain legacy configurations, but at least it would be a conscious decision.

Compositional properties can also be useful for understanding what *could* break when you make certain changes, and more usefully you could ensure that some properties won't break.

Matt M

unread,
Oct 29, 2013, 1:10:08 PM10/29/13
to augmented-...@googlegroups.com
> The best approach for programming in the large is to focus on composition before all else, even before correctness. I mean composition in a formal sense.

Composition is easy if you're willing to sacrifice correctness :).   I take your point, though, to be that sometimes "correct behavior" only means "desirable behavior" and choosing a slightly undesirable behavior for improved
compositional reasoning can be a net win.  I agree.  But in my opinion, we're never going to find a fixed set of compositional operators that works (even "well enough") for all projects.  We're in no-magic-bullet territory here.  Programming-in-the-large should be about choosing or building a framework of abstractions that allows the kinds of composition that are appropriate for the problem domain.   As you and I have discussed ad nauseam in the past, standardization of libraries is a big problem to solve here, both technically and socially.

Matt
To unsubscribe from this group and stop receiving emails from it, send an email to augmented-programming+unsub...@googlegroups.com.

David Barbour

unread,
Oct 29, 2013, 2:06:55 PM10/29/13
to augmented-...@googlegroups.com
On Tue, Oct 29, 2013 at 12:10 PM, Matt M <mclella...@gmail.com> wrote:
> The best approach for programming in the large is to focus on composition before all else, even before correctness. I mean composition in a formal sense.

Composition is easy if you're willing to sacrifice correctness :).

I don't believe so. Unless by 'correctness' you mean "ignoring any specification or requirements other than composition".

One might say that correctness is about "which properties are necessary for this subprogram to fulfill its purpose?" (or "meet its requirements", or "implement its specification", depending on which definition of 'correctness' you favor). Composition, meanwhile, is about "which useful properties can I reason about compositionally across a standard set of operators?" These sets of properties often intersect, but they don't intersect for every subprogram. Not every correctness property is compositional. I think most aren't. Similarly, not every compositional property seems immediately, locally useful for every subprogram. We suffer the limits of human focus and foresight. 

Anyhow, what I was trying to say is that we should favor compositional properties even when it hinders our ability to express what we (with our limited vision and foresight) understand to be a correct solution. Whenever composition conflicts with specifications or requirements, the latter are in the wrong. Go back to your client and clarify your true requirements.

 
we're never going to find a fixed set of compositional operators that works (even "well enough") for all projects

True. We can have multiple layers of composition. Documents, diagrams, graphs, geometries, relations, grammars, etc. can compose in ways that your general purpose computation model might not. They can have different operators and properties. I'm not suggesting we use a fixed set of operators and component models. But I believe we should discard every model that isn't compositional, and we should also enforce that these problem-specific structures are built upon a more general purpose compositional model.


Programming-in-the-large should be about choosing or building a framework of abstractions that allows the kinds of composition that are appropriate for the problem domain.

If you're focused on a single problem domain, I would call that "programming-in-the-small". In my mind, "programming-in-the-large" is by nature multi-disciplinary, cross-domain. 

Best,

Dave


To unsubscribe from this group and stop receiving emails from it, send an email to augmented-progra...@googlegroups.com.

Justin Chase

unread,
Oct 29, 2013, 10:46:58 PM10/29/13
to augmented-...@googlegroups.com
I think the importance of domain specific languages is not to be underestimated. They are currently quite expensive to make and maintain which is prohibitive. But I believe if they were easy enough to make it would dramatically increase productivity and the maximum scale of applications.

More generally I believe scale is about abstraction. Your scale is restricted by the abstraction techniques made available by your tools. I believe we will continue to learn more abstraction techniques but I believe the full power of DSLs has yet to be tapped.



From: Joshua Marinacci
Sent: ‎Monday‎, ‎October‎ ‎28‎, ‎2013 ‎1‎:‎55‎ ‎PM
To: augmented-...@googlegroups.com

I'm catching up with the latest flurry of discussions on Smalltalk, Self, etc.  I love all of the ideas and good efforts to make old concepts work in the real world (Smalltalk would be so much better if recreated today).  I'd like to suggest something else we should talk about. Something orthogonal to the current discussions. Programming in the large.

Most discussion of types, objects, and syntax focus on the individual programmer. How can I create a program which best represents the algorithm and will be as future proof as possible. Something that best represents what is inside my head right now.  This is good but it doesn't help much with building large software. The kind of software that spans tens or hundreds of programmers, across multiple organizations. The most recent example is the new national health care website, but I think we all have our own favorite examples from industry of a huge system that was over schedule, over budget, and never worked quite right; if at all.  Our current tools just don't help with building software on this scale.

During my career the biggest advances in building large software seem to be unit tests and continuous integration. They help express programmer intent to others (unit tests) and provide an early warning system when things are going awry.  Distributed version control systems serve as multipliers to make them work even better.

So, what are some other pie in the sky systems that would make programming in the large better?  Here's a few ideas:


* visual programming at the component level and higher. see how components are connected, attach probes for performance measurement, works across multiple processes and machines. A way to see the design and health of your whole system.

* pattern finders that look for semi-duplicated code. indicates a place where a common lib could be introduced that multiple modules and programmers would share.

* an automated repo of common open source components that can be searched by memory usage, language, performance, etc. not just ratings.

* screensharing and social networking built into the IDE. call up a friend to help you at any moment.

* wysiwyg docs integrated into the IDE.

* wysiwyg for DSLs

* a tool to calculate the real dependencies of any 3rd party library.  no more pulling in a lib and then discovering it has 18 other dependent libs.


What are your ideas?
- J



--
You received this message because you are subscribed to the Google Groups "Augmented Programming" group.

David Barbour

unread,
Oct 29, 2013, 11:34:24 PM10/29/13
to augmented-...@googlegroups.com

DSLs can be useful, but more of a good thing isn't always better. DSLs can become translation barriers between subsystems, requiring widespread interpreters, hindering reuse, integration, and metaprogramming. Productivity and scale can be hurt.

Somewhere between domain specific and general purpose we can find sweet spot languages that are good for common "classes" of problems. Examples include relational calculus, linear algebra, constraint models, tuple spaces. We should find these.

But I think there aren't so many of them that making DSLs easier to build will necessarily pay for itself. Especially not if the result, in practice, is a bunch of poorly designed DSLs.

Jake Brownson

unread,
Oct 30, 2013, 11:54:00 AM10/30/13
to augmented-...@googlegroups.com
> DSLs can be useful, but more of a good thing isn't always better. DSLs can
> become translation barriers between subsystems, requiring widespread
> interpreters, hindering reuse, integration, and metaprogramming.
> Productivity and scale can be hurt.

I think the answer to this is generating rather than interpreting. If
you generate from a DSL to a more common language then two DSLs that
don't know about each other directly, but know the common language can
still be integrated at some common level. If you're working on a
structured environment rather than text this becomes even more
effective.

> But I think there aren't so many of them that making DSLs easier to build
> will necessarily pay for itself. Especially not if the result, in practice,
> is a bunch of poorly designed DSLs.

We'll get those either way. There will just be more of both if it's
easier. Easy home recording made for a bunch more terrible music, but
there are definitely some gems out there that are worth finding and
rise to the top.

David Barbour

unread,
Oct 30, 2013, 1:56:29 PM10/30/13
to augmented-...@googlegroups.com
I agree that the generative approach is more promising. This is essentially "bring your own interpreter". 

Unfortunately, it isn't "the answer". An 'interpreter' for a DSL can effectively be whole a framework in the target language. The translation barriers still exist, they simply shift form: instead of syntactic barriers, there are barriers in the form of protocols, callbacks, variations in data types, update models, interference, failure handling, and resource management. Ultimately, unless you address the problem of composing independently developed frameworks, use of DSLs can still damage integration, productivity, scalability.

I believe that bi-directional reactive programming models are a promising basis for composable frameworks. In particular, such models address a lot of the callback, connectivity, and resource management challenges. 

Josh Marinacci

unread,
Nov 1, 2013, 12:45:56 PM11/1/13
to augmented-...@googlegroups.com
This is exactly the sort of thing I’m thinking of. Imagine we have a giant database which stores and understands your code. Not just the code of your current project, but your other projects and open source projects. As much code as it can get. This magic database understands the code structure and can answer questions about all of the code. What sort of questions would we ask it?

John Nilsson

unread,
Nov 1, 2013, 4:11:34 PM11/1/13
to augmented-...@googlegroups.com, augmented-...@googlegroups.com
Not only the code. The state of instances using it to. F.ex adding a not null constraint to a database is easy if there is no null values.
Question is, do we need the magic database or should our tools demand all changes to be statically verified to be deployable? The add null syntax could require an if null clause for example.

BR
John

Skickat från min iPhone

Justin Chase

unread,
Nov 1, 2013, 8:47:39 PM11/1/13
to augmented-...@googlegroups.com
I humbly and respectfully, completely disagree with you :)

It's true that DSL's can be a hindrance but I believe that is due to two factors: they are expensive to make and they are difficult to make well. But it is my hypothesis that neither of these issues necessarily needs to be true and if both of these limitations were removed it would usher in a new age of software development.

For at each increase in the level of abstraction we have seen dramatic improvements in scale and productivity (which are strongly related in my opinion). This is why we seem to be peaking at certain scales. Today the time it takes to make a large scale application can be more than the intended life-span which leaves the project obsolete before it is even deployed.

And it's all about abstraction... The lower the level of abstraction a programmer has to deal with the more time consuming it is. The value in domain specific languages that it brings a new type of abstraction (new to general purpose languages, dsls are old of course), what I'm calling syntactic abstraction. Additionally DSL's bring in more constraint, which I also think is key to scaling. With a DSL you are able to constrain the developer to the correct design pattern at the layer in the application the developer is currently working on. In modern programming the highest level of abstraction really available to you is a class (or component or something similar) and to express a design pattern you, a lot of times need many classes. When you then go to change or add to the application it can be quite time consuming to understand all of the classes needed to implement the pattern and even more time consuming to know what not to do. This is why, I believe, general purpose languages are not sufficient beyond a certain scale. No matter how terse your language is, no matter whether you like curly brackets or not, if you don't have the ability to add syntactic abstractions in your language you are not fundamentally that much more awesomer than all of the other general purpose languages that came before you.

Pattern matching seems to be the glue that will bind it all together. 

Here's an example of something I've been working on:

This is a grammar file, written in the grammar it defines. It's based on pattern matching as defined by OMeta but in the flavor of .net plus some of my own interpretations. The interesting thing about it is that it has "inheritance", you'll notice that this grammar inherits from LangParser, which is the grammar for a general purpose language.

In order to gain the ability to parse statements and expressions for free you can simply inherit from LangParser and modify its grammar rules as you see fit. This is technically an external DSL but it allows you to gain (or drop) features from a general purpose language at low cost, and have a lot of cohesion with other languages that use the same grammars.

Additionally you can add metadata to your rules and use that metadata to generate language services for tools for almost free:

That workbench I have is also interactive, where you can create the grammar and add input and interactively see the results interactively (including the highlighting!). 

I'm not saying this is "the way" but its an example of how there are still new ideas in the world of dsl's and new avenues to experiment with, please don't disregard it just yet. I hypothesize that the full realization of DSL's will be as important a step in computing as general purpose languages were to assembly. I have many more thoughts on this subject but this will have to do for now.

Here's some other really interesting projects that are similar in spirit to check out:




Date: Tue, 29 Oct 2013 22:34:24 -0500
Subject: Re: Programming in the Large
From: dmba...@gmail.com
To: augmented-...@googlegroups.com

David Barbour

unread,
Nov 2, 2013, 12:10:53 AM11/2/13
to augmented-...@googlegroups.com
On Fri, Nov 1, 2013 at 7:47 PM, Justin Chase <justn...@gmail.com> wrote:

It's true that DSL's can be a hindrance but I believe that is due to two factors: they are expensive to make and they are difficult to make well. But it is my hypothesis that neither of these issues necessarily needs to be true

I believe the issue is more fundamental to the nature of DSLs: first, by their very nature of being "domain specific", the cross-domain integration issues aren't well addresse, or (even worse) aren't addressed consistently. Second, by the normal process of development, we have integration or translation issues even within a domain. If these integration issues are well addressed, the result is a general purpose language. 

I'm not ignorant of the DSL concept or movement. I spent most of 2006-2007 focusing on syntactic abstraction, thinking that DSLs were great, that making them easy to make was "the" answer. I studied things like Christiansen grammars, John Shutt's thesis on Adaptive Grammar models, Dinechin's XL language. 


As I pursued them further, I found DSLs tend to create more problems than they solve. Not only is there a great duplication of design effort across domains, but even within domains there are many arbitrary and subtly incompatible decisions on the data and update models (e.g. about which data is delivered in summarized form, or which data is delivered together in an update message). The amount of integration code tends to grow in a combinatorial fashion with the number of DSLs, and the *quality* of the integration tends to drop to the lowest common denominator.

Now, perhaps what you meant by "difficult to make well" is some of these integration issues, as opposed to merely being effective within their specific domain. But unless we find some way to overcome the narrow focus of DSL developers, these problems will stubbornly persist. A potential way to address these problems is to focus on compositional properties in a general purpose language in which the DSLs are expressed to start with. 

The most systematic use of DSLs I've seen is (surprisingly) in the REBOL community, and they accomplish it by centralizing the development of DSLs so there is (effectively) one for each class of problems. But they also sacrificed a lot of other useful properties, such as modularity, concurrency, and control.

 
at each increase in the level of abstraction we have seen dramatic improvements in scale and productivity

If that were true, we'd have seen dramatic improvements in scale and productivity from people who use logic or constraint-based programming. But it isn't true. It isn't the LEVEL of abstraction that determines scalability and productivity. It is the CHOICE of abstractions - our choice to express and enforce some properties, while hiding others. The choice between what is relevant and what is irrelevant is easy to get wrong, and we can get it wrong at any 'level' of abstraction. 

When we make wrong choices regarding the relevance of modularity, extensibility, composability, portability, concurrency, resource control, failure tolerance, failure handling, progress and fairness, predictable performance, integration, persistence, update, security, etc., we will be hurt by them.

 
The lower the level of abstraction a programmer has to deal with the more time consuming it is.

Similarly, if a problem is low level then trying to deal with it at a high level can be very painful or even impossible. We should be able to express solutions with an appropriate level of abstraction, be it high or low. Some general purpose languages (Forth, Scheme) are good at enabling programmers to move easily upon the ladder of abstraction. Many are not. Are there any DSLs that have the same property without essentially extending a general purpose language? If so, I have not encountered them.

 
In modern programming the highest level of abstraction really available to you is a class (or component or something similar) and to express a design pattern you, a lot of times need many classes. When you then go to change or add to the application it can be quite time consuming to understand all of the classes needed to implement the pattern and even more time consuming to know what not to do.

Classes are not compositional. There is no generic operator to usefully combine two classes into a larger one. There are no useful compositional properties to support reasoning about changes. Classes also have far too many responsibilities - name, namespace, interface, implementation, constructor, encapsulation, information hiding - and even more in ambient authority systems. This hinders fine-grained decomposition of responsibilities. 

It isn't a good idea to generalize from classes to other component models. Components should at least be composable, so when you use many components that pattern becomes a reusable abstraction - another component.  A good component model should also support reasoning about useful compositional properties. When you go to change the application, you'll have a good understanding of exactly how far some of the more important properties will propagate. 


No matter how terse your language is, no matter whether you like curly brackets or not, if you don't have the ability to add syntactic abstractions in your language you are not fundamentally that much more awesomer than all of the other general purpose languages that came before you.

Abstraction is important, but it doesn't need to be syntactic. Abstraction can be expressed in types, structures, substructure, and vocabulary.
 

Pattern matching seems to be the glue that will bind it all together. 

I've programmed in Haskell for a few years, and I still find that kind of pattern matching painful and first-order compared to `either` and `maybe`, and even those are painful compared to (+++) and (|||). Patterns are so much pointless repetition of structure. Have you ever tried arrows, lenses, zippers?


This is a grammar file [..] The interesting thing about it is that it has "inheritance" [..] to parse statements and expressions for free you can simply inherit from LangParser and modify its grammar rules as you see fit.

I've used some similar approaches to building grammars, back in 2007, though more functional in nature: a grammar was modeled as a data structure, I could modify a grammar, and extend and redact rules by name. Even though these days I favor tacit concatenative syntax (and thus, I don't need a grammar for parsing), I still find grammars a valuable way to model state machines and generators. Generative, grammar-based programming, where I generate type-safe sentences (for context sensitivity), seems a very powerful alternative to logic programming.
 

Additionally you can add metadata to your rules and use that metadata to generate language services for tools for almost free:

You've done some good work. I never had the idea of automatically generating language scripts/plugins for my favorite text editors or IDEs.
 

there are still new ideas in the world of dsl's and new avenues to experiment with, please don't disregard it just yet.

I haven't disregarded DSLs. I've regarded them very carefully. But even if we solved the problems of integrating DSLs, DSLs are ultimately an expression of narrowly focused, in-the-small programming. I don't understand the logical step that so many people seem to assume between "I can solve a lot of small, specific problems" and "I can solve one big problem". Systemic problems can't be effectively addressed by curing the symptoms.


Joshua Marinacci

unread,
Nov 2, 2013, 2:40:29 AM11/2/13
to augmented-...@googlegroups.com
Funny you should mention OMeta. I’ve been playing with OMeta/JS. For certain tasks it is absolutely the most compact way to describe an algorithm.  Here is a PNG metadata parser I wrote in less than 20 lines:



ometa BinaryParser <: Parser {
    //entire PNG stream
    start  = [header:h (chunk+):c number*:n] -> [h,c,n],
   
    //chunk definition
    chunk  = int4:len str4:t apply(t,len):d byte4:crc
        -> [#chunk, [#type, t], [#length, len], [#data, d], [#crc, crc]],
   
    //chunk types
    IHDR :len  = int4:w int4:h byte:dep byte:type byte:comp byte:filter byte:inter
        -> {type:"IHDR", data:{width:w, height:h, bitdepth:dep, colortype:type, compression:comp, filter:filter, interlace:inter}},
    gAMA :len  = int4:g                  -> {type:"gAMA",value:g},
    pHYs :len  = int4:x int4:y byte:u    -> {type:"pHYs", x:x, y:y, units:u},
    tEXt :len  = repeat('byte',len):d    -> {type:"tEXt", data:toAscii(d)},
    tIME :len  = int2:y byte:mo byte:day byte:hr byte:min byte:sec
        -> {type:"tIME", year:y, month:mo, day:day, hour:hr, minute:min, second:sec},
    IDAT :len  = repeat('byte',len):d    -> {type:"IDAT", data:"omitted"},
    IEND :len  = repeat('byte',len):d    -> {type:"IEND"},
   
    //useful definitions
    byte    = number,
    header  = 137 80 78 71 13 10 26 10    -> "PNG HEADER",        //mandatory header
    int2    = byte:a byte:b               -> bytesToInt2(a,b),    //2 bytes to a 16bit integer
    int4    = byte:a byte:b byte:c byte:d -> bytesToInt(a,b,c,d), //4 bytes to 32bit integer
    str4    = byte:a byte:b byte:c byte:d -> toAscii([a,b,c,d]),  //4 byte string
    byte4   = repeat('byte',4):d -> d,
   
   
       
    END
}



On Nov 1, 2013, at 5:47 PM, Justin Chase <justn...@gmail.com> wrote:

I humbly and respectfully, completely disagree with you :)

It's true that DSL's can be a hindrance but I believe that is due to two factors: they are expensive to make and they are difficult to makewell. But it is my hypothesis that neither of these issues necessarily needs to be true and if both of these limitations were removed it would usher in a new age of software development.

John Nilsson

unread,
Nov 2, 2013, 7:21:42 AM11/2/13
to augmented-...@googlegroups.com
It seems to me that that DSLs and Grammars just add a lot of noise to the problem. At some level DSL are really about a domain specific meta-model. While a Grammar is really only about how to translate streams of tokens into a structures of tokens, usually a tree structure.
If you instead start from a sufficiently rich structure like a graph, and then define how nodes and edges in this graph relates to domain specific meta-models you essentially bypass all the noise of DSLs and can focus entirely on how those meta-models relates to each other, and thus their compositional properties.
In practice I believe this should be done by solving two problems
1. Create a usable editing environment for arbitrary graph structures (described by metamodels)
2. Make the base graph metacircular so all model reasoning can be done within the graph itself (i.e. support arbitrary levels of meta meta modelling within the graph)
I tried to look at what OMG had done in this space in an effort to implement this, but I guess they are to focused on generating Java code to produce anything usable for a general purpose infrastructure.

BR,
John


David Barbour

unread,
Nov 2, 2013, 9:34:05 AM11/2/13
to augmented-...@googlegroups.com
Good point, John. I haven't even been thinking about the challenge of reading DSLs when I was describing their problems. I was assuming they could start in a structured form (like a Haskell EDSL, or a composite object pattern), or are trivial to parse (like an XML-based DSL). It may be that the deeper problems of DSLs are less obvious to someone focused on the surface pains of reading them or adding syntax highlighting. 

But I'm not following your vision regarding the use of a graph. Are you envisioning some kind of graph rewrite model? or a dataflow graph?

Best,

Dave



 

Justin Chase

unread,
Nov 2, 2013, 3:38:08 PM11/2/13
to augmented-...@googlegroups.com
I think this is one of the innovations of OMeta actually. When you typically think about grammars you think about translating a stream of tokens as you say. But OMeta is actually designed to transform graphs into other graphs, where a stream of tokens is just another, flatter, kind of graph. 

OMeta’s key insight is the realization that all of the passes in a traditional compiler are essentially pattern matching operations...

For example in meta# there are actually two grammars for the Grammar dsl. One (the one I showed you) translates text into a graph, the second transforms that graph into the Lang model graph, then that transforms into another graph, etc. And the interesting part is that each of these transformations are all expressed as pattern matching operations. In a traditional compiler you would write each of these steps in multiple distinct patterns, sometimes using a tool (e.g. lexx/yacc) or manually creating visitor / state machine steps in a general purpose language. It can be quite difficult to even know how to do all of this and it can be even more difficult to get it right. But with a pattern matching tool they can all be written in the same language with the same concepts.

Which means if you can model each layer of your application, you can then describe their relationships as patterns and have tiers of abstractions.



Date: Sat, 2 Nov 2013 12:21:42 +0100

Subject: Re: Programming in the Large

John Nilsson

unread,
Nov 2, 2013, 5:50:49 PM11/2/13
to augmented-...@googlegroups.com
My thinking on the graph front is that, as you point out, EDSL lets you work with your DSL within a predefined structure. The host language provides reusable semantics and possible features needed to combine several EDSLs. On the other hand the host also interferes by imposing its own metamodel on the DSL. Thus I was thinking that the host language should have as little semantic interference as possible by not being a general purpose programming language, and instead being a model specifically design to host other models and model of their relation to eachother. I just assume that a graph wold be the top-level metamodel for this.

In a general purpose programming language I don't think you have sufficient flexibility in adding levels of meta for this. In general you have a metametamodel defined by the compiler, your metamodel defined by the types, and lastly the runtime model.

To allow composing various DS-models I think you need to be able to abstract over, and work directly with, atleast, the metametamodel. To have both Haskell, Java and SQL in the same model and allow some forms of code reuse for editing and validity checking there needs to be a common metamodel at some level. Currently the common base used are streams of characters. In some sense there are also a model shared between them that describe common elements such as identifier, value binding and such, but mostly this has to be formalized again for each implementation (f.ex. as within an ORM-library)

The common metametamodel should be somewhat more structured than a stream of characters, hence the graph base substrate. XML and the surrounding standards is somewhat there. But still seems like suboptimal once you leave documents or treestructures.

BR,
John



John Nilsson

unread,
Nov 2, 2013, 8:37:12 PM11/2/13
to augmented-...@googlegroups.com
Just to contrast things with parsing a bit more. I envision an editing environment that enforces integrity between model elements, including between versions of the model. A full model to model translation on each change would probably not be viable from a performance perspective so parsing doesn't really seem relevant.
Instead I imagine something like model-binding (as used in angularjs and such frameworks), but across many more models in various directions.
In my mind I see something like current SQL server systems. I.e. a live system in which you can execute something like DDL-scripts to transform and extend models and model derivations. Only much more flexible in what kinds of models can be expressed.
Btw. This kind of ties in with Davids idea of every user action being an act of meta programming. With SQL systems this is exactly what you do, even if the GUI tools offer some convenience while working with the structure of the they are implemented as executing the user actions as DDL-scripts.

BR,
John

David Barbour

unread,
Nov 2, 2013, 8:47:06 PM11/2/13
to augmented-...@googlegroups.com
On Sat, Nov 2, 2013 at 4:50 PM, John Nilsson <jo...@milsson.nu> wrote:
the host also interferes by imposing its own metamodel on the DSL

Interference can be a good thing, when it takes the form of "guidance" rather than "obstacles". DSL developers wish to focus on their own domain, and don't want to think about interop. I believe guidance from the host language - i.e. to force developers to address interop concerns (composition, extension, update, concurrency) - will be essential if we are ever to adequately support DSLs. But we also need to find better metamodels than procedural abstraction.


In a general purpose programming language I don't think you have sufficient flexibility in adding levels of meta for this. In general you have a metametamodel defined by the compiler, your metamodel defined by the types, and lastly the runtime model.

General purpose programming languages can be very flexible, much more so than Java and Haskell. 

Historically, we sacrificed a lot of flexibility in the name of reasoning: e.g. structured programming, lexical scope. These days, we have the tools to reason about code without losing flexibility: e.g. dependent types, substructural types, first-class continuations instead of gotos. 

I find concatenative languages (Forth, Factor, PostScript, etc.) especially fascinating due to their ability to absorb so many paradigms and metamodels while keeping such a simple syntax and base semantics. 

Best,

Dave

David Barbour

unread,
Nov 2, 2013, 8:56:07 PM11/2/13
to augmented-...@googlegroups.com

On Sat, Nov 2, 2013 at 7:37 PM, John Nilsson <jo...@milsson.nu> wrote:
I imagine something like model-binding (as used in angularjs and such frameworks), but across many more models in various directions [..] a live system in which you can execute something like DDL-scripts to transform and extend models and model derivations. Only much more flexible in what kinds of models can be expressed.

This sounds similar to what I'm pursuing with RDP. I describe some of the challenges here: 



 

John Nilsson

unread,
Nov 2, 2013, 9:10:39 PM11/2/13
to augmented-...@googlegroups.com
Yeah, I should have been more clear about that. I do agree fully that there has to be interference from the host language. I'm just saying that the host language should be especially designed to be a host language, and be very picky about exactly what semantics it choose to impose on it's hosted models. Most importantly it should allow for adding and composing such semantics into sub-hosts.

Also I think many languages fight to much with problems relating to limitations of their representation as streams of text as primary interface to be a fertile ground for innovation in this area. Instead it would be interesting to handle even the editing interface as pluggable metamodels.


Josh Marinacci

unread,
Nov 2, 2013, 10:00:36 PM11/2/13
to augmented-...@googlegroups.com
Indeed. I really hope we aren’t programming in ASCII text files in the year 2100.

Matt M

unread,
Nov 3, 2013, 10:11:17 AM11/3/13
to augmented-...@googlegroups.com
Of course not.  Everyone will be using UTF8 text files by then.


On Saturday, November 2, 2013 9:00:36 PM UTC-5, Joshua Marinacci wrote:
Indeed. I really hope we aren’t programming in ASCII text files in the year 2100.

On Nov 2, 2013, at 6:10 PM, John Nilsson <jo...@milsson.nu> wrote:

> Yeah, I should have been more clear about that. I do agree fully that there has to be interference from the host language. I'm just saying that the host language should be especially designed to be a host language, and be very picky about exactly what semantics it choose to impose on it's hosted models. Most importantly it should allow for adding and composing such semantics into sub-hosts.
>
> Also I think many languages fight to much with problems relating to limitations of their representation as streams of text as primary interface to be a fertile ground for innovation in this area. Instead it would be interesting to handle even the editing interface as pluggable metamodels.
>
>
> On Sun, Nov 3, 2013 at 1:47 AM, David Barbour <dmba...@gmail.com> wrote:
> On Sat, Nov 2, 2013 at 4:50 PM, John Nilsson <jo...@milsson.nu> wrote:
> the host also interferes by imposing its own metamodel on the DSL
>
> Interference can be a good thing, when it takes the form of "guidance" rather than "obstacles". DSL developers wish to focus on their own domain, and don't want to think about interop. I believe guidance from the host language - i.e. to force developers to address interop concerns (composition, extension, update, concurrency) - will be essential if we are ever to adequately support DSLs. But we also need to find better metamodels than procedural abstraction.
>
>
> In a general purpose programming language I don't think you have sufficient flexibility in adding levels of meta for this. In general you have a metametamodel defined by the compiler, your metamodel defined by the types, and lastly the runtime model.
>
> General purpose programming languages can be very flexible, much more so than Java and Haskell.
>
> Historically, we sacrificed a lot of flexibility in the name of reasoning: e.g. structured programming, lexical scope. These days, we have the tools to reason about code without losing flexibility: e.g. dependent types, substructural types, first-class continuations instead of gotos.
>
> I find concatenative languages (Forth, Factor, PostScript, etc.) especially fascinating due to their ability to absorb so many paradigms and metamodels while keeping such a simple syntax and base semantics.
>
> Best,
>
> Dave
>
>
> --
> You received this message because you are subscribed to the Google Groups "Augmented Programming" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to augmented-programming+unsub...@googlegroups.com.
> To post to this group, send email to augmented-...@googlegroups.com.
> Visit this group at http://groups.google.com/group/augmented-programming.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Augmented Programming" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to augmented-programming+unsub...@googlegroups.com.

Paul Tarvydas

unread,
Nov 3, 2013, 9:19:20 PM11/3/13
to augmented-...@googlegroups.com
On 02/11/2013 8:47 PM, David Barbour wrote:
> ...
> I find concatenative languages (Forth, Factor, PostScript, etc.)
> especially fascinating due to their ability to absorb so many
> paradigms and metamodels while keeping such a simple syntax and base
> semantics.
>
David,

Are you aware of S/SL (syntax/semantic language, Holt, et al)? PT
Pascal, Concurrent Euclid, Turing, etc.

pt

David Barbour

unread,
Nov 4, 2013, 12:46:51 AM11/4/13
to augmented-...@googlegroups.com
I wasn't. I'll check them out.


--
You received this message because you are subscribed to the Google Groups "Augmented Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to augmented-programming+unsub...@googlegroups.com.
To post to this group, send email to augmented-programming@googlegroups.com.

David Barbour

unread,
Nov 4, 2013, 7:53:55 AM11/4/13
to augmented-...@googlegroups.com
S/SL and TXL were interesting. Thanks for pointing me to them.
Reply all
Reply to author
Forward
0 new messages