Little language design/implementation guidance

199 views
Skip to first unread message

Stephen De Gabrielle

unread,
Feb 6, 2019, 11:25:54 AM2/6/19
to Racket Users

Hi,


Dave Herman recently tweeted[1] that consulting a PL specialist was a good idea for little languages to avoid common foundational mistakes, specifically mentioning templating systems and configuration files.


So when designing(or evolving) a little language:

  • what beginners mistakes should you avoid? (some in the subsequent tweets I’ve copied below)
  • When should you consult a PL specialist? (and how do you identify a PL specialist?)
More generally, can you recommend resources a developer (without a background in language design) should refer to if they are building a simple templating system, application configuration file, etc. - that may grow into a little language?

Kind regards,

Stephen 

PS please forgive me if this is the wrong list for this question.



[1] https://twitter.com/littlecalculist/status/1092160821944213504?s=21


It's frustrating that PL is considered such a specialization that PL people only get brought in for big languages. There are vastly more little languages everywhere. People often don't realize their little language needs better underpinnings until very late, if at all.



Common mistakes tweeted by Dave Herman in subsequent thread;


Scenario-solving: language design at its heart is figuring out what your composable primitives are. Too often, people think of use cases one scenario at a time and just sort of glue them together w/o generalizing and simplifying.



Reinventing lexical scope: many systems start by using string concatenation as their core model, instead of composing modules with a rationalized notion of scope. Another common scoping mistake is exposing variables as mutating a global scope.



Lack of abstraction mechanisms: when you don't think of yourself as designing a language, you put up with boilerplate and copy-pasta.



Lack of strategy for general-purpose logic: ultimately most DSLs end up needing general-purpose programming language—often it's in a minority of cases but when you need it you really need it. The best ones IMO have clear extension points and contracts with a general-purpose PL.




Neil Van Dyke

unread,
Feb 6, 2019, 8:14:18 PM2/6/19
to Racket Users
Stephen De Gabrielle wrote on 2/6/19 11:25 AM:
>
> Dave Herman recently tweeted

BTW, Dave Herman was a great contributor of Racket (PLT Scheme)
packages, and has been missed, though I understand the attraction of
Mozilla at the time.

> More generally, can you recommend resources a developer (without a
> background in language design) should refer to if they are building a
> simple templating system, application configuration file, etc. - that
> may grow into a little language?

If I could offer a few quick general guidelines instead:

* Consider whether declarative or imperative is better.  Declarative can
let you do more static reasoning (even though you might implement by
syntactic transformation to imperative Racket), and is often the thing
to do.  However, sometimes imperative is what you want even when
initially it looks like declarative, especially when you get into the
next guideline...

* If you ever find yourself writing bits of an imperative language,
consider just exposing a real language, like Scheme.  Even if you don't
go for Emacs-like extensibility, there is a long-long history of
everyone almost always reinventing a half-butted language when they
would've been in a much better position by taking a real language off
the shelf.  Perhaps a real language with extensions (like, if you really
want a prototype object system, go Scheme with a small layered prototype
object system).

* Consider whether your DSL, in which the the two previous guidelines
might have you do as a mix of declarative with scattered bits in an
off-the-shelf real imperative language... is even better as the
off-the-shelf imperative language at the top level, with a couple really
well-chosen syntax extensions or procedures.

* Don't be overinfluenced by the above guidelines.  Often, the less we
know, the easier it is to make blanket generalizations. :)

Sorry I have to run and can't give a more thoughtful brain dump right
now, but quick anecdotes of my own (criticize oneself first) from
Racket: two non-public DSLs I made, one for high-performance XML
transformation, and one for processing some cool and big tabular data,
both make me cringe -- partly because of how I implemented them
(`syntax-rules` rather than `syntax-parse` or `syntax-case`), and
partly, with the latter, a bit less contrived declarative would've been
more flexible yet looked about the same.

Another DSL, I might be the only one who likes it, but I really wanted
to make it as part of rethinking Web development a bit, and you can see
that I used the DSL only where I got benefit (moving some trivial
computation and checking to syntax expansion time), and I expressly (see
"By design...") avoided doing things like inventing my own declarative
iterators, or reinventing a half-butted Scheme. There are some static
checking things that I can't do this way (e.g., full HTML5 validation),
but it seemed a good tradeoff, and I could always *layer* that crazy
static checking later, if I really wanted to do things like make
declarative language for mapping database results.  One DSL decision I
made was to sacrifice the S-expression representation as data accessible
to the programmer, in favor of emphasizing writing the byte stream as
you go (yes, I've seen this get massive, in practice), and also using
normal Racket ports, so that it could be combined with other Racket
ways, for flexibility (again, not reinventing or restricting the
language when it didn't benefit me).
https://www.neilvandyke.org/racket/html-template/

BTW, related, a few places in the list archives, I talk about an
alternative to configuration files, like thinking of the application
instead as a framework, and the configuration file is actually an
instantiation of the framework.  And some simple practical startup
mechanics for doing that (e.g., executable that falls back to the
default instantiation if no config file available, and maybe writes a
"config file").  You don't have to do all that way, but you see how this
could be powerful.  If you do it right, in a lightweight (in the
developer burden sense) way, and, together with other factors, you could
also encourage an ecosystem and culture of very casual extension of your
application (like was part of what made Emacs so loved and powerful).

George Neuner

unread,
Feb 7, 2019, 12:43:15 PM2/7/19
to racket users

This is a repost of a message sent through Gmane that seems to have
gotten lost.  The original may show up at some point - apologies if you
see this twice.



On Wed, 6 Feb 2019 16:25:47 +0000, Stephen De Gabrielle
<spdegabrielle-Re5J...@public.gmane.org> wrote:

>Dave Herman recently tweeted[1] that consulting a PL specialist was a
>good idea for little languages to avoid common foundational mistakes,
>specifically mentioning templating systems and configuration files.
>
>So when designing(or evolving) a little language:
>
>  - what beginners mistakes should you avoid?
>    (some in the subsequent tweets I’ve copied below)
>  - When should you consult a PL specialist?
>    (and how do you identify a PL specialist?)
>
>More generally, can you recommend resources a developer (without a
>background in language design) should refer to if they are building a
>simple templating system, application configuration file, etc. - that
>may grow into a little language?
>
>Kind regards,
>
>Stephen
>
>PS please forgive me if this is the wrong list for this question.


No offense to Herman, but I think the problem with consulting experts
is that there are relatively few language experts who are available to
consult.  Certainly there are those who are willing to answer
questions (at least well defined questions) but most likely are busy
with their own work and can't be expected to devote a whole lot of
time to helping someone design a new language.


I agree with all the pitfalls Herman mentions, and to them I would
add:

 - fascination with C-like or natural language syntax

 - avoidance of parser tools/libraries

 - picking a programming paradigm that is not well suited to
   the problem domain

 - not using an IR - trying to interpret or compile straight
   from source text

 - forgetting about debugging

These are just off the cuff - if I were to think about it for a while
I probably would come up with more.


I think every *serious* developer should read at least an introductory
text on compilers.  Even if you never try to implement your own
language, understanding what happens to your code when its compiled
makes you a better programmer.

A developer serious about creating DSLs might want deeper study of
both compilers and virtual machines [keeping in mind that programming
languages always are defined over a "virtual" machine (that then is
implemented on a real machine)].

YMMV,
George


 -----

Neil Van Dyke

unread,
Feb 7, 2019, 7:32:53 PM2/7/19
to racket users
George Neuner wrote on 2/7/19 12:43 PM:
> No offense to Herman, but I think the problem with consulting experts
> is that there are relatively few language experts who are available to
> consult.

I suspect there's so very little *market* for such little language
experts.  Some non-exhaustive suggestions of why:

* Reasons like you suggest.  (Is this person going to be sufficiently
engaged to do it well.)

* Egos/opinionatedness/fun.  (What developer thinks they can't design a
language and doesn't have some ideas about that?  And possible morale
hit to team if pushed by manager?)

* How do you even evaluate skill in little language design, to find the
expert.  (This doesn't seem to be as easily objective as "have added a
new target architecture to GCC or LLVM".  It's soft/vague, and industry
even has trouble vetting developer candidates for skills that supposedly
the organization does understand, like basic software development.)

We can talk all day about how industry people should value some
intellectual niche for consultants, but it might just be easier to just
get a professorship instead (also nontrivial). :)  If you solve the
consulting market problem, "https://www.neilvandyke.org/racket-money/"
wants to know.

(Also, if someone has expertise in some niche technical
(non-management-consulting) area, do they really want to be running a
niche technical consulting business, or would they rather focus on doing
great work in those technical areas, and just have gobs of money appear
in their bank account (without dividing attention to constantly drumming
up business, invoicing and work reports, tax accounting, professional
insurance, sector compliances, solo 401k, etc.).)

> I think every *serious* developer should read at least an introductory
> text on compilers.  Even if you never try to implement your own
> language, understanding what happens to your code when its compiled
> makes you a better programmer.

Definitely.  (Do pretty much all CS undergrad curricula currently
include some kind of compilers/translation-to-arch?  And it's accessible
to all CS undergrads, not an elective, nor a hated weedout?)

Separate from compilers and underlying architecture, and perhaps
especially with "little languages" and DSLs, there's also the
linguistics or even HCI side.  Which (depending on the language intent)
is possibly entirely orthogonal to implementation.  What makes a good
language for a purpose.  One thing that I suspect helps with this is
experience with many different languages and approaches, so you have a
breadth from which to draw, and some insight into them (not just
book-learnin').  I assume it also helps to understand users of your
language, what they know and want to do, and what you can convey&convince.

For the linguistic side, evaluation criteria might be hard.  (Do people
qualitatively like using it?  What's the adoption, and how is that
involved in merit or telling us what's good or bad about it? Controlled
productivity/maintainability/defect metrics?  Does a particular employer
use it, for whatever reason?   Does some quality of formal semantics or
syntax say people should like this little language, and if they don't,
they don't deserve such a fine language?  etc.).

George Neuner

unread,
Feb 8, 2019, 2:21:52 AM2/8/19
to racket...@googlegroups.com
On Wed, 6 Feb 2019 16:25:47 +0000, Stephen De Gabrielle
<spdega...@gmail.com> wrote:

>Dave Herman recently tweeted[1] that consulting a PL specialist was a
>good idea for little languages to avoid common foundational mistakes,
>specifically mentioning templating systems and configuration files.
>
>So when designing(or evolving) a little language:
>
> - what beginners mistakes should you avoid?
> (some in the subsequent tweets I’ve copied below)
> - When should you consult a PL specialist?
> (and how do you identify a PL specialist?)
>
>More generally, can you recommend resources a developer (without a
>background in language design) should refer to if they are building a
>simple templating system, application configuration file, etc. - that
>may grow into a little language?
>
>Kind regards,
>
>Stephen
>
>PS please forgive me if this is the wrong list for this question.


No offense to Herman, but I think the problem with consulting experts
is that there are relatively few language experts who are available to
consult. Certainly there are those who are willing to answer
questions (at least well defined questions) but most likely are busy
with their own work and can't be expected to devote a whole lot of
time to helping someone design a new language.


I agree with all the pitfalls Herman mentions, and to them I would
add:

- fascination with C-like or natural language syntax

- avoidance of parser tools/libraries

- picking a programming paradigm that is not well suited to
the problem domain

- not using an IR - trying to interpret or compile straight
from source text

- forgetting about debugging

These are just off the cuff - if I were to think about it for a while
I probably would come up with more.


I think every *serious* developer should read at least an introductory
text on compilers. Even if you never try to implement your own
language, understanding what happens to your code when its compiled
makes you a better programmer.

A developer serious about creating DSLs might want deeper study of
both compilers and virtual machines [keeping in mind that programming
languages always are defined over a "virtual" machine (that then is
implemented on a real machine)].

YMMV,
George


-----

Matthias Felleisen

unread,
Feb 8, 2019, 1:46:41 PM2/8/19
to racket users


> On Feb 7, 2019, at 7:32 PM, Neil Van Dyke <ne...@neilvandyke.org> wrote:
>
> but it might just be easier to just get a professorship instead (also nontrivial). :)


Absolutely.

* How many academic PL experts do you know that design languages?
* How many of their languages reach an audience of more than 7?
* How often do they repeat this?

While there’re clearly some academic PL researchers who do all of this, the de facto careerization of the research job has pushed people to where the glory is:

- papers
- more papers
- yet more papers
— and as many citations as possible.

So to become a professor with a high citation count, you must follow the herd and toe the consensus line. And what is the consensus line at the moment:

- verification via proof assistants
— verification via model checking
— verification of static properties
— synthesis, and
— the desperate need to apply ML-AI to PL (and I don’t mean OCAML here)

In short, there really aren’t many experts. Most of these experts probably focus on general-purpose language design not DSL design.

And all of this despite the fact you can easily identify 2,500 publications on DSLs [see the annotated bibliography of va Deursen et al. from 2000 and the survey of Mernik et al. from 2005].

;; - - -

Now having said that, I would also provide some food for thought since y’all live in the Racket universe:

— match is a DSL, most of its patterns aren’t meaningful outside of match
— syntax pattern variables is a DSL, an extremely small one
— syntax-case patterns is a DSL that interacts with the DSL on syntax pattern variables
— syntax-case templates is a DSL different from the patterns that interacts with the DSL of syntax pattern variables
— syntax-parse is a different DSL
— . . . and up the scale
— #lang info is a DSL
— #lang datalog is yet a different kind of DSL

While a course on compilers may help — parsing, static checking, code generation, and optimization are useful ideas — how well do they apply above.

OK. Back to work — Matthias

Stephen De Gabrielle

unread,
Feb 13, 2019, 10:24:56 AM2/13/19
to Matthias Felleisen, Neil Van Dyke, gneu...@comcast.net, racket users
Thank you Neil, George and Matthias.

I'm interested in the HCI advice, where they impact both the language developer and the language user(s): 
  • [avoid] "Lack of abstraction mechanisms" [DH]
  • "figuring out what your composable primitives are." [DH]
  • [avoid] "Reinventing lexical scope" and "exposing variables as mutating a global scope" [DH]
  • [avoid] "picking a programming paradigm that is not well suited to the problem domain" [GN]
  • "Consider whether declarative or imperative is better." [NVD]
I'm assuming the above advice counts as the HCI aspects of language design, and ignoring the choice of syntax, be it C-style, sexp, 'blocks', or something else, is driven by the need to make it suit the intended audience.  (choosing between global or lexical scope, as opposed to implementing one or the other) 

As far as when to consult a specialist/expert I didn't mean a programming languages researcher, I was more referring to when you were designing a little language like a templating system or config file.  Would that be covered by the compilers course? My alma-mater doesn't offer a compilers course in their compsci degree.

I suppose there is a scale: 
  1. programmer (without compilers course)
  2. did compilers at degree level (can still remember it and it covered design decisions, as opposed to algorithms and data structures)
  3. Beautiful Racket and/or/ https://school.racket-lang.org/#brw
  4. Racket School 'How to design Languages' https://school.racket-lang.org/#htdl 
  5. http://cs.brown.edu/courses/cs173/ , PLAI, PAPL or a postgraduate level (masters?)
  6. PhD and beyond
( I have no idea if the order I made up is right - and I don't remember the dragon book covering the HCI aspects of language design) 

I suppose if you are If you are at (1) and you need help you ask someone at (2) or above.

I'd pay if PLT/Racket offered different levels of certification in DSL/Language design at different levels, or licensed someone do do it on their behalf for a percentage of the income.  (I regret not doing the mooc version of cs173/PLAI when it was offered)

S.


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthias Felleisen

unread,
Feb 13, 2019, 4:36:29 PM2/13/19
to Stephen De Gabrielle, Neil Van Dyke, gneu...@comcast.net, racket users


> On Feb 13, 2019, at 10:24 AM, Stephen De Gabrielle <spdega...@gmail.com> wrote:
>
> • programmer (without compilers course)
> • did compilers at degree level (can still remember it and it covered design decisions, as opposed to algorithms and data structures)
> • Beautiful Racket and/or/ https://school.racket-lang.org/#brw
> • Racket School 'How to design Languages' https://school.racket-lang.org/#htdl
> • http://cs.brown.edu/courses/cs173/ , PLAI, PAPL or a postgraduate level (masters?)
> • PhD and beyond
> ( I have no idea if the order I made up is right - and I don't remember the dragon book covering the HCI aspects of language design)
>
> I suppose if you are If you are at (1) and you need help you ask someone at (2) or above.


I don’t think this is a linear order. It’s more like a landscape with dots and connections and hyper-edges and such. In particular,

— the PLAI course at Brown belongs into the “compiler” equivalence class,
— a PhD does not a good DSL designer or expert consultant (on this topic) make, and
— the first bucket is way too coarse (some programmers have good “native” taste for DSLs, others should never ever program).

And I understood that you were asking about experts who can do DSL x Domain stuff. But NVD’s remark and my long-time observation of where PL research is going triggered my rant. Sorry

— Matthias


Stephen De Gabrielle

unread,
Feb 13, 2019, 5:30:27 PM2/13/19
to Matthias Felleisen, Neil Van Dyke, gneu...@comcast.net, racket users
On Wed, 13 Feb 2019 at 21:36, Matthias Felleisen <matt...@felleisen.org> wrote:
[...]

I don’t think this is a linear order. It’s more like a landscape with dots and connections and hyper-edges and such. In particular,

Thanks. I feel like I need to do a ‘topic map’ (like my niece does in school) to get my head around it.


— the PLAI course at Brown belongs into the “compiler” equivalence class,
— a PhD does not a good DSL designer or expert consultant (on this topic) make, 
— the first bucket is way too coarse
(some programmers have good “native” taste for DSLs, others should never ever program). 

(I think I’m in the ‘should never ever program’ portion of that bucket)

And I understood that you were asking about experts who can do DSL x Domain stuff.
But NVD’s remark and my long-time observation of where PL research is going triggered my rant. Sorry

Don’t be sorry - I’m sure I’m not alone in appreciating how forthright you and Neil and George and others are in sharing you experience and perspective. Don’t stop.

- Stephen 

— Matthias


--
----

George Neuner

unread,
Feb 16, 2019, 2:51:47 PM2/16/19
to racket...@googlegroups.com
On Wed, 13 Feb 2019 15:24:41 +0000, Stephen De Gabrielle
<spdega...@gmail.com> wrote:

>As far as when to consult a specialist/expert I didn't mean a programming
>languages researcher, I was more referring to when you were designing a
>little language like a templating system or config file. Would that be
>covered by the compilers course? My alma-mater doesn't offer a compilers
>course in their compsci degree
><https://www.cdu.edu.au/study/bachelor-computer-science-wcoms1-2018#!study-plan>


Disclaimer: I'm not an academic, and my own CS education was ~30 years
ago ... but I do look up occasionally to see what is being taught to
modern CS students.

Compiler courses usually are much more about the compiler than the
language being compiled. Most often the students work to implement
some subset of an existing language, rather than designing a new one.

IMO, a compiler course or three is not going to make a good language
designer. What makes a good designer is a lot of personal experience
with many different languages - and with different KINDS of languages
- seeing what works, what doesn't work, and what is confusing to
themselves and others.


>I suppose there is a scale:
>
> 1. programmer (without compilers course)

I began as a self taught programmer: I knew several languages already
and had read texts on compilers and operating systems before I took my
first CS course.

At one time or another I have learned and used:

BASIC BASIC-11 [PDP],Commodore,Apple,Microsoft
assembler 6502,8086,68K,ADSP21K
Pascal UCSD, Borland
C K&R .. current,
Modula-2
Scheme R3RS .. R5RS, PLT, Racket
Smalltalk PARC Place
C++ 2.0 (pre ANSI) .. current
C* Connection Machine C
SQL SQL-86 .. current
Common Lisp
ML Edinburgh
Prolog Edinburgh
Modula-3
Java 1.4 .. current
Python 2.6 / 3.0 .. current

This list includes only languages I've actually learned enough to use:
for school, for work, or for my own hacking. I've investigated more
languages than I've bothered to really learn.

The order I learned them in sort of is reflected in the list above.
Obviously variants were picked up along the way as languages evolved
and as I encountered new implementations. But I knew 3 varieties of
BASIC, 6502 assembler, UCSD and Borland Pascal, and ANSI C(89) before
I took any CS courses.


Even with this experience, I would hesitate before taking on design of
a new language. Certainly I would draw from existing languages I
thought were relevant, and unless I strongly disagreed with some
choice they made, I would try to stick to their existing syntax and
semantics as much as possible.



> 2. did compilers at degree level (can still remember it and it covered
> design decisions, as opposed to algorithms and data structures)

My course covered compiler implementation, not language design.
Obviously, by implementing a compiler for a language you can get a
feel for what is right and wrong with it.


> 3. Beautiful Racket and/or/ https://school.racket-lang.org/#brw
> 4. Racket School 'How to design Languages'
> https://school.racket-lang.org/#htdl
> 5. http://cs.brown.edu/courses/cs173/ , PLAI, PAPL or a postgraduate
> level (masters?)
> 6. PhD and beyond
>
>( I have no idea if the order I made up is right - and I don't remember the
>dragon book covering the HCI aspects of language design)

I have the 2nd edition and I've seen the 1st. There was little or
nothing said in either about user issues. I haven't seen the 3rd
edition. [Are there more now?]


On my shelf, I have 7 texts on compilers ranging from beginner to
advanced; 3 texts covering computation theory, type theory, and
semantics; 2 texts on garbage collection; and the classics: SICP and
EoPL.

For the most part, the compiler books cover only how various language
design decisions affect the compiler and/or runtime support for the
language. The only mention of how language features affect the user
is to note that if a feature presents ambiguity problems for the
compiler [in parsing or understanding], then that feature probably
also will be confusing to the user.


>I suppose if you are If you are at (1) and you need help you ask someone at
>(2) or above.
>
>I'd pay if PLT/Racket offered different levels of certification in
>DSL/Language design at different levels, or licensed someone do do it on
>their behalf for a percentage of the income. (I regret not doing the mooc
>version of cs173/ <http://cs.brown.edu/courses/cs173/>PLAI when it was
>offered)
>
>S.


I would have little problem coding a standalone compiler in Racket, or
translating some other language into Racket as the target.

But so far as #lang or other advanced macrology intended to integrate
with other Racket code ... there I still would have to call myself an
amateur: for whatever reason I have yet to fully grasp Racket's syntax
objects and (meta)programming with them.


YMMV,
George

Reply all
Reply to author
Forward
0 new messages