How to make sure a new language is 'complete'

15 views
Skip to first unread message

James Harris

unread,
Dec 14, 2021, 5:58:47 AM12/14/21
to
How can someone make sure a new language is 'complete', i.e. that no
essential is left out and that it will allow a programmer to do anything
he might need to do?

The reason for asking is that I recently realised that I had to add a
new parameter mode. Because of certain choices I found I couldn't get
away with just in, inout and out so I had to add a new mode. I'm
avoiding going in to the details as they could divert from the topic but
the point is that I found that something was missing and it got me
wondering what else could be needed.

So is there a way to make sure a new language is complete?

AISI first of all there's computational or 'Turing' completeness. For
that, perhaps it's enough to ensure that the language has selections and
loops. But then there are other things - such as the parameter-mode
example, above. How does one make sure nothing is missing?

One approach is probably to base a new language on an existing one. Then
as long as the earlier language is complete it should be easier to make
sure the new one is, too. But even that has its weaknesses. For example,
one might base a new language on C but then find that some things cannot
be done - or cannot be done reasonably - without the preprocessor.

If attempting to create a new language without an antecedent then the
situation is even more challenging. There will be no prior model to
guide the design.

So, any suggestions?


--
James Harris

Dmitry A. Kazakov

unread,
Dec 14, 2021, 8:14:29 AM12/14/21
to
On 2021-12-14 11:58, James Harris wrote:
> How can someone make sure a new language is 'complete', i.e. that no
> essential is left out and that it will allow a programmer to do anything
> he might need to do?

1. supporting the corresponding programming paradigms and methods of
software decomposition.

2. supporting methods of software design and software engineering, e.g.
programming in the large.

3. providing interoperability with and abstraction from other/alien
software components and hardware.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Bart

unread,
Dec 14, 2021, 8:28:37 AM12/14/21
to
On 14/12/2021 10:58, James Harris wrote:
> How can someone make sure a new language is 'complete', i.e. that no
> essential is left out and that it will allow a programmer to do anything
> he might need to do?

I'm not sure any practical language can be that complete.

There will always be something it can't do, or can't do easily.

For example, mine does most of things I want, but it doesn't do OOP; or
have lambdas, or closures, or do currying, or have gradual typing, or
have contracts...

There are dozens of advanced features, or simply different paradigms,
that people might expect. Although they may not expect them all in one
language.

> The reason for asking is that I recently realised that I had to add a
> new parameter mode. Because of certain choices I found I couldn't get
> away with just in, inout and out so I had to add a new mode. I'm
> avoiding going in to the details as they could divert from the topic but
> the point is that I found that something was missing and it got me
> wondering what else could be needed.
>
> So is there a way to make sure a new language is complete?
>
> AISI first of all there's computational or 'Turing' completeness.

(What parameter passing modes did the original Turing Machine support?
Just out of interest...)

> For
> that, perhaps it's enough to ensure that the language has selections and
> loops. But then there are other things - such as the parameter-mode
> example, above. How does one make sure nothing is missing?

> One approach is probably to base a new language on an existing one. Then
> as long as the earlier language is complete it should be easier to make
> sure the new one is, too. But even that has its weaknesses. For example,
> one might base a new language on C but then find that some things cannot
> be done - or cannot be done reasonably - without the preprocessor.
>
> If attempting to create a new language without an antecedent then the
> situation is even more challenging. There will be no prior model to
> guide the design.
>
> So, any suggestions?

Create the language for a particular area of use, for example, systems
programming (it helps if you've previously used a similar language).

Put in the things you reckon you will need. Then use it to create real
applications, and preferably get other people to use it.

That will help discovering bugs, and holes in coverage, but every so
often someone will get stuck trying to do something, then you need to
tweak it or add a feature. Or just say it's not possible in this language.

C gets by with pile of missing features; first by people being so
forgiving of it: it's been around forever, and it's everywhere, so it's
fine you have to do things the hard way.

But it also has a preprocessor to get you out of trouble, with some
half-baked, tacky workaround (but it's C, so that's acceptable).

In a new language, you really need to do it properly.

So you can say a systems language should at least be able to do what C
can. But people do crazy things with the C proprocessor like implement a
complete functional language. I just wouldn't go that far.


David Brown

unread,
Dec 14, 2021, 8:34:53 AM12/14/21
to
On 14/12/2021 11:58, James Harris wrote:
> How can someone make sure a new language is 'complete', i.e. that no
> essential is left out and that it will allow a programmer to do anything
> he might need to do?
>
> The reason for asking is that I recently realised that I had to add a
> new parameter mode. Because of certain choices I found I couldn't get
> away with just in, inout and out so I had to add a new mode. I'm
> avoiding going in to the details as they could divert from the topic but
> the point is that I found that something was missing and it got me
> wondering what else could be needed.
>
> So is there a way to make sure a new language is complete?
>
> AISI first of all there's computational or 'Turing' completeness. For
> that, perhaps it's enough to ensure that the language has selections and
> loops. But then there are other things - such as the parameter-mode
> example, above. How does one make sure nothing is missing?
>
> One approach is probably to base a new language on an existing one. Then
> as long as the earlier language is complete it should be easier to make
> sure the new one is, too. But even that has its weaknesses. For example,
> one might base a new language on C but then find that some things cannot
> be done - or cannot be done reasonably - without the preprocessor.
>

(Quibble - the preprocessor in C is part of C, just as its standard
library is part of it. There are different phases of translation from
source to final binary, but it is wrong to think "you should be able to
do this in C without the preprocessor". That is like thinking that
since you bought your car's tires from a different shop than the car,
the car should be able to run without tires. It makes no sense. This
does not, of course, mean a preprocessor architecture is a good idea for
a new language going forward, merely that it is part of C.)

> If attempting to create a new language without an antecedent then the
> situation is even more challenging. There will be no prior model to
> guide the design.
>
> So, any suggestions?
>
>

Define "complete". What the programmer /needs/ is not the same as what
he/she /wants/. Being able to do something is not the same as being
able to do something easily, or safely, or conveniently.

One thing I think can be pretty much guaranteed, is that a language
cannot ever be "complete". There are far too many contradictions. A
language is not "complete" for many of my uses if it does not have
strong static typing. It is not complete for some other uses if it does
not have dynamic typing. It can't have both.

For /your/ purposes, your language is not complete without 4 parameter
passing modes (in, out, inout, and a forth one - shake it all about?).
C, on the other hand, manages perfectly well with just one mode for
parameters. Does that mean C is incomplete? Yet people use it for all
sorts of things.

I think basically you have to try out the language, and see what you
want to do with it. If you can manage what you want, you can call it
"complete".

And always remember that to some extent, the strength of a language is
not in what you can do with it, but what you /cannot/ do with it.
Stopping people from writing incorrect code can be at least as important
as letting them write correct code.

Bart

unread,
Dec 14, 2021, 8:51:42 AM12/14/21
to
On 14/12/2021 13:34, David Brown wrote:
> On 14/12/2021 11:58, James Harris wrote:
>> How can someone make sure a new language is 'complete', i.e. that no
>> essential is left out and that it will allow a programmer to do anything
>> he might need to do?

> (Quibble - the preprocessor in C is part of C, just as its standard
> library is part of it. There are different phases of translation from
> source to final binary, but it is wrong to think "you should be able to
> do this in C without the preprocessor". That is like thinking that
> since you bought your car's tires from a different shop than the car,
> the car should be able to run without tires. It makes no sense.

You've lost me with that analogy.

A better one I think is that C is like buying a car that comes with a
trailer that contains a workshop for creating ad-hoc, home-made versions
of all those features that on other cars are built-in.


> For /your/ purposes, your language is not complete without 4 parameter
> passing modes (in, out, inout, and a forth one - shake it all about?).
> C, on the other hand, manages perfectly well with just one mode for
> parameters.

Value-passing mode, which in the case of arrays, actually is in
complete. There it actually switches to passing by reference.

> Does that mean C is incomplete? Yet people use it for all
> sorts of things.

As they do assembly. Completeness in terms of being able to accomplish
any task is not enough.

David Brown

unread,
Dec 14, 2021, 10:59:59 AM12/14/21
to
On 14/12/2021 14:28, Bart wrote:
> On 14/12/2021 10:58, James Harris wrote:
>> How can someone make sure a new language is 'complete', i.e. that no
>> essential is left out and that it will allow a programmer to do
>> anything he might need to do?
>
> I'm not sure any practical language can be that complete.
>
> There will always be something it can't do, or can't do easily.

Agreed. (Sorry, it happens sometimes :-) )

>>
>> AISI first of all there's computational or 'Turing' completeness.
>
> (What parameter passing modes did the original Turing Machine support?
> Just out of interest...)

Parameters are passed on the tape (a stack), so that's the parameter
passing mode - value only, on the stack. It's a bit like Forth in that
respect.

> So you can say a systems language should at least be able to do what C
> can. But people do crazy things with the C proprocessor like implement a
> complete functional language. I just wouldn't go that far.
>

You can't do that with the C preprocessor, because there is no way to
implement recursion or loops. (In this sense, pre-processors or macro
support in many assemblers is more powerful.) Often it would be very
useful if C /had/ support for loops or recursion of some sort in the
preprocessor - it would allow significantly more compile-time
compilation. However, there are much nicer ways to handle compile-time
compilation, so perhaps it's good that the C preprocessor is limited
like that. (Say what you like about C++ syntax, I think most people
will agree that constinit, constexpr and consteval in C++20 are less
ugly than compile-time calculations with the C preprocessor!)


David Brown

unread,
Dec 14, 2021, 11:09:19 AM12/14/21
to
On 14/12/2021 14:51, Bart wrote:
> On 14/12/2021 13:34, David Brown wrote:
>> On 14/12/2021 11:58, James Harris wrote:
>>> How can someone make sure a new language is 'complete', i.e. that no
>>> essential is left out and that it will allow a programmer to do anything
>>> he might need to do?
>
>> (Quibble - the preprocessor in C is part of C, just as its standard
>> library is part of it.  There are different phases of translation from
>> source to final binary, but it is wrong to think "you should be able to
>> do this in C without the preprocessor".  That is like thinking that
>> since you bought your car's tires from a different shop than the car,
>> the car should be able to run without tires.  It makes no sense.
>
> You've lost me with that analogy.
>
> A better one I think is that C is like buying a car that comes with a
> trailer that contains a workshop for creating ad-hoc, home-made versions
> of all those features that on other cars are built-in.
>

No, that's not helpful at all. (But to be fair, my analogy was not
great either.)

The point is that you can build a C implementation with a preprocessor
from one supplier, a core language compiler from somewhere else, and a C
standard library from a third supplier - but you don't have /C/ unless
they are all together and all compatible with each other. It makes no
sense to say "C without the preprocessor" or "C without the standard
library". "C" is the programming language described in the C standards
- no more, and no less.

>
>> For /your/ purposes, your language is not complete without 4 parameter
>> passing modes (in, out, inout, and a forth one - shake it all about?).
>> C, on the other hand, manages perfectly well with just one mode for
>> parameters.
>
> Value-passing mode, which in the case of arrays, actually is in
> complete. There it actually switches to passing by reference.

In practice, it appears that way - technically, however, the array
expression decays to a pointer expression and the pointer is passed by
value. You can see this from the way functions are declared - even if
you make a parameter that appears to be an array, it is equivalent to a
pointer to the first element. So it is a pointer that is passed, not a
reference to the array.

(Of course passing by reference is typically implemented as passing a
pointer by value in most compiled language implementations. But that's
an implementation detail, rather than the semantics of the language.)

>
>>  Does that mean C is incomplete?  Yet people use it for all
>> sorts of things.
>
> As they do assembly. Completeness in terms of being able to accomplish
> any task is not enough.

Indeed.

Bart

unread,
Dec 14, 2021, 12:45:17 PM12/14/21
to
On 14/12/2021 15:59, David Brown wrote:
> On 14/12/2021 14:28, Bart wrote:

>> So you can say a systems language should at least be able to do what C
>> can. But people do crazy things with the C proprocessor like implement a
>> complete functional language. I just wouldn't go that far.
>>
>
> You can't do that with the C preprocessor, because there is no way to
> implement recursion or loops.

How about this one:

https://github.com/camel-cdr/bfcpp/blob/main/TUTORIAL.md

(Brainf*ck intepreter using CPP)

David Brown

unread,
Dec 14, 2021, 2:23:44 PM12/14/21
to
Now that is /really/ interesting - thank you for that link. It will
take a bit of time to study it to see how it all works. But at first
glance it suggests that loops /are/ possible. (Prior to C90, you could
not have a variable number of arguments in a macro, and that seems an
essential part of the implementation here.)

I have previously wanted more advanced C pre-processor features for
compile-time calculation of CRC tables - I used to do it with macros
when I did assembly programming, and I can do it easily in C++, but for
C I use an external Python script to generate the table and then
#include it. But with that link, I can write a Brainf*ck program and
use the pre-processor interpreter. The guy who has to maintain my code
is going to be praising my name :-)

Rod Pemberton

unread,
Dec 18, 2021, 9:15:35 PM12/18/21
to
On Tue, 14 Dec 2021 10:58:43 +0000
James Harris <james.h...@gmail.com> wrote:

> How can someone make sure a new language is 'complete', i.e. that no
> essential is left out and that it will allow a programmer to do
> anything he might need to do?
>

I'd guess that depends on how you define "complete". I.e., in what way
do you mean "complete"?

E.g, if the language is very close to the assembly language of a
particular microprocessor, or even similar to many of them, then the
language will be able to do whatever can be done in assembly. I.e.,
it's "complete" as far as having a high-level language that can easily
convert into the functionality of the microprocessor as represented by
the assembly language, which should be Turing complete.

E.g., if the language is meant to be a high level language abstracted
entirely from the machine representation, then you can probably never
assure that the language is "complete" in the sense that there may be
some new algorithm or math or implementation that is difficult to
implement on specific machines, e.g., think shoehorning C onto old
mainframes which non-standardized architectures, bizarre character
sizes, word addressing, and non-contiguous memory spaces.

While you can design in known language features you desire, it's really
only through testing the language by writing code for it, that the
errors and mistakes and missing features become truly apparent, and what
gets tested by the latter activity is truly a matter of an individuals
IQ or that of many individuals. I.e., you don't want only choose "dumb"
people to test a language, nor would you want to limit the language to
home computers or smart phones while ignoring mainframes, etc,
especially if you want a broad appeal. There is nothing wrong with a
domain specific or even machine specific language, as long as that is
your goal, but most people seem to want a good general purpose language.

One path you could take is to compare similar languages (*), and strip
the languages down to their fundamental features held in common. That
could provide a base language to build up from. Then, you might
compare the features to those required by critical concepts such as
spaghetti code, structured programming, procedural programming,
object-oriented programming, code density and function points (**), and
Turing completeness.

(*)
https://en.wikipedia.org/wiki/Programming_paradigm

(**)
http://web.archive.org/web/20061231170804/http://www.theadvisors.com/langcomparison.htm

> The reason for asking is that I recently realised that I had to add a
> new parameter mode. Because of certain choices I found I couldn't get
> away with just in, inout and out so I had to add a new mode. I'm
> avoiding going in to the details as they could divert from the topic
> but the point is that I found that something was missing and it got
> me wondering what else could be needed.
>
> So is there a way to make sure a new language is complete?
> [...[
> One approach is probably to base a new language on an existing one.

You'd inherit defects that way, but you won't design in new defects, at
least until you extend the base language with the new elements.

I'm thinking of the Tesla car. Supposedly, it was a from-scratch or
from the ground-up design. So, it failed to inherit defects from other
long established car platforms designed by other automotive
manufacturers, which is a good thing. However, they designed in a
bunch of defects too (which are usually dismissed as slander or libel
by Tesla, e.g., whompy-wheels that pop-off when you hit pot-holes, total
burn down from damaged batteries, incompetent or retarded AI driver to
help drunk drivers not get a ticket, etc. I'm not making any statement
towards or against the validity of these claims about Tesla. That's
just what's floating around the Internet.).

> Then as long as the earlier language is complete it should be easier
> to make sure the new one is, too. But even that has its weaknesses.
> For example, one might base a new language on C but then find that
> some things cannot be done - or cannot be done reasonably - without
> the preprocessor.

Unfortunately, C as I know it, really isn't C without the preprocessor.

I ran into this issue with some C lexers and parsers, as I needed C
preprocessor features, but they weren't available. Even the very
minimal Small C by Ron Cain had to implement some C preprocessor
features in order to compile a subset of C.

> If attempting to create a new language without an antecedent then the
> situation is even more challenging. There will be no prior model to
> guide the design.
>
> So, any suggestions?

Formal design? You need to know what you want in the language, which
means you need to know about the purpose of the language (general
purpose, domain specific), which means you need to know a lot about the
functionality of the host platform(s), which means you need to choose a
programming paradigm (procedural, object-oriented), etc. You may need
to learn how to do the formal proofs for such things too.

Ad-hoc design? Attempt to do something, find out what you need, add it
to the language as you go. Again, this will depend on your own
abilities, as you won't find stuff that is missing in the language, for
things you never use or never do.

E.g., I wouldn't notice if unsigned integers were missing from C, or
floating point either, or if a language has no complex numbers (unless
I coded something with Mandelbrot sets), nor GUI mechanisms, etc.

But, GUI mechanisms are usually in an external library. So, maybe the
question should be where does the language end and the libraries begin?
I.e., the more that you can push out of the language proper and into
the language libraries, the more likely the language will be "complete".
Or, at least, the language will be "complete" enough to implement the
language libraries ... I guess this goes towards your other thread
about OS and library separation.


--

Rod Pemberton

unread,
Dec 18, 2021, 9:16:13 PM12/18/21
to
What? Yes, it does. That's the "C language" proper. The C
preprocessor is not the C language, nor are the C libraries the C
language. The C preprocessor is the C preprocessor. The C libraries
are the C libraries. They are complete without each other. It might
not be a complete C compiler without all three, but to say each is part
of the "C language" is erroneous (IMO).

> >> For /your/ purposes, your language is not complete without 4
> >> parameter passing modes (in, out, inout, and a forth one - shake
> >> it all about?). C, on the other hand, manages perfectly well with
> >> just one mode for parameters.
> >
> > Value-passing mode, which in the case of arrays, actually is in
> > complete. There it actually switches to passing by reference.
>
> In practice, it appears that way - technically, however, the array
> expression decays to a pointer expression and the pointer is passed by
> value.

Many decades ago, when first learning C, I believed the same, because
that is how C is taught to newbs, but that isn't how C actually works.


TRIGGER WARNING:
David, I know you don't accept what I'm about to say below. From the
fact that you believe something decays into something else, I know it's
outside your mental model of C and what I say will be rejected
outright, but maybe you'll remember this in the future. Usually, this
statement has resulted in heated argument with others on the Internet
over the past 3 to 4 decades. As such, I really don't wish to discuss
this further here. You can search my past Usenet posts to various
newsgroups, if you so wish. I think I've discussed it with James on
three or four different newsgroups including here. Anyway, you've been
diverting comp.lang.misc into comp.lang.c for a while now ...


ARRAYS IN C:
FYI, any time you see wishy-washy language like "decaying into" in the
language specifications or books on C, it's because C has pasted a thin
veneer over some low-level feature.

Under the hood, C is predominantly pointer based, i.e.,
pass-by-reference, like certain early versions of PL/1. In C, the
array is just a pointer to storage. There is no "decaying" of anything
actually happening. The only thing that happens is storage is
allocated for the pointer representing the "array", and occasionally,
the type is checked, e.g, prohibited casting.

In this case, C has array declarations to allocate storage for an
"array", but there are no arrays in the C language proper. This is a
result of the subscript operator [], which takes a pointer and index in
either order. The effect of the subscript operator is that there
appears to be array syntax in C, which mimics usage of an actual array
in other languages.

For your reference (and James too), the concepts of C syntax mimicking
C declarations and of array and pointer equivalence is discussed in one
of D.M. Ritchie's papers, "The Development of the C language", April
1993:

"Two ideas are most characteristic of C among languages of its class:
the relationship between arrays and pointers, and the way in which
declaration syntax mimics expression syntax. ... C evolved from
typeless languages."

...

"For example, the empty square brackets in the function declaration
int f(a) int a[]; { ... }
are a living fossil, a remnant of NB's way of declaring a pointer; a
is, in this special case only, interpreted in C as a pointer."


So, that's straight from the horse's mouth, that an array in C is just
a pointer, no "decaying" into anything.

> You can see this from the way functions are declared - even if
> you make a parameter that appears to be an array, it is equivalent to
> a pointer to the first element.

... because it is a pointer. It's not equivalent to. See above.

--

David Brown

unread,
Dec 19, 2021, 9:38:23 AM12/19/21
to
On 19/12/2021 03:17, Rod Pemberton wrote:

> ARRAYS IN C:
Array types are defined in the standards, 6.2.5p20.

Please look that up.

Expressions of type "array of type" are converted to an expression of
type "pointer to type" in most uses. This is colloquially referred to
as "decaying". This is described in 6.3.2.1p3 - please look that up
too. (And look up the word "colloquially" if that also troubles you.)

Declarations of objects with array type are handled in 6.7.6.2.

Declarations of parameters of type "array of type" in a function
declaration are /adjusted/ to "qualified pointer to type". 6.7.6.3p7


If you have difficulty looking up these references, let me know and I
can paste the relevant quotations. Otherwise, I recommend you base your
understanding of the language on the actual definition of the language,
rather than half-remembered misconceptions about C not having arrays or
arrays in C being pointers.

Bakul Shah

unread,
Dec 21, 2021, 1:07:09 PM12/21/21
to
Have you read "Hints on Programming Language Design" by Tony Hoare?
http://flint.cs.yale.edu/cs428/doc/HintsPL.pdf
Published in 1974 but still worth reading.


Reply all
Reply to author
Forward
0 new messages