Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Discussion on removing punctuation in programming languages

48 views
Skip to first unread message

D...@psuvma.bitnet

unread,
Oct 8, 1986, 3:55:07 PM10/8/86
to

Punctuation or seperators (; , .) have long been a part of programming
language design. Anyone who has programmed for even the shortest amount of
will realize that these little demons are responsible for a large amount of
possible errors, thus the question is why have them at all?

The obvious answer is not for the programmer, but for the compiler writer
and his compiler. This is true for at least two reasons.
One, the seperator (usually a parse list seperator (, or ;) signals the
the parser that another item is following which should be included in the
list (i.e. repeat while next symbol is a seperator) instead of having to
repeat until you reach an end of list symbol.
Second, error recovery is easily done for instance in Pascal's ; because
on an error, the parser can go into panic mode until it sees a ; at which
time it can assume that things are pretty stable and continue parsing again.

The question is, are these or any other reasons valid enough to keep these
gremlins in the programming language specifications. Maybe so, i assume not.

What is it that we really want by using seperators
1) When a parse unit begins and ends
2) ???????

By a parse unit i mean such things as statements, blocks, declarations,etc.
How do we know when a unit ends and the next begins? Until now we used the
seperator.

I consider Wirth's Modula-2 an attempt to take a step away from all this
seperator nonsense. However, he choose not to do so, as the semicolon is
still prevalently used.

I propose we can remove the semicolon, without losing its effectiveness
by insisting that statements (and in general parse units) be equally
bracketed. By this I mean that every construct have some sort of
end-construct at the end of the construct. In Modula-2, for instance,
the for statement is for id := expr to expr <seqstmt> END. And similarly
if's, case's repeats,whiles all have end "brackets". Thus any two of these
compounds when butted up against each other, the beginning and end of each
statement can be found even if one of the middle symbols is missing.
(missing symbols meaning, the ending symbol of the first compound statement
or the beginning symbol of the second compound statement). This is
exactly what the semicolon was used for. Ah but there is a problem, the
assignment statement! consider the statements a := b + c e := f + g
Without a seperator, you don't know if the e is part of the first statement
(missing operator) or part of the second statement, UNTIL you read the
second :=. Thus I will require an end of assignment statement token to be
introduced, (notice a weakness in my proposal here). But the assignment
statement is the only case where this is needed.

Is this arguement clear? Correct?

Example program with proposed syntax.

Program Dummy

Const
(v p x) = 5
Type
thetype = array[0 5] of integer
Var
(x1 x2 x3 x4 x5):integer
(y1 y2 y3 y4 y5):real

Procedure proc1(var (x3 x2):integer)
begin
x1 := x3 + x2 END
for x1 := 1 to x5 do
x4 := 1 div x1 END
while x4 < 10 do
x4 := 1 + x4
end
end
end

begin
x3 = 0
repeat
proc1(x3,x5)
until x3 > x5 end
end


Don't try to analyse this program, it does nothing at best.

Admittedly, this looks an awful lot like swapping ; for end, and that is
probably so. But aren't end's alot more intuitive then ; ? Maybe not, maybe
so. Which is better? That's up to you.. What do you think?


As a side note does any language allow the mathematically normal syntax of
IF (4 < x < 8) THEN blabla

I don't think so, but why not?


dave brosius
dmb at psuvma.bitnet

Cesar Quiroz

unread,
Oct 9, 1986, 11:58:05 PM10/9/86
to
Expires:

Sender:

Followup-To:


In a very interesting article, d...@psuvm.bitnet writes:
>> Punctuation or seperators (; , .) have long been a part of programming
>> language design. Anyone who has programmed for even the shortest amount of
>> will realize that these little demons are responsible for a large amount of
>> possible errors, thus the question is why have them at all?
>>

>> ... A proposal follows, removing separators in favor
>> of fully-bracketed syntax ...
>>
The idea is certainly interesting. A few random ramblings
before I forget I wanted to follow up:

1- Bravo! You are about to rediscover S-expressions (and suggest
we abandon syntax as a problem?)

2- I don't think separators (or, in general, punctuation) are an
unmitigated evil. As with case sensitivity, type equivalence and
other areas, I think one can find people who intensely prefer
one of the extremes over the other, as well as people who will
be indifferent.

As an example: Some languages "free" you from semicolons by
expecting all your statements to end properly in a single line.
The language processor supplies the delimiters needed. This
causes on occasion the following interesting situation:

x := y || z
x := y ||
z
x := y
|| z
x := y
||
z

The statements above are not equivalent. For the second, the
system should recognize that the statement is incomplete and,
after looking in the next line, will produce the same results as
the first statement. The third and fourth actually might produce
syntax errors, or worse, unintended behavior, because this way
of fixing the "semicolon problem" is sensitive to indentation...
I consider it aesthetically repugnant, but I am pretty sure
there are quite reasonable people who will find it the "right"
tradeoff.

3- BASICs used to require LET as an assignment initiator (the
end was bracketed by a newline). That gave the whole language a
proper nesting syntax (punctuation had uses, though). I seem to
recall a similar structure for Cobol procedure division
statements: <verb> <things>. But I bet there were enough
irregularities to mask off this sane intention. And again, see
my note 1. What you *really* want is Lisp...



>>
>>
>> As a side note does any language allow the mathematically normal syntax of
>> IF (4 < x < 8) THEN blabla
>>

ICON (Griswold&Griswold) allows it. You might (or might not)
like to take a look at its syntax (it contains the ugly example
I gave above) but is a generally sane language. I would count
it along with 'awk' for writing text processing stuff under
UNIX, but that alone doesn't do enough justice to the concepts
offered in the language (the syntax, then again, ...) I don't
quite remember if Cobol allows it (perhaps *some* Cobol?)
Common Lisp extends some of the comparisons to take more than
one argument, in whose case the predicate is true if the
arguments form a monotonic sequence. Sort of: if (0<x<y<1)
then...

This question has recurred already. Perhaps somebody who
followed the previous round could post a Definitive List of
Languages Whose Comparisons Bind the Usual Way?
--
Cesar Augusto Quiroz Gonzalez
Department of Computer Science {allegra|seismo}!rochester!quiroz
University of Rochester or
Rochester, NY 14627 quiroz@ROCHESTER

han...@mips.uucp

unread,
Oct 10, 1986, 3:42:56 PM10/10/86
to
> As a side note does any language allow the mathematically normal syntax of
> IF (4 < x < 8) THEN blabla
>
> I don't think so, but why not?
>
> dave brosius
> dmb at psuvma.bitnet

Yes. I have implemented a language where this is permitted.
In fact, it will accept expressions of this kind of arbitrary length,
such as (a < b < c <= d), which is equivalent to
(a < b) & (b < c) & (c <= d). The only reason against implementing
it is that it is harder to optimize the expression.

For example, if it is known that (b < c) is true, the expression
above reduces to (a < b) & (c <= d), which can only be represented
clearly by explicitly inserting the AND operator.
If the expression is limited to three terms, of course,
this doesn't happen.

The construct is mostly useful just for its expressive power.
It's shorter to write (4 < (x+(y*z)) < 6) than to write
(4 < (x+(y*z))) & ((x+(y*z)) < 6), though it is
straightforward, using common subexpression optimization,
to make either form generate equivalent code.

--

Craig Hansen | "Evahthun' tastes
MIPS Computer Systems | bettah when it
...decwrl!mips!hansen | sits on a RISC"

deb...@megaron.uucp

unread,
Oct 11, 1986, 1:00:57 PM10/11/86
to
> I propose we can remove the semicolon, without losing its effectiveness
> by insisting that statements (and in general parse units) be equally
> bracketed. By this I mean that every construct have some sort of
> end-construct at the end of the construct.

Try lisp. Personally, I find that the high density of parentheses gets
in the way of readability.

--
Saumya Debray
University of Arizona, Tucson

deb...@arizona.edu
{allegra, cmcl2, ihnp4, ucbvax}!arizona!debray

John Rager

unread,
Oct 11, 1986, 3:11:28 PM10/11/86
to


>Punctuation or seperators (; , .) have long been a part of programming
>language design. Anyone who has programmed for even the shortest amount of
>will realize that these little demons are responsible for a large amount of
>possible errors, thus the question is why have them at all?

We had a long discussion about this topic quite recently. I move we not
repeat it again.

John Rager

Barry Shein

unread,
Oct 11, 1986, 6:02:55 PM10/11/86
to

This comes up fairly often, doesn't it?

In teaching University courses I've observed that people find the
'separator' approach very confusing (I refer to our Intro Pascal
course as "Semicolons 101") while those languages which use them as
terminators (C, PL/I) people seem to find fairly intuitive (I've
taught programming here at BU in Asm, PL/I, Pascal and C, as well as
Survey of Programming Languages where I have taught a lot of others,
fast!) I remember when people started discovering "free-format"
languages (well, more or less, let's say I watched communities migrate
from Fortran, Cobol and Assembler in the mid-late 70's), I think they
preferred the free format approach even if they had to now understand
semi-colons (usually, of course Cobol had both format restrictions and
the '.' terminator.)

Some statements need to be broken into several lines, either you
will have to re-introduce the 'continuation' syntax a la Fortran
(*some* improvement) or impose other convolutions on your language
to make sure you can parse multi-line lines unambiguously (eg. not
allowing assignment within an expression.)

I personally think you are solving a non-problem tho I encourage you
to experiment. It's languages which use punctuations as separators
that are the real problem, not the punctuations.

-Barry Shein, Boston University

Lambert Meertens

unread,
Oct 12, 1986, 3:39:43 PM10/12/86
to rnews@mcvax
>>> As a side note does any language allow the mathematically normal syntax of
>>> IF (4 < x < 8) THEN blabla

> This question has recurred already. Perhaps somebody who


> followed the previous round could post a Definitive List of
> Languages Whose Comparisons Bind the Usual Way?

The languages that I know of that allow this are ABC (formerly called B),
BCPL, COBOL and ICON. That Common Lisp has this too was new to me; I have
never seen a definition of that language.

--

Lambert Meertens, CWI, Amsterdam; lam...@mcvax.UUCP

d25...@mic.uucp

unread,
Oct 12, 1986, 4:09:00 PM10/12/86
to

> Punctuation or seperators (; , .) have long been a part of programming
>language design. Anyone who has programmed for even the shortest amount of
>will realize that these little demons are responsible for a large amount of
>possible errors, thus the question is why have them at all?
>

[ Example of proposed punctuationless language omitted. ]


>
>Admittedly, this looks an awful lot like swapping ; for end, and that is
>probably so. But aren't end's alot more intuitive then ; ? Maybe not, maybe
>so. Which is better? That's up to you.. What do you think?
>

> dave brosius
> dmb at psuvma.bitnet

Hoary old FORTRAN used the newline character instead of ";" or "END".
One could make a case that this was more intuitive than either.
There is certainly no reason to suppose that our choices are limited to
just these three possibilities.

Carrington Dixon
UUCP: { convex, infoswx, texsun!rrm }!mcomp!mic!d25001

sin...@spar.uucp

unread,
Oct 13, 1986, 12:04:45 PM10/13/86
to
The language BCPL uses semi-colon as the statement separator, but the
compiler is documented as being very relaxed about them being missing;
basically they are only required when there would otherwise be an
obvious ambiguity. Partly it does this (I think) by treating end-of-line
as a possible 'implied semi-colon', ignoring it if not appropriate, and
taking it if appropriate. Multiple statements on one line still had
to be separated, I think.

It also has a notation to join two statements in a single block
a := 4 <> b := 5
$( a :=4 ; b := 5 $)
a legible syntax, nice loop constructs, and also allows (5 < a < 10).

Obviously it has problems wrt typing, addressing, and independent
compilation, all fixed in C; but C broke legibility, loop constructs,
and much that was beautiful in the language in a terrible attempt to go
for minimum typing/maximum illegibility (second only to APL, in my
opinion).

ku...@fluke.uucp

unread,
Oct 13, 1986, 12:25:16 PM10/13/86
to
Reasons for continuing to have delimiters in languages:

1. It makes error recovery easier/possible in the compiler.
Don't whine and snivel and say "if the compiler-writers were any good
they could do adequate error recovery without noise tokens like
delimiters." That simply isn't true. What have you gained if you have
a language that is a tiny bit quicker to type in but the compiler can
only mark the first syntax error, then gives up? Making error recovery
easier for the compiler improves its ability to find all your syntax
errors in a single pass.
2. Delimiters reduce the ambiguities in a grammar.
This permits the compiler to use more expressive syntactic forms. You
can use more powerful forms if they can be separated and made distinct
by delimiters.
3. Delimiters reduce the tendency for the compiler to accept incorrect
sentences as correctly formed syntax.
I recently worked on a compiler that allowed function invocations using
a named argument notation. To invoke the function

func (arg1, arg2, ... argN)

You used the invocation

func arg1 <exp1>, arg2 <exp2>, ... argN <expN>

However, since expresions can also be function invocations you could
have invocations like

func1 arg11 func2 arg21 <exp21>

To make a long story short, comments looked like

! any characters following the "!" up to the end of the line.

And if you accidentally forgot the "!", the resulting sentence
frequently was accepted as a bizarre function invocation. If the
argument list had been delimited by "("...")", or the argument name and
argument expression had been separated by ":=", this would have been
avoided.

Noise words in languages add valuable redundancy that aids the human
reader, compiler, and compiler writer. I always think of the words
spoken by C.A.R. Hoare in his 1983(?) Turing Lecture address, where he
said approximately "Wouldn't it be wonderful if your Fairy Gotmother
would wave her magic wand over your program and pronounce it correct,
and all you had to do was type it in three times."

Garry Wiegand

unread,
Oct 13, 1986, 4:58:05 PM10/13/86
to
In a recent article r...@brl-sem.ARPA (Ron Natalie <ron>) wrote:
>Punctuation is a very natural part of any language computer or
>natural the reason nearly all computer languages use such syntactic
>convetions either explicit by ending the statements in semicolons
>periods or by enclosing them in parenthesis or implied by using end
>of card or line is because this is the way most printed natural
>languages work I have never seen any attempt to simplify English
>language by leaving out the punctuation it makes for easier understanding
>of the printed word for humans and punctuation in computer programs makes
>understanding easier for both the programmer and the program

Punctuation is *not* necessarily "very natural". The addition of
punctuation to written English is historically recent, is it not?
(My mental archives are whispering "16th century" at me.) Anybody
know the facts?

garry wiegand (garry%cadi...@cu-arpa.cs.cornell.edu)

lude...@ubc-cs.uucp

unread,
Oct 14, 1986, 2:52:01 AM10/14/86
to
You anti-punctuators might consider looking at the BCPL book (Strachey
et al if I remember correctly). BCPL allows leaving out semicolons
before end-of-line. If the last item on a line is an operator, then
the statement is assumed to continue on the next line. Thus,

a := b + c
d := e

a := b +
c
would be correct but

a := b
+ c
would be an error.

So, the only use of semicolons would be something like:
temp := a; a := b; b := temp; /* swap a and b */
(before anyone starts flaming about lvalues and rvalues, I know
that this is NOT how it is written in BCPL)

SASL (St. Andrews Static Language - Turner et al) has a method of
avoiding punctuation by paying attention to indenting. I could look
up the details if anyone is interested.

Incidentally, a nice feature of BCPL is the ability to dynamically
allocate a vector on the stack. (This feature is provided in 4.2bsd
C by a function that is described as non-portable.)

Also, the question was raised if any language allowed
if a < b < c then ...
and the answer is (of course) COBOL.

Tue Bertelsen

unread,
Oct 14, 1986, 12:40:33 PM10/14/86
to
In article <7796DMB@PSUVMA>, D...@PSUVMA.BITNET writes:
>
> Punctuation or seperators (; , .) have long been a part of programming
> language design. Anyone who has programmed for even the shortest amount of
> will realize that these little demons are responsible for a large amount of
> possible errors, thus the question is why have them at all?
>
> The obvious answer is not for the programmer, but for the compiler writer
> and his compiler. This is true for at least two reasons.

Once upon a time there was a programming language called PLZ/SYS. It was
developed by Zilog for their Z80 processor (and became later available
under UNIX on the Zilog System 8000 computers).

Besides being an extremely efficient language for microprocessor programming,
it contained virtues that we still miss to see in so-called 'high-level'
languages:

- true modular programming
- true structured programming
- a concise syntax eliminating the need for punctuation
- a concise sematics which was logical and readable
- machine independence
- strong typing
- no gotos

Programs in PLZ/SYS contained only declarations:

- declarations of data
- declarations of actions to be performed on data (i.e. procedures)

The actual writing of programs required no punctuation, except that each
token had to be separated from other tokens by delimiters. Delimiters could e
anything (space, comma, semicolon, linefeeds, comments). This meant that
programs could be written very readable (no inconsistent use of END), indented
and spread across several lines.

This made it actual possible to write programs faster and more error-free
in the first run.

Based on this experience, I consider a language design being dependent on
punctuation a shame and an unnecessary constraint imposed on the programmer
just for the purpose of easing the compiler writer's work.

Still, there are old programmers, who cannot live without punctuation. In
PLZ/SYS they were free to use the if they liked - just for the purpose of
improving readability.

So hopefully, next generation of languages will be free of such
useless things.

To conclude, here is the 3 rules of programming:

We DO NOT write programs in high level languages in order
to instruct the computer (nor the compiler!)

(iff we did, we would be using hexadecimal instruction
codes)

We write programs in order to enable OTHER people to read them.

Other people read programs in order to UNDERSTAND what the
computers does, when it executes the program.

In other words: write, so it can be read.

For further information on PLZ:

"Report on the programming language PLZ/SYS"
Tod Snook, Charlie Bass et al.
Springer-Verlag 1979
ISBN 3-540-90374-7

Sincerely yours,

Tue Bertelsen
AmbraSoft A/S

Comfy chair

unread,
Oct 14, 1986, 12:48:51 PM10/14/86
to
One thing that is easily forgotten is that when humans look at a
program listing we see indentation and spacing and all that.
Unfortunately the compiler only sees a single stream. If you think that
"any decent compiler should be able to error recovery in the absence of
delimiters", try reading your program on ticker tape. And no
backtracking, either.

Ken

Ed Segall

unread,
Oct 14, 1986, 8:24:20 PM10/14/86
to
Here's a suggestion about the role of punctuation in improving
readability and reducing errors: I think that the reason semicolons,
etc cause problems is that 1) they don't stand out very well, so it's
not obvious when they're in proper vs. improper places, 2) they can be
confused with each other upon casual reading, and 3) (perhaps most
importantly) they don't have the nice symmetric appearance that
brackets do. Anyone who has used a programming interface like
Macintosh Pascal knows that the same function performed by the
punctuation marks can be done more nicely by font changes and
indentation. This, I believe, can preserve all the information
provided by punct. marks in a *much* more natural, readable way.

Substituting { and } for begin and end is one method I've seen to
help with this. Something similar can be done for statements (see
prev posting on s-expressions). BTW, there's no reason that you can't
have a visual programming environment preprocess a program into the
cryptic, punctuation-laden languages preferred by compilers.

(sorry - I didn't see the previous discussion on the subject. Hope
this is not a rehash)

Ed Segall

Don Steiny

unread,
Oct 15, 1986, 1:47:15 PM10/15/86
to
In article <12...@batcomputer.TN.CORNELL.EDU>, ga...@batcomputer.TN.CORNELL.EDU (Garry Wiegand) writes:
>
> Punctuation is *not* necessarily "very natural". The addition of
> punctuation to written English is historically recent, is it not?

No.

> (My mental archives are whispering "16th century" at me.) Anybody
> know the facts?

I certainly know that Old English (before 1066) was puncuated
as was Middle English (Chaucer - 14th century).

Puncuation of some sort is natural because it seeks to
represent the natural intonation of spoken language. All written
natural language is is a symbolic representation of spoken natural language.

Before there was even writing recorded events were puncuated in
in a sense by being poems. George Miller's work seems to show that
humans can remember a limited number of "chunks" of informations.
Poems group the chunks by having rhyming units. Prose breaks the
text into chunks by indicating sentence and phrase boundries with
commas, periods, and so on.

Maybe we should invent a programming language that deliniated
statements with iambic pentameter. Or sonnetts.

"when in disgrace with fortune and men's eyes,
I, all alone, beweep my outcast state. . ."


--
scc!steiny
Don Steiny @ Don Steiny Software
109 Torrey Pine Terrace
Santa Cruz, Calif. 95060
(408) 425-0382

D...@psuvma.bitnet

unread,
Oct 15, 1986, 2:56:53 PM10/15/86
to

My point to all those who say delimeters aid in compiler writing is that
fully bracketed statement syntax do just the same thing that seperators do.
Plus they remove some ambiquities (dangling else as in Modula-2's
if elsif elsif elsif else form)


dave

Tue Bertelsen

unread,
Oct 17, 1986, 5:08:41 PM10/17/86
to
In article <7796DMB@PSUVMA>, D...@PSUVMA.BITNET writes:
>
> Punctuation or seperators (; , .) have long been a part of programming
> language design. Anyone who has programmed for even the shortest amount of
> will realize that these little demons are responsible for a large amount of
> possible errors, thus the question is why have them at all?
>
> The obvious answer is not for the programmer, but for the compiler writer
> and his compiler. This is true for at least two reasons.

Once upon a time there was a programming language called PLZ/SYS. It was

Brad Templeton

unread,
Oct 18, 1986, 11:44:40 PM10/18/86
to
The real question to be answered here is,

"How are programs going to be created and compiled in the future?"

Now, with traditional technology, punctuation (like semicolons) is a
tremendous aid to error recovery.

In the future, more programming may be done with language based editors
and incremental compilation systems. There are two basic ways of doing
this (although they can be combined). The first is the template style.

In this case, the punctuation isn't even typed in normally. Things like
semicolons and braces are all provided by the system. The user edits only
the real things in the syntax tree. This is fine, (superb for learners, in
fact) but not all experienced users enjoy it.

Thus the other method, where the user works with the editor like a regular
text editor, but using incremental parsing techniques, it understands what
is going on anyway. With this method, punctuation is very useful because
it simplifies (and speeds up) the parsing. It also makes it more reliable.

So you you decide there shall be no punctuation in the next programming
language, you make things more difficult for those who wish LBEs.

Since punctuation is for the parser (mostly) you have to examine future
compiler methodologies to discuss it.
--
Brad Templeton, Looking Glass Software Ltd. - Waterloo, Ontario 519/884-7473

Wayne A. Christopher

unread,
Oct 19, 1986, 12:26:28 AM10/19/86
to
I think that a lot of the "errors" people complain that punctuation
causes are syntax errors -- in my view, fixing syntax errors is the
easiest part of debugging, and if there is any way that a more
insidious error can be made into a syntax error, that's great. Without
delimiters, unless your language forces you to type a lot of extra
stuff or program in an unnatural way, it's easy to write something
that's syntactically valid but not what you meant... Sure, readability
is fine, but you have to make sure that the computer is reading it the
same way...

Wayne

d...@psuvm.bitnet.uucp

unread,
Oct 20, 1986, 12:17:50 AM10/20/86
to

I disagree! I think the subtleties of a programming languages SHOULD be
designed for the compiler writer and associated compiler in mind. A language
description should be such that the compiler can report the best description
of errors, report the most errors in a single compile session, and aid in
optimization.
My problem is that semicolons is not the way to do this.

dave


Apology: Sorry I touched such a hot topic to begin with, those growing tired
can hurl abuse at me, but then again what else is there to talk about....

Bob Hathaway

unread,
Oct 20, 1986, 7:10:12 PM10/20/86
to

Indentation is an excellent way to delimit control structures.
It takes advantage of the natural structure programmers impose
on their programs, and does away with a lot of cumbersome puncuation.
Ignoring this information is unnecessary and error-prone.


Bob Hathaway

Mark H. Colburn

unread,
Oct 22, 1986, 12:38:22 AM10/22/86
to

I would tend to agree, except that I have seen so many different ways to
indent the same piece of code. By imposing the indentation delimiting,
programmers must all use the same indentation style (since it would have
to be part of the grammar of the language. I would think that this might
outrage more programmers then it placates.


--
Mark H. Colburn UUCP: ihnp4!rosevax!ems!mark
EMS/McGraw-Hill ATT: (612) 829-8200
9855 West 78th Street
Eden Prairie, MN 55344

Comfy chair

unread,
Oct 22, 1986, 11:40:17 AM10/22/86
to
References:


Let's follow a more productive line of discussion (or argument :-)):

In a previous article I said that punctuation was needed by parsers for
various reasons. What I failed to mention and what some other people
pointed out is that other things can be made the cues instead of
punctuation, like indentation. All you have to do in conventional
parser technology is to make these explicit tokens instead of
whitespace to be ignored. In interactive systems, it may even be easier
to do it with indentation. (I know of one language that uses
indentation, actually.)

Now I think this is an idea with promise but let's not jump into this
blindly. I think a discussion about the ramifications is worthwhile.

I can think up of several things:

1. What do you do about tabs? Are they equivalent to space to the next
stop or what? What happens if you move to another environment with
different tab stops? (One solution, use tabs only.)

2. What do you do when the user runs out of intermediate columns to add
another level of nesting? Sort of like running out of numbers for lines
in BASIC. (One solution, automatic reformatting.)

Well, I think that gives you the idea. There are more good new ideas to
be explored instead of flaming each other about the right way it used
to be done, etc. Language design isn't dead yet. Let's have more
ideas.

Ken

Larry Wall

unread,
Oct 23, 1986, 4:18:47 PM10/23/86
to
I think different statements should be different colors. Or perhaps stand
out from the screen differently in three dimensions. Instead of top-down
programming we can have front-to-back, or some such. I think you ought
to be able to hide one statement behind another. Maybe you could do
conditionals that way. Hey, did I just invent 3 dimensional programming
languages?

Larry Wall
{allegra,burdvax,cbosgd,hplabs,ihnp4,sdcsvax}!sdcrdcf!lwall

and...@hammer.uucp

unread,
Oct 24, 1986, 12:49:09 PM10/24/86
to
If you think it's easy to misplace a curly brace, you ain't seen
nothing until you go looking for the subtle program misbehavior caused
by incorrect indentation!

An alternate proposal, which would lead to a more redundant, robust
language, would be to require that the programmer use BOTH punctuation
and indentation. If one or the other is omitted, the compiler can
sense and publish the error.

"Language design isn't dead yet."

These syntactic sugar discussions (how do you spell block begin/end?)
are the least interesting and most trivial part of language design. If
you're interested in language design-by-committee, check out your
favorite ANSI standards group. There are still a lot of meaty issues
to argue over.

-=- Andrew Klossner (vice chair, ANSI BASIC committee)
(decvax!tektronix!tekecs!andrew) [UUCP]
(tekecs!andrew.tektronix@csnet-relay) [ARPA]

Brad Templeton

unread,
Oct 25, 1986, 5:41:35 PM10/25/86
to
In article <26...@hammer.TEK.COM> and...@hammer.UUCP writes:
>If you think it's easy to misplace a curly brace, you ain't seen
>nothing until you go looking for the subtle program misbehavior caused
>by incorrect indentation!

True, and very important. Indentation is guaranteed to balance out by the
end of the program. There's no such thing as an error, as far as the compiler
can detect (except in a few cases where extra keywords like else appear).

If you're using a regular text editor, using indentation can be very dangerous.
As Mr. Klossner suggests, the correct suggestion is to use both closers and
indentation, redundantly. Since the indentation supplies the visual cues,
the closers should be unobtrusive, but present.


>
>These syntactic sugar discussions (how do you spell block begin/end?)
>are the least interesting and most trivial part of language design.

Anything that people want to debate is (I say by definition) and important
issue in language design. For a language to be good, it must be liked.
If there is something, even syntactic sugar, that has a strong affect on
public perception of the language, it is an important issue.

One of the primary purposes of high level languages is human readability.
That's what syntax is all about.

der Mouse

unread,
Oct 26, 1986, 6:26:51 PM10/26/86
to
In article <21...@rochester.ARPA>, k...@rochester.ARPA (Comfy chair) writes:
> One thing that is easily forgotten is that when humans look at a
> program listing we see indentation and spacing and all that.
> Unfortunately the compiler only sees a single stream.

*Current* compilers see only a stream of tokens. Compilers could be
and probably have been written which used/use the identation to intuit
intended nesting, assisting in such things as error recovery.

> If you think that "any decent compiler should be able to error
> recovery

[sic]


> in the absence of delimiters", try reading your program on ticker
> tape. And no backtracking, either.

Why no backtracking? A compiler usually has access to enough virtual
memory to hold the entire source file available for random access.
Even when this is impossible, the compiler can normally seek around in
the file, which gives the same capability with a performance penalty.
I am ignoring smaller machines (eg, PCs) here; but I think you would
find that a backtracking compiler doesn't look at very much when it
backtracks - just things like "what was the indentation at the
beginning of this loop?" - and hence doesn't need to remember anything
like the entire source file.

der Mouse

USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse
think!mosart!mcgill-vision!mouse
Europe: mcvax!decvax!utcsri!mcgill-vision!mouse
ARPAnet: think!mosart!mcgill-vision!mo...@harvard.harvard.edu

Aren't you glad you don't shave with Occam's Razor?

TepperL

unread,
Oct 27, 1986, 2:26:47 PM10/27/86
to
In article <6...@looking.UUCP>, br...@looking.UUCP writes:
> In article <26...@hammer.TEK.COM> and...@hammer.UUCP writes:
> >If you think it's easy to misplace a curly brace, you ain't seen
> >nothing until you go looking for the subtle program misbehavior caused
> >by incorrect indentation!
>
> True, and very important. Indentation is guaranteed to balance out by the
> end of the program. There's no such thing as an error, as far as the compiler
> can detect (except in a few cases where extra keywords like else appear).

I think that the issue isn't so much removing all punctuation, but
merely making a judicious choice in what is required and what is
optional.

In the vast majority of C code I've ever seen, there is usually just
one statement per line. I've looked at a lot of the UNIX source code.
That means to me that in a new C-like language, one could:

1) Add new-line as a statement terminator

2) Let semi-colon stand as a statement terminator, but drop
the <requirement> for it. Semi-colon would then be used
for multi-statement lines. I guess that makes it a statement
separator.

3) Add something like <back-slash><new-line> to indicate a
statement continued across line boundaries.

I've been bitten by the bad-indentation bug myself. Problems like
that are tough to find. The typical way that it happens to me goes
something like this:

Write a while loop with a one-statement body:

while (condition)
do something;

The program is working right, so I add some debug:

while (condition)
fprintf(stderr, "Hi, I got to the loop, n = %d\n"m );
do something;

You get the idea. I've introduced another bug with my debug code.
What I've found out recently is that a lot of times when I write
single statement loops without curly braces, it turns out that
sooner or later I have to add them anyway as the loop grows.
Here's what I propose for a new C-like language:

1) Require curly braces in all loops.

2) Drop the requirement for parentheses in the condition
statement.

This means you wouldn't have to do any more typing than before,
and the language would prevent you from getting bitten by the
incorrect-indentation bug.
--
Larry Tepper {ihnp4 | allegra}!drutx!druil!lat +1-303-538-1759

Guido van Rossum

unread,
Oct 30, 1986, 8:17:46 PM10/30/86
to rnews@mcvax
[Context appended at end of article]

The difference between misplaced braces bugs and misplaced indentation
bugs is that misplaced (often: missing) braces are hard to find.
See posted example (loop with one-line body + added debug statement).
On the other hand, misplaced indentation is immediately spotted by the
human reader, e.g. (using a language I am familiar with as example):

PUT "" IN name
GET.CHAR c
WHILE 'a' <= c <= 'z':
WRITE c
PUT name^c IN name
GET.CHAR c \ Bug! should be indented more

Here, I would say, you notice the bug even while you are typing it, or
if not, you'll catch it immediately when you peruse the program again
after you find it gets stuck in an infinite loop. Thus, I find
indentation sufficient.

Sure, it's too late to add this, or any of the other proposed
"improvements", to an existing language like C, or even to a new language
that wants to be compatible, like C++. With computer languages, there
is no "next year's model" which you can load with all the features you
find missing in this year's model. Instead, there are longer-lasting
"generations" where members of two or three generations live together,
the young ones learning from the mistakes of old timers, some old
timers picking up a few new ideas (Fortran seems to be in its third
youth!), and the real future of the country being in the hand of
youngsters that nobody takes serious yet.

Sometimes a there is a real "wave movement": the third generation goes
back to ideas of the first, that were utterly rejected by the second
generation. An example is free formatting. When I was a kid, I learned
that the free format of Algol was an improvement over Fortran's rigid
"one statement per line" structure. (In Algol you could use a minimum
number of punched cards for your program by formatting as in a massive
block, but it also allowed the liberal insertion of spaces and
indentation to display the program's intended meaning, as opposed to
the interpretation understood by the compiler.) Very-high level
languages such as Prolog and ABC again have one statement per line, and
I've seen newly designed languages *with* goto statements! Also, in C,
two statements on a line feel like a cardinal sin to me.

Some more mundane points: the issues of mixing spaces and tabs are
utterly system-dependent; when moving a file from one system to another
with a different tab stop interpretation you should do a conversion,
just like when converting from ASCII to CDC DISPLAY CODE. A practical
problem may be that the conventions of the originating system are not
known, and that users may set their own conventions (":se ts=4") that are
not always understood by all tools in the programming environment.

Luckily, auto-indent text editors are commonplace so typing in the
indentation is no problem. If your indentation gets so deep there is no
room for your statements on the line it is probably time to introduce
another level of subroutines anyway. A form of indentation commonly
found in Pascal programs is ridiculous: small indentation steps (one or
two spaces) with one step for each syntactical production:
case i in
1:
begin
if i=j then
begin
stuff;
end
else
morestuff
end
end
This makes it very hard to find matching begin/end pairs, especially if
they span a page or two. Of course, this is neither an argument against
Pascal nor against indentation: it only shows that religious application
of any principle by small minds doesn't create great software. (Cf. the
goto debate.) In my opinion, indentation should be displayed as at
least three or four space. The 8 commonly used in C are excessive, but
the only solution that is portable *and* easy to type.

Syntax-directed editors can suggest an increase of indentation after
certain statements like IF, FOR, WHILE. Decreasing the indentation
again must be done manually, and I've struggled with at least three
different systems without finding the ideal solution. VI is the worst:
it requires a ^D to backspace over suggested indentation but ^H to
reteat indentation you've just typed. Emacs allows you to use ^H at all
times (I'm not talking of the various electric-C modes, which all too
often impose a particular, often uncomfortable, formatting style, but
about a normal auto-indent mode). A structured editor I've written
myself interprets a [return] with the cursor at the begin of a line
(after the indentation, that is) as a "dedent" operator. After a short
learning period this feels very pleasant, especially since you can
"dither" the return key: one return gives a new line, two returns in a
row give a dedented line. Unfortunately, this does not extend to
documents that may contain blank lines, or you'll need kludges like
typing a space followed a return to keep a blank line. Ideas, anyone?

--
Guido van Rossum, CWI, Amsterdam <gu...@mcvax.uucp>

[Some context:]

Craig Wylie

unread,
Oct 31, 1986, 7:25:12 AM10/31/86
to
In article <21...@rochester.ARPA> k...@rochester.UUCP (Comfy chair) writes:
>
>.... In interactive systems, it may even be easier

>to do it with indentation. (I know of one language that uses
>indentation, actually.)

Miranda uses indentation -- if you know of any other interactive languages
using indentation let me know. Occam uses indentation but is compiled.

The indentation scheme in Occam is forced on the user as two spaces per level
while Miranda uses the offside rule (Peter Landin's idea I believe).


One of the big problems with indentation as a structure notation is that
of multi line statements. The suggestion of using a backslash at the end
of a line to be continued just feels like a backward step. I know it is used
in many UNIX utilities (ie make and sh) but that doesn't mean it is elegant :-).
I can not help but feel that continuation symbols in a programming language
should be left in Fortran. We should be looking for a way of automatically
determining structure. It should be possible for a parser to determine if
the next line is a continuation. A combination of indentation and whether
the next input token would be legal as a continuation should be enough.

Cetainly Syntax directed editors would make things much easier.

Craig.

--
UUCP: ...!seismo!mcvax!ukc!dcl-cs!craig| Post: University of Lancaster,
DARPA: craig%lancs.comp@ucl-cs | Department of Computing,
JANET: cr...@uk.ac.lancs.comp | Bailrigg, Lancaster, UK.
Phone: +44 524 65201 Ext. 4146 | LA1 4YR
Project: Cosmos Distributed Operating Systems Research Group

cds...@alberta.uucp

unread,
Nov 3, 1986, 3:37:40 PM11/3/86
to
In article <71...@boring.mcvax.UUCP> gu...@boring.uucp (Guido van Rossum) writes:
>The difference between misplaced braces bugs and misplaced indentation
>bugs is that misplaced (often: missing) braces are hard to find.

No they aren't. Use an editor which finds matching braces. They are
common and very reliable.

>On the other hand, misplaced indentation is immediately spotted by the
>human reader, e.g. (using a language I am familiar with as example):
>
> PUT "" IN name
> GET.CHAR c
> WHILE 'a' <= c <= 'z':

begin


> WRITE c
> PUT name^c IN name

end


> GET.CHAR c \ Bug! should be indented more
>
>Here, I would say, you notice the bug even while you are typing it, or
>if not, you'll catch it immediately when you peruse the program again
>after you find it gets stuck in an infinite loop. Thus, I find
>indentation sufficient.

Sure, but the begin/end pair (in lower case) makes it much more obvious
because you have two visual cues. The whole point of indentation is
communication with the HUMAN reader. The purpose of begin/end is for
the machine. The combination of both is a definite plus.

I have great difficulty with indentation-only simply because it is traditional
that whitespace is meaningless, and it certainly seems far too easy to
innocently delete whitespace and entirely change the program.

Swapping begin/end for indentation simply make the compiler less able to
catch bugs. Also, what about this:

x := 4 ;
for( i := 1 to 10 )do
{
x := 2*i;
printf( "This is a debug line\n" );
y := 3 ;
}

The purpose of left-justifying the printf is so I can see it as garbage and
remove it once the program works. My basic point is that indentation rules
add the WRONG kind of "stiffness" to programming input. It requires you
to be continually counting whitespace.

All in all, while I can see the point of making the programmers job easier,
it seems foolish to make everything easy simply for the writer's benefit.
The point of a program is for others to read as well, so anything which
makes the reading job harder is a clear lose.

I don't believe that indentation alone is enough to adequately communicate
the author's intention.

> Guido van Rossum, CWI, Amsterdam <gu...@mcvax.uucp>

Chris Shaw cdshaw@alberta
University of Alberta
CatchPhrase: Bogus as HELL !

lib...@uiucdcsb.cs.uiuc.edu

unread,
Nov 3, 1986, 5:12:00 PM11/3/86
to

I rather like the editor that comes with Macintosh Pascal, and its sibling,
Lightspeed Pascal from Think, Inc.. It allows character level editing
but reformats whenever you hit ";" return or start a more distant
modification. The reformating is syntax dependent but only changes the
following text. Since the display is kept consistent, a complete reparse
is not needed.

This periodic reformating provides almost instant feedback as to whether your
brackets match up since you, the human, can quickly tell whether the
indentation is right. Even if the "end" is several pages after the
"begin", you can see that your procedure doesnt end right, or whatever.
The only problem is if you dont like how it indents. But you can turn
off indenting if you want to go it alone.


Dan LaLiberte
lib...@b.cs.uiuc.edu
lib...@uiuc.csnet
ihnp4!uiucdcs!liberte

D...@psuvma.bitnet

unread,
Nov 6, 1986, 9:30:48 PM11/6/86
to

Yes, i quite agree with you about Think's editor. One thing that would
really be nice is if you could collapse constructs into just the control
section (sort of like an outline processor). There's nothing worse then
scanning through a lot of comments and nonimportant (presently, anyway) to
find what your after


dave

Daniel R. Levy

unread,
Nov 7, 1986, 1:07:30 AM11/7/86
to
In article <1...@pembina.alberta.UUCP>, cds...@alberta.UUCP writes:
>In article <71...@boring.mcvax.UUCP> gu...@boring.uucp (Guido van Rossum) writes:
>>The difference between misplaced braces bugs and misplaced indentation
>>bugs is that misplaced (often: missing) braces are hard to find.
>
>No they aren't. Use an editor which finds matching braces. They are
>common and very reliable.
>Chris Shaw cdshaw@alberta

Ah, almost very reliable :-). Vi does this, but I had an awful time
once with a fairly long C function (in someone else's code) which just
would not compile on the machine I tried it on; kept getting strange
error messages about bogus identifiers. Well, finally I idly put my cursor
on the top curly brace of that function and hit '%'. Vi beeped at me! "HMMM,
no match!" I said; "so that's why the code wouldn't compile!".

Well, I went to the bottom of the code, put the cursor on the curly brace
there, and hit '%' again. It matched with the second curly brace in from
the top. I start moving in, matching pair against pair, for several iter-
ations. Then '%' put me right in the middle of a printf statement:

(void) printf("junkjunkjunk{trashtrashtrash\n");
^
|-- see the "hidden" curly brace?

Oh, well... *klunk*
--
------------------------------- Disclaimer: The views contained herein are
| dan levy | yvel nad | my own and are not at all those of my em-
| an engihacker @ | ployer or the administrator of any computer
| at&t computer systems division | upon which I may hack.
| skokie, illinois |
-------------------------------- Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
go for it! allegra,ulysses,vax135}!ttrdc!levy

TepperL

unread,
Nov 7, 1986, 10:47:30 AM11/7/86
to
In article <13...@ttrdc.UUCP>, le...@ttrdc.UUCP writes:
> Well, I went to the bottom of the code, put the cursor on the curly brace
> there, and hit '%' again. It matched with the second curly brace in from
> the top. I start moving in, matching pair against pair, for several iter-
> ations. Then '%' put me right in the middle of a printf statement:
>
> (void) printf("junkjunkjunk{trashtrashtrash\n");
> ^
> |-- see the "hidden" curly brace?
>
> Oh, well... *klunk*

There's any easy way around this. I've even used it before:

(void) printf("junkjunkjunk{trashtrashtrash\n");

/* This } matches the one in the printf above. Vi is happy. */

0 new messages