Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: A Plain English Compiler

191 views
Skip to first unread message
Message has been deleted

Torben Ægidius Mogensen

unread,
Feb 17, 2006, 4:56:29 PM2/17/06
to
danr...@gmail.com writes:


> Well, I speak english, and I found this compiler at www.osmosian.com
> that actually lets me use regular english sentences to program. I
> didn't have to learn any cryptic syntax or weird combinations of
> puncuation. It's just plain english.

Which reminds me of the eight law of computer programming:

Laws of Computer Programming

8. Make it possible for programmers to write programs in English,
and you will find that programmers cannot write in English.

-- SIGPLAN Notices, Vol 2 No 2


Torben Mogensen
[But wait, it gets better. -John]

Derek M. Jones

unread,
Feb 17, 2006, 4:57:29 PM2/17/06
to
All,

> Since this is the 21st century, shouldn't we be able to talk to our
> computers in our own language?

For those of you with a slightly less ambitious aim, I am collecting
information and links for an English lint like tool (ie, a grammar
checker).

Suggestions welcome.

http://www.knosof.co.uk/cbook/grambugs.html

Louis Krupp

unread,
Feb 17, 2006, 4:56:53 PM2/17/06
to
danr...@gmail.com wrote:
> Well, I speak english, and I found this compiler at www.osmosian.com
> that actually lets me use regular english sentences to program. I
> didn't have to learn any cryptic syntax or weird combinations of
> puncuation. It's just plain english.
>
> It compiles to native Windows Executables and was written entirely in
> itself (Plain English). It's pretty cool.

I'm sure it's cool, but are *you* sure you just "found this compiler
at www.osmosian.com"? The WHOIS information for osmosian.com lists
"Dan Rzeppa" as the administrative and technical contact.

Louis

HansO

unread,
Feb 17, 2006, 4:58:26 PM2/17/06
to
>Well, I speak english, and I found this compiler at www.osmosian.com ...

Nice, very nice.

Lucky for me my anti-virus protection prevented the virus infected
sample to execute.

Hans, http://www.hansotten.com
[Oops. I don't run Windows, so I didn't look at it. -John]

Aaron Gray

unread,
Feb 19, 2006, 2:00:27 AM2/19/06
to
> Nice, very nice.

Yes, very, very intriguing. I think I may well buy it just to see it own
compilers source code !

> Lucky for me my anti-virus protection prevented the virus infected
> sample to execute.

I am relatively sure it has no virus or trojan. It scans clean using McAfee.
It does however want internet access to download some pictures it uses.

Aaron

Oliver Wong

unread,
Feb 19, 2006, 2:00:13 AM2/19/06
to
<danr...@gmail.com> wrote

> Since this is the 21st century, shouldn't we be able to talk to our
> computers in our own language?
>
> Well, I speak english, and I found this compiler at www.osmosian.com
> that actually lets me use regular english sentences to program. I
> didn't have to learn any cryptic syntax or weird combinations of
> puncuation. It's just plain english.
>
> It compiles to native Windows Executables and was written entirely in
> itself (Plain English). It's pretty cool.

It is pretty cool... for a toy application though... not so interesting
in terms of compiler theory.

The site claims to have a program which can draw anything you
describe. When I was connected to the internet, it worked like a
charm, drawing pictures of George Bush, for example. Unfortunately,
when I *wasn't* connected to the internet, it would only draw strings
of the form "Could not connect to
http://images.google.com/images?q=roses"

- Oliver

Keith Thompson

unread,
Feb 19, 2006, 2:02:41 AM2/19/06
to

Interesting. My anti-virus program (Symantec AntiVirus with the
latest updates) didn't find anything bad; either I had a false
negative or you had a false positive.

I'm certainly not going to try it, though; the claim by the site's
registered owner that he "found" the site is enough to destroy any
possible trust I might have had.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Pascal Bourguignon

unread,
Feb 19, 2006, 1:59:49 AM2/19/06
to
Louis Krupp <lkr...@pssw.nospam.com.invalid> writes:

Don't be mean, Louis, he may just be amnesic.

--
__Pascal Bourguignon__ http://www.informatimago.com/

"You can tell the Lisp programmers. They have pockets full of punch
cards with close parentheses on them." --> http://tinyurl.com/8ubpf

Aaron Gray

unread,
Feb 19, 2006, 2:02:22 AM2/19/06
to
> Nice, very nice.

Yes, very, very intriguing.

I bought a copy of CAL-3036 and its a "minor work of genius", its a great
"toy" (BTW I call Java and Java VM a toy).

CAL-3036 "understands" a limited subset of English, you have to know
how to say things and you have to know how to think within its
"mindset". You have to be a programmer to be able to program it as it
does not understand most normal English statements, having said that I
would have loved it as a teenager. It would be a great present for
kids who are learning to program.

As a "scripting" language within applications it would be really good
for power users.

The compiler is "Meta" and is written in itself, everything needed to
use and port it are includeed within it. All its tools are written in
its "English". It can manipulate and communicate with the binary level
and with Windows API.

> Lucky for me my anti-virus protection prevented the virus infected
> sample to execute.

I am 100% sure it has no virus or trojan. It scans clean using McAfee.
It does however want internet access to download some pictures it uses from
Google Images.

Aaron

Hans Aberg

unread,
Feb 19, 2006, 2:01:28 AM2/19/06
to
danr...@gmail.com wrote:

> Since this is the 21st century, shouldn't we be able to talk to our
> computers in our own language?

> Well, I speak english, and I found this compiler at www.osmosian.com


> that actually lets me use regular english sentences to program. I
> didn't have to learn any cryptic syntax or weird combinations of
> puncuation. It's just plain english.

One should be aware of that the historical development of mathematics
is going the opposite direction, going from using natural language to
symbols, because it is more expressive to the human.

For example, in the beginning, one might have said "add the first unknown
quantity to the second unknown quantity", but after awhile, symbols "x",
"+", and "y" are introduced, resulting in the more succinct, "x + y".

So, one can go ahead with mathematics, and simply write out in words
the mathematical language, and then design a grammar for that. For
example, "f(x)" would be "the function f applied to x", and so on. But
very simple formulas would quickly become unparsable by humans.

Also, when inventing new notation, one should consider this quote by Mark
Twain, about English spelling reform:

For example, in Year 1 that useless letter "c" would be dropped
to be replased either by "k" or "s", and likewise "x" would no longer
be part of the alphabet. The only kase in which "c" would be retained
would be the "ch" formation, which will be dealt with later. Year 2
might reform "w" spelling, so that "which" and "one" would take the
same konsonant, wile Year 3 might well abolish "y" replasing it with
"i" and Iear 4 might fiks the "g/j" anomali wonse and for all.
Jenerally, then, the improvement would kontinue iear bai iear
with Iear 5 doing awai with useless double konsonants, and Iears 6-12
or so modifaiing vowlz and the rimeining voist and unvoist konsonants.
Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi
ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz
ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.
Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud
hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.
--Mark Twain

--
Hans Aberg

DEÁK JAHN, Gábor

unread,
Feb 19, 2006, 11:05:16 AM2/19/06
to
On 19 Feb 2006 02:02:41 -0500, Keith Thompson wrote:

Keith,

> Interesting. My anti-virus program (Symantec AntiVirus with the
> latest updates) didn't find anything bad; either I had a false
> negative or you had a false positive.

Kaspersky gives a possible virus alert, too. That, coupled with the
overall appearance of the web site (a couple of graphics instead of
text, the FAQ, the "description" about Plain English Programming, the
fact that if offers absolutely no evidence of its claims, description,
information, whatever, just an opportunity to pay with your credit
card but it doesn't even seem to tell you how much), makes me want to
stay clear of it. Either a Trojan or a hoax, but it can't be
serious. Probably a rare situation when we must ask our esteemed
moderator to act... :-)

Bye,
Gábor

-------------------------------------------------------------------
DEÁK JAHN, Gábor -- Budapest, Hungary
E-mail: d...@tramontana.co.hu
[I must admit that the whole thing looks rather fishy, but it does say
the price is $100. -John]

toby

unread,
Feb 20, 2006, 7:45:48 PM2/20/06
to
Aaron Gray wrote:
> > Nice, very nice.
>
> Yes, very, very intriguing. I think I may well buy it just to see it own
> compilers source code !

According to your post on comp.programming you bought it three days
before making this post.

>
> > Lucky for me my anti-virus protection prevented the virus infected
> > sample to execute.
>
> I am relatively sure it has no virus or trojan. It scans clean using McAfee.
> It does however want internet access to download some pictures it uses.

This grass looks just like astroturf.

he...@osmosian.com

unread,
Feb 20, 2006, 9:42:23 PM2/20/06
to
[The guy who said he just stumbled accross this thing last week says:]

>> It is pretty cool... for a toy application...

We don't understand what you mean by "toy". It compiles itself. We
wrote the editor in it. We wrote the page-layout program in it, then
wrote the documentation in the page-layout program. Then we
spell-checked it. We drew the pictures for our website in it. We wrote
our JAVASCRIPT and our CGI applications in it.

>> Not so interesting in terms of compiler theory.

Are you quite sure?

>> When I was connected to the internet, it worked like a
charm, drawing pictures of George Bush, for example. Unfortunately,

when I *wasn't* connected to the internet...

The "Cal Monet" draws like a person draws. It "remembers" or "sees" an
image, then renders an original "dab, dab, dab" work of art based on
the image. How does it "remember" and "see"? By looking in it's memory,
which, in this case, is stored on various computers around the world.

Scott Wyatt

unread,
Feb 24, 2006, 6:00:30 PM2/24/06
to
Languages like AppleScript / Xtalk (sp?) found in Runtime Revolution,
SuperCard (HyperCard), and throughout the Mac world are a complete pain
when you try to figure out what some routines will do.

The problem with "English" is meaning. The "Definitive Guide to
AppleScript" begins with a discussion of just how confusing English can
be as a programming language.

Sure, I want an "easy to read" programming language, but that doesn't
mean it should be like English. English is often mangled -- I admit as a
college English teacher.

A computer language should never be ambiguous. English thrives on its
ambiguity. Sorry, but programming needs rules and limits.

- Scott
[My understanding was that the hope for Cobol was not that it would be
any easier to write than other languages, but that it would be easier
for non-experts to read. One can debate how well they met that
goal. -John]

Gene Wirchenko

unread,
Mar 12, 2006, 1:54:14 PM3/12/06
to
Scott Wyatt <tam...@comcast.net> wrote:

[snip]

>[My understanding was that the hope for Cobol was not that it would be
>any easier to write than other languages, but that it would be easier
>for non-experts to read. One can debate how well they met that
>goal. -John]

Reading is one thing. Understanding is another.

Sincerely,

Gene Wirchenko
[The hierarchically structured data was great. The verbose arithmetic less
so. -John]

gerry....@pobox.com

unread,
Oct 24, 2014, 8:54:51 PM10/24/14
to
[Note that this is a followup to some messages from 2006. See
http://compilers.iecc.com/comparch/article/06-02-122 - John]

We've decided to take this thing to its logical conclusion. See
https://www.indiegogo.com/projects/the-hybrid-programming-language


leslie...@gmail.com

unread,
Oct 25, 2014, 11:14:50 AM10/25/14
to
I wonder if the OP has a clue how compilers work? If he has, he might
want to try his hand at writing a BNF grammar for English.

And I wish him the best of luck.

[As I said back in 2006, it's not hard to make an easy to parse subset of English,
but that's COBOL. -John]

Derek M. Jones

unread,
Oct 27, 2014, 3:02:36 PM10/27/14
to
John,

> I wonder if the OP has a clue how compilers work? If he has, he might
> want to try his hand at writing a BNF grammar for English.
>
> [As I said back in 2006, it's not hard to make an easy to parse subset of English,
> but that's COBOL. -John]

The closest I have seen so far is Attempto:
http://attempto.ifi.uzh.ch/site/

Martin Ward

unread,
Oct 27, 2014, 3:04:10 PM10/27/14
to
John said (back in 2006) "You can certainly chop English down to
a small unambiguous subset, but then you've just reinvented Cobol"

As someone who has had to work with COBOL somewhat, it is interesting
to compare the two languages:

"Plain English" is even more verbose that COBOL, for example:

put the picture's gpbitmap's width minus 1 times the tpp into the
picture's box's right.

I assume that this means:

picture.box.right = (picture.gpbitmap.width - 1) * tpp

and not:

picture.box.right = picture.gpbitmap.width - (1 * tpp)

since the latter has a redundant multiplication.
This means that multiplication does not take precidence
over addition in "Plain English" (unlike in mathematics).

The language does not have formulae at all (unlike COBOL), although
these are planned to be added. When formulae are added, I assume that
they will follow the usual rules of arithmetic: in which case the
discrepency between the "verbose" style and the "formula" style will
become even more confusing!

In COBOL, the above statement could be written:

COMPUTE RIGHT OF BOX OF PICTURE = (WIDTH OF GPBITMAP OF PICTURE - 1) * TPP

COBOL has a large number of machine-independent data structures:
strings, packed decimal, binary decimal and floating point.
For example, packed decimals can have up to 31 digits plus
a (fixed) decimal point and a sign. So you can process numbers
up to 9,999,999,999,999,999,999,999,999,999,999.
"Modern" COBOL (i.e. COBOL 2002 and later) includes bit manipulation,
recursion, user-defined functions and other features.
(Although, IBM has not yet implemented COBOL 2002 in their mainframe
compiler: having only had 12 years or so to work on it!).

"Plain English" only has 8 bit unsigned and 16 and 32 bit signed
binary numbers, 32 bit pointers, strings and records.
So "Plain English" cannot handle any number bigger than 2,147,483,648.
There are also "ratios" which appear to be represented as a signed
numerator and unsighed denominator. I don't know what you do if you
want to address more than 4GB of RAM, or, for that matter, work with
floating point numbers!

Strings appear to be represented using the Windows-1252 character set
-- no Unicode for you!

In old-fashioned (pre-2002) COBOL you had to use a whole byte to
implement a flag: or encode several flags into one data item with some
complicated coding to access and update them. In "Plain English" a
single flag takes 32 bits of storage: which seems a step backwards to
me!

"Plain English" has numeric operations: plus, minus, times and divided
by (with optional remainder), but no power operation.

Finally: there are no nested IF statements and no nested loops.
How do you process a two-dimensional array without a nested loop?
Well, since there are no two-dimensional arrays in the language,
(or even one-dimensional arrays, for that matter), you don't have
to worry about that!

Routines can declare local data: but since each routine has to be
extremely short (due to the lack of nesting of conditionals
or loops), any time you need a nested conditional or loop
you have to split the code into separate routines: which means
that any shared data has to be global variables!

--
Martin

Dr Martin Ward STRL Principal Lecturer & Reader in Software Engineering
mar...@gkc.org.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/
Mirrors: http://www.gkc.org.uk and http://www.gkc.org.uk/gkc

Kartik Agaram

unread,
Oct 27, 2014, 3:56:35 PM10/27/14
to
> John said (back in 2006) "You can certainly chop English down to
> a small unambiguous subset, but then you've just reinvented Cobol"
>
> As someone who has had to work with COBOL somewhat, it is interesting
> to compare the two languages:

Thanks for that detail! The discussion in 2006 seemed too dismissive of
this idea, and I worried that this go-around would subside similarly.
Comparing all attempts at english-like pidgins to cobol seems like a cheap
and overly broad shot. Hopefully now we can have a more substantive
discussion.

I unpacked the zipfile in the original kickstarter link (
http://www.osmosian.com/cal-3040.zip) and found the thousands of lines of
punctuation-free code startlingly alien yet attractive (see attached
screenshot). I had a similar reaction to the 100+ page manual written in
comic sans (http://www.osmosian.com/instructions.pdf). This isn't cobol at
all, and it's worth engaging with on its own terms.

I wish I could play with building the sources without resorting to an
untrusted .exe on a windows machine. How was the executable generated? Does
each new version bootstrap using the previous one?

Kartik
http://akkartik.name/about
[COBOL has an undeserved poor reputation largely among people who've
never used it. Yes, it's wordy, deliberately so, and its faciilities
for control structure are weak by modern standards, but it invented
the structured data we now take for granted in languages like C and
C++ and has a wider range of datatypes than most of its successors.
For its time and its intended application it was a huge success.
-John]

Kaz Kylheku

unread,
Oct 28, 2014, 1:26:12 PM10/28/14
to
On 2014-10-27, Kartik Agaram <a...@akkartik.com> wrote:
>> John said (back in 2006) "You can certainly chop English down to
>> a small unambiguous subset, but then you've just reinvented Cobol"
>>
>> As someone who has had to work with COBOL somewhat, it is interesting
>> to compare the two languages:
>
> Thanks for that detail! The discussion in 2006 seemed too dismissive of
> this idea, and I worried that this go-around would subside similarly.
> Comparing all attempts at english-like pidgins to cobol seems like a cheap
> and overly broad shot. Hopefully now we can have a more substantive
> discussion.

Compiling English to code isn't going to make programmers of those who
are unwilling, and especially unable, to learn a programming language.
They will only face additional struggles caused by ambiguities which
don't exist in ordinary programming languages.

Most people who are not endowed in the right way to be program
designers are also not endowed in the right way to handle precise
language, such as what is used when writing detailed requirements
specifications. In such language, many terms terms do not have their
ordinary meaning. The language might look like English, but it isn't.

A programming language which looks like English will suffer from the
same problem as the formal version of English used in very technical
documents. Those who don't absorb the special definitions of ordinary
terms, and the details semantics they entail, will not comprehend
properly what is going on, and will not wield that language
effectively to express solutions. Those who do understand will find
the language meddlesome.

People from the humanities will be dismayed at not being able to write
a sentence endowed with several simulaneous meanings, as they are used to.
"Gee, why doesn't the machine understand the metaphor which makes the poignant
irony come alive in what I just wrote ..."

[Man, there's a programming language I'd like to use.
Re COBOL, I am fairly sure that the point of making it look like stilted
English was not that they'd thought it'd make it easier to program, but
that it'd be possible for non-programmers, e.g. auditors, to look at the
code and figure out what it did. -John]

Ivan Godard

unread,
Oct 28, 2014, 1:27:23 PM10/28/14
to
On 10/27/2014 12:25 PM, Kartik Agaram wrote:
>> John said (back in 2006) "You can certainly chop English down to
>> a small unambiguous subset, but then you've just reinvented Cobol"

> [COBOL has an undeserved poor reputation largely among people who've
> never used it. Yes, it's wordy, deliberately so, and its facilities
> for control structure are weak by modern standards, but it invented
> the structured data we now take for granted in languages like C and
> C++ and has a wider range of datatypes than most of its successors.
> For its time and its intended application it was a huge success.
> -John]

Agreed. COBOL is the Rodney Dangerfield of languages; it don't get no
respect. It's really organized around name/value pairs (think MOVE
CORRESPONDING), but with a much nicer syntax than raw XML.

Martin Ward

unread,
Oct 28, 2014, 1:29:23 PM10/28/14
to
On 27/10/14 19:25, Kartik Agaram wrote:
> Comparing all attempts at english-like pidgins to cobol seems like a cheap
> and overly broad shot. Hopefully now we can have a more substantive
> discussion.

Gerry Rzeppa pointed me to this thread:

http://forums.anandtech.com/showthread.php?t=2358744

which raises these questions:

1. Is it easier to program when you don't have to translate your
natural-language thoughts into an alternate syntax?

2. Can natural languages be parsed in a relatively sloppy manner (as
humans apparently parse them) and still provide a stable enough
environment for productive programming?

3. Can low-level programs (like compilers) be conveniently and
efficiently written in high level languages (like English)?

Gerry claims that all these questions can be answered in the
affirmative: I would disagree. But if some of the above *can* be done,
the effort is much greater than with a traditional "Algol-like"
programming language with a limited set of keywords and unambiguous,
precisely defined syntax and semantics.

Let's take one of the most basic concepts in imperative programming:
the assignment. In "algol like" languages you have to learn
a new symbol ":=" and learn that the thing on the right of the symbol
is assigned to the thing on the left, for example:

foo := bar

copies the value of variable "bar" into variable "foo".

Gerry claims that "you don't have to translate your natural-language
thoughts into an alternate syntax". What are *your* "natural-language
thoughts" for the assignment operation? Mine include:

MOVE bar TO foo (perhaps I have been reading too much COBOL!)
LET foo = bar (my BASIC background is showing...)
copy bar to foo
set foo to be equal to bar
make foo equal bar
assign bar to foo
assign foo from bar
get bar and put it into foo
put foo into bar (although this suggest that bar is a set)

As far as I can tell from the manual, none of these will work.
Instead you need to write:

put the foo into the bar

(The little words "the", "a", "an" which can often be omitted
in natural English appear to be absolutely required in this language:
"the" indicates a global variable, while "a" or "an" indicates
a local variable).

This reminds me of the old fashioned text-based adventure games
where you knew what you wanted to do: but getting the parser
to accept your command turned into a game of "guess the verb":
http://tvtropes.org/pmwiki/pmwiki.php/Main/GuessTheVerb

Another example: you can "divide the foo by the bar"
but cannot "divide the bar into the foo".

So: you *do* need to translate your natual language thoughts
into an alternative syntax: the minimalist subset of English
accepted by the parser.

Of course, since you have the source code, you can extend
the parser by adding different ways of saying the same thing.

Which leads us to the next problem:

Gerry writes that there are (at least) three ways of calling
the same routine:

Draw the text with the Osmosian font.
Draw the text using the Osmosian font.
Given the Osmosian font, draw the text.

A later commentator asks what happens if the user says:

Write the words with Osmosian.
Type the text with Osmosian.
Print the page in Osmosian.

If you are the only programmer, writing programs for yourself,
you would probably pick one of the options and stick to it.
(The provided source code appears to use a fairly consistent style).
However, if you are working on a team and have to read other
people's code (as most programmers are most of the time),
then you will need to be able to recognise *all* the different
ways of writing a standard statement or call: in this case,
adding more variations just adds to the amount of "legalese"
you have to remember.

You will also have to remember not just the syntax but the semantics
of each statement: this semantics may be subtly different from
the semantics of natural language. For example, in English:
"A plus B times C" means, as any schoolchild will tell you,
multiply B and C together and add the result to A.
But in "Plain English" it is different:

put an A plus a B times a C into the total

means add A to B and then multiply the result by C.

In fact, the manual is ambiguous on this point:
page 74 says that 'Say I find the word PLUS between a snoz and a froz.
I look for a routine that tells me how "to add a froz to a snoz",
and then I use that routine to reduce the expression'.
A little further on the manual adds: 'To process "a snoz TIMES a froz",
I use "to multiply a snoz by a froz"'. In the expression
"the A plus the B times the C" both the above conditions apply,
and the manual does not say whether to do "add an A to a B" first,
or to do "multiply a B by a C" first.

The problem is that English is ambiguous and the ambiguity
in the "Plain English" source code is not resolved by
the ambiguous English sentences in the manual!

The whole "Plain English" language is actually quite limited:
containing integer, string and record data types (no arrays,
floating point, hash tables, regular expressions, lists etc.)
together with simple IF statements, assignments, loops, and simple
functions and subroutines with parameters. This makes it slightly
more powerful than, say, the original Dartmouth BASIC or PL/0,
and rather *less* powerful than BBC BASIC or PASCAL.
Such a small language could probably be learned in its entirety
in a few days: but with "Plain English" we also have to memorise
all the different ways to write the same statement,
and all the subtle semantic differences between it and English.

A real test of the utility of "Plain English" would be to compare
the effort of implementing the "Plain English" compiler in
"Plain English" with the effort of implementing a compiler for,
say, a small extension to PL/0 or similar in itself: since PL/0
is roughly equivalent in computational power with the "Plain English"
language. For example, the chapter "Oberon0: A Case Study" in the book
"Object-Oriented Programming" by Prof. Dr. Hanspeter Mvssenbvck claims
that Oberon0 was implemented under Oberon in 1,300 lines of code.
An implementation of Oberon0 in Simpl ("Implementing Oberon0 Language
with Simpl DSL Tool" by Margus Freudenthal) required 1,987 lines.
The "Plain English" compiler and "noodle" require over 15,000 lines
of code (686kB of source code).

In this context the "LDTA 2011 Tool Challenge" looks interesting:
http://ldta.info/2011/tool.html

Stefan Monnier

unread,
Oct 28, 2014, 10:18:51 PM10/28/14
to
> 3. Can low-level programs (like compilers) be conveniently and

I don't think compilers are low-level. They may generate outputs which
is low-level code, but the compiler itself is a rather pure function
which can only benefit from being written in a high-level language.


Stefan

Stefan Monnier

unread,
Oct 28, 2014, 10:20:14 PM10/28/14
to
> Re COBOL, I am fairly sure that the point of making it look like stilted
> English was not that they'd thought it'd make it easier to program, but
> that it'd be possible for non-programmers, e.g. auditors, to look at the
> code and figure out what it did. -John]

I think the better solution to this problem, nowadays, would be to
transliterate the source code into plain-english.

This has been done fairly successfully for formal proof in proof
assistants, so I assume it shouldn't be too hard to do for your average
programming language, tho maybe it might be useful/necessary to tweak
the source language to make the transliteration more readable.


Stefan
[I wouldn't disagree, but do remember that COBOL was designed over 50 years ago. -John]

Hans-Peter Diettrich

unread,
Oct 28, 2014, 10:24:25 PM10/28/14
to
> [Man, there's a programming language I'd like to use.
> Re COBOL, I am fairly sure that the point of making it look like stilted
> English was not that they'd thought it'd make it easier to program, but
> that it'd be possible for non-programmers, e.g. auditors, to look at the
> code and figure out what it did. -John]

Right, and this leads me to the question, *who* is the intended writer
of a program in that English language?

Is it the customer, needing a program? Have a look at (such) program
specifications, and you'll find most of them unusable until rewritten by
an IT expert.

So could it be the IT expert, in an attempt to make the program
specifications readable by the customer? This was the COBOL idea, but I
doubt that anybody but the coder ever had a *closer* look at such code.
More probably BASIC was the language to bridge the gap between newbies
and their home computers, with the famous Rocky Mountain Basic (HP) and
Visual Basic (MS) for more powerful machines and ambitious programs.

I also doubt that it could be some mathematician, who has little use for
English in writing down his formulas; he were better off with a FORTRAN
descendant, enriched by mathematical glyphs (operators...). OTOH
algorithms are frequently given in verbose form, so that many people
would be happy with an according compiler. SQL is another example of a
widely used language close to spoken English.

Last not least I'm not sure about the many non-native English speakers,
which have problems in finding the right English terms for what they
want to do. Here APL was an attempt to eliminate language barriers, but
with little success - probably due to expensive equipment (keyboards and
screens) in the early age of computing. Nowadays Chinese machines were
much better equipped for a programming language based on
language-independent glyphs.

All in all I'd think that it's a good idea when *everybody* has to learn
an universal (programming...) language, with texts written in that
language being understandable to everybody else, and in the context of
this thread *including computers* :-)

Apart from Esperanto and Volap|k the Indonesian language IMO is a good
example for a simple artificial language, that already allows the native
speakers of many hundred different languages to communicate in everydays
life. As I don't speak Indonesian myself, I'm not sure about its
possible use as a programming language, but as a non-native English
speaker I'd assume that it is less ambiguous than other spoken
languages, and easier to spell than English :-]

DoDi
[APL's character set wasn't a big problem -- IBM terminals had APL
type elements, and in the mid 1970s we had APL fonts for our bitmap
terminals. The problem was mostly that it was hard to teach people
how to write and read it (you needed to recognize a lot of idioms), if
you wanted a data structure other than rectangular arrays you were out
of luck, and it never had very good interfaces to external files.
-John]

Gerry Rzeppa

unread,
Oct 30, 2014, 3:50:01 PM10/30/14
to
Kartik says: How was the executable generated? Does
each new version bootstrap using the previous one?

Reply:

Yes, each new version was (and is) created with the previous version.
Page 9 of the instructions describes the process for re-compiling the whole
shebang in itself.

Gerry Rzeppa

unread,
Oct 30, 2014, 3:53:16 PM10/30/14
to
Hans-Peter Diettrich asks, *who* is the intended writer
of a program in that English language?

Reply:

(1) Experienced programmers like my son and I who like to program in
English, and who had a lot of fun writing an integrated development
environment -- including an iconoclastic interface, a simplified file
manager, an elegant text editor, a hexadecimal dumper, a
native-code-generating compiler/linker, and a wysiwyg page layout program
for documentation -- in just 25,000 lines of the stuff.

(2) Kids who know nothing about programming. Here, for example, is a typical
first program we have our padawans type in, a little at a time, when we're
teaching them about the subject:

To run:
Start this baby up.
Clear the whole freaking screen.
Draw some circles with the red pen.
Wait for the escape key.
Shut the whole thing down.

To draw some circles with a color:
Pick a spot anywhere on the screen.
Draw a circle on the spot with the color.
Refresh the screen.
Add 1 to a count.
If the count is greater than 12, break.
Repeat.

At intervals, they use the "Run" command -- which is conveniently placed
under the "R" menu -- to see the results. Note that since syntax isn't a
major issue, we can concentrate on concepts (sequence, loops, variables,
parameters, etc) -- while improving their typing skills and reinforcing what
they're learning in their grammar classes!

(3) All kinds of students in between. Remember, our entire system is written
in Plain English. So the enterprising student who begins with a program like
the one above can eventually learn how to write all kinds of programs --
including desktop interfaces, file managers, text editors, dumpers,
compilers, page layout programs, and many more -- without ever leaving the
simple and familiar Plain English environment.

(4) To be determined. We're not sure, at this stage of the game, what other
kinds of people will find programming in their native tongue (with snippets
of formulas and graphics inserted in appropriate spots) both productive and
enjoyable. It depends, we think, primarily on the different kinds of Plain
English libraries we can provide (or get others to provide) to support
various application areas. Note that it's in the libraries -- not the
compiler itself -- that the nouns, verbs, adjectives, and various modes of
expression are defined.

Gerry Rzeppa

unread,
Oct 30, 2014, 5:04:56 PM10/30/14
to
lesliedellow says, I wonder if the OP has a clue how compilers work?

Reply: We obviously know how compilers work having written a reliable,
efficient, and fully functioning one -- for Plain English, coded entirely in
Plain English.

lesliedellow says, If he has, he might want to try his hand at writing a BNF
grammar for English.

Reply: English is a natural language and, as such, cannot be entirely
reduced to a BNF; it must be compiled via a combination of heuristic
techniques. Our particular approach takes three swipes at the source code
using a mixture of recursive-descent parsing and a small collection of ad
hoc super- and sub-parsers. (Write me directly for a copy of the EBNF we use
for the recursive-descent part; it's less than 100 lines.) Note also that
our compiler is designed in such a way that the bulk of the vocabulary and
grammar -- nouns, verbs, adjectives, modes of expression, etc -- are not
hard-coded in the compiler, but are are automatically deduced when the
programmer, in his own source code, defines new types, variables, and
routines; the compiler focuses on that tiny and well-defined portion of
English that serves as glue for the rest: articles, prepositions, and
conjunctions.

See here for our next proposed steps and links to the current documentation,
source code, and executable prototype:

https://www.indiegogo.com/projects/the-hybrid-programming-language/x/8950932

Gerry Rzeppa

unread,
Oct 30, 2014, 9:01:23 PM10/30/14
to
Martin Ward says, "Plain English" is even more verbose that COBOL...

Dan Rzeppa replies, Terseness wasn't the goal, writing in English was.

Martin Ward says, This means that multiplication does not take precidence
over addition in "Plain English" (unlike in mathematics).

Gerry Rzeppa replies: Plain English strives to be natural. Ask a few people
on the street what "five plus five times three" is: most (f not all) will
say, "Thirty," not, "twenty" as you suggest. Really. Try it.

Martin Ward says, "Plain English" only has 8 bit unsigned and 16 and 32 bit
signed
binary numbers, 32 bit pointers, strings and records.
So "Plain English" cannot handle any number bigger than 2,147,483,648.

Dan rzeppa replies: At the time we wrote the language there weren't a lot of
commodity 64-bit machines. Seeing how we've managed to implement 8-bit,
16-bit and 32-bit numbers, I'm pretty sure we could implement 64-bit ones
too.

Martin Ward says, I don't know what you do if you
want to address more than 4GB of RAM...

Dan Rzeppa replies, We'd implement the 64-bit numbers and away we'd go.

Martin Ward says, ...or, for that matter, work with
floating point numbers!

Dan Rzeppa replies, Don't need floating point numbers. The ratios work well
and, unlike floating-point numbers, can accurately represent one-third.

Gerry Rzeppa adds, And monetary amounts. We're not saying there aren't
particular applications that are well-served by floating point numbers.
We're simply saying that for our immediate purposes, they weren't (and
aren't) necessary. If we're trying to get the machine to understand English
at the level of, say, a three- to five-year-old, we hardly have to think
about floating point numbers.

Martin Ward says, Strings appear to be represented using the Windows-1252
character set
-- no Unicode for you!

Dan Rzeppa says, We were writing Plain English (not Plain Mandarin) and
didn't need any other characters than the standard ASCII set. Just like we
don't need Unicode to discuss these matters with you.

Gerry Rzeppa adds, And that's hardly the kind of complaint one should raise
regarding a proof-of-concept prototype that is focussed on entirely
different issues!

Martin Ward says, In old-fashioned (pre-2002) COBOL you had to use a whole
byte to
implement a flag: or encode several flags into one data item with some
complicated coding to access and update them. In "Plain English" a
single flag takes 32 bits of storage: which seems a step backwards to
me!

Gerry Rzeppa says, It appears you're missing the point again, Martin. In a
proof-of-concept prototype, one strives for simplicity --and especially so
in matters that are tangential to the line of research being pursued.

Martin Ward says, "Plain English" has numeric operations: plus, minus, times
and divided
by (with optional remainder), but no power operation.

Gerry Rzeppa says, Actually, there is a standard power operation in Plain
English. But it didn't get much use within the scope of our investigations,
so it doesn't appear in the documentation.

Martin Ward says, Finally: there are no nested IF statements and no nested
loops.

Dan Rzeppa replies, Right on. Pretty amazing that one can conveniently and
efficiently write an entire development system -- desktop, file browser,
text editor, dumper, compiler, and page-layout facility -- without them,
yes?

Martin Ward asks, How do you process a two-dimensional array without a
nested loop?

Dan Rzeppa replies, Obviously, with two non-nested loops.

Gerry Rzeppa adds, There are many hierarchical structures in the bowels of
our Plain English development system that are several levels deep and are
processed in this way. Our page editor, for example, processes "shapes"
(text blocks, bitmapped-graphic images, vector-graphic drawings, etc) on
"pages" that are part of "reams" (collections of related pages). See below
for further remarks.

Martin Ward says, Well, since there are no two-dimensional arrays in the
language,
(or even one-dimensional arrays, for that matter), you don't have
to worry about that!

Gerry Rzeppa replies, Arrays (as a data structure native to Plain English)
are under investigation; they're not common in everyday speech which was our
main area of investigation. Our compiler does, however, employ a tentative
implementation of arrays in the "index" facility that is used by the
compiler for symbol tables, the page layout facility for spell-checking,
etc; but that facility is implemented entirely in a library, and is not
native to the compiler itself.

Martin Ward says, Routines can declare local data: but since each routine
has to be
extremely short (due to the lack of nesting of conditionals
or loops), any time you need a nested conditional or loop
you have to split the code into separate routines: which means
that any shared data has to be global variables!

Dan Rzeppa replies, In such cases, shared data is typically passed as
parameters, not as global variables.

Gerry Rzeppa adds, And we think it's desirable to keep routines short. While
it may seem a nuisance to be "forced" to split one's routines up this way,
our consistent experience has been that in many (if not most) cases, the
additional routines have later come in handy and -- since the necessary
variables were passed as parameters -- could be called directly.

Gerry Rzeppa

unread,
Oct 30, 2014, 9:03:46 PM10/30/14
to
Martin Ward says, Gerry claims that all these questions can be answered in
the affermative: I would disagree.

Gerry replies, You are, of course, free to disagree. But keep in mind that
our understanding of the matter is first-hand, taken from actual experience.
Yours is merely theoretical -- and, based on your remarks below, appears to
be founded on a number of serious misunderstandings of both our prototype
and our project goals. For now, let me just say that (1) we know that it is
easier to program when we don't have to translate our natural-language
thoughts into an alternate syntax because we've actually done it both ways,
and find ourselves greatly preferring the natural language environment. (2)
We know that natural languages can be parsed in a relatively sloppy manner
and still provide a stable enough environment for productive programming
because, again, we've done it. And (3) we know that low-level programs (like
compilers) can be conveniently and efficiently written in high level
languages (like English) because, yet again, we've done it.

Martin Ward says, But if some of the above *can* be done, the effort is much
greater than with a traditional "Algol-like" programming language with a
limited set of keywords and unambiguous, precisely defined syntax and
semantics.

Gerry replies, That first clause should read, "has been done." And we know
it's easier to do it the Plain English way because we've actually done it
both ways. You're arguing with actual experience when you talk to us,
Martin, and actual experience is very hard to refute.

Martin Ward says, Let's take one of the most basic concepts in imperative
programming: the assignment... MOVE bar TO foo

Gerry replies, I think a little more self-examination is required here.
"MOVE bar TO foo" may be "natural" to you, but that's only because you've
studied programming languages extensively; it's not a natural way of
speaking English. Do you really say things like "Please pass salt" and "Put
car in garage" when you're talking to other humans?

Martin Ward says,

copy bar to foo
set foo to be equal to bar
make foo equal bar
assign bar to foo
assign foo from bar
get bar and put it into foo
put foo into bar (although this suggest that bar is a set)

As far as I can tell from the manual, none of these will work.

Gerry replies, This is one of the places where you seriously misunderstand
our compiler. Plain English is user-extensible; the programmer can easily
teach the compiler to understand his particular dialect of English and his
preferred modes of expression (within limits, of course -- it's only a
prototype) simply by defining new routines.

Martin Ward says, The little words "the", "a", "an" which can often be
omitted in natural English appear to be absolutely required in this
language: "the" indicates a global variable, while "a" or "an" indicates a
local variable.

Gerry replies, Our studies have led us to believe that "the little words"
are what people actually and intuitively focus on when parsing language in
their heads; so yes, articles, prepositions, and conjunctions play a
prominent role in our compiler. Try removing all the articles, prepositions,
and conjunctions from this post and see if it appears English-like to you.
See if, in that form, it is more or less natural, more or less intelligible.

Martin Ward says, This reminds me of the old fashioned text-based adventure
games where you knew what you wanted to do: but getting the parser to accept
your command turned into a game of "guess the verb":

Gerry replies, As above, your ignorance of our prototype and project is
showing. Almost all verbs in Plain English are defined by the programmer,
not our system; it's the articles, prepositions, and conjunctions that we
focus on. And every English-speaker knows what those "little words" are and
where those "little words" go in a sentence.

Martin Ward says, Another example: you can "divide the foo by the bar" but
cannot "divide the bar into the foo".

Gerry replies, Plain English supports both ways of expressing that thought.
And many others.

Martin Ward says, So: you *do* need to translate your natual language
thoughts into an alternative syntax: the minimalist subset of English
accepted by the parser.

Gerry replies, Again, no. The "subset of English accepted by the parser" is
user-defined.

Martin Ward says, Of course, since you have the source code, you can extend
the parser by adding different ways of saying the same thing.

Martin Ward says, No, the programmer extends the language by creating new
routines in his own source code, not by changing the compiler.

Martin Ward says, Which leads us to the next problem: ...if you are working
on a team and have to read other people's code then you will need to be able
to recognise *all* the different ways of writing a standard statement or
call: in this case, adding more variations just adds to the amount of
"legalese" you have to remember.

Gerry replies, Case in point: (1) You're wrong again, Martin. (2) Once
again, you're wrong. (3) Wrong, Martin, once again. Did you have to "learn"
anything to understand those three sentences? No! Team members understand
each other's code because the thoughts are expressed in the most natural and
common ways. You don't have to understand Plain English to understand a
routine written in Plain English -- you only have to understand English.

Martin Ward says, The semantics may be subtly different from the semantics
of natural language. For example, in English: "A plus B times C" means, as
any schoolchild will tell you, multiply B and C together and add the result
to A.

Gerry replies, Again, your education and your culture have biased you. Here
in America -- among average English speakers on the street, young and old,
"A plus B times C" means "start with A, add B, then multiply by C". Try it:
ask average people on the street what "five plus five times three" is;
they'll say "Thirty" more often than not. It is only those trained in
mathematics that see it otherwise. And here again your ignorance of our
prototype and our goals is hampering your evaluation. We fully intend to
include standard mathematical notation in the language; the purpose of the
prototype was to see how far we could get without it. For such future plans,
see:

https://www.indiegogo.com/projects/the-hybrid-programming-language/x/8950932

Martin Ward says, The problem is that English is ambiguous and the ambiguity
in the "Plain English" source code is not resolved by the ambiguous English
sentences in the manual!

Gerry replies, This is a very common misconception. English itself is not
ambiguous; it is a very powerful tool that can be used both unambiguously
and ambiguously; it is the user of English, not the language itself, that is
either ambiguous or not.

Martin Ward says, The whole "Plain English" language is actually quite
limited...

Gerry replies, Yes and no. It's a proof-of-concept. It's a prototype. It's a
work in progress. So yes. But it is also user-extensible, so no.

Martin Ward says, ...with "Plain English" we also have to memorise all the
different ways to write the same statement, and all the subtle semantic
differences between it and English.

Gerry replies, I really don't see how you could miss the mark by a wider
margin. The whole point of Plain English is to make memorization
unnecessary! You simply code what you're thinking and it works. And if it
doesn't, you write the routines necessary to make it understand, so the next
time it will understand more. Like all natural languages, Plain English is a
living, growing thing; and, like all natural languages, it develops into
different dialects in the context of different users and user communities --
dialects that are, nevertheless, understandable by all who speak (or write
or code in) the mother tongue.

Martin Ward says, A real test of the utility of "Plain English" would be to
compare...

Gerry replies, The "real test of the utility of Plain English" has been
completed. Two programmers, experienced in a wide variety of languages,
wrote a complete development system -- including interface, file manager,
text editor, hex dumper, native-code-generating compiler/linker, and
page-layout facility for documentation -- conveniently and efficiently, in
Plain English. In just six months. And they found it more natural, more
effective, and simply more fun to do it in Plain English than in any other
way.

Gerry adds, It surprises me, Martin, that a fan of Chesterton wouldn't be
attracted to a project such as ours. Chesterton was the quintessential
iconoclast, and he delighted in helping people realize how blinded they
could become by accepting the status quo without question. So he would "turn
things on their head" and "look at things backwards" to make his point.
Which is exactly what we've done: we've developed a system that challenges
pretty much every preconception about programming a modern practitioner
might have: Are installation programs necessary? Can a high-level language
like English be used to conveniently write low-level programs like
compilers? Can a workable interface be designed without icons, scroll bars,
radio buttons, and a wide variety of other widgets? Can complex programs be
clearly and concisely written without nested ifs and loops? Can a
polymorphic drawing program be effectively programmed without objects?
Should menus be organized like a table of contents, or like an index? Should
documentation be dry and serious, or challenging and fun? Can only
scientists and mathematicians write efficient compilers, or can poets and
comedians do the same?
[The small size of the user base worries me. Over and over, stuff
that seems obvious and natural to a small group of developers turns out
to be strange and baffling to other people. -John]

Gerry Rzeppa

unread,
Oct 31, 2014, 4:31:09 PM10/31/14
to
[The small size of the user base worries me. Over and over, stuff
that seems obvious and natural to a small group of developers turns out
to be strange and baffling to other people. -John]

Granted, Plain English has not yet found its niche. Frankly, we see our
situation as analogous to the Lisa/Macintosh saga. When Steve Jobs attempted
to sell a simple but different machine to seasoned professionals, he failed.
(Seems it's true that "user friendly is what the user is used to.") But when
he attempted to sell an even simpler machine to "the rest of us," he
succeeded. We're thus thinking that the most promising target market for
Plain English / Hybrid Programming is primary school education -- where the
students have nothing to unlearn. Time will tell.

[I'm reminded of Logo, which was a big hit around 1970 teaching kids to
program, then disappeared. We'll see. Re the early history of the
Macintosh, please argue about it in alt.folklore.computers, not
here. -John]

Gerry Rzeppa

unread,
Nov 1, 2014, 1:56:16 PM11/1/14
to
[I'm reminded of Logo, which was a big hit around 1970 teaching kids to
program, then disappeared... -John]

Yes, I thought of LOGO myself as I replied to your previous post. But I'm
not sure LOGO is as dead as you might think: many groups are still active
with the thing, from the "Turtle Academy" (http://turtleacademy.com/) to MIT
(http://el.media.mit.edu/logo-foundation/index.html). Google for others.

I think the main problem with Logo is that, like other artificial languages,
it confounds the essential concepts of programming (sequence, conditionals,
loops, types, variables, parameters, etc) with the syntax used to represent
them. And worse -- due to its LISP heritage -- it often encourages the
student to implement certain things in less-than-intuitive ways (making
lists of things that are not naturally lists; using recursion where a simple
loop is the obvious answer, etc). Plain English avoids both of these
shortcomings.
[We're back to arguing about what's intuitive. Having taught a
certain number of undergraduates, I can report that neither loops nor
recursion are obvious and some people have enormous trouble imagining
a variable with contents that are updated, which is the key concept to
make loops work. Anyway, under the semicolon rule, we're done with
this thread. -John]

mac

unread,
Nov 4, 2014, 12:21:56 PM11/4/14
to
Lets not forget the REALITY[tm] database and its ENGLISH[tm] query
language!
--
mac the naC/f
[And with that, this thread is over unless someone has something
to say that's related to compilers. Thanks, all. -John]

0 new messages