Did you take a look at Seed7?
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
That is exactly the area where Ada left the path of Pascal. Pascal
was designed to be easy to implement. Nicklaus Wirth had good
reasons to keep the implementation simple. He once said (IIRC):
What can be parsed easily by a compiler can also be
parsed easily by a human and this can be an asset.
He probably did not use exactly this words, but they hopefully
describe his intentions.
Many languages try to make the job of writing a program easier
and at the same time make the job of reading programs is
harder. All this wonderful "do what I mean" concepts used by
many languages fail in some cases.
IMHO complex compilation processes are an indication
of hard-to-understand concepts or hard-to-read constructs.
As such a complex compilation process only seemingly
(and not really) makes programming easier.
> That is exactly the area where Ada left the path of Pascal. Pascal
> was designed to be easy to implement. Nicklaus Wirth had good
> reasons to keep the implementation simple. He once said (IIRC):
>
> What can be parsed easily by a compiler can also be
> parsed easily by a human and this can be an asset.
>
> He probably did not use exactly this words, but they hopefully
> describe his intentions.
A human who has parsed some part of a program is far away from having
understood the part of the program: Many concepts expressed "easily"
in "simple" languages hide the fact that a complex combination
of simple things needs to be studied (and made "conventional"
or idiomatic) in order to arrive at an understanding of what
is really going on, and what is intended, due to the combination.
Sometimes these lengthy combinations of "simply" expressed things
are equivalent to a simple builtin of less "simple" languages.
> Many languages try to make the job of writing a program easier
> and at the same time make the job of reading programs is
> harder. All this wonderful "do what I mean" concepts used by
> many languages fail in some cases.
True, but this does not apply to Ada. Ada was designed with requirements
that explicitely required ease of reading over ease of writing.
> IMHO complex compilation processes are an indication
> of hard-to-understand concepts or hard-to-read constructs.
> As such a complex compilation process only seemingly
> (and not really) makes programming easier.
>
Not at all. Let me take an example to show you what I meant. If you have
a record (in Pascal) or struct (in C), you are not allowed to compare
them directly. Why? because records may contain gaps that shouldn't be
compared, and skipping the gaps was deemed too much work for the
compiler. In Ada, there is no problem writing:
if Rec1 = Rec2 then ....
would you argue that it is /less/readable than writing:
if Rec1.F1 = Rec2.F1 and Rec1.F2 = Rec2.F2 and Rec1.F2 = Rec2.F3...
--
---------------------------------------------------------
J-P. Rosen (ro...@adalog.fr)
Visit Adalog's web site at http://www.adalog.fr
The converse, however, is not necessarily true: that which cannot be
parsed easily by a compiler sometimes still can be parsed easily by a human.
This will remain so until strong AI is developed.
I once met somebody, who wrote the front end of an Ada compiler, and
he told me a different story. E.g.: He said that a special function
needs to read ahead just to find out the semantic of a parenthesis.
In Pascal such read ahead is not necessary. Another friend told me
stories about a buggy early Ada compiler where it was necessary to
"code around" compiler bugs.
> > [...]
> > Many languages try to make the job of writing a program easier
> > and at the same time make the job of reading programs is
> > harder. All this wonderful "do what I mean" concepts used by
> > many languages fail in some cases.
>
> True, but this does not apply to Ada. Ada was designed with requirements
> that explicitely required ease of reading over ease of writing.
Agree, but not all things designed to ease reading do so.
> > IMHO complex compilation processes are an indication
> > of hard-to-understand concepts or hard-to-read constructs.
> > As such a complex compilation process only seemingly
> > (and not really) makes programming easier.
>
> Not at all. Let me take an example to show you what I meant. If you have
> a record (in Pascal) or struct (in C), you are not allowed to compare
> them directly. Why? because records may contain gaps that shouldn't be
> compared, and skipping the gaps was deemed too much work for the
> compiler. In Ada, there is no problem writing:
> if Rec1 = Rec2 then ....
Neither in Seed7, which has the = and <> operators predefined for
all structs. Btw.: Why should gaps in a struct make problems? A
struct compare should call the compare functions of the elements
anyway.
IMHO such features do not make problems. I was referring to things
like overloading rules where the result of a function/operator is
taken into account.
> On 20 Jul., 17:14, Jean-Pierre Rosen <ro...@adalog.fr> wrote:
>>
>> Parsing is not the difficult part of an Ada compiler.
>
> I once met somebody, who wrote the front end of an Ada compiler, and
> he told me a different story. E.g.: He said that a special function
> needs to read ahead just to find out the semantic of a parenthesis.
There is no need to know the semantics of parenthesis in order to parse
them. Semantic analysis is anther compilation phase.
The only moderately difficult part of Ada that requires short look ahead
are digraphs like "and then" (overloaded with "and"). There can be comments
and new lines between "and" and "then" in the digraph. However it does not
require any roll backs.
> Another friend told me
> stories about a buggy early Ada compiler where it was necessary to
> "code around" compiler bugs.
This problem still exist. (:-() This is not language specific, Delphi,
Borland C++, MSVC are no less buggy.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
>I once met somebody, who wrote the front end of an Ada compiler, and
>he told me a different story. E.g.: He said that a special function
>needs to read ahead just to find out the semantic of a parenthesis.
>In Pascal such read ahead is not necessary.
Of course a read ahead is necessary in Pascal (at least in a one-pass
compiler), otherwise the following would give an error:
procedure test (*comment*) (x: integer);
--
In order to e-mail me a reply to this message, you will have
to remove PLEASE.REMOVE from the address shown in the header
or get it from http://home.netsurf.de/wolfgang.ehrhardt
(Free open source Crypto, AES, CRC, Hash for Pascal/Delphi)
I think Pascal doesn't need lookahead if it's dealing with symbols rather
than characters. Even with characters only a single character lookahead is
needed. Possibly Ada may need to read multiple symbols ahead.
The Pascal design is elegant but I've done parsers where potentially
thousands of tokens need to be processed after a parenthesis before it knows
exactly what it's dealing with. It's not a problem.
--
Bart
IIRC "and then" is not that big an issue. The bigger issue is the two
uses of single quotes: they delimit character constants, and they are
used for qualified expressions. Consider an expression such as
Foo'(',',',',',' ... )
Even with this complication, Ada is not a hard language to parse. C++
poses a considerably greater challenge, due to the dual use of < and >
for both comparisons and for bracketing template parameters.
>> Another friend told me
>> stories about a buggy early Ada compiler where it was necessary to
>> "code around" compiler bugs.
>
> This problem still exist. (:-() This is not language specific, Delphi,
> Borland C++, MSVC are no less buggy.
>
--
"All things extant in this world,
Gods of Heaven, gods of Earth,
Let everything be as it should be;
Thus shall it be!"
- Magical chant from "Magical Shopping Arcade Abenobashi"
"Drizzle, Drazzle, Drozzle, Drome,
Time for this one to come home!"
- Mr. Wizard from "Tooter Turtle"
> Dmitry A. Kazakov wrote:
>> On Fri, 24 Jul 2009 00:26:11 -0700 (PDT), tm wrote:
>>
>>> On 20 Jul., 17:14, Jean-Pierre Rosen <ro...@adalog.fr> wrote:
>>>> Parsing is not the difficult part of an Ada compiler.
>>> I once met somebody, who wrote the front end of an Ada compiler, and
>>> he told me a different story. E.g.: He said that a special function
>>> needs to read ahead just to find out the semantic of a parenthesis.
>>
>> There is no need to know the semantics of parenthesis in order to parse
>> them. Semantic analysis is anther compilation phase.
>>
>> The only moderately difficult part of Ada that requires short look ahead
>> are digraphs like "and then" (overloaded with "and"). There can be comments
>> and new lines between "and" and "then" in the digraph. However it does not
>> require any roll backs.
>
> IIRC "and then" is not that big an issue. The bigger issue is the two
> uses of single quotes: they delimit character constants, and they are
> used for qualified expressions. Consider an expression such as
>
> Foo'(',',',',',' ... )
The context where the apostrophe introduces a character literal is where an
expression operand is expected, i.e. *before* an operand. The context where
the apostrophe starts an attribute is always *after* an operand. If the
parser is aware of the context there is no any problem at all, because only
the infix operations switch the context and they are all fixed (+, -, *, /
etc). No look ahead is needed, not in my parser of Ada 95.
(Well, I forgot to mention another minor case of short look ahead. When
matching integer literals the base may precede the value: 10#10#. I am too
lazy to verify if _ is legal in the base specification, i.e. whether
1_0#10# is OK. If not then that might be also a negligible problem.)
Agree.
> Possibly Ada may need to read multiple symbols ahead.
>
> The Pascal design is elegant but I've done parsers where potentially
> thousands of tokens need to be processed after a parenthesis before it knows
> exactly what it's dealing with. It's not a problem.
The problem is not that it is not doable.
The problem is that human readers must also process
thousands of tokens to know exactly what is going on.
When one symbol tells you, what is going on, reading is easier.
Complicated parsing with lookahead is IMHO an indication
for hard to read constructs. I think that Wirth had such things
in his mind when he compared human and compiler parsing.
The rules that make the syntax of Seed7 easier to parse
are described here:
http://seed7.sourceforge.net/manual/syntax.htm
Note that the syntax and semantic (types, overloading, ...)
of Seed7 are handled by different parts of the compiler. It is
never necessary to look at semantic information (e.g.: The
type of an expression) to decide which syntax is allowed.
This really means it needs to look ahead a complete expression, which can be
of arbitrary complexity, especially if expressions can include statements as
my design did. Examples:
(expr) # ordinary parenthesised expression
(expr | a | b) # if-then-else select
(expr | a,b,c | z) # n-way select
(expr, a,b,c) # list
a[expr] # normal indexing
a[expr..expr] # slicing
etc...
In practice expr will be short and only a few symbols (such as n+1). If the
user wants to put almost a whole program in there, that's up to him. And he
could probably do similar things in Pascal and Seed7.
> When one symbol tells you, what is going on, reading is easier.
> Complicated parsing with lookahead is IMHO an indication
> for hard to read constructs.
You can't argue the above construct's aren't easy on the eye. On the other
hand:
switch n
when x then a
when y then b
else c
end [i]:=k
is probably ill-advised. It's not until the end that you realise it's an
assignment to an array element (ie. assigning k to one of a[i], b[i] or
c[i]).
So it all depends. Parsing this stuff is trivial. Might as well make it
available, and trust the user to use it sensibly.
--
Bart
It's ill-advised, but because of the ill-definition of that language.
There's no reason why such an expression couldn't be written and read
easily. You just need to design your language with prefix operators
(Polish Notation) so you always know up-front what you're parsing:
(setf (aref (switch n
(x a)
(y b)
(else c)) i) k)
Which is even simplier than what Wirth was about.
> (expr) # ordinary parenthesised expression
expr ; if you take care of always parenthesize
; expressions, you simplify the syntax
; rules and don't need to further parenthesize
; expressions.
> (expr | a | b) # if-then-else select
(if expr a b)
> (expr | a,b,c | z) # n-way select
(select expr a b c z)
> (expr, a,b,c) # list
(list expr a b c)
> a[expr] # normal indexing
(aref a expr)
> a[expr..expr] # slicing
(slice a expr1 expr2
> etc...
etc...
> So it all depends. Parsing this stuff is trivial. Might as well make it
> available, and trust the user to use it sensibly.
--
__Pascal Bourguignon__
If you're going to use Lisp-like syntax then you always know what to expect,
ie. (, ) or a term.
But ultra simple parsing presents it's own problems when trying to read it,
because of it's monotony. Richer syntax can make constructions stand out
more.
>
>> (expr) # ordinary parenthesised expression
>
> expr ; if you take care of always parenthesize
> ; expressions, you simplify the syntax
> ; rules and don't need to further parenthesize
> ; expressions.
>
>> (expr | a | b) # if-then-else select
>
> (if expr a b)
>
>
>> (expr | a,b,c | z) # n-way select
>
> (select expr a b c z)
>
>
>> (expr, a,b,c) # list
>
> (list expr a b c)
>
>
>> a[expr] # normal indexing
>
> (aref a expr)
>
>
>> a[expr..expr] # slicing
>
> (slice a expr1 expr2
> etc...
Sure, you have to do some of the language's work for it by telling it what's
coming up. Those forms look more like an intermediate language than original
source code. It's a bit like looking at plain text representing markup
instructions, then looking at the output.
Clearly many people are happy working directly with s-expressions, but
others aren't.
--
Bart