Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Perfect Programming Language

1 view
Skip to first unread message

Mark Carter

unread,
Apr 5, 2003, 6:46:03 AM4/5/03
to
My musings on XS (http://www.markcarter.me.uk/computing/xs.html) led
me to think about programming languages in general - and what "the
world's best programming language" would look like. XS struck me a
potentially very good (OK - not really good), because it is so easy to
write a parser for it. Then it occurred to me that Forth - a language
that I have never written in, nor know much about - might actually fit
the bill. Forth is stack-based, so it ought to be quite easy to write
a parser for it compared to infix-based languages.

Consider the following equation:

syntactic sugar + libraries = programming language
Now, the "library" aspect of a programming language is not really
inherent to the language itself. There's no intrinsic reason why perl
is good at regular expressions, whereas visual basic is not - it's
just that the people who wrote perl decided to include regular
expressions, whilst Microsoft did not. What we find, therefore, is
that language writers implement the same functionality over and over
again. Perl might have some useful library that Python does not, and
vice versa. The chances are, though, that if you want to use some of
that functionality, you have to port over code - which is a waste of
time.

On the "syntactic sugar" front, some languages are definitely better
than others. I like python because it has lists built into the syntax.
So I can write something like:

a = [1,2,3]
which I cannot do in C. I also like python's exception handling - it
is much better than Visual Basic. C does not have exception handling.
So I would rather write in python than C, because I can be more
productive.

This then leads to the argument: why have a programming language with
a fixed syntax at all? If you came up with a language whose syntax was
extensible, then you would have what I believe would be the world's
best programming language - let's call it PPL. PPL would consist of a
bunch of libraries of two types: "function libraries" (which are the
libraries that I have referred to in my definition of a programming
language above), plus "syntax libraries" (an entirely new concept).
These syntax libraries would be capable of converting an infix program
to a stack-based program (e.g. Forth). The stack-based program would
be specifically designed to be simple to compile.

So I could then imagine the scenario where someone woke up one day and
said: "hey, wouldn't it be grand if PPL supported
object-orientation?". The programmer would then write a syntax library
that implemented it, and distribute it like a normal library. Anyone
interested in object-orientation could then import the library and
hey-presto - instant object-orientation. It would also be possible to
translate just about any language to any other language, because I
would expect that its syntax could be encapsulated into a so-called
syntax library.

Thoughts and comments welcome!

Marcin 'Qrczak' Kowalczyk

unread,
Apr 5, 2003, 7:33:24 AM4/5/03
to
On Sat, 05 Apr 2003 03:46:03 -0800, Mark Carter wrote:

> This then leads to the argument: why have a programming language with
> a fixed syntax at all? If you came up with a language whose syntax was
> extensible, then you would have what I believe would be the world's
> best programming language - let's call it PPL.

Those who don't know Lisp are condemned to reinvent it, poorly :-)

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Mark Carter

unread,
Apr 5, 2003, 4:37:25 PM4/5/03
to
"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message news:<pan.2003.04.05....@knm.org.pl>...

> On Sat, 05 Apr 2003 03:46:03 -0800, Mark Carter wrote:
>
> > This then leads to the argument: why have a programming language with
> > a fixed syntax at all? If you came up with a language whose syntax was
> > extensible, then you would have what I believe would be the world's
> > best programming language - let's call it PPL.
>
> Those who don't know Lisp are condemned to reinvent it, poorly :-)

Thanks for the reply. As a follow-up ...

I quote from
http://www.paulgraham.com/icad.html
the following text:

If you look at these languages in order, Java, Perl, Python, you
notice an interesting pattern. At least, you notice this pattern if
you are a Lisp hacker. Each one is progressively more like Lisp.
Python copies even features that many Lisp hackers consider to be
mistakes. You could translate simple Lisp programs into Python line
for line. It's 2002, and programming languages have almost caught up
with 1958.

Starling

unread,
Apr 6, 2003, 8:34:51 PM4/6/03
to
There are 3 things I look for in a programming language. How easy is
it to document, how much code is wasted setting up for the useful
code, and how hard is it to debug? That said, there are no good
languages, only good programmers. ^.^


Starling

Starling

unread,
Apr 6, 2003, 9:33:04 PM4/6/03
to
carter...@ukmail.com (Mark Carter) writes:

> "Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message
> news:<pan.2003.04.05....@knm.org.pl>...

> > Those who don't know Lisp are condemned to reinvent it, poorly :-)

And those who do know Lisp are condemned to reinvent Lisp, poorly. :)

> Thanks for the reply. As a follow-up ...
>
> I quote from
> http://www.paulgraham.com/icad.html
> the following text:
>
> If you look at these languages in order, Java, Perl, Python, you
> notice an interesting pattern. At least, you notice this pattern if
> you are a Lisp hacker. Each one is progressively more like Lisp.

Except that Java is the newest one. -.- At any rate, you could also
consider the progressive Lisp, Basic, Perl, and Java each one more
C-like in syntax. As a concept, Lisp is perfect. But to actually use
it...?


Starling
Who uses Lisp a lot, but doesn't have to like it. :)

Bruce Hoult

unread,
Apr 7, 2003, 4:06:12 AM4/7/03
to
In article <d3c9c04.03040...@posting.google.com>,
carter...@ukmail.com (Mark Carter) wrote:

> XS struck me a potentially very good (OK - not really good), because
> it is so easy to write a parser for it. Then it occurred to me that
> Forth - a language that I have never written in, nor know much about
> - might actually fit the bill. Forth is stack-based, so it ought to
> be quite easy to write a parser for it compared to infix-based
> languages.

Why is this such an important thing? You only have to write the parser
*once*. After that anyone can use it (if you make it available as a
library, and not just as a monolithic part of the compiler).

This is a mistake made by each of Lisp, Smalltalk and Forth, getting a
syntactically extensible language at the expense of readability.


> This then leads to the argument: why have a programming language with
> a fixed syntax at all? If you came up with a language whose syntax was
> extensible, then you would have what I believe would be the world's
> best programming language - let's call it PPL.

I agree. And there are a number of such languages.


> PPL would consist of a bunch of libraries of two types: "function
> libraries" (which are the libraries that I have referred to in my
> definition of a programming language above), plus "syntax libraries"
> (an entirely new concept).

No, your "syntax libraries" are what languages such as Dylan and Common
Lisp call "macros" and Forth calls "<BUILDS DOES>".


> These syntax libraries would be capable of converting an infix program
> to a stack-based program (e.g. Forth). The stack-based program would
> be specifically designed to be simple to compile.

There's no reason to do that. You just have a core language, and the
macros expand into that.

For example, in Gwydion Dylan (http://www.gwydiondylan.org), just about
*all* of what progrmamers would regard as the normal built in
Pascal-like syntax of the language is actually implemented as macros in
the standard library.


> So I could then imagine the scenario where someone woke up one day
> and said: "hey, wouldn't it be grand if PPL supported
> object-orientation?". The programmer would then write a syntax
> library that implemented it, and distribute it like a normal library.

Which is exactly how Common Lisp's o-o facility (CLOS) started out.

Unfortunately that library doesn't integrate the built-in primitive
types into the obejct system all that well (it can't, really), which is
why Dylan integrates a very similar object system into the base language.


> Thoughts and comments welcome!

It's already been done -- go use it!

-- Bruce

Mark Carter

unread,
Apr 7, 2003, 4:37:10 AM4/7/03
to
Starling <nos...@hooey.invalid> wrote in message news:<m1znn3w...@localhost.localdomain>...

Flon's Law: There is not now, and never will be, a language in which
it is the least bit difficult to write bad programs.

Joachim Durchholz

unread,
Apr 7, 2003, 8:32:09 AM4/7/03
to
Mark Carter wrote:
> Flon's Law: There is not now, and never will be, a language in which
> it is the least bit difficult to write bad programs.

So what?
I wnat a language in which is it easy to write good programs.
I'm still waiting for one...

Regards,
Joachim
--
Currently looking for a new job.

K Hollingworth

unread,
Apr 7, 2003, 10:04:18 AM4/7/03
to
Joachim Durchholz <joac...@gmx.de> wrote:
> Mark Carter wrote:
>> Flon's Law: There is not now, and never will be, a language in which
>> it is the least bit difficult to write bad programs.

> So what?
> I wnat a language in which is it easy to write good programs.
> I'm still waiting for one...

I think we need to define what we mean by "good", "bad", "perfect" and so
on.

Presumably "perfect" means "as good as it is possible to be" but...
does good mean fast, reliable, bug-free on first compile/run?
does bad mean memory-hungry, buggy, inefficient?

One reason that there are so many languages is because "good" for a
numerical implementation of a 10,000 parameter model is not the same as
"good" for string-parsing.

I can imagine that the perfect string-parsing or modelling or AI or ...
language might exist. I cannot see anyway in which there could be A
perfect language.

--
Kirsty Hollingworth <kh...@york.ac.uk>
Department of Biology, University of York, York, YO10 5YW

Michael Hobbs

unread,
Apr 7, 2003, 10:10:50 AM4/7/03
to
"Mark Carter" <carter...@ukmail.com> wrote in message
news:d3c9c04.03040...@posting.google.com...

> This then leads to the argument: why have a programming language with
> a fixed syntax at all? If you came up with a language whose syntax was
> extensible, then you would have what I believe would be the world's
> best programming language - let's call it PPL. PPL would consist of a
> bunch of libraries of two types: "function libraries" (which are the
> libraries that I have referred to in my definition of a programming
> language above), plus "syntax libraries" (an entirely new concept).
> These syntax libraries would be capable of converting an infix program
> to a stack-based program (e.g. Forth). The stack-based program would
> be specifically designed to be simple to compile.

To a large extent, this is the exact strategy that Microsoft is trying to
persue with .NET. That is, there exists a large .NET library, which any
application that is running in the .NET runtime can access. Furthermore, the
.NET runtime is generic enough so that any "modern" language can be compiled
to run in it. So while there may not exist "syntax libraries", there do
exist syntax compilers that basically achieve the same thing.

Granted, I've never actually used .NET, so I can't vouch for how accurate
the above statements are. I'm simply repeating information that I've gleaned
from magazine articles and other forms of Microsoft marketing.


Joachim Durchholz

unread,
Apr 7, 2003, 1:44:31 PM4/7/03
to
K Hollingworth wrote:

> Joachim Durchholz <joac...@gmx.de> wrote:
>>I wnat a language in which is it easy to write good programs.
>>I'm still waiting for one...
>
> I think we need to define what we mean by "good", "bad", "perfect" and so
> on.
>
> Presumably "perfect" means "as good as it is possible to be" but...
> does good mean fast, reliable, bug-free on first compile/run?
> does bad mean memory-hungry, buggy, inefficient?

That's part of the problem.
But not a particularly big one - you can get 90% goodness in all these
areas. For the remaining 10% - well, there's always assembly.

BTW you forgot maintainability, and that's a central issue for >90% of
all software written on this planet. Perfection for some specialized
area like AI or parsing fades before this goal.
AND it's an area where open research questions are plentiful.

Tom Breton

unread,
Apr 7, 2003, 8:27:06 PM4/7/03
to
Bruce Hoult <br...@hoult.org> writes:

> In article <d3c9c04.03040...@posting.google.com>,
> carter...@ukmail.com (Mark Carter) wrote:
>
> > XS struck me a potentially very good (OK - not really good), because
> > it is so easy to write a parser for it. Then it occurred to me that
> > Forth - a language that I have never written in, nor know much about
> > - might actually fit the bill. Forth is stack-based, so it ought to
> > be quite easy to write a parser for it compared to infix-based
> > languages.
>
> Why is this such an important thing? You only have to write the parser
> *once*. After that anyone can use it (if you make it available as a
> library, and not just as a monolithic part of the compiler).

No, and I don't advocate that way of thinking about it.

You need some sort of parser for every tool that needs to understand
the code at all. Lisp's is easy: `r' `e' `a' `d'.

Unfortunately in other languages, there's a vicious circle where the
lack of trivial parsing has led to the mindset that tools are
extra/add-on/outside the realm of the language, which in turn devalues
trivial parsing.

For myself, the ease I have found in building tools to work on code
weighs very heavily, and the alleged difficulty of reading Lisp weighs
not at all.

--
Tom Breton at panix.com, username tehom. http://www.panix.com/~tehom

Joachim Durchholz

unread,
Apr 8, 2003, 3:17:39 AM4/8/03
to
Tom Breton wrote:
>
> Unfortunately in other languages, there's a vicious circle where the
> lack of trivial parsing has led to the mindset that tools are
> extra/add-on/outside the realm of the language, which in turn devalues
> trivial parsing.

I don't think it's ease of parsing, it's ease of using the resulting
parse tree.
I wouldn't want to write a tree walker for a C++ parse tree, I'd have to
consider too many constructs. In Lisp or Smalltalk, it's easy, because
both have a simple, regular semantics. (Of course, a simple and regular
semantics usually results in a simple, regular syntax.)

K Hollingworth

unread,
Apr 8, 2003, 4:19:04 AM4/8/03
to
Joachim Durchholz <joac...@gmx.de> wrote:
> K Hollingworth wrote:
>> Joachim Durchholz <joac...@gmx.de> wrote:
>>>I wnat a language in which is it easy to write good programs.
>>>I'm still waiting for one...
>>
>> I think we need to define what we mean by "good", "bad", "perfect" and so
>> on.
>>
>> Presumably "perfect" means "as good as it is possible to be" but...
>> does good mean fast, reliable, bug-free on first compile/run?
>> does bad mean memory-hungry, buggy, inefficient?

> That's part of the problem.
> But not a particularly big one - you can get 90% goodness in all these
> areas. For the remaining 10% - well, there's always assembly.

But you still haven't said what "good" actually means!

> BTW you forgot maintainability, and that's a central issue for >90% of
> all software written on this planet. Perfection for some specialized
> area like AI or parsing fades before this goal.
> AND it's an area where open research questions are plentiful.

I was just throwing in some suggestions. Yes maintainability, other
possiblities might include developer time, platform availability, platform
independance. You can probably think of more as well!

Joachim Durchholz

unread,
Apr 8, 2003, 6:37:55 AM4/8/03
to
Bruce Hoult wrote:
> Joachim Durchholz <joac...@gmx.de> wrote:
>>I wouldn't want to write a tree walker for a C++ parse tree, I'd have to
>>consider too many constructs. In Lisp or Smalltalk, it's easy, because
>>both have a simple, regular semantics.
>
> Not once you start using macros they don't.

Oh, right. You'd need to either analyze the macros (not decidable, but
you can probably get by with some pattern matching heuristics), or just
keep them as black boxes. The latter is probably not a good idea if
you're working on a contemporary Lisp - macros do too important things
there.

Strangely enough, this all is not a problem with Smalltalk. Smalltalks
don't use (or need) macros.
Of course, some design patterns are left unformalized in Smalltalk (such
as MI), and any tree walker will not see that. It doesn't seem to be a
large problem in practice :-)

Joachim Durchholz

unread,
Apr 8, 2003, 4:48:37 AM4/8/03
to
K Hollingworth wrote:
>
> But you still haven't said what "good" actually means!

Pick your choice :-)
There are many criteria, and different criteria are important for
different people. (I don't think that usefulness for a specific
application area is as important as many people think, though.)

Bruce Hoult

unread,
Apr 8, 2003, 5:20:46 AM4/8/03
to
In article <m3y92l9...@panix.com>,
Tom Breton <te...@REMOVEpanNOSPAMix.com> wrote:

> Bruce Hoult <br...@hoult.org> writes:
>
> > In article <d3c9c04.03040...@posting.google.com>,
> > carter...@ukmail.com (Mark Carter) wrote:
> >
> > > XS struck me a potentially very good (OK - not really good), because
> > > it is so easy to write a parser for it. Then it occurred to me that
> > > Forth - a language that I have never written in, nor know much about
> > > - might actually fit the bill. Forth is stack-based, so it ought to
> > > be quite easy to write a parser for it compared to infix-based
> > > languages.
> >
> > Why is this such an important thing? You only have to write the parser
> > *once*. After that anyone can use it (if you make it available as a
> > library, and not just as a monolithic part of the compiler).
>
> No, and I don't advocate that way of thinking about it.
>
> You need some sort of parser for every tool that needs to understand
> the code at all. Lisp's is easy: `r' `e' `a' `d'.

And any other language whatsoever can give the user a function called
"read", if it wants to.


> Unfortunately in other languages, there's a vicious circle where the
> lack of trivial parsing has led to the mindset that tools are
> extra/add-on/outside the realm of the language, which in turn devalues
> trivial parsing.

I agree. It should be trivial for the user to use. That doesn't mean
it has to be trivial for the language implementor to provide it.

-- Bruce

Bruce Hoult

unread,
Apr 8, 2003, 5:24:07 AM4/8/03
to
In article <3E927793...@gmx.de>,
Joachim Durchholz <joac...@gmx.de> wrote:

> Tom Breton wrote:
> >
> > Unfortunately in other languages, there's a vicious circle where the
> > lack of trivial parsing has led to the mindset that tools are
> > extra/add-on/outside the realm of the language, which in turn devalues
> > trivial parsing.
>
> I don't think it's ease of parsing, it's ease of using the resulting
> parse tree.
> I wouldn't want to write a tree walker for a C++ parse tree, I'd have to
> consider too many constructs. In Lisp or Smalltalk, it's easy, because
> both have a simple, regular semantics.

Not once you start using macros they don't.

Consider what your simple Lisp tree-walker will do when it meets "loop"
-- it can see this hunk of *something* which is a list containing a lot
of symbols in no obvious structure.

Unless you teach your tree-walker to parse the syntax of "loop" it's not
going to get very far.

-- Bruce

Nils Goesche

unread,
Apr 8, 2003, 11:44:31 AM4/8/03
to
Bruce Hoult <br...@hoult.org> writes:

But this is besides the general point: Usually, you do not want to
really ``code-walk´´ a whole program -- you want to invent a single,
additional syntactic construct, the implementation of which needs some
transforming of code. And this is indeed very easy in Lisp because of
the obvious structure of the parse tree. If the new construct you
define has a general structure like

(my-new-construct (foo bar) <SOME-CODE>)

then it usually doesn't matter what <SOME-CODE> exactly looks like.
It might very well contains 42 LOOP forms, but you don't have to
descend into them, usually.

Only in rather rare circumstances you really have to ``walk´´ into the
code. And doing that manually would indeed be hard; but you never do
that, anyway, because there are already general, working code-walkers
available everywhere, and most Lisp implementations ship with one,
anyway. And when using them, you /still/ operate on snippets of
ordinary Lisp code, which is just as easy as before; the walker only
automates expanding macros and querying the environment if some
variable is lexically bound and stuff like that.

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x0655CFA0

Joachim Durchholz

unread,
Apr 8, 2003, 12:07:26 PM4/8/03
to
Nils Goesche wrote:
>
>>Consider what your simple Lisp tree-walker will do when it meets
>>"loop" -- it can see this hunk of *something* which is a list
>>containing a lot of symbols in no obvious structure.
>
> But this is besides the general point: Usually, you do not want to
> really ``code-walk创 a whole program -- you want to invent a single,

> additional syntactic construct, the implementation of which needs some
> transforming of code.

This is true if you want to invent new constructs.
It's beside the point if you want to do stuff like code analysis, code
transformation, or documentation generation. Or even just prettyprinting.

Nils Goesche

unread,
Apr 8, 2003, 12:18:43 PM4/8/03
to
Joachim Durchholz <joac...@gmx.de> writes:

> Nils Goesche wrote:
> >
> >>Consider what your simple Lisp tree-walker will do when it meets
> >>"loop" -- it can see this hunk of *something* which is a list
> >>containing a lot of symbols in no obvious structure.

> > But this is besides the general point: Usually, you do not want to

> > really ``code-walk´´ a whole program -- you want to invent a


> > single, additional syntactic construct, the implementation of
> > which needs some transforming of code.
>
> This is true if you want to invent new constructs. It's beside the
> point if you want to do stuff like code analysis, code
> transformation, or documentation generation.

Well, that depends on how deep you have to descend into the code to
achieve what you want, and how much information you really need about
the symbols you encounter along the way. And again: If you do it
using a code walker (that knows about special forms and environments
and expands macros for you (which you could do yourself, too, of
course, using MACROEXPAND)), you are /still/ operating on ordinary
Lisp lists. If you can't see the advantage of this, you should try
using Camlp4 for a while, where doing this is /much/ harder.

> Or even just prettyprinting.

Pretty printing would be a good example where you do /not/ need a
fully fledged code walker. Just walking manually through the list
structure and outputting symbols and parentheses along the way should
be enough, I think.

Joachim Durchholz

unread,
Apr 9, 2003, 3:54:24 AM4/9/03
to
Nils Goesche wrote:
>
> Well, that depends on how deep you have to descend into the code to
> achieve what you want, and how much information you really need about
> the symbols you encounter along the way.

Exactly.

> And again: If you do it
> using a code walker (that knows about special forms and environments
> and expands macros for you (which you could do yourself, too, of
> course, using MACROEXPAND)), you are /still/ operating on ordinary
> Lisp lists.

That it's all lists just means that you don't miss anything just because
you don't know about it.
That's not bad for a beginning, but not enough.
MACROEXPAND loses information that must be regained by inspecting the
expanded code. Depending on the application, this may be impossible,
possible, or even beneficial.
For a practical example, writing a tool that searches for
use-before-definition errors should probably expand macros. To give
meaningful error messages, the tool would somehow have to determine
whether the error is in the macro or in the code that uses the macro -
not easy at all!

My point stands: a simple, regular AST representation is helpful, but a
simple, regular semantics is more important.

Mark Carter

unread,
Apr 9, 2003, 8:56:20 AM4/9/03
to
> I was just throwing in some suggestions. Yes maintainability, other
> possiblities might include developer time, platform availability, platform
> independance. You can probably think of more as well!

> >>>I wnat a language in which is it easy to write good programs.


> >>>I'm still waiting for one...
> >>
> >> I think we need to define what we mean by "good", "bad", "perfect" and so
> >> on.
> >>
> >> Presumably "perfect" means "as good as it is possible to be" but...
> >> does good mean fast, reliable, bug-free on first compile/run?
> >> does bad mean memory-hungry, buggy, inefficient?
>
> > That's part of the problem.
> > But not a particularly big one - you can get 90% goodness in all these
> > areas. For the remaining 10% - well, there's always assembly.
>
> But you still haven't said what "good" actually means!

Actually, I almost think that rate of productivity (code lines per
unit of "functionality required") could be the ultimate arbiter of
what makes a language "good". I'm suggesting that "good to write" is
probably highly correlated with "good to maintain". As regards
platform availability - I regard that as a side issue, because after
all, there's no intrinsic reason why, say, Forth will run on a PC, but
not UNIX. Having said that, the argument doesn't quite work with
assembler, or VB.

Actually, I'm rather attracted to the idea of a previous poster
suggesting Lisp as the ultimate programming language, especially the
bit about every sufficiently complex problem has a poor implementation
of Lisp.

Take XML/Latex/html. When you think about it, these are just examples
of Lisp done badly. If people were to mark everything up in valid
Lisp, then nobody would have to write parsers for them. They'd be
ready for processing right from the second you wrote them.

Mark Carter

unread,
Apr 9, 2003, 9:26:32 AM4/9/03
to
Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-F673E8....@copper.ipg.tsnz.net>...

> In article <m3y92l9...@panix.com>,
> Tom Breton <te...@REMOVEpanNOSPAMix.com> wrote:
>
> > Bruce Hoult <br...@hoult.org> writes:
> >
> > > In article <d3c9c04.03040...@posting.google.com>,
> > > carter...@ukmail.com (Mark Carter) wrote:
> > >
> > > > XS struck me a potentially very good (OK - not really good), because
> > > > it is so easy to write a parser for it. Then it occurred to me that
> > > > Forth - a language that I have never written in, nor know much about
> > > > - might actually fit the bill. Forth is stack-based, so it ought to
> > > > be quite easy to write a parser for it compared to infix-based
> > > > languages.
> > >
> > > Why is this such an important thing? You only have to write the parser
> > > *once*. After that anyone can use it (if you make it available as a
> > > library, and not just as a monolithic part of the compiler).
> >

No - actually you have to write a parser for each language.

What I had in mind is Forth a kind of "canonical" intermediate
language. And I mentioned Forth because, being stack-based rather than
infix-based, it seemed a particularly easy language to parse.


> >
> > You need some sort of parser for every tool that needs to understand
> > the code at all. Lisp's is easy: `r' `e' `a' `d'.
>
> And any other language whatsoever can give the user a function called
> "read", if it wants to.

But this is my point about a so-called "perfect programming language".
We're all writing "read" functions and parsers for a multiplicity of
languages - which amounts to a colossal waste of time. And when you
write application code, it will only be available in the language you
wrote it in. It requires considerable porting effort to make it
available to a different language.

All of the open source software writers are all essentially engaged in
a massive exercise of re-inventing wheels.

Gene Kahn

unread,
Apr 9, 2003, 2:42:50 PM4/9/03
to
"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message news:<pan.2003.04.05....@knm.org.pl>...
> On Sat, 05 Apr 2003 03:46:03 -0800, Mark Carter wrote:
>
> > This then leads to the argument: why have a programming language with
> > a fixed syntax at all? If you came up with a language whose syntax was
> > extensible, then you would have what I believe would be the world's
> > best programming language - let's call it PPL.
>
> Those who don't know Lisp are condemned to reinvent it, poorly :-)

And some who do will reinvent it anyway -- Scheme, Arc.
gk

Kaz Kylheku

unread,
Apr 9, 2003, 4:46:16 PM4/9/03
to
carter...@ukmail.com (Mark Carter) wrote in message news:<d3c9c04.03040...@posting.google.com>...

> My musings on XS (http://www.markcarter.me.uk/computing/xs.html) led
> me to think about programming languages in general - and what "the
> world's best programming language" would look like. XS struck me a
> potentially very good (OK - not really good), because it is so easy to
> write a parser for it. Then it occurred to me that Forth - a language
> that I have never written in, nor know much about - might actually fit
> the bill. Forth is stack-based, so it ought to be quite easy to write
> a parser for it compared to infix-based languages.

Forth parsing is essentially just: extract token, call the
corresponding function. But a Forth program has more structure than
that; the remaining parsing task is rolled into evaluation. Each
function knows how to interpret the stack, and since the stack comes
from reducing the previous tokens, you are essentially handling
grammar reductions as you evaluate. The postfix syntax lets you decide
what reduction to do by the rightmost symbol, which allows you to push
all the left material onto a stack, then decide how to treat it
according to the evaluation semantics of the reduction.

>
> Consider the following equation:
>
> syntactic sugar + libraries = programming language
> Now, the "library" aspect of a programming language is not really
> inherent to the language itself.

Not so in Lisp; there is no clear-cut boundary between ``library'' and
``language''. You don't know what language features come from a macro
library and which are intrinsic.

> a fixed syntax at all? If you came up with a language whose syntax was
> extensible, then you would have what I believe would be the world's
> best programming language - let's call it PPL. PPL would consist of a
> bunch of libraries of two types: "function libraries" (which are the
> libraries that I have referred to in my definition of a programming
> language above), plus "syntax libraries" (an entirely new concept).

Given that Lisp has syntax libraries, and it's one of the oldest
programming languages, probably second only to Fortran, it's hard to
label this concept as ``new''.

> These syntax libraries would be capable of converting an infix program
> to a stack-based program (e.g. Forth). The stack-based program would
> be specifically designed to be simple to compile.

Lisp's macros translate arbitrary syntax to Lisp, which can be
compiled. Many of the ANSI standard language features are in fact
macros.

> So I could then imagine the scenario where someone woke up one day and
> said: "hey, wouldn't it be grand if PPL supported
> object-orientation?". The programmer would then write a syntax library
> that implemented it, and distribute it like a normal library.

This is exactly what happened with Lisp; the Common Lisp Object System
started out as a macro library. Its constructs like DEFCLASS continue
to be implemented as macros. There are portable implementations of
CLOS like PCL (Portable CommonLOOPS). The CMUCL and GCL
implementations of Common Lisp have object systems based on the PCL
source code.

Because Lisp hackers were able to come up with a solid object system
without having to invent a new language, they were able to beat every
other OO language to the standardization punch. Common Lisp became the
first ANSI standard OO language.

The object system still beats out others in power. Dispatch on every
method argument, auxiliary methods (:before, :after and :around),
programmable method combinations, method parameters specializable to
integer values or identities of specific objects (such as symbols),
etc.

Kaz Kylheku

unread,
Apr 9, 2003, 4:58:59 PM4/9/03
to
Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-FBC037....@copper.ipg.tsnz.net>...

That's why the Lisp language gives you access to the macro-expander
via the MACROEXPAND function. If you call MACROEXPAND from the body of
a macro, you can even pass down the environment, so that this embedded
expansion is done in the same macroexpansion context as the macro that
is calling it.

The ANSI Lisp standard say that even if LOOP (or any other standard
macro) is implemented as a special form, the implementation must
provide a macro as well, so that the macroexpansion will work.

The real problem is that a useful code walker has to deal with
nonstandard, impelmentation-specific special forms. These extensions
may be written by the programmer directly, or may appear in the
expansions of macros provided by the implementation.

Anyway, Lisp has features which make it unnecessary to write a
full-blown code walker for most things that you want to do. If you
have a problem that you think requires a code walker, pose your
problem to comp.lang.lisp first! ;)

For example, there is no need to walk code in order to substitute code
into it, because you can instead establish MACROLET or SYMBOL-MACROLET
blocks around that code, and allow occurences of these macros in the
code perform the substitution. In other words, you let the
implementation's built-in code walker do the job of hunting down your
special symbols or expressions within the tree and do the
substitution.

Supposedly, SYMBOL-MACROLET was added specifically to support the
WITH-SLOTS construct. It boiled down to a decision between
standardizing a code-walker feature (which WITH-SLOTS could use to
substitute the slot accesses in place of the symbolic references), or
standardizing symbol macros. Symbol macros were the simpler way to go.
ANSI Lisp could acquire a portable code walker feature one day. Some
Lisp implementations provide access to theirs.

This is one of those features that are truly hard for programmers to
develop in a portable way, like COMPILE or EVAL!

Tom Breton

unread,
Apr 9, 2003, 12:21:21 AM4/9/03
to
Joachim Durchholz <joac...@gmx.de> writes:

> Tom Breton wrote:
>
> >
> > Unfortunately in other languages, there's a vicious circle where the
> > lack of trivial parsing has led to the mindset that tools are
> > extra/add-on/outside the realm of the language, which in turn devalues
> > trivial parsing.
>
> I don't think it's ease of parsing, it's ease of using the resulting
> parse tree.

OK, perhaps I should say, ease of transforming source into an AST.

> I wouldn't want to write a tree walker for a C++ parse tree, I'd have to
> consider too many constructs. In Lisp or Smalltalk, it's easy, because
> both have a simple, regular semantics. (Of course, a simple and regular
> semantics usually results in a simple, regular syntax.)

Also Prolog, which is just functor+arguments. To walk it, you just
have to recursively turn it into lists.

Andy Freeman

unread,
Apr 10, 2003, 12:05:43 AM4/10/03
to
Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-7FDFAA....@copper.ipg.tsnz.net>...

> Why is this such an important thing? You only have to write the parser
> *once*. After that anyone can use it (if you make it available as a
> library, and not just as a monolithic part of the compiler).

The "make a parser available" problem is arguably fixable, even though
said fix isn't generally available and people spend lots of time on related
hacks.

The more serious problem is that humans aren't all that competent writing
and reading languages with precedence. That's unlikely to change.

Precedence languages are sort of like keyboards with keys in alphabetical
order. Both look like they'd be easy to use (no learning curve), but
the learning curve isn't the important problem.

> This is a mistake made by each of Lisp, Smalltalk and Forth, getting a
> syntactically extensible language at the expense of readability.

I haven't watched people write Smalltalk, but I have watched people
read/write Lisp, Forth, and the usual suspects. The lisp&forth folk
spend far less time finding/fixing syntax problems, even when the
other folks have IDEs that provide far more help than paren-matching.

I don't know what definition of "readability" you're using, but if it
C scores higher than Lisp, it's measuring the wrong thing.

-andy

Raffael Cavallaro

unread,
Apr 10, 2003, 1:01:50 AM4/10/03
to
"Michael Hobbs" <ho...@citilink.com> wrote in message news:<v931ole...@corp.supernews.com>...

> Furthermore, the
> .NET runtime is generic enough so that any "modern" language can be compiled
> to run in it. So while there may not exist "syntax libraries", there do
> exist syntax compilers that basically achieve the same thing.

This turns out not to be true. .NET is restrictive enough in its
semantics that certain language features of more powerful languages
cannot be expressed in the .NET runtime, short of writing full blown
interpreters/compilers for them, and/or suffering a huge performance
hit (~10x).

see: <http://rover.cs.northwestern.edu/~surana/blog/archives/000062.html>

>
> Granted, I've never actually used .NET, so I can't vouch for how accurate
> the above statements are. I'm simply repeating information that I've gleaned
> from magazine articles and other forms of Microsoft marketing.

Well MS would have us believe that .NET is the one runtime to rule
them all, but lets hope it ends up in the crack of doom where it
belongs.

Mark Carter

unread,
Apr 10, 2003, 5:52:12 AM4/10/03
to
> OK, perhaps I should say, ease of transforming source into an AST.

Actually, maybe transforming source into an AST is easy (enough) in
any language. What you need is the BNF grammar for BNF itself - a
one-time investment which appears to be minor. Assuming that your
language has a parser generator for it, you can then create a parser
that parses grammars . When you have that, you then need the grammar
for whatever language it is that you want to build symbol trees. The
grammar parser parses the grammar for the target language. What you
end up with is a parser that builds a tree for the language you are
interested in.

Theoretically, the steps above should be relatively painless, because
language grammars are readily available. It's just a question of
compiling code.

In this way, you can make the language you write in (C, or whatever)
easily accomodate language parsing - provided it possesses a parser
generator. This may suit people better than Lisp, because Lisp has a
fixed syntax, whereas my suggestion would let you parse any language
for which you know the grammar.

You could also build cross-translators between languages. Some parts
of it will be easy, some parts very difficult. if-then constructs
ought to be easy to parse, because nearly every language provides
them. Object-orientation would be rather more difficult.

But it would be a boon for people who wanted to extend or prettify
existing languages. They could create a new language which essentially
spits out code of some existing, closely-matching language, and spend
most of their effort concentrating on the differences between the
langauges. This of course creates an additional compilation step, but
might be considered worthwhile on account of the effort it saves.

Bruce Hoult

unread,
Apr 10, 2003, 7:01:19 AM4/10/03
to
In article <8bbd9ac3.03040...@posting.google.com>,
ana...@earthlink.net (Andy Freeman) wrote:

> Bruce Hoult <br...@hoult.org> wrote in message
> news:<bruce-7FDFAA....@copper.ipg.tsnz.net>...
> > Why is this such an important thing? You only have to write the parser
> > *once*. After that anyone can use it (if you make it available as a
> > library, and not just as a monolithic part of the compiler).
>
> The "make a parser available" problem is arguably fixable, even though
> said fix isn't generally available and people spend lots of time on related
> hacks.

I don't know why language implementators don't generally make their
compiler's parser available to users :-(


> The more serious problem is that humans aren't all that competent writing
> and reading languages with precedence. That's unlikely to change.
>
> Precedence languages are sort of like keyboards with keys in alphabetical
> order. Both look like they'd be easy to use (no learning curve), but
> the learning curve isn't the important problem.

I agree that the learning curve isn't the important problem, but surely
it's Lisp that has the lower learning curve for syntax -- there almost
*isn't* any.


> I don't know what definition of "readability" you're using, but if it
> C scores higher than Lisp, it's measuring the wrong thing.

Oh, I don't think *C* is readable. In fact it's atrocious, expecially
when people try to use the preprocessor to make up for the lack of
language power.

-- Bruce

Bruce Hoult

unread,
Apr 10, 2003, 7:04:16 AM4/10/03
to
In article <3E92A683...@gmx.de>,
Joachim Durchholz <joac...@gmx.de> wrote:

> Bruce Hoult wrote:
> > Joachim Durchholz <joac...@gmx.de> wrote:
> >>I wouldn't want to write a tree walker for a C++ parse tree, I'd have to
> >>consider too many constructs. In Lisp or Smalltalk, it's easy, because
> >>both have a simple, regular semantics.
> >
> > Not once you start using macros they don't.
>
> Oh, right. You'd need to either analyze the macros (not decidable, but
> you can probably get by with some pattern matching heuristics), or just
> keep them as black boxes. The latter is probably not a good idea if
> you're working on a contemporary Lisp - macros do too important things
> there.

Right. Probably one of the few big theorecital *advantages* of
pattern-based hygienic macro systems such as in Dylan and some Schemes.

-- Bruce

Joachim Durchholz

unread,
Apr 10, 2003, 9:37:20 AM4/10/03
to
Mark Carter wrote:
> Theoretically, [writing a universal parser]

> should be relatively painless, because
> language grammars are readily available. It's just a question of
> compiling code.

Unfortunately, this is _not_ the case.
Creating a parser from a set of BNF rules is not that easy, at least if
you use yacc or bison: both require that the grammar is LALR(1).
Neither C nor C++ are LALR. Transforming their grammar into its LALR
equivalent is major work, and you're never sure whether the transformed
language still defines the same language.
Then, the C and C++ grammars are ambiguous. There are constructs that
must be parsed differently, depending on whether the name that starts
the construct is a type name or some other name.

With the advent of usable GLR parsers, these problems have lessened
somewhat: a GLR parser can handle arbitrary grammars. On the minus side,
it cannot detect whether the grammar is unambiguous, and if it is fed a
program that can be parsed in different ways, it will either fail (not
good) or return all possible parses (not very good either: in the case
of the type name / other name ambiguity, N ambiguities will result in
2^N parse trees).
So GLR frees you of rewriting the grammar for your parser generator, but
it does not free you of all the mistakes that the language designers made.

> You could also build cross-translators between languages. Some parts
> of it will be easy, some parts very difficult. if-then constructs
> ought to be easy to parse, because nearly every language provides
> them. Object-orientation would be rather more difficult.
>
> But it would be a boon for people who wanted to extend or prettify
> existing languages. They could create a new language which essentially
> spits out code of some existing, closely-matching language, and spend
> most of their effort concentrating on the differences between the
> langauges. This of course creates an additional compilation step, but
> might be considered worthwhile on account of the effort it saves.

This has been done and given up.
It turned out that the extensions of different people tended to be
highly incompatible.
Writing libraries was a better way to extend the power of a language.
(There were languages that had no idea of a subroutine, for example.)

Joachim Durchholz

unread,
Apr 10, 2003, 9:40:52 AM4/10/03
to
Bruce Hoult wrote:
>
> I don't know why language implementators don't generally make their
> compiler's parser available to users :-(

I do.

The parser is usually the oldest and most hacked-up piece of software
that exists for a language. Publishing that code might be embarrassing.

Another reason is that making a parser general enough that it can be
used in more contexts than the compiler requires work. The first-time
implementer of a language has other priorities, so he'll end up with a
parser that's tailored for compilation work, so publishing it isn't even
going to help much.

I once took a short look at gcc, and a somewhat longer look at what's
now called SmartEiffel. I would not *want* to use these parsers.

Andreas Koch

unread,
Apr 10, 2003, 3:16:13 PM4/10/03
to
Mark Carter wrote:
> Forth is stack-based, so it ought to be quite easy to write
> a parser for it compared to infix-based languages.

Are you searching for a language that is perfect to code
or for a language that is perfect to code IN ?

I think you can't have both at the same time.

> syntactic sugar + libraries = programming language
> Now, the "library" aspect of a programming language is not really
> inherent to the language itself.

hey, i'd say syntax + libaries + editor + debugger = programming
language. Nothing but syntax is inherent to the language itself,
but its what you (or at least i) practically will use.

If a have a super great language, for which only one
compiler exists who only uses binary source files created
by some edlin like editor and with no debugging methods, i
guess i would even prefer visual basic :-)


> The chances are, though, that if you want to use some of
> that functionality, you have to port over code - which is a waste of
> time.

Yep.

> I like python because it has lists built into the syntax.

Well, i don't like python becauses it seems to use tabs as
syntactic elements - no language has to tell me the
optical format of my code :-)

> This then leads to the argument: why have a programming language with

> a fixed syntax at all? If you came up with a language whose syntax was
> extensible, then you would have what I believe would be the world's
> best programming language

Oh hell no.

I'd really hate to find out that some code i got from the
web doesn't work because that guy defined that
forloop(0,42) will count from 42 to 0 while i defined it will
count from 0 to 42.

Or find code that only uses commands i have never seen, because
the guy invented others than i did.

Not even to speak of different versions of syntax libaries.

Custom syntax should be fine as long as ONLY YOU will ever
see that code. All other cases : brrrr


--
Andreas
He screamed: THIS IS SIG!

Tom Breton

unread,
Apr 10, 2003, 5:02:55 PM4/10/03
to
carter...@ukmail.com (Mark Carter) writes:

> > OK, perhaps I should say, ease of transforming source into an AST.
>
> Actually, maybe transforming source into an AST is easy (enough) in
> any language. What you need is the BNF grammar for BNF itself - a
> one-time investment which appears to be minor. Assuming that your
> language has a parser generator for it, you can then create a parser
> that parses grammars . When you have that, you then need the grammar
> for whatever language it is that you want to build symbol trees. The
> grammar parser parses the grammar for the target language. What you
> end up with is a parser that builds a tree for the language you are
> interested in.

Have you taken a look at a C compiler? gcc is a good example. Look
at cp-parse.y or c-parse.in. There's far more going on in the grammar
than just a BNF skeleton.

And the complexity isn't just in lexing and parsing. There are many
support files for a front end, before it's reduced to trees so the
back end can work (the line between them is blurry, just so I don't
mislead you). This includes (for C) `stor-layout.c', `fold-const.c',
`tree.c', `tree.h', `tree.def', `c-common.c', `c-parse.in',
`c-decl.c', `c-typeck.c', `c-aux-info.c', `c-convert.c', `c-lang.c',
`c-lex.h', and `c-tree.h'.

And don't think they are small files. For instance, c-common.c. 4547
lines (4526 lines not counting the GPL notice). 145312 bytes.

And before you say these files are not required to transform source
into an AST, yes they are, unless you want it to be the walker's job
to manage type coercion and so forth. With much work, you could
probably split all of them into parsing and non-parsing functionality,
but it's far from trivial and you still end up with a great deal of
code.

Now consider Lisp's way of doing all of that: `r' `e' `a' `d'. 5
keystrokes, counting the parentheses, not counting telling it what
file (string, etc) to read from.

True, Lisp has the functionality built in, but before you cry unfair
comparison, just think of how much easier it was to build in, and why
it isn't done for syntax-heavy languages.



> Theoretically, the steps above should be relatively painless, because
> language grammars are readily available. It's just a question of
> compiling code.

Theoretically. In fact nobody's done it because it's far from
painless.

Bruce Hoult

unread,
Apr 10, 2003, 10:02:54 PM4/10/03
to
In article <b74fjr$m8j$04$1...@news.t-online.com>,
Andreas Koch <ma...@kochandreas.com> wrote:

> > This then leads to the argument: why have a programming language with
> > a fixed syntax at all? If you came up with a language whose syntax was
> > extensible, then you would have what I believe would be the world's
> > best programming language
>
> Oh hell no.
>
> I'd really hate to find out that some code i got from the
> web doesn't work because that guy defined that
> forloop(0,42) will count from 42 to 0 while i defined it will
> count from 0 to 42.
>
> Or find code that only uses commands i have never seen, because
> the guy invented others than i did.
>
> Not even to speak of different versions of syntax libaries.
>
> Custom syntax should be fine as long as ONLY YOU will ever
> see that code. All other cases : brrrr

You must really hate all the language with functions/procedures then.

-- Bruce

Andy Freeman

unread,
Apr 11, 2003, 1:00:54 AM4/11/03
to
Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-443730....@copper.ipg.tsnz.net>...

> In article <8bbd9ac3.03040...@posting.google.com>,
> ana...@earthlink.net (Andy Freeman) wrote:
>
> > Bruce Hoult <br...@hoult.org> wrote in message
> > news:<bruce-7FDFAA....@copper.ipg.tsnz.net>...
> > > Why is this such an important thing? You only have to write the parser
> > > *once*. After that anyone can use it (if you make it available as a
> > > library, and not just as a monolithic part of the compiler).
> >
> > The "make a parser available" problem is arguably fixable, even though
> > said fix isn't generally available and people spend lots of time on related
> > hacks.
>
> I don't know why language implementators don't generally make their
> compiler's parser available to users :-(

Since that parser (for non-lisp languages) won't help me parse the vast
majority of the languages that I might care about (lisp or non-lisp), I
can see why they wouldn't bother.

But - I'll bite. How often do you need to parse a language that is
close enough to the (non-lisp) programming language that you're using
that having the parser available makes a difference. (Note that if you're
just compiling said language, you don't need the parser.)

> > The more serious problem is that humans aren't all that competent writing
> > and reading languages with precedence. That's unlikely to change.
> >
> > Precedence languages are sort of like keyboards with keys in alphabetical
> > order. Both look like they'd be easy to use (no learning curve), but
> > the learning curve isn't the important problem.
>
> I agree that the learning curve isn't the important problem, but surely
> it's Lisp that has the lower learning curve for syntax -- there almost
> *isn't* any.

For someone starting from scratch, yes, but the argument for precedence
languages is that they're familiar because similar notations are used
elsewhere, such as math textbooks. (My analogy depends on the fact that
it used to be that most folks knew alphabetical order before they used
their first keyboard.) Lisp notation isn't used elsewhere, so it has
to be learned.

My point is that the seeming headstart doesn't help. Precedence languages
are a mistake whenever humans are involved, and that has nothing to do
with the availability of parsers. People think that they can use precedence
languages without learning anything, but they're wrong. Still, they
compare that "no learning" with the small bit of learning req'd for
lisp. They don't seem to notice that precedence languages require a
lot of learning and comprehension effort. That's why folks spend so
much more time with syntax errors in non-lisp languages.

> > I don't know what definition of "readability" you're using, but if it
> > C scores higher than Lisp, it's measuring the wrong thing.
>
> Oh, I don't think *C* is readable.

My claim applies to all precedence languages....

-andy

Kenny Tilton

unread,
Apr 11, 2003, 1:10:16 AM4/11/03
to

Andy Freeman wrote:
> Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-443730....@copper.ipg.tsnz.net>...
>
>>In article <8bbd9ac3.03040...@posting.google.com>,
>> ana...@earthlink.net (Andy Freeman) wrote:
>>
>>
>>>Bruce Hoult <br...@hoult.org> wrote in message
>>>news:<bruce-7FDFAA....@copper.ipg.tsnz.net>...
>>>
>>>>Why is this such an important thing? You only have to write the parser
>>>>*once*. After that anyone can use it (if you make it available as a
>>>>library, and not just as a monolithic part of the compiler).
>>>
>>>The "make a parser available" problem is arguably fixable, even though
>>>said fix isn't generally available and people spend lots of time on related
>>>hacks.
>>
>>I don't know why language implementators don't generally make their
>>compiler's parser available to users :-(
>
>
> Since that parser (for non-lisp languages) won't help me parse the vast
> majority of the languages that I might care about (lisp or non-lisp), I
> can see why they wouldn't bother.
>
> But - I'll bite. How often do you need to parse a language that is
> close enough to the (non-lisp) programming language that you're using
> that having the parser available makes a difference. (Note that if you're
> just compiling said language, you don't need the parser.)

Lispniks like to get at code programmatically.

--

kenny tilton
clinisys, inc
http://www.tilton-technology.com/
---------------------------------------------------------------
"Everything is a cell." -- Alan Kay

Daniel Barlow

unread,
Apr 11, 2003, 7:50:04 AM4/11/03
to
ana...@earthlink.net (Andy Freeman) writes:

> But - I'll bite. How often do you need to parse a language that is
> close enough to the (non-lisp) programming language that you're using
> that having the parser available makes a difference. (Note that if you're
> just compiling said language, you don't need the parser.)

Whenever I want an editor that knows how to indent, colourize, or
recognize keywords or syntactic blocks in that language. More
generally, writing pretty-printers (e.g. to convert program source to
neatly marked up LaTeX or HTML).

Also, in the specific case of C on unix, it would be very convenient
to be able to parse header files and automatically transform them into
FFI definitions for some non-C language. Anyone using Linux will be
aware that the glibc include maze is full of pits, spikes, traps, and
carefully half-rotted floorboards - you just can't send anything other
than the True Gcc in there unless it has volunteered for the mission
and understands it probably won't make it out alive.

(Yes, I accept that this won't get me an idiomatic library interface
for whatever non-C language i had in mind. But if it gets me out of
having to hand-transcribe the values of O_EXCL and SIGBUS and the
offset and size of stat.st_mtime, it would save an awful lot of time
anyway)


-dan

--

http://www.cliki.net/ - Link farm for free CL-on-Unix resources

Alan Shutko

unread,
Apr 11, 2003, 11:07:59 AM4/11/03
to
Daniel Barlow <d...@telent.net> writes:

> Whenever I want an editor that knows how to indent, colourize, or
> recognize keywords or syntactic blocks in that language.

Also, for things like "intellisense" (completing methods allowed on
an object), generating documentation templates from function
prototypes, lookup of variable definitions following correct
scoping, code-assisted refactoring, class browsers... the list goes
on and on.

It's so useful to have that Eric Ludlam started a separate
parser-generator project for Emacs, so that one could write a full
parser for these kinds of things. It would be much less work if one
could just hook into the compiler's parser, which is how Visual
Studio does it (from what I've been told).

--
Alan Shutko <a...@acm.org> - I am the rocks.
Looking for a developer in St. Louis? http://web.springies.com/~ats/
Insert New Disk for Drive C: Press ENTER when ready.

Andreas Koch

unread,
Apr 11, 2003, 12:57:53 PM4/11/03
to
Bruce Hoult wrote:

> You must really hate all the language with functions/procedures then.

Well, no. They have a fixed SYNTAX and all those functions tend
to be part of either the project itself or some really common libs.

No one is going to re-define the for-loop in C or Pascal .

Florian Weimer

unread,
Apr 11, 2003, 3:41:34 PM4/11/03
to
Daniel Barlow <d...@telent.net> writes:

> (Yes, I accept that this won't get me an idiomatic library interface
> for whatever non-C language i had in mind. But if it gets me out of
> having to hand-transcribe the values of O_EXCL and SIGBUS and the
> offset and size of stat.st_mtime, it would save an awful lot of time
> anyway)

I've used autoconf successfully to extract constants, struct sizes and
offsets. No real need to reinvent the wheel (and aiming for GCC
compatibility is futile anyway).

Bruce Hoult

unread,
Apr 11, 2003, 9:33:35 PM4/11/03
to
In article <b76rsd$c29$06$2...@news.t-online.com>,
Andreas Koch <ma...@kochandreas.com> wrote:

> Bruce Hoult wrote:
>
> > You must really hate all the language with functions/procedures then.
>
> Well, no. They have a fixed SYNTAX and all those functions tend
> to be part of either the project itself or some really common libs.

And the same goes for new syntax.


> No one is going to re-define the for-loop in C or Pascal .

And only the lunatics are going to do it in Common Lisp or Dylan or
Forth.

There is *no* programming language feature that cannot be misused. Good
taste is always essential.

-- Bruce

Will Hartung

unread,
Apr 11, 2003, 10:25:59 PM4/11/03
to

"Alan Shutko" <a...@acm.org> wrote in message
news:87d6jt2...@wesley.springies.com...

> Daniel Barlow <d...@telent.net> writes:
>
> > Whenever I want an editor that knows how to indent, colourize, or
> > recognize keywords or syntactic blocks in that language.
>
> Also, for things like "intellisense" (completing methods allowed on
> an object), generating documentation templates from function
> prototypes, lookup of variable definitions following correct
> scoping, code-assisted refactoring, class browsers... the list goes
> on and on.

So, the language standard (for said mythical language) needs this
capability, built-in, so that folks can write a snazzy editor faster? Is
that what I'm hearing?

"I'd like to use Ada for my Autonomous RDBMS backed Learning Expert System
Internet Shopping Agent, but it can't pretty print it's source code by
itself very easily".

Yeah, Ok.

I know the IDE is the primary deciding factor behind all of my language
decsions.

Regards,

Will Hartung
(wi...@msoft.com)


Andreas Bogk

unread,
Apr 12, 2003, 9:47:51 AM4/12/03
to
Daniel Barlow <d...@telent.net> writes:

> Also, in the specific case of C on unix, it would be very convenient
> to be able to parse header files and automatically transform them into
> FFI definitions for some non-C language. Anyone using Linux will be
> aware that the glibc include maze is full of pits, spikes, traps, and
> carefully half-rotted floorboards - you just can't send anything other
> than the True Gcc in there unless it has volunteered for the mission
> and understands it probably won't make it out alive.

Aye. I particularly like the fact that Linux (or rather, the glibc)
doesn't even come with a complete set of headers, some are provided by
gcc, and you have to know that you'll find them in

$ gcc --print-file-name=include
/usr/lib/gcc-lib/i386-linux/3.2.3/include

rather than /usr/include, where they are on more decent systems.

In the Gwydion Dylan project, we have a tool called melange that does
generate C-FFI from header files. After 5 years if intense training,
it manages to get out alive with most headers. We've been using it to
wrap things like the OpenGL headers, an improved version of the parser
is being used for wrapping GTK.

Andreas

--
"In my eyes it is never a crime to steal knowledge. It is a good
theft. The pirate of knowledge is a good pirate."
(Michel Serres)

Alan Shutko

unread,
Apr 12, 2003, 11:23:12 AM4/12/03
to
"Will Hartung" <wi...@msoft.com> writes:

> So, the language standard (for said mythical language) needs this
> capability, built-in, so that folks can write a snazzy editor faster? Is
> that what I'm hearing?

No.

Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-443730....@copper.ipg.tsnz.net>...

> I don't know why language implementators don't generally make their

> compiler's parser available to users :-(

It would be convenient if compiler vendors were to expose their
parser. Nobody said anything about requiring it in the language
standard. Just talking about one simple thing which would make
several real life problems easier.

--
Alan Shutko <a...@acm.org> - I am the rocks.
Looking for a developer in St. Louis? http://web.springies.com/~ats/

PRIDE OF CHANUR: A great read if you don't mind the pits.

Florian Weimer

unread,
Apr 12, 2003, 11:55:25 AM4/12/03
to
"Will Hartung" <wi...@msoft.com> writes:

> "I'd like to use Ada for my Autonomous RDBMS backed Learning Expert System
> Internet Shopping Agent, but it can't pretty print it's source code by
> itself very easily".

Actually, Ada is kind of a bad example because there *is* a
well-defined introspection facility. 8-)

Joachim Durchholz

unread,
Apr 13, 2003, 5:27:39 PM4/13/03
to
Daniel Barlow wrote:
>
> Also, in the specific case of C on unix, it would be very convenient
> to be able to parse header files and automatically transform them into
> FFI definitions for some non-C language. Anyone using Linux will be
> aware that the glibc include maze is full of pits, spikes, traps, and
> carefully half-rotted floorboards - you just can't send anything other
> than the True Gcc in there unless it has volunteered for the mission
> and understands it probably won't make it out alive.

Then send in GCC.

More specifically: send in GPP, and let it return the definitions.
Usually, you want to wrap some specific part of all the API. I.e. you
have a file foo.h that says
#define SOME_CONFIGURATION_VALUE
#include <lots-of-cruft>
#define BLAH ...
typedef ... X

Write a small "detector" file that has
#include <foo.h>
blah: BLAH
x: X
send this to gpp, then look for the blah: and x: headers and look what
gpp made of them.

Doesn't seem to be very difficult.
Or did I overlook something here?

Daniel Barlow

unread,
Apr 13, 2003, 7:12:00 PM4/13/03
to
Florian Weimer <f...@deneb.enyo.de> writes:

> I've used autoconf successfully to extract constants, struct sizes and
> offsets. No real need to reinvent the wheel (and aiming for GCC
> compatibility is futile anyway).

Yes, but (unless autoconf is a lot more featureful than it was last
time I looked at it) you do this by means of writing small programs
and giving them to the C compiler. If the parser were available
separately, that would be notably less inefficient (and be rather
easier to do in a cross-compile)

Daniel Barlow

unread,
Apr 13, 2003, 7:27:49 PM4/13/03
to
Joachim Durchholz <joac...@gmx.de> writes:

> Write a small "detector" file that has
> #include <foo.h>
> blah: BLAH
> x: X
> send this to gpp, then look for the blah: and x: headers and look what
> gpp made of them.

This is insufficient for sizeof, offsetof, or to get the values of
enums - you need to run the compiler all the way through to pick these
up. I do this already, and frankly it's pretty kludgey.

Jacek Generowicz

unread,
Apr 14, 2003, 7:20:53 AM4/14/03
to
Joachim Durchholz <joac...@gmx.de> writes:

> Daniel Barlow wrote:
> > Also, in the specific case of C on unix, it would be very convenient
> > to be able to parse header files and automatically transform them into
> > FFI definitions for some non-C language. Anyone using Linux will be
> > aware that the glibc include maze is full of pits, spikes, traps, and
> > carefully half-rotted floorboards - you just can't send anything other
> > than the True Gcc in there unless it has volunteered for the mission
> > and understands it probably won't make it out alive.
>
> Then send in GCC.
>
> More specifically: send in GPP,

I would have thought that this is a job for GCC-XML:

http://www.gccxml.org

... then use the full power of Lisp (or whatever happens to be at
hand) to finish off the job.

Marco van de Voort

unread,
Apr 20, 2003, 9:29:11 AM4/20/03
to
In article <XOKla.490$cn3...@newssvr16.news.prodigy.com>, Will Hartung wrote:
>
> "Alan Shutko" <a...@acm.org> wrote in message
> news:87d6jt2...@wesley.springies.com...
>> Daniel Barlow <d...@telent.net> writes:
>>
>> > Whenever I want an editor that knows how to indent, colourize, or
>> > recognize keywords or syntactic blocks in that language.
>>
>> Also, for things like "intellisense" (completing methods allowed on
>> an object), generating documentation templates from function
>> prototypes, lookup of variable definitions following correct
>> scoping, code-assisted refactoring, class browsers... the list goes
>> on and on.
>
> So, the language standard (for said mythical language) needs this
> capability, built-in, so that folks can write a snazzy editor faster? Is
> that what I'm hearing?

Yes, maybe it would be wise when designing a new language to allow for
incremental parsing/ parsing code with errors in it.

I myself use Delphi and JBuilder, and I noticed that JBuilders autocomplete often
stalls (doesn't list methods), when I make a very simple error in de code above (like
forgetting a semi-colon), while Delphi doesn't.

Such things could be related to the language parsability (Pascal is easy parsable, and though
something like Delphi is afaik not really LL(1) anymore, it's still easy to parse)



> "I'd like to use Ada for my Autonomous RDBMS backed Learning Expert System
> Internet Shopping Agent, but it can't pretty print it's source code by
> itself very easily".

While this is not realistic (it is a tool problem, not language), I can
imagine people having to do source-transforms and creating own tools
avoiding complex to parse (and large) languages.

Marco van de Voort

unread,
Apr 20, 2003, 9:35:59 AM4/20/03
to
In article <87y92f2...@meo-dipt.andreas.org>, Andreas Bogk wrote:
> Daniel Barlow <d...@telent.net> writes:
>
>> Also, in the specific case of C on unix, it would be very convenient
>> to be able to parse header files and automatically transform them into
>> FFI definitions for some non-C language. Anyone using Linux will be
>> aware that the glibc include maze is full of pits, spikes, traps, and
>> carefully half-rotted floorboards - you just can't send anything other
>> than the True Gcc in there unless it has volunteered for the mission
>> and understands it probably won't make it out alive.
>
> Aye. I particularly like the fact that Linux (or rather, the glibc)
> doesn't even come with a complete set of headers, some are provided by
> gcc, and you have to know that you'll find them in
>
> $ gcc --print-file-name=include
> /usr/lib/gcc-lib/i386-linux/3.2.3/include

Quite annoying yes. Luckily FreeBSD doesn't do this.



> rather than /usr/include, where they are on more decent systems.
>
> In the Gwydion Dylan project, we have a tool called melange that does
> generate C-FFI from header files. After 5 years if intense training,
> it manages to get out alive with most headers. We've been using it to
> wrap things like the OpenGL headers, an improved version of the parser
> is being used for wrapping GTK.

We've similar problems with *nix headers for our Pascal compiler. There is
some header tool that works on moderately clean headers only. Also a lot of
API's are quite C oriented. (like e.g. IOCTL definitions)

I heard QT seems to have some Perl tool that generates the C headers from
some mother format, and this tool can be adapted to generate headers for
other languages/compilers.

Seems like a nice system in principle. (if one can unify the projects to
standarise on one or a few of such converter tools)

Teemu Kalvas

unread,
Apr 21, 2003, 3:38:03 AM4/21/03
to
Daniel Barlow <d...@telent.net> writes:

> Florian Weimer <f...@deneb.enyo.de> writes:
>
> > I've used autoconf successfully to extract constants, struct sizes and
> > offsets. No real need to reinvent the wheel (and aiming for GCC
> > compatibility is futile anyway).
>
> Yes, but (unless autoconf is a lot more featureful than it was last
> time I looked at it) you do this by means of writing small programs
> and giving them to the C compiler. If the parser were available
> separately, that would be notably less inefficient (and be rather
> easier to do in a cross-compile)

There's an autoconf macro[1] in the GNU autoconf macro archive[2]
which solves the cross compiling issue. Basically, what it does is
this: you have a constant expression in C, and want to know the value
without running the generated code. Obviously you can do this by
surrounding the value by some magic constant data and then grepping
the generated executable file. This sounds horrible, but in practice
it works quite well.

Of course there's no need to use autoconf for this at all, the
mechanism is the magic, not the tool.

[1] compile_value,
http://www.gnu.org/software/ac-archive/htmldoc/compile_value.html
[2] http://www.gnu.org/software/ac-archive/

--
Teemu Kalvas

Kalle Olavi Niemitalo

unread,
Apr 21, 2003, 5:54:28 AM4/21/03
to
Teemu Kalvas <ch...@s2.org> writes:

> Basically, what it does is this: you have a constant expression
> in C, and want to know the value without running the generated
> code. Obviously you can do this by surrounding the value by
> some magic constant data and then grepping the generated
> executable file.

Current versions of Autoconf can also evaluate a constant Boolean
expression by compiling a test program and checking the exit code
of the compiler. This lets them deduce the value of a constant
integer expression with a binary search when cross-compiling.
_AC_COMPUTE_INT_COMPILE is the essential macro.

0 new messages