Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Program compression

163 views

Skip to first unread message

Your Name

unread,

Jun 9, 2008, 12:55:02 PM6/9/08

I've learned and used quite a few programming languages, accross just
about all of the "paradigms." I'm sure there are many in this group who
have learned and use more.

I'd like to share what I've found common to all of them. First, there
is some syntax that must be learned. Then there is a period of becoming
familiar with syntax and sementics, as one learns a crucial thing: "how
to get the job done" with this language. Certainly, too many language
learners reach this point and plateau in their learning progress.

But any serious programmer of a given language will undoubtedly take the
next step, which is the one that I find the most interesting. Only
after about 25+ years of programming did this perticular process, common
to (probably) all languages, become clear.

The process, for lack of a better term, is "compression." The first
form of this that most of us programmers encounter is subroutines.

Instead of this:

do_something_1 with X
do_something_2 with X
do_something_3 with X

do_something_1 with Y
do_something_2 with Y
do_something_3 with Y

we compress to this:

do_some_things(argument)
{
do_something_1 with argument
do_something_2 with argument
do_something_3 with argument
}

do_some_things with X
do_some_things with Y

This refinement makes obvious sense in many ways, including
maintainability, readability, code space reduction, etc... But I
realize now that this step is the real essence of programming. In every
useful language I can think of, this compression is really the central
feature.

The whole concept of "object oriented" programming is nothing more than
this. Code and data common to various objects are moved to a "parent"
or "base" class. Objects can either derive from other objects ("X is a
Y"), or contain other objects ("X has a Y").

Even scripting languages like HTML have incorporated this concept. When
this inevitably became too cumbersome:

<font family="Arial" color="#000000" size="3">Hello</font>
<font family="Arial" color="#000000" size="3">World</font>

the common elements were separated:

<h2>Hello</h2>
<h2>World</h2>

I would wager that most programmers, especially in the beginner to
intermediate realm, don't really understand why this type of design is
desireable, but just find that it feels right. Maybe the short term
payoff of simply having to type less is the incentive.

But the reason is deeper. A very simple algorithm for compressing data
is run-length-encoding. The following data, possibly part of a
bitmapped image:

05 05 05 05 05 05 05 02 02 02 17 17 17 17 17

Can be run-length-encoded to:

07 05 03 02 05 17

The reward, at first, is just a smaller file. But at a deeper level,
the second version could be considered "better," in that it is more than
just a mindless sequence of bytes. Some meaning is now attached to the
content. "Seven fives, followed by three twos, followed by five
seventeens" is much less mind numbing than "five, five, five, five,
five, five, five, two, two..."

It has been argued that compression is actually equivalent to
intelligence. This makes sense at a surface level. Instead of solving
a problem with a long sequence of repetitious actions, understanding the
problem allows us to break it into more manageable pieces. The better
our understanding, the more compression we can achieve, and the more
likely our resulting algorithm will be suited to solving similar
problems in the future.

This was quite a revelation for me, and it shed much light on writing
"good" code. It also made clear why I find some languages much more
useful than others. The more power a language gives me to compress my
algorithm -- both code and data, as well as in space and executuion time
-- the more I like it. The true measure of this is not the number of
bytes required by the source code, although this surely has some
correlation.

This has given me a great deal of direction in thinking about creating
languages.

Richard Heathfield

unread,

Jun 9, 2008, 2:46:12 PM6/9/08

Your Name said:

<snip>

> But any serious programmer of a given language will undoubtedly take the
> next step, which is the one that I find the most interesting. Only
> after about 25+ years of programming did this perticular process, common
> to (probably) all languages, become clear.
>
> The process, for lack of a better term, is "compression."

Or abstraction, or functional decomposition...

> The first
> form of this that most of us programmers encounter is subroutines.
>
> Instead of this:
>
>
> do_something_1 with X
> do_something_2 with X
> do_something_3 with X
>
> do_something_1 with Y
> do_something_2 with Y
> do_something_3 with Y
>
>
> we compress to this:
>
>
> do_some_things(argument)
> {
> do_something_1 with argument
> do_something_2 with argument
> do_something_3 with argument
> }
>
> do_some_things with X
> do_some_things with Y

No, we compress to this:

do_some_things(obj, min, max)
{
while(min <= max)
{
do_something with min++, obj
}
}

do_some_things with X, 1, 3
do_some_things with Y, 1, 3

or even:

object = { X, Y }
for foo = each object
{
do_some_things with foo, 1, 3
}

> Maybe the short term
> payoff of simply having to type less is the incentive.

Don't forget elegance. It's impossible to define, but a good programmer
knows it when he/she/it sees it.

> But the reason is deeper. A very simple algorithm for compressing data
> is run-length-encoding. The following data, possibly part of a
> bitmapped image:
>
> 05 05 05 05 05 05 05 02 02 02 17 17 17 17 17
>
> Can be run-length-encoded to:
>
> 07 05 03 02 05 17
>
> The reward, at first, is just a smaller file. But at a deeper level,
> the second version could be considered "better," in that it is more than
> just a mindless sequence of bytes.

There are better algorithms than RLE. :-)

> It has been argued that compression is actually equivalent to
> intelligence. This makes sense at a surface level. Instead of solving
> a problem with a long sequence of repetitious actions, understanding the
> problem allows us to break it into more manageable pieces. The better
> our understanding, the more compression we can achieve, and the more
> likely our resulting algorithm will be suited to solving similar
> problems in the future.

This is sometimes expressed in the form "theories destroy facts". If you
know the equation, you don't need the data!

> This was quite a revelation for me, and it shed much light on writing
> "good" code. It also made clear why I find some languages much more
> useful than others. The more power a language gives me to compress my
> algorithm -- both code and data, as well as in space and executuion time
> -- the more I like it. The true measure of this is not the number of
> bytes required by the source code, although this surely has some
> correlation.

Expressive power matters a lot, and you are right to highlight its
importance.

> This has given me a great deal of direction in thinking about creating
> languages.

Incidentally, it has also given this newsgroup the possibility of entering
into a worthwhile discussion that isn't based on a lame newbie question.
Nice one.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 9, 2008, 4:20:47 PM6/9/08

> From: Your Name <n...@none.none>
(when learning a new programming language, your very first, or yet
another different from what you previously knew/used:)

> First, there is some syntax that must be learned.
> Then there is a period of becoming familiar with syntax and
> sementics, as one learns a crucial thing: "how to get the job
> done" with this language.

I agree completely. In fact these are the start of my course
outline for teaching computer programming to absolute beginners:
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#s4outl>
Notice how lesson 1 is just the syntax used to express values, and
then lesson 2 starts the semantics of using some of those values as
programs (i.e. passing them to EVAL). So lesson 1 uses with a
read-describe-print loop, then lesson 2 adds a call to EVAL in the
middle.

(What happens next:)

> The process, for lack of a better term, is "compression."

This is a subset of "refactoring". In fact I strongly advise that
beginners (and even experts much of the time) write single lines of
code and get each single line of code working before moving onto
the next line of code, and only after getting the "business logic"
(the data transformations) working for some function *then*
refactor those lines of code into a self-contained function
definition. I suppose "agile programming" is the most common
buzzword expressing something like this methodology. Instead of
treating refactoring as a pain to be avoided by correct design in
the first place, treat refactoring as a dominant part of the
software development process. You *always* write easily debuggable
code, and then *after* you have it working you *always* refactor it
to be better for long-term use, with the tradeoff that it's now
more difficult to debug, but since debugging at such a low level of
code has been mostly finished this isn't a problem.

Notice how my lesson plan, after the really basic semantics of individual
lines of code, proceeds to teach refactoring in several different ways:
* Lesson 3: Putting micro-programs in sequence to do multi-step D/P
(data processing), and building such sequences into named
functions
* Lesson 4: Refactoring syntax: Getting rid of most uses of GO in
PROG.
* Lesson 5: Refactoring algorithms: Devising data structures that
make D/P much more efficient than brute-force processing of flat
data sequences.
After that point, my beginning lessons don't discuss additional
ways of refactoring, but I think we would agree that further
refactoring is *sometimes* beneficial:
- Using OOP.
- Using macros to allow variations on the usual syntax for calling functions.
- Defining a whole new syntax for specialized problem domains,
including a parser for such syntax.
Almost any software project can benefit from bundling lines of code
into named functions (and sometimes anonymous functions), and also
refactoring syntax and algorithms/dataStructures. But whether a
particular project can benefit from those three additional
refactorings depends on the project. Perhaps my course outline for
an absolute beginner's course in how to write software is
sufficient, and OOP/macros/newSyntaxParsers should be a second
course for people who have gotten at least several months
experience putting the lessons of the first course to practice? If
you like my absolute-beginner's course outline, would you be
willing to work with me to develop a similarily organized outline
for a second course that covers all the topics in your fine essay
and my three bullet points just above?

> The more power a language gives me to compress my algorithm --
> both code and data, as well as in space and executuion time -- the
> more I like it.

Hopefully you accept that Lisp (specifically Common Lisp) is the
best language in this sense? Common Lisp supports, in a
user/applicationProgrammer-friendly way, the complete process of
(agile) programming from immediately writing and testing single
lines of code all the way through all the refactorings needed to
achieve an optimal software project. Java with BeanShell is a
distant second, because the semantics of individual statements
given to BeanShell are significantly different from the semantics
of exactly the (syntactically) same statements when they appear in
proper method definitions which appear within a proper class
definition which has been compiled and then loaded.

> This has given me a great deal of direction in thinking about
> creating languages.

There's no need to create another general-purpose programming
language. Common Lisp already exists and works just fine. All you
might need to do is create domain-specific languages, either as
mere sets of macros within the general s-expression syntax, or as
explicitly parsed new syntaxes feeding into Lisp, or as GUI-based
no-syntax pure-semantics editors generating tables that feed into Lisp.

Nick Keighley

unread,

Jun 10, 2008, 6:22:40 AM6/10/08

On 9 Jun, 17:55, Your Name <n...@none.none> wrote:

<snip>

independent of language.

> The process, for lack of a better term, is "compression." The first
> form of this that most of us programmers encounter is subroutines.

This is the "Refactor Mercilessly" of the Agile crowd.

> This refinement makes obvious sense in many ways, including
> maintainability, readability, code space reduction, etc... But I
> realize now that this step is the real essence of programming. In every
> useful language I can think of, this compression is really the central
> feature.
>
> The whole concept of "object oriented" programming is nothing more than
> this. Code and data common to various objects are moved to a "parent"
> or "base" class. Objects can either derive from other objects ("X is a
> Y"), or contain other objects ("X has a Y").

now here I disagree. Thinking about OO like this tends at best to lead
to really deep inheritance trees and at worst LSP violations. "yes I
know
a Widget isn't really a ClockWorkEngine but I used inheritance to
share
the code".

OO is about identifying abstractions. Look up the Open Closed
Principle
for a better motivator for OO.

Read The Patterns Book.

<snip>

--
Nick Keighley

Programming should never be boring, because anything
mundane and repetitive should be done by the computer.
~Alan Turing

I'd rather write programs to write programs than write programs

Richard Heathfield

unread,

Jun 10, 2008, 6:55:57 AM6/10/08

Nick Keighley said:

<snip>

>
> OO is about identifying abstractions. Look up the Open Closed
> Principle
> for a better motivator for OO.
>
> Read The Patterns Book.

That's unconstitutional in the USA, because it's cruel and unusual
punishment. It remains legal in the UK, though.

Your Name

unread,

Jun 10, 2008, 2:38:20 PM6/10/08

Nick Keighley <nick_keigh...@hotmail.com> wrote:

> now here I disagree. Thinking about OO like this tends at best to lead
> to really deep inheritance trees and at worst LSP violations. "yes I
> know
> a Widget isn't really a ClockWorkEngine but I used inheritance to
> share
> the code".

At the risk of a religious debate, I have to respond that I don't really
find OO very useful. One reason for this is probably exactly what you
just said. I see that tendency as a failure of OO.

Initially, it is nice to be able to say "a square is a shape" and "a
shape has an area." But the more complex the project, the more
inheritance and information hiding become nothing more than a burden, in
my experience.

They may be necessary evils in some environments. The open/closed
principle reduces modifications to the "base" set of code -- whether
this is a set of classes, a library of functions, or something else. It
encourages code to become set in stone. Unfortunately, especially on a
large project, I find that no matter how we try, it is impossible to
foresee the entire scope of what needs to be designed. So the design is
constantly changing, and the idea of some set of code that we write in
the early stages surviving without needing vast rewrites for both
functionality and efficiency is delusional.

Our current project is enormous. It involves half a dozen separate
applications, running on completely different platforms (some x86, some
embedded platforms, some purely virtual). The code ranges from very
high-level jobs, like 3D rendering, all the way down to nuts-and-bolts
tasks like device drivers. It even includes developing an operating
system on the embedded side.

I can honestly say that OO has really added nothing at any stage of that
chain, from the lowest level to the highest. We use C++ for two
reasons:

1) The development tools and libraries are the most mature for C++ and
this is essential.

2) The OS, drivers, and application code in the embedded firmware is all
C and we have no choice in that, unless we want to develop a compiler
for some other language, or write directly in assembly language. Since
C and C++ are largely similar, parts of the C code that we want to share
with the other applications can be used directly.

But if not for these two restrictions, in hindsight, I would not have
chosen C++ or any other OO language.

I've wandered far enough off-topic from the original post.

John W. Krahn

unread,

Jun 10, 2008, 2:57:36 PM6/10/08

Richard Heathfield wrote:
> Nick Keighley said:
>
> <snip>
>> OO is about identifying abstractions. Look up the Open Closed
>> Principle
>> for a better motivator for OO.
>>
>> Read The Patterns Book.
>
> That's unconstitutional in the USA, because it's cruel and unusual
> punishment. It remains legal in the UK, though.

I understand that the current US VP has changed the definition of "cruel
and unusual" so that it's not unconstitutional any more. Fortunately I
live in neither the US nor the UK. :-)

John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

Steve O'Hara-Smith

unread,

Jun 10, 2008, 4:00:10 PM6/10/08

On Tue, 10 Jun 2008 18:38:20 GMT
Your Name <no...@none.none> wrote:

> They may be necessary evils in some environments. The open/closed
> principle reduces modifications to the "base" set of code -- whether
> this is a set of classes, a library of functions, or something else. It
> encourages code to become set in stone. Unfortunately, especially on a
> large project, I find that no matter how we try, it is impossible to
> foresee the entire scope of what needs to be designed. So the design is
> constantly changing, and the idea of some set of code that we write in
> the early stages surviving without needing vast rewrites for both
> functionality and efficiency is delusional.

I have seen OO work wonderfully well with core classes surviving
seven years of product development with them at the heart unchanged in API
(the internals got a severe optimisation at one point). They were used in
applications never even suspected at the start and the core aspect
mechanism held up under the strain superbly.

I have also seen OO fail horribly yielding nothing but awkward
constructs that impede design and make life difficult exploding into a
myriad of similar but vitally different classes requiring constant tweaking
with resultant adjustments all over the codebase.

The key difference as far as I can tell has to do with a rigorous
attack on assumptions in the core design eliminating everything except the
core concepts the base classes are intended to provide and then layering
with care. That and the craft to design a truly useful set of simple
abstractions expressing the immutable aspects of the problem domain.

The idea is not delusional, I have seen it work on more than one
occasion. It is however a difficult art and almost certainly not suited (or
possible) in all problem domains. When it works it's like putting a magic
wand in the hands of the developers, when it fails it's like casing their
hands in two kilos of half set epoxy resin.

--
C:>WIN | Directable Mirror Arrays
The computer obeys and wins. | A better way to focus the sun
You lose and Bill collects. | licences available see
| http://www.sohara.org/

Your Name

unread,

Jun 10, 2008, 7:49:41 PM6/10/08

Robert Maas wrote:

> Hopefully you accept that Lisp (specifically Common Lisp) is the
> best language in this sense? Common Lisp supports, in a
> user/applicationProgrammer-friendly way, the complete process of
> (agile) programming from immediately writing and testing single
> lines of code all the way through all the refactorings needed to
> achieve an optimal software project.

I have to admit that I haven't used Lisp since college. At the time, I
found it interesting, and I seem to recall that it was well suited for
things like AI development. I also remember it being difficult to write
readable code. Professionally, I'm somewhat handcuffed to C and C++ for
reasons I mentioned earlier. It's also hard to find bright and
motivated employees who are fluent in Lisp. But I will make a point of
revisiting it.

I'd just like a language where this idea of compression, or refactoring,
is the central principle around which the language is built. Any
language that supports subroutines offers a mechanism for this, but I
feel the concept could be taken further. It's all very nebulous at the
moment, but I feel that somehow the answer lies in pointers.

At its heart, a subroutine is nothing more than a pointer, in any
language. A compiler takes the code inside the routine and stuffs it
into memory somewhere. To call that function, you just dereference the
pointer to "jump" to that code.

An object in an OO language is accomplished using a pointer. Each class
has a function table, and each object has a pointer to that function
table. Here, the compression is happening at the compiler level. Since
every object of a given class has the same code, it can all be moved to
a common place, then each object need only store a pointer to it.
Further, code in a base class need only exist there, with each child
class pointing to that table.

But these are hard-coded abstractions built into compilers. C++ or Java
tell you, "This is how inheritance works... This is the order in which
constructors are called..." Somehow, I'd like for the language itself
to make the mechanism directly accessible to the programmer. If his or
her programming style then naturally tends to evolve into an OO sort of
structure, wonderful. If not, then maybe a sort of table-driven type of
"engine" architecture would emerge. That just happens to be what I
generally find most powerful.

I suppose you could just take a huge step backwards and write pure
assembly language. Then all you really have is code, data, and
pointers. You're free to use them however you like. But I strongly
believe there is a way to still offer this freedom, while also offering
a great deal of convenience, readability, and maintainability by way of
a high-level language.

thomas...@gmx.at

unread,

Jun 12, 2008, 6:30:40 AM6/12/08

If you want to invent your own abstraction mechanisms you
might be interested in Seed7. There are several constructs
where the syntax and semantic can be defined in Seed7:
- Statements and operators (while, for, +, rem, mdiv, ... )
- Abstract data types (array, hash, bitset, ... )
- Declaration constructs
The limits of what can be defined by a user are much wider
in Seed7.

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

Jon Harrop

unread,

Jun 17, 2008, 6:52:33 AM6/17/08

Your Name wrote:
> This has given me a great deal of direction in thinking about creating
> languages.

I agree.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u

Jon Harrop

unread,

Jun 17, 2008, 7:28:00 AM6/17/08

Robert Maas, http://tinyurl.com/uh3t wrote:
>> The more power a language gives me to compress my algorithm --
>> both code and data, as well as in space and executuion time -- the
>> more I like it.
>
> Hopefully you accept that Lisp (specifically Common Lisp) is the
> best language in this sense?

If that were true Lisp would be concise but Lisp is actually extremely
verbose.

>> This has given me a great deal of direction in thinking about
>> creating languages.
>
> There's no need to create another general-purpose programming
> language. Common Lisp already exists and works just fine. All you
> might need to do is create domain-specific languages, either as
> mere sets of macros within the general s-expression syntax, or as
> explicitly parsed new syntaxes feeding into Lisp, or as GUI-based
> no-syntax pure-semantics editors generating tables that feed into Lisp.

Consider the following questions:

1. Why has Lisp always remained so unpopular when, as you say, it is so
extensible?

2. Why have the implementors of successful modern languages that were
originally built upon Lisp gone to great lengths to completely remove Lisp
from their implementations?

I studied Lisp for some time before realising the answers to these
questions:

1. Modern language features (e.g. pattern matching over algebraic data
types) are so difficult to implement that it is a practical impossibility
to expect ordinary programmers to use Lisp's extensibility to make
something decent out of it. Instead, the vast majority of programmers
choose to use less extensible languages that already provide most of what
they need (e.g. a powerful static type system) because that is vastly more
productive. In theory, this problem could be fixed but, in practice, the
few remaining members of the Lisp community lack the talent to build even
the most basic infrastructure (e.g. a concurrent GC).

2. Even though Lisp's forte is as a language laboratory, Lisp has so little
to offer but costs so much even in that niche that the implementors of
successful modern languages soon stripped all traces of Lisp from their
implementations in order to obtain decent performance (compilers written in
Lisp are extremely slow because Lisp is extremely inefficient).

In other words, Lisp is just a toy language because it does not help real
people solve real problems.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 22, 2008, 1:04:56 AM6/22/08

> >> The more power a language gives me to compress my algorithm --
> >> both code and data, as well as in space and executuion time -- the
> >> more I like it.
> > Hopefully you accept that Lisp (specifically Common Lisp) is the
> > best language in this sense?

> From: Jon Harrop <j...@ffconsultancy.com>

> If that were true Lisp would be concise but Lisp is actually
> extremely verbose.

Compared to what? What other programming language do you know that
is as expressive as Lisp, allowing not just canned procedures
manually defined by syntax that is compiled, but ease in building
your own new procedures "on the fly" at runtime, but which is less
verbose than Lisp for equivalent value? Both C and Java are more
verbose than Lisp. To add two numbers in Lisp, you simply write (+
n1 n2) where n1 and n2 are the two numbers, and enter that directly
into Lisp's REP (Read Eval Print) loop. To do the same thing in C,
you need to write a function called "main" which includes both the
arithemetic operation itself also an explicit formatted-print
statement. To do the same thing in Java, you need to define a Class
which contains a method called "main", and then the innerds of main
are essentially the same as in C. For either C or Java, you then
need to compile the source, you can't just type it into a REP loop.
And in the case of Java, you can't even directly run the resultant
compiled-Class program, you have to start up the Java Virtual
Machine and have *it* interpret the main function of your compiled
Class.

Here's a more extreme comparison: Suppose you want to hand-code a list
of data to be processed, and then map some function down that list.
In Lisp all you need to do is
(mapcar #'function '(val1 val2 val3 ... val4))
where the vals are the expressions of the data you want processed
and function is whatever function you want applied to each element
of the list. You just enter that into the REP and you're done. Try
to imagine how many lines of code it takes in C to define a STRUCT
for holding a linked-list cell of whatever datatype you want the
function applied to, and a function for allocating a new cell and
linking it to some item of data and to the next cell in the chain,
and then calling that function over and over to add the cells to
the linked list one by one, and then you have to write a function
to map down the list to apply the other function. Or you have to
manually count how many items you want to process, and use an array
instead of a linked list, and manually insert elements into the
array one by one, then allocate another array to hold the results,
and finally you can
do (i=0,i<num,i) res[i]=fun(data[i]);
and then to print out the contents of that array or linked list you
have to write another function. And in Java you have to decide
whether to use vectors or arrays or any of several other collection
classes to hold your data, and then manually call
collection.add(element) over and over for the various elements you
want to add. Then to map the function down the list you need to
create an iterator for that collection and then alternate between
checking whether there any other elements and actually getting the
next element, and create another collection object to hold the
results. Then again just like in C you need to write a function for
reading out the elements in the result collection.

Of course if you're just going to map a function down your list and
immediately discard the internal form of the result, you don't need
a result list/array/collection. I'm assuming in the descriptions
above that you want to actually build a list/array/collection of
results because you want to pass *that* sequence of values to yet
another function later.

Are you complaining about the verbosity of the names of some of the
built-in functions? For example, there's adjust-array alphanumericp
assoc assoc-if char-lessp char-greaterp char-equal char-not-lessp
char-not-equal char-not-greaterp? Would you rather be required to
memorize Unix-style ultra-terse-inscrutable names aa a as ai cl cg
ce cnl cne cng respectively? Do you really think anybody will
understand a program that is written like that?

> 1. Why has Lisp always remained so unpopular when, as you say, it
> is so extensible?

Unlike all other major languages except Java and
HyperCard/HyperTalk and Visual Basic, it doesn't produce native
executables, it produces modules that are loaded into an
"environment". That means people can't just download your compiled
program and run it directly on their machine. They need to first
download the Lisp environment, and *then* your application can be
loaded into it (possibly by a script you provided for them) and
run. Unlike Visual Basic and Java, there's no major company pushing
hard to get people to install the environment. Unlike
HyperCard/HyperTalk, there's no major vendor of operating systems
(MS-Windows, FreeBSD Unix, Linux, Apple/Macintosh) providing Lisp
as part of the delivered operating system. (Linux does "ship" with
GNU Emacs with built-in E-Lisp, but when I say "Lisp" here I mean
Common Lisp. E-Lisp doesn't catch on, despite "shipping" with
Linux, because it simply doesn't have the usefulness of Common Lisp
for a wide variety of tasks *other* than managing a text-editor
with associated utilities such as DIRED and e-mail.) So Lisp has an
uphill battle to compete with MicroSoft and Sun who are pushing
their inferior languages. (HyperTalk/HyperCard died a long time ago
because it wasn't an especially good programming language, was
supplied only on Macintosh computers, and Macintosh lost market
share to MicroSoft's unfair labor practices, resulting in hardly
anybody making major use of it, resulting in Apple no longer
maintaining it to run under newer versions of their operating
system, so that now very few people still are running old versions
of MacOS where HyperCard runs.) Lisp is not yet dead. Common Lisp
is still thriving, even if it hasn't proven its case to the average
customer of MicroSoft Windows sufficiently that said customer would
want to install a Lisp environment so as to be able to run Lisp
applications.

> 2. Why have the implementors of successful modern languages that
> were originally built upon Lisp gone to great lengths to completely
> remove Lisp from their implementations?

-a- Penny wise pound foolish business managers who care only about
next-quarter profit, damn any longterm prospect for software
maintenance.
-b- Not-invented-here syndrome. Companies would rather base their
product line on a new language where they have a monopoly on
implementation rather than an already-existing well
established language where other vendors already provide
adequate implmentation which would need to be licensed for use
with your own commercial product line.
-c- Narrow-minded software vision which sees only today's set of
applications that can be provided by a newly-invented-here
language, blind to the wider range of services already
provided by Common Lisp that would support a greatly extended
future set of applications. Then once the company has invested
so heavily in building their own system to duplicate just some
of the features of Common Lisp, when they realize they really
do need more capability, it's too late to switch everything to
Common Lisp, so they spend endless resources crocking one new
feature after another into an ill-conceived system (compared
to Common Lisp), trying desperately to keep up with the needs
of the new applications they too-late realize they'll want.

> Modern language features (e.g. pattern matching over algebraic
> data types) are so difficult to implement that it is a practical
> impossibility to expect ordinary programmers to use Lisp's
> extensibility to make something decent out of it.

Agreed. That's why *one* (1) person or group needs to define
precisely what API-use-cases are required
(see the question I asked earlier today in the
64-parallel-processor thread, and please answer it ASAP)
and what intentional datatypes are needed for them, then implement
those API-use-cases per those intentional datatypes in a nicely
designed and documented package, then make that available at
reasonable cost (or free). I take it you, who realize the need,
aren't competant enough to implement it yourself, right? Why don't
you specify precisely what is needed (API-use-cases) and then ask
me whether I consider myself competant to implement your specs, and
offer to pay me for my work if I accept the task?

> Instead, the vast majority of programmers choose to use less
> extensible languages that already provide most of what they need
> (e.g. a powerful static type system)

That's rubbish!!! Static type declarations/checking only enforces
*internal* data types, not intentional data types on top of them.
So you're in a dilemma:
- Try to hardwire into the language every nitpicking difference in
intentional datatype as an actual *internal* datatype, with no
flexibility for the application programmer to include a new
datatype you didn't happen to think of.
- Hardwire into the language a meta-language capable of fully
expressing every nitpicking nuance of any intentional datatype,
so that application programmers can define their own *internal*
datatypes to express their intentional datatypes. Expect every
application programmer to actually make use of this facility,
always implementing new intentional datatypes as actual
application-programmer-defined *internal* datatypes.
- Don't bother to implement intentional datatypes at all.
Type-check just the internal datatypes that are built-in, and
expect the programmer to do what he does now of runtime
type-checking for all intentional type information, thereby
removing any good reason to do compile-time static type-checking
or even declaractions in the first place.

Example of internal datatype: 4.7

Example of intentional datatype: 4.7 miles per hour, 4.7 m/s, 4.7
children per average family, 4.7 average percentage failure rate of
latest-technology chips, $4.70 cost of gallon of gasoline, $7.7
million dollars total expenditure for a forest fire, $4.7 billion
dollars crop losses in the MidWest, $4.7 billion dollars per month
for war in Iraq, etc.

Application where you may need to mix the same internal datatype
with multiple intentions, where the intention is carried around
with the data to avoid confusion: Some engineers are working in
"metric" while others are working in English units, while all
original information must be kept as-is for error-control purposes
rather than automatically converted to common units on input, yet
later in the algorithms conversion to common units must be
performed to interface to various functions/methods which do
processing of the data. This is especially important if some
figures are total money for project while other figures are money
per unit time (month or year) and they need to be compared in some
way.

> In theory, this problem could be fixed but, in practice, the few
> remaining members of the Lisp community lack the talent to build
> even the most basic infrastructure (e.g. a concurrent GC).

What does that have to do with static type checking????????????

Please write up a Web page that explains:
- What you precisely mean by "concurrent GC" (or find a Web page
that somebody else wrote, such as on WikiPedia, that says
exactly the same as what *you* mean, and provide the URL plus a
brief summary or excerpt of what that other Web page says).
- List several kind of applications and/or API tools that are
hampered by lack of whatever that means.
- Explain how existing GC implementations don't satisfy your
definition of "concurrent GC" and how they specifically are not
sufficient for those kinds of applications you listed.
Then post the URL of your Web page here and/or in the other thread
where I also asked about the API-use-cases that you lament are
missing from Lisp.

> 2. Even though Lisp's forte is as a language laboratory, Lisp has
> so little to offer but costs so much

Um, some implementations of Common Lisp are **free** to download
and then "use to your heart's content". How does that cost too much???

> ... compilers written in Lisp are extremely slow because Lisp is
> extremely inefficient

That's a fucking lie!!

> In other words, Lisp is just a toy language because it does not
> help real people solve real problems.

That's another fucking lie!! I use Lisp on a regular basis to write
applications of practical importance to me, and also to write Web
demos of preliminary ideas for software I offer to write for
others. For example, there's a demo of my flashcard program on the
Web, including both the overall algorithm for optimal chronological
presentation of drill questions
(to get them into your short-term memory then to develop them
toward your medium-term and long-term memory),
and the specific quiz type where you type the answer to a question
(usually a missing word in a sentence or phrase)
and that short-answer quiz-type software coaches you toward a
correct answer and then reports back to the main drill algorithm
whether you needed help or not to get it correct. My Lisp program
on my Macintosh was used to teach two pre-school children how to
read and spell at near-adult level, and later the conversion of it
to run under CGI/Unix allowed me to learn some Spanish and
Mandarin, hampered only by lack of high-quality data to use to
generate Spanish flashcards and lack of anybody who knows Mandarin
and has the patience to let me practice my Mandarin with them. I'd
like to find somebody with money to pay me to develop my program
for whatever the money-person wants people to learn.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 22, 2008, 5:31:50 AM6/22/08

> From: Your Name <n...@none.none>

> I have to admit that I haven't used Lisp since college. At the
> time, I found it interesting, and I seem to recall that it was well
> suited for things like AI development.

That's an incredible understatement, like saying you haven't used
an automobile since college when you found it useful for taking
high-school girls to drive-in movies (but you can't think of any
other use whatsoever for an automobile).

Why don't you admit that Lisp is a general-purpose programming
language that is useful for many different kinds of applications
that require fiexible datastructures to be crafted on the fly at
runtime? Can't you think of any other application area except A.I.
that might make good use of such data structures?

> I also remember it being difficult to write readable code.

Either you had a bad instructor, or you didn't have any inherent
talent, or both. It's a lot easier to write readable code in Lisp
than in other popular languages such as C or even Java. For
example, here's a simple executable expression in Lisp:
(loop for str in
'("Hello" "world." "Always" "love" "Lisp.")
collect (position #\l str :test #'char-equal))
which returns the list:
(2 3 1 0 0)
Now try to convert that to C so that the result is more "readable".
It has to be an expression which returns a value. It's OK if you
need to write a function which returns a value, then give a line of
code that calls that function as an "expression" that returns the
same value. It's *not* OK for you to write a program that
*prints*out* that syntax of open parens digits and spaces and close
parens but doesn't build any actual list of numeric values to
return to the caller. Go ahead and see if you can write anything
even half as clean as the three lines of Lisp code I displayed
above.
(By the way, after completely composing that three-line
expression, I actually started up CMUCL and copied the three
lines from my message-edit buffer and pasted it into CMUCL, and
it ran correctly the first time, and then I copied the result
from CMUCL and pasted it back into the edit buffer. Try writing
your C equivalent program and getting it exactly right the very
first time you try to compile it, nevermind the original question
of making it "readable".)

> Professionally, I'm somewhat handcuffed to C and C++ for reasons
> I mentioned earlier.

I feel sad for your plight. But I presume you're an adult, so
you're responsible for allowing yourself to be abused in that way.

Why don't you organize your fellow workers to stage a strike
against your employer until you are un-handcuffed? Or just file a
complaint with whatever government agecy protects employees from
abuse?

> It's also hard to find bright and motivated employees who are
> fluent in Lisp.

That's completely untrue if you mean *potential* employees, people
your company *could* hire if they had any brains. But it's true if
you mean *present* emplyees of your company. If that's what you're
saying, why don't you convince your company to hire somebody new
instead of recruiting only from their existing employee base?

> I'd just like a language where this idea of compression, or
> refactoring, is the central principle around which the language is
> built.

While I'm totally in favor of constant-refactoring as an
operational principle, I'm not in favor of making the language
itself *require* that way of working. For example, suppose you have
a government contract where every nitpicking detail of the use
cases are specified in the wording of the contract, and you are
legally required to provide *exactly* what the contract requires.
In that case you might be able to design the entire application at
the start and *not* need to do any refactoring during development.
Why should you be *required* by the language design to refactor
without any benefit? IMO it's better to have a language that makes
easy to refactor several times per day without being the central
principle of the language that you just can't escape.

You talk about being handcuffed to C and C++. Being handcuffed to
perpetual refactoring would not be as painful, but still I'd rather
avoid that too. Why do you seem to *want* it?

Common Lisp is the language of choice for enabling frequent
refactoring without absolutely requiring it, without refactoring
being the central principle of the language per se, merely *a*
central principle of the REP which is available whenever you need
it.

> Any language that supports subroutines offers a mechanism for
> this, but I feel the concept could be taken further.

I agree only to a limited degree. Having a generic datatype which
is the default, with runtime dispatching on type specified by the
application programmer, as in Lisp, is a lot better for this
purpose than most languages-with-subroutines which have strict
classification of data types that essentially preclude developing
as if there were a generic data type.

> It's all very nebulous at the moment, but I feel that somehow the
> answer lies in pointers.

That's half true. The other half is automatic garbage collection,
which works only with safe runtime datatypes (as in Lisp, and Java
if you declare variables of type java.lang.Object and use the
'instanceof' operator to dispatch on expected types your code knows
how to process). Unconstrained pointer arithmetic as in C is a
maintainability **disaster**. Look for example at the
buffer-overflow bugs (in code written in C) that allow
viruses/worms to take over people's computers and use them to
replicate themselves and also to disclose confidential information
to criminal organizations in foreign countries.

> At its heart, a subroutine is nothing more than a pointer, in any
> language.

How can I even begin to explain to you how grossly wrong you are???

In Lisp, a subroutine (function) is a first-class-citizen fully
self-contained data object. I'm not sure if Java goes that far, but
clearly in Java a subroutine (method) is likewise some kind of
self-contained data object. In neither case is it just a pointer to
some random spot in RAM as you claim. Only in really stupid cruddy
languages like C is a subroutine just a block of code with a
pointer to (the first byte of) it.

> A compiler takes the code inside the routine and stuffs it into
> memory somewhere. To call that function, you just dereference the
> pointer to "jump" to that code.

Only in the really stupid cruddy languages such as C you're
familiar with. In a decent language, there are stack frames, that
are formal structures, which are carefully controlled during call
of a function and return from a function as well as non-local
return (throw/catch and error restarts).

Even in C and assembly language, you don't JUMP to a subroutine,
you JSR to a subroutine, which automatically puts the return
address within the calling program onto the stack (or into a
machine register on some CPUs).

> An object in an OO language is accomplished using a pointer.
> Each class has a function table, and each object has a pointer to
> that function table. Here, the compression is happening at the
> compiler level. Since every object of a given class has the same
> code, it can all be moved to a common place, then each object need
> only store a pointer to it. Further, code in a base class need only
> exist there, with each child class pointing to that table.

OK, that paragraph is actually mostly correct and nicely explained.

> But these are hard-coded abstractions built into compilers. C++
> or Java tell you, "This is how inheritance works... This is the
> order in which constructors are called..." Somehow, I'd like for
> the language itself to make the mechanism directly accessible to
> the programmer.

That's pretty easy in Lisp. Are you familar with the 'typecase'
macro? You are free not to use the built-in CLOS inheritance at
all, but instead to use CLOS classes only to tag objects for
purpose of dispatching via the 'typecase' macro. You have the
choice whether to have it all nice and automatic via inheritance,
or explicitly under control of the programmer via 'typecase'
dispatching. There are arguments pro and con each way of doing it.
Within the past week or so somebody posted a major complaint about
OOP that with a deeply nested class inheritance hierarchy it's nigh
impossible for anyone to figure out why a particular method defined
in some distantly-related class is being called when a generic
function is passed some particular object of some particular class.
(In Java, read "generic function" as "method name", which is part
of the "method signature". Quoting from Liang's textbook (ISBN
0-13-100225-2) page 119: "The <i>parameter profile</i> refers to
the type, order, and number of parameters of a method. The method
name and the parameter profiles together constitute the <i>method
signature</i>. I like the way that author expresses concepts in Java.)

> If his or her programming style then naturally tends to evolve
> into an OO sort of structure, wonderful. If not, then maybe a
> sort of table-driven type of "engine" architecture would emerge.
> That just happens to be what I generally find most powerful.

Do you agree that Common Lisp offers the best availability of
features to support not just standard OO design (making heavy use
of inheritance) but also other variants that you seem to prefer
sometimes?

> I suppose you could just take a huge step backwards and write
> pure assembly language.

NO NO A THOUSAND TIMES NO!!!
Try Common Lisp, really, try it, for J.Ordinary applications of the
day. Whatever new application you want to write this coming Monday,
try writing it with Common Lisp, and tell me what difficulties (if
any) you experience.

Hey, if you are unwilling to swallow crow, then write a
machine-language emulator in Common Lisp, and then try to write
applications in that emulated machine language. Make sure your
emulator has debugging features a million times better than what
you'd have a bare machine you're trying to program in machine
language. While you're at it, put your emulator up as a CGI service
so that the rest of us can play with it too.

Suggestion: Emulate IBM 1620 machine language.
16 00010 00000
That's your first test program, just a single instruction.

> Then all you really have is code, data, and pointers. You're
> free to use them however you like. But I strongly believe there is
> a way to still offer this freedom, while also offering a great deal
> of convenience, readability, and maintainability by way of a
> high-level language.

If Lisp were Maynard G. Krebs
<http://www.fortunecity.com/meltingpot/lawrence/153/krebs.html>
<http://althouse.blogspot.com/2005/09/maynard-to-god-you-rang.html>
Lisp would jump in right there and say "YOU RANG?"

Jon Harrop

unread,

Jun 22, 2008, 8:13:13 AM6/22/08

Robert Maas, http://tinyurl.com/uh3t wrote:

>> >> The more power a language gives me to compress my algorithm --
>> >> both code and data, as well as in space and executuion time -- the
>> >> more I like it.
>> > Hopefully you accept that Lisp (specifically Common Lisp) is the
>> > best language in this sense?
>> From: Jon Harrop <j...@ffconsultancy.com>
>> If that were true Lisp would be concise but Lisp is actually
>> extremely verbose.
>
> Compared to what? What other programming language do you know that
> is as expressive as Lisp, allowing not just canned procedures
> manually defined by syntax that is compiled, but ease in building
> your own new procedures "on the fly" at runtime, but which is less
> verbose than Lisp for equivalent value?

Haskell, SML, OCaml, Mathematica, F# and Scala all allow real problems to be
solved much more concisely than with Lisp. Indeed, I think it is difficult
to imagine even a single example where Lisp is competitively concise.

> Both C and Java are more
> verbose than Lisp. To add two numbers in Lisp, you simply write (+
> n1 n2) where n1 and n2 are the two numbers, and enter that directly
> into Lisp's REP (Read Eval Print) loop.

Yes. Consider the trivial example of defining a curried "quadratic"
function. In Common Lisp:

(defun quadratic (a) (lambda (b) (lambda (c) (lambda (x)
(+ (* a x x) (* b x) c)))))

In F#:

let inline quadratic a b c x = a*x*x + b*x + c

> To do the same thing in C,
> you need to write a function called "main" which includes both the
> arithemetic operation itself also an explicit formatted-print
> statement. To do the same thing in Java, you need to define a Class
> which contains a method called "main", and then the innerds of main
> are essentially the same as in C. For either C or Java, you then
> need to compile the source, you can't just type it into a REP loop.
> And in the case of Java, you can't even directly run the resultant
> compiled-Class program, you have to start up the Java Virtual
> Machine and have *it* interpret the main function of your compiled
> Class.

Forget C and Java.

> Here's a more extreme comparison: Suppose you want to hand-code a list
> of data to be processed, and then map some function down that list.
> In Lisp all you need to do is
> (mapcar #'function '(val1 val2 val3 ... val4))
> where the vals are the expressions of the data you want processed
> and function is whatever function you want applied to each element
> of the list. You just enter that into the REP and you're done.

Lisp:

(mapcar #'function '(val1 val2 val3 ... val4))

OCaml and F#:

map f [v1; v2; v3; .... vn]

Mathematica:

f /@ {v1, v2, v3, ..., vn}

> Try
> to imagine how many lines of code it takes in C to define a STRUCT
> for holding a linked-list cell of whatever datatype you want the
> function applied to, and a function for allocating a new cell and
> linking it to some item of data and to the next cell in the chain,
> and then calling that function over and over to add the cells to
> the linked list one by one, and then you have to write a function
> to map down the list to apply the other function. Or you have to
> manually count how many items you want to process, and use an array
> instead of a linked list, and manually insert elements into the
> array one by one, then allocate another array to hold the results,
> and finally you can
> do (i=0,i<num,i) res[i]=fun(data[i]);
> and then to print out the contents of that array or linked list you
> have to write another function. And in Java you have to decide
> whether to use vectors or arrays or any of several other collection
> classes to hold your data, and then manually call
> collection.add(element) over and over for the various elements you
> want to add. Then to map the function down the list you need to
> create an iterator for that collection and then alternate between
> checking whether there any other elements and actually getting the
> next element, and create another collection object to hold the
> results. Then again just like in C you need to write a function for
> reading out the elements in the result collection.

Forget C and Java. Compare with modern alternatives.

> Of course if you're just going to map a function down your list and
> immediately discard the internal form of the result, you don't need
> a result list/array/collection. I'm assuming in the descriptions
> above that you want to actually build a list/array/collection of
> results because you want to pass *that* sequence of values to yet
> another function later.
>
> Are you complaining about the verbosity of the names of some of the
> built-in functions? For example, there's adjust-array alphanumericp
> assoc assoc-if char-lessp char-greaterp char-equal char-not-lessp
> char-not-equal char-not-greaterp? Would you rather be required to
> memorize Unix-style ultra-terse-inscrutable names aa a as ai cl cg
> ce cnl cne cng respectively? Do you really think anybody will
> understand a program that is written like that?

Far more programmers use modern functional languages like Haskell, OCaml and
F# than Lisp now. They clearly do not have a problem with the supreme
brevity of these languages.

Look at the intersection routines from my ray tracer benchmark, for example.
In OCaml:

let rec intersect orig dir (lam, _ as hit) (center, radius, scene) =
let lam' = ray_sphere orig dir center radius in
if lam' >= lam then hit else
match scene with
| [] -> lam', unitise(orig +| lam' *| dir -| center)
| scene -> List.fold_left (intersect orig dir) hit scene

and in Lisp:

(defun intersect (orig dir scene)
(labels ((aux (lam normal scene)
(let* ((center (sphere-center scene))
(lamt (ray-sphere orig
dir
center
(sphere-radius scene))))
(if (>= lamt lam)
(values lam normal)
(etypecase scene
(group
(dolist (kid (group-children scene))
(setf (values lam normal)
(aux lam normal kid)))
(values lam normal))
(sphere
(values lamt (unitise
(-v (+v orig (*v lamt dir))
center)))))))))
(aux infinity zero scene)))

The comparative brevity of the OCaml stems almost entirely from pattern
matching.

>> 1. Why has Lisp always remained so unpopular when, as you say, it
>> is so extensible?
>
> Unlike all other major languages except Java and
> HyperCard/HyperTalk and Visual Basic, it doesn't produce native
> executables, it produces modules that are loaded into an
> "environment". That means people can't just download your compiled
> program and run it directly on their machine.

IIRC, some commercial Lisps allow standalone executables to be generated but
they are still not popular.

> They need to first
> download the Lisp environment, and *then* your application can be
> loaded into it (possibly by a script you provided for them) and
> run. Unlike Visual Basic and Java, there's no major company pushing
> hard to get people to install the environment. Unlike
> HyperCard/HyperTalk, there's no major vendor of operating systems
> (MS-Windows, FreeBSD Unix, Linux, Apple/Macintosh) providing Lisp
> as part of the delivered operating system. (Linux does "ship" with
> GNU Emacs with built-in E-Lisp, but when I say "Lisp" here I mean
> Common Lisp. E-Lisp doesn't catch on, despite "shipping" with
> Linux, because it simply doesn't have the usefulness of Common Lisp
> for a wide variety of tasks *other* than managing a text-editor
> with associated utilities such as DIRED and e-mail.) So Lisp has an
> uphill battle to compete with MicroSoft and Sun who are pushing
> their inferior languages. (HyperTalk/HyperCard died a long time ago
> because it wasn't an especially good programming language, was
> supplied only on Macintosh computers, and Macintosh lost market
> share to MicroSoft's unfair labor practices, resulting in hardly
> anybody making major use of it, resulting in Apple no longer
> maintaining it to run under newer versions of their operating
> system, so that now very few people still are running old versions
> of MacOS where HyperCard runs.) Lisp is not yet dead. Common Lisp is still

> thriving...

Lisp is not "thriving" by any stretch of the imagination. According to
Google Trends (which measures the proportion of searches for given search
terms) "Common Lisp" has literally almost fallen off the chart:

http://www.google.com/trends?q=common+lisp

>> 2. Why have the implementors of successful modern languages that
>> were originally built upon Lisp gone to great lengths to completely
>> remove Lisp from their implementations?
>
> -a- Penny wise pound foolish business managers who care only about
> next-quarter profit, damn any longterm prospect for software
> maintenance.
> -b- Not-invented-here syndrome. Companies would rather base their
> product line on a new language where they have a monopoly on
> implementation rather than an already-existing well
> established language where other vendors already provide
> adequate implmentation which would need to be licensed for use
> with your own commercial product line.
> -c- Narrow-minded software vision which sees only today's set of
> applications that can be provided by a newly-invented-here
> language, blind to the wider range of services already
> provided by Common Lisp that would support a greatly extended
> future set of applications. Then once the company has invested
> so heavily in building their own system to duplicate just some
> of the features of Common Lisp, when they realize they really
> do need more capability, it's too late to switch everything to
> Common Lisp, so they spend endless resources crocking one new
> feature after another into an ill-conceived system (compared
> to Common Lisp), trying desperately to keep up with the needs
> of the new applications they too-late realize they'll want.

I have never heard of a single user of a modern FPL regretting not choosing
Lisp. Can you refer me to any such people?

>> Modern language features (e.g. pattern matching over algebraic
>> data types) are so difficult to implement that it is a practical
>> impossibility to expect ordinary programmers to use Lisp's
>> extensibility to make something decent out of it.
>
> Agreed. That's why *one* (1) person or group needs to define
> precisely what API-use-cases are required
> (see the question I asked earlier today in the
> 64-parallel-processor thread, and please answer it ASAP)
> and what intentional datatypes are needed for them, then implement
> those API-use-cases per those intentional datatypes in a nicely
> designed and documented package, then make that available at
> reasonable cost (or free). I take it you, who realize the need,
> aren't competant enough to implement it yourself, right?

You are grossly underestimating the amount of work involved. It would
literally take me decades of full time work to catch up with modern
functional language implementations in terms of features and the result
could never be competitively performant as long as it was built upon Lisp.
Finally, I don't believe I could ever build a commercial market as
successful as F# already is so, even if I did ever start doing this, it
would be as a hobby and not for profit.

Note that this is precisely why the developers of all successful modern
functional language implementations do not build them upon Lisp.

> Why don't
> you specify precisely what is needed (API-use-cases) and then ask
> me whether I consider myself competant to implement your specs, and
> offer to pay me for my work if I accept the task?

Because it would be a complete waste of my time and money because Lisp
offers nothing of benefit whatsoever to me, my company or our customers.
Moreover, Lisp does not even have a commercially viable market for our kind
of software.

Lisp is literally at the opposite end of the spectrum from where we want to
be. We need to combine the performance of Fortran with the expressiveness
of Mathematica (which F# almost does!) but Lisp combines the performance of
Mathematica with the expressiveness of Fortran.

>> In theory, this problem could be fixed but, in practice, the few
>> remaining members of the Lisp community lack the talent to build
>> even the most basic infrastructure (e.g. a concurrent GC).
>
> What does that have to do with static type checking????????????

That has nothing to do with static typing. I was listing some of Lisp's most
practically-important deficiencies. Lack of static typing in the language is
one. Lack of concurrent GC in all Lisp implementations is another. Also
lack of threads, weak references, finalizers, asychronous computations,
memory overflow recovery, tail call optimization, callcc etc. are all
fundamental deficiencies of the language.

> Please write up a Web page that explains:
> - What you precisely mean by "concurrent GC" (or find a Web page
> that somebody else wrote, such as on WikiPedia, that says
> exactly the same as what *you* mean, and provide the URL plus a
> brief summary or excerpt of what that other Web page says).

See Jones and Lins "Garbage Collection: algorithms for automatic dynamic
memory management" chapter 8.

In a serial GC, the program often (e.g. at every backward branch) calls into
the GC to have some collection done. Collections are typically done in
small pieces (incrementally) to facilitate soft-real time applications but
can only use a single core. For example, OCaml has a serial GC so OCaml
programs wishing to use multiple cores fork multiple processes and
communicate between them using message passing which is two orders of
magnitude slower than necessary, largely because it incurs huge amounts of
copying that is not necessary on a shared memory machine:

http://caml.inria.fr/pub/ml-archives/caml-list/2008/05/6ba948d84934b1e61875687961706f61.en.html

In a parallel GC, the program occasionally (e.g. when a minor heap is
exhausted) suspends all program threads and begins a parallel traversal of
the heap using all available cores. This allows programs (even serial
programs) to benefit from multiple cores but it has poor incrementality (so
it is unsuitable for soft real-time applications) and scales badly. For
example, the GHC implementation of Haskell recently acquired a parallel GC
which can improve Haskell's performance on <4 cores but (according to the
authors) can degrade performance with more cores because the cost of
suspending many threads becomes the bottleneck.

With a concurrent GC, the garbage collector's threads run concurrently with
the program threads without globally suspending all program threads during
collection. This is scalable and can be efficient but it is incredibly
difficult to implement correctly. The OCaml team spent a decade trying to
implement a concurrent GC and never managed to get it working, let alone
efficient.

> - List several kind of applications and/or API tools that are
> hampered by lack of whatever that means.

Any software that requires fine-grained parallelism for performance will be
hampered by the lack of a concurrent GC.

>> 2. Even though Lisp's forte is as a language laboratory, Lisp has
>> so little to offer but costs so much
>
> Um, some implementations of Common Lisp are **free** to download
> and then "use to your heart's content". How does that cost too much???

Development costs are astronomical in Lisp compared to modern alternatives
like F#, largely because it lacks a static type system but also because it
lacks language features like pattern matching, decent developer tools like
IDEs and libraries like Windows Presentation Foundation (WPF).

For example, adding the new 3D surface plotting functionality to our F# for
Visualization product took me four days of full time work even though I am
new to WPF:

http://www.ffconsultancy.com/products/fsharp_for_visualization/?clp

The result will run reliably on hundreds of millions of computers (you just
need Windows and .NET 3.0 or better). Developing it into a standalone
Windows application will be trivial, if I choose to do so.

Contrast that with Lisp. There are no decent Lisp implementations for .NET.
Microsoft certainly aren't using or advocating Lisp. So you can immediately
kiss goodbye to easy multicore support and all of Microsoft's latest
libraries and tools. You'll be developing your GUIs without the aid of an
interactive GUI designer and you'll be using a low-level graphics API like
DirectX or OpenGL for visualization. You are looking at several times as
much effort to get something comparable and, even then, it will never be as
reliable (because Microsoft have invested billions in making WPF and .NET
reliable).

Finally, there is no way you'll ever turn a profit because the market for
commercial third-party software for Lisp is too small.

>> In other words, Lisp is just a toy language because it does not
>> help real people solve real problems.
>

> ...I use Lisp on a regular basis to write

> applications of practical importance to me, and also to write Web
> demos of preliminary ideas for software I offer to write for
> others. For example, there's a demo of my flashcard program on the
> Web, including both the overall algorithm for optimal chronological
> presentation of drill questions
> (to get them into your short-term memory then to develop them
> toward your medium-term and long-term memory),
> and the specific quiz type where you type the answer to a question
> (usually a missing word in a sentence or phrase)
> and that short-answer quiz-type software coaches you toward a
> correct answer and then reports back to the main drill algorithm
> whether you needed help or not to get it correct. My Lisp program
> on my Macintosh was used to teach two pre-school children how to
> read and spell at near-adult level, and later the conversion of it
> to run under CGI/Unix allowed me to learn some Spanish and
> Mandarin, hampered only by lack of high-quality data to use to
> generate Spanish flashcards and lack of anybody who knows Mandarin
> and has the patience to let me practice my Mandarin with them. I'd
> like to find somebody with money to pay me to develop my program
> for whatever the money-person wants people to learn.

I think you should aspire to earn money directly from customers rather than
asking people to give you money to develop your software.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 22, 2008, 6:05:40 PM6/22/08

> From: thomas.mer...@gmx.at

> If you want to invent your own abstraction mechanisms you might
> be interested in Seed7. There are several constructs where the
> syntax and semantic can be defined in Seed7:
> - Statements and operators (while, for, +, rem, mdiv, ... )

Some of those are such commonly useful constructs that it seems
ill-conceived to have each application programmer re-invent the
wheels. Are there standard packages available to provide these as
"givens" with a well-documented API so that different application
programmers can read each other's code?

Now the ability to define *variants* of those common operators
which do additional/different tasks would be useful. Obviously if
the standard operator can be defined, a variant can be defined,
right?

> - Abstract data types (array, hash, bitset, ... )

Again, these are such basic container types that they really ought
to be provided in a standard package. Are they? Again, the ability
to define variants on the standard implementation would be useful.
But the ability to mix-in the variation within the standard
definition without needing to start from scratch to re-invent the
wheel would be even better. Is that possible? Do the standard
definitions have hooks for adding variant functionality, sort of
the way Common Lisp has hooks for what to do when an error occurs
or an unbound variable is referenced or a garbage-collect happens
etc.? For example, is it possible to use a built-in (standard
package) definition of hash table but change the pre-hash function
(the pseudo-random function from key to large integer) to replace
the standard pre-hash function?

> Seed7 Homepage: http://seed7.sourceforge.net

] It is a higher level language compared to Ada, C/C++ and Java.

Um, C is not exactly a high-level language.
Comparing your language to both C and Java in the same sentence is
almost an oxymoron.

] The compiler compiles Seed7 programs to C programs which are
] subsequently compiled to machine code.

Ugh! So you're using C almost as if it were an assembly language,
which is probably appropriate for it *not* being a high-level
language itself, being in reality a "syntax-sugared assembly
language". OK, you win, I withdraw my complaint, if you stipulate
that your earlier statement really was self-contradictory.

If you don't allow that C is nothing more than a sugar-coated
assembly language, if you insist it's really a high-level language,
then the Seed7 compiler doesn't need to really do anything, just do
a syntax transformation from one high-level language to another, in
which case you might do better to syntax-translate to Common Lisp,
maybe just emulate the Seed7 syntax within CL as if it were a DSL
(Domain-Specific Language).

] Functions with type results and type parameters are more elegant than
] a template or generics concept.

Since a template concept is dumb to begin with, saying you're
better than that is not a really good advertising point, like
saying you as a person have higher moral standards than Adolph
Hitler or Pol Pot or George W. Bush.

I don't know what you mean by a "generics" concept. Is that like
the tagging of data object to identify their respective data types
at run-time that Lisp has? Or something completely different?
Please define what precisely you mean by that (I assume you're the
author of that Web page).

] * Types are first class objects

In a sense that's also true in Common Lisp:
(class-of 5)
=> #<BUILT-IN-CLASS FIXNUM (sealed) {50416FD}>
(class-of (expt 3 99))
=> #<BUILT-IN-CLASS BIGNUM (sealed) {5044695}>
(class-of (make-hash-table))
=> #<STRUCTURE-CLASS HASH-TABLE {500D32D}>
* (defclass foo)
=> #<STANDARD-CLASS FOO {90285FD}>
Is that the kind of first-class type-objects that you are talking about?

] (Templates and generics can be defined easily without special syntax).

What precisely do you mean by "templates"?
What precisely do you mean by "generics"?

Suggestion: On a Web page where you throw around terms like this,
each such mention of a jargon-term should actually be
<a href="urlWhereTheTermIsDefined">theTerm</a>
That's the nice thing about WebPages (and DynaBooks, if they ever
existed), that there can be links from jargon to definitions, not
possible in hardcopy printed books (footnotes for every such term
would be a royal pain by comparison with HREF anchors).

Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

] * User defined statements and operators.

Why make a distinction between statements and expressions-with-operators??
IMO it's a royal pain to have to deal with.
Lisp does it right, having every expression usable *both* as
statement or operator, or even as both simultaneously
(perform side-effect and also return a value, for example SETF
which stores the right-side into the left-side place but *also*
returns that value to the caller, so that SETFs can be nested to
store the same value in more than one place, and IF and CASE which
select which of several alternative blocks of code to execute and
*also* return the value from the block that was executed).
C does it wrong, requiring different syntax for IF statements and
?: expressions which return a value.

] * Static type checking and no automatic casts.

There's hardly any value to static type checking, compared to
dynamic (runtime) type-checking/dispatching. See another article I
posted late Friday night
<http://groups.google.com/group/comp.programming/msg/5cea7d186eddfd42>
= Message-ID: <rem-2008...@yahoo.com>
(skip down to where I used the word "dilemma", page 10 on VT100
lynx, appx. 5 screens into the article on full-screen browser)
where I explained why static type checking fails to solve the
problem it claims to solve hence is worthless to include in a
programming language.

By "no automatic casts", do you mean that you can't even have a
literal that is generic to cast to short-integer or long-integer in
an assignmet, so everytime you set a variable to a literal value
you must explitly tag the literal with the appropriate word length
(or even worse, explictly cast it from literal type to whatever
type the variable happens to be today)??

] * exception handling

How does your servive compare with what Common Lisp and Java provide?
Do you provide a default break loop for all uncaught exceptions, as
Lisp does, or do you do what Java does of ABEND/BACKTRACE whenever
an uncaught unchecked-exception ocurrs (and *require* compile-time
exception handler for each and every checked-exception).

] * overloading of procedures/functions/operators/statements

So that's basically the same as what C++ and Java do (and Common
Lisp generic functions do even better)?

] * Runs under linux, various unix versions and windows.

Does it run on FreeBSD Unix? (That's what my shell account is on.)
Why doesn't it run on MacOS 6 or 7? (Just curious, since my Mac
doesn't have a decent C compiler. I have Sesame C, but it's only a
crude subset, no structs or even arrays, not even malloc. It *does*
have machine-language hooks, whereby you can embed hexadecimal
codes inline, which I used to implement a crude form of malloc via
system/OS traps to allocate a huge block of RAM and then my own code
to break it into pieces to return to callers of myMalloc!!)

<http://seed7.sourceforge.net/faq.htm#new_language>
] Why a new programming language?
] Because Seed7 has several features which are not found in other
] programming languages:
] * The possibility to declare new statements (syntactical and
] semantically) in the same way as functions are declared

Are you really really sure that's a good idea. Like why is it
even necessary in most cases, compared to Lisp's system of keeping the
bottom-level syntax OpenParens OperatorName Arg1 Arg2 ... Argn
CloseParens but allowing various OperatorNames to cause the
apparent Args to be interepreted any way you like?
Problems with defining new statement-level operators:
- It makes one person's code unreadable by anyone else. Not just
that they don't know what the semantics do, another person can't
even parse the new syntax you've invented.
- It kills any chance of having a smart editor, such as Emacs,
automatically deal with sub-expressions, such as copy/cut/paste
entire sub-expressions, skip forward/backward by sub-expressions,
etc., unless you take the extra pain of reconfiguring Emacs to
know about all the new syntaxes you've invented for your Seed7
sourcecode.
- It's already a royal pain in the first place to need to keep a
copy of the top half of page 49 of K&R posted for constant
reference when deciding whether parentheses are really necessary
to provide the desired sequence of sub-expression combination.
It would be an order of magnitude more pain to need to keep a
listing of operator precedence for every new operator invented by
every programmer in a large software project, and know which page
to refer to when looking at each person's code, and not get all
confused when trying to coorelate two different pieces of code
written by different people which use different operator
definitions.

Please reconsider your decision to use operators in the first place
for anything except arithmetic expressions.
Please consider going back to square one in your syntax design, and
using Lisp notation for everything except arithmetic (with some
sort of syntactic marker to tell when arithmetic mode starts and
ends, i.e. to wrap an arithmetic-syntax expression within an
otherwise s-expression syntax).
Heck, consider scrapping the "new language from scratch, except C
as post-processor to compiler" idea entirely, instead just use a
reader macro within Common Lisp to nest an arithmetic-syntax
expression within an s-expression. Maybe something like this:
(let* ((origHeight (get-height myDoorway))
(aspectRatio (get-aspect-ratio standardDoor))
(scaledWidth #[origVal*aspectRatio])) ;Sub-expression in math syntax
(set-width myDoorway scaledWidth)
(make-new-door :height origHeight :width scaledWidth))
Does your current implementation even provide anything like LET* in
the first place?

Some/most/all of the other features you claim aren't available in
other languages are in fact already available in Common Lisp.

<http://seed7.sourceforge.net/faq.htm#bytecode>

] Can I use something and declare it later?
] No, everything must be declared before it is used. The possibility to
] declare new statements and new operators on one side and the static
] typing requirements with compile time checks of the parameters on the
] other side would make the job of analyzing expressions with undeclared
] functions very complex.

This is a killer for top-down design+debugging that sometimes is
useful. Better to stick to a fixed syntax, where something doesn't
need to be defined before code that calls it can be set up. Then
when an undefined-function exception throws you into the break
package, *then* you can supply the missing fuction definition and
proceed from the break as if nothing were wrong in the first place.

] Forward declarations help, if something needs to be used before it can
] be declared fully.

In practice they are a royal pain, both the necessity of doing them
all before you can even write the code that calls the undefind
functions, and then the maintenance problem if you change the
number of parameters to a function and therefore need to find all
the places in your code where you declared the old parameters and
now need to re-declare all of those and re-do everything that
follows. It's a totally royal pain to have to deal with!!

] With static type checking all type checks are performed during
] compile-time. Typing errors can be caught earlier without the need to
] execute the program. This increases the reliability of the program.

Bullshit. Utter bullshit!!!

See what I wrote (see URL/MessageID of article earlier above) about
the failure of static type checking to deal with intentional types
regardless of whether you try to definie every intentional type an
an explicitly declared static type or not. Then answer my point,
either by admitting you were totally mistaken in your grandoise
claim about static type checkign, or by explaining how Seed7 is
able to completely solve the problem.

] Can functions have variable parameter lists?
] No, because functions with variable parameter lists as the C printf
] function have some problems:
] * Type checking is only possible at run time.

That's untrue.

Variable args of the same type can be all checked by mapping the
checker down the list of formal arguments in the syntax.
Lisp doesn't provide this, but an extension to DEFUN/LAMBDA could do this:
(defun foo (i1(integer) f2(single-float) &nary cxs(complex)) ...)
(foo 5 4.2 #C(2 5) #C(1 4) #C(6 9)) ;OK
(foo 5 4.2 #C(2 5) 3.7 #C(6 9)) ;syntax error, 3.7 not COMPLEX

Keyword args can be checked to make sure the only keywords actually
used are in fact defined as available by the function definition.
Thus: (defun foo (a1 a2 &key k1 k2) ...)
(foo 5 7 :k2 42 :k3 99) ;syntax error, keyword K3 not allowed by FOO
;suggestion: K1 is allowed, maybe you meant that?

] Although functions can return arbitrary complex values (e.g. arrays of
] structures with string elements) the memory allocated for all
] intermediate results is freed automatically without the help of a
] garbage collector.

How?????

Debug-use case: A application is started. A sematic error (file
missing for example) throws user into break package. User fixes the
problem, but saves a pointer to some structure in a global for
later study. User continues from the break package. There are now
two pointers to the structure, the one the compiler provided on the
stack, which goes away when some intermediate-level function (above
the break) returns, and the global one set up by the user from the
break package. How does the return-from-function mechanism know
that the structure should *not* be freed? Later, when the user
changes that global to point somewhere else, and there are no
longer any references to that structure, how does the
assign-new-value-to-global mechanism know that the *old* value of
that global can *now* finally be freed?

Reference counts don't work if you allow circular pointer structures:
(setq foo (list 1 2 3))
=> (1 2 3)
(setf (cdddr foo) (cdr foo))
=> #1=(2 3 . #1#)
foo
=> (1 . #1=(2 3 . #1#))
(setq foo nil)
;It's easy to see that the CONS cell pointing at 1 can be freed,
; because it no longer has any references.
;But the cells pointing to 2 and 3 have CDR pointers to each other,
; so how does your system know that those cells can also be freed
; without a garbage collector to verify no other references except
; those circular references exist anywhere within the runtime environment??

] of all container classes. Abstract data types provide a better and
] type save solution for containers ...
**f* (typo, the first typo I've found so-far, your English is good!)

] What is multiple dispatch?
] Multiple dispatch means that a function or method is connected to more
] than one type. The decision which method is called at runtime is done
] based on more than one of its arguments. The classic object
] orientation is a special case where a method is connected to one class
] and the dispatch decision is done based on the type of the 'self' or
] 'this' parameter. The classic object orientation is a single dispatch
] system.

What you've implemeted sounds the same as generic functions in Common Lisp.

But having *any* runtime dispatching based on actual type of an
object defeats your <coughCough>wonderful</coughCough> static type
checking, since if three sub-types inherit from one parent type,
but only two of them implement a particular method, especially if
there are multiple parameter-type dispatching with multiple options
for each parameter and not *all* combinations of parameter types
are provided, then it's possible for compiler to accept a call
involving parent-type declared parameters which at runtime steps
into one the combinations of subtypes that aren't defined.

Example, in case my English wasn't clear:
Define class table with subtypes endtable dinnertable coffeetable and bedstand.
Define class room with subtypes livingroom bedroom kitchen and bathroom.
Declare generic function arrangeTableInRoom, and define these
specific cases of parameters to it:
endtable,livingroom
dinnertable,kitchen
bedstand,bedroom
endtable,bathroom
Declare variable t1 of class table.
Declare variable r1 of class room.
Assign t1 an object of sub-type bedstand.
Assign r1 an object of sub-type kitchen.
Call arrangeTableInRoom(t1,r1) ;Compiles fine, but causes runtime exception,
; because that specific method is not defined.
;Static type checking fails to detect this type-mismatch undefined-method error.

] As in C++, Java, C# and other hybrid object oriented languages there
] are predefined primitive types in Seed7. These are integer, char,
] boolean, string, float, rational, time and others.

What is the precise meaning of type 'integer'? Is it 16-bit signed
integer, or 32-bit signed integer, or 64-bit signed integer, or
unlimited-size signed integer? If this is defined elsewhere, you
should have <a href="urlWhereDefined">integer</a> here.

What is the precise meaning of type 'char'? Is it US-ASCII 7-bit
character, or Latin-1 8-bit character, or UniCode-subset 16-bit
codepoint, or full UniCode 21-bit (embedded in 24-bit or 32-bit
machine word) codepoint, or what?? Ditto need href.

What is the precise meaning of type 'string', both in terms of
possible number of characters within a string, and what each
character is.

What is the precise meaning of type 'float'? Is it IEEE 754 single
precision, IEEE 754 double precision, IEEE 754 single-extended
precision, IEEE 754 double-extended precision, or some form of IEEE
854-1987, or any of those revised in 2008.Jun (this very month!!),
or something else?

] Variables with object types contain references to object values. This
] means that after
] a := b
] the variable 'a' refers to the same object as variable 'b'. Therefore
] changes on variable 'a' will effect variable 'b' as well (and vice
] versa) because both variables refer to the same object.

That is not worded well. There are two kinds of changes to variable
*a*, one which changes that variable itself to point to a different
object, and one which doesn't change *a* itself at all but instead
performs internal modification (what some other poster referred to
as "surgery", applied in his case to changing CAR or CDR of a CONS
cell, but the term could equally apply to *any* internal
modification of an object) upon whatever object *a* currently
points to.

If it's true that change in variable *a* itself by reassignment is
*not* passed to variable *b*, but "surgery" on the object that both
*a* and *b* point to *does* cause both *a* and *b* to "see" that
same change, you need to make that clear. Example of the distinction:
a := new tableWithLegs(color:purple);
b := a; /* *a* and *b* both point to same object */
tellIfLegs(a); /* Reports legs present, and purple */
tellIfLegs(b); /* Reports legs present, and purple */
paintLegs(a,color:green);
tellIfLegs(a); /* Reports legs present, and green */
tellIfLegs(b); /* Reports legs present, and green */
cutOffLegs(a); /* Legs of that single table are now without legs,
and henceforth the side-effect of that literal
surgery will be seen via both *a* and *b* */
tellIfLegs(a); /* Reports legs missing */
tellIfLegs(b); /* Reports legs missing */
a := new tableWithLegs; /* *a* and *b* now point to different tables */
tellIfLegs(a); /* Reports legs present, and beige (the default) */
tellIfLegs(b); /* Reports legs missing */
fastenNewLegs(b,color:orange);
tellIfLegs(a); /* Reports legs present, and beige */
tellIfLegs(b); /* Reports legs present, and orange */

] For primitive types a different logic is used. Variables with
] primitive types contain the value itself. This means that after
] a := b
] both variables are still distinct and changing one variable has no
] effect on the other.

This is correct, but totally confusing of the semantics I expressed
above are correct. If **assignment** of a new object to a variable
causes simultaneous assignment of all other variables that point to
the same object, a totally perverse semantics for your language,
which I ardently hope is *not* the case, then this all makes
(perverse) sense.

You seriously need to rewrite that whole section one way or another.

Suggestion (if I guessed the semantics correctly despite your
incorrect English): Say that in the case of primitive values, the
actual data, all of it, is right there in the variable itself, so
if you copy that value to somewhere else, and then change one bit
of one of the copies, the other copy won't be affected. But in the
case of Objects, what's in the variable is just a pointer to the
object, so you can't change bits in that pointer without trashing
the whole system (it now points to some random place in memory that
probably isn't an Object), so modifying Object variable isn't
allowed, so the question of what if you modify the actual value
itself doesn't make any sense. What you *can* do in the case of
Object variables is modify the actual Object it points to. Since
two different variables may point to the same object, that
modification (surgery) will be "seen" from both places equally.

Say that the other thing you can do with variables is to simply
re-assign the variable to have a new value, a new self-contained
value in the case of primitive variables, a pointer to a different
Object in the case of Object variables. In neither case is the
reassignment "seen" by any other variable that happened to share a
copy of the primitive value or pointer-to-Object. In the case of
primitive variables that previously contained copies of the exact
same primitive value, the two variables now have different
self-contained values, one of them now containing the
newly-assigned value, the other still containing the same original
self-contained value it had before. In the case of Object variables
that previously pointed to the same Object, they now point to
different Objects, one of them now pointing to the new Object that
was assigned to it, the other still pointing to the same Object it
already pointed to.

] In pure object oriented languages the effect of independent objects
] after the assignment is reached in a different way: Every change to an
] object creates a new object and therefore the time consuming copy
] takes place with every change.

I've never heard of any such language. Perhaps you can name one.
Java and Lisp in particular do *not* copy when objects are
modified. In Lisp you have the option, for some kinds of
intentional objects, such as sequences (linked lists and
one-dimensional arrays), to either optimize speed by destructively
modifying the object (what we're talking about here) or avoid side
effects by copying as much of the structure as needed (something
that *explicitly* says it returns a new object if necessary). (For
the no-side-effect version: For arrays you either make a complete
copy or you don't. For linked lists you usually keep/share the tail
of the original list past the last change, but re-build/copy
everything before that point.) I suspect your point here is strawman.

] ... In Seed7 every
] type has its own logic for the assignment where sometimes a value copy
] and sometimes a reference copy is the right thing to do. Exactly
] speaking there are many forms of assignment since every type can
] define its own assignment.

IMO this is a poor design decision. This means the same kind of
object, such as array, can't exist sometimes on the stack and
sometimes as an object in the heap, because the fact of how it's
allocated and copied is hardwired into the type definition. Better
would be to have a shallow-copy method for every object, and call
the shallow-copy object whenever the object is stored on the stack,
but simply copy the pointer whenever just the pointer is stored on
the stack. Thus it's the variable type (inline-stack-Object vs.
pointer-to-heap-Object), not the Object class, which determines
what copying happens. Then it would be possible to copy an object
from stack to heap or vice versa as needed. With your method, it
would seem that every instance of a given type of object must be on
the stack, or every instance in the heap, never some here and some
there as needed.

As for deep copy, that's a huge can of worms, telling the copy
operation when to stop and go no deeper. Kent Pittman discussed
this in his essay on intention. That's why copy-list copy-alist and
copy-tree all do different things when presented with exactly the
same *internal* datatype of CONS-tree. I think it's best if
assignment avoid this can of worms and let the programmer say
explicitly what kind of deep copy might ever be required in special
circumstances.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 22, 2008, 7:54:18 PM6/22/08

> From: Your Name <n...@none.none>
> ... I don't really find OO very useful. ... the more complex the

> project, the more inheritance and information hiding become nothing
> more than a burden, in my experience.

Would you consider the following compromise: During development of
a package of software tools, everything except the widgets of the
GUI is done by ordinary Lisp-style functions, not OO. Macros are
defined only where essential to simplify the syntax to speed up
coding of lots of cases, but only after several cases have been
coded manually by ordinary function call with recursive evaluation
of nested expressions and explicit quoting of data that is not to
be evaluated. All the business logic is done by ordinary D/P
(Data-Processing) functions, possibly aided by macros. At the very
end, before releasing the code to regular users, an OO wrapper is
put around the public-accessible tools, limiting access/view of the
innerds.

> The open/closed principle

Which one??
<http://en.wikipedia.org/wiki/Open/closed_principle>
- Meyer's Open/Closed Principle -- Parent class remains unchanged
forever except for legitimate bug fixes. Derived classes inherit
everything that stays the same, and re-define (mask) anything
that would be changed compared to the parent class. The
interface is free to differ in a derived class compared to the
parent class.
- Polymorphic Open/Closed Principle -- A formal *interface* is set
up once, and then never changed. Various classes implement this
interface in different ways.

GUIs with classes of widgets that are interchangeable via events
triggered by mouse keyboard etc. satisfy the second definition.
Traditional modular programming (doesn't have to be per the current
jargon) ideally satisfies the first definition.

Common Lisp's keyword parameters allow a compromise (with the first
definition) whereby a function can be defined with limited
functionality, then a new keyword can be added to allow additional
functionality without changing the earlier functionality in any
way. In this way the primary value of Meyer's Open/Closed Principle
can be obtained without needing any actual OO.

> It encourages code to become set in stone.

Code for implementations, or code for interfaces?? Or both?

> Unfortunately, especially on a large project, I find that no
> matter how we try, it is impossible to foresee the entire scope
> of what needs to be designed.

Totally true, especially on "cutting edge" R&D which explores new
D/P algorithms and eventually settles on whatever works best (or
gives up if nothing works well enough to put into practice). I find
that most of my software is new-D/P-algorithm R&D where "agile
programming" methodology (without the overhead of purchasing a
commercial Agile (tm) software environment) is the only practical
course of action, and totally precludes the design-first
implement-last paradigm.

Refactoring is a daily experience as I try various algorithms until
I learn what works best. Deciding that I really *also* need to do
something I didn't even envision at the start of the project, in
addition to the fifth totally different algorithm for something I
*did* anticipate at the start, is a common occurance.

For example, in my current project for developing ProxHash to
organize a set of transferrale/soft skills (in regard to seeking
employment), I had no idea that there would be a very large number
of extreme outliers (records nearly orthogonal to every other
record) which would require special handling, until I had already
finished developing all the code leading up to that point where the
extreme outliers could be discovered. (If I had known this problem
at the start, I might have simply written a brute-force
outlier-finding algorithm, which directly compared all n*(n+1)/2
records, not scalable to larger datasets than the 289 records I'm
working with presently, but it sure would have simplified the rest
of the code for *this* dataset by having them out of the way at the
start.) If you're curious:
- Original data, with labels added:
<http://www.rawbw.com/~rem/NewPub/ProxHash/labsatz.txt>
- Identification of extreme outliers, starting with the *most* distant,
working down to semi-outliers:
<http://www.rawbw.com/~rem/NewPub/ProxHash/outliers-2008.6.11.txt>
Note that d=1.4142135=sqrt(2) means orthogonal, where correlation is 0.
The most distant outlier has d=1.39553 C=0.02625 to its nearest "neighbor".

> So the design is constantly changing, and the idea of some set of
> code that we write in the early stages surviving without needing
> vast rewrites for both functionality and efficiency is delusional.

Indeed, the "d" word is quite appropriate here. It's sad how
software programming classes so often teach the dogma that you must
spec out the entire project at the start before writing the first
line of code. Agile programming (including rapid prototyping,
bottom-up tool-building, and incessant refactoring in at three
different ways) is IMO the way to go.
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#s4outl>

* Lesson 4: Refactoring syntax

* Lesson 5: Refactoring algorithms

(One other that came up in another thread, which I've now forgotten)
OK, I went back to my file of backup copies of articles posted, and
here's the relevant quote:

| After that point, my beginning lessons don't discuss additional
| ways of refactoring, but I think we would agree that further
| refactoring is *sometimes* beneficial:
| - Using OOP.
| - Using macros to allow variations on the usual syntax for calling functions.
| - Defining a whole new syntax for specialized problem domains,
| including a parser for such syntax.

Here's a reference for the entire article if you're curious:
<http://groups.google.com/group/comp.programming/msg/bc8967c3b7522c5d>
= Message-ID: <rem-2008...@yahoo.com>

> 1) The development tools and libraries are the most mature for
> C++ and this is essential.

I'm curious: Why did you choose C++ instead of Common Lisp?

> 2) The OS, drivers, and application code in the embedded firmware
> is all C and we have no choice in that, unless we want to develop a
> compiler for some other language, or write directly in assembly
> language.

I agree with that choice. C is fine for writing device drivers.

> Since C and C++ are largely similar, parts of the C code that we
> want to share with the other applications can be used directly.

Why would there be any code shared between device drivers and other
parts of your application???

> But if not for these two restrictions, in hindsight, I would not
> have chosen C++ or any other OO language.

If there was in fact no significant amount of code shared between
device drivers and the rest of the application, then maybe chosing
C++ was a mistake anyway? I'll withhold judgement until I know the
answer the shared-code-dd/app question just above.

> I've wandered far enough off-topic from the original post.

That's not a problem at all, since we're still totally on-topic for
the newsgroup comp.programming, and even for comp.software-eng in
case anybody wants to cross-post parts of this thread there.

Bartc

unread,

Jun 23, 2008, 9:00:51 AM6/23/08

"Robert Maas, http://tinyurl.com/uh3t"
<jaycx2.3....@spamgourmet.com.remove> wrote in message
news:rem-2008...@yahoo.com...

> Only in the really stupid cruddy languages such as C you're
> familiar with.

What would your language of choice be for implementing Lisp?

--
Bartc

Robert Maas, http://tinyurl.com/uh3t

unread,

Jun 23, 2008, 9:39:32 PM6/23/08

> > Only in the really stupid cruddy languages such as C you're
> > familiar with.

> From: "Bartc" <b...@freeuk.com>

> What would your language of choice be for implementing Lisp?

A bootstrapping/crosscompiling process involving extended versions
of SYSLISP (which was available in PSL) and LAP (which was
available in MacLisp and Stanford Lisp 1.6). Revive those
dinosaurs, and enhance them to support generating native
executables for bootstrapping purposes.

When I helped port the PSL kernel from Tenex to VM/CMS, we needed
to write the outer frame of the executable in IBM 360/370 assembly
language, because SYSLISP required the IBM 370/370 registers (base
register and stack mostly) to be already set up before code
generated by SYSLISP would execute propertly. It seems to me
entirely reasonable to enhance SYSLISP or LAP to be able to
generate those first few instructions that reserve memory for the
stack and load the registers needed by the regular code.

So the basic plan would be as follows:
- Read the internal documentation for the first target system, to
learn what the format of executable files is supposed to be.
Write code in an earlier version of Lisp (anything that's
currently available, even XLISP on Macintosh System 7.5.5 might
be "good enough" for this purpose) to generate the minimal
executable file and to have a hook for including additional code
in it. A sequence of (write-byte whatever outputStream)
statements would be good enough to directly generate the minimal
executable file. Or LAP could be fixed to allow directly
generating inline code, not wrapped in any FUNCTION header, and
to call WRITE-BYTE instead of building the body of a FUNCTION.
Or LAP could build a dummy function body and then the body could
be copied to the outputChannel and then the dummy function
discarded. Or use something totally esoteric, instead of Lisp,
to implement something like LAP, such as Pocket Forth, or
HyperCard/HyperTalk, or an assembler. But actually the advantage
of implementing LAP to do the right thing is that the code for
that can then be ported to later stages in the bootstrapping
process to avoid needing to re-do all that work in Lisp later
when it's time to "close the loop". On the other hand, writing a
Forth emulator in Lisp would be easy enough, so if Pocket Forth
is used at this stage none of the Forth code would need to be
discarded, it could be kept as part of the finished product.
But actually using real genuine LAP syntax instead of something
easy for Forth to parse would be best, so I retreat to using
some earlier version of Lisp to implement the revitalized LAP.
In any case, LAP doesn't need to actually know any opcodes. It's
sufficient to know the different formats of machine language
instructions and accept hexadecimal notation for opcode and each
other field within each machine format instruction, or possibly
just accept a list whose first element is a symbol identifying
the rest of the list as hexadecimal bytes and the rest of the
elements handcoded byte of the instruction.
- Hand-code in assembly language, using hexadecimal-LAP syntax, the
absolute minimum application to assemble code, taking input from
a file IN.LAP in LAP syntax and writing to a file OUT.EXE in
target-machine language. Add that code to the minimal executable
from above and pass all that code through the earlier-version-of-Lisp
LAP above, thereby generating an executable that all by itself
assembles LAP. The earlier-LISP can now be despensed with since
we now have an assembler which can assemble itself.
- Hand-code in assembly language, using LAP syntax, enhancements to
the LAP assembler, such as the full set of instruction formats
(still with all hexadecimal fields), opcodes for the target
machine that know what instruction format is used by each,
labels that can be referred to from other places in the code.
After each round of this enhancement is completed, the next
round will be easier to code. After labels are implemented,
*real* assembly-language programming is possible, with lots of
subroutines used for nice structured programming.
- Hand-code in assembly language, using full LAP syntax from above,
the transformations needed to map a minimal subset of SYSLISP to
LAP. Now we have a compiler that takes a mix of SYSLISP-subset
and LAP and produces an executable.
- Code in SYSLISP-subset all the enhancements needed to implement
the full SYSLISP syntax and semantics. Now we have a compiler
that takes a mix of SYSLISP and LAP and produces an executable.
- Code in SYSLISP the bare minimum primitives needed to build a
symbol (in a default package that isn't a fullfledged PACKAGE
object in the usual sense) and access key fields from it, box
and unbox a small integer, build a CONS cell and access CAR and
CDR from it, parse a simple s-expression containing only symbols
and small integers to produce a linked list of CONS cells etc.,
print out such an s-expression from the linked list, enhance
SYSLISP to support wrapping a function body around a block of
code and linking from a symbol to that function body, applying
such a function to a list of parameters, EVALing small integers
to themselves and symbols to their current values, and
recursively EVALing a linked list whose CAR is the symbol of a
defined function and whose CDR is a list of expressions that
evaluate to parameters to that function. Include all that code
in compilation to our executable. We now have a mimimal LISP-subset1
interpreter that automatically builds an executable each time it
is run.
- Rearrange the code so that there's a function called
START-WRITING-EXECUTABLE which opens OUT.EXE for output and
generates the starting data for an executable, a function
called COMPILE-LAP+SYSLISP which takes a linked-list of a mix of
LAP and SYSLISP and compiles them into the already-open
executable-output file, a function called
FINISH-WRITING-EXECUTABLE which writes any necessary finalization
data in the executable and closes the output file. We now have
an interpretor we can just use as we want without necessarily
writing an executable, but any time we want we can code
(START-WRITING-EXECUTABLE)
(COMPILE-LAP+SYSLISP ...) ;More than one such can be done here
(FINISH-WRITING-EXECUTABLE)
.. details of additional levels of bootstrapping not included here ...

Now when we want to port everytning up to some bootstrapping point
to another machine, all we need is some crude way to establish the
very first bootstrap on the new machine, then copy all the various
additional bootstrapping code across from the old machine to the
new machine and manually replace the specific machine instructions
that aren't the same between old and new CPUs and then run the
result through the current level of bootstrapping to build the
executable for the next level of bootstrapping. By the time we
reach the level of having a minimal SYSLISP compiler, virtually
nothing more would need translation to the new CPU because SYSLISP
is mostly portable. During really high-level bootstrapping, only
the parts that need to be written in LAP, such as tag bits on
various kinds of data objects, and system calls to allocate memory
or do filesystem operations, would need to be manually converted to
a new target machine. But once we have a halfway decent Lisp
interpretor working, it's easy to write code to maintain a set of
tables that formally express tag bits and stuff like that, and then
generate LAP expressions dependent on those tables, such that each
machine-level characteristic would need to be coded in the tables
just once then all the LAP related to it could be generated
automatically.

Of course you could *cheat* by using an existing fullfledged
implementation of Common Lisp, which *was* coded in C, to implement
the entire LAP and SYSLISP compiler as functions within that
existing implementation, and then build more and more of the *new*
Lisp implementation by generating the executable directly from that
existing+LAP+SYSLISP environment. That would let you shortcut some
of the early bootstrapping levels. But that's not as much fun, and
really it is cheating to use anything that passed through C for
your cross-compiler, and it really would be *cheating*.

Now to *really* avoid all sense of using C, you can't implement
your first bootstrap using any operating system that was even
partly built using C, especially not Unix. I think early versions
of MacOS (at least through 6.0.7) were built using Pascal rather
than C, is that correct? But for the full spirit of this avoidance
of using some *other* programming language to cross-compile into
Lisp, thereby making Lisp dependent on that *other* cursed
language, we really ought to avoid *any* other language above
assembly language. My MOS-6502 bare machine, bootable using pROM
Octal-DDT, is in storage but was working the last time I tried it.
It could perhaps be used to avoid both C and Pascal as well as any
other high-level language. Does anybody have a PDP-10 that still
works? Its operating system was written in assembly language, the
original DEC systems using Macro, and the Stanford-AI system using
FAIL. What was MIT's ITS written in, DDT?

gremnebulin

unread,

Jun 24, 2008, 6:11:51 AM6/24/08

On 9 Jun, 19:46, Richard Heathfield <r...@see.sig.invalid> wrote:

> > The process, for lack of a better term, is "compression."
>
> Or abstraction, or functional decomposition...

Or chunking, or convergence...

Bartc

unread,

Jun 24, 2008, 6:34:43 AM6/24/08

"Robert Maas, http://tinyurl.com/uh3t"
<jaycx2.3....@spamgourmet.com.remove> wrote in message
news:rem-2008...@yahoo.com...

>> > Only in the really stupid cruddy languages such as C you're
>> > familiar with.
>> From: "Bartc" <b...@freeuk.com>
>> What would your language of choice be for implementing Lisp?
>
> A bootstrapping/crosscompiling process involving extended versions

...

> Of course you could *cheat* by using an existing fullfledged
> implementation of Common Lisp, which *was* coded in C, to implement
> the entire LAP and SYSLISP compiler as functions within that
> existing implementation, and then build more and more of the *new*
> Lisp implementation by generating the executable directly from that
> existing+LAP+SYSLISP environment. That would let you shortcut some
> of the early bootstrapping levels. But that's not as much fun, and
> really it is cheating to use anything that passed through C for
> your cross-compiler, and it really would be *cheating*.

OK, so you're going out of your way to avoid using C, for reasons I don't
fully understand (or perhaps I do :-)

Although I would just have used assembler, if conventional HLLs were out;
Lisp doesn't look that difficult to implement, at first glance...

> Does anybody have a PDP-10 that still
> works?

alt.sys.pdp10

Apparently one just got sold on eBay.

--
bartc

thomas...@gmx.at

unread,

Jun 26, 2008, 5:21:31 PM6/26/08

On 23 Jun., 00:05, jaycx2.3.calrob...@spamgourmet.com.remove (Robert

Maas, http://tinyurl.com/uh3t) wrote:
> > From: thomas.mer...@gmx.at
> > If you want to invent your own abstraction mechanisms you might
> > be interested in Seed7. There are several constructs where the
> > syntax and semantic can be defined in Seed7:
> > - Statements and operators (while, for, +, rem, mdiv, ... )
>
> Some of those are such commonly useful constructs that it seems
> ill-conceived to have each application programmer re-invent the
> wheels.

[snip]
Thank you four your feedback. Since I am still evaluating your mail
I do not respond to all your questions. I tried to improve the FAQ
and the manual to include some answers to your questions. To keep
the answer short I suggest you read some links:

> Are there standard packages available to provide these as
> "givens" with a well-documented API so that different application
> programmers can read each other's code?

It is not the intention of Seed7 that everybody re-invents the
wheel. There is a well-documented API. The predefined statements of
Seed7 are described here:
http://seed7.sourceforge.net/manual/stats.htm

Other types and their functions (methods) are described here:
http://seed7.sourceforge.net/manual/types.htm

> Again, these are such basic container types that they really ought
> to be provided in a standard package. Are they?

A short explanation of Seed7 container types is here:
http://seed7.sourceforge.net/faq.htm#container_classes

Details to some abstract data types cen be found here:
http://seed7.sourceforge.net/manual/types.htm#array
http://seed7.sourceforge.net/manual/types.htm#hash
http://seed7.sourceforge.net/manual/types.htm#set
http://seed7.sourceforge.net/manual/types.htm#struct

> Um, C is not exactly a high-level language.

What computer scientists mean when they speak about "high level
languages" is explained here:
http://en.wikipedia.org/wiki/High_level_language

> Ugh! So you're using C almost as if it were an assembly language,

The first C++ compiler did this also: Cfront generated C output
which was compiled with a C compiler. See:
http://en.wikipedia.org/wiki/Cfront

> What precisely do you mean by "templates"?

What computer scientists mean when they speak about "templates"
is explained here:
http://en.wikipedia.org/wiki/Template_%28programming%29

> What precisely do you mean by "generics"?

What computer scientists mean when they speak about "generics"
is explained here:
http://en.wikipedia.org/wiki/Generic_programming

> Why make a distinction between statements and expressions-with-operators??

Because most people see operators special. Probably because they
think of infix operators with priority and associativity. Languages
like Ada, Java, C++ also see them as destinct. In Seed7 there is no
such destinction.

> There's hardly any value to static type checking, compared to
> dynamic (runtime) type-checking/dispatching.

Just because a feature like mandatory static type checking is not
present in your favorite language (Lisp) does not turn this concept
into something negative. To learn something about type systems I
suggest you read:
http://en.wikipedia.org/wiki/Type_system

Type checking and dispatching are different things. Although Seed7
does static type checking you can choose between compile-time and
run-time dispatching. The OO concept of Seed7 is explained here:
http://seed7.sourceforge.net/manual/objects.htm

I have also improved my FAQ about static type checking. See:
http://seed7.sourceforge.net/faq.htm#static_type_checking

The difference between your "intentional datatype" and your
"internal datatype" lies in the measurement units. A static type
system as Seed7's can also be used to cope with measurement units.

> Does it run on FreeBSD Unix? (That's what my shell account is on.)

If a shell, a make utility and a C compiler with header files and
library are available I see no reason that compilation should not
work. At least I had reports of people who successfully compiled
Seed7 on FreeBSD.

> Why doesn't it run on MacOS 6 or 7?

Where did you get the impression that it does not run on MacOS?
I have not tried it, but that does not mean that is is not possible.
One thing necessary to support Seed7 is a (real) C compiler and
a C library.

> ] * The possibility to declare new statements (syntactical and
> ] semantically) in the same way as functions are declared
>
> Are you really really sure that's a good idea.

Yes

> - It makes one person's code unreadable by anyone else.

I have added an answer to this question to the FAQ. See:
http://seed7.sourceforge.net/faq.htm#everybody_invents_statements

> Please reconsider your decision to use operators...
> Please consider going back to square one...

You obviously have no idea for how long I am already working on the
concepts of Seed7. Seed7 is not Lisp, nor wants it to be. The S7SSD
(the Seed7 structured syntax description) is one of the basic
building blocks of Seed7 (a chapter about it will be added to the
manual in the near future). Seed7 is about the possibility to
define new syntax and semantic. Therefore it is possible to steal
constructs from other languages. Additionally the syntax and the
semantic of the predefined constructs is defined with much more
precision since they are defined in terms of Seed7.

> Some/most/all of the other features you claim aren't available in
> other languages are in fact already available in Common Lisp.

Not exactly. See:
http://seed7.sourceforge.net/faq.htm#lisp_comparison

> ] Although functions can return arbitrary complex values (e.g. arrays of
> ] structures with string elements) the memory allocated for all
> ] intermediate results is freed automatically without the help of a
> ] garbage collector.
>
> How?????

I have improved the FAQ for this. See:
http://seed7.sourceforge.net/faq.htm#garbage_collection

> Debug-use case: A application is started. A sematic error (file
> missing for example) throws user into break package. User fixes the
> problem, but saves a pointer to some structure in a global for
> later study.

You are much too deep in the Lisp way of thinking. If a stack
shrinks the elements popped from the top just don't exist any more.
It will lead to undefined behaviour if you keep pointers to such
nonexisting elements and expect the data to be still there. It is
also a common bug in C/C++ and other similar languages when a
function returns a pointer to some local data (which is at the
stack): It is a severe bug (although it may or may not work
depending on the implementaion). If you want such data to be
available later you need to make a (deep) copy.

This is also the reason why Seed7 functions use a 'return' variable
to refer to the result of a function. The 'return' variable is
handled specially such that no (deep) copy is necessary when the
function is left.

> ] type save solution for containers ...
> **f* (typo, the first typo I've found so-far, your English is good!)

Thank you for pointing this typo out. Since I am not a native
speaker it is hard to get the spelling right.

> But having *any* runtime dispatching based on actual type of an
> object defeats your <coughCough>wonderful</coughCough> static type

> checking.

No. The interface types can be checked statically and it is also
possible to check the implementation type that it implements all
functions granted by the interface. The terms interface and
implementation and the OO concept of Seed7 is explained here:
http://seed7.sourceforge.net/manual/objects.htm

> ;Static type checking fails to detect this type-mismatch undefined-method error.

You obviously did not understand that static type checking is used
to reduce the possibility of bugs, not to make them impossible.
Every bug found at compile-time will not make you trouble at
run-time. The earlier you can eliminate bugs the better.

> What is the precise meaning of type 'integer'?

Integers are at least 32 bit signed integers.
The type 'integer' is explained here:
http://seed7.sourceforge.net/manual/types.htm#integer

There is also the type 'bigInteger' which serves as unlimited-size
signed integer. The type 'bigInteger' is explained here:
http://seed7.sourceforge.net/manual/types.htm#bigInteger

> What is the precise meaning of type 'char'?

The 'char' values use the UTF-32 encoding, see:
http://seed7.sourceforge.net/manual/types.htm#char

> What is the precise meaning of type 'string', both in terms of
> possible number of characters within a string, and what each
> character is.

The characters in the 'string' use the UTF-32 encoding. The Seed7
strings are not \0 terminated and therefore can also contain binary
data. The length of the string is stored in a 32 bit unsigned
integer. The type 'string' is explained here:
http://seed7.sourceforge.net/manual/types.htm#string

> What is the precise meaning of type 'float'?

The type 'float' is IEEE 754 single precision (at least when the C
compiler/library uses it). The type 'float' is described here:
http://seed7.sourceforge.net/manual/types.htm#char

> ] Variables with object types contain references to object values. This
> ] means that after
> ] a := b
> ] the variable 'a' refers to the same object as variable 'b'. Therefore
> ] changes on variable 'a' will effect variable 'b' as well (and vice
> ] versa) because both variables refer to the same object.
>
> That is not worded well.

I have changed it to

Variables with object types contain references to object values.
This means that after
a := b
the variable 'a' refers to the same object as variable 'b'.

Therefore changes of the object value that 'a' refers to, will

effect variable 'b' as well (and vice versa) because both
variables refer to the same object.

> ] In pure object oriented languages the effect of independent objects

> ] after the assignment is reached in a different way: Every change to an
> ] object creates a new object and therefore the time consuming copy
> ] takes place with every change.
>
> I've never heard of any such language.

This concept is called immutable object. It is explained here.
http://en.wikipedia.org/wiki/Immutable_object

This immutable objects concept is not used always and for all
objects. AFAIK Java and C# use immutable strings. If you concatenate
a lot of elements to a string the immutable strings are copied a
lot. This is the reason that Java introduced the mutable strings
classes StringBuffer and StringBuilder.

> ] ... In Seed7 every
> ] type has its own logic for the assignment where sometimes a value copy
> ] and sometimes a reference copy is the right thing to do. Exactly
> ] speaking there are many forms of assignment since every type can
> ] define its own assignment.
>
> IMO this is a poor design decision.

No. If you assign an integer you want a deep copy, but if you
assign a file you want to copy just the reference to it.
If you define a data structure where at some place no deep
copy should take place you can use pointers instead of
a normal type.

> This means the same kind of
> object, such as array, can't exist sometimes on the stack and
> sometimes as an object in the heap, because the fact of how it's
> allocated and copied is hardwired into the type definition.

Allocation of memory for an object, assigning a value to an object
for the first time (with the Seed7 'create' operation) and copying
an object (with the Seed7 := operator) are different things.

Seed7 Arrays can change their size at runtime therefore it would
not be easy to put them on the stack.

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net

Robert Maas, http://tinyurl.com/uh3t

unread,

Jul 5, 2008, 11:04:47 AM7/5/08

> Date: Sun, 22 Jun 2008 13:13:13 +0100
Why this response is so belated:
<http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
= <news:rem-2008...@yahoo.com>
> From: Jon Harrop <j...@ffconsultancy.com>

> Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
> problems to be solved much more concisely than with Lisp. Indeed, I
> think it is difficult to imagine even a single example where Lisp
> is competitively concise.

What does "solved much more concisely" mean??

The solution to any problem of significant complexity requires a
lot of code. Some of the code is in the core language, some is in
the libraries that ship with the core language, some is in
third-party libraries or vendor options, and some needs to be
written because that particular aspect of the problem has never
been solved before and there's always a first time for anything
that gets done. When you include *all* that code, there's no way
the totality of all that can be "concise".

Each language makes a design choice which kinds of tools are likely
to be generally useful and hence ought to be included in the core
langauge or supplied libraries, and everything else is left for
later (add-ons from vendor, third-party, and do-it-yourself).

But regardless of how much of the code is supplied by the vendor
and how much is supplied by third parties, in the end when all the
code is written, and it's time to actually run an application, the
command to run it is typically very short: You give the name of the
utility, you give a pointer to the file containing the
configuration parameters, you give a pointer to the file containing
the dataset (or that pointer is contained within the configuration
file), and you press ENTER or click the SUBMIT button to confirm
you want that task to actually execute.

So why don't you say what you really mean, that some particular
application types are better represented in published libraries in
your favorite language than in mine (Lisp)? And people who like
Lisp will respond that some *other* application types are better
represented in published libraries in Lisp language than in
*yours*? Why don't you compile a list of all the major application
types that you feel are needed by the largest number of potential
customers/users, and allow the rest of us to propose additions to
your list? For example, IMO the most commonly useful application
types are:
- Business operations (accounts receivable/payable, inventory, payroll)
- Text editing (both plain text and "word processing" formatted text)
- Web browsing/surfing/searching
- E-mail (both plain text and MIME multi-media) including protection against spam
- Conferencing and other public/group forums
- Security, including protection against computer viruses/worms/trojans/bots
There are probably five or ten more that I could add if I had the
time/energy to think of them. Anyway, after we get a complete list,
based on concensus of everyone reading comp.programming on a
regular basis and seeing this thread, *then* you need to say how
well your favorite language addresses these common application
needs.

Note that algebraic datatypes are not among the major applications I listed.

> Consider the trivial example of defining a curried "quadratic"
> function. In Common Lisp:
> (defun quadratic (a) (lambda (b) (lambda (c) (lambda (x)
> (+ (* a x x) (* b x) c)))))

I've never in my entire life found an application area where such a
tool was needed or even useful. Can you please cite what real-world
problem was aided by currying a quadradic or other function? If
it's a rare problem, then the two lines of code above are entirely
concise enough. I anybody needed to do a *lot* of that apparent
pattern of nested lambdas with one parameter each, it would be
simple to write a macro whereupon the syntax for actual usage would
reduce to:
(defcurry quadratic (a) (b c x) (+ (* a x x) (* b x) c))
Actually the expression for the quadradic is cruddy: Imagine trying
to extend that to a high-degree polynomial:
(defcurry quintic (a) (b c d e x)
(+ (* a x x x x x) (* b x x x x) (* c x x x) (* d x x) (* e x x) f))
Wouldn't it make more sense to chain the multiplications like this:
(defcurry quadratic (a) (b c x)
(+ c (* x (+ b (* x (+ a))))))
(defcurry quintic (a) (b c d e x)
(+ f (* x (+ e (* x (+ d (* x (+ c (* x (+ b (* x (+ a))))))))))))
Or for that matter, if all you're generating are standard
polynomials then you can abstract the pattern further:
(defcurrypoly quadratic (x) (a b c))
(defcurrypoly quintic (x) (a b c d e f))

> In F#:
> let inline quadratic a b c x = a*x*x + b*x + c

That's more verbose than my use of defcurrypoly.
Does F# provide a way to carry out the abstraction of defcurrypoly?
Also, it's absurd that the keyword for generating a curried
function is "inline" instead of "curry". I know that "curry" is
CompSci jargon, but at least it's not grossly misleading as to what
it does the way the "inline" keyword in F# is!! Anybody who doesn't
understand the jargon can look it up in Google or WikiPedia and
learn what the jargon means and not be confused. By comparison,
anybody reading code that mentions "inline" would be very unlikely
to guess that he/she should read that as what "curry" means.
(Of course if you're lying about what the "inline" keyword does in
F# then the fault lies with you, not with F#.)

By the way, sometimes the biggest gripe about mis-named functions
in Lisp is CONS CAR CDR. Like CONS is an abbreviation for
"construct", which begs the question *what* precisely is being
constructed, and CAR/CDR are totally obscure. IMO better names for
the object itself would be Standard Pair, abbreviated SP, and then
the three functions could be named MAKE-SP SP-LEFT SP-RIGHT and be
totally clear sensible to newbies. (The name for the print
expression for Standard Pair could remain "Dotted Pair", and the
connection between the internal Standard Pair and the printed
Dotted Pair would make a lot more sense to beginners.)
Note there are a lot of different kinds of (ordered) pairs available in Lisp:
- Pair of characters in a string, such as that which prints as "az"
- Pair of elements in a linked list, such as that which prints as (1 2)
- Pair of elements in a vector, such as that which prints as #(1 2)
- Key-value pair in a hash table, which has no print form but it's
the result of something like (setf (gethash :SURNAME *ht*) "Robert")
It's just that the kind of ordered pair constructed by CONS is the
*standard* kind that is essential to Lisp itself. Maybe *basic*
instead of *standard* would be a better term, but then the asswipe
who invented BASIC might sue for infringement. Maybe *common* would
be a better term? (Play on first word of "Common Lisp")

> > Suppose you want to hand-code a list
> > of data to be processed, and then map some function down that list.
> > In Lisp all you need to do is
> > (mapcar #'function '(val1 val2 val3 ... val4))
> > where the vals are the expressions of the data you want processed
> > and function is whatever function you want applied to each element
> > of the list. You just enter that into the REP and you're done.
> Lisp:
> (mapcar #'function '(val1 val2 val3 ... val4))
> OCaml and F#:
> map f [v1; v2; v3; .... vn]
> Mathematica:
> f /@ {v1, v2, v3, ..., vn}

OK, for simple lists those three languages equal Lisp in conciseness.
So what if you want to pass a more complicated structure to some function.
For example, how do you pass a nested ASSOC list to a function that
automatically traverses it matching keys against given keys to find
the correct branch at each level? Suppose that the function is
called fks which is an abbreviation for Find Key Sequence (in such
a nested assoc list implied).
(defparameter *nal1*
'(:TELEPHONECODES
(:USA "1" (:OREGON "503" "541" "971") (:NEVADA "702" "775")
(:NORTHDAKOTA "701"))
(:CANADA "1" (:MANATOBA 204))
(:BELGIUM "32" (:ANTWERP "3") (:BRUSSELS "2"))
(:CROATIA "385" (:DUBROVNIK "20") (:ZAGREV "1"))
))
(fks '(:TELEPHONECODES :CROATIA :DUBROVNIK) *nal1*) => ("385" "20")
(fks '(:FOOBAR :USA)) ;signals error, toplevel tag wrong
(fks '(:TELEPHONECODES :MEXICO :CANCUN) ;signals error, inner tag #1 missing
(fks '(:TELEPHONECODES :USA :TEXAS) ;signals error, inner tag #2 missing

Would that be equally simple to directly code in those three
languages you suggest?

> ... modern functional languages like Haskell, OCaml and F# ...
> They [users] clearly do not have a problem with the supreme brevity
> of these languages.

In cases where those particular languages are not already brief as
delivered, how easy is it to define a shorter syntax if you're
going to want a particular operation very terse because it's going
to be done a lot? I.e. how do they compare with Lisp's ability to
define macros?

> ... I was listing some of Lisp's most practically-important

> deficiencies. Lack of static typing in the language is one.

I disagree. I pointed out how static typing doesn't really solve
the problem at all because it's incapable of dealing with
intentional data types which is where the *real* problem lies.
Static type-checking of internal datatypes doesn't hardly begin to
solve the real problem. It's just an advertising gimmick that turns
out to be more cost than value. (The exception is in tight loops
where some machine type happens to work efficiently and where that
particular loop eats up most of the CPU time, which is precisely
where Lisp allows type declarations and consequent CPU-efficient
code as an *option* where it would actually be worth the cost.)

> Lack of concurrent GC in all Lisp implementations is another.

I gotta look up that to see what the term means ... first match is:
<http://xiao-feng.blogspot.com/2007/03/about-mostly-concurrent-gc.html>
That gives me some idea, but it doesn't seem like what you desire, right?

Skipping a few search results that don't look worth studying, I come to:
<http://groups.google.com/group/comp.lang.functional/browse_thread/thread/a288a989c5cd6acb/50b6ca2607173c91>
"So concurrent GC makes communication cheaper at the expense of everything else?"
That was written by none other than Jon D Harrop!
(The point was that if you have a shared mutable object, then you
need concurrent GC, but usually it's better design to have each
thread have its own mutable data and instead marshall data to pass
back and forth as needed, but then passing large objects back and
forth can cost a lot.)
There's a compromise between shared mutable data and passing large
data objects back and forth, which is to have each mutable data
object owned by just one process and everyone else must mutate or
examine it via queries in the OO style. So then only the handle on
the object, the parameters of the query (which may be handles on
other objects), and the return value (some piece of the object, or
a success/fail flag) need be passed between processes. It seems to
me that having an actual shared-mutable-object is almost as bad as
having a global variable that is mutable by multiple processes
(except for semaphores/interlocks/etc. which really *do* need to be
shared at a low level to make multi-processing work in the first place).
Later in that thread, it's pointed out that sharing memory doesn't
scale to really large applications, so you need to marshall data
(even if it's only ints) to pass between distant processors.
So in essence shared memory is a bad idea in large applications
where you need parallization in the first place, so you should
avoid shared memory, and thus there's no need for concurrent GC.
Later in that same thread, Harrop said "I'm hard pressed to find a
practically-important application that will be better off with a
concurrent GC." So if concurrent GC is essentially worthless, why
is Lisp's lack of it a reason for disliking Lisp??

<http://portal.acm.org/citation.cfm?id=1345206.1345238&coll=GUIDE&dl=GUIDE>
Hmm, this paper claims that "concurrent GC can share some of the
mechanisms required for transactional memory" thus as mechanisms
for the latter are made more efficient, same applies to former. But
I don't have time to look up yet another jargon term at bedtime to
know what the author is talking about.

So anyway, whether concurrent GC is worth the trouble or not, is debatable.

> Also lack of threads,

We're in agreement that the ANSI standard didn't include threads,
which is a major deficiency if you look to fully portable software
between different implementations of CL. Allegedly at that time the
Lisp community didn't have enough experience with threads to be
sure what version of them ought to be standardized. IMO the best
strategy is to keep the code specific to thread management totally
disjoint from the business logic so that thread management can be
simply replaced when porting to a different implementation.

> weak references,
[suffix "are all fundamental deficiencies of the language" for
these next several items]

It seems to me that a GC hook could be used to purge transient data
that hasn't been referenced recently. How easy is it to actually
implement this and get it working well with only ANSI CL to work
with?

> finalizers,

You mean like to automatically close a system resource whenever the
Lisp hook for it is GC'd? Normally system resources like that are
lexically bound, whereby UNWIND-PROTECT is sufficient to make sure
they get closed when that block of code is exited. Is there
*really* a good reason to have such resources kept around per data
object instead of per block-entry/exit? Somehow it seems to me a
dangerous design to have system resources that get closed at
"random" times (when memory runs short) instead of at clearly
defined times. But if you can think of a really good use for random
closure, I may reconsider my opinion.

> asychronous computations,

I'm not sure what you mean. Don't compilers already generate
efficient CPU data+instruction parallel-pipelining whenever they
detect that two operations are done in close proximity neither of
which depends on the result of the other?

> memory overflow recovery,

That might be moot on typical modern architectures that have a
small CPU cache and a large RAM and a truly gigantic virtual
address space. Locality of reference to avoid thrashing is a
problem long before total virtual memory or disk space for swapping
is ever exceeded.

> > Please write up a Web page that explains:
> > - What you precisely mean by "concurrent GC" (or find a Web page
> > that somebody else wrote, such as on WikiPedia, that says
> > exactly the same as what *you* mean, and provide the URL plus a
> > brief summary or excerpt of what that other Web page says).
> See Jones and Lins "Garbage Collection: algorithms for automatic dynamic
> memory management" chapter 8.

I don't have access to that. Is there a summary on the Web?

> In a serial GC, the program often (e.g. at every backward branch)
> calls into the GC to have some collection done. Collections are
> typically done in small pieces (incrementally) to facilitate
> soft-real time applications but can only use a single core. For
> example, OCaml has a serial GC so OCaml programs wishing to use
> multiple cores fork multiple processes and communicate between them
> using message passing which is two orders of magnitude slower than
> necessary, largely because it incurs huge amounts of copying that
> is not necessary on a shared memory machine:

On a shared-memory machine, how difficult is it to prevent
different processes from trying to read and write the same object
at the same time, whereupon the writer has only partly overwritten
the object when the reader starts to read it and sees pieces of old
and new object scrambled together?

> With a concurrent GC, the garbage collector's threads run
> concurrently with the program threads without globally suspending
> all program threads during collection. This is scalable and can be
> efficient but it is incredibly difficult to implement correctly.
> The OCaml team spent a decade trying to implement a concurrent GC
> and never managed to get it working, let alone efficient.

Maybe it's just too tricky (to implement correctly) to be worth
doing? Maybe the Lisp implementors were wise not to make that
gamble? Maybe it's "not ready for prime time", i.e. until somebody
finds a way to get it right, it's not worth fretting over the fact
that it hasn't been done, it's better to just live without it?

> > - List several kind of applications and/or API tools that are
> > hampered by lack of whatever that means.
> Any software that requires fine-grained parallelism for
> performance will be hampered by the lack of a concurrent GC.

List five such applications, and convince me that any technique
other than fine-grained parallelism is too inefficient for each
such application. Also convince me in each case that circular
structure is absolutely necessary, because after all if there are
no pointer loops then reference counting works just fine and no GC
is needed at all. Convice me that a hybrid system, using a symbol
table of tops of structures, with no loops within any of the
structures except that symbols in the symbol table may be
mentionned within structures, would not be sufficient for each such
application.

> >> 2. Even though Lisp's forte is as a language laboratory, Lisp has
> >> so little to offer but costs so much
> > Um, some implementations of Common Lisp are **free** to download
> > and then "use to your heart's content". How does that cost too much???
> Development costs are astronomical in Lisp compared to modern
> alternatives like F#, largely because it lacks a static type system

As I pointed out before, static type-checking doesn't solve enough
of the "problem" to be worth the trouble it causes. It doesn't
(can't) handle intentional types. Any language that imposes static
typing is what slows down development.

> but also because it lacks language features like pattern matching,

Here we go again.

> decent developer tools like IDEs

I used to use Macintosh Allegro Common Lisp on my Mac Plus.
Wasn't that a IDE?

> There are no decent Lisp implementations for .NET.

Why does anybody care? What, if anything, is good about .NET?

> Finally, there is no way you'll ever turn a profit because the
> market for commercial third-party software for Lisp is too small.

Consider leased server-side applications (CGI or PHP). How would
the customer know or care whether Lisp or some other language
happens to be used to implement it? Just about everyone on the net
uses the Google search engine. It's advertising based rather than
subscription, but ignore that detail for a moment. Do you know what
language(s) are used to implement all its internals? Do you care?

> I think you should aspire to earn money directly from customers
> rather than asking people to give you money to develop your
> software.

I like that idea very much. Will you find me a potential customer
who wants some serverside application that doesn't already exist?

Robert Maas, http://tinyurl.com/uh3t

unread,

Jul 5, 2008, 11:23:55 PM7/5/08

> Date: Tue, 24 Jun 2008 10:34:43 GMT

Why this response is so belated:
<http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
= <news:rem-2008...@yahoo.com>

> From: "Bartc" <b...@freeuk.com>
> >> What would your language of choice be for implementing Lisp?
> > A bootstrapping/crosscompiling process involving extended versions
> ...
> > Of course you could *cheat* by using an existing fullfledged
> > implementation of Common Lisp, which *was* coded in C, to implement
> > the entire LAP and SYSLISP compiler as functions within that
> > existing implementation, and then build more and more of the *new*
> > Lisp implementation by generating the executable directly from that
> > existing+LAP+SYSLISP environment. That would let you shortcut some
> > of the early bootstrapping levels. But that's not as much fun, and
> > really it is cheating to use anything that passed through C for
> > your cross-compiler, and it really would be *cheating*.
> OK, so you're going out of your way to avoid using C, for reasons
> I don't fully understand (or perhaps I do :-)

Well for starters, there's thing about how much of the camel's nose
do you allow into the tent. If you allow any use of C whatsoever,
there's a tendancy to get lazy and rely on C for just about
everything. It's sorta hypocritical to claim Lisp is better than C,
but for Lisp itself to rely on C all over the place.

> Although I would just have used assembler, if conventional HLLs
> were out; Lisp doesn't look that difficult to implement, at first
> glance...

One approach is to avoid all HLL (High Level Languages, I presume)
at all stages in bootstrapping, writing the barebones Lisp
interpretor in assembly language. But if the assembler itself was
written in C, we're getting into hypocritical territory again. It's
really difficult to be sure we aren't relying on C or other HLL
somehow unless we start from a bare machine such as an Altair with
front panel or PDP-10 with paper-tape of DDT.

> > Does anybody have a PDP-10 that still works?
> alt.sys.pdp10

Ah, thanks. I browsed the group last night, then posted a query
there just now before posting this.

> Apparently one just got sold on eBay.

How much did somebody pay for it? I saw an ad in the newsgroup
offering to sell one for a price that only a rich collector (for a
technology museum presumably) could afford.

Hmmm, I just thought of a neat hack: Get the old case of a PDP-10
that is no longer working (or fake one), strip out all the
electronics etc., put in a modern micro-computer with a PDP-10
emulator (including emulator for the kind of hard disks PDP-10s
used way back then so the emulated software can include an entire
operating system including device drivers), use the rest of the
space for storage or some other unrelated use to avoid wasting the
space, or use the extra space for a server farm. Pretend like the
hard disk (IBM 3330 or somesuch) is in the "basement", and have a
piece of floor that can be lifted up to "get to the basement" but
actually have a back-lit hologram so it *looks* as if there was a
huge basement full of disk drives.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jul 6, 2008, 3:36:43 AM7/6/08

> Date: Thu, 26 Jun 2008 14:21:31 -0700 (PDT)

Why this response is so belated:
<http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
= <news:rem-2008...@yahoo.com>

> From: thomas.mer...@gmx.at

> > Are there standard packages available to provide these as
> > "givens" with a well-documented API so that different application
> > programmers can read each other's code?
> It is not the intention of Seed7 that everybody re-invents the
> wheel. There is a well-documented API. The predefined statements of
> Seed7 are described here:
> http://seed7.sourceforge.net/manual/stats.htm

Given that such statements aren't in the *core* of the language,
but are added later when the library containing their definitions
is loaded (possibly when building the executable that is saved on
the disk to avoid the cost of loading the library again each time
the executable is run):
- Does Seed7 include a parser that reads Seed7 source-code syntax
(from an input stream such as from a text file, or from the
contents of a string) and produces a parse tree (as a pointy
structure)?
- If so, does this parser automatically get enhanced whenever a new
statement type is defined in some library that was loaded, so
that statements defined in the library can now be parsed?
- If so, is there also an inverse function that prettyprints from a
parse tree back out to textual source-code syntax?
- If so, does that prettyprinter function also get automatically
enhanced whenever a new statement type is defined, so that
statements of that new type can be printed out meaningfully?
I ask because in Lisp it's almost trivial to write a utility that
reads in source code, analyzes it for some properties such as
undeclared free variables or functions etc., in order to build a
master cross-reference listing for an entire project and to flag
globally undefined functions, and also it's trivial to write code
that writes code and then either executes it with the same
core-image or prettyprints it to a file to be compiled and/or
loaded later. So I wonder if Seed7 provides the primitives needed
to make the same kinds of tasks equally easy for Seed7 sourcecode.

> Other types and their functions (methods) are described here:
> http://seed7.sourceforge.net/manual/types.htm

| boolean conv A Conversion to boolean
| ( Type of argument A: integer,
| boolean conv 0 => FALSE,
| boolean conv 1 => TRUE )
Is the behaviour defined for other values given? Does it throw an
exception, or are compiler writers free to treat other integers any
way they feel, resulting in code that produces different results
under different implementations? The Common Lisp spec is careful to
have undefined behaviour only in cases where the cost of
prescribing the behaviour would have a good chance of greatly
increasing the cost of implementation. Is such the case here?

> > Again, these are such basic container types that they really ought
> > to be provided in a standard package. Are they?
> A short explanation of Seed7 container types is here:
> http://seed7.sourceforge.net/faq.htm#container_classes

| ord(A) Ordinal number
| ( Type of result: integer,
| ord(FALSE) => 0, ord(TRUE) => 1 )

So this is just the inverse of boolean conv?

Why even bother, unless this is a hackish way of conditionally
signalling an exception?

What distribution within that range, uniform or what?
Or implementation dependent? If some implementation wants to just
always return A with probability 1, that's conforming to the spec?
Or some other implementation wants to return A 90% of the time and
B 10% of the time and never any of the possible values strictly
between A and B, that would be conforming to the spec?
What sorts of datatypes are allowed for A and B?
Do they have to be of the same datatype, or can they be unrelated?
rand(3,9.5) rand(4.7SinglePrecision, 9.7DoublePrecision)
rand("FALSE",4.3) rand(FALSE,"TRUE") rand(TRUE,3)
Which if any of those expressions are conforming to the spec?

> Note that the 'and' and 'or' operators do not work correct when
> side effects appear in the right operand.

What is that supposed to mean??? If you use an OR expression to
execute the right side only if the left side is false, what
happens? I would expect what I said to happen, but the spec says
that doesn't work correctly? So what really happens?
(or (integerp x) (error "X isn't an integer")) ;Lisp equivalent

> The result an 'integer' operation is undefined when it overflows.

That's horrible! Does Seed7 provide any way to perform
unlimited-size integer arithmetic, such as would be useful to
perform cryptographic algorithms based on products of large prime
numbers, where the larger the primes are the more secure the
cryptographic system is?

| ! Faktorial

Is that how the word is spelled somewhere in the English-speaking world?

Most CPUs, or software long-division procedures, compute quotient
and remainder simultaneously.
Is there any way for a Seed7 program to get them together?
It's wasteful to throw away the remainder then need to multiply the
quotient by the divisor and subtract to generate a copy of the
remainder that was thrown away a moment earlier.
(multiple-value-setq (q r) (floor a b)) ;Can do it in Lisp

So -1 ** -1 is required by the spec to signal an exception, instead
of giving the mathematically correct result of -1?
While 0 ** 0 which is mathematically undefined is *required* to return 1?

I'm too tired to proofread your spec any further.

> http://seed7.sourceforge.net/manual/types.htm#hash

| flip(A) Deliver a hash with keys and values flipped
| ( Type of result: hash [baseType] array keyType )

How is that possible if the table isn't a 1-1 mapping??

> > What precisely do you mean by "templates"?
> What computer scientists mean when they speak about "templates"
> is explained here:
> http://en.wikipedia.org/wiki/Template_%28programming%29

I.e. exactly what C++ implements, everything else is different and
not the same thing and substancard compared to C++ templates,
right?

> > ] Although functions can return arbitrary complex values (e.g. arrays of
> > ] structures with string elements) the memory allocated for all
> > ] intermediate results is freed automatically without the help of a
> > ] garbage collector.
> > How?????
> I have improved the FAQ for this. See:
> http://seed7.sourceforge.net/faq.htm#garbage_collection

| Memory used by local variables and parameters is automatically freed
| when leaving a function.

That doesn't makes sense to me. Suppose there's a function that has
a local variable pointing to an empty collection-object (set, list,
array, hashtable, etc.) allocated elsewhere. Now the function runs
a loop that adds some additional data to that collection-object, so
the object is now larger than before. Now the function returns. How
much of that collection-object is "memory used by local variables
and parameters" hence "automatically freed when leaving a
function", and how much of that collection-object is *not* such and
hence *not* automatically freed upon leaving the function?

> > Debug-use case: A application is started. A sematic error (file
> > missing for example) throws user into break package. User fixes the
> > problem, but saves a pointer to some structure in a global for
> > later study.
> You are much too deep in the Lisp way of thinking. If a stack
> shrinks the elements popped from the top just don't exist any more.

So it's impossible in Seed7 to have a function create an object and
return it, because anything created by a function doesn't exist any
more after return?

> It is also a common bug in C/C++ and other similar languages when
> a function returns a pointer to some local data (which is at the

> stack) ...

Are you saying that in Seed7 it's impossible to have any data that
is not on the stack, hence it's impossible for any function to
allocate memory for some object and then *return* a pointer to that
object so that the caller can later work with that object?

> If you want such data to be available later you need to make a
> (deep) copy.

That's an utterly horrible design requirement!!! That makes it
impossible to keep around multiple generations of a balanced search
tree which share all structure except log(n) path difference from
each version to the next, and hence it's impossible to insert or
delete etc. in log(n) time because of the need to make a complete
deep copy of the structure before you do the log(n) ins/del
operation on one of the two copies. Consider for example an
application that explores a space in a nested (tree-like) way, with
backtracking when it looks like some branch is not going to be
productive, but with continuations to re-activate a backed-from
branch when it turns out that the alternative is even less likely
to be productive. Each time it descends into a branch, it updates a
state tree in a non-destructive way so that both the current
branch-state (and any continuation of it) can co-exist with the
parent-state. As an example of such an algorithm, each branch is
evaluated as to likelihood of success, and a priority queue,
probably implemented as a HEAP (as in heapsort) is used to keep
track of all the currently pending branches in the search tree,
with each entry in the priority queue being one of the
balanced-binary-search state-trees. So with shared structures, as
is easy in Lisp, it takes log(n) time to pick the next branch to
explore, where n is the number of active branches, and it takes
log(k) time to non-destructively modify a branch-state BST when
moving from one branch to a sub-branch of it. It doesn't sound like
such an algorithm is even possible in Seed7, at least not with
log(k) time, more like k time to make a complete copy of the BST
each time it's modified while needing to keep the original too.

> > ;Static type checking fails to detect this type-mismatch undefined-method error.

> ... Every bug found at compile-time will not make you trouble at

> run-time. The earlier you can eliminate bugs the better.

Why do you feel the need to have two different times, one where
static stuff is checked but you have no idea what's really
happening, and then one where things actually happen? I suppose in
Seed7 it's impossible to have an interactive loop where you can
type in a line of code and it *immediately* does something and you
*immediately* see whether it did what you expected it to do instead
of needing to wait until the whole program is compiled before you
can see what that one line of code did?

> ... There is also the type 'bigInteger' which serves as

> unlimited-size signed integer. The type 'bigInteger' is explained
> here:
> http://seed7.sourceforge.net/manual/types.htm#bigInteger

For bigIntegers, it's especially painful not to have division with
both quotient and remainder directly returned. It takes a lot of
extra time to multiply the quotient by the divisor (then subtract
that from the original value) to get back the remainder after
having thrown away the remainder in the first place.

> > What is the precise meaning of type 'char'?
> The 'char' values use the UTF-32 encoding, see:
> http://seed7.sourceforge.net/manual/types.htm#char

| The type 'char' describes UNICODE characters. The 'char' values use
| the UTF-32 encoding. In the source file a character literal is written
| as UTF-8 UNICODE character enclosed in single quotes. For example:
| 'a' ' ' '\n' '!' '\\' '2' '"' '\"' '\''

That's not written well at all. It doesn't make sense. \n is not a
UTF-8 coding of a single character, it's two different UTF-8
(US-ASCII subset thereof) characters which are a C convention for
newline.
Likewise \\ is two UTF-8 (US-ASCII) characters.
Likewise \" is two UTF-8 (US-ASCII) characters.
Likewise \' is two UTF-8 (US-ASCII) characters.

I don't think you have any idea what UTF-8 really means.

> AFAIK Java and C# use immutable strings. If you concatenate a lot
> of elements to a string the immutable strings are copied a lot.

You don't know what you're talking about. There is no method in
Java that given a String object adds an element to it. There is an
*operator* (effectively a function but using infix syntax) that
concatenates two or more strings to produce a new string, i.e. what
you have is an ordinary expression that takes input values and
returns a new value, not an instance method that modifies an object
or even tries to modify an object.

> This is the reason that Java introduced the mutable strings
> classes StringBuffer and StringBuilder.

I don't see StringBuilder in the JavaDoc for the API online, so
maybe that's something brand-new in the most recent version of
Java. Anyway, yes StringBuffer *does* have methods for adding new
characters to the end of the string it represents.

But all this is irrelevant. Java's class String is a special case,
one of the very few immutable-object classes in Java. Saying "if
you concatenate to a String" is wrong, because you simply cannot
contactenate to a String in the first place. In general, objects
(with these rare exceptions) are mutable in Java and Lisp etc.
If a language has no mutable objects whatsoever, if the slightest
change to *any* object requires performing a deep copy to build a
whole new object, that IMO is not a good language design.

Jon Harrop

unread,

Jul 9, 2008, 5:35:09 PM7/9/08

Robert Maas, http://tinyurl.com/uh3t wrote:

>> Date: Sun, 22 Jun 2008 13:13:13 +0100
> Why this response is so belated:
> <http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
> = <news:rem-2008...@yahoo.com>
>> From: Jon Harrop <j...@ffconsultancy.com>
>> Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
>> problems to be solved much more concisely than with Lisp. Indeed, I
>> think it is difficult to imagine even a single example where Lisp
>> is competitively concise.
>
> What does "solved much more concisely" mean??

The solutions are shorter in modern FPLs.

> So why don't you say what you really mean, that some particular
> application types are better represented in published libraries in
> your favorite language than in mine (Lisp)?

That is neither what I meant nor what I said.

> And people who like
> Lisp will respond that some *other* application types are better
> represented in published libraries in Lisp language than in
> *yours*? Why don't you compile a list of all the major application
> types that you feel are needed by the largest number of potential
> customers/users, and allow the rest of us to propose additions to
> your list? For example, IMO the most commonly useful application
> types are:
> - Business operations (accounts receivable/payable, inventory, payroll)
> - Text editing (both plain text and "word processing" formatted text)
> - Web browsing/surfing/searching
> - E-mail (both plain text and MIME multi-media) including protection
> against spam - Conferencing and other public/group forums
> - Security, including protection against computer
> viruses/worms/trojans/bots There are probably five or ten more that I
> could add if I had the time/energy to think of them. Anyway, after we get
> a complete list, based on concensus of everyone reading comp.programming
> on a regular basis and seeing this thread, *then* you need to say how
> well your favorite language addresses these common application
> needs.
>
> Note that algebraic datatypes are not among the major applications I
> listed.

Common != useful.

> ...

> (defcurry quadratic (a) (b c x) (+ (* a x x) (* b x) c))

> ...

>> In F#:
>> let inline quadratic a b c x = a*x*x + b*x + c
>
> That's more verbose than my use of defcurrypoly.

No it isn't. Also, you did not define "defcurry".

> Does F# provide a way to carry out the abstraction of defcurrypoly?

The "inline" did precisely that.

> Also, it's absurd that the keyword for generating a curried
> function is "inline" instead of "curry".

That is not what the "inline" was for.

Here is the OCaml/F# including the definition of the searching function:

let nal1 =
[ "USA", ("1", [ "Oregon", ["503"; "541"; "971"];
"Nevada", ["702"; "775"];
"NorthDekota", ["701"] ]);
"Canada", ("1", ["Manatoba", ["204"]]);
"Belgium", ("32", ["Antwerp", ["3"]; "Brussels", ["2"]]);
"Croatia", ("385", ["Dubrovnik", ["20"]; "Zagrev", ["1"]]) ]

let search key map k =
let h, t = List.assoc key map in
h :: k t

search "Croatia" nal1 (List.assoc "Dubrovnik")

The Mathematica is essentially the same as the Lisp because it also lacks
static type checking.

>> ... modern functional languages like Haskell, OCaml and F# ...
>> They [users] clearly do not have a problem with the supreme brevity
>> of these languages.
>
> In cases where those particular languages are not already brief as
> delivered, how easy is it to define a shorter syntax if you're
> going to want a particular operation very terse because it's going
> to be done a lot? I.e. how do they compare with Lisp's ability to
> define macros?

Macros are very rare in modern languages. OCaml has a full macro system but
it is rarely used. F# has no macro system.

>> ... I was listing some of Lisp's most practically-important
>> deficiencies. Lack of static typing in the language is one.
>
> I disagree. I pointed out how static typing doesn't really solve
> the problem at all because it's incapable of dealing with
> intentional data types which is where the *real* problem lies.
> Static type-checking of internal datatypes doesn't hardly begin to
> solve the real problem. It's just an advertising gimmick that turns
> out to be more cost than value.

Why do you think statically typed languages completely dominate general
purpose programming?

> (The exception is in tight loops
> where some machine type happens to work efficiently and where that
> particular loop eats up most of the CPU time, which is precisely
> where Lisp allows type declarations and consequent CPU-efficient
> code as an *option* where it would actually be worth the cost.)

Lisp is unable to regain the performance cost of its dynamic design:

http://www.ffconsultancy.com/languages/ray_tracer/results.html

>> Lack of concurrent GC in all Lisp implementations is another.
>
> I gotta look up that to see what the term means ... first match is:
> <http://xiao-feng.blogspot.com/2007/03/about-mostly-concurrent-gc.html>
> That gives me some idea, but it doesn't seem like what you desire, right?

That is a step in the right direction.

The problem is that my tests were all ideally suited to distributed
parallelism when, in fact, even embarassingly parallel programs often
require substantial scattering and gathering of data. Having a
concurrent GC makes it possible for programs to share mutable data
structures which removes the cost of scatter and gather.

For example, I found that a parallel matrix multiply is up to 100x faster in
F# than OCaml:

http://groups.google.com/group/fa.caml/msg/c3dbf6c5cdb3a898

For programs like this (which are very common), scattering and gathering are
free with a concurrent GC but can be extremely expensive without.
Consequently, F# sees performance improvements from parallelization for a
much wider variety of tasks than OCaml can.

> So anyway, whether concurrent GC is worth the trouble or not, is
> debatable.

If you have >1 core, it is not debateable.

>> weak references,
> [suffix "are all fundamental deficiencies of the language" for
> these next several items]
>
> It seems to me that a GC hook could be used to purge transient data
> that hasn't been referenced recently. How easy is it to actually
> implement this and get it working well with only ANSI CL to work
> with?

AFAIK, it is inherently implementation dependent.

>> finalizers,
>
> You mean like to automatically close a system resource whenever the
> Lisp hook for it is GC'd? Normally system resources like that are
> lexically bound, whereby UNWIND-PROTECT is sufficient to make sure
> they get closed when that block of code is exited. Is there
> *really* a good reason to have such resources kept around per data
> object instead of per block-entry/exit?

Yes. Finalizers are useful for the same reason that GC itself is good (and
Ada is bad): you don't want to have to shoehorn everything you do into
scopes just to ensure that things get freed correctly. Another reason is
that garbage collectors often free sooner than that.

> Somehow it seems to me a
> dangerous design to have system resources that get closed at
> "random" times (when memory runs short) instead of at clearly
> defined times.

Absolutely. Finalizers are not suitable if you require determinism.

> But if you can think of a really good use for random closure, I may
> reconsider my opinion.

I used finalizers in OCaml to collect unused OpenGL display lists and
textures. The results were excellent: the code was greatly simplified at no
cost.

>> asychronous computations,
>
> I'm not sure what you mean. Don't compilers already generate
> efficient CPU data+instruction parallel-pipelining whenever they
> detect that two operations are done in close proximity neither of
> which depends on the result of the other?

I was referring to the use of green threads from a monadic style, like F#'s
asynchronous workflows. You need to rewrite the code to CPS which can be
done in Lisp but is not provided as standard (and would be useful for a
wide variety of concurrent programming tasks).

>> memory overflow recovery,
>
> That might be moot on typical modern architectures that have a
> small CPU cache and a large RAM and a truly gigantic virtual
> address space. Locality of reference to avoid thrashing is a
> problem long before total virtual memory or disk space for swapping
> is ever exceeded.

The vast majority of computers do not have virtual memory because they are
embedded devices.

>> In a serial GC, the program often (e.g. at every backward branch)
>> calls into the GC to have some collection done. Collections are
>> typically done in small pieces (incrementally) to facilitate
>> soft-real time applications but can only use a single core. For
>> example, OCaml has a serial GC so OCaml programs wishing to use
>> multiple cores fork multiple processes and communicate between them
>> using message passing which is two orders of magnitude slower than
>> necessary, largely because it incurs huge amounts of copying that
>> is not necessary on a shared memory machine:
>
> On a shared-memory machine, how difficult is it to prevent
> different processes from trying to read and write the same object
> at the same time, whereupon the writer has only partly overwritten
> the object when the reader starts to read it and sees pieces of old
> and new object scrambled together?

Depends who you ask. The Java community don't seem to have a problem but the
Haskell community will tell you that it is impossible (they often tell me
that what I do is impossible).

>> With a concurrent GC, the garbage collector's threads run
>> concurrently with the program threads without globally suspending
>> all program threads during collection. This is scalable and can be
>> efficient but it is incredibly difficult to implement correctly.
>> The OCaml team spent a decade trying to implement a concurrent GC
>> and never managed to get it working, let alone efficient.
>
> Maybe it's just too tricky (to implement correctly) to be worth
> doing?

If that were true then the JVM and .NET would not be worth having but, in
reality, they are two of the most highly valued pieces of software ever
written.

> Maybe the Lisp implementors were wise not to make that gamble?

The Lisp implementors had no choice: they cannot write a concurrent GC
because it is too hard for them. Same goes for all other open source FPLs as
well.

> Maybe it's "not ready for prime time", i.e. until somebody
> finds a way to get it right, it's not worth fretting over the fact
> that it hasn't been done, it's better to just live without it?

The JVM and .NET are prime time.

>> > - List several kind of applications and/or API tools that are
>> > hampered by lack of whatever that means.
>>
>> Any software that requires fine-grained parallelism for
>> performance will be hampered by the lack of a concurrent GC.
>
> List five such applications, and convince me that any technique
> other than fine-grained parallelism is too inefficient for each
> such application.

Polygon tesselation, Fast Fourier transform, LU decomposition, suffix trees,
the distinct element method...

> Also convince me in each case that circular
> structure is absolutely necessary, because after all if there are
> no pointer loops then reference counting works just fine and no GC
> is needed at all. Convice me that a hybrid system, using a symbol
> table of tops of structures, with no loops within any of the
> structures except that symbols in the symbol table may be
> mentionned within structures, would not be sufficient for each such
> application.

In theory, you can code everything in assembler. In practice, it is not
feasible. The same applies here: high-level languages make all of those
parallel tasks much easier.

>> >> 2. Even though Lisp's forte is as a language laboratory, Lisp has
>> >> so little to offer but costs so much
>> > Um, some implementations of Common Lisp are **free** to download
>> > and then "use to your heart's content". How does that cost too much???
>> Development costs are astronomical in Lisp compared to modern
>> alternatives like F#, largely because it lacks a static type system
>
> As I pointed out before, static type-checking doesn't solve enough
> of the "problem" to be worth the trouble it causes.

I disagree.

> Any language that imposes static typing is what slows down development.

I have found that modern statically typed languages let me develop
complicated applications much faster than any dynamic language.

>> decent developer tools like IDEs
>
> I used to use Macintosh Allegro Common Lisp on my Mac Plus.
> Wasn't that a IDE?

Compare it to Mathematica (which has the world's best IDE, IMHO).

>> There are no decent Lisp implementations for .NET.
>
> Why does anybody care? What, if anything, is good about .NET?

Lots of wealthy .NET users leading to high profits is good.

>> Finally, there is no way you'll ever turn a profit because the
>> market for commercial third-party software for Lisp is too small.
>
> Consider leased server-side applications (CGI or PHP). How would
> the customer know or care whether Lisp or some other language
> happens to be used to implement it?

In theory, they would not care. In practice, I am not familiar with any such
businesses.

> Just about everyone on the net
> uses the Google search engine. It's advertising based rather than
> subscription, but ignore that detail for a moment. Do you know what
> language(s) are used to implement all its internals?

Mostly C/C++.

> Do you care?

Yes but as a programmer rather than a Googler.

>> I think you should aspire to earn money directly from customers
>> rather than asking people to give you money to develop your
>> software.
>
> I like that idea very much. Will you find me a potential customer
> who wants some serverside application that doesn't already exist?

You need to write a compelling sales page and make it indexable by search
engines and make it as easy as possible for users to give you their money
before plastering all of the news websites that you can find with
descriptions of your new service and links to it. Then you need to
repeatedly update it and report new news stories about it for many months
before you will start to see significant website hits. Only hits on sales
pages correlate with sales.

As a guide, our products make around 20p profit per page view.

Alf P. Steinbach

unread,

Jul 9, 2008, 5:44:13 PM7/9/08

* Jon Harrop:

> Robert Maas, http://tinyurl.com/uh3t wrote:
>> So anyway, whether concurrent GC is worth the trouble or not, is
>> debatable.
>
> If you have >1 core, it is not debateable.

This seems to be a misconception. Having GC run on its own core is only "free"
(not quoting you) for a single-threaded program running in a single-tasking
operating system. It seems to be the same kind of misconception that generated
the SPARC register "window" system to avoid going to main memory for stack --
where end result is just more overhead for thread and process context switches.

Not that GC on own core isn't nice.

It's just not free, and not not debatable. ;-)

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

thomas...@gmx.at

unread,

Jul 11, 2008, 6:49:28 AM7/11/08

On 6 Jul., 09:36, jaycx2.3.calrob...@spamgourmet.com.remove (Robert

Maas, http://tinyurl.com/uh3t) wrote:
> > Date: Thu, 26 Jun 2008 14:21:31 -0700 (PDT)
>
> Why this response is so belated:
> <http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
> = <news:rem-2008...@yahoo.com>

Anyway, thank you for the feedback.

> > From: thomas.mer...@gmx.at
> > > Are there standard packages available to provide these as
> > > "givens" with a well-documented API so that different application
> > > programmers can read each other's code?
> > It is not the intention of Seed7 that everybody re-invents the
> > wheel. There is a well-documented API. The predefined statements of
> > Seed7 are described here:
> >http://seed7.sourceforge.net/manual/stats.htm
>
> Given that such statements aren't in the *core* of the language,
> but are added later when the library containing their definitions

> is loaded ...
Actually the statements are added during the parsing process.

> (possibly when building the executable that is saved on
> the disk to avoid the cost of loading the library again each time
> the executable is run):
> - Does Seed7 include a parser that reads Seed7 source-code syntax
> (from an input stream such as from a text file, or from the
> contents of a string) and produces a parse tree (as a pointy
> structure)?

Actually there are the functions 'parseFile' and 'parseStri' which
can be used to parse Seed7 source programs into a values of the
type 'program'. I just added a short description about the type
'program' to the manual. See:
http://seed7.sourceforge.net/manual/types.htm#program

Note that I try to handle the program currently executed and the
program which was parsed into a 'program' variable to be separate.
It is possible to execute 'program' values and to request the code
of a 'program' value in a structured form. Actually the Seed7 to C
compiler uses this feature to generate the C code. Currently the
features of this reflection are designed to make them usable for the
compiler.

It is not my intend to support programs which manipulaten their own
code as it is done with "self modifying code".

> - If so, does this parser automatically get enhanced whenever a new
> statement type is defined in some library that was loaded, so
> that statements defined in the library can now be parsed?

Yes. This is something happening during the parsing process.
Loading a library at runtime as a way to introduce new statements
for the program which is currently running IMHO makes no sense.

> - If so, is there also an inverse function that prettyprints from a
> parse tree back out to textual source-code syntax?

During the parsing some information, such as whitespace and comments
are lost. Some information about the position of an expression
is maintained. Generally I think that such a prettyprinter would be
possible.

> - If so, does that prettyprinter function also get automatically
> enhanced whenever a new statement type is defined, so that
> statements of that new type can be printed out meaningfully?

I have nothing done in this direction, but I think that that should
be possible. It might me necessary to extend the the reflection, if
some functionality necessary for prettyprinting, is missing.

> I ask because in Lisp it's almost trivial to write a utility that
> reads in source code, analyzes it for some properties such as
> undeclared free variables or functions etc., in order to build a
> master cross-reference listing for an entire project and to flag
> globally undefined functions, and also it's trivial to write code
> that writes code and then either executes it with the same
> core-image or prettyprints it to a file to be compiled and/or
> loaded later. So I wonder if Seed7 provides the primitives needed
> to make the same kinds of tasks equally easy for Seed7 sourcecode.

Such things are planned for Seed7. I have started with a program
which generates html documentation including a source file where
every use of a function is linked to its definition. The program
(doc7) works to some degree, but not good enough to release it.

> > Other types and their functions (methods) are described here:
> >http://seed7.sourceforge.net/manual/types.htm
>
> | boolean conv A Conversion to boolean
> | ( Type of argument A: integer,
> | boolean conv 0 => FALSE,
> | boolean conv 1 => TRUE )
> Is the behaviour defined for other values given? Does it throw an
> exception, or are compiler writers free to treat other integers any
> way they feel, resulting in code that produces different results
> under different implementations? The Common Lisp spec is careful to
> have undefined behaviour only in cases where the cost of
> prescribing the behaviour would have a good chance of greatly
> increasing the cost of implementation. Is such the case here?

This function was added short ago to be helpful for the I4 P-Code
interpreter which would be used for the P4 Pascal compiler.
Yes, I transfered this classic Pascal compiler to Seed7...
The I4 interpreter needs cheap functions to transfer all its
basic types like boolean, float, char, set to and from integer.
The Pascal version of the I4 interpreter uses an unchecked
cased record (which is equivalent to a C union). Therefore I
introduced this function. It was a mistake to add this function
to the documentation, since it is currently only experimental.
BTW it works like the odd function.

> > > Again, these are such basic container types that they really ought
> > > to be provided in a standard package. Are they?
> > A short explanation of Seed7 container types is here:
> >http://seed7.sourceforge.net/faq.htm#container_classes
>
> | ord(A) Ordinal number
> | ( Type of result: integer,
> | ord(FALSE) => 0, ord(TRUE) => 1 )
>
> So this is just the inverse of boolean conv?

Yes, but the introduction of 'boolean conv' was just experimental.
The function 'odd(integer)' is the function to be preferred
to convert an integer into a boolean.

Such functions are present to be usable in generic code. That way
the generic code can assume that the 'succ' function is present. The
'incr(A)' function, which is just a shortcut for 'A := succ(A)', is
also present just for this purpose. An example of a template
function which uses 'incr' is here:
http://seed7.sourceforge.net/examples/for_decl.htm

> | rand(A, B) Random value in the range [A, B]
> | ( rand(A, B) returns a random value such that
> | A <= rand(A, B) and rand(A, B) <= B holds.
> | rand(A, A) => A,
> | rand(TRUE, FALSE) => EXCEPTION RANGE_ERROR )
>
> What distribution within that range, uniform or what?

Uniform. I added an explanation to the documentation.

> What sorts of datatypes are allowed for A and B?
> Do they have to be of the same datatype, or can they be unrelated?
> rand(3,9.5) rand(4.7SinglePrecision, 9.7DoublePrecision)
> rand("FALSE",4.3) rand(FALSE,"TRUE") rand(TRUE,3)
> Which if any of those expressions are conforming to the spec?

There is a general rule to keep the descriptions short. This rule
can be found at the beginning of the chapter "PREDEFINED TYPES":
The operators have, when not stated otherwise, the type described
in the subchapter as parameter type and result type.

> > Note that the 'and' and 'or' operators do not work correct when
> > side effects appear in the right operand.
>
> What is that supposed to mean??? If you use an OR expression to
> execute the right side only if the left side is false, what
> happens? I would expect what I said to happen, but the spec says
> that doesn't work correctly? So what really happens?

This is a bug in the spec. I have corrected the sentence to:
Note that this early termination behaviour of the 'and' and 'or'
operators also has an influence when the right operand has side
effects.

> (or (integerp x) (error "X isn't an integer")) ;Lisp equivalent
>
> > The result an 'integer' operation is undefined when it overflows.
>
> That's horrible!

AFAIK many languages such as C, C++ and Java have this behaviour.
I would like to raise exceptions in such a case, but as long
as there is no portable support for that in C, Posix or some
other common standard, it would be hard to support it with
satisfactory performance.

> Does Seed7 provide any way to perform
> unlimited-size integer arithmetic, such as would be useful to
> perform cryptographic algorithms based on products of large prime
> numbers, where the larger the primes are the more secure the
> cryptographic system is?

Yes.

> | ! Faktorial
>
> Is that how the word is spelled somewhere in the English-speaking world?

This is how a word looks like when it is not translated correctly.
Thank you for pointing this out.

> | div Integer division truncated towards zero
> | ( A div B => trunc(flt(A) / flt(B)),
> | A div 0 => EXCEPTION NUMERIC_ERROR )
> | rem Reminder of integer division div
> | ( A rem B => A - (A div B) * B,
> | A rem 0 => EXCEPTION NUMERIC_ERROR )
>
> Most CPUs, or software long-division procedures, compute quotient
> and remainder simultaneously.
> Is there any way for a Seed7 program to get them together?

Currently not, but it is not hard to add such a thing.

> It's wasteful to throw away the remainder then need to multiply the
> quotient by the divisor and subtract to generate a copy of the
> remainder that was thrown away a moment earlier.

I guess that a good optimizing compiler can recognize the situation
when 'a div b' and 'a rem b' are computed close together without
changing a or b in between. Since Seed7 is compiled to C, I think
that I can rely on the C compiler to do this optimisation.

> (multiple-value-setq (q r) (floor a b)) ;Can do it in Lisp
>
> | ** Power
> | ( A ** B is okay for B >= 0,
> | A ** 0 => 1,
> | 1 ** B => 1,
> | A ** -1 => EXCEPTION NUMERIC_ERROR )
>
> So -1 ** -1 is required by the spec to signal an exception, instead
> of giving the mathematically correct result of -1?

In the general case 'a ** -1' does not have an integer result.
AFAIK Ada also does it that way.
BTW: The type float also has exponentiation operators defined.

> While 0 ** 0 which is mathematically undefined is *required* to return 1?

This behaviour is borrowed from FORTRAN, Ada and some other
programming languages which support exponentiation.

> I'm too tired to proofread your spec any further.
>
> >http://seed7.sourceforge.net/manual/types.htm#hash
>
> | flip(A) Deliver a hash with keys and values flipped
> | ( Type of result: hash [baseType] array keyType )
>
> How is that possible if the table isn't a 1-1 mapping??

That's the reason the result is of type:
hash [baseType] array keyType.
The values in the hash tables are arrays with keyType elements.

> > > What precisely do you mean by "templates"?
> > What computer scientists mean when they speak about "templates"
> > is explained here:
> >http://en.wikipedia.org/wiki/Template_%28programming%29
>
> I.e. exactly what C++ implements, everything else is different and
> not the same thing and substancard compared to C++ templates,
> right?

Wrong. I use the word template to describe a function which is
executed at compile time and declares some things while executing
(at compile time). For example: The function 'FOR_DECLS' is used to
declare for loops. FOR_DECLS gets a type as parameter and declares
a for loop for that type. This is explained here:
http://seed7.sourceforge.net/examples/for_decl.htm

As you can see it is necessary to call template functions explicit.
They are not invoked implicit as the C++ template functions.
IMHO this explicit calls of template functions make the program
easier to read. Maybe I should add something to the FAQ.

> > > ] Although functions can return arbitrary complex values (e.g. arrays of
> > > ] structures with string elements) the memory allocated for all
> > > ] intermediate results is freed automatically without the help of a
> > > ] garbage collector.
> > > How?????
> > I have improved the FAQ for this. See:
> >http://seed7.sourceforge.net/faq.htm#garbage_collection
>
> | Memory used by local variables and parameters is automatically freed
> | when leaving a function.
>
> That doesn't makes sense to me. Suppose there's a function that has

> a local variable pointing ...
Pointers are something else. If you are using pointers you are
responsible to manage that they point to reasonable data.

The automatic freeing of local variables has exceptions (sorry
I will add an explanation to the FAQ). The values referred by
pointers and the values refered by interface types are not managed
automatically.

> to an empty collection-object (set, list,
> array, hashtable, etc.) allocated elsewhere. Now the function runs
> a loop that adds some additional data to that collection-object, so
> the object is now larger than before. Now the function returns. How
> much of that collection-object is "memory used by local variables
> and parameters" hence "automatically freed when leaving a
> function", and how much of that collection-object is *not* such and
> hence *not* automatically freed upon leaving the function?

I see. We have a cultural misunderstanding.
Lets say the collection is an array.
What you were suggesting is a collection declared with:

array ptr myData

which means that the collection contains pointers to myData
(some structure). In this case you are right and automatic
management is not possible (at least in the sense I talked about).

What I have in my mind when talking about automatic managed memory
is a collection declared with:

array myData

In this case the collection contains (copies of) the actual data.
When such a collection is freed it can also free its content
since it owns it. And this are the things which done automatically
in a stack oriented manner.

If you use pointer structures for everything you are right that
a GC or a manually managed heap is necessary. In Seed7 many things
can be done with abstract datatypes.

If abstract datatypes are used in an efficient way there is not
so much need to use pointers in Seed7 is not so high as in some
other languages.

> > > Debug-use case: A application is started. A sematic error (file
> > > missing for example) throws user into break package. User fixes the
> > > problem, but saves a pointer to some structure in a global for
> > > later study.
> > You are much too deep in the Lisp way of thinking. If a stack
> > shrinks the elements popped from the top just don't exist any more.
>
> So it's impossible in Seed7 to have a function create an object and
> return it, because anything created by a function doesn't exist any
> more after return?

The return variable is excluded from this mechanism.

>
> > It is also a common bug in C/C++ and other similar languages when
> > a function returns a pointer to some local data (which is at the
> > stack) ...
>
> Are you saying that in Seed7 it's impossible to have any data that
> is not on the stack, hence it's impossible for any function to
> allocate memory for some object and then *return* a pointer to that
> object so that the caller can later work with that object?
>
> > If you want such data to be available later you need to make a
> > (deep) copy.
>

> That's ...
[snip reasons why not all data should be stack oriented]
I agree that some data cannot be managed in a stack oriented way.

> > > ;Static type checking fails to detect this type-mismatch undefined-method error.
> > ... Every bug found at compile-time will not make you trouble at
> > run-time. The earlier you can eliminate bugs the better.
>
> Why do you feel the need to have two different times, one where
> static stuff is checked but you have no idea what's really
> happening, and then one where things actually happen?

I think that compile-time type checking can find bugs which
slip through the fingers when you test your program.
IMHO even a test with 100% code coverage is not sufficient since
the combination of all places where values are generated and
all places where this values are used must be taken into account.
I have aggain improved the FAQ to contain this argumentation:
http://seed7.sourceforge.net/faq.htm#static_type_checking

> I suppose in
> Seed7 it's impossible to have an interactive loop where you can
> type in a line of code and it *immediately* does something and you
> *immediately* see whether it did what you expected it to do instead
> of needing to wait until the whole program is compiled before you
> can see what that one line of code did?

No. There is the type 'program' which can be used for that.

BTW the Seed7 parser usually processes 200000 lines per second.

> > ... There is also the type 'bigInteger' which serves as
> > unlimited-size signed integer. The type 'bigInteger' is explained
> > here:
> >http://seed7.sourceforge.net/manual/types.htm#bigInteger
>
> | div Integer division truncated towards zero
> | ( A div B => trunc(A / B),
> | A div 0_ => EXCEPTION NUMERIC_ERROR )
> | rem Reminder of integer division div
> | ( A rem B => A - (A div B) * B,
> | A rem 0_ => EXCEPTION NUMERIC_ERROR )
>
> For bigIntegers, it's especially painful not to have division with
> both quotient and remainder directly returned. It takes a lot of
> extra time to multiply the quotient by the divisor (then subtract
> that from the original value) to get back the remainder after
> having thrown away the remainder in the first place.

When there is demand, such a function can be added.

> > > What is the precise meaning of type 'char'?
> > The 'char' values use the UTF-32 encoding, see:
> >http://seed7.sourceforge.net/manual/types.htm#char
>
> | The type 'char' describes UNICODE characters. The 'char' values use
> | the UTF-32 encoding. In the source file a character literal is written
> | as UTF-8 UNICODE character enclosed in single quotes. For example:
> | 'a' ' ' '\n' '!' '\\' '2' '"' '\"' '\''
>
> That's not written well at all. It doesn't make sense.

Agree. Historically this where just character literal examples.
Some of them use escape sequences, which were explained below.
Now you get the impression that this are UTF-8 literal
examples which was not the original intend.

> UTF-8 coding of a single character, it's two different UTF-8
> (US-ASCII subset thereof) characters which are a C convention for
> newline.
> Likewise \\ is two UTF-8 (US-ASCII) characters.
> Likewise \" is two UTF-8 (US-ASCII) characters.
> Likewise \' is two UTF-8 (US-ASCII) characters.

This are escape sequences. They are explained in the next paragraph.
I have moved the character literal examples after the explanation
of escape sequences. That way you don't get the impression that
this is an explanation of UTF-8 literals.

> I don't think you have any idea what UTF-8 really means.

As I have implemented UTF-8 support for Seed7, I think I know
something about it.

Robert Maas, http://tinyurl.com/uh3t

unread,

Jul 26, 2008, 4:43:34 AM7/26/08

> >> Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
> >> problems to be solved much more concisely than with Lisp. Indeed, I
> >> think it is difficult to imagine even a single example where Lisp
> >> is competitively concise.
> > What does "solved much more concisely" mean??

> From: Jon Harrop <j...@ffconsultancy.com>

> The solutions are shorter in modern FPLs.

OK, let's say you have a text string which contains the notation
for a nested list starting from the beginning of the string. How
many lines of code, in various modern FPLs, would it take to parse
that nested-list notation to produce an actual nested list
structure, and also report back in another value the position in
the string where the parse left off at the end of the nested-list
notation? In Common Lisp it's just one line of code:
(multiple-value-setq (ls ei) (read-from-string ts))
where ts has the given text string,
ls gets the resultant list structure,
and ei gets the resultant end index.

Now let's say you want to pick up where you left off, because after
that first nested list notation there's another within the same
text string you were given. How many lines of code in your favorite
FPLs? In Common Lisp, it's another one line of code:
(multiple-value-setq (ls2 ei2) (read-from-string ts :start ei))

My point is that different languages are based on different things
being important enough to put into the core language, but with
add-on libraries it's possible to make a lot of things be
essentially the same length in a whole bunch of languages. If you
don't mind Unix-style ultra-short names for commonly use utilities,
and you do a lot of this kind of stuff, if you really want to make
your code as short as possibly, you can use macros to abbreviate
the special operator and function used above, so the two lines of
code will easily fit into just one run-on line of code:
(mvs (ls ei) (rfs ts)) (mvs (ls2 ei2) (rfs ts :start ei))

So I propose when comparing languages that we allow liberal use of
macros (in languages such as Lisp which support them) to tailor the
function names to the same length in all languages, so that when we
look at the code we are comparing the size of the parse tree and
*not* the relative lengths of names of functions. And furthermore
we allow add-on libraries to compress the size of the parse tree in
the application itself down to just the part that's different from
one application to another. So the only real difference is whether
the syntax of the language (with all available help from macros and
libraries) supports the most natural expression of the application
of the tool(s).

So let's do a check-list. Which languages allow natural expression of:
- case expresssions (not case statements!)
- loops with a typical wide variety of ways to map various indices/pointers
across lists/ranges
- anonymous functions which lexically close variables from the
surrounding lexical scope
- a mix of symbols which are in a master table so that they can be
canonicalized upon input and symbols that are *not* in such a
master table so that multiple instances by the same name can be
created at runtime without them interferring with each other, and
where each symbol of either type can have an unlimited number of
tagged properties hanging off it, including the traditional
variable (value) and function-definition
- formal specification of the parse tree of source code as data
that can be manipulated at runtime just the same as ordinary data
can
- choice or mix of two ways to specialize functions and datatypes,
either by having a single function defined in one place which
explicitly dispatches on the type(s) of its parameters, and/or
having multiple definitions of the same-name function distributed
into multiple places where keeping track of all the variants of the
same-name function is automatic and dispatching from the same
generic function per parameter datatypes is set up automatically
- virtually identical semantics between code typed directly into a
read-eval-print loop one line at a time and lines of code embedded
within function definitions which are either entered from the REP
as a unit or loaded from a source file en masse or compiled to an
object file and then that file loaded, with easy intermix of all
four modes of lines-of-codes
Common Lisp can do all of those. How many of your favorite FPLs can?

> Macros are very rare in modern languages.

Why? Because their parse tree isn't well defined and documented, so
it would be impossible for the author of a macro (in the Lisp
sense) to know what to do? Or because the syntax of the source code
is so different from the parse tree that it would be just too
difficult to mentally keep switching back and forth between
thinking about the parse tree (when designing the function of the
macro, what the macro is supposed to do) and writing code
(including the code of the macro-expander itself)? Or because
authors of most modern languages are unaware how valuable macros
can be in improving the "out of the box" syntax to be better fit a
problem domain or to capture the essense of common design patterns
into a single syntax?

Let's think about design patterns more. The whole idea of a design
pattern is that the language itself doesn't support direct
expression of a design, so the programmer must constantly think of
the pattern and write code in that form to put the pieces of a
pattern together into a coherent hole that is self-consistent. But
with a macro, all the programmer need do is call the name of the
design pattern and the parameters to it, and the macro-expader
automatically generates the whole self-consistent design pattern.
So from the view of the programmer, it's no longer a "pattern",
it's a utility.

Here's an example of what was originally a design pattern:
(unwind-protect
(let ((chan (open "foo.txt")))
(process-file chan))
(close chan))
The idea is that if something goes wrong while the file is open,
and PROCESS-FILE throws control out of the scope of its toplevel,
the CLOSE of the channel will nevertheless be executed, so the file
**will** end up closed before control reaches the main level of
program above this code.

So with a macro, it's no longer necessary to go to that pain of
writing all the stuff in just the right relationship. Now it's just:
(with-open-file (chan "foo.txt") (process-file chan))

Now that particular macro wasn't in older versions of Lisp before
Common Lisp, but the design pattern was so commonly useful that the
macro was incorporated into Common Lisp. But back in those earlier
versions of Lisp, programmers didn't have to write the
UNWIND-PROTECT pattern over and over. Instead they just wrote a
macro once and then called it. They didn't have to wait for Common
Lisp to do it. They could realize all by themselves that the design
pattern was common, and that the macro would condense the pattern
into a foolproof utility, so they just did it without waiting for
Common Lisp to provide it built-in. That's the value of
user-definable macros.

Now there are probably a *lot* of design patterns used in Common
Lisp and other languages which are not yet covered by pre-defined
macros or other mechanisms, because at the time Common Lisp or the
other laguages were standardized those patterns weren't recognized
as common enough in use with well-established macro that it seemed
reasonable to incorporate them into the standard language. After
time passes, in some problem domains it becomes common knowledge
that certain design patterns are common enough that it *really*
would be nice if the language supported them as a compact
expression *instead* of requiring the program to repeat the design
pattern over and over and over. Here is where languages such as
Lisp which have parse-tree macros, and other languages that don't
have them, differ: In Lisp, the appliation programmer has the power
to define a macro and thereby convert a design pattern into a
simple call. In languages without macros, the programmer does *not*
have that option, and must forever write out the entire design
pattern over and over and over until and unless the people defining
the standard and implement it decide to make that simple call part
of the language.

> OCaml has a full macro system but it is rarely used.

Is it a parse-tree macro system, like Lisp has, or is it a
string-syntax macro system, like C has? If the former, then why
isn't it used? Perhaps as I guessed above, because the parse tree
is so different from the source syntax that it's confusing to
switch back and forth between the two? If it's a string-syntax
macro system like C, then it's total crap and I can understand why
it's rarely used.

> F# has no macro system.

Then F# is crap compared to Lisp.

> Why do you think statically typed languages completely dominate
> general purpose programming?

Because a lot of people don't know any better and are stuck
with installed code base (legacy code) which must be maintained.

> > (The exception is in tight loops
> > where some machine type happens to work efficiently and where that
> > particular loop eats up most of the CPU time, which is precisely
> > where Lisp allows type declarations and consequent CPU-efficient
> > code as an *option* where it would actually be worth the cost.)
> Lisp is unable to regain the performance cost of its dynamic design:

As long ago as around the late 1970's, MacLisp achieved a
reputation that tight code loops with proper declarations ran
faster than the equivalent loop in Fortran, because the code within
each function was identical but Lisp had more efficient
function-call linkage. Do you dispute that, or do you claim that
Common Lisp has taken a step backward and no longer is as fast in
tight loops with proper declarations as it should be?

By the way, there *is* one apparent problem with Common Lisp:
Getting those declarations correct is a rather tricky skill, and
different implementations of Common Lisp present different levels
of difficulty for optimizing. See the two threads where I had
trouble optimizing a benchmark of random access within large array,
and where I had trouble optimizing a brute-force trial
factorization, in both cases it was doing CONSing when it shouldn't
have. (SBCL was able to avoid CONSing, but with the same source
code the very old version of CMUCL that my ISP has was doing lots
of CONSing.)

> >> finalizers,
> > You mean like to automatically close a system resource whenever the
> > Lisp hook for it is GC'd? Normally system resources like that are
> > lexically bound, whereby UNWIND-PROTECT is sufficient to make sure
> > they get closed when that block of code is exited. Is there
> > *really* a good reason to have such resources kept around per data
> > object instead of per block-entry/exit?
> Yes. Finalizers are useful for the same reason that GC itself is
> good (and Ada is bad): you don't want to have to shoehorn
> everything you do into scopes just to ensure that things get freed
> correctly.

On a philosophical level, I totally agree with you on this point.
I'm convinced you're not a troll!
Although you frequently go overboard on exaggerations in favor of
other languages and against Lisp, you *are* bringing up good issues
and occasionally (as here) actually showing some good insight.
So I'm not sorry I engaged you in debate.

Now let's discuss the practical matter, on a case by case basis:

If you are reading a file, you probably want only one agent reading
it at a time, and you probably want to process the contents of the
file straight through (or bulk copy the whole contents of the file
to a text-string and immediately close the file and process from
the string rather than from the file itself). Accordingly lexical
scope and UNWIND-PROTECT (such as in the WITH-OPEN-FILE macro) is
totally sufficient.

If you have a connection to a remote database, you may have
multiple agents sharing a single connection. But you want to close
the connection whenever the connection goes idle too long, and
automatically re-open it whenever somebody starts to use it again,
rather than keep it open yet idle the whole time anybody has a
handle on it. Accordingly a clock-timeout mechanism makes more
sense than a finalizer tied into the GC.

I'm hard pressed to think of any case where the best mechanism is a
GC-finalizer rather than lexical scope or clock-timeout and
auto-reopen. Please enlighten me with a few examples (no details,
just the essence of the problem to be solved) where GC-finalizer is
the *best* solution to resource management, OK?

> Another reason is that garbage collectors often free sooner than that.

I don't really believe that. Either the pointer to the handle to
the resource is lexically scoped, in which case GC won't reclaim it
until that pointer goes out of scope, in fact won't reclaim it
until sometime *after* the pointer goes out of scope; Or you have
multiple pointers being passed around in a non-lexical way and GC
won't reclaim it until the very last of those pointers is
inaccessible, whereas clock-timeout will close the resource as soon
as those pointers are no longer actively being used, hence again GC
will close the resource *after* the other method. Sure, once in a
while all the multiple pointers will suddenly go away at once, and
GC will by chance happen sooner than the idle-timeout. But other
times GC won't have yet happened when idle-timeout happens. So it's
a race that can go either way with no consistent winner. Please
present a scenerio where GC gets there *first* more often than not.
Or an example where idle-timeout is *not* an option, where the
resource absolutely must be kept active until the very last of the
references to it is gone and GC then can reclaim it, even if none
of those references are actively using it.

> > But if you can think of a really good use for random closure, I may
> > reconsider my opinion.
> I used finalizers in OCaml to collect unused OpenGL display lists and
> textures. The results were excellent: the code was greatly
> simplified at no cost.

Sorry, I don't know what those are, and they don't seem to be a
central issue that is worth my spending a half hour doing a Google
search to find a tutorial to teach me what they are. Would you
please define their important characteristics in regard to this
discussion, i.e. where they are physically located, what reason
they have for being kept around (until all references are gone),
why they can't be easily re-created if idle-timeout causes their
deletion but re-activity forces their re-existance, why they can't
be kept around nearly *forever* (until they have been idle a week
or two and it really would be surely a waste to keep them around
longer), etc.? On Unix/Linux there's a concept of various processes
in a tree, with each process dependent on the next parent process.
When a parent dies, all children automatically die. It would make
sense for a display list or texture to be a property of a
sub-process, which stays around so long as the main process is
around, then when the main process dies the sub-processes are
auto-killed and the display list or texture then disappears. Why
doesn't that model fit what's actually happening with display lists
and textures? The only reason I can think of is when the main
process is some persistent daemon, such as the Apache server. But I
don't think the Apache server or any of the other persistent
daemons are written in OCaml, so there must be some other reason
why that sub-process model wouldn't solve the problem
(hypothetically if OCaml didn't have finalizers, or if you decided
to translate your application to some other language that doesn't
have finalizers). So if you can succinctly enlighten me, please do.

> The vast majority of computers do not have virtual memory because
> they are embedded devices.

Yeah, that's a different environment from anything I've ever
programmed for. I would suspect that in an embedded device you
wouldn't want GC in the first place, you'd do better with mostly
static allocation with perhaps some non-cyclic dynamic structures
where reference count would work just fine. But it's late and maybe
I've lost track of the context.

> >> I think you should aspire to earn money directly from customers
> >> rather than asking people to give you money to develop your
> >> software.
> > I like that idea very much. Will you find me a potential customer
> > who wants some serverside application that doesn't already exist?
> You need to write a compelling sales page and make it indexable
> by search engines and make it as easy as possible for users to give
> you their money before plastering all of the news websites that you
> can find with descriptions of your new service and links to it.

That's a strawman comparison. In fact before doing *either*, I need
to find some customers who desire some new server-side application
that is within my capability, and who are willing to work with me
to define use cases that I might offer to implement, and then once
that's all done *then* we start talking about a fixed-price
contract to implement those use cases, and alpha testing, etc. Then
after I have one set of satisied customers, only then would it be
reasonble to advertise to see if there might be lots of other
customers for an existing customer-pleasing application.

So can you find me a first brainstorm-use-cases customer to get
this whole process started? I've mentionned lots of new software
technology I've implemented to the pointer where I feel they might
possibly be useful for a variety of applications that might be of
value to others. But until I find a potential customer who is
actually interested in what I have done and how I can work it into
a salable product, I don't know if any of the wondrous things I've
programmed really could lead to a salable product.

By the way, Tuesday of last week I gave a partial demo of some of
my computer-assisted instruction software, and the person seeing
the demo seemed to think it would be of interest to two of his
technical acquaintances, one very far away (about 1000 miles) and
one closer (about 40 miles), so he'll be contacting them to try to
get them interested.

Richard Heathfield

unread,

Jul 26, 2008, 5:06:27 AM7/26/08

Robert Maas, http://tinyurl.com/uh3t said:

>> >> Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
>> >> problems to be solved much more concisely than with Lisp. Indeed, I
>> >> think it is difficult to imagine even a single example where Lisp
>> >> is competitively concise.
>> > What does "solved much more concisely" mean??
>> From: Jon Harrop <j...@ffconsultancy.com>
>> The solutions are shorter in modern FPLs.
>
> OK, let's say you have a text string which contains the notation
> for a nested list starting from the beginning of the string. How
> many lines of code, in various modern FPLs, would it take to parse
> that nested-list notation to produce an actual nested list
> structure, and also report back in another value the position in
> the string where the parse left off at the end of the nested-list
> notation? In Common Lisp it's just one line of code:
> (multiple-value-setq (ls ei) (read-from-string ts))
> where ts has the given text string,
> ls gets the resultant list structure,
> and ei gets the resultant end index.

In any (even relatively sensible) language where it takes /more/ than one
line of code, it only takes more than one line the /first/ time - because
the programmer will wrap it into a function (or procedure, or subroutine,
or whatever) and, in future, call the function (or whatever). So this
conciseness argument is a mere canard, as all program tasks eventually
boil down to:

call existing_solution

which is a one-liner in pretty well any language.

Pascal J. Bourguignon

unread,

Jul 26, 2008, 7:06:55 AM7/26/08

Richard Heathfield <r...@see.sig.invalid> writes:

> In any (even relatively sensible) language where it takes /more/ than one
> line of code, it only takes more than one line the /first/ time - because
> the programmer will wrap it into a function (or procedure, or subroutine,
> or whatever) and, in future, call the function (or whatever). So this
> conciseness argument is a mere canard, as all program tasks eventually
> boil down to:
>
> call existing_solution
>
> which is a one-liner in pretty well any language.

Not in languages that don't provide syntactic abstraction.

Let's take a simple example, the Memento pattern:
http://en.wikipedia.org/wiki/Memento_pattern

Everytime you want to apply this pattern to a new class, you have to
add the following methods and class, take care not to miss a member to
save/restore/keep in the memento:

public Object saveToMemento() {
System.out.println("Originator: Saving to Memento.");
return new Memento(state);
}
public void restoreFromMemento(Object m) {
if (m instanceof Memento) {
Memento memento = (Memento)m;
state = memento.getSavedState();
System.out.println("Originator: State after restoring from Memento: "+state);
}
}

private static class Memento {
private String state;

public Memento(String stateToSave) { state = stateToSave; }
public String getSavedState() { return state; }

In lisp, we write the same, but only ONCE:

(defgeneric save-to-memento (<base-class>)
(:documentation
"Returns a new <memento-class> holding all the state of <base-class>"))

(defmacro define-memento (<memento-class> <base-class> (&rest <base-class-slots>))
`(progn
(defclass ,<memento-class> ()
,(loop
:for slot :in <base-class-slots>
:collect `(,slot
:initarg ,(intern (symbol-name slot)
(load-time-value (find-package "KEYWORD"))))))
(defmethod save-to-memento ((self ,<base-class>))
(format *trace-output* "~&~S saving to memento.~%" self)
(make-instance ',<memento-class>
,@(loop
:for slot :in <base-class-slots>
:collect (intern (symbol-name slot) (load-time-value (find-package "KEYWORD")))
:collect `(slot-value self ',slot))))
(defmethod restore-from-memento ((self ,<base-class>) (memento ,<memento-class>))
(format *trace-output* "~&~S restoring from memento.~%" self)
,@(loop
:for slot :in <base-class-slots>
:collect `(setf (slot-value self ',slot) (slot-value memento ',slot)))
self)
',<memento-class>))

And then, everytime we have a new class:

(defclass originator ()
((state :initarg :state :accessor state)))
(defmethod (setf state) :before (new-value (self originator))
(format *trace-output* "~&~S setting new state ~S~%" self new-value))

we only have to write the oneliner call:

(define-memento memento originator (state))

to add a memento to the new class:

(defun example ()
(let ((caretaker '())
(originator (make-instance 'originator :state "Initial")))
(setf (state originator) "State1")
(setf (state originator) "State2")
(push (save-to-memento originator) caretaker)
(setf (state originator) "State3")
(push (save-to-memento originator) caretaker)
(setf (state originator) "State4")
(restore-from-memento originator (elt caretaker 0))
(format *trace-output* "~&~S state is ~S~%" originator (state originator))
(values)))

C/USER[82]> (example)
#<ORIGINATOR #x208D4AB6> setting new state "State1"
#<ORIGINATOR #x208D4AB6> setting new state "State2"
#<ORIGINATOR #x208D4AB6> saving to memento.
#<ORIGINATOR #x208D4AB6> setting new state "State3"
#<ORIGINATOR #x208D4AB6> saving to memento.
#<ORIGINATOR #x208D4AB6> setting new state "State4"
#<ORIGINATOR #x208D4AB6> restoring from memento.
#<ORIGINATOR #x208D4AB6> state is "State3"

(defclass person ()
((name :initarg :name :accessor name)
(address :initarg :address :accessor address)
(birthdate :initarg :birthdate :reader birthdate)))

(define-memento person-state person (name address birthdate)) ; ONE LINE! YAY!

C/USER[87]> (let* ((p (make-instance 'person
:name "Tintin"
:address "Chateau de Moulinsart"
:birthdate '1929/01/10))
(m (save-to-memento p)))
(print (list (name p) (address p) (birthdate p)))
(setf (name p) "Tintinovitch"
(address p) "Kremlin")
(print (list (name p) (address p) (birthdate p)))
(restore-from-memento p m)
(print (list (name p) (address p) (birthdate p)))
(values))
#<PERSON #x207D0636> saving to memento.

("Tintin" "Chateau de Moulinsart" |1929/01/10|)
("Tintinovitch" "Kremlin" |1929/01/10|)
#<PERSON #x207D0636> restoring from memento.

("Tintin" "Chateau de Moulinsart" |1929/01/10|)

This is possible because lisp, in addition to supporting data
abstraction and procedural abstraction, which any other programming
language followed suit, also supports syntactic abstraction and
meta-linguistic abstraction.

--
__Pascal Bourguignon__ http://www.informatimago.com/

PLEASE NOTE: Some quantum physics theories suggest that when the
consumer is not directly observing this product, it may cease to
exist or will exist only in a vague and undetermined state.

thomas...@gmx.at

unread,

Jul 28, 2008, 11:55:58 AM7/28/08

On 26 Jul., 10:43, jaycx2.3.calrob...@spamgourmet.com.remove (Robert

Maas, http://tinyurl.com/uh3t) wrote:
> > >> Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
> > >> problems to be solved much more concisely than with Lisp. Indeed, I
> > >> think it is difficult to imagine even a single example where Lisp
> > >> is competitively concise.
> > > What does "solved much more concisely" mean??
> > From: Jon Harrop <j...@ffconsultancy.com>

...

> > Why do you think statically typed languages completely dominate
> > general purpose programming?
>
> Because a lot of people don't know any better and are stuck
> with installed code base (legacy code) which must be maintained.

I get the impression that you think that only concepts supported
by your favorite language (LISP) are good concepts. If some
concept is not present in LISP you conclude that it is bad and
everybody using it does not know any better.

I know that you hate static type checking, but I think that your
should look at concepts from outside your (LISP) world.
I tried to include answers to your previous arguments aggainst
static typ checking in my FAQ at:
http://seed7.sourceforge.net/faq.htm#static_type_checking
Basing on this arguments, it would be nice to discuss
static type checking.

BTW.: I am still waiting to get an answer for my other mail in
this thread.

Jon Harrop

unread,

Aug 4, 2008, 10:03:58 AM8/4/08

Robert Maas, http://tinyurl.com/uh3t wrote:

> From: Jon Harrop <j...@ffconsultancy.com>
>> The solutions are shorter in modern FPLs.
>
> OK, let's say you have a text string which contains the notation
> for a nested list starting from the beginning of the string.

"The" notation? You mean Lisp's notation?

That's fine but it is of no practical interest because real data is rarely
already in Lisp syntax. For a more useful comparison you must consider
other formats.

> How
> many lines of code, in various modern FPLs, would it take to parse
> that nested-list notation to produce an actual nested list
> structure, and also report back in another value the position in
> the string where the parse left off at the end of the nested-list
> notation? In Common Lisp it's just one line of code:
> (multiple-value-setq (ls ei) (read-from-string ts))
> where ts has the given text string,
> ls gets the resultant list structure,
> and ei gets the resultant end index.

In OCaml with OCaml's notation:

Gram.parse expr Loc.ghost (Stream.of_string ts)

> Now let's say you want to pick up where you left off, because after
> that first nested list notation there's another within the same
> text string you were given. How many lines of code in your favorite
> FPLs? In Common Lisp, it's another one line of code:
> (multiple-value-setq (ls2 ei2) (read-from-string ts :start ei))

That is just reading twice from the same stream.

> My point is that different languages are based on different things
> being important enough to put into the core language, but with
> add-on libraries it's possible to make a lot of things be
> essentially the same length in a whole bunch of languages. If you
> don't mind Unix-style ultra-short names for commonly use utilities,
> and you do a lot of this kind of stuff, if you really want to make
> your code as short as possibly, you can use macros to abbreviate
> the special operator and function used above, so the two lines of
> code will easily fit into just one run-on line of code:
> (mvs (ls ei) (rfs ts)) (mvs (ls2 ei2) (rfs ts :start ei))

How about something of practical relevance: custom grammars.

Consider parsing Mathematica expressions including lists, rules, sums,
products, powers and factorials with associativity and precedence. In
OCaml:

open Camlp4.PreCast;;

let mma = Gram.Entry.mk "mma";;

EXTEND Gram
mma:
[ "rule" LEFTA
[ e1 = mma; "->"; e2 = mma -> `Rule(e1, e2) ]
| "add" LEFTA
[ e1 = mma; "+"; e2 = mma -> `Add(e1, e2) ]
| "mult" LEFTA
[ e1 = mma; e2 = mma -> `Mul(e1, e2) ]
| "pow" RIGHTA
[ e1 = mma; "^"; e2 = mma -> `Pow(e1, e2) ]
| "fact" LEFTA
[ e = mma; "!" -> `Factorial e ]
| "simple" NONA
[ n = INT -> `Int(int_of_string n)
| s = UIDENT -> `Sym s
| "{"; fs = LIST0 mma SEP ","; "}" -> `List fs
| "("; e = mma; ")" -> e ] ];
END;;

For example:

# Gram.parse mma Loc.ghost (Stream.of_string "{A->B,123,C D^E+F}");;
- : _[> `Add of 'a * 'a
| `Factorial of 'a
| `Int of int
| `List of 'a list
| `Mul of 'a * 'a
| `Num of int
| `Pow of 'a * 'a
| `Rule of 'a * 'a
| `Sym of string ]
as 'a
=
`List
[`Rule (`Sym "A", `Sym "B"); `Int 123;
`Add (`Mul (`Sym "C", `Pow (`Sym "D", `Sym "E")), `Sym "F")]

> So I propose when comparing languages that we allow liberal use of
> macros (in languages such as Lisp which support them) to tailor the
> function names to the same length in all languages, so that when we
> look at the code we are comparing the size of the parse tree and
> *not* the relative lengths of names of functions. And furthermore
> we allow add-on libraries to compress the size of the parse tree in
> the application itself down to just the part that's different from
> one application to another. So the only real difference is whether
> the syntax of the language (with all available help from macros and
> libraries) supports the most natural expression of the application
> of the tool(s).

Then you will be discounting the enormous amount of Lisp code that is
devoted to Greenspunning modern language features like pattern matching.

>> Macros are very rare in modern languages.
>
> Why?

Two reasons:

. Macros are used primarily to work around deficiencies in the language's
design, e.g. the absence of pattern matching in Lisp.

. Using macros forks the language and makes maintenance much harder. New
programmers must effectively learn a new language. Tools no longer work
transparently with source code. Typeful programming is made harder (except
perhaps in Haskell but I am unfamiliar with its typed macro system).

> Because their parse tree isn't well defined and documented, so
> it would be impossible for the author of a macro (in the Lisp
> sense) to know what to do?

The parse tree is no longer relevant in modern languages.

> Or because the syntax of the source code
> is so different from the parse tree that it would be just too
> difficult to mentally keep switching back and forth between
> thinking about the parse tree (when designing the function of the
> macro, what the macro is supposed to do) and writing code
> (including the code of the macro-expander itself)?

In all modern languages, the parse tree (the internal representation used by
the compiler and macro system) is irrelevant due to the use of quotations.
Consequently, parse trees do not need to be handled directly by OCaml
macros.

Look at the following equivalent of UNWIND-PROTECT:

# EXTEND Gram
expr: LEVEL "top"
[ [ "try"; f=sequence; "finally"; g=expr ->
<:expr<
(function
| `Val v, g -> g(); v
| `Exn e, g -> (try g() with _ -> ()); raise e)
((try `Val $f$ with e -> `Exn e), (fun () -> $g$))
>> ] ];
END;;

The input source code is represented by a sequence of tokens and the
associativity and precedence of the new grammar rule, so the programmer is
no longer forced to write without syntax as you do in Lisp.

The action is represented by a quotation <:expr< ... >> that contains
ordinary OCaml code so, again, the programmer is no longer forced to write
without syntax as you do in Lisp.

Pattern matching is also instrumental here: you can quote code in patterns
and in expressions so you can rewrite OCaml code without having to deal
with the low-level parse tree at all.

This makes term rewriting much easier in OCaml than in Lisp and is the
reason why my OCaml solution to the Minim challenge (implementing a
compiler for a Lisp-like language) was far shorter than all of the Lisp
solutions.

> Or because
> authors of most modern languages are unaware how valuable macros
> can be in improving the "out of the box" syntax to be better fit a
> problem domain or to capture the essense of common design patterns
> into a single syntax?

Ironically, exactly the opposite is true. Programming language designers
have long since superceded Lisp's capabilities and it is only the few
remaining Lisp programmers who have failed to realise this. We can now
improve syntax far beyond the capabilities provided by Lisp.

You can start reinventing these modern techniques in Lisp by writing
libraries, of course, but then you are doing nothing more than
Greenspunning modern macro systems like Camlp4.

No. That is just another example of Lispers using macros for the sake of it.
You can implement UNWIND-PROTECT using only a higher-order function. There
is no need to use a macro here.

> Now there are probably a *lot* of design patterns used in Common
> Lisp and other languages which are not yet covered by pre-defined
> macros or other mechanisms, because at the time Common Lisp or the
> other laguages were standardized those patterns weren't recognized
> as common enough in use with well-established macro that it seemed
> reasonable to incorporate them into the standard language. After
> time passes, in some problem domains it becomes common knowledge
> that certain design patterns are common enough that it *really*
> would be nice if the language supported them as a compact
> expression *instead* of requiring the program to repeat the design
> pattern over and over and over.

I'd like to see an example of a design pattern that cannot already be
factored out in modern FPLs without the use of macros.

> Here is where languages such as
> Lisp which have parse-tree macros, and other languages that don't
> have them, differ: In Lisp, the appliation programmer has the power
> to define a macro and thereby convert a design pattern into a
> simple call. In languages without macros, the programmer does *not*
> have that option, and must forever write out the entire design
> pattern over and over and over until and unless the people defining
> the standard and implement it decide to make that simple call part
> of the language.

There is overwhelming evidence to the contrary: rich syntax is ubiquitous in
all modern languages (even ones where macros play a key role, like
Mathematica) but Lispers limp on without out it precisely because Lisp is
not capable of providing this hugely beneficial functionality is a usable
way.

That is also why Lisp is so verbose in practice: the burden of parsing is
placed upon the programmer who must write out programs in what should be an
internal representation.

>> OCaml has a full macro system but it is rarely used.
>
> Is it a parse-tree macro system, like Lisp has, or is it a
> string-syntax macro system, like C has?

OCaml's macros act upon parse trees internally but they are not imposed upon
the programmer as they are in Lisp.

> If the former, then why isn't it used?

In Lisp, macros are used primarily to Greenspun pattern matching. OCaml
already provides pattern matching so that primary need is gone and macros
see a lot less use in OCaml as a consequence.

> Perhaps as I guessed above, because the parse tree
> is so different from the source syntax that it's confusing to
> switch back and forth between the two?

Quotations and pattern matching make that a non-issue: you do not deal with
the parse tree in modern macro systems.

> If it's a string-syntax
> macro system like C, then it's total crap and I can understand why
> it's rarely used.

In the time you wasted writing this sprawling strawman argument you could
instead have educated yourself about Camlp4.

>> F# has no macro system.
>
> Then F# is crap compared to Lisp.

That is an uninformed opinion: you are not even aware of the effects of
laziness in this context and are oblivious to typeful programming despite
its ubiquity. Moreover, you have yet to come up with a single example of a
useful macro.

>> Why do you think statically typed languages completely dominate
>> general purpose programming?
>
> Because a lot of people don't know any better and are stuck
> with installed code base (legacy code) which must be maintained.

I think a lot of people do know better.

>> > (The exception is in tight loops
>> > where some machine type happens to work efficiently and where that
>> > particular loop eats up most of the CPU time, which is precisely
>> > where Lisp allows type declarations and consequent CPU-efficient
>> > code as an *option* where it would actually be worth the cost.)
>>
>> Lisp is unable to regain the performance cost of its dynamic design:
>
> As long ago as around the late 1970's, MacLisp achieved a
> reputation that tight code loops with proper declarations ran
> faster than the equivalent loop in Fortran, because the code within
> each function was identical but Lisp had more efficient
> function-call linkage. Do you dispute that, or do you claim that
> Common Lisp has taken a step backward and no longer is as fast in
> tight loops with proper declarations as it should be?

You are not seriously trying to substantiate "Lisp is not cripplingly slow"
by citing performance data that are over 30 years out of date?

Finalizers are good when memory pressure is relevant, i.e. when you are
handling resources that consume system memory. A clock timeout is unaware
of memory pressure so it will not be as good in such situations.

For example, finalizers are used to collect weakly referenced parts of a
cache. This allows the cache to be shrunk by the GC when memory pressure is
high.

>> Another reason is that garbage collectors often free sooner than that.
>
> I don't really believe that. Either the pointer to the handle to
> the resource is lexically scoped, in which case GC won't reclaim it
> until that pointer goes out of scope,

That is wrong. GCs can and do collect before pointers go out of scope.
Indeed, GCs are completely unaware of the notion of scope.

I discussed this on comp.lang.c++ recently with a proponent of reference
counting. He did not believe that real GCs would collect data while it was
still theoretically in scope (he was trying to claim that reference
counting always collect at least as soon as a GC but that is completely
fallacious) so I provided several complete counter examples each of which
proved that real VMs already do this.

>> > But if you can think of a really good use for random closure, I may
>> > reconsider my opinion.
>>
>> I used finalizers in OCaml to collect unused OpenGL display lists and
>> textures. The results were excellent: the code was greatly
>> simplified at no cost.
>
> Sorry, I don't know what those are, and they don't seem to be a
> central issue that is worth my spending a half hour doing a Google
> search to find a tutorial to teach me what they are. Would you
> please define their important characteristics in regard to this
> discussion, i.e. where they are physically located,

In system memory and automatically cached on the graphics card by the OpenGL
driver.

> what reason
> they have for being kept around (until all references are gone),

They represent a piece of visualization code that you will want to redraw
every frame of animation while they might be visible.

> why they can't be easily re-created if idle-timeout causes their
> deletion but re-activity forces their re-existance,

Creating display lists is extremely slow: they are compiled to improve
performance.

> why they can't
> be kept around nearly *forever* (until they have been idle a week
> or two and it really would be surely a waste to keep them around
> longer), etc.?

That would be a memory leak, potentially on the graphics card which has
limited memory.

> On Unix/Linux there's a concept of various processes
> in a tree, with each process dependent on the next parent process.
> When a parent dies, all children automatically die. It would make
> sense for a display list or texture to be a property of a
> sub-process, which stays around so long as the main process is
> around, then when the main process dies the sub-processes are
> auto-killed and the display list or texture then disappears. Why
> doesn't that model fit what's actually happening with display lists
> and textures?

Several reasons:

. That would impose manual memory management upon the user. Using finalizers
completely automates memory management for the user.

. Starting and reaping processes is very slow but display lists must be very
fast.

. OpenGL is single process only so that cannot work anyway.

> The only reason I can think of is when the main
> process is some persistent daemon, such as the Apache server. But I
> don't think the Apache server or any of the other persistent
> daemons are written in OCaml,

Lots of persistent daemons are written in OCaml, including web servers.
XenEnterprise is underpinned by several daemons that are written in OCaml
including a high performance custom database.

> so there must be some other reason
> why that sub-process model wouldn't solve the problem
> (hypothetically if OCaml didn't have finalizers, or if you decided
> to translate your application to some other language that doesn't
> have finalizers). So if you can succinctly enlighten me, please do.

Creating and deleting processes is manual memory management. OCaml and its
finalizers allow you to obtain the same results with less code and lower
development costs whilst also eliminating leaks.

>> The vast majority of computers do not have virtual memory because
>> they are embedded devices.
>
> Yeah, that's a different environment from anything I've ever
> programmed for. I would suspect that in an embedded device you
> wouldn't want GC in the first place, you'd do better with mostly
> static allocation with perhaps some non-cyclic dynamic structures
> where reference count would work just fine. But it's late and maybe
> I've lost track of the context.

Embedded devices used to be PICs but they now include phones and larger
devices that can have substantial resources.

>> >> I think you should aspire to earn money directly from customers
>> >> rather than asking people to give you money to develop your
>> >> software.
>> >
>> > I like that idea very much. Will you find me a potential customer
>> > who wants some serverside application that doesn't already exist?
>>
>> You need to write a compelling sales page and make it indexable
>> by search engines and make it as easy as possible for users to give
>> you their money before plastering all of the news websites that you
>> can find with descriptions of your new service and links to it.
>
> That's a strawman comparison.

No, it is my personal advice based upon years of succeeding in this
industry: growing a company from seed with no investment and increasing
profits exponentially year upon year when almost all others go bankrupt.

My reason for the above advice is that the "pre-customers" you are alluding
to are either unidentifiable or worthless. The vast majority of people who
offer you advice on the basis that they will buy your product end up not
buying anything and the people who do buy your product will be people whom
you have been completely unaware of and who have completely different
requirements and values.

Let me give you some examples:

My book OCaml for Scientists was written by me as a physical scientist and
aimed at other physical scientists. I made the same mistake that you are of
seeking out pre-customers and taking their advice. Atmospheric scientists
told me to strip out the maths and discuss interoperability with obscure
tools but virtually none of them actually paid for the book.
Bioinformaticians are probably our largest market of real paying customers.
I never got any advice from bioinformaticians whilst writing the book and
the fact that much of the content is well-suited to their work is a
complete fluke.

I was less fortunate with our technical presentation software Presenta where
I again made the mistake of listening to "pre-customers" who told me that
compiled OCaml binaries using OpenGL on Linux would work and that Mac OS X
was an important platform. I followed their advice and discovered that such
binaries are unusably unreliable on Linux (whereas the equivalent works
flawlessly on Windows because it has modern hardware-accelerated GUIs that
require this technology to work or Windows itself will not even work) and
Mac OS X has virtually no technical users with money and pitiful
development tools.

More recently, I have been completely ignoring advice from "pre-customers"
and have focussed on getting products to market faster based upon my
intuition of what people need. This is vastly more lucrative.

> In fact before doing *either*, I need
> to find some customers who desire some new server-side application
> that is within my capability, and who are willing to work with me
> to define use cases that I might offer to implement, and then once
> that's all done *then* we start talking about a fixed-price
> contract to implement those use cases, and alpha testing, etc. Then
> after I have one set of satisied customers, only then would it be
> reasonble to advertise to see if there might be lots of other
> customers for an existing customer-pleasing application.

That is the second highest risk approach to building a company in this
industry, the highest risk being operating at a loss from VC funding as a
gamble that your entire company will be bought out soon enough that your
shares haven't been diluted to the point that they are worthless. Unless
you have a lot of capital sloshing around or a steady income, you will not
survive the inevitable contract droughts and null payments and will go
bankrupt very quickly.

I found that selling commodities like OCaml for Scientists to lots of
ordinary programmers is not only more stable than gambling but can also be
just as lucrative. For example, we recently lost a £25k contract with
Microsoft because they went into duck and cover mode for the US recession
but it doesn't matter because we will make more money in the long term by
spending that time diversifying to other platforms and developing more
commodity products that we will sell to hundreds of individuals over the
next three years.

> So can you find me a first brainstorm-use-cases customer to get
> this whole process started? I've mentionned lots of new software
> technology I've implemented to the pointer where I feel they might
> possibly be useful for a variety of applications that might be of
> value to others. But until I find a potential customer who is
> actually interested in what I have done and how I can work it into
> a salable product, I don't know if any of the wondrous things I've
> programmed really could lead to a salable product.

The only way to find out is to put it up for sale.

> By the way, Tuesday of last week I gave a partial demo of some of
> my computer-assisted instruction software, and the person seeing
> the demo seemed to think it would be of interest to two of his
> technical acquaintances, one very far away (about 1000 miles) and
> one closer (about 40 miles), so he'll be contacting them to try to
> get them interested.

Excellent. Best of luck.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 12, 2008, 12:36:37 PM8/12/08

> > - Does Seed7 include a parser that reads Seed7 source-code syntax
> > (from an input stream such as from a text file, or from the
> > contents of a string) and produces a parse tree (as a pointy
> > structure)?

> From: thomas.mer...@gmx.at

> Actually there are the functions 'parseFile' and 'parseStri' which
> can be used to parse Seed7 source programs into a values of the
> type 'program'. I just added a short description about the type
> 'program' to the manual. See:
> http://seed7.sourceforge.net/manual/types.htm#program

I don't see any operations that allow looking at parts of the
program. For example, in Lisp, if you read all the s-expressions
within a source file to form a list, then LENGTH tells the number
of such s-expressions in that list, NTH (or ELT with args reversed)
pulls out a particular element from the list, then you can
decompose the structure of a single element by CONSP to see if it
is decomposible and if so then LENGTH or length of element and NTH
(or ELT) for piece within element. What good is an internal parse
tree for your program if you can't look into it to see what's
there?

> It is possible ... request the code of a 'program' value in a
> structured form.

Oh, why didn't you show me *that* in the first place? So a
'program' object isn't actually a parse tree, it's something else,
and you have to do something additional to get it converted to
parse tree (structured form as you call it)? So will you show where
*that* feature is documented?

> It is not my intend [sic] to support programs which manipulaten [sic]

> their own code as it is done with "self modifying code".

Do you take the extreme position that once you "start" a program,
it's too late to add any more code to it? You must load everything
(all code) you will ever need once at load time, and *then* start
executing?

Or do you not really mean to take that extreme position, that after
a program has been started, it can generate additional code as
separate 'program' parts, and then when an object is all finished
it can be linked into the already-running program and thus made
available for use?

> Loading a library at runtime as a way to introduce new statements
> for the program which is currently running IMHO makes no sense.

You have a thing about "statements". I agree that adding new infix
operators or anything else that affects the parsing of program
source doesn't make a lot of sense most of the time. But what's
wrong with simply loading new FUNCTIONS that don't change the
syntax of anything??? What if at start-up time you don't know which
libraries you'll need, you have to first see what data you're
working with, and you have a milllion different modules you *might*
load, and it would take forever to load all of them every time you
re-start your program when you'll only use maybe 50 of them in any
given run of the program.

> > - If so, is there also an inverse function that prettyprints from a
> > parse tree back out to textual source-code syntax?
> During the parsing some information, such as whitespace and comments
> are lost.

That's not a problem. If printing out a program then parsing it back
in results in equivalent program structure, that's good enough.

> > > Note that the 'and' and 'or' operators do not work correct when
> > > side effects appear in the right operand.
> > What is that supposed to mean??? If you use an OR expression to
> > execute the right side only if the left side is false, what
> > happens? I would expect what I said to happen, but the spec says
> > that doesn't work correctly? So what really happens?
> This is a bug in the spec. I have corrected the sentence to:
> Note that this early termination behaviour of the 'and' and 'or'
> operators also has an influence when the right operand has side
> effects.

That is too vague to be useful. Do side-effects of right side
happen if and only if the left side fails to terminate-early
causing the right side to be needed to decide the result? Or
something more inscrutable happens? For example:
foo() and bar()
If 'foo' returns false, then 'and' immediately returns false, and
'bar' is never executed, so side-effects that 'bar' might have
done in fact don't happen.
If 'foo' returns true, then 'bar' is called, and any side-effects
that 'bar' would do really do get done, and the value returned by
'bar' is returned as the value of 'and'?
Is that what you meant to say??

> > (or (integerp x) (error "X isn't an integer")) ;Lisp equivalent
> > > The result an 'integer' operation is undefined when it overflows.
> > That's horrible!
> AFAIK many languages such as C, C++ and Java have this behaviour.
> I would like to raise exceptions in such a case, but as long
> as there is no portable support for that in C, Posix or some
> other common standard, it would be hard to support it with
> satisfactory performance.

So you're completely begging the question as to portability? If C
doesn't make something portable, then you don't make it portable
yourself? So you have just piggy-backed on top of C for porting
instead of porting yourself? So you haven't really written a
programming language. You've just written an add-on to C.

When you really write a programming language, *you* write the code
generators for various CPUs you intend to support. *You* control
how low-level machine functions such as interrupts will be handled,
by directly invoking operating-system and/or CPU facilities for
them, instead of calling something written in C to do all the work
for you.

Oh well, it's nice that you're honest and admitted what you do. So
I guess it's easy enough for an application programmer to restrict
all arithmetic operations to what won't ever overflow, assuming you
have a portable MIN_NEG_INT and MAX_POS_INT documented so that the
application programmer knows what the rules are for avoiding ever
getting overflow.

> > While 0 ** 0 which is mathematically undefined is *required* to return 1?
> This behaviour is borrowed from FORTRAN, Ada and some other
> programming languages which support exponentiation.

Why do you borrow things that are flat-out wrong mathematically??
Can't you show restraint when borrowing features from ancient languages?
If you had a child, and your child was caught using drugs, and your
child's excuse was "I borrowed the drugs from some other kids",
would that be an acceptable excuse?

> ... The values referred by pointers and the values refered by

> interface types are not managed automatically.

That seems to be a step backward from sometime around 1964 when
automatic garbage collection was invented. Has anybody written any
sort of add-on library that manages dynamically allocated memory
accessible by pointers so that each unit of allocation is reclaimed
after it is last used? Even a reference-count system would be
useful for many practical cases.

> I agree that some data cannot be managed in a stack oriented way.

OK, so how is the application programmer supposed to deal with his
own non-stack-oriented data which is strung together with pointers?
Memory is cheap nowadays, so one solution is to just let allocation
units accumulate within a huge amount of available memory, but keep
track of how much is in use as the program runs, and when memory is
getting close to full then find a graceful way to terminate the
program so that later you can pick up where you left off. Is that
your recommendation for users of your language?

Now before we go on further: Is Seed7 capable of being used in a
CGI environment? I.e. can it examine the environment variables that
tell whether the GET or POST method is used, and read either the
environment variable or standard input to get the URL-encoded form
contents? (I assume once it has the URL-encoded form contents, it's
relatively trivial to decode the key=val&key=val&key=val to produce
some sort of associative array, such as a hash table.) Can it
produce proper HTTP header followed by HTML syntax, written to
standard output?

If Seed7 *is* so capable, would you be so kind as to install a
demonstration of its ability on your Web site, similar to the demos
I already have for seven other languages? The demo would have a
form where you fill in some toy information, and a submit button to
call your CGI demo using the contents of that form, whereupon your
demo would do something cute with the data to demonstrate that you
demo program read the form contents and decoded them correctly. In
addition there'd be a way to view the source of your demo and also
the directory listing of all involved files to show protection bits
needed to make the demo accessible to the Web. See here for what I
have already: <http://www.rawbw.com/~rem/HelloPlus/hellos.html>
So-far Flaming Thunder is the only newly-invented programming
language for which a CGI demo exists. Would you like Seed7 to be
the second such language honored?

If you do set up such a demo, next after that would be to
illustrate (again via CGI) some "interesting" kinds of data
processing relevant to a CGI environment. See here:
<http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html>
So far I've just barely started that project, with just two tasks
(validate string decimal number in form field and convert to usable
numeric internal form, and set/read cookies). I presume it would be
easy to make the validate+parse decimal demo. What about cookies,
is that easy in Seed7? I did it in PHP because it's so incredibly
easy there. I haven't replicated that in Lisp yet due to being too
busy with other projects, so if you want you can wait until I find
time to do it in Lisp and then just translate my Lisp solution to
Seed7.

One thing I might do *instead* of directly generating HTTP cookie
stuff is to develop a system for DOM (Document Object Model) of
HTTP+HTML output, whereby you build your DOM inside your program,
without outputting anything yet, and then just before your program
exits it PRINTs out the DOM, i.e. converts the data structure to
HTTP+HTML textual representation to standard output. All the old
demos print each piece of HTTP header or HTML syntax one at a time
to standard output. I'm thinking of aborting that way of doing
things, switching to DOM for all future CGI demos. See discussion here:
<http://groups.google.com/group/comp.lang.lisp/browse_thread/thread/d1ce166bed96b873/07908fde8adf2dc3?hl=en&lnk=st&q=#07908fde8adf2dc3>

One nice thing about CGI applications is that you always re-start
the application for each new HTTP transaction, so you really don't
need a garbage collector. A reasonable amount of memory leak isn't
anything to worry about. So long as you don't allocate all the
available memory in your entire virtual address space within a
single HTTP transaction, you're OK.

So Seed7 might find a niche in CGI where nobody can complain about
the fact that you don't have any automatic way to reclaim memory
that was allocated (not on the stack) and then discarded (pointer
re-assigned).

santosh

unread,

Aug 12, 2008, 2:57:08 PM8/12/08

Robert Maas, http://tinyurl.com/uh3t wrote:

<snip>

> That seems to be a step backward from sometime around 1964 when
> automatic garbage collection was invented. Has anybody written any
> sort of add-on library that manages dynamically allocated memory
> accessible by pointers so that each unit of allocation is reclaimed
> after it is last used? Even a reference-count system would be
> useful for many practical cases.

Yes. See the Boehm Conservative Garbage Collector for C.

<snip>

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 15, 2008, 3:06:59 AM8/15/08

Richard Heathfield <r...@see.sig.invalid> said:
RH> In any (even relatively sensible) language where it takes
RH> /more/ than one line of code, it only takes more than one line
RH> the /first/ time - because the programmer will wrap it into a
RH> function (or procedure, or subroutine, or whatever) and, in
RH> future, call the function (or whatever). So this conciseness
RH> argument is a mere canard, as all program tasks eventually boil
RH> down to: call existing_solution
RH> which is a one-liner in pretty well any language.

Rebuttal: What if you need to do lots of new different things once
each? Then it really helps if the language you start with allows
you to do each new-and-different thing concisely instead of
requiring thirty lines of code for each new task you want.

Now if you have lots of *similar* but different things you want to
do once each, and if you know in advance all the things you will be
wanting to do, and if you have the spare time at the very start of
the project to build tools before your boss needs you to show
results already, then you may be able to abstract out what's
common, write that code once, then write much less code per
individual new-and-different task. But that's a lot of "if"s when
your boss is looking down your shoulder, or when you're in a R&D
environment where you need early results just to see where to go
next and to assess whether the whole project is even worth your
effort.

However I agree that in some cases you will indeed need to write a
big complicated algorithm just once because it's "obvious" how to
factor out the common part, and in fact the common part is most of
the work. The key is not whether the common part is terse or
verbose, but whether it is so natural to write that you can whip it
off in a few minutes or an hour so that it doesn't slow down your
whole project.

Now there are actually two ways to solve the main/common part of a
set of similar tasks once:
-1- By writing a module that you call with various parameters to
specify the parts that vary from task to task;
-2- By solving just one task completely, then using copy&paste with
micro-edit to create variants of the code for each new but similar
task.
Which is the better approach is a matter of debate as well as a
matter of different approaches for different kinds of task. The
advantage of the copy+paste method is that you don't have to spend
time up front planning how to modularize the solution, you just
solve each problem, copying old code whenever you recognize that a
new task can use a similar solution to what you used before. In a
sense, method 2 is using a "design pattern" whereas method 1 is
"modular programming". Sometimes the beat technique is to start
using method 2 until you start to see a pattern that can be
abstracted out to a software module, then refactor your current
task-solvers into a single parameterized method and use that in the
future. But that requires a time in the middle of a project where
you have some spare time and can afford to stop work on new results
just to invest time refactoring. Maybe the best approach overall is
to use method 2 whenever you are under pressure to meet an
immediate deadline, then refactor to method 1 whenever you have
some "time to breathe" before the next time crunch. Note that if
you have only a little bit of "time to breathe", you may have just
enough time to write the common tool and apply it to the last one
or two tasks you previously had by copy+paste, but not enough time
to find all the other copy+paste instances and refactor them too.
But maybe you have little bits of "time to breathe" every so often,
so after you've once had time to write the parameterized tool, each
later time you have some spare time you refactor a few more of the
copy+paste instances to use the parameterizable tool instead.

In summary: I think it's still useful to compare how the unadorned
language allows concise expression of new algorithms, but it's also
useful to compare how the modularization tools (functions, macros,
etc.) allow you to write the semi-concise algorithm just once and
then afterward just write:
call existing_solution(someSimpleAndObviousParameters)
And for each such task converted from raw code to defineMacroThenUse,
there are more than one factor to compare:
- Body of solution expressed in unadorned language: how terse?
- Body of solution expressed in unadorned language: how natural?
- Code to define the macro that condenses language: how natural?
- Code needed to simply call the macro: how terse?
(It's a "given" that what you need to say when calling the macro is natural,
otherwise you wouldn't have written the macro in the first place.)

Willem

unread,

Aug 15, 2008, 3:28:23 AM8/15/08

Robert Maas, http://tinyurl.com/uh3t wrote:

) Now there are actually two ways to solve the main/common part of a
) set of similar tasks once:

Three, sir. ;-)

) -1- By writing a module that you call with various parameters to
) specify the parts that vary from task to task;
) -2- By solving just one task completely, then using copy&paste with
) micro-edit to create variants of the code for each new but similar
) task.

-3- By solving just one task completely, then adding parametrized code
to the existing solution to create the variants as you go along.

(And to support your point:
some languages are more easily parametrizable than others)

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 15, 2008, 4:36:04 AM8/15/08

RH> In any (even relatively sensible) language where it takes
RH> /more/ than one line of code, it only takes more than one line
RH> the /first/ time - because the programmer will wrap it into a
RH> function (or procedure, or subroutine, or whatever) and, in
RH> future, call the function (or whatever). So this conciseness
RH> argument is a mere canard, as all program tasks eventually boil
RH> down to:
RH> call existing_solution
RH> which is a one-liner in pretty well any language.

p...@informatimago.com (Pascal J. Bourguignon) responded:
PJB> Not in languages that don't provide syntactic abstraction.
PJB> Let's take a simple example, the Memento pattern:
PJB> http://en.wikipedia.org/wiki/Memento_pattern

OK, I read that, and it seems that it may take some considerable
work within each new class to support the ability to restore an old
state even after destructive modifications have been done. On the
other hand, with classes that use a never-modify state object, such
as self-balancing binary trees (all operations log(n) nodes rebuilt
with the rest shared between old and new), the original state never
was changed so it's trivial to restore the "official" state to be
in fact the original state, and it would be easy to abstract a
generic tool that worked in all cases without any new code per
case. Unfortunately all the examples in the WikiPedia page, and in
your article, were too complicated for me to feel like reading in
full, so I don't know whether it was easy or hard to restore the
old state.

PJB> Everytime you want to apply this pattern to a new class, you have
PJB> to add the following methods and class, ...
PJB> In lisp, we write the same, but only ONCE: ...

OK, without studying the details, I get the basic idea.

PJB> This is possible because lisp, in addition to supporting data
PJB> abstraction and procedural abstraction, which any other
PJB> programming language followed suit, also supports syntactic
PJB> abstraction and meta-linguistic abstraction.

This may be the best way to show how much better Lisp is than those
other languages. The anti-Lisp folks are saying "nobody really
needs to be able to construct new source code at run time", but
this kind of pattern may be a good example of how it's a royal pain
*not* to be able to do that. I'd like to see a simpler example of
this sort of need, where it's not huge that I simply don't have the
energy to study the details. After all, if *I* am turned away by
the immensity of the example, I'm sure the anti-Lisp folks will
turn away too, and the lesson won't be learned.

By the way, in my software module to analyze the whitespace
properties of tabular data (in my case, the output from the
FileList utility on Macintosh computers) to automatically discern
where the various columns of data begin and end, I actually did
take advantage of the ability to define a new function at runtime.
Specifically I wrote functions that given the list of start/end
column indexes generated the body of a function definition for
extracting all those substrings:

;Given starting and ending index of multi-column, and symbol to be used
; to refer to a string to be split into pieces:
;Generate the SUBSEQ form, either (SUBSEQ SYM IX1 IX2) or (SUBSEQ SYM IX1)
; depending on whether IX2 is a number or NIL.
;Return that form.
(defun ix1+ix2+strsym-make-subseq-form (ix1 ix2 strsym) ...)

;Given alternating list of start..end indices for multi-columns,
; and symbol to be used to represent the string:
;Build list of SUBSEQ forms per pairs from the list, allowing for
; odd number of elements where end index of last pair is NIL.
;Return list of such SUBSEQ forms.
(defun ixpairs+strsym-make-forms (ixpairs asym) ...)

Having applied that function to the actual list of column indexes,
I then wrap FUNCTION and LAMBDA and LIST around the list of SUBSEQ
forms, and call EVAL to build a function object. Then to parse the
entire list of lines that had been previously read in (which had
already been used to generate the column whitespace statistics used
to decide where the multi-columns of data were located), all I had
to do was call MAPCRR on that newly-built function object and the
list of lines.

Now there are probably several other ways I could have written the
program to avoid needing to define a new function object at run
time, but doing it this way just seemed so very natural. One reason
I chose this method is that the code to run down the list of
alternating start/stop column indixes and check for odd/even total
hence last pair short or not, not only is defined just once, but is
*executed* just once, at the time the list of SUBSEQ forms are
generated, rather than being executed for each and every record
being parsed as some other algorithms would require. Once the
function object has been built, there's no indexing down the inner
loop at all, just inline machine instructions to call SUBSEQ just
the right number of times with just the right parameters each time.
Now whether it's actually faster to have inline SUBSEQ calls rather
than an inner loop that makes a single SUBSEQ call with varying
runtime parameters, is anybody's guess until and unless a profiler
can take a look at both ways. But it's done this way, and works,
and I'm not gonna spend my time trying the other way.

Abstract summary of what I said there as difference between my
method and the usual nested-loop method:
- Nested loop method:
(loop for line in allLines collect
(loop for pix from ixpairs step cddr let (ixlo ixhi) = pix collect
(if ixhi (subseq line ixlo ixhi)
(subseq line ixlo))))
- Define-parser-function method:
(setq theFunctionIDefined ...CodeToBuildAllSUBSEQsIntoOneFunction..)
(loop for line in allLines collect
(funcall theFunctionIDefined line)) ;Inner loop is inline in the function

In a sense I've inverted the inner and outer loop. The inner loop
is a "constant" in a sense, with exactly the same ixpairs each
time, hence exactly the same sequence of SUBSEQ calls except
insofar as the outer loop supplies a different line each time the
loop runs. So I simply took all the parts of the logic which were
constant, compiled them into theFunctionIDefined, and then the
outer loop mapped down the list of the one thing that changed, the
various lines from the file.

Hmm, this is actually a powerful design pattern: Given an algorithm
that has some parts constant and some parts variable, condense all
the constant parts into a function definition, with all constant
info compiled inline, and then only the variable parts need to be
passed around when that function is called. You don't have to keep
passing the several constant parts again and again and again. You
have just one constant part that needs to be mentionned repeatedly,
namely the pointer to the function object that you've defined. Can
any language except Lisp implement this design pattern, where the
constant parts are known only at runtime, and where you aren't
allowed to write source-code to the disk and compile it to disk and
then load from disk (as Java can do), because that isn't
timeshare-thread-safe unless you parameterize the file names to
contain the process-ID, a royal pain, nevermind requiring a 'chron'
job just to audit the disk to garbage collect entire disk files
after the process that generated them is no longer running?

Richard Heathfield

unread,

Aug 15, 2008, 5:03:57 AM8/15/08

Robert Maas, http://tinyurl.com/uh3t said:

> Richard Heathfield <r...@see.sig.invalid> said:
> RH> In any (even relatively sensible) language where it takes
> RH> /more/ than one line of code, it only takes more than one line
> RH> the /first/ time - because the programmer will wrap it into a
> RH> function (or procedure, or subroutine, or whatever) and, in
> RH> future, call the function (or whatever). So this conciseness
> RH> argument is a mere canard, as all program tasks eventually boil
> RH> down to: call existing_solution
> RH> which is a one-liner in pretty well any language.
>
> Rebuttal: What if you need to do lots of new different things once
> each?

Rerebuttal: It's never happened yet. Most new different things turn out to
be mere rearrangements of old same things. The Preacher had us bang to
rights (Eccles 1:9).

<snip>

Pascal J. Bourguignon

unread,

Aug 15, 2008, 7:14:28 AM8/15/08

The memento pattern applies when you can save and load the state
mainly by just reading or writting the slots of the object. A small
addition is usually done: to call a method to prepare the reading, and
post-process the writting of the state. Of course, in Common Lisp,
there's nothing more to do to add these methods, you just add :before
and :after methods on save-to-memento and restore-from-memento.

In other programming language, you have to prepare the interface
methods willSaveToMemento and didRestoreFromMemento, and add a call to
these methods to the saveToMemento and restoreFromMemento.

> PJB> Everytime you want to apply this pattern to a new class, you have
> PJB> to add the following methods and class, ...
> PJB> In lisp, we write the same, but only ONCE: ...
>
> OK, without studying the details, I get the basic idea.
>
> PJB> This is possible because lisp, in addition to supporting data
> PJB> abstraction and procedural abstraction, which any other
> PJB> programming language followed suit, also supports syntactic
> PJB> abstraction and meta-linguistic abstraction.
>
> This may be the best way to show how much better Lisp is than those
> other languages. The anti-Lisp folks are saying "nobody really
> needs to be able to construct new source code at run time", but
> this kind of pattern may be a good example of how it's a royal pain
> *not* to be able to do that. I'd like to see a simpler example of
> this sort of need, where it's not huge that I simply don't have the
> energy to study the details. After all, if *I* am turned away by
> the immensity of the example, I'm sure the anti-Lisp folks will
> turn away too, and the lesson won't be learned.

And when we keep writting small examples, we're dismissed because they
say it works only on toy problems.

But I can reassure you, professionnal programmers are used to work on
programs between 100 KLoC and 10 MLoC, so they can understand the
point of a technique allowing them to reduce the complexity of the
code and therefore the number of bugs.

--
__Pascal Bourguignon__ http://www.informatimago.com/

I need a new toy.
Tail of black dog keeps good time.
Pounce! Good dog! Good dog!

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 15, 2008, 7:01:36 AM8/15/08

thomas.mer...@gmx.at wrote:
TM> I get the impression that you think that only concepts
TM> supported by your favorite language (LISP) are good concepts.
TM> If some concept is not present in LISP you conclude that it is
TM> bad and everybody using it does not know any better.

Your impression is slightly wrong. While I immediately appreciate
mechanisms present in Lisp, I'm willing to hear a convincing case
why some additional mechanism would also be useful. But as soon as
you convince me that the additional mechanism is useful, I'll ask
how much trouble would it take to implement as an add-on module
within Lisp, thereby making that "other" language unnecessary. If
there's a very natural way to express the additional mechanism
within Lisp, such that it also generates reasonably efficient
compiled code, then Lisp wins again. But if there were ever a case
where it would take an extreme contortion to accomplish the task in
Lisp, ...

I hope you return the favor. Whenever somebody shows some mechanism
in Lisp that isn't naturally present in *your* favorite language,
you figure out how Lisp's mechanism could be implemented in your
favorite language, and evaluate whether it's impossible or possible
but only by extreme contortions of your language or in fact easy to
express directly in your language.

For example, suppose your favorite language is Java. So I give an
example where in Lisp your code can analyze a file of columnar data
statistically to decide where the start/end of each column of data
lies, producing a list of start/end index pairs. Then your code can
build a list of forms to select those particular sub-strings from a
line of input from the file. Then your code can compile that list
of forms into a function that when applied to a line will produce a
list of sub-strings representing the fields of data.

Then I show how in Java, if you want to accomplish the same task,
you need to generate Java source code and write it to a disk file,
then call the Java compiler to compile that source file to a Class
file, then load the Class file back in. The Class is a sub-class or
satisfies an interface, so as soon as it's loaded you can apply the
generic methods of the parent class or interface (the Class
contains a constructor that converts a string into an instance of
that class, then another method returns the vector of fields within
the string). The problem is that you're writing a disk file which
must have a name that isn't going to step on similar disk files
written by other instances of exactly the same software that by
chance is running at the same time. So you need to include the
process number in the names of the files, and you need to run a
'chron' job to check from time to time to see if any crashed
process has left any of its files around that need to be deleted.
Big royal pain in Java. Is there any better way to do it in Java
that is as "clean" as the way it's done in Lisp by constructing a
LAMBDA expression and compiling it into an anonymous FUNCTION
object?

So is that fair, a design pattern that is nice and straightforward
(and clean, no need to run a 'chron' job to clean up the mess) in
one person's favorite language, and asking how nice and
straightforward and clean it is to do in somebody else's favorite
language?

TM> I know that you hate static type checking

Yeah, most of the time it terribly slows down line-at-a-time REPL
development of new code, and it doesn't seem to provide any
advantage that can't be easily obtained by runtime checking of
parameters going into each function. (Furthermore, runtime checking
can go far beyond static machine-type checking, to include full
intentional-type checking as well as various sub-type checking such
as whether a particular field has been added to a data record yet.
There's just no way to know at compile time that upon entry to this
particular routine somebody else has already filled in the FOO
field within this type of data record.)

TM> but I think that you[r] should look at concepts from outside your
TM> (LISP) world.

You show me a concept, nicely explained so that I understand it, or
maybe confusing at first but you be available to answer my followup
questions, and show why you believe it's of general use, not just
some special dohicky needed for a single problem domain, and I'll
consider it. One example that has been touted recently is Google's
map-reduce paradigm. I don't yet fully appreciate it, but I do
appreciate it enough to accept it as a valuable idea that I might
eventually find a use for in my own work. I won't be using it for
its primary purpose of expressing algorithms for server farms,
because I don't have any money to lease time on any server farm,
and the number of regular systems I can run software on is fewer
than the fingers on one hand. But I might someday implement it in a
non-parallel way, just as a way to cleanly express some algorithms.
My ProxHash algorithms are already working fine without it, and I
don't feel like refactoring them into map-reduce form, so at
present I can't think of any algorithm I would want to express in
map-reduce form. But you never know, maybe someday...

TM> I tried to include answers to your previous arguments ag[g]ainst
TM> static typ[e] checking in my FAQ at:
TM> http://seed7.sourceforge.net/faq.htm#static_type_checking

***START***
TM-STC> Type errors, such as an attempt to divide an integer by a
TM-STC> string, can be caught earlier

That's not earlier at all. With my line-at-a-time development
style, as soon as I write the line of code that tries to do the
illegal operation and submit it to REP, it'll signal an error. This
is long before the point where I build this line of code with other
lines of code into a function where it's possible to declare static
types which can then be checked during compilation of the function
definition.

TM-STC> Since static type checking makes run-time type checks unnecessary, ...

That is the big lie right there! Suppose you have a data record
that has ten optional fields. These ten fields can be filled in any
sequence, and for many applications several of them will remain
unfilled. Each function that uses these fields can have a runtime
check to make sure all the required fields are filled in, ignoring
the other fields not needed for that particular function. It would
be horrendous to set up a hierarchy of classes of objects that were
parameterized according to which fields were always defined per
that class of object, and convert an object from one class to
another (in Java using a constructor that took the old object and
the new field to yield a new object of the new class; in Common
Lisp using the CHANGE-CLASS function) each time one of the fields
got filled in for the first time. There's just no practical way for
compile-time static type-checking to completely replace runtime
checking. Note that it's *intentional* type checking that most
requires runtime checking. For example, you might use exactly the
same static compiletime type for a whole bunch of intentional
types, because setting up a separate static type for each separate
intentional type would be a royal pain.
Static type: Animal &optional gut legs wings fins lungs gills breasts
Intentional type: Mammal = animal+gut+legs(4)+lungs+breasts
Intentional type: Fish = animal+gut+fins+gills
Intentional type: Marsupial = animal+gut+legs(4)+lungs-breasts
Intentional type: Bird = animal+gut+legs(2)+wings(2)+lungs
Intentional type: Worm = animal+gut-legs-fins
Intentional type: Sponge = animal-gut
Intentional type: Crocodile = animal+gut+legs(4)+lungs+fins

Intentional types can be so subtle that no imaginable compile time
declarations would suffice to distinguish them and thereby allow
that compile-time type-checking would protect against all possible
inappropriate function-data applications.

TM-STC> Although static type checking is very helpful in finding
TM-STC> type errors, it cannot replace a careful program design.
TM-STC> Some operations, allowed by the static type system, can
TM-STC> still be wrong because of different measurement units or
TM-STC> other reasons.

If the static type system is sufficiently advanced, such
distinctions *can* be included. Instead of having a type FLOAT, you
have a type FLOAT_FEET or FLOAT_METERS, and funtions that expect a
measurement in feet but get passed a measurement in meters could
generate a compile time type-check error. Are you saying that Seed7
is not capable of having two different FLOAT types, one for feet
and one for meters, and distinguishing them from each other at
compile time? In Java or CL this would be trivial to implement.
(The trick is to make sure the **input** data is of the correct
type, since input data sitting in a disk file or feeding from a
live instrucment is outside the protection of the compile-time
type-checking ability. If the input data explicitly said in
English or other natural language what the units were, that might
help reduce the error rate. For example, an input form for data
entry could pre-load the TextField with "NN feet" with the NN part
selected so that keystrokes will replace it without harming the
"feet" part, so that the user would constantly see "feet" alongside
what he/she was typing and would have to be very very stupid to
enter a reading in meters there and still press the ENTER or
SUBMIT button.)

TM-STC> In the end there are also other possible sources of errors,
TM-STC> such as range violations.

And how will static type checking protect against those, where
runtime intentional-type checking already does? Do you define a
different static type for each possible range of a value, thereby
having thousands of different static types for all the possible
ranges for intermediate results within a lengthy calculation?
float[0..5] x = 4.2;
float[0..25] y = x**2;
float[-12.5..12.5] z = y - 12.5;
float[0..12.5] w = abs(z);

TM> Basing on this arguments, it would be nice to discuss static
TM> type checking.

OK, I'm game, under the rules that if I present a case for runtime
checking of intentional type (for example the weight of a backpack
must be between zero and whatever weight the human is capable of
carrying on his/her back, which varies from person to person, so
effectively for each type of person in regard to weight lifting
capability there's a different type of backpack), you must tell how
the same protection against inappropriate operations can be somehow
accomplished by compile-time type-checking without any runtime
checking whatsoever. So for a starter, how would you propose to
guarantee, by compile-time type-checking *only*, that arithmetic
overflow never happens in a complicated chain of calculations, and
that nobody is ever fitted with a backpack too heavy for that
person.

TM> BTW.: I am still waiting to get an answer for my other mail in
TM> this thread.

This thread is in a newsgroup. No mail is involved here.

I'm going to guess that you mean posted articles, not mail. There
is no practical way for me to check that I haven't missed one of
the articles you posted that you intended me to see. There is no
practical way for you to know whether I missed an article, or I
downloaded it but haven't yet replied because I currently have a
backlog of 38 articles (after this) which I've downloaded but not
yet replied to, except if you ask me. Of those 38, one was written
by you, namely this one:
Message-ID: <cc834369-ea96-4ddd...@56g2000hsm.googlegroups.com>
If any other articles you wrote haven't been answered by me when
you believe I should have answered, please e-mail me the list of
such message-IDs so that I can fetch them via Google Groups and see
what each was and why I didn't respond to it.

Meanwhile, if you haven't done so yet, please create a CGI
hello-world example, and also a CGI hello-world-plus-form-decode
example, using Seed7. If you can make such examples work on
cellphones that have mobile InterNet (WAP) enabled, that would be
even better.

thomas...@gmx.at

unread,

Aug 15, 2008, 3:50:36 PM8/15/08

On 15 Aug., 13:01, jaycx2.3.calrob...@spamgourmet.com.remove (Robert

Maas, http://tinyurl.com/uh3t) wrote:
> thomas.mer...@gmx.at wrote:
>
> TM> I get the impression that you think that only concepts
> TM> supported by your favorite language (LISP) are good concepts.
> TM> If some concept is not present in LISP you conclude that it is
> TM> bad and everybody using it does not know any better.
>
> Your impression is slightly wrong. While I immediately appreciate

> mechanisms present in Lisp, ...
[snip]

> So is that fair, a design pattern that is nice and straightforward

> (and clean, no need to run a 'c[h]ron' job to clean up the mess) in

> one person's favorite language, and asking how nice and
> straightforward and clean it is to do in somebody else's favorite
> language?

Guess what my favorite language is?

Besides this, not all concepts make sense when transfered to other
languages. Languages have a paradigma. All things that fit to a
paradigma can be added. Things that do not fit to a paradigma are
hard to add. I would not try to add overloading or infix operators
with priority and associativity or compile-time static type checking
(in the way this things are used and implemented in Seed7) to LISP.

> TM> I know that you hate static type checking
>
> Yeah, most of the time it terribly slows down line-at-a-time REPL
> development of new code, and it doesn't seem to provide any

> advantage that can't be easily obtained by runtime checking ...

I am not sure you got the idea of static type checking. Please try
to understand the concept. When you don't really understand the
concept any discussion is meaningless.

> TM> but I think that you should look at concepts from outside your

> TM> (LISP) world.
>
> You show me a concept, nicely explained so that I understand it, or
> maybe confusing at first but you be available to answer my followup

> questions, ...

Such as with this answer article.

> TM> I tried to include answers to your previous arguments against
> TM> static type checking in my FAQ at:

> TM>http://seed7.sourceforge.net/faq.htm#static_type_checking
>
> ***START***
> TM-STC> Type errors, such as an attempt to divide an integer by a
> TM-STC> string, can be caught earlier
>
> That's not earlier at all. With my line-at-a-time development
> style, as soon as I write the line of code that tries to do the
> illegal operation and submit it to REP, it'll signal an error.

So when you type the line

(add a b)

(I assume that 'add' can add two numbers and will fail (at runtime)
when one of the parameters is not a number) the 'REP' knows if this
is a legal or illegal operation? Since you probably hate
declarations with types, I assume that a and b are untyped global
variables or parameters. In the general case no assignments to a
and b have been written at the time you write the 'add' function
call. How can 'REP' know what values a and b will have at runtime?

> TM-STC> Since static type checking makes run-time type checks unnecessary, ...
>
> That is the big lie right there!

This is not a lie, and everybody who understands static type
checking will agree with me: If the compiler has checked that some
parameter is of type 'integer' (or 'file' or 'myObject') it will
for sure have a value of this type at runtime (see below). This
value could be NULL (when NULL is a valid value for this type as it
is the case for class values in most OO languages) or could cause
some illegal operation such as a division by zero, but it is
guaranteed to be a value of the static type (as long as other
mechanisms like initialisation and runtime checks take care that
no garbage values get created). Therefore the corresponding type
check can be omitted at runtime. Other checks like a check for NULL
or 0 might still be necessary.

Do not confuse type checking with dynamic dispatch. E.g. When
'myFile' is a parameter of the (interface) type 'file' a call
like

write(myFile, "hello");

can be checked at compile time aggainst the declaration of 'write':

const proc: write (inout file param, in string param) is DYNAMIC;

At runtime 'myFile' can have a value which has an implementation
type like external_file, utf8_file, screen_file, etc. But it is
guaranteed that the implementation type implements (directly or
indirectly) the interface type 'file'. That way there is a 'file'
value which actually has an implementation type which implements
'file'.

Choosing the right method is called dynamic dispatch and has nothing
to do with type checking (although the implementation type of the
value is used to do the dynamic dispatch).

> Suppose you have a data record
> that has ten optional fields. These ten fields can be filled in any
> sequence, and for many applications several of them will remain
> unfilled. Each function that uses these fields can have a runtime
> check to make sure all the required fields are filled in, ignoring
> the other fields not needed for that particular function. It would
> be horrendous to set up a hierarchy of classes of objects that were
> parameterized according to which fields were always defined per

> that class of object, ...

You are mixing two concepts here. Type checking and checking for
NULL values. E.g. Java will automatically initialize all
uninitialized fields with NULL. Even when the fields have different
types the NULL values are not illegal according to type check rules.
Some operations are legal for NULL values (e.g. comparison) while
others (dereferencing, method call) are illegal. Such things are
usually done with NULL checks at runtime. This has nothing to do
with compile-time or run-time type checking.

> There's just no practical way for
> compile-time static type-checking to completely replace runtime
> checking.

Compile-time static type-checking does not replace runtime checking
completely. Static type checking and runtime checking complement
each other. Sometimes a runtime type check can be ommited, because
of a static type check. Other things can only be checked at runtime.

> Note that it's *intentional* type checking that most
> requires runtime checking. For example, you might use exactly the
> same static compiletime type for a whole bunch of intentional
> types, because setting up a separate static type for each separate
> intentional type would be a royal pain.
> Static type: Animal &optional gut legs wings fins lungs gills breasts
> Intentional type: Mammal = animal+gut+legs(4)+lungs+breasts
> Intentional type: Fish = animal+gut+fins+gills
> Intentional type: Marsupial = animal+gut+legs(4)+lungs-breasts
> Intentional type: Bird = animal+gut+legs(2)+wings(2)+lungs
> Intentional type: Worm = animal+gut-legs-fins
> Intentional type: Sponge = animal-gut
> Intentional type: Crocodile = animal+gut+legs(4)+lungs+fins

Your concept of static vs intentional type can be compared to
the concept of interface vs implementation type that Seed7 has.
There is an explanation of this concept here:
http://seed7.sourceforge.net/manual/objects.htm

> Intentional types can be so subtle that no imaginable compile time
> declarations would suffice to distinguish them and thereby allow
> that compile-time type-checking would protect against all possible
> inappropriate function-data applications.

It never was a goal of compile-time type-checking to protect against

all possible inappropriate function-data applications.

The static type checks for an interface type can check some
misuses like an attempt to add two files.

> TM-STC> Although static type checking is very helpful in finding
> TM-STC> type errors, it cannot replace a careful program design.
> TM-STC> Some operations, allowed by the static type system, can
> TM-STC> still be wrong because of different measurement units or
> TM-STC> other reasons.
>
> If the static type system is sufficiently advanced, such
> distinctions *can* be included.

Yes

> Instead of having a type FLOAT, you
> have a type FLOAT_FEET or FLOAT_METERS, and funtions that expect a
> measurement in feet but get passed a measurement in meters could
> generate a compile time type-check error. Are you saying that Seed7
> is not capable of having two different FLOAT types, one for feet
> and one for meters, and distinguishing them from each other at
> compile time?

Seed7 can do such things.
Unless it has the purpose of measurement unit conversions a program
which uses feet and meters would be suspicious.

> In Java or CL this would be trivial to implement.
> (The trick is to make sure the **input** data is of the correct
> type, since input data sitting in a disk file or feeding from a
> live instrucment is outside the protection of the compile-time
> type-checking ability. If the input data explicitly said in
> English or other natural language what the units were, that might
> help reduce the error rate. For example, an input form for data
> entry could pre-load the TextField with "NN feet" with the NN part
> selected so that keystrokes will replace it without harming the
> "feet" part, so that the user would constantly see "feet" alongside
> what he/she was typing and would have to be very very stupid to
> enter a reading in meters there and still press the ENTER or
> SUBMIT button.)
>
> TM-STC> In the end there are also other possible sources of errors,
> TM-STC> such as range violations.
>
> And how will static type checking protect against those, where
> runtime intentional-type checking already does?

Static type checking is not intended to replace runtime checks.
Range errors like accessing an non-existing array element cannot
(at least in the general case) be checked at compile time. Why do
you think that static type checking wants to protect against such
things?

> Do you define a
> different static type for each possible range of a value, thereby
> having thousands of different static types for all the possible
> ranges for intermediate results within a lengthy calculation?
> float[0..5] x = 4.2;
> float[0..25] y = x**2;
> float[-12.5..12.5] z = y - 12.5;
> float[0..12.5] w = abs(z);

Pascal had subranges for integers. An implementation could check at
runtime that some value is still in range. This is a concept that
allows to introduce more runtime checks based on the subranges.
Since arithmetic operations have an influence at the range it is
hard to manage subranges with a simple type system. I consider
subranges as distinct (orthogonal) concept.

> TM> Basing on this arguments, it would be nice to discuss static
> TM> type checking.
>
> OK, I'm game, under the rules that if I present a case for runtime
> checking of intentional type (for example the weight of a backpack
> must be between zero and whatever weight the human is capable of
> carrying on his/her back, which varies from person to person, so
> effectively for each type of person in regard to weight lifting
> capability there's a different type of backpack), you must tell how
> the same protection against inappropriate operations can be somehow
> accomplished by compile-time type-checking without any runtime
> checking whatsoever.

Nobody who arguments for compile-time type-checking talks about
"without any runtime checking whatsoever". As I already said I see
the subrange concept as orthogonal to the type concept. The type
is 'float' (which can be checked statically) and the elements also
have a subrange (which needs to be checked at runtime).

If you don't want to mix this subranges you need to create
incompatible types. Otherwise your subranges are compatible and
can be mixed. From the Seed7 view there could be compatible and
incompatible subranges.

There are several open issues with a subrange concept like the
places where the runtime checks should be done (should every
intermediate result be checked or just values that gets assigned).

> So for a starter, how would you propose to
> guarantee, by compile-time type-checking *only*, that arithmetic
> overflow never happens in a complicated chain of calculations, and
> that nobody is ever fitted with a backpack too heavy for that
> person.

Compile-time type-checking is not capable to end all wars and
it will not cure all diseases. Therefore you conclude that it is
useless. IMHO overflow and subrange checking is NOT the purpose of
compile-time type-checking. Its a different issue.

As you probably know a subrange can be followed (and expanded) over
a complicated chain of calculations. When loops get involved the
subranges can grow very quickly even to infinite. In the optimal
case some guarantees can be deduced that allow the omission of some
overflow or range checks at runtime. I see such things as a subject
of code analyzers and not as part of the static type checking.

> Meanwhile, if you haven't done so yet, please create a CGI

> hello-world example, ...

I wrote some CGI programs in Seed7 (for example an experimental
wiki). Therefore I can assure you that CGIs in Seed7 are possible.
My CGI Programs start with the line:

#!/usr/local/bin/hi -q

As you can probably guess the Seed7 interpreter is installed at
/usr/local/bin/hi on my machine. The option -q (for quiet) instructs
the hi interpreter to omit any output of its own. Compiled CGIs are
also possible. The library "cgi.s7i" contains some support to manage
the CGI parameters.

My intent is to come up with a little bit more than a hello world
CGI.

BTW. Did you download Seed7 and experiment with it?

Pascal J. Bourguignon

unread,

Aug 15, 2008, 8:11:56 PM8/15/08

thomas...@gmx.at writes:

> Besides this, not all concepts make sense when transfered to other
> languages. Languages have a paradigma. All things that fit to a
> paradigma can be added. Things that do not fit to a paradigma are
> hard to add. I would not try to add overloading or infix operators
> with priority and associativity or compile-time static type checking
> (in the way this things are used and implemented in Seed7) to LISP.

You're demonstrating some closed mindeness here, because some lispers
did dare adding all that to Lisp.

>> That's not earlier at all. With my line-at-a-time development
>> style, as soon as I write the line of code that tries to do the
>> illegal operation and submit it to REP, it'll signal an error.
>
> So when you type the line
>
> (add a b)
>
> (I assume that 'add' can add two numbers and will fail (at runtime)
> when one of the parameters is not a number) the 'REP' knows if this
> is a legal or illegal operation?

No it doesn't need to know. It just pass the expression to EVAL, and
care to catch any error. If in the meantime the user or any other
agent replace ADD to deal with the unexpected type of parameters, or
replace the invalid values with good ones, all is well.

C/USER[530]> (defun add (a b) (+ a b))
ADD
C/USER[531]> (add "IV" "II")

*** - +: "IV" is not a number
The following restarts are available:
USE-VALUE :R1 You may input a value to be used instead.
ABORT :R2 ABORT
C/Break 1 USER[532]> :r1
Use instead: 4

*** - +: "II" is not a number
The following restarts are available:
USE-VALUE :R1 You may input a value to be used instead.
ABORT :R2 ABORT
C/Break 1 USER[533]> :r1
Use instead: 2
6

Notice that what the user did here interactively, a software restart
software handler can do it also, automatically. At run-time.

In addition to the substitution of invalid values, it's also possible
to replace a function definition by another debugged one, during
run-time.

> Since you probably hate
> declarations with types, I assume that a and b are untyped global
> variables or parameters. In the general case no assignments to a
> and b have been written at the time you write the 'add' function
> call. How can 'REP' know what values a and b will have at runtime?

The REPL do work at run-time. If the variables are unbound when their
value is required, then an error occurs, and of course, some agent can
correct that error, at run-time:

C/USER[547]> (add a b)

*** - EVAL: variable A has no value
The following restarts are available:
USE-VALUE :R1 You may input a value to be used instead of A.
STORE-VALUE :R2 You may input a new value for A.
ABORT :R3 ABORT
C/Break 1 USER[548]> :r2
New A: 4

*** - EVAL: variable B has no value
The following restarts are available:
USE-VALUE :R1 You may input a value to be used instead of B.
STORE-VALUE :R2 You may input a new value for B.
ABORT :R3 ABORT
C/Break 1 USER[549]> :r2
New B: 2
6
C/USER[550]> (add a b)
6
C/USER[551]>

>> TM-STC> Since static type checking makes run-time type checks unnecessary, ...
>>
>> That is the big lie right there!
>
> This is not a lie, and everybody who understands static type
> checking will agree with me: If the compiler has checked that some
> parameter is of type 'integer' (or 'file' or 'myObject') it will
> for sure have a value of this type at runtime (see below).

This is both irrelevant and wrong.

It is irrelevant, because of course, that variable is of type
'integer', we don't need any nitpicking compiler to know that, and
this is useless, because it doesn't tell anything about the validity
of the integer values this variable has. Perhaps the variable was
meant to hold the capitain's age, and suddenly, it helds his wife's
age instead. The type system won't be able to detect this defect.
You will have to write test cases and check that at run-time ANYWAYS.

And this is very wrong in most programming language, who like C allow
for random pointers and other niceties which allows any bit pattern to
land in any variable, and therefore in practice, in practical
programs, that are always bugged, at run-time the garantees of the
compilers are often violated. Unfortunately, since you relied on
compilation time typing, you lost all typing information at run-time,
and therefore you cannot do anything else than fail miserably.

On the other hand, in checked environment, the types being not
attached to the variable but to the value, if you assign a value of
the wrong type to a variable where you expected another type, this is
something you can detect and you can correct. See the above examples.
No process crashed in the making of that example.

> Compile-time static type-checking does not replace runtime checking
> completely. Static type checking and runtime checking complement
> each other. Sometimes a runtime type check can be ommited, because
> of a static type check. Other things can only be checked at runtime.

We can agree with that. Notice that some lisp compilers do run
compile-time type checks:

C/USER[551]> (compile (defun f (x) (+ x "abc")))
WARNING in F :
Arithmetic operand "abc" must evaluate to a number, not "abc"
F ;
1 ;
1

While such type checking can be useful indeed, there is ABSOLUTELY no
reason to _require_ type declarations from the programmer. In the
above function, the programmer didn't need to add any type declaration
for the compiler to know x was a number and "abc" was of an
incompatible type.

But note that still doesn't prevent it to produce a usable and
debuggable program:

C/USER[552]> (f 3)

*** - +: "abc" is not a number
The following restarts are available:
USE-VALUE :R1 You may input a value to be used instead.
ABORT :R2 ABORT
C/Break 1 USER[553]> :r1
Use instead: 42
45
C/USER[554]>

> Seed7 Homepage: http://seed7.sourceforge.net
> Seed7 - The extensible programming language: User defined statements
> and operators, abstract data types, templates without special
> syntax, OO with interfaces and multiple dispatch, statically typed,
> interpreted or compiled, portable, runs under linux/unix/windows.

Sounds so much better than C++. Good luck :-)

--
__Pascal Bourguignon__ http://www.informatimago.com/

Nobody can fix the economy. Nobody can be trusted with their finger
on the button. Nobody's perfect. VOTE FOR NOBODY.

thomas...@gmx.at

unread,

Aug 16, 2008, 5:16:36 AM8/16/08

On 16 Aug., 02:11, p...@informatimago.com (Pascal J. Bourguignon)
wrote:

> thomas.mer...@gmx.at writes:
> > Besides this, not all concepts make sense when transfered to other
> > languages. Languages have a paradigma. All things that fit to a
> > paradigma can be added. Things that do not fit to a paradigma are
> > hard to add. I would not try to add overloading or infix operators
> > with priority and associativity or compile-time static type checking
> > (in the way this things are used and implemented in Seed7) to LISP.
>
> You're demonstrating some closed mindeness here, because some lispers
> did dare adding all that to Lisp.

I know that everything and the kitchen sink has been added to LISP..
I spoke of the paradigma and I spoke about adding some things in the
way they are used and implemented in Seed7. You can add things
outside a paradigma, but it will disturb the paradigma.

> >> That's not earlier at all. With my line-at-a-time development
> >> style, as soon as I write the line of code that tries to do the
> >> illegal operation and submit it to REP, it'll signal an error.
>
> > So when you type the line
>
> > (add a b)
>
> > (I assume that 'add' can add two numbers and will fail (at runtime)
> > when one of the parameters is not a number) the 'REP' knows if this
> > is a legal or illegal operation?
>
> No it doesn't need to know. It just pass the expression to EVAL, and
> care to catch any error.

If the function arguments are present, an expression can be checked.
The hard thing is: Verify at compile (or edit) time that some
function will always get values of the correct type at runtime.

How does the execution of an expression, while editing the program,
veryfy that it will get a value of the correct type at runtime?

> If in the meantime the user or any other
> agent replace ADD to deal with the unexpected type of parameters, or
> replace the invalid values with good ones, all is well.
>
> C/USER[530]> (defun add (a b) (+ a b))
> ADD
> C/USER[531]> (add "IV" "II")

This is the wrong example: I know that such a call can be checked
with a runtime type check.

The question was how it can be verified at compile time that the
calls of the ADD function will never get a string (or other illegal)
value at runtime in a big program. For Lispers, one that ends

))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

What I say is:
Without some sort of static type checking it is a huge effort. See:
http://seed7.sourceforge.net/faq.htm#static_type_checking

[snip wrong arguments]

> > Since you probably hate
> > declarations with types, I assume that a and b are untyped global
> > variables or parameters. In the general case no assignments to a
> > and b have been written at the time you write the 'add' function
> > call. How can 'REP' know what values a and b will have at runtime?
>
> The REPL do work at run-time.

The discussion was about finding type errors at compiler time.

[snip wrong arguments]

> >> TM-STC> Since static type checking makes run-time type checks unnecessary, ...
>
> >> That is the big lie right there!
>
> > This is not a lie, and everybody who understands static type
> > checking will agree with me: If the compiler has checked that some
> > parameter is of type 'integer' (or 'file' or 'myObject') it will
> > for sure have a value of this type at runtime (see below).
>
> This is both irrelevant and wrong.

I doubt that.

> It is irrelevant, because of course, that variable is of type
> 'integer', we don't need any nitpicking compiler to know that, and
> this is useless, because it doesn't tell anything about the validity
> of the integer values this variable has.

The validity of a value (E.g.: to be an age with a value less than
150) is not the question for a type check. Such things are managed
with range (or validity) checks. If your program assigns the age of
the captain's wife as age of the captain most compile-time and
runtime checks will fail. Such things are not the topic of a
compile-time type check.

> And this is very wrong in most programming language, who like C allow
> for random pointers and other niceties which allows any bit pattern to
> land in any variable

Have you seen that I wrote: (see below).
Below I wrote:

As long as other mechanisms like initialisation and
runtime checks take care that no garbage values get created.

This rules out examples where a random pointer puts any bit pattern
in a variable.

BTW: Random bit patterns created by random pointers would make
compile-time and runtime type checks useless.

> > Compile-time static type-checking does not replace runtime checking
> > completely. Static type checking and runtime checking complement
> > each other. Sometimes a runtime type check can be ommited, because
> > of a static type check. Other things can only be checked at runtime.
>
> We can agree with that. Notice that some lisp compilers do run
> compile-time type checks:
>
> C/USER[551]> (compile (defun f (x) (+ x "abc")))
> WARNING in F :
> Arithmetic operand "abc" must evaluate to a number, not "abc"
> F ;
> 1 ;
> 1
>
> While such type checking can be useful indeed, there is ABSOLUTELY no
> reason to _require_ type declarations from the programmer.

I know that there are technics like type inferencing that can deduce
the type of an expression. For an expression like "abc" this is
easy, but in the general case (with generic typeless functions) it
may be necessary to look at many program lines to infer a type.
Needless to say: This must be done by the compiler and by every
person who reads the program. I have described this here:
http://seed7.sourceforge.net/faq.htm#type_inference

When a program is written the programmer has something in mind.
E.g.: What values his variables or parameters will hold.
Declarations with type allow the programmer to express his
thoughts. This again eases reading the program.

Try to read programs written in different programming languages.
Asside from being used to a language there are other criteria
which make reading of programs easier or harder.

Some programing languages famed for being easy to write are also
known as write only languages.

Programs are more often read than written.
Have you ever read your own ingenious programs several years
after you wrote them?

When you read a book. Do you care that it was easy for the author
to write it? Or do you prefer books that are easy to read?
Do you like to jump through several chapters just to understand
one sentence?

> > Seed7 Homepage: http://seed7.sourceforge.net
> > Seed7 - The extensible programming language: User defined statements
> > and operators, abstract data types, templates without special
> > syntax, OO with interfaces and multiple dispatch, statically typed,
> > interpreted or compiled, portable, runs under linux/unix/windows.
>
> Sounds so much better than C++. Good luck :-)

Thank you.

Greetings Thomas Mertes

thomas...@gmx.at

unread,

Aug 16, 2008, 1:46:11 PM8/16/08

On 16 Aug., 02:11, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> thomas.mer...@gmx.at writes:

> And this is very wrong in most programming language, who like C allow
> for random pointers and other niceties which allows any bit pattern to
> land in any variable, and therefore in practice, in practical
> programs, that are always bugged, at run-time the garantees of the
> compilers are often violated. Unfortunately, since you relied on
> compilation time typing, you lost all typing information at run-time,
> and therefore you cannot do anything else than fail miserably.

Good point.
I did not take into account things like uninitialised variables and
pointers which change arbitrary places in memory. This is something
missing in my FAQ argumentation (maybe because Seed7 is missing such
"features").
IMHO it is possible to forbid the violation of the compiler
garantees. Would the following paragraph be ok?

It is possible to make sure that objects of a type hold only legal
values of this type. This goal is reached with mechanisms like
mandatory initialisation, runtime checks and the impossibility to
change arbitrary places in memory. When the generation of garbage
values is avoided, it can be guaranteed that only legal values of
the correct type are used as object values. This way run-time type
checks are unnecessary and the program execution can be more
efficient.

What do you think about it?

> On the other hand, in checked environment, the types being not
> attached to the variable but to the value, if you assign a value of
> the wrong type to a variable where you expected another type, this is
> something you can detect and you can correct. See the above examples.
> No process crashed in the making of that example.

The Seed7 interpreter maintains a category for every object. The
category of an object can be INTOBJECT, STRIOBJECT, FLOATOBJECT,
etc. When the interpreter executes a primitive action like INT_ADD
(which is used to implement the + operator for integers) both
parameters are checked that they have the correct category (in case
of INT_ADD this is INTOBJECT). This way the hi interpreter does also
some sort of runtime type check (using the category instead of a
type). If this check fails you get a fatal error. When the
definitions of the primitive actions are correct this error never
shows up. This is because all type errors are catched by the
compile-time type check.

In Seed7 it is this way: Runtime type checks would never be
triggered, because compile-time type checks stop such bugs long
before they are executed.

Greetings Thomas Mertes

Pascal J. Bourguignon

unread,

Aug 16, 2008, 2:37:16 PM8/16/08

thomas...@gmx.at writes:

Yes, that's better. Notice that is a very strong requirement. For
example, since unix syscalls are implemented in C, you have to prove
that the kernel is bug free, and this proof is not done by your
compiler. It forbids FFI to "lesser" programming languages (as long
as they run on a non-checked virtual machine).

--
__Pascal Bourguignon__ http://www.informatimago.com/

NEW GRAND UNIFIED THEORY DISCLAIMER: The manufacturer may
technically be entitled to claim that this product is
ten-dimensional. However, the consumer is reminded that this
confers no legal rights above and beyond those applicable to
three-dimensional objects, since the seven new dimensions are
"rolled up" into such a small "area" that they cannot be
detected.

thomas...@gmx.at

unread,

Aug 16, 2008, 5:32:03 PM8/16/08

On 16 Aug., 20:37, p...@informatimago.com (Pascal J. Bourguignon)

Yes, this is a strong requirement.
IMHO all languages (also those that only use dynamic type
checking) have the problem that a library or even the kernel
could destroy a value (while keeping the type information
intact). In that case a runtime type check would succeed
although it shouldn't and bad things could happen afterwards.

For that reason the implementation of every programming
language should also include a verification of all used libraries
and the kernel.

Naturally this is not always done up to 100%.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 31, 2008, 6:27:39 AM8/31/08

Quick summary of the topic of this thread: "program compresssion"
means using tools (such as parse-tree rewrite rules or syntax
extension rules) to allow collapsing a verbose software pattern
into a very terse expression whereby all the boilerplate of the
software pattern is gone, leaving only the variable parts plus just
enough keyword and/or syntax to trigger expansion back to the full
software pattern.

> >> The solutions are shorter in modern FPLs.
> > OK, let's say you have a text string which contains the notation
> > for a nested list starting from the beginning of the string.
> From: Jon Harrop <j...@ffconsultancy.com>
> "The" notation? You mean Lisp's notation?

Well I suppose "the" implies that. So let me change my wording to
say instead "any reasonable notation for a nested list". By
reasonable I mean that there's some mark to show the start of each
level of list and a matching mark to show the end of that same
level of list. For example, C uses curly braces instead of parens,
and commas instead of white-space, to denote initializer data for
arrays. If that same notation were in a string instead of the right
place in an array declaration in a C source file, could you write
code to parse it into a nested list, or can you usurp the code in
the C compiler to avoid having to write new code from scratch?

Or could you write code to parse SGML or XML instead?

> That's fine but it is of no practical interest because real data
> is rarely already in Lisp syntax. For a more useful comparison you
> must consider other formats.

Are you saying the kind of notation used in C array initializers
and XML SOAP objects is not general enough? What kind of nested
list notation would you propose as an alternative where the data
might be naturally in that form without any human intervention, or
where manual methods for data keeping by humans have had that
notation existing prior to scanning that notation to be entered
into computers, or whatever else you meant? You say what your
favorite notation for nested lists is, and if it looks reasonable
to me then I'll agree it's a valid benchmark for comparing how easy
it is to implement a parser for it (creating a DOM structure as
output) in various programming languages.

On a related question, is there *any* nested-list notation that is
commonly convertible to/rom internal DOM structure by some
available C library? For Lisp the answer is YES, namely those
s-expresssions you seem to hate. So does C even have something
comparable, never mind the perverse syntax it uses??

> How about something of practical relevance: custom grammars.

Does there exist an OCaml library which converts a BNF
specification into a corresponding parser function?

> Consider parsing Mathematica expressions including lists, rules,
> sums, products, powers and factorials with associativity and
> precedence. In OCaml:
> open Camlp4.PreCast;;
> let mma = Gram.Entry.mk "mma";;
> EXTEND Gram
> mma:
> [ "rule" LEFTA
> [ e1 = mma; "->"; e2 = mma -> `Rule(e1, e2) ]

> ...
> END;;

Does there exist an OCaml library which converts a BNF specification
into the above OCaml code, which can then be compiled and executed?

> >> Macros are very rare in modern languages.
> > Why?
> Two reasons:
> . Macros are used primarily to work around deficiencies in the

> language's design, ...

It is impossible for a language, as delivered, as standardized, to
already contain every syntax mechanism any user-programmer will
ever want to employ to make writing code easier. Consequently every
language ought to have Lisp-style macros (parse-tree
transformations) to allow ordinary programmers to extend the set of
allowed just-right parse-tree formulations beyond those which are
supplied by the standard/vendor. It is a gross deficiency that any
language is missing such parse-tree transformation capability.
These other (non-Lisp) languages already allow user-programmers to
define new functions to extend the functions given by the
vendor/standard. The OOP languages already allow user-programmers
to define new classes with associated methods to extend the class
hierarchy given by the vendor/standard. So why don't they also
allow user-programmers to define new transformations from
source-file syntax to what-actually-gets-compiled syntax?

Just one example: The new LOOP special-form is a wonderful addition
to Common Lisp, and was implementable by ordinary user-programmers
before it became part of the standard, merely by defining a LOOP
macro to transform the LOOP special form into ordinary supported
source code. Why don't other languages allow enough user-programmer
power to define an equivalent LOOP macro and thereby allow ordinary
user-programmers to implement a LOOP special form in those other
langauges?

> Using macros forks the language and makes maintenance much harder.

True, but defining functions forks the language in almost exactly
the same way. A program that calls SQRT(5) isn't a valid program
unless the library containing the SQRT function is available.
Therefore applications which depend on the availability of
externally-defined functions, and applications which depend on the
availability of externally-defined macros, suffer almost exactly
the same extra hardness of maintenance. The only difference is that
with externally-defined macros you need to make sure the
appropriate macro library is loaded at compile time, whereas with
externally-defined functions you only need to make sure the
appropriate macro library is included in the load directives (in
languages such as C which require all external symbols to be
resolved at load time) or available at runtime (in languages such
as Java which support loading new classes at runtime). In languages
where the compiler itself is just another application written in
the same language (see the other thread about how to solve the
chicken-egg problem via bootstrapping early
versions of the compiler from some other
language before closing the loop on
self-supporting compiler)
making an external library available at compile time vs. load time
or run time is a trivial distinction. Accordingly the use of macros
does not make maintenance significantly more difficult than merely
using functions or methods.

> New programmers must effectively learn a new language.

The way to do that is not to try to learn the whole thing at one
time. First just learn how to install and run a "Hello World"
program. Next learn how to include most of the essential
primitives, and practice coding simple algorithms using them and
getting these algorithms to work correctly. Finally, start learning
how to use additional functions/methods/macros and special
mechanisms such as multiple return values or keyword parameters
which are useful for whatever type of program that student wants to
tackle first. Note that at this stage every student should be able
to learn new stuff in a different sequence, appropriate for what
kinds of D/P tasks that particular student is trying to program.
The key during this never-ending stage is to have the means to
efficiently explore intentional data types and the libraries etc.
that implement those data types, including information as to
whether those libraries etc. are already part of the delivered
programming environment or need to be downloaded from some Web site
and linked into the programming environment, and if download is
needed than full installation instructions that are easy for
beginners to understand. For example, if you want to tackle the
definition of a custom grammar for some problem domain, you need to
be able to find what libraries deal with custom grammers and how to
pick the most appropriate one (if there are more than one "good"
libraries) and how to make the chosen library available for use.

> Tools no longer work transparently with source code.

If the source code contains declarations to cause an additional
library to be available at compile time to support source-code
parse-tree transformations, but the tool fails to notice these
declarations and hence fails to work correctly when such "macros"
are used in source code, that is a deficiency in the tool. Use a
better tool, don't cramp the language because your stupid tool
doesn't work on the full range of the language syntax.

> The parse tree is no longer relevant in modern languages.

If you are going to allow user-programmers to extend the set of
parse trees that can be compiled, by adding new parse-tree
transformations, then the parse tree is totally relevant!! Just
because your favorite language doesn't support parse-tree
transformations doesn't mean the technique is worthless. Lisp
doesn't support SQL with relational databases natively, it needs to
be added by a library, but I don't consider relational databases to
be worthless because of that. So why do you consider parse-tree
transformations to be worthless? Do you also consider
domain-specific laguages (DSLs) to be worthless??

> In all modern languages, the parse tree (the internal
> representation used by the compiler and macro system) is irrelevant
> due to the use of quotations.

What do you mean by "quotations" in this context?
(It's not going to do any good to do a Google search, because it'll
turn up Shakespeare quotations, and Will Rogers quotations, and
George W. Bush gaffes, etc. etc. etc., so I'm asking you to cite
what you mean.)

> Look at the following equivalent of UNWIND-PROTECT:
> # EXTEND Gram
> expr: LEVEL "top"
> [ [ "try"; f=sequence; "finally"; g=expr ->
> <:expr<
> (function
> | `Val v, g -> g(); v
> | `Exn e, g -> (try g() with _ -> ()); raise e)
> ((try `Val $f$ with e -> `Exn e), (fun () -> $g$))
> >> ] ];
> END;;

I can't make heads or tails of what you wrote there. I see
half-meaningful pieces tangled together in a meaningless (to me)
mess, and I have no idea what that block of "code" is really
saying. Part of my problem seems to be that you're using a non-BNF
form of expression of grammar production rules, which I don't
understand. Perhaps whenever you are presenting such examples, you
should post the URL of a tutorial that explains that particular
grammar-production syntax, what it all means. Also you need to post
the URL of a playpen for testing my understanding of that syntax,
so that I can try my hand at writing my own grammer and checking if
I got it right by parsing what I believe to be sentences of my
grammar and see if the playpen parser agrees with me as to how the
sentence should parse.

By the way, either "whereis ocaml" nor "man ocaml" turns up
anything here. Is that the correct name that would be used for
OCaml on Unix, so I know it's not installed here? Or should I be
looking under some other name? I tried both commands with
capitalized "OCaml" but neither of those turned up anything here
either.

> The input source code is represented by a sequence of tokens and
> the associativity and precedence of the new grammar rule, so the
> programmer is no longer forced to write without syntax as you do in
> Lisp.

So with all this extra power, it should be **easy** for you to write
an extension to OCaml source-code syntax to support something like the
new LOOP special form in ANSI Common Lisp, right? Let's say you
wanted to make a toy demo of this capability by implementing only
something like (LOOP FOR el IN ls COLLECT (sqrt el)) where I've capitalized
the LOOP keywords and left the ordinary source code as lower case to make
the syntax clear. How would you design the corresponding OCaml syntax
extension? I'd guess something like:
LOOP FOR el IN ls COLLECT sqrt(el);
Is that the exact syntax you'd choose, or something different? In
either case, how would you implement that syntax extension in
OCaml, to achieve exactly equivalent semantics to what Common Lisp
provides? For example:
(LOOP FOR el IN '(1 2 3 4 5) COLLECT (sqrt el))
=> (1.0 1.4142135 1.7320508 2.0 2.236068)
(LOOP FOR sr IN * COLLECT (* sr sr))
=> (1.0 1.9999999 3.0 4.0 5.0) ;If I had my way, the default mode
; would be interval arithmetic
; instead of floating-point
; shot-in-dark, but that issue
; isn't relevant to this discussion.
Note your syntax extension must work for *any* local variable
instead of just 'el', *any* given object in place of 'ls', and
*any* valid form involving the local variable, not just (sqrt el)
or (* sr sr).

> The action is represented by a quotation <:expr< ... >> that
> contains ordinary OCaml code so, again, the programmer is no longer
> forced to write without syntax as you do in Lisp.

That remark is a first step toward a tutorial and playpen for this
feature of OCaml, but until the full tutorial+playpen is available
for me to play with, I won't be able to really understand what
you're talking about and what your code fragment means.

> Pattern matching is also instrumental here: you can quote code in
> patterns and in expressions so you can rewrite OCaml code without
> having to deal with the low-level parse tree at all.

I'm not granting your point just yet, but it sounds like this
feature *might* be a valid alternative to source-code parse-tree
transformation. At this point I really hope you do set up a CGI
application that lets me play with this feature, together with a
tutorial that explains what I should be trying, starting with the
simplest case and working toward more complicated cases.

> This makes term rewriting much easier in OCaml than in Lisp ...

It doesn't at all look easier than how it's done in Lisp. Maybe
after you set up your tutorial+playpen and I spend some time
developing my understanding of how to use that facility, I'll
agree, or maybe I'll say it really works but is more complicated
than Lisp's "macro" facility.

> We can now improve syntax far beyond the capabilities provided by Lisp.

That's blatantly untrue. *any* formally defined grammer can be
implemented as a parser using Lisp as the coding language,
whereupon any such grammer can be fed as syntax into such an
operational Lisp program. Witness how way back with MacLisp,
Vaughan Pratt developed CGOL, and about the same time with SL
(Standard Lisp) Tony Hearn developed REDUCE/RLISP, each an
Algol-like syntax for Lisp sourcecode.

> You can implement UNWIND-PROTECT using only a higher-order function.

(and thus avoid needing to use source-code parse-tree transformations)

Please explain what you mean by higher-order function in this
context, such that the syntax for an unwind-protect form can be
implemented without any need to enhance the Lisp kernel to provide
support for it. I don't believe it's possible to define
unwind-protect in terms of McCarthy's original Lisp primitives.

> I'd like to see an example of a design pattern that cannot
> already be factored out in modern FPLs without the use of macros.

Note that "macros", i.e. parse-tree post-read pre-compilation
transformations, are just one way to implement new syntax to
support what was formerly a design pattern in the underlying
language. I don't claim that "macros" are the only way. If and when
you provide a tutorial and playpen for OCaml's facility for
ordinary user-programers to introduce new syntax capabilities into
the OCaml compiler, I'll be able to better estimate whether it's
fully as capable as Lisp's "macros" or more capable or less
capable, as well as whether it's easier to use or harder to use or
about the same difficulty.

> That is also why Lisp is so verbose in practice: the burden of
> parsing is placed upon the programmer who must write out programs
> in what should be an internal representation.

IMO this claim is totally bogus. Whenever a design pattern consists
of a lot of boiler plate and just a little bit of variability, a
Lisp "macro" can support a syntax that consists of a master keyword
to name the macro-expander together with just the barebones
variable parts and just enough parens to properly nest the variable
parts, which expands into the full boilerplate-plus-variability. If
the OCaml feature you have mentionned for introducing new syntax
rules works in the way you seem to imply, almost exactly the same
can be said for OCaml. There's no way that the syntax needed to
invoke an OCaml syntax extension would be significantly less
verbose than the syntax needed to invoke an equivalent Lisp "macro".

> >> OCaml has a full macro system but it is rarely used.
> > Is it a parse-tree macro system, like Lisp has, or is it a
> > string-syntax macro system, like C has?
> OCaml's macros act upon parse trees internally but they are not
> imposed upon the programmer as they are in Lisp.

They are not imposed on LIsp programmers. They are merely available
for anyone who might wish to define one (and supplied macros are
available for use without even needing to know that they are
macros). In fact I hardly ever use "macros" in my work, because the
abstractions achieved by defining new functions are good enough for
my needs virtually all the time, and the need to load macros before
anyting that uses them would make my standard software development
methodology more trouble. Without sufficient extra value to offset
the increased trouble, I just don't bother. I don't use CLOS or
generic functions for exactly the same reason. Ordinary bottom-up
tool development using layers of defined functions are good enough
for 99.99% of my uses here. But still I appreciate that macros are
available for others to implement the LOOP special form and other
special forms that I use all the time such as WITH-OPEN-FILE and
WITH-OUTPUT-TO-STRING, and I sorely miss those macros in
implementations of Common Lisp that are incomplete. In the case of
the PUSH macro, I missed it so sorely in XLISP that I found the
source and copied it to my patch file which I then loaded whenever
I used XLISP. The fact that I could just load it into an
already-running XLISP environment and voila the PUSH special form
was available, is a big win IMO. Can your new-syntax-rule thingy in
OCaml be used like that??

> In Lisp, macros are used primarily to Greenspun pattern matching.

Wrong. "macros" are used primarily to rearrange the variable parts
and supply the boilerplate parts. For example:
(push foo bar) => (setf foo (cons foo bar))
Note that SETF and CONS are boilerplate, and foo has been
rearranged to be in two places instead of just one, and bar has
descended to the second level of nested list. (Actually to handle
cases where foo is a complicated form, the actual expansion
invovles LET of gensyms for sub-forms so that each sub-form of foo
gets evaluated just once instead of twice. The LET and gensym etc.
are a *lot* of boilerplate that gets added, then in trivial cases
where foo is a simple variable the compiler optimizes away most of
that boilerplate. But the macro-expander doesn't have to worry
about special cases, it just generates a giant boilerplate
expression that will *always* work, making the macro very easy to
write.)

I'm curious: Does OCaml provide native support for the equivalent
of SETF and PUSH? Or is it easy using the syntax-extension
mechanism to add such syntax? When adding syntax for PUSH, is it
easy to make sure each sub-form of the place-expression ('foo' in
the above example) gets evaluated just once?

> you have yet to come up with a single example of a useful macro.

SETF is a useful macro, whereby anybody can introduce a new data
type that has getter and setter methods, and then a corresponding
SETF rule can be defined, after which that new data type can be the
'place' parameter to setf, equivalent to the left side of an
assignment in infix-operator languages. Does that mechanism somehow
exist in OCaml?

After SETF or equivalent is available, next PUSH and POP and are
also very useful. They can be defined in Lisp, being careful to
evaluate sub-forms of the getter and setter expression just once.
Is an equivalent mechanism available in OCaml?

> Finalizers are good when memory pressure is relevant, i.e. when
> you are handling resources that consume system memory.

Using a reference-count GC system, where memory is freed
immediately when the last reference to it goes away, this would be
very useful, releasing the system resouce exactly at that point in
time. But with a mark-and-sweep GC system, where memory isn't
released until some random time later, this wouldn't be good at
all, IMO. Maybe somebody needs to implement a CL which uses both
kinds of GC, in two separate heaps. The application would then get
to choose which heap to use when allocating new structure. For most
applications, it wouldn't matter much which heap to use. For
applications that need to build circular structures, the
mark-and-sweep heap would be necessary. For applications that need
to establish system resources and close them the moment the last
reference goes away, the reference-count heap would be necessary to
make things work reliably. If choice of heap were a per-thread
global, I think this would be maximally useful. What do you think?
Do you know of any other languages where dual heap is employed? Or
is this a new great idea I came up with just tonight, which will
win me the Nobel prize eventually?

> For example, finalizers are used to collect weakly referenced
> parts of a cache. This allows the cache to be shrunk by the GC when
> memory pressure is high.

I agree. In this case, either kind of GC would be appropriate,
since the purge would happen exactly when memory runs out with
either reference count or mark and sweep. I agree that weak
references are missing from ANSI Common Lisp and they would be a
good thing for some vendor to add.

> [big context snip] Embedded devices used to be PICs but they now

> include phones and larger devices that can have substantial
> resources.

So indeed these are long-lived applications, with lots of
dynamically allocated memory, hence a need for automatic storage
reclamation (garbage collection). It seems to me that for *all*
these kinds of applications, circular list structure is totally
unnecessary, hence a reference-count system would be fully
adequate. Furthermore, any object has at most just a few
references. For example a WAV soundfile can be referenced from the
sound directory, from the voice-memo directory, from the ringtones
directory, from the currently-under-edit or currently-listened
sound facility, and that's about all. A GIF or JPEG image can be
referenced from the image directory, from the screensaver
directory, from the view-now/modify-image utility, and that's about
all. Hence a fixed-length integer would suffice for the reference
count, perhaps four bits for up to 15 references, or maybe overkill
with 8 bits for up to 255 references if that's more convenient to
program, or any other value at least 4 but less than 8 bits if some
bits of the octet are needed for other use. Do you agree?

What about cellphones that use Java as their OS? I'm pretty sure
they'd never need circular pointers, but might they occasionally
need more than 15 references to a single object?

> ... the "pre-customers" you are alluding to are either

> unidentifiable or worthless. The vast majority of people who offer
> you advice on the basis that they will buy your product end up not
> buying anything and the people who do buy your product will be
> people whom you have been completely unaware of and who have
> completely different requirements and values.

So how do I find alpha brainstormers and beta testers for early
versions of my software so that I can get feedback to direct my
programming efforts towards something that has a good chance of
eventualy getting real paying customers?

Note that tools and applications are very different "animals".
With a software tool, I can probably figure out all by myself how
general to make the tool, to maximize its usefulness, and I don't
need either alpha brainstormers nor beta testers until I have it
all done and just need beta testers to check if I accidently let a
bug slip in.
But with an application, I really need alpha brainstormers from
nearly the very start to help me define what customer base my idea
could help and what features I want to implement first and best and
to help me design a good user interface etc. etc. Have you looked
at my mobile-InterNet-WAP-cellphone WebPage where I list several
new projects for which I'd like to find some alpha brainstormers?
<http://shell.rawbw.com/~rem/WAP/projectIdeas.html>
Do you have any suggestions how to get the help I need at designing
a good application, for *any* of those proposed new applications?

> Let me give you some examples:
> My book OCaml for Scientists was written by me as a physical
> scientist and aimed at other physical scientists. I made the same
> mistake that you are of seeking out pre-customers and taking their
> advice. Atmospheric scientists told me to strip out the maths and
> discuss interoperability with obscure tools but virtually none of
> them actually paid for the book. Bioinformaticians are probably our
> largest market of real paying customers. I never got any advice
> from bioinformaticians whilst writing the book and the fact that
> much of the content is well-suited to their work is a complete
> fluke.

A book is not at all the same thing as an application. But your
experience may be of *some* limited value transferred from book
writing to software development. But I'd prefer your experience in
writing user-level application software, if you have any.

> More recently, I have been completely ignoring advice from
> "pre-customers" and have focussed on getting products to market
> faster based upon my intuition of what people need. This is vastly
> more lucrative.

I can't read people's minds, especially people I've never met or
even heard, my potential future user base. I don't have the
intuition to know what somebody might really want, among the
several things I am hot to develop but can spend my time developing
only one at a time, so I need to choose which *one* to apply heavy
effort toward. Recently I've been splitting my unpaid time among:
- ProxHash tool, and transferrable-skills application thereof.
- Custom products of pseudo-random large primes and public-key cryptography.
- Small utilities and applications for my own personal use, not for market.
- Practicing my skill at developing Web sites for tiny
(one-inch-square) screens of cellphones with mobile InterNet
(WAP) now that I've enabled such service about 5 weeks ago on my
cellphone that I bought in March. Currently my login form here:
<http://shell.rawbw.com/~rem/cgi-bin/LogForm.cgi?f=WAP>
doesn't work on my cellphone, but all my hello-CGI-plus stuff here:
<http://www.rawbw.com/~rem/WAP/HelloPlus/wHellos.html>
works just fine, so I need to spend some time someday soon to
diagnose why one works but the other doesn't, so that I can make
my login form and all the stuff after it work on cellphones too.
By the way, I know how to use the Paint program on MS-Windows to
crop and demagnify images to fit the cellphone screen, which
usually reduces their size to about 3-20k bytes. Here's a sample of
some of my cropping+demagnifying work:
<http://shell.rawbw.com/~rem/WAP/hotWantMeet.html>
The numbers in parens are the k size of the image, where the first
image in each group is the original image before cropping and later
images in the group are cropped. An 'x' means that particular image
doesn't work on my cellphone, shows as broken-image icon instead.
(So far *all* my images cropped and/or reduced to fit cellphone
screen actually do work on my cellphone, which is a great relief.)
But I'd really like to shrink the largest (10-30k) of the cropped
images to less than 10k each without shrinking their visual size,
by specifying lossy JPEG compression. Do you know how to do that on
MS Windows? I can't find that option in Paint.

Oh, if you notice images from #36 onward aren't cropped, that's
because I got only through #35 on Thursday before the computer lab
closed, and the lab was closed Friday because the admin has gone on
a four-day holiday, so I won't be finishing images #36-61 until
Tuesday at the earliest. Also I have two other HotOrNot accounts
and I haven't even begun setting up those images for cellphone.

> ... second highest risk approach to building a company ... ...

> I found that selling commodities like OCaml for Scientists to
> lots of ordinary programmers is not only more stable than
> gambling but can also be just as lucrative.

Gambling isn't at all stable, so being more stable than gambling
isn't saying much at all.

In any case, you seem to be selling software-productivity tools
rather than end-user applications, right?

> > So can you find me a first brainstorm-use-cases customer to get
> > this whole process started? I've mentionned lots of new software
> > technology I've implemented to the pointer where I feel they might
> > possibly be useful for a variety of applications that might be of
> > value to others. But until I find a potential customer who is
> > actually interested in what I have done and how I can work it into
> > a salable product, I don't know if any of the wondrous things I've
> > programmed really could lead to a salable product.
> The only way to find out is to put it up for sale.

I'm totally not comfortable offering for sale some software that I
am the **only** person who has ever seriously tried to use it. For
example, I have yet to find even one person in the local area who
has time to try my flashcard-drill algorithm and tell me what
he/she thinks of it. (It's been online since late 2002.) At present
it has flashcard decks for learning how to choose and spell the 900
most common words in context, where the context is mostly my own
particular choice of lyrics from some of my favorite romantic
ballads. I would want to customize it to include the customer's
personal choice of context before I sell it to anyone. Why would
anybody want **my** choice of song lyrics? When I used an earlier
version of this program (ran as desktop GUI application on
Macintosh Plus using Macintosh Allegro Common Lisp version 1.2.2,
whereas current version runs over CGI on CMUCL on FreeBSD Unix) to
teach my two children how to develop that skill when they were not
yet in first grade, my personal choice of song lyrics, songs I had
played for my children, was just fine. But I'm really not
comfortable trying to sell my application with *that* set of song
lyrics samples to the general public, especially since some of the
lyrics are still under copyright and I have no idea which those are
so I have no practical way to purge them without deleting my entire
corpus. Other flashcard decks with the current program including
pre-reading choosing single letters missing from Barney Dinosaur
lyrics, which worked very well for my 4-yr-old daughter to warm her
up enough for the rest of the program. Also I have several ESL
decks I developed more recently than 2002. I can't see why
*anybody* would want to buy the current program as-is with my
current hodge podge of useful and sample flashcard decks.

> > By the way, Tuesday of last week I gave a partial demo of some of
> > my computer-assisted instruction software, and the person seeing
> > the demo seemed to think it would be of interest to two of his
> > technical acquaintances, one very far away (about 1000 miles) and
> > one closer (about 40 miles), so he'll be contacting them to try to
> > get them interested.
> Excellent. Best of luck.

Not excellent at all. He didn't have time to get a complete demo,
and since then a month has passed and he's been too busy for the
rest of the demo. So far he's seen the 2001.Jan SEGMAT demo by
itself (for guessing answers to riddles), and the
optmimum-flashcard-sequence algorithm by itself (for reading out
loud single letters and memorizing an accompanying usage), but
hasn't yet seen SEGMAT and optimum-sequence together, for all the
flashcard decks I talked about in the previous paragraph.

What part of the world are you in? Is there anybody in this part of
California who would like a full demo?

thomas...@gmx.at

unread,

Sep 3, 2008, 1:50:47 AM9/3/08

On 12 Aug., 18:36, jaycx2.3.calrob...@spamgourmet.com.remove (Robert

Maas, http://tinyurl.com/uh3t) wrote:
> > > - Does Seed7 include a parser that reads Seed7 source-code syntax
> > > (from an input stream such as from a text file, or from the
> > > contents of a string) and produces a parse tree (as a pointy
> > > structure)?
> > From: thomas.mer...@gmx.at
> > Actually there are the functions 'parseFile' and 'parseStri' which
> > can be used to parse Seed7 source programs into a values of the
> > type 'program'. I just added a short description about the type
> > 'program' to the manual. See:
> >http://seed7.sourceforge.net/manual/types.htm#program

> For example, in Lisp, ...
[snip]
I thought that I already sent a response. But now I cannot find it.
Therefore I send it now.

> > It is possible ... request the code of a 'program' value in a
> > structured form.
>
> Oh, why didn't you show me *that* in the first place? So a
> 'program' object isn't actually a parse tree, it's something else,
> and you have to do something additional to get it converted to
> parse tree (structured form as you call it)?

IMHO parse tree is a raw form of a program. In the structured form
type checks, overload resolution, expansion of abstract data types
and templates already has happened.

> So will you show where *that* feature is documented?

As I already said it is explained at:
http://seed7.sourceforge.net/manual/types.htm#program
It is possible to get a list of objects declared in the 'program'
with the function 'declared_objects'. This list has the type
'ref_list'.
The type 'ref_list' is explained here:
http://seed7.sourceforge.net/manual/types.htm#ref_list
A simple 'for' loop can iterate over the elements of a 'ref_list'.
An element of the 'ref_list' is represented with the type 'reference',
which is explained here:
http://seed7.sourceforge.net/manual/types.htm#reference
A 'reference' can refer to any object in the program.
To find out to which category a refered object belongs you can
use the function 'category' which returns a value of the type
'category':
The type 'category' is described here:
http://seed7.sourceforge.net/manual/types.htm#category

>
> > It is not my intend [sic] to support programs which manipulaten [sic]
> > their own code as it is done with "self modifying code".

intent
manipulate

When I write a mail some sentences are rewritten several times...

> Do you take the extreme position that once you "start" a program,
> it's too late to add any more code to it? You must load everything
> (all code) you will ever need once at load time, and *then* start
> executing?

Not exacly. I want to do such things in a structured way.
The concept of variables and constants can be used for this.
A constant is granted to be unchangeable during program
execution. Therefore it should not be possible to change
the code of a constant function during the runtime.
OTOH the code of variable functions could be changed
at runtime. Except for closures the concept of variable
functions is currently not supported (but will be some
day in the future).

> > Loading a library at runtime as a way to introduce new statements
> > for the program which is currently running IMHO makes no sense.
>
> You have a thing about "statements". I agree that adding new infix
> operators or anything else that affects the parsing of program
> source doesn't make a lot of sense most of the time. But what's
> wrong with simply loading new FUNCTIONS that don't change the
> syntax of anything??? What if at start-up time you don't know which
> libraries you'll need, you have to first see what data you're
> working with, and you have a milllion different modules you *might*
> load, and it would take forever to load all of them every time you
> re-start your program when you'll only use maybe 50 of them in any
> given run of the program.

Something like dynamic linking functions of some library.
Currently this is not supported. To fit to the Seed7 concept
such things need to be done in a structured and type safe way.

Yes. I now use the following sentence to describe the situation:

Naturally side effects of the right operand of the 'and' and 'or'
operator only take place when the operand is executed.

> > > (or (integerp x) (error "X isn't an integer")) ;Lisp equivalent
> > > > The result an 'integer' operation is undefined when it overflows.
> > > That's horrible!
> > AFAIK many languages such as C, C++ and Java have this behaviour.
> > I would like to raise exceptions in such a case, but as long
> > as there is no portable support for that in C, Posix or some
> > other common standard, it would be hard to support it with
> > satisfactory performance.
>
> So you're completely begging the question as to portability? If C
> doesn't make something portable, then you don't make it portable
> yourself?

I use C as base to reduce the implementation effort.

> When you really write a programming language, *you* write the code
> generators for various CPUs you intend to support. *You* control
> how low-level machine functions such as interrupts will be handled,
> by directly invoking operating-system and/or CPU facilities for
> them, instead of calling something written in C to do all the work
> for you.

I will not hinder you implementing Seed7 this way.

> Oh well, it's nice that you're honest and admitted what you do. So
> I guess it's easy enough for an application programmer to restrict
> all arithmetic operations to what won't ever overflow, assuming you
> have a portable MIN_NEG_INT and MAX_POS_INT documented so that the
> application programmer knows what the rules are for avoiding ever
> getting overflow.
>
> > > While 0 ** 0 which is mathematically undefined is *required* to return 1?
> > This behaviour is borrowed from FORTRAN, Ada and some other
> > programming languages which support exponentiation.
>
> Why do you borrow things that are flat-out wrong mathematically??

[snip]

There is a detailed discussion about this topic here:
http://en.wikipedia.org/wiki/Exponentiation#Zero_to_the_zero_power
There is a long list of reasons why 0 ** 0 should be treated as 1.
The list why it should be treated as undefined is much shorter.
The programming languages that evaluate 0 ** 0 to 1 include also:
bc, Haskell, J, Java, MATLAB, ML, Perl, Python, R, Ruby, Scheme, and
SQL.

> > ... The values referred by pointers and the values refered by
> > interface types are not managed automatically.
>

> That seems to be a step backward ...

As you probably know there are many possibilitys to manage data:

- Global data which exists as long as the program runs
(E.g.: An array which contains a conversion table or a string to
which lines are appended and which is written to a file before
the program ends). This needs no management but it could be feed
automatically (by processing the list of global declared data).

- Local data which exists as long as a function or block executes
(E.g.: Integer or string variables or value parameters declared
in a function). When the function or block is left it is easily
possible to free local data, even if it is complicated like an
array of structs with strings as elements. It is clear that
carrying pointers into such local data outside of the function is
something illegal, but the compiler could check for such things
by comparing the lifetime of the pointer and the lifetime of the
data.

Note that although this is stack oriented technic, the actual values
can be at the program stack or at the heap and they can also grow in
size (Strings with variable size can be implemented with this
technic).

The technic above assumes that there is a 1:1 relationship between
the variable (or constant or parameter) and its value. But not
everything can be described this way. Sometimes it is necessary
that several variables can refer to the same value. Most programming
languages solve this problem by using pointers (or references). The
variable points to the value which is at the heap.

- When it is possible to differentiate between one owning pointer
and several non-owning pointers aggain a stack oriented
management can be done, using the owning pointers. The non-owning
pointers would not cause any memory management. This concept
of two pointer categorys is natural when there is a hierarchical
data structure which is processed with non-owning pointers.

- When no pointer can be identified as owner, and no pointer cycles
exist (the data structure is a tree), reference counting can be
used. File classes can be implemented this way, as a 'write' to
a file which ends up writing recursively to itself (in a cyclic
data structure) would be very unusual.

More complicated situations must be managed by hand or with a GC.
If a GC concentrates on data which cannot be managed in a simpler
way it is probably much faster. Therefore I see the philosopy to
let the GC manage all data as the wrong way for memory management.
Currently the Seed7 implementation does not have a GC, but it can
be added in the future (to manage just data which cannot be managed
in a simpler way).

In Seed7 several container types are present. Therefore many uses
of pointers (to manage lists, trees, hash tables, etc.) are not
necessary. In Seed7 such containers should be used and that way the
use of pointers is reduced as much as possible.

Generally a program consists of data and code (for the moment we
ignore the fact that some programming languages can handle code as
data). There are interesting parallesisms between data and code
structures.

array of alike objects <---> loop with a statement in the body
struct of different objects <---> block with different statements
union type <---> switch or if statement
type declarations <---> function declarations
use of a defined type <---> call of a function
pointer type <---> goto statement

The unrestricted use of gotos leads to spaghetti code which is hard
to manage. Structured programming tries to reduce the use of gotos
by replacing them with structured statements like 'if', 'switch',
'while', 'repeat' (or 'do') and 'for'. In a similar way pointers
can be replaced by (dynamic) arrays, hash tables and other container
types.

> <http://groups.google.com/group/comp.lang.lisp/browse_thread/thread/d1...>

>
> One nice thing about CGI applications is that you always re-start
> the application for each new HTTP transaction, so you really don't
> need a garbage collector. A reasonable amount of memory leak isn't
> anything to worry about. So long as you don't allocate all the
> available memory in your entire virtual address space within a
> single HTTP transaction, you're OK.
>
> So Seed7 might find a niche in CGI where nobody can complain about
> the fact that you don't have any automatic way to reclaim memory
> that was allocated (not on the stack) and then discarded (pointer
> re-assigned).

Greetings Thomas Mertes

Jon Harrop

unread,

Sep 3, 2008, 11:18:05 AM9/3/08

Robert Maas, http://tinyurl.com/uh3t wrote:

I was discussing Lisp syntax but you had apparently meant nested lists.

>> How about something of practical relevance: custom grammars.
>
> Does there exist an OCaml library which converts a BNF
> specification into a corresponding parser function?

The ocamlyacc compiler compiler serves that purpose.

>> Consider parsing Mathematica expressions including lists, rules,
>> sums, products, powers and factorials with associativity and
>> precedence. In OCaml:
>> open Camlp4.PreCast;;
>> let mma = Gram.Entry.mk "mma";;
>> EXTEND Gram
>> mma:
>> [ "rule" LEFTA
>> [ e1 = mma; "->"; e2 = mma -> `Rule(e1, e2) ]
>> ...
>> END;;
>
> Does there exist an OCaml library which converts a BNF specification
> into the above OCaml code, which can then be compiled and executed?

Yes.

>> >> Macros are very rare in modern languages.
>> > Why?
>> Two reasons:
>> . Macros are used primarily to work around deficiencies in the
>> language's design, ...
>
> It is impossible for a language, as delivered, as standardized, to
> already contain every syntax mechanism any user-programmer will
> ever want to employ to make writing code easier. Consequently every
> language ought to have Lisp-style macros (parse-tree
> transformations) to allow ordinary programmers to extend the set of
> allowed just-right parse-tree formulations beyond those which are
> supplied by the standard/vendor. It is a gross deficiency that any
> language is missing such parse-tree transformation capability.
> These other (non-Lisp) languages already allow user-programmers to
> define new functions to extend the functions given by the
> vendor/standard. The OOP languages already allow user-programmers
> to define new classes with associated methods to extend the class
> hierarchy given by the vendor/standard. So why don't they also
> allow user-programmers to define new transformations from
> source-file syntax to what-actually-gets-compiled syntax?

You are completely neglecting the other side of the argument: macros come at
a grave cost in terms of other features like static type checking.

> Just one example: The new LOOP special-form is a wonderful addition
> to Common Lisp, and was implementable by ordinary user-programmers
> before it became part of the standard, merely by defining a LOOP
> macro to transform the LOOP special form into ordinary supported
> source code. Why don't other languages allow enough user-programmer
> power to define an equivalent LOOP macro and thereby allow ordinary
> user-programmers to implement a LOOP special form in those other
> langauges?

You can do that in OCaml but OCaml programmers choose not to because OCaml
also provides better alternatives.

>> Using macros forks the language and makes maintenance much harder.
>
> True, but defining functions forks the language in almost exactly
> the same way.

No. You can define as many functions as you like without undermining the
static type system as macros do.

>> Tools no longer work transparently with source code.
>
> If the source code contains declarations to cause an additional
> library to be available at compile time to support source-code
> parse-tree transformations, but the tool fails to notice these
> declarations and hence fails to work correctly when such "macros"
> are used in source code, that is a deficiency in the tool. Use a
> better tool, don't cramp the language because your stupid tool
> doesn't work on the full range of the language syntax.

Then you have burdened the complexity of tools that act upon code.

>> The parse tree is no longer relevant in modern languages.
>
> If you are going to allow user-programmers to extend the set of
> parse trees that can be compiled, by adding new parse-tree
> transformations, then the parse tree is totally relevant!!

No, the parse tree is completely abstracted away in modern systems.

> Just
> because your favorite language doesn't support parse-tree
> transformations doesn't mean the technique is worthless. Lisp
> doesn't support SQL with relational databases natively, it needs to
> be added by a library, but I don't consider relational databases to
> be worthless because of that. So why do you consider parse-tree
> transformations to be worthless?

There are much more powerful alternatives to manipulating the parse tree
directly.

> Do you also consider domain-specific laguages (DSLs) to be worthless??

No.

>> In all modern languages, the parse tree (the internal
>> representation used by the compiler and macro system) is irrelevant
>> due to the use of quotations.
>
> What do you mean by "quotations" in this context?

An abstraction of the parse tree that is designed to be more readable.
Quotations are found in many languages including OCaml, Mathematica and F#.

>> Look at the following equivalent of UNWIND-PROTECT:
>> # EXTEND Gram
>> expr: LEVEL "top"
>> [ [ "try"; f=sequence; "finally"; g=expr ->
>> <:expr<
>> (function
>> | `Val v, g -> g(); v
>> | `Exn e, g -> (try g() with _ -> ()); raise e)
>> ((try `Val $f$ with e -> `Exn e), (fun () -> $g$))
>> >> ] ];
>> END;;
>
> I can't make heads or tails of what you wrote there. I see
> half-meaningful pieces tangled together in a meaningless (to me)
> mess, and I have no idea what that block of "code" is really
> saying. Part of my problem seems to be that you're using a non-BNF
> form of expression of grammar production rules, which I don't
> understand.

This is an abstraction of ordinary LL parsing, implemented internally using
recursive descent.

> Perhaps whenever you are presenting such examples, you
> should post the URL of a tutorial that explains that particular
> grammar-production syntax, what it all means. Also you need to post
> the URL of a playpen for testing my understanding of that syntax,
> so that I can try my hand at writing my own grammer and checking if
> I got it right by parsing what I believe to be sentences of my
> grammar and see if the playpen parser agrees with me as to how the
> sentence should parse.

Google for camlp4 tutorials.

> By the way, either "whereis ocaml" nor "man ocaml" turns up
> anything here. Is that the correct name that would be used for
> OCaml on Unix, so I know it's not installed here? Or should I be
> looking under some other name? I tried both commands with
> capitalized "OCaml" but neither of those turned up anything here
> either.

You need to install ocaml, e.g.:

apt-get install ocaml

>> The input source code is represented by a sequence of tokens and
>> the associativity and precedence of the new grammar rule, so the
>> programmer is no longer forced to write without syntax as you do in
>> Lisp.
>
> So with all this extra power, it should be **easy** for you to write
> an extension to OCaml source-code syntax to support something like the
> new LOOP special form in ANSI Common Lisp, right?

If I wanted that, yes.

> Let's say you
> wanted to make a toy demo of this capability by implementing only
> something like (LOOP FOR el IN ls COLLECT (sqrt el)) where I've
> capitalized the LOOP keywords and left the ordinary source code as lower
> case to make the syntax clear. How would you design the corresponding
> OCaml syntax extension? I'd guess something like:
> LOOP FOR el IN ls COLLECT sqrt(el);

If I understand that correctly, I would just write:;;;

map sqrt ls

instead.

> Is that the exact syntax you'd choose, or something different?

My concern is about types rather than syntax.

> In
> either case, how would you implement that syntax extension in
> OCaml, to achieve exactly equivalent semantics to what Common Lisp
> provides?

I would certainly design something better than Lisp, e.g. typeful.

> For example:
> (LOOP FOR el IN '(1 2 3 4 5) COLLECT (sqrt el))
> => (1.0 1.4142135 1.7320508 2.0 2.236068)
> (LOOP FOR sr IN * COLLECT (* sr sr))
> => (1.0 1.9999999 3.0 4.0 5.0) ;If I had my way, the default mode
> ; would be interval arithmetic
> ; instead of floating-point
> ; shot-in-dark, but that issue
> ; isn't relevant to this discussion.
> Note your syntax extension must work for *any* local variable
> instead of just 'el', *any* given object in place of 'ls', and
> *any* valid form involving the local variable, not just (sqrt el)
> or (* sr sr).

The nearest useful equivalent would be a list comprehension like:

[ for el in ls ->
sqrt el ]

F# has these built in and OCaml has macros implementing them.

You basically just inject a new grammar rule for an expression:

| "["; "for"; x=patt; "in"; s=expr; "->"; f=expr; "]" ->

Then you write a quotation in ordinary OCaml syntax that is transparently
translated into the internal parse tree:

<:expr<
List.map (fun $x$ -> $f$) $s$
>>

That would only work over lists. If you want it to operate over any object
that implements a "fold_right" function then do:

<:expr<
$s$#fold_right (fun $x$ t -> $f$ :: t) $s$ []
>>

>> The action is represented by a quotation <:expr< ... >> that
>> contains ordinary OCaml code so, again, the programmer is no longer
>> forced to write without syntax as you do in Lisp.
>
> That remark is a first step toward a tutorial and playpen for this
> feature of OCaml, but until the full tutorial+playpen is available
> for me to play with, I won't be able to really understand what
> you're talking about and what your code fragment means.

This functionality has been widely available and used for many years. There
is already a lot of tutorial information out there and working examples for
you to study.

>> Pattern matching is also instrumental here: you can quote code in
>> patterns and in expressions so you can rewrite OCaml code without
>> having to deal with the low-level parse tree at all.
>
> I'm not granting your point just yet, but it sounds like this
> feature *might* be a valid alternative to source-code parse-tree
> transformation. At this point I really hope you do set up a CGI
> application that lets me play with this feature, together with a
> tutorial that explains what I should be trying, starting with the
> simplest case and working toward more complicated cases.

Everything I have been saying has been specific to OCaml but this
functionality is certainly available outside OCaml as well. Mathematica is
a superb example of how pattern matching can be invaluable when rewriting
terms (parse trees).

>> This makes term rewriting much easier in OCaml than in Lisp ...
>
> It doesn't at all look easier than how it's done in Lisp. Maybe
> after you set up your tutorial+playpen and I spend some time
> developing my understanding of how to use that facility, I'll
> agree, or maybe I'll say it really works but is more complicated
> than Lisp's "macro" facility.

Try writing even these trivial examples in Lisp using an extensible grammar
with precedence and associativity.

>> We can now improve syntax far beyond the capabilities provided by Lisp.
>
> That's blatantly untrue. *any* formally defined grammer can be
> implemented as a parser using Lisp as the coding language,
> whereupon any such grammer can be fed as syntax into such an
> operational Lisp program. Witness how way back with MacLisp,
> Vaughan Pratt developed CGOL, and about the same time with SL
> (Standard Lisp) Tony Hearn developed REDUCE/RLISP, each an
> Algol-like syntax for Lisp sourcecode.

Greenspun. You're just reinventing solutions that are already available and
widely tested in modern languages.

>> You can implement UNWIND-PROTECT using only a higher-order function.
> (and thus avoid needing to use source-code parse-tree transformations)
>
> Please explain what you mean by higher-order function in this
> context, such that the syntax for an unwind-protect form can be
> implemented without any need to enhance the Lisp kernel to provide
> support for it. I don't believe it's possible to define
> unwind-protect in terms of McCarthy's original Lisp primitives.

A higher-order function is just a function that accepts one or more function
arguments.

>> That is also why Lisp is so verbose in practice: the burden of
>> parsing is placed upon the programmer who must write out programs
>> in what should be an internal representation.
>
> IMO this claim is totally bogus. Whenever a design pattern consists
> of a lot of boiler plate and just a little bit of variability, a
> Lisp "macro" can support a syntax that consists of a master keyword
> to name the macro-expander together with just the barebones
> variable parts and just enough parens to properly nest the variable
> parts, which expands into the full boilerplate-plus-variability. If
> the OCaml feature you have mentionned for introducing new syntax
> rules works in the way you seem to imply, almost exactly the same
> can be said for OCaml. There's no way that the syntax needed to
> invoke an OCaml syntax extension would be significantly less
> verbose than the syntax needed to invoke an equivalent Lisp "macro".

Lispers restrict themselves to s-exprs and direct manipulation of the parse
tree without any common abstractions.

>> >> OCaml has a full macro system but it is rarely used.
>> > Is it a parse-tree macro system, like Lisp has, or is it a
>> > string-syntax macro system, like C has?
>> OCaml's macros act upon parse trees internally but they are not
>> imposed upon the programmer as they are in Lisp.
>
> They are not imposed on LIsp programmers.

Yes they are: you need to manipulate the parse tree as a s-expr directly in
a Lisp macro.

> They are merely available
> for anyone who might wish to define one (and supplied macros are
> available for use without even needing to know that they are
> macros). In fact I hardly ever use "macros" in my work, because the
> abstractions achieved by defining new functions are good enough for
> my needs virtually all the time, and the need to load macros before
> anyting that uses them would make my standard software development
> methodology more trouble.

Exactly.

> Without sufficient extra value to offset
> the increased trouble, I just don't bother. I don't use CLOS or
> generic functions for exactly the same reason. Ordinary bottom-up
> tool development using layers of defined functions are good enough
> for 99.99% of my uses here. But still I appreciate that macros are
> available for others to implement the LOOP special form and other
> special forms that I use all the time such as WITH-OPEN-FILE and
> WITH-OUTPUT-TO-STRING, and I sorely miss those macros in
> implementations of Common Lisp that are incomplete. In the case of
> the PUSH macro, I missed it so sorely in XLISP that I found the
> source and copied it to my patch file which I then loaded whenever
> I used XLISP. The fact that I could just load it into an
> already-running XLISP environment and voila the PUSH special form
> was available, is a big win IMO. Can your new-syntax-rule thingy in
> OCaml be used like that??

Interactively? Yes.

>> In Lisp, macros are used primarily to Greenspun pattern matching.
>
> Wrong. "macros" are used primarily to rearrange the variable parts
> and supply the boilerplate parts.

That is the same thing.

> For example:
> (push foo bar) => (setf foo (cons foo bar))

let push foo bar =
foo := bar :: !foo

> Note that SETF and CONS are boilerplate, and foo has been
> rearranged to be in two places instead of just one, and bar has
> descended to the second level of nested list. (Actually to handle
> cases where foo is a complicated form, the actual expansion
> invovles LET of gensyms for sub-forms so that each sub-form of foo
> gets evaluated just once instead of twice. The LET and gensym etc.
> are a *lot* of boilerplate that gets added, then in trivial cases
> where foo is a simple variable the compiler optimizes away most of
> that boilerplate. But the macro-expander doesn't have to worry
> about special cases, it just generates a giant boilerplate
> expression that will *always* work, making the macro very easy to
> write.)
>
> I'm curious: Does OCaml provide native support for the equivalent
> of SETF and PUSH?

SETF is the := operator and PUSH is probably Stack.push.

> Or is it easy using the syntax-extension mechanism to add such syntax?

In this case, user definable operators are sufficient and there is no need
for a macro (but it is equally possible).

> When adding syntax for PUSH, is it
> easy to make sure each sub-form of the place-expression ('foo' in
> the above example) gets evaluated just once?

Yes. If this appears in your macro:

$foo$ := $bar$ :: !$foo$

then you replace it with:

let foo = $foo$ in
foo := $bar$ :: !foo

>> you have yet to come up with a single example of a useful macro.
>
> SETF is a useful macro, whereby anybody can introduce a new data
> type that has getter and setter methods, and then a corresponding
> SETF rule can be defined, after which that new data type can be the
> 'place' parameter to setf, equivalent to the left side of an
> assignment in infix-operator languages. Does that mechanism somehow
> exist in OCaml?

There is no direct equivalent but you can construct one easily enough.

> After SETF or equivalent is available, next PUSH and POP and are
> also very useful. They can be defined in Lisp, being careful to
> evaluate sub-forms of the getter and setter expression just once.
> Is an equivalent mechanism available in OCaml?

I would just use a function, like Stack.push.

>> Finalizers are good when memory pressure is relevant, i.e. when
>> you are handling resources that consume system memory.
>
> Using a reference-count GC system, where memory is freed
> immediately when the last reference to it goes away,

That is actually a common misconception. GCs can and do collect sooner than
the scope-based reference counting you are referring to.

> this would be
> very useful, releasing the system resouce exactly at that point in
> time. But with a mark-and-sweep GC system, where memory isn't
> released until some random time later, this wouldn't be good at
> all, IMO.

Finalizers certainly work well for me (and many other people).

> Maybe somebody needs to implement a CL which uses both
> kinds of GC, in two separate heaps.

Reference counting is devoid of merit, IMHO.

> The application would then get
> to choose which heap to use when allocating new structure. For most
> applications, it wouldn't matter much which heap to use. For
> applications that need to build circular structures, the
> mark-and-sweep heap would be necessary. For applications that need
> to establish system resources and close them the moment the last
> reference goes away, the reference-count heap would be necessary to
> make things work reliably.

You don't need reference counting to do that. Look at "Dispose" in .NET, for
example.

> If choice of heap were a per-thread
> global, I think this would be maximally useful. What do you think?
> Do you know of any other languages where dual heap is employed? Or
> is this a new great idea I came up with just tonight, which will
> win me the Nobel prize eventually?

Some old languages like C++ are limping on with reference counting but none
of the new breed make use of it because modern GCs are so much better.

>> [big context snip] Embedded devices used to be PICs but they now
>> include phones and larger devices that can have substantial
>> resources.
>
> So indeed these are long-lived applications, with lots of
> dynamically allocated memory, hence a need for automatic storage
> reclamation (garbage collection). It seems to me that for *all*
> these kinds of applications, circular list structure is totally
> unnecessary, hence a reference-count system would be fully
> adequate.

Cyclic non-functional data structures are certainly not overly common but
functional data structures are often cyclic because closures refer to each
other and to themselves.

>> ... the "pre-customers" you are alluding to are either
>> unidentifiable or worthless. The vast majority of people who offer
>> you advice on the basis that they will buy your product end up not
>> buying anything and the people who do buy your product will be
>> people whom you have been completely unaware of and who have
>> completely different requirements and values.
>
> So how do I find alpha brainstormers and beta testers for early
> versions of my software so that I can get feedback to direct my
> programming efforts towards something that has a good chance of
> eventualy getting real paying customers?

You don't. Your paying customers are your testers. This sounds harsh but it
is the difference between financial success and failure.

Our software products have been in beta for a year and we are getting very
useful feedback from dozens of paying customers:

http://www.ffconsultancy.com/products/fsharp_for_numerics/?clp
http://www.ffconsultancy.com/products/fsharp_for_visualization/?clp

I believe what they are telling me because they put their money where their
mouth is and they are happy to be getting software to play with.

> I'm totally not comfortable offering for sale some software that I
> am the **only** person who has ever seriously tried to use it.

I'm afraid that is the only way forward to start with (unless you want to
get heavily into debt first).

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 4, 2008, 6:37:56 AM9/4/08

> From: p...@informatimago.com (Pascal J. Bourguignon)

>>PJB> Everytime you want to apply this pattern to a new class, you have
>>PJB> to add the following methods and class, ...
>>PJB> In lisp, we write the same, but only ONCE: ...

>REM> OK, without studying the details, I get the basic idea.

>>PJB> This is possible because lisp, in addition to supporting data
>>PJB> abstraction and procedural abstraction, which any other
>>PJB> programming language followed suit, also supports syntactic
>>PJB> abstraction and meta-linguistic abstraction.

>REM> This may be the best way to show how much better Lisp is than those
>REM> other languages. The anti-Lisp folks are saying "nobody really
>REM> needs to be able to construct new source code at run time", but
>REM> this kind of pattern may be a good example of how it's a royal pain
>REM> *not* to be able to do that. I'd like to see a simpler example of
>REM> this sort of need, where it's not huge that I simply don't have the
>REM> energy to study the details. After all, if *I* am turned away by
>REM> the immensity of the example, I'm sure the anti-Lisp folks will
>REM> turn away too, and the lesson won't be learned.
PJB> And when we keep writting [sic] small examples, we're dismissed
PJB> because they say it works only on toy problems.

We use a totallly toy example, basically a "hello world" example
for the new technology, just to *start* the process of teaching, to
get the person to understand the basic idea of the new technology
being learned. But after that first-conceptual HelloWorld example,
then we need to follow with elaborations from that example to make
it less and less toy and more and more possibly useful. See for
example my progression of teaching how to install CGI applications
from the basic CGI-hello-world through three steps to full
urlencoded-form-contents decoding:
<http://www.rawbw.com/~rem/HelloPlus/hellos.html>
then I've started to extend that toward tools that help actually
doing something useful, including both a datatype matrix (designed
before I realized that intentional datatypes was better, this needs
to be re-organized per intentional datatypes someday, but right now
it looks like this):
<http://www.rawbw.com/~rem/HelloPlus/CookBook/CookTop.html>
and then specific code examples of useful-for-CGI higher-level
functions (decoding strings that represent integers, and setting
and reading cookies) within standalone CGI-demo applications:
<http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html>

By comparison there's an otherwise fine book using your strawman
hello-only-examples that one of my instructors loaned me:
Visual Basic .NET, HOW TO PROGRAM (Second Edition)
by Deitel, ISBN 0-13-029363-6
which is full of "hello world" equivalent programs, each of which
takes several pages of printed text. For example, pages 846-850
contain the listing of a program that does nothing more than read
in an XLM document to form a DOM and display the tree structure
on-screen, and then do *nothing* further with that data. (The
bottom half of the last page is a screen shot, so actually that's
"only" 4.5 pages of code.) After spending so much printed-page
space on trivial examples that show the extreme verbosity needed in
VB.NET just to get the most basic stuff done, the book is already
more than 1500 pages, so of course there's no way they could have
*also* shown, or even described in general, how to extend those toy
examples to anything significant. On the net, with Web-based
tutorials, there's no such restriction. If a Web site with all the
elaborations on each toy example gets to a million
printed-page-equivalent, who cares?? You explore the branches you
need at the moment, and ignore the rest until later if and when you
need any of them.

> I can reassure you, professionnal programmers are used to work on
> programs between 100 KLoC and 10 MLoC, so they can understand the
> point of a technique allowing them to reduce the complexity of the
> code and therefore the number of bugs.

When only the top few levels of an application, plus some utilities
that weren't previously needed, are new code, and all the rest are
just library functions from work already done months or years ago,
who is to say how much of the totality of used code is to be
counted in measuring the size of a "program". With proper re-use of
code already in libraries, I see no reason any single new "program"
should involve a hundred thousand new lines of code that have to
all be simultaneously kept in the mind of a single programmer, such
that it would be completely true to use the language "work on" as
you did above. It should be necessary to "work on" only the new
code, not the old code already in existing libraries, won't you
agree?

I did a check just now as to the sizes of some of the software
modules I've written in recent years. The largest complete and
successful module is 2007-4-spamu.lisp which is about one thousand
lines totalling about 50k bytes. It was the largest of several new
modules I need to write for a medium-sized new application. So I'd
estimate about five thousand new lines of code for all the new
modules that I needed to write for that one application. But I was
doing it for myself, not getting paid, so of course you won't
consider it "professional", because "professional" means getting
paid to do it, right? That entire directory, all the CommonLisp
code I've written since late 2000, for a whole bunch of different
small to medium to large applications, totals 2.7 megabytes, which
reduced by that bytes/lines=50 scale factor gives an estimate of 54
thousand lines of code, only half of your *minimum* LOC estimate
for "professional" software, and that's for *all* the various
applications over nearly 8 years, not just one "program". So
there's a huge discrepancy between the "professional" code you cite
and the **useful** code I have written. I don't know what it means.
Is it that in Lisp you can re-use code a lot better than in other
languages, and the core language gives you much more compact ways
to express algorithms in the first place, resulting in much lower
LOC in code written for Lisp applications than for "programs" in
whatever language you are talking about that was used in
"professional" programs? Is it that people getting paid for their
work tend to do a lot more work than is really needed, in order to
charge more hours they get paid for? Is it that companies hiring
those people to write programs suffer from the "every bell and
whistle we can imagine ought to be in every program we sell"
syndrome? I don't know. Can you reconcile these two figures
somehow? Maybe a combination of factors I guessed at? Or something
else I haven't even come close to guessing?

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 4, 2008, 6:55:54 AM9/4/08

REM> Now there are actually two ways to solve the main/common part of a
REM> set of similar tasks once:
> From: Willem <wil...@stack.nl>
> Three, sir. ;-)
REM> -1- By writing a module that you call with various parameters to
REM> specify the parts that vary from task to task;
REM> -2- By solving just one task completely, then using copy&paste with
REM> micro-edit to create variants of the code for each new but similar
REM> task.

> -3- By solving just one task completely, then adding parametrized code
> to the existing solution to create the variants as you go along.

Indeed, that's one I had overlooked when posting previously. In
Common Lisp, keywords parameters are a great tool for adding new
variation in the call that wasn't present originally, with code set
up so that the default value of the parameter gives the original
behaviour that existing code depends on continuing to work the same
way, and other values give variant functionality needed by new
callers.

If while getting ready to copy&and&paste you realize keyword
parameters will work just as well, but with better efficiency in
regard to maintainence, then indeed that's the way to go. But if it
takes more work to figure out a good way to parameterize than it's
worth, copy&paste method seems best for the moment. Maybe later it
can be parameterized easily, maybe not. Meanwhile, the job has to
be delivered, and you don't need to have a heart attack over the
stress of feeling you *need* to parameterize it when it's really
too difficult.

Copy&paste is basically using the original code as a template for a
"design pattern", as opposed to either parameterize-from-the-start
or add-keywords-parameters where it's a "tool" instead.

> (And to support your point:
> some languages are more easily parametrizable than others)

Common Lisp with keyword parameters is one if those more easily
parametrizable languages IMO.

By the way, a few years ago I found an essay by the folks who
created Yahoo! Stores telling how incredibly valuable keyword
parameters were for supporting rapid installation of new features
as the service grew. Unfortunately I don't have the URL handy.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 4, 2008, 7:18:33 AM9/4/08

> From: Richard Heathfield <r...@see.sig.invalid>

> RH> In any (even relatively sensible) language where it takes
> RH> /more/ than one line of code, it only takes more than one line
> RH> the /first/ time - because the programmer will wrap it into a
> RH> function (or procedure, or subroutine, or whatever) and, in
> RH> future, call the function (or whatever). So this conciseness
> RH> argument is a mere canard, as all program tasks eventually boil
> RH> down to: call existing_solution
> RH> which is a one-liner in pretty well any language.

>REM> Rebuttal: What if you need to do lots of new different things once
>REM> each?
RH> Rerebuttal: It's never happened yet. Most new different things
RH> turn out to be mere rearrangements of old same things.

Indeed that's the true value of copy&paste methodology of writing
semi-new code: Very quickly you can grab an old piece of code that
did something similar (assuming you can find it quickly) and
copy&paste the parts that don't need changing and type in just the
new parts, and it's a lot faster in many cases than trying to
parameterize the old code so that it works for both uses.

Of course if you can see a way to parameterize it easily so that
anticipated future variant uses can just change the parameters
without so much copy&paste as the above implies, then that might be
the way to go, if you're not hitting an immediate deadline.

> The Preacher had us bang to rights (Eccles 1:9).

Nitpick before I go look up the reference: There have been many
preachers, not just one, in all of history. So I guess that's your
pet name you gave your favorite preacher, rather than a proper
English construction.

OK, I looked up the reference, and it's exactly what I guessed.
However I believe that passage in its day was meant to be taken
literally, that there's nothing new at all, not even a variant on
what happened before. The scientific revolution when inventions
come a hundred a day is quite recent. In those days, it might be a
century before the next significant invention happened, so within a
single human lifetime of 25-30 years there was often not a single
new invention during the whole lifetime, and that passage would
apply exactly for that lifetime. A few years ago I read Isaac
Asimov's biographical history of science and technology and it was
interesting to see how widely spaced inventions were until modern
times. If a single person, such as Archimedes, made or became aware
of and started using more than one significant invention during a
lifetime, that was remarkable.

Accordingly that statement seems to be an overstatement per our
topic here. The other famous quote, that all fictional works are
based on seven basic themes, is more like the appropriate quote
here, indicating recurring basis but novel variation each time,
although its use here would be metaphor. I wonder if Shannon ever
wrote about this topic. He would have uttered something that is
both accurate for this topic and not a metaphor from literature. Or
was he too early to witness the "explosion" of software that
happened shortly after the first mass-produced computers?

Richard Heathfield

unread,

Sep 4, 2008, 7:35:25 AM9/4/08

Robert Maas, http://tinyurl.com/uh3t said:

>> From: Richard Heathfield <r...@see.sig.invalid>

<snip>

>> The Preacher had us bang to rights (Eccles 1:9).
>
> Nitpick before I go look up the reference: There have been many
> preachers, not just one, in all of history. So I guess that's your
> pet name you gave your favorite preacher, rather than a proper
> English construction.

You guess wrong. If you do a little more research, you will discover why.

stan

unread,

Sep 4, 2008, 9:26:22 PM9/4/08

Robert Maas, http://tinyurl.com/uh3t wrote:

<snip>

>> I can reassure you, professionnal programmers are used to work on
>> programs between 100 KLoC and 10 MLoC, so they can understand the
>> point of a technique allowing them to reduce the complexity of the
>> code and therefore the number of bugs.
>
> When only the top few levels of an application, plus some utilities
> that weren't previously needed, are new code, and all the rest are
> just library functions from work already done months or years ago,
> who is to say how much of the totality of used code is to be
> counted in measuring the size of a "program". With proper re-use of
> code already in libraries, I see no reason any single new "program"
> should involve a hundred thousand new lines of code that have to
> all be simultaneously kept in the mind of a single programmer, such
> that it would be completely true to use the language "work on" as
> you did above. It should be necessary to "work on" only the new
> code, not the old code already in existing libraries, won't you
> agree?

I would agree that code reuse is a good concept and tool. Software
metrics serve many purposes; not all of them are good. Comparing loc
between different languages is about as useful as comparing chocolate
milk to plywood.

Could it be that "professional" programs mentioned above are orders of
magnitude more complicated than the ones in your directory? Have you
reimplemented Vista, Word, Excel, Access, Apache, Mozilla, or maybe
Oracle?

When I look at the shell scripts I've written in the last 10 years it's
significantly smaller than 54K lines. Does that mean Bash is the
ultimate programming language? Or is it something that I'm not guessing?

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 7, 2008, 7:46:49 PM9/7/08

> From: thomas.mer...@gmx.at

> Guess what my favorite language is?

I asked Google Groups to show me the last 100 articles you posted,
and it showed me several articles in comp.lang.c, a few articles in
comp.lang.c++, and just one in comp.lang.basic.misc, so I would
guess that C is your favorite language.

> Besides this, not all concepts make sense when transfered to
> other languages.

I agree. However a lot of data/processing tasks are of common types
that *ought* to be do-able by *any* general purpose langauge. So
there are two different ways to compare languages:
- How well do they make it easy to write software for the vast
majority of D/P tasks, i.e. all the common types of tasks.
- What specialized/unusual types of tasks do they *also* support
especially well, by mechanisms that are much better for these
specific types of tasks than the general mechanisms are.
Now it's a matter of personal judgement which tasks should be
considered "common" and which "unusual". IMO at the very least the
following ought to be considered "common":
- Numerical calculations, including linear algebra, trigonometry, ...
- STDIO (Standard Input/Output, i.e. from/to the controlling terminal)
- File I/O, both text and binary
- GUI (Graphical User Interface, i.e. mouse/trackball/whatever and keyboard)
- CGI or alternative (JSP/ASP/PHP/...) for Web server applications
- Text editing, both plain-text and "word processing"
- Interface to relational databases
- Dynamically-allocated pointy structures (data points to other data)
- Ability to effect new intentional data-types by re-interpretation
of existing true data types (this is trivial in most languages)
I would also like some/most/all of the following to be included:
- Ability to define new true data types, thereby converting an
intentional data type into a true data type, thereby allowing
the different data types to be mixed in a container without
confusion (this is trivial in any language that allows
definition of new classes in the sense of OOP)
- Ability to interface to OS API, such as for "directly" controlling devices
- Ability to compile to native-code libraries, and later load&link them
Alternatively: Ability to compile to efficient byte-code libraries, ditto
- Ability to develop (write and test) code on a line-by-line basis
- Ability to dynamically load libraries into an already-running application
- Multi-threading: system-supported true threads and/or "green-threads"
- Ability to receive and process command-line parameters (CLP)
- Ability to invoke and control sub-processes that run other programs with CLP
- Ability to open&use local streams to/from other processes on same machine
- Ability to open&use network sockets for TCP (and optionally UDP)
- Formal mechanisms for distributed applications, such as RPC RMI SOAP etc.
- Compiling/assembling to generate a stand-alone "executable" for a target OS
- Options for managing dynamically-allocated pointy structures:
- Don't bother (for short-lived applications that won't run out of memory)
- Manual freeing (when programmer can "know" precisely when to free memory)
- Reference count (efficient for applications that don't have pointer loops)
- Advanced mark-and-sweep (required to avoid "memory leak" of pointer loops)
- Ability to build new functions/closures/continuations at runtime
- Ability to extend the syntax of the language, either directly at
the top level of processing source, or indirectly by explicitly
calling a parser whose output (parse tree) can then be used to
build executable code and then install it in the already-running
application; thus the ability to effect a DSL (Domain-Specific Language)
fully integrated into the standard language.
- Ability to be called from a standard FFI (Foreign Function Interface)
- Ability to load and call other-language procedures via a standard FFI
Feel free to suggest anything I overlooked that (IYO) ought to be included.

Note that "ability" above means that an ordinary application
programmer can write the code to effect it without needing to
request an addition to the core language.

Note that all these features don't need to be incorporated in the
core langauge. It suffices that they are straightforward to
implement in user-level code which can be incorporated into
applications or provided in libraries.

Putting those two together, it means that an ordinary application
programmer can write clean code to define a library for doing the
particular kind of task and providing a clean API for calling the
functions/methods of that library, then an ordinary application
programmer will find it easy to invoke functions/methods in that
library to accomplish specific instances of that kind of task. Of
course if there's already a provision in the core language or
vendor-supplied libraries to provide an easy-to-use API for that
kind of task, the first half is already done. But it makes no
difference to me whether that first half is already done or is just
straightforward to do, it still counts. But if extreme hackery is
required to do it at all, that doesn't count.

> Languages have a paradigma. All things that fit to a paradigma
> can be added. Things that do not fit to a paradigma are hard to
> add.

IMO most/all of what I listed above *ought* to be doable in a
straightforward way in *any* general-purpose programming language
"worth its salt".

> I would not try to add overloading or infix operators with
> priority and associativity or compile-time static type checking
> (in the way this things are used and implemented in Seed7) to LISP.

In any language, including Lisp, which supports the ability to
incorporate a new parser and use it to process new-syntax
source-code input, that should be possible if desired. For example,
here's how it might be done in Lisp:
- You have in mind a module that you would like to write in the
compile-time static type checking style. You have in mind the
syntax you want to use to represent that style of coding.
- You write a parser for that syntax.
- You have in mind an implementation strategy. For example, you
might cross-compile it to C or assembly language, build it to
library, and then load it in via FFI. Or you might store all the
data for that module within a Lisp array or a system-allocated
block of RAM, and use ordinary Lisp functions to process it.
- You write code to cross-check the type declarations and usages
within the parse tree and to compile it to your target form.
There are some advantages of this method compared to switching to
another language such as C or Seed7:
- You can freely mix ordinary Lisp modules and new-syntax type-checked
modules without needing to run them as separate application processes.
- You can invent your own (domain-specific) language exactly how
you want it instead of needing to start from what Seed7 or other
language specifies, and you aren't restricted to what that other
language allows as syntax extensions.
Of course if in you're in a hurry, and you can "live" with exactly
the syntax that some other language already provides, or some tiny
extension of that language that's trivial to specify, then coding
that part of your application as a separate program which the main
program controls in some way (sub-process with PTY or other stream
or inter-job events connecting them) might be the best way. If
large structured data objects need to be passed between the main
application and the sub-application, and if something like SOAP is
available in each language, then the best way might be to convert
the data to a SOAP object and write it to a temporary disk file or
a shared-memory block, then pass the filename or handle across via
PTY or stream or event etc.

Now let's say you go all-out, writing a parser for your DSL. Now
you can replace the regular REP with a DSL-REP whereupon you can
enter statements of your DSL for direct execution just as if they
were regular Lisp. Likewise you can replace LOAD with something
that parses your DSL from disk file, and thereby load a whole file
of your DSL. Finally, you can replace the compiler function with a
DSL-compiler function, and thereby compile DSL source files to
FASL, which can then be loaded in the usual manner. All of these
uses of your DSL parser are essentially trivial in Lisp once you
have written the DSL parser itself.

REM> ... With my line-at-a-time development
REM> style, as soon as I write the line of code that tries to do the
REM> illegal operation and submit it to REP, it'll signal an error.

> So when you type the line
> (add a b)
> (I assume that 'add' can add two numbers and will fail (at
> runtime) when one of the parameters is not a number) the 'REP'
> knows if this is a legal or illegal operation?

No. All that the REP knows about this is:
- The overall form is a valid list
- Each of the three sub-forms is a valid symbol
- The first of the sub-forms has a global function binding
- The other two sub-forms each has a global value binding
At that point the REP calls the function, passing it the two
values. It's the responsibility of the function being called to
check whether its parameters are appropriate, in which case it
performs the operation, or not, in which case it signals an error
(more commonly called "exception" nowadays).

> Since you probably hate declarations with types, I assume that a
> and b are untyped global variables or parameters.

No, a and b are symbols, which are usable as:
- variables (get or set value)
- functions (call function)
- property lists (get or set or remove property associated with some key)
- printing the name
Each is in the hashtable which is the "package" called something like CL-USER,
which is how the Read part of the REP finds them in the first place.

In this context, a and b are *used* only as variables, and only the
getting ability is used, not the setting ability.

> in the general case no assignments to a and b have been written

> at the time you write the 'add' function call.

That's too general. The only place I'd ever use such a line of code
is when building the internal body of a function definition, so
that each of them is either a parameter to the function
(which I've supplied with a value at the top before I write/test the
first actual line of code within the body)
or a local variable
(which was previously set by earlier lines of code within that same
function body I'm writing).
Now if I've failed to set up a test value for a function parameter,
or I've failed to execute the earlier line of code to assign a
value to the local variable, or if I've mistyped the name of a
variable, then when I try to execute this line of code the REP will
signal an error itself. Which version of Lisp have you used
yourself where you might have the result of such an error signal?

In some cases an earlier line of code may or may not assign a valid
integer value to one of these symbols, but in that case it's part
of the logic of the program and some test has already occured for
that case. For example, a typical pattern is to use POSITION or
POSITION-IF or POSITION-IF-NOT to find the index within a string
where the end of a token occurs, except it returns NIL if the token
runs all the way to the end of the string. In that case, the
following code will make sure the ending-index of the token is
provided as the value of the variable:
(setq ix2 (or (position #\space str :start ix1) (length str)))
Variable IX1 was previously set up with the index at the start of
the token, and now IX2 is guaranteed to be the index just after the
end of the same token, thus (subseq ix1 ix2) will yield the
sub-string of the entire token in all cases.

> how can 'rep' know what values a and b will have at runtime?

Moot since it doesn't know.

> > TM-STC> Since static type checking makes run-time type checks unnecessary, ..

> > That is the big lie right there!
> This is not a lie, and everybody who understands static type
> checking will agree with me: If the compiler has checked that some
> parameter is of type 'integer' (or 'file' or 'myObject') it will
> for sure have a value of this type at runtime (see below).

That is grossly insufficient if the required type is an integer
within certain bounds, such as the index within a string (which
must be in the range of 0 to length of string minus 1 for direct
access to a single character or 0 to length of string for
specifying end of sub-string), or a prime number, or the product of
two prime numbers, or a positive integer that is relatively prime
to the totient of the product of two prime numbers, etc. If the
function you call absolutely requires a value of such a restricted
type, there's no way any compile-time type-check can make sure that
will be true at runtime.

Runtime checks will *still* be necessary upon entry to any module
via a public interface from some untrusted caller, especially when
called from the Web or other net-remote client. There's just no way
that compile checks will make sure that:
- The URLencoded form contents being submitted from the HTTP/CGI
client in fact contains a key of "PQ".
- That key is associated with a value which consists entirely of
decimal digits.
- Those decimal digits represent an integer which in fact is the
product of two odd primes.
- The URLencoded form contents also contain another key of "CERT".
- That key is associated with a value which represents a valid URL.
- That URL in fact points to a Web page of type text/plain which in
fact contains a certification that the PQ from the other field is
actually a product of two odd primes.
- That certification is public-key signed by a PQ-registration authority.
- That authority is telling the truth in this case.

Like I said, it's a BIG LIE to claim that runtime checks are NEVER
necessary if compile-time type-checking is used. So stop lying!!

> This value could be NULL (when NULL is a valid value for this
> type as it is the case for class values in most OO languages) or
> could cause some illegal operation such as a division by zero,

IMO, in any decent high-level language, the CPU would *not* be
passed an unprotected request to divide by zero. Instead, either
the CPU would divide inside an exception-trapper, or higher-level
code would notice the zero and do something *instead* of asking the
CPU to divide by zero. In either case, IMO, the decent high-level
language would signal an exception rather than just crash the
machine or OS or application. Black screen (crashed CPU) or blue
screen (crashed OS) or "application XLISP has unexpectedly quit"
are not acceptable results from a high-level attempt to divide by
zero. Depending on whether the exception is caught at a higher
level or not, the result might be continuation from the exception
with appropriate recovery actions, or entry into an interactive
BREAK loop. Stack backtrace and exit from the application is what
Java does, which IMO is only marginally acceptable.

> but it is guaranteed to be a value of the static type (as long as
> other mechanisms like initialisation and runtime checks take care
> that no garbage values get created).

How do you propose to make sure the user doesn't type in an invalid
value, or that some remote user on the Web doesn't submit an
invalid value in some field in a Web form, without runtime checks?

> Therefore the corresponding type check can be omitted at runtime.

Only for built-in types, such as 32-bit signed integer, not for
intentional types, such as odd prime.

> Other checks like a check for NULL or 0 might still be necessary.

That's a runtime check, right? So you admit you lied when you said
runtime checks are unnecessary?

> > Suppose you have a data record
> > that has ten optional fields. These ten fields can be filled in any
> > sequence, and for many applications several of them will remain
> > unfilled. Each function that uses these fields can have a runtime
> > check to make sure all the required fields are filled in, ignoring
> > the other fields not needed for that particular function. It would
> > be horrendous to set up a hierarchy of classes of objects that were
> > parameterized according to which fields were always defined per
> > that class of object, ...
> You are mixing two concepts here. Type checking and checking for
> NULL values.

To my mind, if a function requires a particular type of value, and
NULL is not allowed, then static type checking that can't tell the
difference between NULL and a valid value is *not* sufficient, that
runtime checks are *also* needed to validate the parameter.

Now in Java, this runtime check is *always* done when trying to
access a field within an object. If the check shows the parameter
is NULL, it throws an exception. If the check shows the parameter
points to a non-NULL valid object, the field is accessed.
(Static type-checking *does* prevent a third kind of case from
occurring, that the pointer is not NULL but in fact it doesn't
point to an object of the appropriate type. I agree that static
type checking is useful for *this* level of protection, but since
runtime checking is necessary anyway to protect against NULL
pointers, as well as to protect against violations of intentional
types such as prime integers, I disagree that runtime checks are
totally unnecessary.)

> > There's just no practical way for
> > compile-time static type-checking to completely replace runtime
> > checking.

> Compile-time static type-checking does not replace runtime
> checking completely.

So now you agree with me! So please stop repeating the BIG LIE that
runtime checking is unnecessary, OK? Please understand that
intentional types (such as prime number) often cannot be checked at
compile time even if they often *can* be checked at run time.

> Sometimes a runtime type check can be ommited, because of a
> static type check.

If you never have to develop new code, if you always have a
supernatural being hand you a platter consisting of complete
programs already written, and all you have to do is feed them to
your compiler then run the resultant application, that's true.

If you can yourself pretend to be such a supernatural being,
writing complete applications yourself then compiling them, that
may also be true. If you really like needing to develop new code by
writing a complete application before you can test even one line of
code, relying on compiletime checks and trace output to find your
mistakes, like I did when programming in FORTRAN cirac 1964-67,
that's fine.

But if you would like to write one line of code and immediately
make sure it's correct before distracting your mind with the next
line of code, that's not at all sufficient. You absolutely need
"runtime" type checking during your line-at-a-time development,
because their ain't no compiler seeing all your lines of code
simultaneously such that it could compare declarations one place
with usages elsewhere. Now hypothetically you could have an IDE
that automatically put a breakpoint at the start of a new function
whose framework you've written (before the body has been started),
and which has a way to supply test data for parameters, and then
each time you write a line of code the entire function up to that
point is re-compiled and executed with that test data with a
breakpoint just after the line you wrote and with that line inside
some sort of exception catcher, etc. etc. etc., really complicated
IDE, and just maybe you can use the compile-time type checks to
avoid runtime checks, if you're willing to call your IDE
JIT-compiler-with-breakpoints "compile time" instead of "run time".
I've never seen any such IDE, so REP with runtime checks are the
only way I've found availble for line-at-a-time writing+testing.
But lacking such an IDE, you need to duplicate all your compiletime
checks with corresponding equivalent runtime checks so that the REP
will work just the same as compiled code will later. But after
you've debugged the code using the runtime checks, the compiletime
checks wouldn't be needed anyway, because the code already works!!
So why do you insist the compiletime checks are needed at all??
Unless you have that super-nifty IDE I described above. Do you?
If so, is there a version availble for Java on FreeBSD Unix, so
that I could try it too? (Note BeanShell isn't like that at all.)

> Other things can only be checked at runtime.

Agreed, one of my four primary points all along.
A second point is what I said in the paragraph above, that in the
absense of a super-nifty IDE as described in that paragraph, you
need runtime checks that duplicate compiletime checks if you intend
to do interactive development of new code, after which the
compiletime checks aren't needed so why bother with them.
A third point is that given that you don't need compiletime checks
in the first place, you need runtime checks during debugging, then
you need *neither* in already working code, the extra hassle of
**requiring** declarations of type before you can use a variable is
simply a hassle of no value, a cost without a benefit.
And fourth is that compiletime checks can't deal with intentional types.

> > Note that it's *intentional* type checking that most
> > requires runtime checking. For example, you might use exactly the
> > same static compiletime type for a whole bunch of intentional
> > types, because setting up a separate static type for each separate
> > intentional type would be a royal pain.

(Note of addendum: Note the word "most" in the first line. Built-in
type checking still requires runtime checking when developing
interactively, unless a super-nifty IDE is available to
automatically re-compile and re-execute-from-the top the
partly-written function whenever a new line of syntactically valid
code has been added to it. But intentional types must be runtime-checked
*even* in fully compiled applications, not just during debugging.)

Side remark preface: One of the projects I've been wanting to do,
if I can find others to work on them with me (brainstorming,
testing my code, giving user-experience feedback, maybe even
writing some of the code), is a no-syntax method of developing
computer software. Search for that "no-syntax" key-phrase for
earlier postings, but the general idea is that instead of writing
sytnax for function/method calls you specify what input data to
pass as parameters and get a menu of functions/methods (by
description, not name) that take such parameters, organized by the
intentional type of parameters and return value, and just select
from the menu. As I envisioned it up to today, one-call-at-a-time
development would be mostly one-shot, whereby you set up test
values for parameters, specify and execute one line of code at a
time, and at some point say "that's the return value I want", and
the IDE automatically builds a function with all lines of code
whose output affects the return value. (SIde branches whose return
values never get used toward the final return value are deleted at
that point.)

Side remark main point: But in the light of what I said about the
super-nifty IDE above, I'm thinking maybe I should write it more
like a spreadsheet, where any time you change an input value or
edit an intermediate call etc. it tracks the data flow and
automatically re-executes anything whose input or definition has
changed. Furthermore, *after* you have gotten all your dataflow
calls (equivalent to lines of code, except this is no-syntax, there
are no "lines" of any "code" here) working, when you are ready to
complete building a complete function, you have the *option* of
introducing static type declarations for any variables where you
feel this may be of benefit, and again the spreadsheet-style IDE
will re-check to make sure your new declarations are consistent
with the operations you had already programmed. Thus you avoid the
burden of needing to declare the type of a variable before you can
assign a value to it, but still you can include static type
declaractions and thereby introduce static type checking if you
really want to. Of course if you *feel* like including some of the
static type declarations *before* you have finished writing all the
steps of your function, that would be allowed too. As a result, my
proposed no-syntax IDE would support both the kind of programming
you like (with static type declarations for *every* variable you
use) and the kind of programming I prefer (no need to ever do that
except for efficiency when doing tight fixed-precision numeric
loops), and all points between, with ease of transformation from
one style to another at any time, with a mix of styles allowed
anywhere.

Side extra remark: Most coding is simply executing D/P steps one
after another. Whether intermediate values output from one step and
fed into a later step are explicitly assigned to temporary local
variables or implicitly passed via form nesting doesn't matter,
it's the same data flow in either case. Thus (setq x (cos theta))
(setq y (+ x pi)) is equivalent to (setq y (+ (cos theta) pi)). It
makes no sense to *require* type-declaration for explict variable x
but *forbid* type-declaration for value passed by nesting. Common
Lisp provides (the <type> <valueExpression>) for declaring the type
of a passed value. Does your favorite language provide an
equivalent? Anyway, with linear chains of dataflow, no loops,
there's no chance of an infinite loop, in fact no loop at all, so
the spreadsheet model works fine without change. Even decisions
(conditional branches) work fine with a slight variation, whereby
one of the two branches simply doesn't get exercised during a given
test run, and the JOIN at the end of the branches uses the value
from just the branch that got executed. But for loops a much more
advanded variation of the spreadsheet model must be used, and
infinite loops are possible. To prevent that, I have in mind two
ideas: Allow only once around a loop per interactive confirmation
to continue; Have a maximum total number of steps to perform before
forcing a breakpoint, which has a default value that is
semi-reasonable, but which the user can change to a more reasonable
value at any time.

(splitting my long reply here as a minor topic-change occurs)

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 7, 2008, 7:54:39 PM9/7/08

(picking up where I left off)

> Your concept of static vs intentional type can be compared to the
> concept of interface vs implementation type that Seed7 has.
> There is an explanation of this concept here:
> http://seed7.sourceforge.net/manual/objects.htm

... When
you write to a 'file' in C you use the same interface ('fprintf') for
harddisk files, console output and printer output. The implementation
does totally different things for this files. UNIX has used the
"everything is a file" philosopy for ages (even network communication
uses the file' interface (see sockets)).

The person who wrote that (you?) is a bright person who groks that
concept in fullness.

For short: An interface defines which methods are supported while the
implementation describes how this is done. Several types with
different method implementations can share the same interface.

Nicely written! This is more succinct than any equivalent
explanation I recall seeing in online JavaDoc or in books used for
classes that teach Java programming.

Several implementation types can belong to one interface
type (they implement the interface type). E.g.: The types 'null_file',
'external_file' and 'socket' implement the 'file' interface.

In a way this is better than hierarchial Classes in Java which then
require *also* interfaces as a separate issue hacked into the
system as an afterthought to overcome the rigid single-inheritance
organization of the Classes. On the other hand, what if a single
value supports two unrelated interfaces? In Java it's easy to
define a Class for that type of object, inherited directly from
Object if need be, but then to declare that the Class satisfies
both interfaces, thence you require just a single variable to hold
members of that Class and you can call methods of *both* interfaces
at will on that one object. Is it even possible for a single
variable to be declared of two different Interfaces in Seed7?

type_implements_interface(circle, shape);

Hmm, maybe that's the answer to my question. So actually it's the
same (except for syntax) as how Java does it.

In the classic OOP philosopy a message is sent to an object.

Are you talking about SmallTalk there? The first time I heard about
SmallTalk, circa 1973, I immediately recognized passing messages
(e-mail messages, or datagram packets) a really stupid idea, and
what's really happening is that the facade (syntax) of passing a
message is actually a subroutine call. You don't *really* build a
**message** and then transmit it from the caller machine to the
called machine over a serial port. You simply pretend like there's
a structured message, with a primary parameter that is used for
method dispatch and the other parameters which are passed as actual
parameters to the appropriate method. For example, you don't pass
the "ADD 5" mesage to the 42 object which is a member of the
Integer class. You select the Integer.ADD(Integer) method and call
it with the object 42 as primary (self) parameter and the object 5
as regular parameter. (RMI/SOAP really does transmit a message, but
that's a different topic.)

Seed7 uses a different approach: All parameters get a user defined name.

This is a case where you need to be more clear, say whether it's
the formal parameter or the actual parameter that must be
user-named. I think you're talking about the formal parameter which
must have a user-defined name, right? Actual parameters aren't
named (although some of them might get their values directly from
named local/lesical variables in the caller's scope). Right?

If my understanding of your wording is correct, this simply means
that you've re-invented generic functions as they appear in Common
Lisp, where *all* parameters to a method are equivalent in
importance, there isn't one of them treated specially as "self"
with a different syntax for passing it to the method. Thus instead
of
parm1InOtherWordsSelfObject.methodName(parm2,parm3)
you have
methodName(parm1,parm2,parm3)
Is that correct (ignoring details of syntax)?

All decisions which implementation
function should be called can be made at compile time. To please the
OO fans such decisions must be made at runtime.

Moreso: To please the Java or C++ OO fans, dispatch on just the one
'self' must be doable at run time while dispatch on the rest of the
parameters must be done at compile time. Thus the "method
signature" includes just the rest of the parameters while the "base
class" includes just the 'self' parameter.
To please the Common Lisp OO fans, *all* dispatch must be doable at
runtime. Thus there is no method signature in an absolute sense.

... the program must decide at runtime which
implementation of the function should be invoked. This decision is
based on the [] type of the value of an object.

Need to insert word "implementation" where I put brackets, right?

Also, "value of an object" is redudant. The object is *already* a
value, either the value fetched from a variable or slot, or the
value returned from a function/method and passed by nested
expression to the method call under discussion right? Perhaps you
meant to say "value of the parameter" which is short for "value
being passed as a parameter"?

In many situations it makes more sense that a new type has an
element of another type (so called has-a relation) instead of
inheriting from that type (so called is-a relation).

So "has-a" refers to using a wrapper around an existing type of
object? If so, what you wrote is more succinct than I saw before,
so that I finally understand what that jargon means. That's
essentially what I mentionned earlier regarding converting an
intentional (ad hoc, per use) type (not known to compiler nor to
type system) to an internal (implementation) type (known to the
type system etc.). When you use a wrapper around an existing type,
you can mix wrapped and native objects in a container without them
getting confused, and you can dispatch on native and wrapped types
as you map/interate through the container. In Common Lisp with CLOS
you can even wrap around primitive types such as fixnum or
single-float. Inheritance can also support mixing different
sub-types within a container, except you can't inherit from a
primitive type, not in Java and not in Common Lisp, so wrapping is
more flexible. There are subtle tradeoffs involved here and I won't
try to finish enumerating all the tradeoffs here.

The type 'Number' can be extended to support other operators and there
can be also implementations using 'complex', 'bigInteger',
'bigRational', etc.

Question: Do all the relations between the encompassing type
'Number' and the various sub-types complex', 'bigInteger',
'bigRational', etc. need to be in one central place, or can they be
distributed all over in multiple otherwise-unrelated source files?
There's an important trade-off in this decision:
One way, the owner of the encompassing type has control over what
will be allowed, thus can better control/prevent bugs. The other
way, users who don't have write-permission on the file that defines
the encompassing type can neverthelless extend that type to include
more implementation types without needing to have the owner of the
encompassing type recompile his file.

... Further extending can lead to an universal type.
Such an universal type is loved by proponents of dynamic typed
languages, but there are also good reasons to have destinct types for
different purposes.

Ideally you can develop (write-and-immediately-test) one line of
code at a time in the dynamic context of the previously-developed
lines of code (intended as part of the same eventual function
body), without any declarations, thus using a "universal type" for
all variables. Then later you can add declarations to your code if
you feel like it, such as declaring that a particular variable will
*always* be some kind of Number.

(splitting my long reply here as a major topic-change occurs)

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 7, 2008, 11:18:57 PM9/7/08

(picking up where I left off)

> > Meanwhile, if you haven't done so yet, please create a CGI

> > hello-world example, ...
> I wrote some CGI programs in Seed7 (for example an experimental
> wiki). Therefore I can assure you that CGIs in Seed7 are possible.

Your assurance is nice, but still I'd like to see actual examples
of how to do it, including live demo of it actually being done. If
you can make a wiki, then it should be trivial to make a CGI-hello
and a CGI-hello-plus-3-steps demo. If you want to skip over the two
intervening steps that's fine, but the those two points are
important for comparing programming languages and showing novices
at the new language how to get started using it for CGI
applications. If you can make your two examples similar in style to
what I have already (14 languages at step 0, 9 languages at step 1,
9 languages at step 2, and 7 languages at step 3), where I present
a link to the live demo and another link to a listing of the source
(except the Flaming Thunder demo at step 3 which combines the two),
that would be best.

> My CGI Programs start with the line:
> #!/usr/local/bin/hi -q

That seems to be simlar to Perl where you can include your source
code directly in the toplevel CGI script, rather than like C/C++
where you absolutely need to compile your application and rename
the executable to be foo.CGI, and CMUCL where you need to write a
shell script to interface between CGI and the actual application.

But when you do that, does the entire file of source code (sans the
first line) get compiled to a temporary (file or first-class
function object or whatever) of native code, and then executed the
same as if you had compiled to an executable file then renamed that
executable to foo.CGI, or does it get interpreted line by line as
it passes through a sort of read-eval-print loop, or something
in-between such as a READ-JITCOMPILE-EXECUTE loop? (I'm curious.)
If it doesn't pass through a fullfledged whole-file compiler before
execution, to what degree are the semantics different from if it
were fully compiled?

> As you can probably guess the Seed7 interpreter is installed at
> /usr/local/bin/hi on my machine.

Yes. I know that much about Unix and how it relates to CGI and
executable scripts.

> The option -q (for quiet) instructs the hi interpreter to omit
> any output of its own.

Continuing my question from above, is it **really** an interpretor
in the script sense, either like the original BASIC interepretor
which kept the lines of source (possibly with the primary token
translated to a dispatch index) and re-scanned each line of code as
it was executed again and again, or like the original LISP
interepretor which loaded the parse tree of each function and then
recursively traversed the parse tree each time the code was
re-executed but didn't need to re-scan the raw text. Or is it more
like modern LISP read-eval-print loops which don't interpret at all
but instead call a JIT compiler the first time each expression is
executed and then directly call machine code that time and each
later time that same code is executed?

> Compiled CGIs are also possible.

It would be useful to have both a direct-CGI-script example (as you
indicated above) and a compiled executable (renamed foo.CGI), for
both CGI-hello and CGI-hello+3, but one example of each step would
suffice for minimal presentation.

> The library "cgi.s7i" contains some support to manage the CGI parameters.

Does that library come with the intepretor and compiler in the
installation setup (gzip of tar or whatever), or do users need to
download that separately if they want to use it to do CGI?

> My intent is to come up with a little bit more than a hello world CGI.

My request is to *start* with just CGI-hello and CGI-hello+3, and
then provide additional demos of how to do common operations
relevant to CGI (the examples I have so-far are: converting string
representation of an integer, as it might appear as a value
associated with a key in the associative array that is decoded from
the URLencoded form contents, into a true integer of data; and
setting/recovering cookies), then finally code snippets for regular
data processing *after* the input data has been converted from HTML
form contents to internal data. At present all the code snippets
are in my CookBook/Matrix, not yet incorporated into live CGI
demos. I'd also like to include DOM-style generation of the HTML
output, something I started thinking about in another thread just a
few weeks ago and haven't yet begun to try coding.

> BTW. Did you download Seed7 and experiment with it?

No. I don't have any available disk space on my regular shell
account. There's a Unix shell account a fellow in the UK offered
for me to use (see the From: field in this e-mail for a clue as to
that host), with oodles of free disk space, so I'm thinking that as
soon as you post CGI-hello and CGI-hello+3 examples I might try
downloading Seed7 to that UK machine and trying to run your
examples from that machine.

I did download Flaming Thunder to my regular shell account because
it's *tiny*:
70 -rwx------ 1 rem user 70566 Mar 17 14:04 ft*
That's the *whole*thing*! How large is the total of Seed7 with all
the libraries for CGI and other typical application domains?

> Seed7 - The extensible programming language ...

Actually it's not *the* (one and only) extensible programming
language. There are several others that came before it, including
Standish's PPL and Wegbreit's ECL:
<http://en.wikipedia.org/wiki/Polymorphic_Programming_Language>
<http://en.wikipedia.org/wiki/ECL_programming_language>
One of the principle artitechs of one of them gave me printed
manuals for both, which I browsed, and was impressed that (don't
remember which of them) defined cases of generic functions by
writing a production rule something like this:
RuleName: Given -> Produces

stan

unread,

Sep 8, 2008, 7:32:14 AM9/8/08

Robert Maas, http://tinyurl.com/uh3t wrote:

>> From: thomas.mer...@gmx.at
>> Guess what my favorite language is?
>
> I asked Google Groups to show me the last 100 articles you posted,
> and it showed me several articles in comp.lang.c, a few articles in
> comp.lang.c++, and just one in comp.lang.basic.misc, so I would
> guess that C is your favorite language.
>
>> Besides this, not all concepts make sense when transfered to
>> other languages.
>
> I agree. However a lot of data/processing tasks are of common types
> that *ought* to be do-able by *any* general purpose langauge. So
> there are two different ways to compare languages:
> - How well do they make it easy to write software for the vast
> majority of D/P tasks, i.e. all the common types of tasks.

You speak of general purpose languages and then restrict your domain to
data processing. Then you mention "common" tasks which I take to mean
common data processing tasks. But then you list non data programming
issues like "ability to write (and test) on a line by line basis" You mix
problem domain tasks with tool characteristics to the point of
confusion. A clearer presentation might separate task and
characteristics. You might also separate specific technologies like CGI
from general language issues.

For readers, I'm snipping a lot of a very long message where most of the
details mentioned above are below this point.
<snip>

thomas...@gmx.at

unread,

Sep 9, 2008, 4:24:15 AM9/9/08

On 8 Sep., 01:54, seeWebInst...@teh.intarweb.org (Robert Maas,

http://tinyurl.com/uh3t) wrote:
> (picking up where I left off)
>
> > Your concept of static vs intentional type can be compared to the
> > concept of interface vs implementation type that Seed7 has.
> > There is an explanation of this concept here:
> >http://seed7.sourceforge.net/manual/objects.htm
>
> ... When
> you write to a 'file' in C you use the same interface ('fprintf') for
> harddisk files, console output and printer output. The implementation
> does totally different things for this files. UNIX has used the
> "everything is a file" philosopy for ages (even network communication
> uses the file' interface (see sockets)).
>
> The person who wrote that (you?) is a bright person who groks that
> concept in fullness.

Thank you. This paragraph has several authors. I had an paragraph
with a reference to UNIX and "everything is a file" and I got help
to reformulate it nicely.

> For short: An interface defines which methods are supported while the
> implementation describes how this is done. Several types with
> different method implementations can share the same interface.
>
> Nicely written! This is more succinct than any equivalent
> explanation I recall seeing in online JavaDoc or in books used for
> classes that teach Java programming.
>
> Several implementation types can belong to one interface
> type (they implement the interface type). E.g.: The types 'null_file',
> 'external_file' and 'socket' implement the 'file' interface.
>
> In a way this is better than hierarchial Classes in Java which then
> require *also* interfaces as a separate issue hacked into the
> system as an afterthought to overcome the rigid single-inheritance
> organization of the Classes. On the other hand, what if a single
> value supports two unrelated interfaces?

I improved the whole paragraph with a picture and the following
sentence:

On the other hand: An implementation type can also implement
several interface types.

> So actually it's the
> same (except for syntax) as how Java does it.

Yes. Maybe I add some more elegant syntax in the future.

> In the classic OOP philosopy a message is sent to an object.
>
> Are you talking about SmallTalk there?

Yes, but later OO languages IMHO also use this missleading naming.

> Seed7 uses a different approach: All parameters get a user defined name.
>
> This is a case where you need to be more clear, say whether it's
> the formal parameter or the actual parameter that must be
> user-named. I think you're talking about the formal parameter which
> must have a user-defined name, right?

Yes, I am talking about the formal parameter.

> If my understanding of your wording is correct, this simply means
> that you've re-invented generic functions as they appear in Common
> Lisp, where *all* parameters to a method are equivalent in
> importance, there isn't one of them treated specially as "self"
> with a different syntax for passing it to the method. Thus instead
> of
> parm1InOtherWordsSelfObject.methodName(parm2,parm3)
> you have
> methodName(parm1,parm2,parm3)
> Is that correct (ignoring details of syntax)?

Yes.
I improved the whole paragraph. Please take a look at
http://seed7.sourceforge.net/manual/objects.htm

> All decisions which implementation
> function should be called can be made at compile time. To please the
> OO fans such decisions must be made at runtime.
>
> Moreso: To please the Java or C++ OO fans, dispatch on just the one
> 'self' must be doable at run time while dispatch on the rest of the
> parameters must be done at compile time. Thus the "method
> signature" includes just the rest of the parameters while the "base
> class" includes just the 'self' parameter.
> To please the Common Lisp OO fans, *all* dispatch must be doable at
> runtime. Thus there is no method signature in an absolute sense.

In Seed7 the DYNAMIC function declaration decides about the places
of a dynamic dispatch: All parameters which have an interface type
are dispatched dynamically.

> ... the program must decide at runtime which
> implementation of the function should be invoked. This decision is
> based on the [] type of the value of an object.
>
> Need to insert word "implementation" where I put brackets, right?

I added the word "implementation". The sentence is now:

... program must decide at runtime which implementation of the

function should be invoked. This decision is based on the

implementation type of the value (refered by the interface
object).

> Also, "value of an object" is redudant. The object is *already* a
> value, either the value fetched from a variable or slot, or the
> value returned from a function/method and passed by nested
> expression to the method call under discussion right?

Actually I see two objects involved: The one with the interface type
and the one with the implementation type. When I use the term
'value' I mean the value object to which the interface object
refers. This view is described with the picture at the beginning of
the chapter '7.1 Interface and implementation':
http://seed7.sourceforge.net/manual/objects.htm#interface_and_implementation

> In many situations it makes more sense that a new type has an
> element of another type (so called has-a relation) instead of
> inheriting from that type (so called is-a relation).
>
> So "has-a" refers to using a wrapper around an existing type of
> object?

The term is used when an interface or implementation object (an
object with an interface or implementation type) is element in the
data structure of an implementation type. In the following example
the files 'in_file' and 'out_file' are elements in the data
stucture. The 'echo_file' has the elements 'in_file' and 'out_file'
(has-a relation):

const type: echo_file is sub null_file struct
var file: in_file is STD_NULL;
var file: out_file is STD_NULL;
var integer: in_pos is 0;
end struct;

OTOH the new defined 'echo_file' inherits from 'null_file'. The
'echo_file' is a 'null_file' (is-a relation).

When the OO technic and inheritance was new, people had the
tendency to inherit (is-a relation) too much. Later it turned
out that in many cases it is better to just add elements to the
data structure (has-a relation) instead of inheriting.

> The type 'Number' can be extended to support other operators and there
> can be also implementations using 'complex', 'bigInteger',
> 'bigRational', etc.
>
> Question: Do all the relations between the encompassing type
> 'Number' and the various sub-types complex', 'bigInteger',
> 'bigRational', etc. need to be in one central place, or can they be
> distributed all over in multiple otherwise-unrelated source files?

It does not need to be in one central place.
E.g.: The definition of additional interface and implementation
functions can be done at a different place (even an otherwise-
unrelated
source file). This is different to Java where all method declarations
must be inside the class declaration.

> There's an important trade-off in this decision:
> One way, the owner of the encompassing type has control over what
> will be allowed, thus can better control/prevent bugs. The other
> way, users who don't have write-permission on the file that defines

> the encompassing type can neverthelless extend that type ...

Extending is possible, but the current implementation of Seed7
needs the source files to do it.

Thank you for the feedback.

What do you think about the improvements of that chapter?

thomas...@gmx.at

unread,

Sep 9, 2008, 7:04:05 AM9/9/08

On 8 Sep., 01:46, seeWebInst...@teh.intarweb.org (Robert Maas,

http://tinyurl.com/uh3t) wrote:
> > From: thomas.mer...@gmx.at
> > Guess what my favorite language is?
>
> I asked Google Groups to show me the last 100 articles you posted,
> and it showed me several articles in comp.lang.c, a few articles in
> comp.lang.c++, and just one in comp.lang.basic.misc, so I would
> guess that C is your favorite language.

Guess aggain :-)

> > Besides this, not all concepts make sense when transfered to
> > other languages.
>
> I agree. However a lot of data/processing tasks are of common types
> that *ought* to be do-able by *any* general purpose langauge. So
> there are two different ways to compare languages:
> - How well do they make it easy to write software for the vast
> majority of D/P tasks, i.e. all the common types of tasks.
> - What specialized/unusual types of tasks do they *also* support
> especially well, by mechanisms that are much better for these
> specific types of tasks than the general mechanisms are.
> Now it's a matter of personal judgement which tasks should be
> considered "common" and which "unusual". IMO at the very least the
> following ought to be considered "common":

I will use this list as checklist for Seed7 features.

> - Numerical calculations, including linear algebra, trigonometry, ...

http://seed7.sourceforge.net/manual/types.htm#integer
http://seed7.sourceforge.net/manual/types.htm#bigInteger
http://seed7.sourceforge.net/manual/types.htm#rational
http://seed7.sourceforge.net/manual/types.htm#bigRational
http://seed7.sourceforge.net/manual/types.htm#float
http://seed7.sourceforge.net/manual/types.htm#complex

> - STDIO (Standard Input/Output, i.e. from/to the controlling terminal)

http://seed7.sourceforge.net/manual/file.htm#Simple_read_and_write_statements
http://seed7.sourceforge.net/manual/file.htm#Standard_input_and_output_files
http://seed7.sourceforge.net/manual/file.htm#Access_to_operating_system_files

> - File I/O, both text and binary

http://seed7.sourceforge.net/manual/file.htm#Basic_input_and_output_operations
http://seed7.sourceforge.net/manual/file.htm#Input_and_output_with_conversion
http://seed7.sourceforge.net/manual/file.htm#User_defined_file_types

> - GUI (Graphical User Interface, i.e. mouse/trackball/whatever and keyboard)

A graphics library is there. There is no separate documentation for
the graphics library interface besides comments in the source:
http://seed7.sourceforge.net/prg/draw.htm

A GUI library is missing but there is a (simple) interface library
to the gtkserver (the documentation for that is in the Seed7
package). If somebody could take that task, I would be happy.

> - CGI or alternative (JSP/ASP/PHP/...) for Web server applications

CGIs can be written in Seed7. Documentation and examples are
currently missing.

> - Text editing, both plain-text and "word processing"

Not clear what this means. Seed7 has the type string (see:
http://seed7.sourceforge.net/manual/types.htm#string)
and libraries containing string and file scanner functions:
http://seed7.sourceforge.net/manual/file.htm#Scanning_a_file
http://seed7.sourceforge.net/prg/scanfile.htm
http://seed7.sourceforge.net/prg/scanstri.htm

> - Interface to relational databases

This is currently missing. I plan to do that in an integrated way
(like Linq).

The following points are possible with Seed7:

> - Dynamically-allocated pointy structures (data points to other data)
> - Ability to effect new intentional data-types by re-interpretation
> of existing true data types (this is trivial in most languages)
> I would also like some/most/all of the following to be included:
> - Ability to define new true data types, thereby converting an
> intentional data type into a true data type, thereby allowing
> the different data types to be mixed in a container without
> confusion (this is trivial in any language that allows
> definition of new classes in the sense of OOP)

> - Ability to interface to OS API, such as for "directly" controlling devices

Because of portability this is a difficult issue. With a direct OS
API people would start to write unportable programs. I prefere when
the OS API is provided in an OS independend way. That means: The OS
functions that can be supported under linux/bsd/unix and windows are
available. Sometimes (e.g.: console, keybord, graphic and other
areas) it is necessary to use driver libraries to compensate the OS
differences. I am not very motivated to support access to
linux/bsd/unix or windows only interface functions.

> - Ability to compile to native-code libraries, and later load&link them

The current Seed7 implementation does not support this, but I see no
major roadblock. The Seed7 compiler just needs to create a library
instead of an executable.

> Alternatively: Ability to compile to efficient byte-code libraries, ditto

Seed7 does not have any byte-code.

> - Ability to develop (write and test) code on a line-by-line basis

I see this as IDE feature and not as language feature.

> - Ability to dynamically load libraries into an already-running application

Loading libraries has the same portability issues as an interface to
the OS API. Up to now I did not make plans for such a feature. IMHO
at least the loading interface needs to be provided in an OS
independend way.

> - Multi-threading: system-supported true threads and/or "green-threads"

Seed7 can be compiled which hopefully makes the support for true
threads easier.

> - Ability to receive and process command-line parameters (CLP)

http://seed7.sourceforge.net/examples/echo.htm

> - Ability to invoke and control sub-processes that run other programs with CLP

Such things as the system() call. The Seed7 compiler (comp.sd7) uses
such an interface to invoke the C compiler and the linker. Naturally
such an interface has also portability issues and I make some effort
that the C compiler and the linker can be called in a portable way.

> - Ability to open&use local streams to/from other processes on same machine
> - Ability to open&use network sockets for TCP (and optionally UDP)

Sockets are present in Seed7.

> - Formal mechanisms for distributed applications, such as RPC RMI SOAP etc.

Currently not present.

> - Compiling/assembling to generate a stand-alone "executable" for a target OS

Yes. This is done with the Seed7 compiler (comp.sd7) which compiles
Seed7 to C which is further compiled to machine code.

> - Options for managing dynamically-allocated pointy structures:
> - Don't bother (for short-lived applications that won't run out of memory)
> - Manual freeing (when programmer can "know" precisely when to free memory)
> - Reference count (efficient for applications that don't have pointer loops)
> - Advanced mark-and-sweep (required to avoid "memory leak" of pointer loops)

Seed7 tries to manage memory automatically, except for some cases
where memory needs to be managed manually. In the future some
mechanism will support automatically managed memory also for this
cases.

> - Ability to build new functions/closures/continuations at runtime

Seed7 has some support for this. See:
http://seed7.sourceforge.net/manual/types.htm#category
http://seed7.sourceforge.net/manual/types.htm#reference
http://seed7.sourceforge.net/manual/types.htm#ref_list
http://seed7.sourceforge.net/manual/types.htm#program

> - Ability to extend the syntax of the language, either directly at
> the top level of processing source, or indirectly by explicitly
> calling a parser whose output (parse tree) can then be used to
> build executable code and then install it in the already-running
> application; thus the ability to effect a DSL (Domain-Specific Language)
> fully integrated into the standard language.

Seed7 uses syntax descriptions. See:
http://seed7.sourceforge.net/manual/syntax.htm

> - Ability to be called from a standard FFI (Foreign Function Interface)

When the Seed7 compiler is changed to generate a library instead of
an executable this is possible. It is the job for the other language
who wants to call Seed7 functions to get them from the library.

> - Ability to load and call other-language procedures via a standard FFI

This is a popular requirement. It has also the same portability
issues (unless the library is also portable) as a direct OS API.
I would like to support access to library functions in a portable
way. Therefore I am interested: What libraries / functions you
would see as necessary to be available.

As always: I am happy with every feedback.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 17, 2008, 4:51:29 AM9/17/08

> From: thomas.mer...@gmx.at

> I know that everything and the kitchen sink has been added to LISP.

That's one of the wonderful things about Lisp. See below...

> I spoke of the paradigma and I spoke about adding some things in
> the way they are used and implemented in Seed7.

Seed7 seems to be rather limited in what it can easily support.

> You can add things outside a paradigma, but it will disturb the
> paradigma.

Continuing about Lisp: Lisp provides powerful flexible tools to
allow adding lots of new paradigms without upsetting the existing
system. Starting from a simple lambda calculus, i.e. a way to
define new anonymous function objects at any time, with three
list-processing primitives as applied to linked lists (push new
cell (pointing to new element) onto front of list without losing
the original list, skip over cell at front of list to see what's
left (after the first element) without losing the original list,
peek at first element of list (whatever first cell points to)),
Lisp was extended to include a reader for s-expressions to produce
internal linked-lists which are effectively parse trees of the
s-expressions, a corresponding printer, a universal function (EVAL)
that converts a special kind of parse tree (nested list) into the
result of applying its first element to the rest of the elements, a
toplevel read-applyUniversalFunction-print loop for interactive
use, tagged objects distinguishable at all times without requiring
a priori kowledge, mark-and-sweep garbage collector, global
variables that can be assigned, symbols that can have a set of
properties in addition to a value and function binding and print
name, special variables that can mask earlier instances of same,
numbers and arithmetic functions, strings, structures with named
slots, file I/O, a library of mapping functions, parse-tree
transformations as an initial action when applying the universal
function, extension of the lambda calculus to include lexical
closures and hence shared OWN/STATIC variables, multiple value
returns, keyword parameters, multiple packages of symbols,
user-reconfigurable read table, and several forms of OOP. Many of
those add-ons required changes to the kernel to be efficient, but
enough different mechanisms are already installed supporting enough
different software paradigms that ordinary users can now add
virtually any new facility (except threads) by ordinary
application-level (user) programming.

I see from your recent postings that Seed7 supports most (but not
all) of those mechanisms, hence most (but not all) of those
paradigms, and that Seed7 supports an innovative new way to extend
the syntax by ordinary user programming that is quite different
from how it can be done in Lisp. If and when both of these happen:
- Seed7 supports all the paradigms that Lisp already supports, and
- Seed7 demonstrates its ability to do CGI by live demos similar to
what I already have for seven other languages;
I may get interested enough to download a copy of Seed7 to a
computer I have access to (not my regular shell account) and try to
set up some CGI applications/demos myself, starting with copies of
the demos you will have provided as a starting set.

REM> That's not earlier at all. With my line-at-a-time development
REM> style, as soon as I write the line of code that tries to do the
REM> illegal operation and submit it to REP, it'll signal an error.
TM> So when you type the line
TM> (add a b)
TM> (I assume that 'add' can add two numbers and will fail (at runtime)
TM> when one of the parameters is not a number) the 'REP' knows if this
TM> is a legal or illegal operation?
REM> No it doesn't need to know. It just passes the expression to EVAL, and
REM> takes care to catch any error thereby generated.
TM> If the function arguments are present, an expression can be checked.

By whom? The function called already either checks parameters for
type and range etc., anything needed to validate them, and signals
an error if they are not valid (per the <before> requirements of
the function called and doing this at-start parameter-validity
checking), or if not written in a robust manner like that it passes
them to lower-level functions which signal an error themselves if
they can't operate on the parameters as passed to them. It's a lot
easier to find a bug caused by passing bad parameter if the
toplevel function called does a validity check and issues a
meaningful error message, than if some deep inside function errors
out. But it *is* possible to track down the error in either case,
using the stack frames in the debugger to see what functions are
called with what parameters and using divide-and-conquer to find
the point where it's "obvious" that a bad parameter is being passed
through some point. Then either there's a bug in *that* function
making that bad call which somehow wasn't detected earlier by
line-at-a-time debugging, or a bad parameter slipped into that
function without being validated properly.

> The hard thing is: Verify at compile (or edit) time that some
> function will always get values of the correct type at runtime.

With line-at-a-time debugging of new code, edit occurs only moments
before execution, so it really doesn't matter whether a mistake is
noticed during edit itself or a couple seconds later when execution
is tried on test data for the first time. The mistake is promptly
fixed, and execution is tried again with the corrected line of
code, and now it works. Fixing the one line of code takes the same
length of time whether it's spotted at single-line-edit time or at
single-line-first-try-at-execution time. I actually prefer to have
it checked when executed and *not* a few seconds earlier while I'm
in the middle of editing it, because most of the time I compose the
line of code in a nonlinear manner, taking a template from
something similar I did earlier and then changing various parts
until it has all the parts correct, and it would be a pain if it
kept interrupting me to tell me what I already know that it is
*not*yet* correct. That's why I'm editing it, and why I haven't yet
submitted it for first attempt at execution, because I *know* it's
not yet ready for prime time. When I am *finished* making it the
best I can, *then* I attempt to execute it, and *then* I want it
checked for correctness, and whether it does a static check or just
calls EVAL on the form and signals an error if anything bombs out,
really doesn't matter at that point.

Now line-at-a-time testing of new lines of code will only check
whether the code works for the test data I'm giving it. It's my job
to make sure I submit all reasonable variations on valid data as
test data and verify it works in all such cases, and to put code
earlier in the function I'm building to make sure no other kind of
data can ever get to this particular line of code. That's why I'm
needed, to make such human judgements. The program can't write
itself. It needs me to make such judgements of where to put what
specific type-checking upon entry to which functions, and what
additional bug checks to put inside functions to detect possible
deficiencies in my understanding of my own algorithm I've written.
There's a CS concept called an "invariant", sort of in the sense of
a physical law, like constant energy or momentum. At a certain
point in a loop, all the various data involved in the loop are in a
well-defined relationship with each other, and that can often be
explicitly checked to make sure that going once around the loop
they haven't somehow fallen out of that correct relationship. As
one simple example, if you're in a loop processing accounts
receivable, the total of unprocessed input and processed output
ought to stay the same at all times, equal to a check value you
computed upon entry. If you have OOP for both input and processed
data, with a method that tells you the total, then it's trivial to
write code like this (vaguely Java notation here):
ARset processAccountsReceivable(ARset in) {
real tot = in.total();
ARset out = new ARset();
while in.hasMore() {
ARitem oneItem = in.getNext();
...Process oneItem...yielding ARitem processedItem...
out.add(processedItem);
unless (tot == in.total() + out.total()) // Test invariant here!
throw exception ... Bug, we've lost or gained something!!
}
return out;
}
More likely you move input records, after processing, to more than
one output set, such as good records to one set and bad records to
another set and deductions to a third set. Then it's more
compilcated to make sure you don't have a money leak, but simply
checking if the total of everything is the same after each
iteration will detect most money leaks. Anyway, that's just an
example of how an each-time-around-loop consistency/invariant check
can protect against an important type of bugs.

> How does the execution of an expression, while editing the program,

(while building the program *after* a line of code has been
finished getting edited into believed-correct form)

> ver[y|i]fy that it will get a value of the correct type at runtime?

The previous lines of code that produce a particular type of data
to assign to a local variable, or the validation of parameters
going into the function, or a bug check if a call to subroutine and
subsequent assignment to local variable can return two different
types depending on success/failure of the operation whereupon only
one of those two types is valid for the next line of code. As I
showed elsewhere, the OR pseudo-function can be used to define a
default value to handle the failure case, allowing the next line of
code to work correct. Here's that example again:
(setq ix2 (or (position key str :start ix1) (length str)))
(setq token (subseq str ix1 ix2))
If there's no instance of key after ix1, token runs to end of string,
else token runs just to ix2 where the key (closing delimiter) was found.
Note there's an alternate way, which I consider uglier:
(setq ix2 (position key str :start ix1))
(setq token (if ix2 (subseq str ix1 ix2) (subseq str ix1)))
I don't like that way of coding because the function name and first
two arguments are repeated, and if I'm later changing the name of
something there I might change one but forget to change the other.

> What I say is:
> Without some sort of static type checking it is a huge effort. See:

What you say is wrong. Why do you say it?

> http://seed7.sourceforge.net/faq.htm#static_type_checking
With static type checking all type checks are performed during
compile-time. ...

That's a lie! Only internal types can be checked at compile time.
Intentional types can't be checked. With OOP you can often convert
intentional types to internal types by wrapping each
intenationally-typed object with a OOP class type, but that can
become a royal pain in practice, especially when dealing with
polymorphic types. For example, there are several different types
of mathematical expressions: sum, difference, product, quotient,
power, root, and various transcendental functions such as sine and
cosine and exponent and logarithm. *EACH* of those types of
expressions can be differentiated, and *EACH* of them can have any
of the other types as sub-expressions, and in the case of sum or
product any number of any mix of any other types can be immeditate
sub-expressions. There's no way a static type-check at compile time
will make sure that some expression you're going to read in from a
file or you've handcoded as a literal constant or hand-built from
nested calls to constructors is not only differentiable at the top
level but all sub-expressions are also differentiable so that the
recursive algorithm for differentiation won't bomb out due to
invalid data type deep inside some expression. IMO it's better to
not bother with all that polymorphic wrapper stuff and just use
runtime checking with graceful error recovery if the expression
read from a file or from user's terminal isn't differentiable.

Would you like to have a "swordfight" in the sense of I give an
example from my own code where I have a intentional datatype with
runtime checking and you tell how runtime checking can be
completely replaced by compile-time checking, I challenge and you
parry my threat, repeatedly, until either I give up challenging or
you fail to parry?

But never mind. Part of intentional type is range checking, and you
already admitted that compiletime static type checking can't deal
with that correctly. For example, if I have GPS coordinates with
latitude and longitude and altitude, then latitude must be from -90
degrees to +90 degrees, inclusive at both ends, longitude must be
from -180 degrees to +180 degrees, exclusive at one end or other,
and where if latitude is at pole then longitude doesn't matter, and
altitude must be from -10 miles (deepest drillhole where a GPS
receiver might work) to +100 or so miles (highest altitude where
downward-directed antennas on GPS satellites can send the signal).
So how are you going to do trigonometry on such values, converting
from one GPS location to another, without runtime type checking to
make sure the final result is valid?

> The discussion was about finding type errors at compiler time.

My point was that only internal types, not internal types, can be
checked at compile time by standard type checks.

> The validity of a value (E.g.: to be an age with a value less than
> 150) is not the question for a type check.

It may indeed be a question for an intentional-type check.
Not all 32-bit integers (internal type) are valid ages (intentional type).

> When a program is written the programmer has something in mind.
> E.g.: What values his variables or parameters will hold.
> Declarations with type allow the programmer to express his
> thoughts. This again eases reading the program.

Why do you consider it *necessary* to include these declarations
*before* the line of code is written that depends on them? The
programmer may want to test his line of code *immediately* and not
need to spend time formally naming and declaring the appropriate
data type before checking if that line of code works.

Have you thought of my idea of allowing declarations of types of
local variables and nested-expresssion-passed values *later* as an
option, rather then requiring them to be coded *first*?

By the way, in Seed7, how **do** you check the type of a data value
that is returned from a function and *immediately* passed to
another function without first assigning to a local variable?
Do you have something like (THE <type> <expression>) in Lisp?
Or do you forbid nesting of function calls, requiring every
intermediate result to be assigned to a named and typed local
variable?

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 17, 2008, 5:27:49 AM9/17/08

> From: thomas.mer...@gmx.at

> It is possible to make sure that objects of a type hold only
> legal values of this type. This goal is reached with mechanisms
> like mandatory initialisation,

Java gave up that idea for good reason, and adopted instead the
allowance to semi-initialize a local variable to be the NULL
pointer and *later* set a pointer to an object in place of the NULL
pointer. Unless you have a way to have non-nested blocks of local
declarations, you can't handle the following task:
- Get an expensive network resource, and lock out other users from
it, and return a handle, to be stored in a local variable.
- Query that network resource to get some info which is used to
decide the name of a second network resource.
- Open that second network resource, and store its handle in a
second variable.
- Process the data from the first resource and write to second
resource.
- Close the first resource and discard the variable that held the
pointer to it, but *keep* the more recently second local in scope.
- Do further processing on that second resource.
- Finally, close the second resource and end the scope of that
second variable too.
The Java way, you can do:
{NetResource ptr1 = NULL;
NetResource ptr2 = NULL;
ptr1 = getFirstNetworkResource();
String name2 = processFirstResourceToGetName(ptr1);
ptr2 = getSecondNetworkResource(name2);
doFurtherProcessing(ptr1,ptr2);
closeFirstProcess(ptr1); ptr1=NULL;
doYetMoreProcessing(ptr2);
closeSecondProcess(ptr2); ptr2=NULL;
returFinalResult...
}
Your way with forced initialization at the time a local variable is allocated,
to avoid need for runtime type/NULL checks, you must do it like this:
{NetResource ptr1 = getFirstNetworkResource();
{String name2 = processFirstResourceToGetName(ptr1);
{NetResource ptr2 = getSecondNetworkResource(name2);
doFurtherProcessing(ptr1,ptr2);
closeFirstProcess(ptr1); /* At this point we are stuck.
We can't leave ptr1 pointing at a network resource which is
no longer valid, but we can't assign it NULL either, so all we
can do is exit all blocks until ptr1 is out of scope */
}
}
}
/* but now ptr2 is also out of scope, so we can't do the rest of the processing,
to whit we can't do any of the following code in any legal way:
doYetMoreProcessing(ptr2);
closeSecondProcess(ptr2); */

So does Seed7 provide a way to declare and undeclare local
variables, i.e. take local variables in and out of scope, in a
non-nested way?
Declare ptr1 and initialize it to resource 1.
Compute name of resource 2.
Declare ptr2 and initialize it to resource 2.
Process data that requires both resources.
Close resource 1 and take ptr1 out of scope.
Do processing that requires only resource 2.
Close resource 2 and take ptr2 out of scope.

> it can be guaranteed that only legal values of the correct type
> are used as object values.

Not without non-nested local-variable scopes, or NULL values that
must be tested at runtime.

> This way run-time type checks are unnecessary and the program
> execution can be more efficient.

What way? Does Seed7 allow overlapping non-nested scopes or not?

> The Seed7 interpreter maintains a category for every object. The
> category of an object can be INTOBJECT, STRIOBJECT, FLOATOBJECT,
> etc. When the interpreter executes a primitive action like INT_ADD
> (which is used to implement the + operator for integers) both
> parameters are checked that they have the correct category (in case
> of INT_ADD this is INTOBJECT). This way the hi interpreter does also
> some sort of runtime type check (using the category instead of a
> type). If this check fails you get a fatal error.

That is utterly horrible!!!! The last time I suffered that kind of
crap as the best way possible was when I programmed in FORTRAN on
the IBM 360. ABEND and COREDUMP whenever an error is detected at
runtime. Sometimes it happens due to bug in underlying programming
environment, such as in PowerLisp, and only rarely in CMUCL, but
it's noway the norm. It's common, but avoidable, in Java when using
unchecked exceptions. You say in Seed7 it's unavoidable, so you've
taken a step backwards to 1968 when I last used that horrible IBM
360 system?

Anyway, sometimes you say Seed7 has runtime checks, and sometimes
you say Seed7 doesn't need runtime checks so it doesn't have them.
What's the true story??

Bartc

unread,

Sep 20, 2008, 6:28:40 AM9/20/08

"Robert Maas, http://tinyurl.com/uh3t" <seeWeb...@teh.intarweb.org>
wrote in message news:rem-2008...@yahoo.com...

> But never mind. Part of intentional type is range checking, and you
> already admitted that compiletime static type checking can't deal
> with that correctly. For example, if I have GPS coordinates with
> latitude and longitude and altitude, then latitude must be from -90
> degrees to +90 degrees, inclusive at both ends, longitude must be
> from -180 degrees to +180 degrees, exclusive at one end or other,
> and where if latitude is at pole then longitude doesn't matter, and
> altitude must be from -10 miles (deepest drillhole where a GPS
> receiver might work) to +100 or so miles (highest altitude where
> downward-directed antennas on GPS satellites can send the signal).
> So how are you going to do trigonometry on such values, converting
> from one GPS location to another, without runtime type checking to
> make sure the final result is valid?

What does all this application-specific stuff have to do with the design of
a programming language?

Are you suggesting you should be able to specify a range-check to this level
of detail?

(And even if you could, you could still have a gps point wrongly located
anywhere in the world.)

--
Bartc

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 7:05:17 PM9/30/08

> From: thomas.mer...@gmx.at

> IMHO parse tree is a raw form of a program. In the structured
> form type checks, overload resolution, expansion of abstract data
> types and templates already has happened.

OK, I accept that my use of the term "parse tree" was sloppy. But I
don't know a proper generic term for a tree-structured
representation of an algorithm or of a set of (related) algorithms
or a structured data object, or a set of (related) data objects, or
a combination of related algorithms and data objects, etc., which
may be the raw parse before "macro" expansion, or the result after
"macro" expansion, or the result after lexical analysis and
resolution of types of variables etc., or even a constructed entity
which never came from input but which is intended to generate
output by a reverse parse (the "print" function of Lisp for
example). Perhaps you can suggest an appropriate generic term?

> > So will you show where *that* feature is documented?
> As I already said it is explained at:
> http://seed7.sourceforge.net/manual/types.htm#program

Funktions:
You need to translate that word from semi-German to English "Functions".

Amyway, that section doesn't say anything about the structure of a
"program" object in Seed7. Compare that to Lisp where CONS cells
are clearly defined as primitive objects with two pointers, and
lists are defined in terms of CONS cells clearly as having a
particular structure, and program elements are defined in terms of
nested list structures, and so the structure of a program as it
appears internally (before compilation) is clearly described. This
has two consequences for people writing software to explore programs:
- Clearly CAR and CDR can be used to traverse such a structure, and
other functions related to them such as CADR and MAPCAR likewise
can be used.
- The locations of the various parts of a program structure are
clearly defined, so somebody can decide which invocations of CAR
and/or CDR would get from the top level of any program to any
sub-part thereof. For example, given a PROG form, the CADR will
get you to the list of local variables.
This makes it possible to directly program algorithms for
traversing a program structure, rather than needing to guess what
the internal form might be and "hack" to discover it, or be totally
unable to delve into the innerds of a program structure. In this
respect, Seed7 is seriously deficient compared to Lisp.

> It is possible to get a list of objects declared in the 'program'
> with the function 'declared_objects'. This list has the type
> 'ref_list'.

That's just one of hundreds of items of information a
program-traverser might need to know. Other items would be: What
type of toplevel program element is here? (Is this a sequence, or a
decision, or a loop, or a function call, or what??) If this is a
sequence, how many steps are in the sequence, and how do I get each
of them individually? If this is a decision, how do I get the
predicate-form and how do I get the true clause and how do I get
the false clause? If this is a loop, what kind of loop is it (WHILE
FOR etc.)? If this is a FOR loop, what is the local name of the
index variable, what is the starting value, what is the increment,
what is the ending value, and what is the body? Recursively, what
kind of program element is at the top level of the body? In Lisp,
with the raw parse tree, it's trivial to check the CAR of a form to
see what's there, and if it's a symbol to look up whether it's the
name of a special operator or a macro-expander or just an ordinary
function. Then there's a natural course of action to learn the
structure of the special form (by looking in the manual), or what
the macro expands to (by looking in documentation and/or calling
macro-expand), or what the function does (by looking in
documentation or source). Then based o that info, there's a natural
course of action to delve deeper into the parts of the form. None
of the traversal of a "program" object in Seed7 seems possible in
any documented way, unless I missed something there.

Now in Java, I haven't worked with this enough to be sure, but I
get the impression that the reflection class allows such queries to
be asked despite the non-transparency of the interal
representation.

Oh, before I go on, thanks for this article:
<http://groups.google.com/group/comp.programming/msg/338161674975af29?hl=en&dmode=source>
which I haven't directly replied to, but have bookmarked for later
use when I get back to working on my multi-language CookBook/Matrix
re-organized per intentional data types.

> The type 'ref_list' is explained here:
> http://seed7.sourceforge.net/manual/types.htm#ref_list

That section seems to leave it undefined whether such an object is
implemented as an array (fixed time to access but very long time to
change length because the whole array needs to be copied to a new
place in memory) or as a linked list (time to access an element
proportional to how far down the list it's located, but some
change-length operations are very fast), or a self-balancing binary
search tree (log(n) time for nearly any operation, even
concatenation), or what. This makes it impractical to use these
accessor functions in any major way since it is too easy to program
something that takes "forever" to complete due to o(n) behaviour at
key places.

> To find out to which category a refered object belongs you can
> use the function 'category' which returns a value of the type
> 'category':
> The type 'category' is described here:
> http://seed7.sourceforge.net/manual/types.htm#category

I assume PROGOBJECT is the type of a program object? But how can
you learn what *kind* of program object it is, for example whether
it's a sequence or a function call or a decision or a loop etc.?

> > > It is not my intend [sic] to support programs which manipulaten [sic]
> > > their own code as it is done with "self modifying code".
> intent
> manipulate

Correct now.

> When I write a mail some sentences are rewritten several times...

Yeah. I don't make a big fuss over mis-spellings in e-mail or
newsgroup articles, although I sometimes flag them by [sic] when
I'm replying to them. But Web pages can be edited in-place to fix
such errors, so I make a point about any mis-spellings which
persist after you've finished asking your friends and associates to
proofread them and you are now advertising them widely. I suppose
in this case most of your friends and associates are
German-language speakers, not fluent in English, so I'm probably
the first English-speaking person to seriously read your Web pages
(Seed7 documentation) to spot what seem like obvious typos.

> > Do you take the extreme position that once you "start" a program,
> > it's too late to add any more code to it? You must load everything
> > (all code) you will ever need once at load time, and *then* start
> > executing?
> Not exacly. I want to do such things in a structured way.
> The concept of variables and constants can be used for this.
> A constant is granted to be unchangeable during program
> execution. Therefore it should not be possible to change
> the code of a constant function during the runtime.

I agree with your distinction. During development of new code,
which in the domain of the programmer never ends, although in the
domain of regular users might not be apparent, *all* code is
variable, *nothing* is unchangeable. If you discover a bug, you
*should* be able to modify the relevant function to fix it to get
rid of the bug. Obviously you can edit the source of that function
then re-build the entire system. But I personally prefer the
ability to re-define that one function (and optionally re-compile
just that one fuction) and load it right back in place of the buggy
version. In Lisp this is trivial. In Seed7 is it possible at all??

> OTOH the code of variable functions could be changed
> at runtime. Except for closures the concept of variable
> functions is currently not supported

LOSE LOSE LOSE!!!

<hackery>Is it theoretically possible to define *every* function as
a closure? Like instead of writing the source for the function
directly, you write a "factory" for that function, and then invoke
the factory at runtime to make the function as a "closure",
whereupon all you need to do during debugging is edit the variable
part of the factory that specifies the source of the closure, and
then call the factory to make a new closure just like the one
before except it now incorporates the new code with the bug fix?</hackery>

> (but will be some day in the future).

Let me know when that happens.

(Regarding after-first arguments to boolean operators:)

> I now use the following sentence to describe the situation:
> Naturally side effects of the right operand of the 'and' and 'or'
> operator only take place when the operand is executed.

That may be an improvement, but I'd need to look at the full text
of that section to be sure it is clear now.

> I use C as base to reduce the implementation effort.

OK, so Seed7 is dependent on C, not just for bootstrapping, but for
*every* build of a new version of Seed7. Also you use the runtime
library of C, so even *during* a run of a Seed7 application you are
*still* dependent on C. This makes your effort less of course.

> > > > While 0 ** 0 which is mathematically undefined is *required* to return 1?
> > > This behaviour is borrowed from FORTRAN, Ada and some other
> > > programming languages which support exponentiation.
> > Why do you borrow things that are flat-out wrong mathematically??

> There is a detailed discussion about this topic here:
> http://en.wikipedia.org/wiki/Exponentiation#Zero_to_the_zero_power

The evaluation of 0^0 presents a problem, because different
mathematical reasoning leads to different results. The best choice for
its value -- and indeed, whether or not to consider 0^0 indeterminate
(i.e., undefined) -- depends on the context. According to Benson
(1999), "The choice whether to define 0^0 is based on convenience, not
on correctness."^[2] ...

Let me translate: The way Seed7 handles 0^0 is based on
convenience, not on correctness. One consolation, Common Lisp also
defines (EXPT 0 0) => 1, again convenient but not mathematically
correct.

> As you probably know there are many possibilitys to manage data:
> - Global data which exists as long as the program runs
> (E.g.: An array which contains a conversion table or a string to
> which lines are appended and which is written to a file before
> the program ends). This needs no management but it could be feed
> automatically (by processing the list of global declared data).
> - Local data which exists as long as a function or block executes
> (E.g.: Integer or string variables or value parameters declared
> in a function). When the function or block is left it is easily
> possible to free local data, even if it is complicated like an
> array of structs with strings as elements. It is clear that
> carrying pointers into such local data outside of the function is
> something illegal, but the compiler could check for such things
> by comparing the lifetime of the pointer and the lifetime of the
> data.

That's just two possibilities, not many possibilities.

What about this third possibility?
Some data structure is computed at runtime, and stored inside
several other structures, to be later shared among several program
modules that use each of those other structures. At times that are
unpredictable, each of these several program modules is finished
and can discard its particular data structure, but must *not*
discard the portion that is shared among other structures owned by
other program modules still running. Eventually the last of those
program modules is finished, and the shared data can finally be
recycled, but how is the software system supposed to recognize that
the time is *now*??

> ... The technic above [snipped] assumes that there is a 1:1

> relationship between the variable (or constant or parameter) and
> its value. But not everything can be described this way.
> Sometimes it is necessary that several variables can refer to the
> same value. Most programming languages solve this problem by
> using pointers (or references). The variable points to the value

> which is at [sic: "in"] the heap.

Note it's more complicated than that. The pointer to the shared
data might be deep within some heap structure, rather than directly
from a local variable. For example:
+--------------+ +-----------+
LocVar1-->|PrivateStruct1|---->| |
+--------------+ | Shared |
| |
+--------------+ | Structure |
LocVar2-->|PrivateStruct2|---->| |
+--------------+ +-----------+

> - When it is possible to differentiate between one owning pointer
> and several non-owning pointers aggain a stack oriented
> management can be done, using the owning pointers. The non-owning
> pointers would not cause any memory management. This concept
> of two pointer categorys is natural when there is a hierarchical
> data structure which is processed with non-owning pointers.

That doesn't work, to declare either LocVar1, or LocVar2, or some
third party, as the owner of SharedStructure.

> - When no pointer can be identified as owner, and no pointer cycles
> exist (the data structure is a tree), reference counting can be
> used.

Yes, that's one of the four memory-management options that I suggested.
(#1=ignore the problem, just let memory get used,
and quit the program before memory is exhausted;
#2=manual/stack management, sub-cases:
#2a=the programming system clearly keeps track of how memory was
allocated and hence "knows" how to get rid of it when stack
is unwound, which seems to be how Seed7 does it;
#2b=the programmer must explicitly code constructors and
destructors, where the destructors get called whenever the
stack is unwound, but the programmer has the option of
destructing even earlier, which seems to be the way C++ does it;
#3=reference count, as you mention there;
#4=mark-and-sweep GC, as Lisp and Java do it.)

> More complicated situations must be managed by hand or with a GC.
> If a GC concentrates on data which cannot be managed in a simpler
> way it is probably much faster.

Did you see my proposal a few weeks ago for multiple heaps, one
managed by reference count, another managed by mark-and-sweep GC?
Or my proposal for using reference count everywhere, allowing
pointer loops *only* when they pass through a symbol, and using
mark-and-sweep GC to manage symbols only when reference count can't
reclaim enough memory?

> Therefore I see the philosopy to let the GC manage all data as
> the wrong way for memory management.

It has an advantage that it saves a lot of human effort at some
expense in CPU effort, with CPU effort much cheaper than human
labor, so it's a favorable exchange.

> In Seed7 several container types are present. Therefore many uses
> of pointers (to manage lists, trees, hash tables, etc.) are not
> necessary. In Seed7 such containers should be used and that way the
> use of pointers is reduced as much as possible.

This is essentially the way Java does it. This has a disadvantage
that it's impossible for different containers to share structure,
thus log(n) time to modify a structure while keeping the old
structure intact (sharing all side branches that didn't change) is
impossible when using such containers.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 7:54:53 PM9/30/08

> From: thomas.mer...@gmx.at

> What do you think about the improvements of that chapter?

Not sure. I'm sorta exhausted and unable to offer more feedback. I
had one pending question, about an implementation type that happens
to implement more than one interface. So if you create such an
object, how do you call different methods from both interfaces on
the same object? Do you up-cast the implementation type to each of
the two interface types separately and call the appropriate methods
on each interface object?

0 new messages