I'd like to share what I've found common to all of them. First, there
is some syntax that must be learned. Then there is a period of becoming
familiar with syntax and sementics, as one learns a crucial thing: "how
to get the job done" with this language. Certainly, too many language
learners reach this point and plateau in their learning progress.
But any serious programmer of a given language will undoubtedly take the
next step, which is the one that I find the most interesting. Only
after about 25+ years of programming did this perticular process, common
to (probably) all languages, become clear.
The process, for lack of a better term, is "compression." The first
form of this that most of us programmers encounter is subroutines.
Instead of this:
do_something_1 with X
do_something_2 with X
do_something_3 with X
do_something_1 with Y
do_something_2 with Y
do_something_3 with Y
we compress to this:
do_some_things(argument)
{
do_something_1 with argument
do_something_2 with argument
do_something_3 with argument
}
do_some_things with X
do_some_things with Y
This refinement makes obvious sense in many ways, including
maintainability, readability, code space reduction, etc... But I
realize now that this step is the real essence of programming. In every
useful language I can think of, this compression is really the central
feature.
The whole concept of "object oriented" programming is nothing more than
this. Code and data common to various objects are moved to a "parent"
or "base" class. Objects can either derive from other objects ("X is a
Y"), or contain other objects ("X has a Y").
Even scripting languages like HTML have incorporated this concept. When
this inevitably became too cumbersome:
<font family="Arial" color="#000000" size="3">Hello</font>
<font family="Arial" color="#000000" size="3">World</font>
the common elements were separated:
<style type="text/css">
h2
{
font-family: Arial;
font-size: 12pt;
color: #000000;
}
</style>
<h2>Hello</h2>
<h2>World</h2>
I would wager that most programmers, especially in the beginner to
intermediate realm, don't really understand why this type of design is
desireable, but just find that it feels right. Maybe the short term
payoff of simply having to type less is the incentive.
But the reason is deeper. A very simple algorithm for compressing data
is run-length-encoding. The following data, possibly part of a
bitmapped image:
05 05 05 05 05 05 05 02 02 02 17 17 17 17 17
Can be run-length-encoded to:
07 05 03 02 05 17
The reward, at first, is just a smaller file. But at a deeper level,
the second version could be considered "better," in that it is more than
just a mindless sequence of bytes. Some meaning is now attached to the
content. "Seven fives, followed by three twos, followed by five
seventeens" is much less mind numbing than "five, five, five, five,
five, five, five, two, two..."
It has been argued that compression is actually equivalent to
intelligence. This makes sense at a surface level. Instead of solving
a problem with a long sequence of repetitious actions, understanding the
problem allows us to break it into more manageable pieces. The better
our understanding, the more compression we can achieve, and the more
likely our resulting algorithm will be suited to solving similar
problems in the future.
This was quite a revelation for me, and it shed much light on writing
"good" code. It also made clear why I find some languages much more
useful than others. The more power a language gives me to compress my
algorithm -- both code and data, as well as in space and executuion time
-- the more I like it. The true measure of this is not the number of
bytes required by the source code, although this surely has some
correlation.
This has given me a great deal of direction in thinking about creating
languages.
<snip>
> But any serious programmer of a given language will undoubtedly take the
> next step, which is the one that I find the most interesting. Only
> after about 25+ years of programming did this perticular process, common
> to (probably) all languages, become clear.
>
> The process, for lack of a better term, is "compression."
Or abstraction, or functional decomposition...
> The first
> form of this that most of us programmers encounter is subroutines.
>
> Instead of this:
>
>
> do_something_1 with X
> do_something_2 with X
> do_something_3 with X
>
> do_something_1 with Y
> do_something_2 with Y
> do_something_3 with Y
>
>
> we compress to this:
>
>
> do_some_things(argument)
> {
> do_something_1 with argument
> do_something_2 with argument
> do_something_3 with argument
> }
>
> do_some_things with X
> do_some_things with Y
No, we compress to this:
do_some_things(obj, min, max)
{
while(min <= max)
{
do_something with min++, obj
}
}
do_some_things with X, 1, 3
do_some_things with Y, 1, 3
or even:
object = { X, Y }
for foo = each object
{
do_some_things with foo, 1, 3
}
> Maybe the short term
> payoff of simply having to type less is the incentive.
Don't forget elegance. It's impossible to define, but a good programmer
knows it when he/she/it sees it.
> But the reason is deeper. A very simple algorithm for compressing data
> is run-length-encoding. The following data, possibly part of a
> bitmapped image:
>
> 05 05 05 05 05 05 05 02 02 02 17 17 17 17 17
>
> Can be run-length-encoded to:
>
> 07 05 03 02 05 17
>
> The reward, at first, is just a smaller file. But at a deeper level,
> the second version could be considered "better," in that it is more than
> just a mindless sequence of bytes.
There are better algorithms than RLE. :-)
> It has been argued that compression is actually equivalent to
> intelligence. This makes sense at a surface level. Instead of solving
> a problem with a long sequence of repetitious actions, understanding the
> problem allows us to break it into more manageable pieces. The better
> our understanding, the more compression we can achieve, and the more
> likely our resulting algorithm will be suited to solving similar
> problems in the future.
This is sometimes expressed in the form "theories destroy facts". If you
know the equation, you don't need the data!
> This was quite a revelation for me, and it shed much light on writing
> "good" code. It also made clear why I find some languages much more
> useful than others. The more power a language gives me to compress my
> algorithm -- both code and data, as well as in space and executuion time
> -- the more I like it. The true measure of this is not the number of
> bytes required by the source code, although this surely has some
> correlation.
Expressive power matters a lot, and you are right to highlight its
importance.
> This has given me a great deal of direction in thinking about creating
> languages.
Incidentally, it has also given this newsgroup the possibility of entering
into a worthwhile discussion that isn't based on a lame newbie question.
Nice one.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
I agree completely. In fact these are the start of my course
outline for teaching computer programming to absolute beginners:
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#s4outl>
Notice how lesson 1 is just the syntax used to express values, and
then lesson 2 starts the semantics of using some of those values as
programs (i.e. passing them to EVAL). So lesson 1 uses with a
read-describe-print loop, then lesson 2 adds a call to EVAL in the
middle.
(What happens next:)
> The process, for lack of a better term, is "compression."
This is a subset of "refactoring". In fact I strongly advise that
beginners (and even experts much of the time) write single lines of
code and get each single line of code working before moving onto
the next line of code, and only after getting the "business logic"
(the data transformations) working for some function *then*
refactor those lines of code into a self-contained function
definition. I suppose "agile programming" is the most common
buzzword expressing something like this methodology. Instead of
treating refactoring as a pain to be avoided by correct design in
the first place, treat refactoring as a dominant part of the
software development process. You *always* write easily debuggable
code, and then *after* you have it working you *always* refactor it
to be better for long-term use, with the tradeoff that it's now
more difficult to debug, but since debugging at such a low level of
code has been mostly finished this isn't a problem.
Notice how my lesson plan, after the really basic semantics of individual
lines of code, proceeds to teach refactoring in several different ways:
* Lesson 3: Putting micro-programs in sequence to do multi-step D/P
(data processing), and building such sequences into named
functions
* Lesson 4: Refactoring syntax: Getting rid of most uses of GO in
PROG.
* Lesson 5: Refactoring algorithms: Devising data structures that
make D/P much more efficient than brute-force processing of flat
data sequences.
After that point, my beginning lessons don't discuss additional
ways of refactoring, but I think we would agree that further
refactoring is *sometimes* beneficial:
- Using OOP.
- Using macros to allow variations on the usual syntax for calling functions.
- Defining a whole new syntax for specialized problem domains,
including a parser for such syntax.
Almost any software project can benefit from bundling lines of code
into named functions (and sometimes anonymous functions), and also
refactoring syntax and algorithms/dataStructures. But whether a
particular project can benefit from those three additional
refactorings depends on the project. Perhaps my course outline for
an absolute beginner's course in how to write software is
sufficient, and OOP/macros/newSyntaxParsers should be a second
course for people who have gotten at least several months
experience putting the lessons of the first course to practice? If
you like my absolute-beginner's course outline, would you be
willing to work with me to develop a similarily organized outline
for a second course that covers all the topics in your fine essay
and my three bullet points just above?
> The more power a language gives me to compress my algorithm --
> both code and data, as well as in space and executuion time -- the
> more I like it.
Hopefully you accept that Lisp (specifically Common Lisp) is the
best language in this sense? Common Lisp supports, in a
user/applicationProgrammer-friendly way, the complete process of
(agile) programming from immediately writing and testing single
lines of code all the way through all the refactorings needed to
achieve an optimal software project. Java with BeanShell is a
distant second, because the semantics of individual statements
given to BeanShell are significantly different from the semantics
of exactly the (syntactically) same statements when they appear in
proper method definitions which appear within a proper class
definition which has been compiled and then loaded.
> This has given me a great deal of direction in thinking about
> creating languages.
There's no need to create another general-purpose programming
language. Common Lisp already exists and works just fine. All you
might need to do is create domain-specific languages, either as
mere sets of macros within the general s-expression syntax, or as
explicitly parsed new syntaxes feeding into Lisp, or as GUI-based
no-syntax pure-semantics editors generating tables that feed into Lisp.
<snip>
> But any serious programmer of a given language will undoubtedly take the
> next step, which is the one that I find the most interesting. Only
> after about 25+ years of programming did this perticular process, common
> to (probably) all languages, become clear.
independent of language.
> The process, for lack of a better term, is "compression." The first
> form of this that most of us programmers encounter is subroutines.
<snip example of common code>
This is the "Refactor Mercilessly" of the Agile crowd.
> This refinement makes obvious sense in many ways, including
> maintainability, readability, code space reduction, etc... But I
> realize now that this step is the real essence of programming. In every
> useful language I can think of, this compression is really the central
> feature.
>
> The whole concept of "object oriented" programming is nothing more than
> this. Code and data common to various objects are moved to a "parent"
> or "base" class. Objects can either derive from other objects ("X is a
> Y"), or contain other objects ("X has a Y").
now here I disagree. Thinking about OO like this tends at best to lead
to really deep inheritance trees and at worst LSP violations. "yes I
know
a Widget isn't really a ClockWorkEngine but I used inheritance to
share
the code".
OO is about identifying abstractions. Look up the Open Closed
Principle
for a better motivator for OO.
Read The Patterns Book.
<snip>
--
Nick Keighley
Programming should never be boring, because anything
mundane and repetitive should be done by the computer.
~Alan Turing
I'd rather write programs to write programs than write programs
<snip>
>
> OO is about identifying abstractions. Look up the Open Closed
> Principle
> for a better motivator for OO.
>
> Read The Patterns Book.
That's unconstitutional in the USA, because it's cruel and unusual
punishment. It remains legal in the UK, though.
> now here I disagree. Thinking about OO like this tends at best to lead
> to really deep inheritance trees and at worst LSP violations. "yes I
> know
> a Widget isn't really a ClockWorkEngine but I used inheritance to
> share
> the code".
At the risk of a religious debate, I have to respond that I don't really
find OO very useful. One reason for this is probably exactly what you
just said. I see that tendency as a failure of OO.
Initially, it is nice to be able to say "a square is a shape" and "a
shape has an area." But the more complex the project, the more
inheritance and information hiding become nothing more than a burden, in
my experience.
They may be necessary evils in some environments. The open/closed
principle reduces modifications to the "base" set of code -- whether
this is a set of classes, a library of functions, or something else. It
encourages code to become set in stone. Unfortunately, especially on a
large project, I find that no matter how we try, it is impossible to
foresee the entire scope of what needs to be designed. So the design is
constantly changing, and the idea of some set of code that we write in
the early stages surviving without needing vast rewrites for both
functionality and efficiency is delusional.
Our current project is enormous. It involves half a dozen separate
applications, running on completely different platforms (some x86, some
embedded platforms, some purely virtual). The code ranges from very
high-level jobs, like 3D rendering, all the way down to nuts-and-bolts
tasks like device drivers. It even includes developing an operating
system on the embedded side.
I can honestly say that OO has really added nothing at any stage of that
chain, from the lowest level to the highest. We use C++ for two
reasons:
1) The development tools and libraries are the most mature for C++ and
this is essential.
2) The OS, drivers, and application code in the embedded firmware is all
C and we have no choice in that, unless we want to develop a compiler
for some other language, or write directly in assembly language. Since
C and C++ are largely similar, parts of the C code that we want to share
with the other applications can be used directly.
But if not for these two restrictions, in hindsight, I would not have
chosen C++ or any other OO language.
I've wandered far enough off-topic from the original post.
I understand that the current US VP has changed the definition of "cruel
and unusual" so that it's not unconstitutional any more. Fortunately I
live in neither the US nor the UK. :-)
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
> They may be necessary evils in some environments. The open/closed
> principle reduces modifications to the "base" set of code -- whether
> this is a set of classes, a library of functions, or something else. It
> encourages code to become set in stone. Unfortunately, especially on a
> large project, I find that no matter how we try, it is impossible to
> foresee the entire scope of what needs to be designed. So the design is
> constantly changing, and the idea of some set of code that we write in
> the early stages surviving without needing vast rewrites for both
> functionality and efficiency is delusional.
I have seen OO work wonderfully well with core classes surviving
seven years of product development with them at the heart unchanged in API
(the internals got a severe optimisation at one point). They were used in
applications never even suspected at the start and the core aspect
mechanism held up under the strain superbly.
I have also seen OO fail horribly yielding nothing but awkward
constructs that impede design and make life difficult exploding into a
myriad of similar but vitally different classes requiring constant tweaking
with resultant adjustments all over the codebase.
The key difference as far as I can tell has to do with a rigorous
attack on assumptions in the core design eliminating everything except the
core concepts the base classes are intended to provide and then layering
with care. That and the craft to design a truly useful set of simple
abstractions expressing the immutable aspects of the problem domain.
The idea is not delusional, I have seen it work on more than one
occasion. It is however a difficult art and almost certainly not suited (or
possible) in all problem domains. When it works it's like putting a magic
wand in the hands of the developers, when it fails it's like casing their
hands in two kilos of half set epoxy resin.
--
C:>WIN | Directable Mirror Arrays
The computer obeys and wins. | A better way to focus the sun
You lose and Bill collects. | licences available see
| http://www.sohara.org/
> Hopefully you accept that Lisp (specifically Common Lisp) is the
> best language in this sense? Common Lisp supports, in a
> user/applicationProgrammer-friendly way, the complete process of
> (agile) programming from immediately writing and testing single
> lines of code all the way through all the refactorings needed to
> achieve an optimal software project.
I have to admit that I haven't used Lisp since college. At the time, I
found it interesting, and I seem to recall that it was well suited for
things like AI development. I also remember it being difficult to write
readable code. Professionally, I'm somewhat handcuffed to C and C++ for
reasons I mentioned earlier. It's also hard to find bright and
motivated employees who are fluent in Lisp. But I will make a point of
revisiting it.
I'd just like a language where this idea of compression, or refactoring,
is the central principle around which the language is built. Any
language that supports subroutines offers a mechanism for this, but I
feel the concept could be taken further. It's all very nebulous at the
moment, but I feel that somehow the answer lies in pointers.
At its heart, a subroutine is nothing more than a pointer, in any
language. A compiler takes the code inside the routine and stuffs it
into memory somewhere. To call that function, you just dereference the
pointer to "jump" to that code.
An object in an OO language is accomplished using a pointer. Each class
has a function table, and each object has a pointer to that function
table. Here, the compression is happening at the compiler level. Since
every object of a given class has the same code, it can all be moved to
a common place, then each object need only store a pointer to it.
Further, code in a base class need only exist there, with each child
class pointing to that table.
But these are hard-coded abstractions built into compilers. C++ or Java
tell you, "This is how inheritance works... This is the order in which
constructors are called..." Somehow, I'd like for the language itself
to make the mechanism directly accessible to the programmer. If his or
her programming style then naturally tends to evolve into an OO sort of
structure, wonderful. If not, then maybe a sort of table-driven type of
"engine" architecture would emerge. That just happens to be what I
generally find most powerful.
I suppose you could just take a huge step backwards and write pure
assembly language. Then all you really have is code, data, and
pointers. You're free to use them however you like. But I strongly
believe there is a way to still offer this freedom, while also offering
a great deal of convenience, readability, and maintainability by way of
a high-level language.
If you want to invent your own abstraction mechanisms you
might be interested in Seed7. There are several constructs
where the syntax and semantic can be defined in Seed7:
- Statements and operators (while, for, +, rem, mdiv, ... )
- Abstract data types (array, hash, bitset, ... )
- Declaration constructs
The limits of what can be defined by a user are much wider
in Seed7.
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
I agree.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
If that were true Lisp would be concise but Lisp is actually extremely
verbose.
>> This has given me a great deal of direction in thinking about
>> creating languages.
>
> There's no need to create another general-purpose programming
> language. Common Lisp already exists and works just fine. All you
> might need to do is create domain-specific languages, either as
> mere sets of macros within the general s-expression syntax, or as
> explicitly parsed new syntaxes feeding into Lisp, or as GUI-based
> no-syntax pure-semantics editors generating tables that feed into Lisp.
Consider the following questions:
1. Why has Lisp always remained so unpopular when, as you say, it is so
extensible?
2. Why have the implementors of successful modern languages that were
originally built upon Lisp gone to great lengths to completely remove Lisp
from their implementations?
I studied Lisp for some time before realising the answers to these
questions:
1. Modern language features (e.g. pattern matching over algebraic data
types) are so difficult to implement that it is a practical impossibility
to expect ordinary programmers to use Lisp's extensibility to make
something decent out of it. Instead, the vast majority of programmers
choose to use less extensible languages that already provide most of what
they need (e.g. a powerful static type system) because that is vastly more
productive. In theory, this problem could be fixed but, in practice, the
few remaining members of the Lisp community lack the talent to build even
the most basic infrastructure (e.g. a concurrent GC).
2. Even though Lisp's forte is as a language laboratory, Lisp has so little
to offer but costs so much even in that niche that the implementors of
successful modern languages soon stripped all traces of Lisp from their
implementations in order to obtain decent performance (compilers written in
Lisp are extremely slow because Lisp is extremely inefficient).
In other words, Lisp is just a toy language because it does not help real
people solve real problems.
Compared to what? What other programming language do you know that
is as expressive as Lisp, allowing not just canned procedures
manually defined by syntax that is compiled, but ease in building
your own new procedures "on the fly" at runtime, but which is less
verbose than Lisp for equivalent value? Both C and Java are more
verbose than Lisp. To add two numbers in Lisp, you simply write (+
n1 n2) where n1 and n2 are the two numbers, and enter that directly
into Lisp's REP (Read Eval Print) loop. To do the same thing in C,
you need to write a function called "main" which includes both the
arithemetic operation itself also an explicit formatted-print
statement. To do the same thing in Java, you need to define a Class
which contains a method called "main", and then the innerds of main
are essentially the same as in C. For either C or Java, you then
need to compile the source, you can't just type it into a REP loop.
And in the case of Java, you can't even directly run the resultant
compiled-Class program, you have to start up the Java Virtual
Machine and have *it* interpret the main function of your compiled
Class.
Here's a more extreme comparison: Suppose you want to hand-code a list
of data to be processed, and then map some function down that list.
In Lisp all you need to do is
(mapcar #'function '(val1 val2 val3 ... val4))
where the vals are the expressions of the data you want processed
and function is whatever function you want applied to each element
of the list. You just enter that into the REP and you're done. Try
to imagine how many lines of code it takes in C to define a STRUCT
for holding a linked-list cell of whatever datatype you want the
function applied to, and a function for allocating a new cell and
linking it to some item of data and to the next cell in the chain,
and then calling that function over and over to add the cells to
the linked list one by one, and then you have to write a function
to map down the list to apply the other function. Or you have to
manually count how many items you want to process, and use an array
instead of a linked list, and manually insert elements into the
array one by one, then allocate another array to hold the results,
and finally you can
do (i=0,i<num,i) res[i]=fun(data[i]);
and then to print out the contents of that array or linked list you
have to write another function. And in Java you have to decide
whether to use vectors or arrays or any of several other collection
classes to hold your data, and then manually call
collection.add(element) over and over for the various elements you
want to add. Then to map the function down the list you need to
create an iterator for that collection and then alternate between
checking whether there any other elements and actually getting the
next element, and create another collection object to hold the
results. Then again just like in C you need to write a function for
reading out the elements in the result collection.
Of course if you're just going to map a function down your list and
immediately discard the internal form of the result, you don't need
a result list/array/collection. I'm assuming in the descriptions
above that you want to actually build a list/array/collection of
results because you want to pass *that* sequence of values to yet
another function later.
Are you complaining about the verbosity of the names of some of the
built-in functions? For example, there's adjust-array alphanumericp
assoc assoc-if char-lessp char-greaterp char-equal char-not-lessp
char-not-equal char-not-greaterp? Would you rather be required to
memorize Unix-style ultra-terse-inscrutable names aa a as ai cl cg
ce cnl cne cng respectively? Do you really think anybody will
understand a program that is written like that?
> 1. Why has Lisp always remained so unpopular when, as you say, it
> is so extensible?
Unlike all other major languages except Java and
HyperCard/HyperTalk and Visual Basic, it doesn't produce native
executables, it produces modules that are loaded into an
"environment". That means people can't just download your compiled
program and run it directly on their machine. They need to first
download the Lisp environment, and *then* your application can be
loaded into it (possibly by a script you provided for them) and
run. Unlike Visual Basic and Java, there's no major company pushing
hard to get people to install the environment. Unlike
HyperCard/HyperTalk, there's no major vendor of operating systems
(MS-Windows, FreeBSD Unix, Linux, Apple/Macintosh) providing Lisp
as part of the delivered operating system. (Linux does "ship" with
GNU Emacs with built-in E-Lisp, but when I say "Lisp" here I mean
Common Lisp. E-Lisp doesn't catch on, despite "shipping" with
Linux, because it simply doesn't have the usefulness of Common Lisp
for a wide variety of tasks *other* than managing a text-editor
with associated utilities such as DIRED and e-mail.) So Lisp has an
uphill battle to compete with MicroSoft and Sun who are pushing
their inferior languages. (HyperTalk/HyperCard died a long time ago
because it wasn't an especially good programming language, was
supplied only on Macintosh computers, and Macintosh lost market
share to MicroSoft's unfair labor practices, resulting in hardly
anybody making major use of it, resulting in Apple no longer
maintaining it to run under newer versions of their operating
system, so that now very few people still are running old versions
of MacOS where HyperCard runs.) Lisp is not yet dead. Common Lisp
is still thriving, even if it hasn't proven its case to the average
customer of MicroSoft Windows sufficiently that said customer would
want to install a Lisp environment so as to be able to run Lisp
applications.
> 2. Why have the implementors of successful modern languages that
> were originally built upon Lisp gone to great lengths to completely
> remove Lisp from their implementations?
-a- Penny wise pound foolish business managers who care only about
next-quarter profit, damn any longterm prospect for software
maintenance.
-b- Not-invented-here syndrome. Companies would rather base their
product line on a new language where they have a monopoly on
implementation rather than an already-existing well
established language where other vendors already provide
adequate implmentation which would need to be licensed for use
with your own commercial product line.
-c- Narrow-minded software vision which sees only today's set of
applications that can be provided by a newly-invented-here
language, blind to the wider range of services already
provided by Common Lisp that would support a greatly extended
future set of applications. Then once the company has invested
so heavily in building their own system to duplicate just some
of the features of Common Lisp, when they realize they really
do need more capability, it's too late to switch everything to
Common Lisp, so they spend endless resources crocking one new
feature after another into an ill-conceived system (compared
to Common Lisp), trying desperately to keep up with the needs
of the new applications they too-late realize they'll want.
> Modern language features (e.g. pattern matching over algebraic
> data types) are so difficult to implement that it is a practical
> impossibility to expect ordinary programmers to use Lisp's
> extensibility to make something decent out of it.
Agreed. That's why *one* (1) person or group needs to define
precisely what API-use-cases are required
(see the question I asked earlier today in the
64-parallel-processor thread, and please answer it ASAP)
and what intentional datatypes are needed for them, then implement
those API-use-cases per those intentional datatypes in a nicely
designed and documented package, then make that available at
reasonable cost (or free). I take it you, who realize the need,
aren't competant enough to implement it yourself, right? Why don't
you specify precisely what is needed (API-use-cases) and then ask
me whether I consider myself competant to implement your specs, and
offer to pay me for my work if I accept the task?
> Instead, the vast majority of programmers choose to use less
> extensible languages that already provide most of what they need
> (e.g. a powerful static type system)
That's rubbish!!! Static type declarations/checking only enforces
*internal* data types, not intentional data types on top of them.
So you're in a dilemma:
- Try to hardwire into the language every nitpicking difference in
intentional datatype as an actual *internal* datatype, with no
flexibility for the application programmer to include a new
datatype you didn't happen to think of.
- Hardwire into the language a meta-language capable of fully
expressing every nitpicking nuance of any intentional datatype,
so that application programmers can define their own *internal*
datatypes to express their intentional datatypes. Expect every
application programmer to actually make use of this facility,
always implementing new intentional datatypes as actual
application-programmer-defined *internal* datatypes.
- Don't bother to implement intentional datatypes at all.
Type-check just the internal datatypes that are built-in, and
expect the programmer to do what he does now of runtime
type-checking for all intentional type information, thereby
removing any good reason to do compile-time static type-checking
or even declaractions in the first place.
Example of internal datatype: 4.7
Example of intentional datatype: 4.7 miles per hour, 4.7 m/s, 4.7
children per average family, 4.7 average percentage failure rate of
latest-technology chips, $4.70 cost of gallon of gasoline, $7.7
million dollars total expenditure for a forest fire, $4.7 billion
dollars crop losses in the MidWest, $4.7 billion dollars per month
for war in Iraq, etc.
Application where you may need to mix the same internal datatype
with multiple intentions, where the intention is carried around
with the data to avoid confusion: Some engineers are working in
"metric" while others are working in English units, while all
original information must be kept as-is for error-control purposes
rather than automatically converted to common units on input, yet
later in the algorithms conversion to common units must be
performed to interface to various functions/methods which do
processing of the data. This is especially important if some
figures are total money for project while other figures are money
per unit time (month or year) and they need to be compared in some
way.
> In theory, this problem could be fixed but, in practice, the few
> remaining members of the Lisp community lack the talent to build
> even the most basic infrastructure (e.g. a concurrent GC).
What does that have to do with static type checking????????????
Please write up a Web page that explains:
- What you precisely mean by "concurrent GC" (or find a Web page
that somebody else wrote, such as on WikiPedia, that says
exactly the same as what *you* mean, and provide the URL plus a
brief summary or excerpt of what that other Web page says).
- List several kind of applications and/or API tools that are
hampered by lack of whatever that means.
- Explain how existing GC implementations don't satisfy your
definition of "concurrent GC" and how they specifically are not
sufficient for those kinds of applications you listed.
Then post the URL of your Web page here and/or in the other thread
where I also asked about the API-use-cases that you lament are
missing from Lisp.
> 2. Even though Lisp's forte is as a language laboratory, Lisp has
> so little to offer but costs so much
Um, some implementations of Common Lisp are **free** to download
and then "use to your heart's content". How does that cost too much???
> ... compilers written in Lisp are extremely slow because Lisp is
> extremely inefficient
That's a fucking lie!!
> In other words, Lisp is just a toy language because it does not
> help real people solve real problems.
That's another fucking lie!! I use Lisp on a regular basis to write
applications of practical importance to me, and also to write Web
demos of preliminary ideas for software I offer to write for
others. For example, there's a demo of my flashcard program on the
Web, including both the overall algorithm for optimal chronological
presentation of drill questions
(to get them into your short-term memory then to develop them
toward your medium-term and long-term memory),
and the specific quiz type where you type the answer to a question
(usually a missing word in a sentence or phrase)
and that short-answer quiz-type software coaches you toward a
correct answer and then reports back to the main drill algorithm
whether you needed help or not to get it correct. My Lisp program
on my Macintosh was used to teach two pre-school children how to
read and spell at near-adult level, and later the conversion of it
to run under CGI/Unix allowed me to learn some Spanish and
Mandarin, hampered only by lack of high-quality data to use to
generate Spanish flashcards and lack of anybody who knows Mandarin
and has the patience to let me practice my Mandarin with them. I'd
like to find somebody with money to pay me to develop my program
for whatever the money-person wants people to learn.
That's an incredible understatement, like saying you haven't used
an automobile since college when you found it useful for taking
high-school girls to drive-in movies (but you can't think of any
other use whatsoever for an automobile).
Why don't you admit that Lisp is a general-purpose programming
language that is useful for many different kinds of applications
that require fiexible datastructures to be crafted on the fly at
runtime? Can't you think of any other application area except A.I.
that might make good use of such data structures?
> I also remember it being difficult to write readable code.
Either you had a bad instructor, or you didn't have any inherent
talent, or both. It's a lot easier to write readable code in Lisp
than in other popular languages such as C or even Java. For
example, here's a simple executable expression in Lisp:
(loop for str in
'("Hello" "world." "Always" "love" "Lisp.")
collect (position #\l str :test #'char-equal))
which returns the list:
(2 3 1 0 0)
Now try to convert that to C so that the result is more "readable".
It has to be an expression which returns a value. It's OK if you
need to write a function which returns a value, then give a line of
code that calls that function as an "expression" that returns the
same value. It's *not* OK for you to write a program that
*prints*out* that syntax of open parens digits and spaces and close
parens but doesn't build any actual list of numeric values to
return to the caller. Go ahead and see if you can write anything
even half as clean as the three lines of Lisp code I displayed
above.
(By the way, after completely composing that three-line
expression, I actually started up CMUCL and copied the three
lines from my message-edit buffer and pasted it into CMUCL, and
it ran correctly the first time, and then I copied the result
from CMUCL and pasted it back into the edit buffer. Try writing
your C equivalent program and getting it exactly right the very
first time you try to compile it, nevermind the original question
of making it "readable".)
> Professionally, I'm somewhat handcuffed to C and C++ for reasons
> I mentioned earlier.
I feel sad for your plight. But I presume you're an adult, so
you're responsible for allowing yourself to be abused in that way.
Why don't you organize your fellow workers to stage a strike
against your employer until you are un-handcuffed? Or just file a
complaint with whatever government agecy protects employees from
abuse?
> It's also hard to find bright and motivated employees who are
> fluent in Lisp.
That's completely untrue if you mean *potential* employees, people
your company *could* hire if they had any brains. But it's true if
you mean *present* emplyees of your company. If that's what you're
saying, why don't you convince your company to hire somebody new
instead of recruiting only from their existing employee base?
> I'd just like a language where this idea of compression, or
> refactoring, is the central principle around which the language is
> built.
While I'm totally in favor of constant-refactoring as an
operational principle, I'm not in favor of making the language
itself *require* that way of working. For example, suppose you have
a government contract where every nitpicking detail of the use
cases are specified in the wording of the contract, and you are
legally required to provide *exactly* what the contract requires.
In that case you might be able to design the entire application at
the start and *not* need to do any refactoring during development.
Why should you be *required* by the language design to refactor
without any benefit? IMO it's better to have a language that makes
easy to refactor several times per day without being the central
principle of the language that you just can't escape.
You talk about being handcuffed to C and C++. Being handcuffed to
perpetual refactoring would not be as painful, but still I'd rather
avoid that too. Why do you seem to *want* it?
Common Lisp is the language of choice for enabling frequent
refactoring without absolutely requiring it, without refactoring
being the central principle of the language per se, merely *a*
central principle of the REP which is available whenever you need
it.
> Any language that supports subroutines offers a mechanism for
> this, but I feel the concept could be taken further.
I agree only to a limited degree. Having a generic datatype which
is the default, with runtime dispatching on type specified by the
application programmer, as in Lisp, is a lot better for this
purpose than most languages-with-subroutines which have strict
classification of data types that essentially preclude developing
as if there were a generic data type.
> It's all very nebulous at the moment, but I feel that somehow the
> answer lies in pointers.
That's half true. The other half is automatic garbage collection,
which works only with safe runtime datatypes (as in Lisp, and Java
if you declare variables of type java.lang.Object and use the
'instanceof' operator to dispatch on expected types your code knows
how to process). Unconstrained pointer arithmetic as in C is a
maintainability **disaster**. Look for example at the
buffer-overflow bugs (in code written in C) that allow
viruses/worms to take over people's computers and use them to
replicate themselves and also to disclose confidential information
to criminal organizations in foreign countries.
> At its heart, a subroutine is nothing more than a pointer, in any
> language.
How can I even begin to explain to you how grossly wrong you are???
In Lisp, a subroutine (function) is a first-class-citizen fully
self-contained data object. I'm not sure if Java goes that far, but
clearly in Java a subroutine (method) is likewise some kind of
self-contained data object. In neither case is it just a pointer to
some random spot in RAM as you claim. Only in really stupid cruddy
languages like C is a subroutine just a block of code with a
pointer to (the first byte of) it.
> A compiler takes the code inside the routine and stuffs it into
> memory somewhere. To call that function, you just dereference the
> pointer to "jump" to that code.
Only in the really stupid cruddy languages such as C you're
familiar with. In a decent language, there are stack frames, that
are formal structures, which are carefully controlled during call
of a function and return from a function as well as non-local
return (throw/catch and error restarts).
Even in C and assembly language, you don't JUMP to a subroutine,
you JSR to a subroutine, which automatically puts the return
address within the calling program onto the stack (or into a
machine register on some CPUs).
> An object in an OO language is accomplished using a pointer.
> Each class has a function table, and each object has a pointer to
> that function table. Here, the compression is happening at the
> compiler level. Since every object of a given class has the same
> code, it can all be moved to a common place, then each object need
> only store a pointer to it. Further, code in a base class need only
> exist there, with each child class pointing to that table.
OK, that paragraph is actually mostly correct and nicely explained.
> But these are hard-coded abstractions built into compilers. C++
> or Java tell you, "This is how inheritance works... This is the
> order in which constructors are called..." Somehow, I'd like for
> the language itself to make the mechanism directly accessible to
> the programmer.
That's pretty easy in Lisp. Are you familar with the 'typecase'
macro? You are free not to use the built-in CLOS inheritance at
all, but instead to use CLOS classes only to tag objects for
purpose of dispatching via the 'typecase' macro. You have the
choice whether to have it all nice and automatic via inheritance,
or explicitly under control of the programmer via 'typecase'
dispatching. There are arguments pro and con each way of doing it.
Within the past week or so somebody posted a major complaint about
OOP that with a deeply nested class inheritance hierarchy it's nigh
impossible for anyone to figure out why a particular method defined
in some distantly-related class is being called when a generic
function is passed some particular object of some particular class.
(In Java, read "generic function" as "method name", which is part
of the "method signature". Quoting from Liang's textbook (ISBN
0-13-100225-2) page 119: "The <i>parameter profile</i> refers to
the type, order, and number of parameters of a method. The method
name and the parameter profiles together constitute the <i>method
signature</i>. I like the way that author expresses concepts in Java.)
> If his or her programming style then naturally tends to evolve
> into an OO sort of structure, wonderful. If not, then maybe a
> sort of table-driven type of "engine" architecture would emerge.
> That just happens to be what I generally find most powerful.
Do you agree that Common Lisp offers the best availability of
features to support not just standard OO design (making heavy use
of inheritance) but also other variants that you seem to prefer
sometimes?
> I suppose you could just take a huge step backwards and write
> pure assembly language.
NO NO A THOUSAND TIMES NO!!!
Try Common Lisp, really, try it, for J.Ordinary applications of the
day. Whatever new application you want to write this coming Monday,
try writing it with Common Lisp, and tell me what difficulties (if
any) you experience.
Hey, if you are unwilling to swallow crow, then write a
machine-language emulator in Common Lisp, and then try to write
applications in that emulated machine language. Make sure your
emulator has debugging features a million times better than what
you'd have a bare machine you're trying to program in machine
language. While you're at it, put your emulator up as a CGI service
so that the rest of us can play with it too.
Suggestion: Emulate IBM 1620 machine language.
16 00010 00000
That's your first test program, just a single instruction.
> Then all you really have is code, data, and pointers. You're
> free to use them however you like. But I strongly believe there is
> a way to still offer this freedom, while also offering a great deal
> of convenience, readability, and maintainability by way of a
> high-level language.
If Lisp were Maynard G. Krebs
<http://www.fortunecity.com/meltingpot/lawrence/153/krebs.html>
<http://althouse.blogspot.com/2005/09/maynard-to-god-you-rang.html>
Lisp would jump in right there and say "YOU RANG?"
Haskell, SML, OCaml, Mathematica, F# and Scala all allow real problems to be
solved much more concisely than with Lisp. Indeed, I think it is difficult
to imagine even a single example where Lisp is competitively concise.
> Both C and Java are more
> verbose than Lisp. To add two numbers in Lisp, you simply write (+
> n1 n2) where n1 and n2 are the two numbers, and enter that directly
> into Lisp's REP (Read Eval Print) loop.
Yes. Consider the trivial example of defining a curried "quadratic"
function. In Common Lisp:
(defun quadratic (a) (lambda (b) (lambda (c) (lambda (x)
(+ (* a x x) (* b x) c)))))
In F#:
let inline quadratic a b c x = a*x*x + b*x + c
> To do the same thing in C,
> you need to write a function called "main" which includes both the
> arithemetic operation itself also an explicit formatted-print
> statement. To do the same thing in Java, you need to define a Class
> which contains a method called "main", and then the innerds of main
> are essentially the same as in C. For either C or Java, you then
> need to compile the source, you can't just type it into a REP loop.
> And in the case of Java, you can't even directly run the resultant
> compiled-Class program, you have to start up the Java Virtual
> Machine and have *it* interpret the main function of your compiled
> Class.
Forget C and Java.
> Here's a more extreme comparison: Suppose you want to hand-code a list
> of data to be processed, and then map some function down that list.
> In Lisp all you need to do is
> (mapcar #'function '(val1 val2 val3 ... val4))
> where the vals are the expressions of the data you want processed
> and function is whatever function you want applied to each element
> of the list. You just enter that into the REP and you're done.
Lisp:
(mapcar #'function '(val1 val2 val3 ... val4))
OCaml and F#:
map f [v1; v2; v3; .... vn]
Mathematica:
f /@ {v1, v2, v3, ..., vn}
> Try
> to imagine how many lines of code it takes in C to define a STRUCT
> for holding a linked-list cell of whatever datatype you want the
> function applied to, and a function for allocating a new cell and
> linking it to some item of data and to the next cell in the chain,
> and then calling that function over and over to add the cells to
> the linked list one by one, and then you have to write a function
> to map down the list to apply the other function. Or you have to
> manually count how many items you want to process, and use an array
> instead of a linked list, and manually insert elements into the
> array one by one, then allocate another array to hold the results,
> and finally you can
> do (i=0,i<num,i) res[i]=fun(data[i]);
> and then to print out the contents of that array or linked list you
> have to write another function. And in Java you have to decide
> whether to use vectors or arrays or any of several other collection
> classes to hold your data, and then manually call
> collection.add(element) over and over for the various elements you
> want to add. Then to map the function down the list you need to
> create an iterator for that collection and then alternate between
> checking whether there any other elements and actually getting the
> next element, and create another collection object to hold the
> results. Then again just like in C you need to write a function for
> reading out the elements in the result collection.
Forget C and Java. Compare with modern alternatives.
> Of course if you're just going to map a function down your list and
> immediately discard the internal form of the result, you don't need
> a result list/array/collection. I'm assuming in the descriptions
> above that you want to actually build a list/array/collection of
> results because you want to pass *that* sequence of values to yet
> another function later.
>
> Are you complaining about the verbosity of the names of some of the
> built-in functions? For example, there's adjust-array alphanumericp
> assoc assoc-if char-lessp char-greaterp char-equal char-not-lessp
> char-not-equal char-not-greaterp? Would you rather be required to
> memorize Unix-style ultra-terse-inscrutable names aa a as ai cl cg
> ce cnl cne cng respectively? Do you really think anybody will
> understand a program that is written like that?
Far more programmers use modern functional languages like Haskell, OCaml and
F# than Lisp now. They clearly do not have a problem with the supreme
brevity of these languages.
Look at the intersection routines from my ray tracer benchmark, for example.
In OCaml:
let rec intersect orig dir (lam, _ as hit) (center, radius, scene) =
let lam' = ray_sphere orig dir center radius in
if lam' >= lam then hit else
match scene with
| [] -> lam', unitise(orig +| lam' *| dir -| center)
| scene -> List.fold_left (intersect orig dir) hit scene
and in Lisp:
(defun intersect (orig dir scene)
(labels ((aux (lam normal scene)
(let* ((center (sphere-center scene))
(lamt (ray-sphere orig
dir
center
(sphere-radius scene))))
(if (>= lamt lam)
(values lam normal)
(etypecase scene
(group
(dolist (kid (group-children scene))
(setf (values lam normal)
(aux lam normal kid)))
(values lam normal))
(sphere
(values lamt (unitise
(-v (+v orig (*v lamt dir))
center)))))))))
(aux infinity zero scene)))
The comparative brevity of the OCaml stems almost entirely from pattern
matching.
>> 1. Why has Lisp always remained so unpopular when, as you say, it
>> is so extensible?
>
> Unlike all other major languages except Java and
> HyperCard/HyperTalk and Visual Basic, it doesn't produce native
> executables, it produces modules that are loaded into an
> "environment". That means people can't just download your compiled
> program and run it directly on their machine.
IIRC, some commercial Lisps allow standalone executables to be generated but
they are still not popular.
> They need to first
> download the Lisp environment, and *then* your application can be
> loaded into it (possibly by a script you provided for them) and
> run. Unlike Visual Basic and Java, there's no major company pushing
> hard to get people to install the environment. Unlike
> HyperCard/HyperTalk, there's no major vendor of operating systems
> (MS-Windows, FreeBSD Unix, Linux, Apple/Macintosh) providing Lisp
> as part of the delivered operating system. (Linux does "ship" with
> GNU Emacs with built-in E-Lisp, but when I say "Lisp" here I mean
> Common Lisp. E-Lisp doesn't catch on, despite "shipping" with
> Linux, because it simply doesn't have the usefulness of Common Lisp
> for a wide variety of tasks *other* than managing a text-editor
> with associated utilities such as DIRED and e-mail.) So Lisp has an
> uphill battle to compete with MicroSoft and Sun who are pushing
> their inferior languages. (HyperTalk/HyperCard died a long time ago
> because it wasn't an especially good programming language, was
> supplied only on Macintosh computers, and Macintosh lost market
> share to MicroSoft's unfair labor practices, resulting in hardly
> anybody making major use of it, resulting in Apple no longer
> maintaining it to run under newer versions of their operating
> system, so that now very few people still are running old versions
> of MacOS where HyperCard runs.) Lisp is not yet dead. Common Lisp is still
> thriving...
Lisp is not "thriving" by any stretch of the imagination. According to
Google Trends (which measures the proportion of searches for given search
terms) "Common Lisp" has literally almost fallen off the chart:
http://www.google.com/trends?q=common+lisp
>> 2. Why have the implementors of successful modern languages that
>> were originally built upon Lisp gone to great lengths to completely
>> remove Lisp from their implementations?
>
> -a- Penny wise pound foolish business managers who care only about
> next-quarter profit, damn any longterm prospect for software
> maintenance.
> -b- Not-invented-here syndrome. Companies would rather base their
> product line on a new language where they have a monopoly on
> implementation rather than an already-existing well
> established language where other vendors already provide
> adequate implmentation which would need to be licensed for use
> with your own commercial product line.
> -c- Narrow-minded software vision which sees only today's set of
> applications that can be provided by a newly-invented-here
> language, blind to the wider range of services already
> provided by Common Lisp that would support a greatly extended
> future set of applications. Then once the company has invested
> so heavily in building their own system to duplicate just some
> of the features of Common Lisp, when they realize they really
> do need more capability, it's too late to switch everything to
> Common Lisp, so they spend endless resources crocking one new
> feature after another into an ill-conceived system (compared
> to Common Lisp), trying desperately to keep up with the needs
> of the new applications they too-late realize they'll want.
I have never heard of a single user of a modern FPL regretting not choosing
Lisp. Can you refer me to any such people?
>> Modern language features (e.g. pattern matching over algebraic
>> data types) are so difficult to implement that it is a practical
>> impossibility to expect ordinary programmers to use Lisp's
>> extensibility to make something decent out of it.
>
> Agreed. That's why *one* (1) person or group needs to define
> precisely what API-use-cases are required
> (see the question I asked earlier today in the
> 64-parallel-processor thread, and please answer it ASAP)
> and what intentional datatypes are needed for them, then implement
> those API-use-cases per those intentional datatypes in a nicely
> designed and documented package, then make that available at
> reasonable cost (or free). I take it you, who realize the need,
> aren't competant enough to implement it yourself, right?
You are grossly underestimating the amount of work involved. It would
literally take me decades of full time work to catch up with modern
functional language implementations in terms of features and the result
could never be competitively performant as long as it was built upon Lisp.
Finally, I don't believe I could ever build a commercial market as
successful as F# already is so, even if I did ever start doing this, it
would be as a hobby and not for profit.
Note that this is precisely why the developers of all successful modern
functional language implementations do not build them upon Lisp.
> Why don't
> you specify precisely what is needed (API-use-cases) and then ask
> me whether I consider myself competant to implement your specs, and
> offer to pay me for my work if I accept the task?
Because it would be a complete waste of my time and money because Lisp
offers nothing of benefit whatsoever to me, my company or our customers.
Moreover, Lisp does not even have a commercially viable market for our kind
of software.
Lisp is literally at the opposite end of the spectrum from where we want to
be. We need to combine the performance of Fortran with the expressiveness
of Mathematica (which F# almost does!) but Lisp combines the performance of
Mathematica with the expressiveness of Fortran.
>> In theory, this problem could be fixed but, in practice, the few
>> remaining members of the Lisp community lack the talent to build
>> even the most basic infrastructure (e.g. a concurrent GC).
>
> What does that have to do with static type checking????????????
That has nothing to do with static typing. I was listing some of Lisp's most
practically-important deficiencies. Lack of static typing in the language is
one. Lack of concurrent GC in all Lisp implementations is another. Also
lack of threads, weak references, finalizers, asychronous computations,
memory overflow recovery, tail call optimization, callcc etc. are all
fundamental deficiencies of the language.
> Please write up a Web page that explains:
> - What you precisely mean by "concurrent GC" (or find a Web page
> that somebody else wrote, such as on WikiPedia, that says
> exactly the same as what *you* mean, and provide the URL plus a
> brief summary or excerpt of what that other Web page says).
See Jones and Lins "Garbage Collection: algorithms for automatic dynamic
memory management" chapter 8.
In a serial GC, the program often (e.g. at every backward branch) calls into
the GC to have some collection done. Collections are typically done in
small pieces (incrementally) to facilitate soft-real time applications but
can only use a single core. For example, OCaml has a serial GC so OCaml
programs wishing to use multiple cores fork multiple processes and
communicate between them using message passing which is two orders of
magnitude slower than necessary, largely because it incurs huge amounts of
copying that is not necessary on a shared memory machine:
http://caml.inria.fr/pub/ml-archives/caml-list/2008/05/6ba948d84934b1e61875687961706f61.en.html
In a parallel GC, the program occasionally (e.g. when a minor heap is
exhausted) suspends all program threads and begins a parallel traversal of
the heap using all available cores. This allows programs (even serial
programs) to benefit from multiple cores but it has poor incrementality (so
it is unsuitable for soft real-time applications) and scales badly. For
example, the GHC implementation of Haskell recently acquired a parallel GC
which can improve Haskell's performance on <4 cores but (according to the
authors) can degrade performance with more cores because the cost of
suspending many threads becomes the bottleneck.
With a concurrent GC, the garbage collector's threads run concurrently with
the program threads without globally suspending all program threads during
collection. This is scalable and can be efficient but it is incredibly
difficult to implement correctly. The OCaml team spent a decade trying to
implement a concurrent GC and never managed to get it working, let alone
efficient.
> - List several kind of applications and/or API tools that are
> hampered by lack of whatever that means.
Any software that requires fine-grained parallelism for performance will be
hampered by the lack of a concurrent GC.
>> 2. Even though Lisp's forte is as a language laboratory, Lisp has
>> so little to offer but costs so much
>
> Um, some implementations of Common Lisp are **free** to download
> and then "use to your heart's content". How does that cost too much???
Development costs are astronomical in Lisp compared to modern alternatives
like F#, largely because it lacks a static type system but also because it
lacks language features like pattern matching, decent developer tools like
IDEs and libraries like Windows Presentation Foundation (WPF).
For example, adding the new 3D surface plotting functionality to our F# for
Visualization product took me four days of full time work even though I am
new to WPF:
http://www.ffconsultancy.com/products/fsharp_for_visualization/?clp
The result will run reliably on hundreds of millions of computers (you just
need Windows and .NET 3.0 or better). Developing it into a standalone
Windows application will be trivial, if I choose to do so.
Contrast that with Lisp. There are no decent Lisp implementations for .NET.
Microsoft certainly aren't using or advocating Lisp. So you can immediately
kiss goodbye to easy multicore support and all of Microsoft's latest
libraries and tools. You'll be developing your GUIs without the aid of an
interactive GUI designer and you'll be using a low-level graphics API like
DirectX or OpenGL for visualization. You are looking at several times as
much effort to get something comparable and, even then, it will never be as
reliable (because Microsoft have invested billions in making WPF and .NET
reliable).
Finally, there is no way you'll ever turn a profit because the market for
commercial third-party software for Lisp is too small.
>> In other words, Lisp is just a toy language because it does not
>> help real people solve real problems.
>
> ...I use Lisp on a regular basis to write
> applications of practical importance to me, and also to write Web
> demos of preliminary ideas for software I offer to write for
> others. For example, there's a demo of my flashcard program on the
> Web, including both the overall algorithm for optimal chronological
> presentation of drill questions
> (to get them into your short-term memory then to develop them
> toward your medium-term and long-term memory),
> and the specific quiz type where you type the answer to a question
> (usually a missing word in a sentence or phrase)
> and that short-answer quiz-type software coaches you toward a
> correct answer and then reports back to the main drill algorithm
> whether you needed help or not to get it correct. My Lisp program
> on my Macintosh was used to teach two pre-school children how to
> read and spell at near-adult level, and later the conversion of it
> to run under CGI/Unix allowed me to learn some Spanish and
> Mandarin, hampered only by lack of high-quality data to use to
> generate Spanish flashcards and lack of anybody who knows Mandarin
> and has the patience to let me practice my Mandarin with them. I'd
> like to find somebody with money to pay me to develop my program
> for whatever the money-person wants people to learn.
I think you should aspire to earn money directly from customers rather than
asking people to give you money to develop your software.
Some of those are such commonly useful constructs that it seems
ill-conceived to have each application programmer re-invent the
wheels. Are there standard packages available to provide these as
"givens" with a well-documented API so that different application
programmers can read each other's code?
Now the ability to define *variants* of those common operators
which do additional/different tasks would be useful. Obviously if
the standard operator can be defined, a variant can be defined,
right?
> - Abstract data types (array, hash, bitset, ... )
Again, these are such basic container types that they really ought
to be provided in a standard package. Are they? Again, the ability
to define variants on the standard implementation would be useful.
But the ability to mix-in the variation within the standard
definition without needing to start from scratch to re-invent the
wheel would be even better. Is that possible? Do the standard
definitions have hooks for adding variant functionality, sort of
the way Common Lisp has hooks for what to do when an error occurs
or an unbound variable is referenced or a garbage-collect happens
etc.? For example, is it possible to use a built-in (standard
package) definition of hash table but change the pre-hash function
(the pseudo-random function from key to large integer) to replace
the standard pre-hash function?
> Seed7 Homepage: http://seed7.sourceforge.net
] It is a higher level language compared to Ada, C/C++ and Java.
Um, C is not exactly a high-level language.
Comparing your language to both C and Java in the same sentence is
almost an oxymoron.
] The compiler compiles Seed7 programs to C programs which are
] subsequently compiled to machine code.
Ugh! So you're using C almost as if it were an assembly language,
which is probably appropriate for it *not* being a high-level
language itself, being in reality a "syntax-sugared assembly
language". OK, you win, I withdraw my complaint, if you stipulate
that your earlier statement really was self-contradictory.
If you don't allow that C is nothing more than a sugar-coated
assembly language, if you insist it's really a high-level language,
then the Seed7 compiler doesn't need to really do anything, just do
a syntax transformation from one high-level language to another, in
which case you might do better to syntax-translate to Common Lisp,
maybe just emulate the Seed7 syntax within CL as if it were a DSL
(Domain-Specific Language).
] Functions with type results and type parameters are more elegant than
] a template or generics concept.
Since a template concept is dumb to begin with, saying you're
better than that is not a really good advertising point, like
saying you as a person have higher moral standards than Adolph
Hitler or Pol Pot or George W. Bush.
I don't know what you mean by a "generics" concept. Is that like
the tagging of data object to identify their respective data types
at run-time that Lisp has? Or something completely different?
Please define what precisely you mean by that (I assume you're the
author of that Web page).
] * Types are first class objects
In a sense that's also true in Common Lisp:
(class-of 5)
=> #<BUILT-IN-CLASS FIXNUM (sealed) {50416FD}>
(class-of (expt 3 99))
=> #<BUILT-IN-CLASS BIGNUM (sealed) {5044695}>
(class-of (make-hash-table))
=> #<STRUCTURE-CLASS HASH-TABLE {500D32D}>
* (defclass foo)
=> #<STANDARD-CLASS FOO {90285FD}>
Is that the kind of first-class type-objects that you are talking about?
] (Templates and generics can be defined easily without special syntax).
What precisely do you mean by "templates"?
What precisely do you mean by "generics"?
Suggestion: On a Web page where you throw around terms like this,
each such mention of a jargon-term should actually be
<a href="urlWhereTheTermIsDefined">theTerm</a>
That's the nice thing about WebPages (and DynaBooks, if they ever
existed), that there can be links from jargon to definitions, not
possible in hardcopy printed books (footnotes for every such term
would be a royal pain by comparison with HREF anchors).
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
] * User defined statements and operators.
Why make a distinction between statements and expressions-with-operators??
IMO it's a royal pain to have to deal with.
Lisp does it right, having every expression usable *both* as
statement or operator, or even as both simultaneously
(perform side-effect and also return a value, for example SETF
which stores the right-side into the left-side place but *also*
returns that value to the caller, so that SETFs can be nested to
store the same value in more than one place, and IF and CASE which
select which of several alternative blocks of code to execute and
*also* return the value from the block that was executed).
C does it wrong, requiring different syntax for IF statements and
?: expressions which return a value.
] * Static type checking and no automatic casts.
There's hardly any value to static type checking, compared to
dynamic (runtime) type-checking/dispatching. See another article I
posted late Friday night
<http://groups.google.com/group/comp.programming/msg/5cea7d186eddfd42>
= Message-ID: <rem-2008...@yahoo.com>
(skip down to where I used the word "dilemma", page 10 on VT100
lynx, appx. 5 screens into the article on full-screen browser)
where I explained why static type checking fails to solve the
problem it claims to solve hence is worthless to include in a
programming language.
By "no automatic casts", do you mean that you can't even have a
literal that is generic to cast to short-integer or long-integer in
an assignmet, so everytime you set a variable to a literal value
you must explitly tag the literal with the appropriate word length
(or even worse, explictly cast it from literal type to whatever
type the variable happens to be today)??
] * exception handling
How does your servive compare with what Common Lisp and Java provide?
Do you provide a default break loop for all uncaught exceptions, as
Lisp does, or do you do what Java does of ABEND/BACKTRACE whenever
an uncaught unchecked-exception ocurrs (and *require* compile-time
exception handler for each and every checked-exception).
] * overloading of procedures/functions/operators/statements
So that's basically the same as what C++ and Java do (and Common
Lisp generic functions do even better)?
] * Runs under linux, various unix versions and windows.
Does it run on FreeBSD Unix? (That's what my shell account is on.)
Why doesn't it run on MacOS 6 or 7? (Just curious, since my Mac
doesn't have a decent C compiler. I have Sesame C, but it's only a
crude subset, no structs or even arrays, not even malloc. It *does*
have machine-language hooks, whereby you can embed hexadecimal
codes inline, which I used to implement a crude form of malloc via
system/OS traps to allocate a huge block of RAM and then my own code
to break it into pieces to return to callers of myMalloc!!)
<http://seed7.sourceforge.net/faq.htm#new_language>
] Why a new programming language?
] Because Seed7 has several features which are not found in other
] programming languages:
] * The possibility to declare new statements (syntactical and
] semantically) in the same way as functions are declared
Are you really really sure that's a good idea. Like why is it
even necessary in most cases, compared to Lisp's system of keeping the
bottom-level syntax OpenParens OperatorName Arg1 Arg2 ... Argn
CloseParens but allowing various OperatorNames to cause the
apparent Args to be interepreted any way you like?
Problems with defining new statement-level operators:
- It makes one person's code unreadable by anyone else. Not just
that they don't know what the semantics do, another person can't
even parse the new syntax you've invented.
- It kills any chance of having a smart editor, such as Emacs,
automatically deal with sub-expressions, such as copy/cut/paste
entire sub-expressions, skip forward/backward by sub-expressions,
etc., unless you take the extra pain of reconfiguring Emacs to
know about all the new syntaxes you've invented for your Seed7
sourcecode.
- It's already a royal pain in the first place to need to keep a
copy of the top half of page 49 of K&R posted for constant
reference when deciding whether parentheses are really necessary
to provide the desired sequence of sub-expression combination.
It would be an order of magnitude more pain to need to keep a
listing of operator precedence for every new operator invented by
every programmer in a large software project, and know which page
to refer to when looking at each person's code, and not get all
confused when trying to coorelate two different pieces of code
written by different people which use different operator
definitions.
Please reconsider your decision to use operators in the first place
for anything except arithmetic expressions.
Please consider going back to square one in your syntax design, and
using Lisp notation for everything except arithmetic (with some
sort of syntactic marker to tell when arithmetic mode starts and
ends, i.e. to wrap an arithmetic-syntax expression within an
otherwise s-expression syntax).
Heck, consider scrapping the "new language from scratch, except C
as post-processor to compiler" idea entirely, instead just use a
reader macro within Common Lisp to nest an arithmetic-syntax
expression within an s-expression. Maybe something like this:
(let* ((origHeight (get-height myDoorway))
(aspectRatio (get-aspect-ratio standardDoor))
(scaledWidth #[origVal*aspectRatio])) ;Sub-expression in math syntax
(set-width myDoorway scaledWidth)
(make-new-door :height origHeight :width scaledWidth))
Does your current implementation even provide anything like LET* in
the first place?
Some/most/all of the other features you claim aren't available in
other languages are in fact already available in Common Lisp.
<http://seed7.sourceforge.net/faq.htm#bytecode>
] Can I use something and declare it later?
] No, everything must be declared before it is used. The possibility to
] declare new statements and new operators on one side and the static
] typing requirements with compile time checks of the parameters on the
] other side would make the job of analyzing expressions with undeclared
] functions very complex.
This is a killer for top-down design+debugging that sometimes is
useful. Better to stick to a fixed syntax, where something doesn't
need to be defined before code that calls it can be set up. Then
when an undefined-function exception throws you into the break
package, *then* you can supply the missing fuction definition and
proceed from the break as if nothing were wrong in the first place.
] Forward declarations help, if something needs to be used before it can
] be declared fully.
In practice they are a royal pain, both the necessity of doing them
all before you can even write the code that calls the undefind
functions, and then the maintenance problem if you change the
number of parameters to a function and therefore need to find all
the places in your code where you declared the old parameters and
now need to re-declare all of those and re-do everything that
follows. It's a totally royal pain to have to deal with!!
] With static type checking all type checks are performed during
] compile-time. Typing errors can be caught earlier without the need to
] execute the program. This increases the reliability of the program.
Bullshit. Utter bullshit!!!
See what I wrote (see URL/MessageID of article earlier above) about
the failure of static type checking to deal with intentional types
regardless of whether you try to definie every intentional type an
an explicitly declared static type or not. Then answer my point,
either by admitting you were totally mistaken in your grandoise
claim about static type checkign, or by explaining how Seed7 is
able to completely solve the problem.
] Can functions have variable parameter lists?
] No, because functions with variable parameter lists as the C printf
] function have some problems:
] * Type checking is only possible at run time.
That's untrue.
Variable args of the same type can be all checked by mapping the
checker down the list of formal arguments in the syntax.
Lisp doesn't provide this, but an extension to DEFUN/LAMBDA could do this:
(defun foo (i1(integer) f2(single-float) &nary cxs(complex)) ...)
(foo 5 4.2 #C(2 5) #C(1 4) #C(6 9)) ;OK
(foo 5 4.2 #C(2 5) 3.7 #C(6 9)) ;syntax error, 3.7 not COMPLEX
Keyword args can be checked to make sure the only keywords actually
used are in fact defined as available by the function definition.
Thus: (defun foo (a1 a2 &key k1 k2) ...)
(foo 5 7 :k2 42 :k3 99) ;syntax error, keyword K3 not allowed by FOO
;suggestion: K1 is allowed, maybe you meant that?
] Although functions can return arbitrary complex values (e.g. arrays of
] structures with string elements) the memory allocated for all
] intermediate results is freed automatically without the help of a
] garbage collector.
How?????
Debug-use case: A application is started. A sematic error (file
missing for example) throws user into break package. User fixes the
problem, but saves a pointer to some structure in a global for
later study. User continues from the break package. There are now
two pointers to the structure, the one the compiler provided on the
stack, which goes away when some intermediate-level function (above
the break) returns, and the global one set up by the user from the
break package. How does the return-from-function mechanism know
that the structure should *not* be freed? Later, when the user
changes that global to point somewhere else, and there are no
longer any references to that structure, how does the
assign-new-value-to-global mechanism know that the *old* value of
that global can *now* finally be freed?
Reference counts don't work if you allow circular pointer structures:
(setq foo (list 1 2 3))
=> (1 2 3)
(setf (cdddr foo) (cdr foo))
=> #1=(2 3 . #1#)
foo
=> (1 . #1=(2 3 . #1#))
(setq foo nil)
;It's easy to see that the CONS cell pointing at 1 can be freed,
; because it no longer has any references.
;But the cells pointing to 2 and 3 have CDR pointers to each other,
; so how does your system know that those cells can also be freed
; without a garbage collector to verify no other references except
; those circular references exist anywhere within the runtime environment??
] of all container classes. Abstract data types provide a better and
] type save solution for containers ...
**f* (typo, the first typo I've found so-far, your English is good!)
] What is multiple dispatch?
] Multiple dispatch means that a function or method is connected to more
] than one type. The decision which method is called at runtime is done
] based on more than one of its arguments. The classic object
] orientation is a special case where a method is connected to one class
] and the dispatch decision is done based on the type of the 'self' or
] 'this' parameter. The classic object orientation is a single dispatch
] system.
What you've implemeted sounds the same as generic functions in Common Lisp.
But having *any* runtime dispatching based on actual type of an
object defeats your <coughCough>wonderful</coughCough> static type
checking, since if three sub-types inherit from one parent type,
but only two of them implement a particular method, especially if
there are multiple parameter-type dispatching with multiple options
for each parameter and not *all* combinations of parameter types
are provided, then it's possible for compiler to accept a call
involving parent-type declared parameters which at runtime steps
into one the combinations of subtypes that aren't defined.
Example, in case my English wasn't clear:
Define class table with subtypes endtable dinnertable coffeetable and bedstand.
Define class room with subtypes livingroom bedroom kitchen and bathroom.
Declare generic function arrangeTableInRoom, and define these
specific cases of parameters to it:
endtable,livingroom
dinnertable,kitchen
bedstand,bedroom
endtable,bathroom
Declare variable t1 of class table.
Declare variable r1 of class room.
Assign t1 an object of sub-type bedstand.
Assign r1 an object of sub-type kitchen.
Call arrangeTableInRoom(t1,r1) ;Compiles fine, but causes runtime exception,
; because that specific method is not defined.
;Static type checking fails to detect this type-mismatch undefined-method error.
] As in C++, Java, C# and other hybrid object oriented languages there
] are predefined primitive types in Seed7. These are integer, char,
] boolean, string, float, rational, time and others.
What is the precise meaning of type 'integer'? Is it 16-bit signed
integer, or 32-bit signed integer, or 64-bit signed integer, or
unlimited-size signed integer? If this is defined elsewhere, you
should have <a href="urlWhereDefined">integer</a> here.
What is the precise meaning of type 'char'? Is it US-ASCII 7-bit
character, or Latin-1 8-bit character, or UniCode-subset 16-bit
codepoint, or full UniCode 21-bit (embedded in 24-bit or 32-bit
machine word) codepoint, or what?? Ditto need href.
What is the precise meaning of type 'string', both in terms of
possible number of characters within a string, and what each
character is.
What is the precise meaning of type 'float'? Is it IEEE 754 single
precision, IEEE 754 double precision, IEEE 754 single-extended
precision, IEEE 754 double-extended precision, or some form of IEEE
854-1987, or any of those revised in 2008.Jun (this very month!!),
or something else?
] Variables with object types contain references to object values. This
] means that after
] a := b
] the variable 'a' refers to the same object as variable 'b'. Therefore
] changes on variable 'a' will effect variable 'b' as well (and vice
] versa) because both variables refer to the same object.
That is not worded well. There are two kinds of changes to variable
*a*, one which changes that variable itself to point to a different
object, and one which doesn't change *a* itself at all but instead
performs internal modification (what some other poster referred to
as "surgery", applied in his case to changing CAR or CDR of a CONS
cell, but the term could equally apply to *any* internal
modification of an object) upon whatever object *a* currently
points to.
If it's true that change in variable *a* itself by reassignment is
*not* passed to variable *b*, but "surgery" on the object that both
*a* and *b* point to *does* cause both *a* and *b* to "see" that
same change, you need to make that clear. Example of the distinction:
a := new tableWithLegs(color:purple);
b := a; /* *a* and *b* both point to same object */
tellIfLegs(a); /* Reports legs present, and purple */
tellIfLegs(b); /* Reports legs present, and purple */
paintLegs(a,color:green);
tellIfLegs(a); /* Reports legs present, and green */
tellIfLegs(b); /* Reports legs present, and green */
cutOffLegs(a); /* Legs of that single table are now without legs,
and henceforth the side-effect of that literal
surgery will be seen via both *a* and *b* */
tellIfLegs(a); /* Reports legs missing */
tellIfLegs(b); /* Reports legs missing */
a := new tableWithLegs; /* *a* and *b* now point to different tables */
tellIfLegs(a); /* Reports legs present, and beige (the default) */
tellIfLegs(b); /* Reports legs missing */
fastenNewLegs(b,color:orange);
tellIfLegs(a); /* Reports legs present, and beige */
tellIfLegs(b); /* Reports legs present, and orange */
] For primitive types a different logic is used. Variables with
] primitive types contain the value itself. This means that after
] a := b
] both variables are still distinct and changing one variable has no
] effect on the other.
This is correct, but totally confusing of the semantics I expressed
above are correct. If **assignment** of a new object to a variable
causes simultaneous assignment of all other variables that point to
the same object, a totally perverse semantics for your language,
which I ardently hope is *not* the case, then this all makes
(perverse) sense.
You seriously need to rewrite that whole section one way or another.
Suggestion (if I guessed the semantics correctly despite your
incorrect English): Say that in the case of primitive values, the
actual data, all of it, is right there in the variable itself, so
if you copy that value to somewhere else, and then change one bit
of one of the copies, the other copy won't be affected. But in the
case of Objects, what's in the variable is just a pointer to the
object, so you can't change bits in that pointer without trashing
the whole system (it now points to some random place in memory that
probably isn't an Object), so modifying Object variable isn't
allowed, so the question of what if you modify the actual value
itself doesn't make any sense. What you *can* do in the case of
Object variables is modify the actual Object it points to. Since
two different variables may point to the same object, that
modification (surgery) will be "seen" from both places equally.
Say that the other thing you can do with variables is to simply
re-assign the variable to have a new value, a new self-contained
value in the case of primitive variables, a pointer to a different
Object in the case of Object variables. In neither case is the
reassignment "seen" by any other variable that happened to share a
copy of the primitive value or pointer-to-Object. In the case of
primitive variables that previously contained copies of the exact
same primitive value, the two variables now have different
self-contained values, one of them now containing the
newly-assigned value, the other still containing the same original
self-contained value it had before. In the case of Object variables
that previously pointed to the same Object, they now point to
different Objects, one of them now pointing to the new Object that
was assigned to it, the other still pointing to the same Object it
already pointed to.
] In pure object oriented languages the effect of independent objects
] after the assignment is reached in a different way: Every change to an
] object creates a new object and therefore the time consuming copy
] takes place with every change.
I've never heard of any such language. Perhaps you can name one.
Java and Lisp in particular do *not* copy when objects are
modified. In Lisp you have the option, for some kinds of
intentional objects, such as sequences (linked lists and
one-dimensional arrays), to either optimize speed by destructively
modifying the object (what we're talking about here) or avoid side
effects by copying as much of the structure as needed (something
that *explicitly* says it returns a new object if necessary). (For
the no-side-effect version: For arrays you either make a complete
copy or you don't. For linked lists you usually keep/share the tail
of the original list past the last change, but re-build/copy
everything before that point.) I suspect your point here is strawman.
] ... In Seed7 every
] type has its own logic for the assignment where sometimes a value copy
] and sometimes a reference copy is the right thing to do. Exactly
] speaking there are many forms of assignment since every type can
] define its own assignment.
IMO this is a poor design decision. This means the same kind of
object, such as array, can't exist sometimes on the stack and
sometimes as an object in the heap, because the fact of how it's
allocated and copied is hardwired into the type definition. Better
would be to have a shallow-copy method for every object, and call
the shallow-copy object whenever the object is stored on the stack,
but simply copy the pointer whenever just the pointer is stored on
the stack. Thus it's the variable type (inline-stack-Object vs.
pointer-to-heap-Object), not the Object class, which determines
what copying happens. Then it would be possible to copy an object
from stack to heap or vice versa as needed. With your method, it
would seem that every instance of a given type of object must be on
the stack, or every instance in the heap, never some here and some
there as needed.
As for deep copy, that's a huge can of worms, telling the copy
operation when to stop and go no deeper. Kent Pittman discussed
this in his essay on intention. That's why copy-list copy-alist and
copy-tree all do different things when presented with exactly the
same *internal* datatype of CONS-tree. I think it's best if
assignment avoid this can of worms and let the programmer say
explicitly what kind of deep copy might ever be required in special
circumstances.
Would you consider the following compromise: During development of
a package of software tools, everything except the widgets of the
GUI is done by ordinary Lisp-style functions, not OO. Macros are
defined only where essential to simplify the syntax to speed up
coding of lots of cases, but only after several cases have been
coded manually by ordinary function call with recursive evaluation
of nested expressions and explicit quoting of data that is not to
be evaluated. All the business logic is done by ordinary D/P
(Data-Processing) functions, possibly aided by macros. At the very
end, before releasing the code to regular users, an OO wrapper is
put around the public-accessible tools, limiting access/view of the
innerds.
> The open/closed principle
Which one??
<http://en.wikipedia.org/wiki/Open/closed_principle>
- Meyer's Open/Closed Principle -- Parent class remains unchanged
forever except for legitimate bug fixes. Derived classes inherit
everything that stays the same, and re-define (mask) anything
that would be changed compared to the parent class. The
interface is free to differ in a derived class compared to the
parent class.
- Polymorphic Open/Closed Principle -- A formal *interface* is set
up once, and then never changed. Various classes implement this
interface in different ways.
GUIs with classes of widgets that are interchangeable via events
triggered by mouse keyboard etc. satisfy the second definition.
Traditional modular programming (doesn't have to be per the current
jargon) ideally satisfies the first definition.
Common Lisp's keyword parameters allow a compromise (with the first
definition) whereby a function can be defined with limited
functionality, then a new keyword can be added to allow additional
functionality without changing the earlier functionality in any
way. In this way the primary value of Meyer's Open/Closed Principle
can be obtained without needing any actual OO.
> It encourages code to become set in stone.
Code for implementations, or code for interfaces?? Or both?
> Unfortunately, especially on a large project, I find that no
> matter how we try, it is impossible to foresee the entire scope
> of what needs to be designed.
Totally true, especially on "cutting edge" R&D which explores new
D/P algorithms and eventually settles on whatever works best (or
gives up if nothing works well enough to put into practice). I find
that most of my software is new-D/P-algorithm R&D where "agile
programming" methodology (without the overhead of purchasing a
commercial Agile (tm) software environment) is the only practical
course of action, and totally precludes the design-first
implement-last paradigm.
Refactoring is a daily experience as I try various algorithms until
I learn what works best. Deciding that I really *also* need to do
something I didn't even envision at the start of the project, in
addition to the fifth totally different algorithm for something I
*did* anticipate at the start, is a common occurance.
For example, in my current project for developing ProxHash to
organize a set of transferrale/soft skills (in regard to seeking
employment), I had no idea that there would be a very large number
of extreme outliers (records nearly orthogonal to every other
record) which would require special handling, until I had already
finished developing all the code leading up to that point where the
extreme outliers could be discovered. (If I had known this problem
at the start, I might have simply written a brute-force
outlier-finding algorithm, which directly compared all n*(n+1)/2
records, not scalable to larger datasets than the 289 records I'm
working with presently, but it sure would have simplified the rest
of the code for *this* dataset by having them out of the way at the
start.) If you're curious:
- Original data, with labels added:
<http://www.rawbw.com/~rem/NewPub/ProxHash/labsatz.txt>
- Identification of extreme outliers, starting with the *most* distant,
working down to semi-outliers:
<http://www.rawbw.com/~rem/NewPub/ProxHash/outliers-2008.6.11.txt>
Note that d=1.4142135=sqrt(2) means orthogonal, where correlation is 0.
The most distant outlier has d=1.39553 C=0.02625 to its nearest "neighbor".
> So the design is constantly changing, and the idea of some set of
> code that we write in the early stages surviving without needing
> vast rewrites for both functionality and efficiency is delusional.
Indeed, the "d" word is quite appropriate here. It's sad how
software programming classes so often teach the dogma that you must
spec out the entire project at the start before writing the first
line of code. Agile programming (including rapid prototyping,
bottom-up tool-building, and incessant refactoring in at three
different ways) is IMO the way to go.
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#s4outl>
* Lesson 4: Refactoring syntax
* Lesson 5: Refactoring algorithms
(One other that came up in another thread, which I've now forgotten)
OK, I went back to my file of backup copies of articles posted, and
here's the relevant quote:
| After that point, my beginning lessons don't discuss additional
| ways of refactoring, but I think we would agree that further
| refactoring is *sometimes* beneficial:
| - Using OOP.
| - Using macros to allow variations on the usual syntax for calling functions.
| - Defining a whole new syntax for specialized problem domains,
| including a parser for such syntax.
Here's a reference for the entire article if you're curious:
<http://groups.google.com/group/comp.programming/msg/bc8967c3b7522c5d>
= Message-ID: <rem-2008...@yahoo.com>
> 1) The development tools and libraries are the most mature for
> C++ and this is essential.
I'm curious: Why did you choose C++ instead of Common Lisp?
> 2) The OS, drivers, and application code in the embedded firmware
> is all C and we have no choice in that, unless we want to develop a
> compiler for some other language, or write directly in assembly
> language.
I agree with that choice. C is fine for writing device drivers.
> Since C and C++ are largely similar, parts of the C code that we
> want to share with the other applications can be used directly.
Why would there be any code shared between device drivers and other
parts of your application???
> But if not for these two restrictions, in hindsight, I would not
> have chosen C++ or any other OO language.
If there was in fact no significant amount of code shared between
device drivers and the rest of the application, then maybe chosing
C++ was a mistake anyway? I'll withhold judgement until I know the
answer the shared-code-dd/app question just above.
> I've wandered far enough off-topic from the original post.
That's not a problem at all, since we're still totally on-topic for
the newsgroup comp.programming, and even for comp.software-eng in
case anybody wants to cross-post parts of this thread there.
<snip pro-Lisp anti-everything else stuff>
> Only in the really stupid cruddy languages such as C you're
> familiar with.
What would your language of choice be for implementing Lisp?
--
Bartc
A bootstrapping/crosscompiling process involving extended versions
of SYSLISP (which was available in PSL) and LAP (which was
available in MacLisp and Stanford Lisp 1.6). Revive those
dinosaurs, and enhance them to support generating native
executables for bootstrapping purposes.
When I helped port the PSL kernel from Tenex to VM/CMS, we needed
to write the outer frame of the executable in IBM 360/370 assembly
language, because SYSLISP required the IBM 370/370 registers (base
register and stack mostly) to be already set up before code
generated by SYSLISP would execute propertly. It seems to me
entirely reasonable to enhance SYSLISP or LAP to be able to
generate those first few instructions that reserve memory for the
stack and load the registers needed by the regular code.
So the basic plan would be as follows:
- Read the internal documentation for the first target system, to
learn what the format of executable files is supposed to be.
Write code in an earlier version of Lisp (anything that's
currently available, even XLISP on Macintosh System 7.5.5 might
be "good enough" for this purpose) to generate the minimal
executable file and to have a hook for including additional code
in it. A sequence of (write-byte whatever outputStream)
statements would be good enough to directly generate the minimal
executable file. Or LAP could be fixed to allow directly
generating inline code, not wrapped in any FUNCTION header, and
to call WRITE-BYTE instead of building the body of a FUNCTION.
Or LAP could build a dummy function body and then the body could
be copied to the outputChannel and then the dummy function
discarded. Or use something totally esoteric, instead of Lisp,
to implement something like LAP, such as Pocket Forth, or
HyperCard/HyperTalk, or an assembler. But actually the advantage
of implementing LAP to do the right thing is that the code for
that can then be ported to later stages in the bootstrapping
process to avoid needing to re-do all that work in Lisp later
when it's time to "close the loop". On the other hand, writing a
Forth emulator in Lisp would be easy enough, so if Pocket Forth
is used at this stage none of the Forth code would need to be
discarded, it could be kept as part of the finished product.
But actually using real genuine LAP syntax instead of something
easy for Forth to parse would be best, so I retreat to using
some earlier version of Lisp to implement the revitalized LAP.
In any case, LAP doesn't need to actually know any opcodes. It's
sufficient to know the different formats of machine language
instructions and accept hexadecimal notation for opcode and each
other field within each machine format instruction, or possibly
just accept a list whose first element is a symbol identifying
the rest of the list as hexadecimal bytes and the rest of the
elements handcoded byte of the instruction.
- Hand-code in assembly language, using hexadecimal-LAP syntax, the
absolute minimum application to assemble code, taking input from
a file IN.LAP in LAP syntax and writing to a file OUT.EXE in
target-machine language. Add that code to the minimal executable
from above and pass all that code through the earlier-version-of-Lisp
LAP above, thereby generating an executable that all by itself
assembles LAP. The earlier-LISP can now be despensed with since
we now have an assembler which can assemble itself.
- Hand-code in assembly language, using LAP syntax, enhancements to
the LAP assembler, such as the full set of instruction formats
(still with all hexadecimal fields), opcodes for the target
machine that know what instruction format is used by each,
labels that can be referred to from other places in the code.
After each round of this enhancement is completed, the next
round will be easier to code. After labels are implemented,
*real* assembly-language programming is possible, with lots of
subroutines used for nice structured programming.
- Hand-code in assembly language, using full LAP syntax from above,
the transformations needed to map a minimal subset of SYSLISP to
LAP. Now we have a compiler that takes a mix of SYSLISP-subset
and LAP and produces an executable.
- Code in SYSLISP-subset all the enhancements needed to implement
the full SYSLISP syntax and semantics. Now we have a compiler
that takes a mix of SYSLISP and LAP and produces an executable.
- Code in SYSLISP the bare minimum primitives needed to build a
symbol (in a default package that isn't a fullfledged PACKAGE
object in the usual sense) and access key fields from it, box
and unbox a small integer, build a CONS cell and access CAR and
CDR from it, parse a simple s-expression containing only symbols
and small integers to produce a linked list of CONS cells etc.,
print out such an s-expression from the linked list, enhance
SYSLISP to support wrapping a function body around a block of
code and linking from a symbol to that function body, applying
such a function to a list of parameters, EVALing small integers
to themselves and symbols to their current values, and
recursively EVALing a linked list whose CAR is the symbol of a
defined function and whose CDR is a list of expressions that
evaluate to parameters to that function. Include all that code
in compilation to our executable. We now have a mimimal LISP-subset1
interpreter that automatically builds an executable each time it
is run.
- Rearrange the code so that there's a function called
START-WRITING-EXECUTABLE which opens OUT.EXE for output and
generates the starting data for an executable, a function
called COMPILE-LAP+SYSLISP which takes a linked-list of a mix of
LAP and SYSLISP and compiles them into the already-open
executable-output file, a function called
FINISH-WRITING-EXECUTABLE which writes any necessary finalization
data in the executable and closes the output file. We now have
an interpretor we can just use as we want without necessarily
writing an executable, but any time we want we can code
(START-WRITING-EXECUTABLE)
(COMPILE-LAP+SYSLISP ...) ;More than one such can be done here
(FINISH-WRITING-EXECUTABLE)
.. details of additional levels of bootstrapping not included here ...
Now when we want to port everytning up to some bootstrapping point
to another machine, all we need is some crude way to establish the
very first bootstrap on the new machine, then copy all the various
additional bootstrapping code across from the old machine to the
new machine and manually replace the specific machine instructions
that aren't the same between old and new CPUs and then run the
result through the current level of bootstrapping to build the
executable for the next level of bootstrapping. By the time we
reach the level of having a minimal SYSLISP compiler, virtually
nothing more would need translation to the new CPU because SYSLISP
is mostly portable. During really high-level bootstrapping, only
the parts that need to be written in LAP, such as tag bits on
various kinds of data objects, and system calls to allocate memory
or do filesystem operations, would need to be manually converted to
a new target machine. But once we have a halfway decent Lisp
interpretor working, it's easy to write code to maintain a set of
tables that formally express tag bits and stuff like that, and then
generate LAP expressions dependent on those tables, such that each
machine-level characteristic would need to be coded in the tables
just once then all the LAP related to it could be generated
automatically.
Of course you could *cheat* by using an existing fullfledged
implementation of Common Lisp, which *was* coded in C, to implement
the entire LAP and SYSLISP compiler as functions within that
existing implementation, and then build more and more of the *new*
Lisp implementation by generating the executable directly from that
existing+LAP+SYSLISP environment. That would let you shortcut some
of the early bootstrapping levels. But that's not as much fun, and
really it is cheating to use anything that passed through C for
your cross-compiler, and it really would be *cheating*.
Now to *really* avoid all sense of using C, you can't implement
your first bootstrap using any operating system that was even
partly built using C, especially not Unix. I think early versions
of MacOS (at least through 6.0.7) were built using Pascal rather
than C, is that correct? But for the full spirit of this avoidance
of using some *other* programming language to cross-compile into
Lisp, thereby making Lisp dependent on that *other* cursed
language, we really ought to avoid *any* other language above
assembly language. My MOS-6502 bare machine, bootable using pROM
Octal-DDT, is in storage but was working the last time I tried it.
It could perhaps be used to avoid both C and Pascal as well as any
other high-level language. Does anybody have a PDP-10 that still
works? Its operating system was written in assembly language, the
original DEC systems using Macro, and the Stanford-AI system using
FAIL. What was MIT's ITS written in, DDT?
> > The process, for lack of a better term, is "compression."
>
> Or abstraction, or functional decomposition...
Or chunking, or convergence...