Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

core features and functions required in modern programming languages

8 views

Skip to first unread message

Neil Morris

unread,

Jul 24, 2008, 3:58:12 PM7/24/08

Dear All

Having read/studied a few programming languages I would be interested to
know the core features functions etc that would be most desired? I have
listed below some features I would like to see.

1. ease of scripting languages with the near speed of compiled ones
2. linked to a core library of functions ie glibc
3. mozilla's XUL
4. SDL graphics opengl library
5. mozilla's XUL linked to 3. and 4.
6. message sending and receiving features such as SOAP WDSL RDF WDDX and
RPC
7. bytecode VM that can be translated in the native or targeted host
system via a JIT compiler
8. designed mainly for dymanic languages
9. reverse polish notation ie forth and postscript as core language
design or used as lower level conversation
10. MACROS like features
11. conditional compilation

any suggestions wellcome!

Neil Morris

jimmaure...@worldnet.att.net

unread,

Jul 24, 2008, 4:02:46 PM7/24/08

I suspect we can each come up with our own list of features.
For instance, I would not include any of the items in the above list.

Jim Rogers

S James S Stapleton

unread,

Jul 24, 2008, 4:04:05 PM7/24/08

"Neil Morris" <neil.m...@virgin.net> wrote in message
news:n95ik.26877$WT.2...@newsfe29.ams2...

- Trivial C interaction, both in accessing libraries in the language from C,
and Writing C modules (libraries) for the language.

C is the most common programming language I've seen other languages able to
interact with, so it seems C is a logical choice to make the language able
to interact with as many languages as possible.

James Harris

unread,

Jul 24, 2008, 4:22:02 PM7/24/08

There is a newsgroup specifically for language design, comp.lang.misc,
so am including that....

FWIW I don't like your list much. It seems like an ad-hoc feature list
related to some technologies that are currently in vogue. A general
purpose language should be more cohesive (i.e. have few "features" and
instead concentrate on unifying principles). Other things can be
provided by standard libraries but, IMHO, even these should be
minimalist. (This is NOT the path taken by Java which has extensive
libraries.)

Check out Principles of Programming Languages by Bruce MacLennan

http://www.cs.utk.edu/~mclennan/books.html

for a clear set of guidelines to language design.

Rod Pemberton

unread,

Jul 24, 2008, 6:10:10 PM7/24/08

"James Harris" <james.h...@googlemail.com> wrote in message
news:f4fd39e2-7206-4cc7...@j33g2000pri.googlegroups.com...

> On 24 Jul, 20:58, Neil Morris <neil.morr...@virgin.net> wrote:
> >
> > Having read/studied a few programming languages I would be interested to
> > know the core features functions etc that would be most desired? I have
> > listed below some features I would like to see.
> >
> > 1. ease of scripting languages with the near speed of compiled ones

Scripting languages, batch languages, aren't needed... IMO. I miss the more
rapid development on interpreters. But, if it can't produce an executable
binary, it has little value to me.

> > 2. linked to a core library of functions ie glibc

Libraries aren't needed, if the core language allows dynamic memory
allocation, and file I/O.

> > 3. mozilla's XUL

? Never heard of it.

> > 4. SDL graphics opengl library

Let's break that into two:

> > 4. SDL

SDL's (Simple DirectMedia Layer) best feature is that it allows emulation of
a source code OS (operating system), in SDL, on another OS. An SDL library
would be useful for an OS, or powerful language.

> > 4. opengl library

I think the graphics library should be left upto the host OS since the
functions required to get best results from graphics cards must go through
an/or bypass the OS's privilege and control mechanisms and are usually
highly custom to the graphics card chipset.

> > 5. mozilla's XUL linked to 3. and 4.

What's the purpose of XUL and/or XML?

> > 6. message sending and receiving features such as SOAP WDSL RDF WDDX and
RPC

Almost never heard of them. Why would you want message passing in a general
purpose programming language? Most language designs avoid this because of
the large overhead needed to implement them.

> > 7. bytecode VM that can be translated in the native or targeted host
system via a JIT compiler

VM's with bytecode is nice for portability. The question is do you really
need portability? x86 cpu's control 97% of the PC market and a large
percentage of higher processing markets. ARM controls the embedded market.
At most, you only need a compiler that supports two platforms. I.e., what's
the purpose, other than slow down the host system while the JIT compiles the
code?

> > 8. designed mainly for dymanic languages

Are you referring to a dynamicly typed language? I'm not sure if I want to
see a dy-manic language... ;-0

Anyway, this seems to be related to your preference for, uh, the "ultimate"
interpreted "connect to everything" language...

> > 9. reverse polish notation ie forth and postscript as core language
design or used as lower level conversation

IMO, this is a step backwards. There is nothing wrong using these in a
compiler. But, the reason FORTH (and probably PS, it's been a while since I
did and PS programming...) don't get adopted as primary languages is that 1)
they lack enough syntax that it's difficult for most people to follow what's
going on. And, 2) they require the user to keep track of variables, which
is a monumental task on larger applications.

> > 10. MACROS like features

Is this another language, or did you mean macro's, like those in an
assembler?

> > 11. conditional compilation

That's called a preprocessor. Public Domain (der Mouse, 3 variations of
DECUS, SCPP) and FOSS (had links to three - all dead now...) C preprocessors
exist. The C preprocessor will work with any language without a syntax
conflict (rare).

>
(nothing for James...) :-)

This is my list:

1) language has both low level and higher level programming features (like
C)
2) language should have variables, unsigned integers, arrays, and strings
(floating point not necessary)
3) the low level features should closely correspond to the cpu's assembly
abilities
4) the higher level features should provide:
4a) flow control, variable allocation, abilities to manipulate
integers, arrays, strings
4b) dynamic memory allocation, functions for file I/O
5) the language, even if interpreted, should be able to produce binaries

(and a bunch of other useful stuff I'm probably forgetting...)

Rod Pemberton

Gene

unread,

Jul 24, 2008, 6:44:15 PM7/24/08

> any suggestions wellcome!bi
>
> Neil Morris

I recommend you get a few good textbooks on the topic of programming
languages. Check the web pages of some high quality comp sci undergrad
language courses for titles. Study hard. From the new vantage point
you've gained, look again at a big collection of languages. Make sure
to include functional and logic programming, concurrent and
distributed programming as well. You'll have to write some programs
in each of a representative smattering of the entire range. Then
build a new list and ask your question on comp.lang.misc.

cr88192

unread,

Jul 24, 2008, 11:36:34 PM7/24/08

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote in message
news:g6aumk$62a$1...@aioe.org...

> "James Harris" <james.h...@googlemail.com> wrote in message
> news:f4fd39e2-7206-4cc7...@j33g2000pri.googlegroups.com...
>> On 24 Jul, 20:58, Neil Morris <neil.morr...@virgin.net> wrote:
>> >
>> > Having read/studied a few programming languages I would be interested
>> > to
>> > know the core features functions etc that would be most desired? I have
>> > listed below some features I would like to see.
>> >
>> > 1. ease of scripting languages with the near speed of compiled ones
>
> Scripting languages, batch languages, aren't needed... IMO. I miss the
> more
> rapid development on interpreters. But, if it can't produce an executable
> binary, it has little value to me.
>

yes, which is partly why I still stick mostly to C (I have forrays into C++
and Assembler, however, these are lesser used languages in my case).

and, having a scripting engine I can actually tolerate, as it so happens, it
compiles C code into native machine code. of course, it would be nicer if
some more of the technical issues/bugs were worked out, and the thing
compiled faster (a compiler design based heavily around XML DOM trees is not
the fastest thing around it seems, or at least when faced with big masses of
system headers...).

several seconds per-module can get a little more tiring than it would at
first seem...

an eventual option would likely be to make use of caching:
the compiler keeps around a big cache of ELF or COFF files, and re-uses
previously compiled modules if the file is unmodified and the dependencies
hold.

another issue is that the compiler keeps trying to multiply-link XML nodes
for some reason (ie, where it tries to include the same node in multiple
parent nodes), which results in cloning the nodes in question (as well as
issuing a warning message).

>> > 2. linked to a core library of functions ie glibc
>
> Libraries aren't needed, if the core language allows dynamic memory
> allocation, and file I/O.
>

more likely:
can make use of the libraries that already exist, if it is a general-purpose
language.

it is the case for many special purpose languages that they need nothing
really beyond the core stuff needed to complete their task (for example, the
language may not even allow recursion).

>> > 3. mozilla's XUL
>
> ? Never heard of it.
>

it is an XML-based representation for describing GUI's.

a slightly alternate option (something I had used before), was to make an
XML syntax mostly derived from GTK and HTML forms (some GTK constructions
replacing what would have been done via formatting features in HTML).

>> > 4. SDL graphics opengl library
> Let's break that into two:
>
>> > 4. SDL
>
> SDL's (Simple DirectMedia Layer) best feature is that it allows emulation
> of
> a source code OS (operating system), in SDL, on another OS. An SDL
> library
> would be useful for an OS, or powerful language.
>

in my case, it would just be nice if some more things were standardized
between OS's, and that people would stop making a mess of things by trying
to endlessly re-invent everything.

but, then again, I am guilty to some extent of re-inventing things as well,
but mostly in areas where agreement is not generally reached between
OS's/frameworks (GC API, threads, sockets, ...).

but, at the same time, little prevents one from using the Win32 API or
pthreads or whatever in my case either...

ok, I don't generally use SDL in my case (I had personally found it more
painful than it was worth, since the native APIs tend to exist on the
systems in question, but SDL often has to be built and installed).

>> > 4. opengl library
>
> I think the graphics library should be left upto the host OS since the
> functions required to get best results from graphics cards must go through
> an/or bypass the OS's privilege and control mechanisms and are usually
> highly custom to the graphics card chipset.
>

however, at least in the case of OpenGL, it is present on a wide variety of
OS's (Windows, Linux, some game consoles, many embedded systems, ...).

of course, many Windows (and XBox) developers would probably insist on
DirectX, but oh well...

>> > 5. mozilla's XUL linked to 3. and 4.
>
> What's the purpose of XUL and/or XML?
>

at least in my case, they are a fairly useful way of representing abstract
data and trees (albeit S-Expressions, or other more 'native' structures,
tend to be much more efficient).

>> > 6. message sending and receiving features such as SOAP WDSL RDF WDDX
>> > and
> RPC
>
> Almost never heard of them. Why would you want message passing in a
> general
> purpose programming language? Most language designs avoid this because of
> the large overhead needed to implement them.
>

they could exist as library features, but IMO should be avoided as core
functionality (IMO, it is better to leave features out of the language
compiler/syntax proper if they can be more or similarly effectively pulled
of with library code).

even within a single process, certain kinds of message passing can be rather
useful (for example, a messaging system between threads, between the GUI
framework and app, ...).

>> > 7. bytecode VM that can be translated in the native or targeted host
> system via a JIT compiler
>
> VM's with bytecode is nice for portability. The question is do you really
> need portability? x86 cpu's control 97% of the PC market and a large
> percentage of higher processing markets. ARM controls the embedded
> market.
> At most, you only need a compiler that supports two platforms. I.e.,
> what's
> the purpose, other than slow down the host system while the JIT compiles
> the
> code?
>

similar reasoning in my case, albeit x86-64 is rapidly growing, and may be
non-trivial to target (as I have found, due in large part to the complex
calling convention used on Linux and friends).

I don't really understand it anyways, noting that the register/memory
jerkoffs one has to go through may well cost more than the speedup of the
minor speed difference between registers and the CPU's local cache (where
the stack frames are almost invariably located).

IMO, the Win64 convention is less absurdly designed, but oh well...

>> > 8. designed mainly for dymanic languages
>
> Are you referring to a dynamicly typed language? I'm not sure if I want
> to
> see a dy-manic language... ;-0
>
> Anyway, this seems to be related to your preference for, uh, the
> "ultimate"
> interpreted "connect to everything" language...
>

I use some dynamic features in C, and they are useful.

in all though, it is fairly similar to RTTI in C++ (only that I tend to test
using predicate functions rather than 'dynamic_cast' and similar...).

>> > 9. reverse polish notation ie forth and postscript as core language
> design or used as lower level conversation
>
> IMO, this is a step backwards. There is nothing wrong using these in a
> compiler. But, the reason FORTH (and probably PS, it's been a while since
> I
> did and PS programming...) don't get adopted as primary languages is that
> 1)
> they lack enough syntax that it's difficult for most people to follow
> what's
> going on. And, 2) they require the user to keep track of variables, which
> is a monumental task on larger applications.
>

yes, even then, one discovers that it only really makes sense as a
mid-level, if one is compiling to native code (where the much bigger part of
the compiler actually turns out being efficiently compiling this RPN-style
notation into native machine code...).

while I was still working on it, my newer compiler core had ended up going
from RPN to SSA, and then (incomplete), converting the SSA into commands for
a lower-level code generator (operated with yet more XML manipulation...).

if I resume this effort, I may likely replace this lower XML-level with an
S-Expression based form (as I had discovered in the RPN->SSA process, and in
the SSA manipulation code, S-Exps seem to work a lot better for this type of
thing, as well as being the higher-performance option).

basically, the awkward logic and munging of the parser and upper compiler
and similar, work better with XML, but the much more rigid transforms of the
lower compiler stages, and the linear relay of information and commands,
seem better suited to s-expressions.

>> > 10. MACROS like features
>
> Is this another language, or did you mean macro's, like those in an
> assembler?
>

maybe he meant MACROSS...
yes, giant alians really can be turned into allies with the power of
music...

>> > 11. conditional compilation
>
> That's called a preprocessor. Public Domain (der Mouse, 3 variations of
> DECUS, SCPP) and FOSS (had links to three - all dead now...) C
> preprocessors
> exist. The C preprocessor will work with any language without a syntax
> conflict (rare).
>

I can imagine a few cases.
I guess it depends mostly on the type of syntax and tokenizing behavior.

for example, the C preprocessor would be a really bad fit for languages such
as LISP, Python, Smalltalk, ...

but then again, I have seen such syntactic horrors as, IMO, Objective-C.
actually, IMO, Brook+ is ugly, but at least this is likely to be far more
contained.

if I were to do similarly, it would likely be far more subtle:
a few special keywords (and likely also following the standard rules for
extensions, even if not the prettiest, macros can remedy this).

#define vector _Vector
...

vector float a[10][10], b[10][10], c[10][10];

c=a*a+b;

going a little further, one could add some minor tweaks to allow tensor
operations.

of course, Brook+ and Obj-C both have justifications for their horrid
syntax:
both were designed effectively as preprocessors which extract and rewrite
the code in question (in the case of Brook+, forming both C-based output,
and other code intended to be compiled and run on the GPU).

likely though, anything I added to my compiler would not likely run on the
GPU, since this is difficult to do generally and portably (thus far I mostly
target the GPU by going through OpenGL...).

most of what I have then, in my compiler, is utility functions for
convinient vector operations, which I can then optimize (eventually, GPU
targetting could be done via more conventional optimizations being applied
to already existing code).

for example, whenever one writes a loop performing certain kinds of
operations (such as loops linearly processing large arrays of numbers of
vectors), the compiler may be inclined to try to pass it off to the GPU...

>>
> (nothing for James...) :-)
>
>
> This is my list:
>
> 1) language has both low level and higher level programming features (like
> C)
> 2) language should have variables, unsigned integers, arrays, and strings
> (floating point not necessary)
> 3) the low level features should closely correspond to the cpu's assembly
> abilities
> 4) the higher level features should provide:
> 4a) flow control, variable allocation, abilities to manipulate
> integers, arrays, strings
> 4b) dynamic memory allocation, functions for file I/O
> 5) the language, even if interpreted, should be able to produce binaries
>
> (and a bunch of other useful stuff I'm probably forgetting...)
>

yes, and luckily C already has these features...

C could use more reflection ability and more dynamic features (such as
dynamic loading and evaluation), then again, this is largely why I wrote a
compiler...

>
> Rod Pemberton
>

scholz...@gmail.com

unread,

Jul 27, 2008, 12:58:12 PM7/27/08

> Libraries aren't needed, if the core language allows dynamic memory
> allocation, and file I/O.

ROTFL.

> > > 3. mozilla's XUL
>
> ? Never heard of it.

Normally this would disqualify you from any further discussion.
You don't need to know it but you should have heared about it.

> VM's with bytecode is nice for portability. The question is do you really
> need portability? x86 cpu's control 97% of the PC market and a large
> percentage of higher processing markets.

And here you finally write to much bullshit.
On Desktops it's not so much about architectures but about OS. We have
Windows,Linux,FreeBSD,Solaris in both i386 and amd64.
Add NetBSD,HP-UX,OpenBSD as server systems. With servers we get
Sparc,
Power PC and Itanium as other architectures.

In the embedded you can also add hundert smaller (but yes there you
don't need
a VM because apps are much stricter deployed and you can live with a
"compile everywhere"
instead of "compile once".

gremnebulin

unread,

Jul 27, 2008, 2:43:34 PM7/27/08

On 24 Jul, 20:58, Neil Morris <neil.morr...@virgin.net> wrote:

> Dear All
>
> Having read/studied a few programming languages I would be interested to
> know the core features functions etc that would be most desired? I have
> listed below some features I would like to see.
>
> 1. ease of scripting languages with the near speed of compiled ones
> 2. linked to a core library of functions ie glibc
> 3. mozilla's XUL
> 4. SDL graphics opengl library
> 5. mozilla's XUL linked to 3. and 4.
> 6. message sending and receiving features such as SOAP WDSL RDF WDDX and
> RPC

Can always be bolted on with libraries, as it usually is. Or do you
mean support
in the basic syntax/keywords?

> 7. bytecode VM that can be translated in the native or targeted host
> system via a JIT compiler
> 8. designed mainly for dymanic languages
> 9. reverse polish notation ie forth and postscript as core language

Huh? Do you mean "intermediate" language? But why would you need a
forth-type language AND bytecode as intermediates?

> design or used as lower level conversation
> 10. MACROS like features

Can always be bolted on by textual pre-processing, although that is
not always the best answer.

> 11. conditional compilation

Ditto.

Rod Pemberton

unread,

Jul 27, 2008, 5:12:37 PM7/27/08

<scholz...@gmail.com> wrote in message
news:cf380ef3-ddf5-40d3...@p25g2000hsf.googlegroups.com...

> > Libraries aren't needed, if the core language allows dynamic memory
> > allocation, and file I/O.
>
> ROTFL.

Really? Why? I don't see what's so laughable...

(From my perspective, that was an accurate statement from programming for
about 27 years with experience in about 14 languages or so...)

> > > > 3. mozilla's XUL
> >
> > ? Never heard of it.
>
> Normally this would disqualify you from any further discussion.

Why? There are lots of dead, useless, low usage and specific use languages.
Am I supposed to know all of them? I started to list some of them, but then
realized it would take me a few hours...

(I.e., if you were to tell me electrolysis of water is the only electrolysis
method that produces hydrogen. Am I to disqualify you from further
discussion for not knowing about the chloralkali process?)

> You don't need to know it but you should have heared about it.

Why? Unless one specifically searches for XUL, there's no mention of it...
So, how am I to meet your expectations of my exposure to XUL?

(I.e., this tends to imply to me that XUL's primary use is in a dedicated
situation, e.g., for a specific single application, or in some highly
limited environment. Is it?)

> > VM's with bytecode is nice for portability. The question is do you
really
> > need portability? x86 cpu's control 97% of the PC market and a large
> > percentage of higher processing markets.
>
> And here you finally write to much bullshit.

I didn't. But, with a claim like that, you need to prove I'm wrong. Feel
free to post some respectable links with industry numbers that you think
contradict these:

http://www.arm.com/news/19720.html
http://www.xbitlabs.com/news/cpu/display/20070802231958.html
http://www.eetimes.com/rss/showArticle.jhtml?articleID=189602065

Rod Pemberton

gremnebulin

unread,

Jul 27, 2008, 5:27:01 PM7/27/08

On 27 Jul, 22:12, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:

> I didn't. But, with a claim like that, you need to prove I'm wrong. Feel
> free to post some respectable links with industry numbers that you think
> contradict these:
>
> http://www.arm.com/news/19720.htmlhttp://www.xbitlabs.com/news/cpu/display/20070802231958.htmlhttp://www.eetimes.com/rss/showArticle.jhtml?articleID=189602065
>
> Rod Pemberton

Some languages have a lifespan in decades. Even if 100% of the market
is x86, is it still a good idea to tie a language
to an architecture that might die out?

Casey Hawthorne

unread,

Jul 27, 2008, 6:08:14 PM7/27/08

The language features wanted would depend on problem domain.

As somebody else has pointed out, a FFI (Foreign Function Interface)
to C, is always a good idea, since C is like a high level assembly
language.

--
Regards,
Casey

gremnebulin

unread,

Jul 28, 2008, 5:10:02 AM7/28/08

On 27 Jul, 23:08, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
> The language features wanted would depend on problem domain.

Or can you have enough extensibility to cope with any any domain? How
about extensibility
of syntax, or user-defined levels of type cehcking, two features that
are rarely implemented in current
languages?

Marco van de Voort

unread,

Jul 28, 2008, 8:41:23 AM7/28/08

On 2008-07-27, scholz...@gmail.com <scholz...@gmail.com> wrote:
>> VM's with bytecode is nice for portability. The question is do you really
>> need portability? x86 cpu's control 97% of the PC market and a large
>> percentage of higher processing markets.
>
> And here you finally write to much bullshit. On Desktops it's not so much
> about architectures but about OS. We have Windows,Linux,FreeBSD,Solaris in
> both i386 and amd64. Add NetBSD,HP-UX,OpenBSD as server systems. With
> servers we get Sparc, Power PC and Itanium as other architectures.

> In the embedded you can also add hundert smaller (but yes there you don't
> need a VM because apps are much stricter deployed and you can live with a
> "compile everywhere" instead of "compile once".

However the reality is still pretty much: 4 programs, 4 different VM/JITs.
Multiple choice(Java,.NET and the default scripting languages in their
default deployments), versioning (e.g. .NET 1.1 vs 2.0) are the main
reasons.

Nick Keighley

unread,

Jul 28, 2008, 10:14:09 AM7/28/08

On 24 Jul, 23:10, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

> This is my list:
>
> 1) language has both low level and higher level programming features (like
> C)
> 2) language should have variables, unsigned integers, arrays, and strings
> (floating point not necessary)

signed ints? records (structs)

> 3) the low level features should closely correspond to the cpu's assembly
> abilities
> 4) the higher level features should provide:
> 4a) flow control, variable allocation, abilities to manipulate
> integers, arrays, strings
> 4b) dynamic memory allocation, functions for file I/O
> 5) the language, even if interpreted, should be able to produce binaries
>
> (and a bunch of other useful stuff I'm probably forgetting...)

subprograms (procedures, functions): ability to define and call.
If it can't do this it isn't a programming language IMHO.

--
Nick Keighley

Robert Maas, http://tinyurl.com/uh3t

unread,

Jul 31, 2008, 11:29:49 AM7/31/08

> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>

> I miss the more rapid development on interpreters.

Why do you miss it? I have it every day, so I don't miss it. You
only miss something when you don't have it. So are you saying you
don't have access to an interpretor (actually a read-eval-print
loop, nevermind whether it really interprets or does JIT
compilation to native code)? That's so sad. I feel so sorry for you.

> But, if it can't produce an executable binary, it has little
> value to me.

There's no such thing as an executable in a vacuum. An executable
requires some operating system to run it on. For example:
- An ELF executable requires some form of Unix or the like that
knows how to parse the header which tells the number and
location of the various code and data segments;
- A Macintosh (System 6 or 7) executable requires an underlying
MacOS which knows how to find the data segment (if any) and how
to set up an associative array of all the resources indexed by
name and number so that they can be invoked as requested, and how
to make sure CODE resource number 0 is the first code actually
executed.
- A Lisp Machine executable is probably just a FASL file, right?

So why don't you install an operating system that supports both
Class files and FASL files as native executable formats, then both
Common Lisp and Java would produce executables for your particular OS?

> > > 6. message sending and receiving features such as SOAP WDSL RDF WDDX and RPC
> Almost never heard of them. Why would you want message passing
> in a general purpose programming language? Most language designs
> avoid this because of the large overhead needed to implement them.

Perhaps the OP meant that SOAP etc. would be useful for distributed
applications across multiple hosts on the InterNet, and it would be
nice if the language made support o such facilities as easy as
possible. IMO the network overhead is so great that there's no need
to build support into the language, so long as TCP streams are
built in to the OS and it's easy to write user code to wrap SOAP on
top of the interface to TCP streams.

> > > 9. reverse polish notation ie forth and postscript as core language
> > > design or used as lower level conversation
> IMO, this is a step backwards. There is nothing wrong using
> these in a compiler. But, the reason FORTH (and probably PS, it's
> been a while since I did and PS programming...) don't get adopted
> as primary languages is that 1) they lack enough syntax that it's
> difficult for most people to follow what's going on.

That's a big issue, that you have no way to know you left out
something and now have the stack off by one, hence it's virtually
impossible to have any debugging aids that tell you your program
syntax is nonsense. But a bigger reason is that lack of parens
means each function must either take exactly the same number of
arguments, precluding optional or keyword arguments, or you must
greenspun such in a very painful way.

> And, 2) they require the user to keep track of variables, which
> is a monumental task on larger applications.

If there was a stack frame, such that every local variable kept a
fixed offset, you could make handwritten notes or memorize the
offsets and it wouldn't be horrible. But in a stack language, the
offset keeps changing every time you call a function, making it
impossible for a human mind to calculate where on the stack a given
parameter is at the moment.

Imagine if there were no absolute street addresses, only relative
addresses, and every time you met somebody new and wanted to tell
them where you live, so that they could come visit you, you had to
calculate the number of houses to count from here back to there.

> This is my list:
> 1) language has both low level and higher level programming features (like C)

Yuk. C doesn't have any higher level programming features.

> 2) language should have variables, unsigned integers, arrays, and strings

A string is nothing more than a sequence (array) of characters. It
just has a special read/print syntax to make it easier to specify
literal constants in a program and to make it easier to read debug
output. So delete strings from your wish-list and use characters
instead.

> (floating point not necessary)

Agreed, in fact IEEE floating point and everything before it is a
crock the way people use them. IEEE floating point supports
interval arithmetic, but hardly anybody takes full advantage of
that capability. And IEEE floating point doesn't support arbitrary
precision, merely two levels of fixed precision. It's a crock!!

> 3) the low level features should closely correspond to the cpu's
> assembly abilities

On modern multi-instruction-pipeline RISC CPUs that may be easier
said than done, or may be totally useless to an application
programmer.

> 4) the higher level features should provide:
> 4a) flow control, variable allocation, abilities to manipulate
> integers, arrays, strings
> 4b) dynamic memory allocation, functions for file I/O

You've begged the question with your sound-bite of "dynamic memory
allocation". Without a fullfledged Garbage Collector, you're up a
creek trying to get any large application clean of memory leaks.

Now it may or may not be practical to use a reference count system
instead of the more usual sweep-collect system, using elaborate
forms of self-balancing binary search tree instead of pointer loops.
But either way, requiring the application programmer to know the
exact moment when the last reference to an object goes away is
absurd, especially if you want different toplevel objects to share
inner structure to avoid needing to copy an entire structure just
to change one tiny part of it. (With self-balancing binary serach
trees, you can produce a modified copy by re-building log(n) path
down to the change and sharing all the rest of the tree between old
and new trees.)

> 5) the language, even if interpreted, should be able to produce binaries

Lisp can do that, if a FASL file works as a binary on the underlying OS.

Heck, machines are so fast you don't need the underlying OS to
support FASL files. You can have your .login script start up Lisp
and then use Lisp as your shell, an extra layer of overhead between
what you type and the OS, but who really cares when the machine is
spending 99.9% of its time waiting for you to compose a new
toplevel command and type it in, or waiting for you to select from
a menu of things you want it to do next.

Jacko

unread,

Jul 31, 2008, 12:06:34 PM7/31/08

Here's my experience:

Assembler - mostly bad.
Blitz Basic - quite good.
BBC Basic and basic in general - good but slow, no dynamic typing.
Tcl/Tk - good, very good stack trace on error, faster than expected.
J - too complex to remember, but looks good if ya a math head, maybe I
do some later.
Forth - good low level, bit of a mind warp, compact and fast.
C - good, standard libraries, no encapsulation.
C++ - bit of a bum hunt.
Java - adaptable, makes you handle exceptions, works on phones.
bash et al - quick to knock up but tcl prefered.
php - nice and nasty.
sql - not very good at graphics and need VOID as well as NULL.
Mathematica - like J but has some wikedly good functions. (try
spacetime mathematics if on a budget)

languages I have read.
algol 68 - not bad except ref ref ref ...
fortran - don't go there, I hate it
cobol - very long winded typing, but goodprecision.
PL/1 - why bother use pascal
Ada - like pascal with directional ports and parallelism.
OCCAM - looked good, wonder how it's getting along.
Modula-2 - like pascal but better.

plus others.

Now in my experience soft typing with hard under types (Tcl) is a good
one, hard typing can be a real pain sometimes. good for sorting
spelling mistakes only, although not initialized finding could solve
this better.

I like a good stack trace. Blah, blah is not an object is very
objectionable.

at the moment i am writing forth in jave for mobile devices
http://mid4th.googlecode.com but on my pocketPC i have eTcl, J and
spacetime mathematics installed. I do some C ocasionally, and some
php.

I use java almost exclusively on desktop, but have Tcl and C. I would
use java on Pocket PC if it had an IDE. I like java mostly, but it can
be a bit of a pain when trying to do vectored decisions, as creating
many inner classes and assigning them to array elements to get one
abstract method to execute is a real pain. (this is my missing
feature.) It should be as simple as defining an arrayed method

void meth[] () {

},{

} //etc

or making switch be faster making case start at 0 and use case: as the
next incremented case.
or do case labels where the case sets the sequential value (like a
local var)?/

{meth(),meth2(),...}[idx];

for that on gosub feel (the only thing i miss from basic).

cheers
jacko

Rod Pemberton

unread,

Jul 31, 2008, 9:09:05 PM7/31/08

"Robert Maas, http://tinyurl.com/uh3t"
<jaycx2.3....@spamgourmet.com.remove> wrote in message
news:rem-2008...@yahoo.com...

> > From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
> > I miss the more rapid development on interpreters.
>
> Why do you miss it?

I just stated why... (above m/r/d...). Languages that can do real work
don't usually have both interpreters and compilers available for them. They
usually have one or the other.

> > But, if it can't produce an executable binary, it has little
> > value to me.
>
> There's no such thing as an executable in a vacuum.

True. "There's no such thing as an" *interpreter* "in a vacuum"
either... Uh, just where did this vacuum you decided to fill [snipped] come
from? ;)

> So why don't you install an operating system

Have that.

> that supports both

That too... probably.

> Class files and FASL files as native executable formats, then both

...
> as native executable formats

Ah, but, that is the entire problem with your statements. These aren't
"native executable formats", i.e., binaries. They're interpreted code.
They can be compiled, just like C can be compiled. But, they aren't usually
compiled into binaries. And, they aren't distributed as source either.

> Common Lisp and Java would produce executables for your particular OS?

Do they produce permanent binaries from the "native executable formats"?
No... not usually. They might temporarily produce binaries, or binary code
snippets, if an architecture uses just-in-time compilation instead of a
virtual machine to execute the code.

Wait... wait.... Was part of your suggestion to install _two_ languages to
do the work that should be done by _one_ language?

> > This is my list:
> > 1) language has both low level and higher level programming features
(like C)
>
> Yuk. C doesn't have any higher level programming features.

Huh...? I keep getting this response about C from people across various
NGs... (Why me?) And, they either:
1) are unable to express what they mean.
or 2) never state what they mean.

So, I ask, but I never hear what "higher level programming features" C
doesn't have... ever. Are you referring to object-oriented features? Most
of these are implementable in C - just not as easily as C++, remember that
C++ was implemented _entirely_ in C at first...

(I've had experience at one point in time or another in about 14
languages... So, to me, the common statement that C has no higher level
features is a real mystery given my background.)

> > 2) language should have variables, unsigned integers, arrays, and
strings
>
> A string is nothing more than a sequence (array) of characters.

For C, true. But, not so for other languages (BASIC, PL/1) which implement
the type natively. I.e., while strings may actually be implemented as an
array of characters in memory, most languages support strings as a language
construct which is separate from arrays.

> > 3) the low level features should closely correspond to the cpu's
> > assembly abilities
>
> On modern multi-instruction-pipeline RISC CPUs that may be easier
> said than done,

True. I don't expect CISC functionality in C etc...

> or may be totally useless to an application
> programmer.

True. It might be useless to a pure application programmer (working on a
full featured OS). But, OS developers, game programmers, etc. program also,
and frequently need the ability to interface to things in assembly without
using assembly.

> > 4) the higher level features should provide:
> > 4a) flow control, variable allocation, abilities to manipulate
> > integers, arrays, strings
> > 4b) dynamic memory allocation, functions for file I/O
>
> You've begged the question with your sound-bite of "dynamic memory
> allocation". Without a fullfledged Garbage Collector, you're up a
> creek trying to get any large application clean of memory leaks.

Maybe...

It still doesn't change the idea that programmers need to do dynamic memory
allocation, as part of the native language. If you take C, how does one do
memory allocation without "stdlib.h"? They have to 1) preallocate more
memory than they need, e.g., as part of an array, or 2) emulate dynamic
memory using files, since files in C effectively have built in "memory"
allocation that is done behind the scenes.

> Now it may or may not be practical to use a reference count system
> instead of the more usual sweep-collect system, using elaborate
> forms of self-balancing binary search tree instead of pointer loops.
> But either way, requiring the application programmer to know the
> exact moment when the last reference to an object goes away is
> absurd,

In a large application, that's been worked on by numerous programmers over
twenty years, it's probably an absurd expectation... although they probably
solved the issue early on for their application. But, in a well written,
i.e., structured programming techniques, smaller application, this shouldn't
be a problem.

> especially if you want different toplevel objects to share
> inner structure to avoid needing to copy an entire structure just
> to change one tiny part of it.

"Inheritance" of data... between objects... That would make "releasing" the
data a problem for the programmer (since he/she's no longer in control...)
and so would become a compiler issue. (Was this a C++ example?) It also
begs the question: Why would one design a language that takes such control
over the data away from the programmer? Doesn't that defeat the purpose of
having a programmer program?

> (With self-balancing binary serach
> trees, you can produce a modified copy by re-building log(n) path
> down to the change and sharing all the rest of the tree between old
> and new trees.)

Is this worth the overhead?... (memory, time, speedup, etc...) It seems
like much work to me - just to not make a copy of some data...

> Heck, machines are so fast you don't need the underlying OS to
> support FASL files. You can have your .login script start up Lisp
> and then use Lisp as your shell, an extra layer of overhead between
> what you type and the OS, but who really cares when the machine is
> spending 99.9% of its time waiting for you to compose a new
> toplevel command and type it in, or waiting for you to select from
> a menu of things you want it to do next.

You care when the application you're running is consuming a large percent of
the machine time.

Rod Pemberton

Pascal J. Bourguignon

unread,

Aug 1, 2008, 2:39:12 AM8/1/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:

> "Robert Maas, http://tinyurl.com/uh3t"
> <jaycx2.3....@spamgourmet.com.remove> wrote in message
>

>> > This is my list:
>> > 1) language has both low level and higher level programming features
> (like C)
>>
>> Yuk. C doesn't have any higher level programming features.
>
> Huh...? I keep getting this response about C from people across various
> NGs... (Why me?) And, they either:
> 1) are unable to express what they mean.
> or 2) never state what they mean.

C is not a high level language.
C is a portable assembler, designed to implement the unix kernel.

Are you implementing a system kernel? If no, then you don't need C.

> So, I ask, but I never hear what "higher level programming features" C
> doesn't have... ever.

- first class functions,
- automatic memory management (garbage collector),
- macros (no, C has no macro. For real macros you need lisp),
- exceptions,
- packages / modules,
- bignums,
- bounds checked arrays, multidimensional arrays,
- strings and characters (no, C has no string and no character),
- strong type safety (ie. not allowing ("iv"+1)),
- object oriented programming,
- etc.

> Are you referring to object-oriented features? Most
> of these are implementable in C - just not as easily as C++, remember that
> C++ was implemented _entirely_ in C at first...

Indeed it's possible, and happily we could, implement high level
programming language in assembler. That doesn't prove that assembler
is high level a programming language.

> (I've had experience at one point in time or another in about 14
> languages... So, to me, the common statement that C has no higher level
> features is a real mystery given my background.)

Perhaps you didn't learned the right programming language?

>> (With self-balancing binary serach
>> trees, you can produce a modified copy by re-building log(n) path
>> down to the change and sharing all the rest of the tree between old
>> and new trees.)
>
> Is this worth the overhead?... (memory, time, speedup, etc...) It seems
> like much work to me - just to not make a copy of some data...

Yes it is worth the overhead. The point is to keep the existing
structures immutable, so they can be shared. This way, you don't need
to copy the structure everywhere, just to be sure, as it is done
usually in C and C++ programs.

--
__Pascal Bourguignon__ http://www.informatimago.com/

This is a signature virus. Add me to your signature and help me to live.

gremnebulin

unread,

Aug 1, 2008, 5:20:11 AM8/1/08

On 1 Aug, 02:09, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> "Robert Maas,http://tinyurl.com/uh3t"<jaycx2.3.calrob...@spamgourmet.com.remove> wrote in message

ke C)
>
> > Yuk. C doesn't have any higher level programming features.
>
> Huh...? I keep getting this response about C from people across various
> NGs... (Why me?) And, they either:
> 1) are unable to express what they mean.
> or 2) never state what they mean.

It doesn't have *some*

* No assignment of arrays or strings (copying can be done via
standard functions; assignment of objects having struct or union type
is supported)
* No automatic garbage collection
* No requirement for bounds checking of arrays
* No operations on whole arrays
* No syntax for ranges, such as the A..B notation used in several
languages
* No separate Boolean type: zero/nonzero is used instead[10]
* No nested function definitions
* No formal closures or functions as parameters (only function and
variable pointers)
* No generators or coroutines; intra-thread control flow consists
of nested function calls, except for the use of the longjmp or
setcontext library functions
* No exception handling; standard library functions signify error
conditions with the global errno variable and/or special return values
* Only rudimentary support for modular programming
* No compile-time polymorphism in the form of function or operator
overloading
* Only rudimentary support for generic programming
* Very limited support for object-oriented programming with regard
to polymorphism and inheritance
* Limited support for encapsulation
* No native support for multithreading and networking
* No standard libraries for computer graphics and several other
application programming needs

(WP)

Rod Pemberton

unread,

Aug 1, 2008, 5:50:42 AM8/1/08

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
news:87y73h8...@hubble.informatimago.com...

> > Huh...? I keep getting this response about C from people across
various
> > NGs... (Why me?) And, they either:
> > 1) are unable to express what they mean.
> > or 2) never state what they mean.
>
> C is not a high level language.

What? That claim seems highly revisionist to me. C, FORTRAN, Pascal, etc.
are all HLLs. Please (re)define "high level language"...

> C is a portable assembler,

C has low-level features, but it's not an assembler. Due to the era of it's
primary development, it captures much of load-store and early RISC
functionality, IMO.

FYI, the quote to which you are referring, was by Larry Rosler (of X3J11):
"I taught the first course on C at Bell Labs, using a draft of K&R, which
helped vet the exercises. The students were hardware engineers who were
being induced to learn programming. They found C (which is 'portable
assembly language') much to their liking. Essentials such as pointers are
very clear if you have a machine model in mind."

> designed to implement the unix kernel.

Untrue myth. That was never a goal for C. Ritchie's papers are online
which confirm this.

> Are you implementing a system kernel? If no, then you don't need C.

C is a general purpose programming language. Are you saying you don't need

a general purpose programming language?

> > So, I ask, but I never hear what "higher level programming features" C

> > doesn't have... ever.
>
> - first class functions,

C has, AFAIK. It depends on how you define 'class' here... I.e.,
procedures (and derivatives of, such as functions) are part of the language,
i.e., first class, but if you're referring to C++ classes, then no...

> - automatic memory management (garbage collector),

C's not interpreted. Garbage collectors are normally implemented for
interpreted languages. Why do you want "automatic memory management" with
C? However, a garbage collector has been implemented for at least one C
(Jacob Navia's variant of LCC: LCC-Win32). I'm not sure how he managed
that, since pointer use would need to be restricted or eliminated...

> - macros (no, C has no macro.

True. The C preprocessor implements macro's...

> For real macros you need lisp),

Please explain.

> - exceptions,

What type of exceptions? C has exceptions for numerical results and also
has software "exceptions" called signals.

> - packages / modules,

What do you mean by "packages" and "modules"? C has libraries... Are these
similar in concept? Linux "modules" are just compiled C code, like
libraries...

> - bignums,

Available as a library for C...

> - bounds checked arrays,

Not available as part of the language.

> multidimensional arrays,

Part of the language.

> - strings

Implemented as "arrays" of characters. Strings aren't a fundamental type in
C though. Neither are arrays. Although C has array declarations, the C
language doesn't actually have arrays, but contiguous sequences of objects
and the offset operator which visually simulate arrays for the novice.

> and characters (no, C has no string and no character),

Part of the C language. C supports byte character sets such as EBCDIC, or
ASCII, as well as larger character sets such as Unicode with "wide
characters".

> - strong type safety

C has moderate type safety. *Many* times this limited type checking has to
be overridden with casts to implement basic numeric conversions or for
accessing assembly. If a strongly typed language has solutions for this so
casts or other conversion methods are uneeded, then great!

> (ie. not allowing ("iv"+1)),

"one-past-the-end" is required for any language that supports pointers which
can use the pointer to access an object via smaller individual elements,
e.g., as an object accessed as an array of characters by pointers in C -
required. I.e., a pointer-free or restricted pointer language could
implement this requirement. But, having programmed in early Pascal without
pointers, and programmed in other high use pointer languages, e.g., PL/1,
this is _not_ something you want as a programmer. It is something you want
to prevent if you'd like to reduce coding errors.

> - object oriented programming,

Not available. Although, some of the look of object-oriented code can be
simulated in C.

> >> (With self-balancing binary serach
> >> trees, you can produce a modified copy by re-building log(n) path
> >> down to the change and sharing all the rest of the tree between old
> >> and new trees.)
> >
> > Is this worth the overhead?... (memory, time, speedup, etc...) It seems
> > like much work to me - just to not make a copy of some data...
>
> Yes it is worth the overhead.

Why? (e.g., structures need to be shared to implement ____ or this
effectively allows various forms of memory compaction, etc.)

> The point is to keep the existing
> structures immutable, so they can be shared.

I think he said that...

> This way, you don't need
> to copy the structure everywhere, just to be sure,

That's useful if you have to shared data, but how frequently does that
occur? And, does it occur much outside OS design? (which you said wasn't
needed for a HLL...)

> as it is done
> usually in C and C++ programs.

It's not that common in C from my experiences... perhaps frequent in C++?

Rod Pemberton

Bartc

unread,

Aug 1, 2008, 6:06:03 AM8/1/08

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
news:87y73h8...@hubble.informatimago.com...

> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
>
>> "Robert Maas, http://tinyurl.com/uh3t"

>>> Yuk. C doesn't have any higher level programming features.

>>
>> Huh...? I keep getting this response about C from people across
>> various
>> NGs... (Why me?) And, they either:
>> 1) are unable to express what they mean.
>> or 2) never state what they mean.
>
> C is not a high level language.
> C is a portable assembler, designed to implement the unix kernel.
>
> Are you implementing a system kernel? If no, then you don't need C.

C has a few other uses besides kernels. For example, in building some of
those higher level languages.

That's what I use it for (or my version of it anyway); for these purposes,
you don't really need any of those fancy features in your list, or that of
gremnebulin, because many of those will be in the new language.

> - macros (no, C has no macro. For real macros you need lisp),

C has macros out of necessity, but otherwise they render code (ie proper
code, not Lisp-like) unreadable. Is this really a desirable feature? It
seems a low-level feature to me.

--
Bartc

Rod Pemberton

unread,

Aug 1, 2008, 6:54:05 AM8/1/08

"gremnebulin" <peter...@yahoo.com> wrote in message
news:97e5a2f8-8e35-4034...@j22g2000hsf.googlegroups.com...

> On 1 Aug, 02:09, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > "Robert
Maas,http://tinyurl.com/uh3t"<jaycx2.3.calrob...@spamgourmet.com.remove>
wrote in message
> ke C)
> >
> > > Yuk. C doesn't have any higher level programming features.
> >
> > Huh...? I keep getting this response about C from people across
various
> > NGs... (Why me?) And, they either:
> > 1) are unable to express what they mean.
> > or 2) never state what they mean.
>
>
> It doesn't have *some*
>

I'm not sure where you pulled this list from, but it's not correct.

> * No assignment of arrays or strings (copying can be done via
> standard functions; assignment of objects having struct or union type
> is supported)

(True.) I'm not sure. I was thinking C99 supported array assignment, but
... probably not since string assignment would then be implemented too. So,
probably: True.

> * No automatic garbage collection

True. Garbage collection is not usually implemented in compiled languages.
(see LCC-Win32 for counter-example)

> * No requirement for bounds checking of arrays

True.

> * No operations on whole arrays

True, AFAIK.

> * No syntax for ranges, such as the A..B notation used in several
> languages

True. Not really needed, IMO.

> * No separate Boolean type: zero/nonzero is used instead[10]

True and False. False for C99, see _Bool. True for C89. Feature is
trivial and uneeded. Zero and non-zero is common for most HLLs anyway
(except for one language which used a value other than zero for false,
perhaps was ADA?).

> * No nested function definitions

True. Unecessary, like goto, if you understand structured programming
concepts. That's why C doesn't have nested functions. This is mentioned
somewhere in Ritchie's articles.

> * No formal closures

Not sure what these are...

> * No ... functions as parameters (only function and variable pointers)

Not sure what you mean here. Function pointers can be passed as parameters.
There is no reason to "pass" the actual function (which is binary code) as a
"parameter" in C. (i.e., code and data are separate... and there is no need
to pass code in C.) Is this an attempt to apply object-oriented features to
C? e.g., encapsulation?

> * No generators or coroutines; intra-thread control flow consists
> of nested function calls, except for the use of the longjmp or
> setcontext library functions

False. (And, I've seen no real C examples where these are needed, only
made up ones...)
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en
http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en
http://www.sics.se/~adam/pt/

> * No exception handling; standard library functions signify error
> conditions with the global errno variable and/or special return values

False. There are numeric exceptions and software exception handling via
signal() etc.

> * Only rudimentary support for modular programming

(False.) Okay, let me go look up what you mean by "modular programming"...
According to Wikipedia, this doesn't need explicit language support and
classifies languages which use libraries and linkers as (like C) as "modular
programming". So, False...

> * No compile-time polymorphism in the form of function or operator
> overloading

This is an object-oriented language feature. C is not object-oriented.

> * Only rudimentary support for generic programming

Okay, let me go look up what you mean by "generic programming"... This
seems to be similar in nature to C's casts, but more advanced and related to
object-oriented classes. This is an object-oriented language feature. C is
not object-oriented

> * Very limited support for object-oriented programming with regard
> to polymorphism and inheritance

These are an object-oriented language feature. C is not object-oriented.

> * Limited support for encapsulation

This is an object-oriented language feature. C is not object-oriented.

> * No native support for multithreading

Unecessary in C. In addition to being hardware specific on certain
platforms, this has more to do with C compiler implementation, than with the
language itself.

> * No native support for networking

True. OS dependent.

> * No standard libraries for computer graphics and several other
> application programming needs

True. OS dependent.

--
FWIW, most of the issues seem to be about lack of object-orientedness, or
are trivial and unecessary, or are OS implementation specific.

Rod Pemberton

Marco van de Voort

unread,

Aug 1, 2008, 7:55:51 AM8/1/08

On 2008-07-31, Robert Maas, http://tinyurl.com/uh3t <jaycx2.3....@spamgourmet.com.remove> wrote:
> executed.
> - A Lisp Machine executable is probably just a FASL file, right?
>
> So why don't you install an operating system that supports both
> Class files and FASL files as native executable formats, then both
> Common Lisp and Java would produce executables for your particular OS?

You must be a teacher. You answers are perfectly correct, and totally
impractical

gremnebulin

unread,

Aug 1, 2008, 8:34:08 AM8/1/08

On 1 Aug, 11:54, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> "gremnebulin" <peterdjo...@yahoo.com> wrote in message

>
> news:97e5a2f8-8e35-4034...@j22g2000hsf.googlegroups.com...> On 1 Aug, 02:09, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
> > > "Robert
>
> Maas,http://tinyurl.com/uh3t"<jaycx2.3.calrob...@spamgourmet.com.remove>
> wrote in message
>
> > ke C)
>
> > > > Yuk. C doesn't have any higher level programming features.
>
> > > Huh...? I keep getting this response about C from people across
> various
> > > NGs... (Why me?) And, they either:
> > > 1) are unable to express what they mean.
> > > or 2) never state what they mean.
>
> > It doesn't have *some*
>
> I'm not sure where you pulled this list from, but it's not correct.

It's wikipedia so you can correct it. But notice the context.

It's an attemp to apply LISP-like behaviour --lambda and so on. It's
missing for all that it
goes completely against he grain of the langauge.

> made up ones...)http://www.chiark.greenend.org.uk/~sgtatham/coroutines.htmlhttp://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=enhttp://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=enhttp://www.sics.se/~adam/pt/

>
> > * No exception handling; standard library functions signify error
> > conditions with the global errno variable and/or special return values
>
> False. There are numeric exceptions and software exception handling via
> signal() etc.

That's libraries, not the langauge per se.

> > * Only rudimentary support for modular programming
>
> (False.) Okay, let me go look up what you mean by "modular programming"...
> According to Wikipedia, this doesn't need explicit language support and
> classifies languages which use libraries and linkers as (like C) as "modular
> programming". So, False...

There is no syntax for modules, it is inferred from how thigns are
placed into files.

> > * No compile-time polymorphism in the form of function or operator
> > overloading
>
> This is an object-oriented language feature. C is not object-oriented.
>
> > * Only rudimentary support for generic programming
>
> Okay, let me go look up what you mean by "generic programming"... This
> seems to be similar in nature to C's casts, but more advanced and related to
> object-oriented classes. This is an object-oriented language feature. C is
> not object-oriented

Generic programming basically means being able to write algorithms
with data types abstracted away or paramaterised. Alexander Stepanov,
author of the STL, argues that generic programming is differt to and
more useful than OOP.

> > * Very limited support for object-oriented programming with regard
> > to polymorphism and inheritance
>
> These are an object-oriented language feature. C is not object-oriented.
>
> > * Limited support for encapsulation
>
> This is an object-oriented language feature. C is not object-oriented.

> > * No native support for multithreading
>
> Unecessary in C. In addition to being hardware specific on certain
> platforms, this has more to do with C compiler implementation, than with the
> language itself.

It's stil not there.

> > * No native support for networking
>
> True. OS dependent.
>
> > * No standard libraries for computer graphics and several other
> > application programming needs
>
> True. OS dependent.

No, e.g openGL.

> --
> FWIW, most of the issues seem to be about lack of object-orientedness, or
> are trivial and unecessary, or are OS implementation specific.

The points about strings and arrays stand up, as most HLLs have them
and they are useful features.

Pascal J. Bourguignon

unread,

Aug 1, 2008, 4:19:13 PM8/1/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:

> "Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
> news:87y73h8...@hubble.informatimago.com...
>> > Huh...? I keep getting this response about C from people across
> various
>> > NGs... (Why me?) And, they either:
>> > 1) are unable to express what they mean.
>> > or 2) never state what they mean.
>>
>> C is not a high level language.
>
> What? That claim seems highly revisionist to me. C, FORTRAN, Pascal, etc.
> are all HLLs. Please (re)define "high level language"...

"high-level language" refers to the higher level of abstraction from machine language.
http://en.wikipedia.org/wiki/High-level_programming_language

The less we can say is that C doesn't try to abstract anything from
the machine language. It only tries to generalize machine language to
let unix be ported from one processor to the other.

>> C is a portable assembler,
>
> C has low-level features, but it's not an assembler. Due to the era of it's
> primary development, it captures much of load-store and early RISC
> functionality, IMO.
>
> FYI, the quote to which you are referring, was by Larry Rosler (of X3J11):
> "I taught the first course on C at Bell Labs, using a draft of K&R, which
> helped vet the exercises. The students were hardware engineers who were
> being induced to learn programming. They found C (which is 'portable
> assembly language') much to their liking. Essentials such as pointers are
> very clear if you have a machine model in mind."
>
>> designed to implement the unix kernel.
>
> Untrue myth. That was never a goal for C. Ritchie's papers are online
> which confirm this.

The first sentence of the abstract of
http://cm.bell-labs.com/cm/cs/who/dmr/chist.html
by Dennis M. Ritchie is:

"The C programming language was devised in the early 1970s as a
system implementation language for the nascent Unix operating
system."

>> Are you implementing a system kernel? If no, then you don't need C.
>
> C is a general purpose programming language. Are you saying you don't need
> a general purpose programming language?

If your purpose is to introduce bugs in all your applications, then
yes, C is a general purpose programming language.

>> > So, I ask, but I never hear what "higher level programming features" C
>> > doesn't have... ever.
>>
>> - first class functions,
>
> C has, AFAIK. It depends on how you define 'class' here... I.e.,
> procedures (and derivatives of, such as functions) are part of the language,
> i.e., first class, but if you're referring to C++ classes, then no...

http://en.wikipedia.org/wiki/First-class_function

Integers are first class objects in C.

You can write literal integers: 0, 1, 2, etc. (compilation time)
you can define integer constants: const int pi=3;
you can store integers into variables: int i=0;
you can create new integers: i+1 (run-time)
you can pass integers to functions: f(int x); f(42);
you can return integers from functions: int f(int x); i=f(42);

But you cannot do all of that with functions:

typedef int (*fi)(int);
you can return functions from functions: fi g(int x); g(42)(2);
you can pass functions to functions: h(fi g){g(42);} h(f);
you CANNOT create new functions: (run-time)
you can store functions into variables: fi f=g(42);
you can define function constants: int g(int x){return(x+1);}
You CANNOT write literal functions: (compilation time)

In lisp and any other high level programming language,
(defun adder (x) (lambda (y) (+ x y)))
you can return functions from functions: (adder 42) --> #<FUNCTION (Y) (+ 42 Y)>
you can pass functions to functions: (mapcar #'adder '(1 2 3))
--> (#<FUNCTION (Y) (+ 1 Y)>
#<FUNCTION (Y) (+ 2 Y)>
#<FUNCTION (Y) (+ 3 Y)>)
you can create new functions: see adder above. (run-time)
you can store functions into variables: (setf g (adder 42))
you can define function constants: (defun h (z) (* z 2))
You can write literal functions: ((lambda (x) (* x x)) 3) (compilation time)
(setf g (lambda (x) (* x x))) (funcall g 3)

>> - automatic memory management (garbage collector),
>
> C's not interpreted.

C is interpreted.
CINT - http://root.cern.ch/root/Cint.html
EiC - http://eic.sourceforge.net/
Ch - http://www.softintegration.com

> Garbage collectors are normally implemented for
> interpreted languages.

Money is normally lost in casinos. That's not a reason to be go to a
casino and lose all your money.

But the point is that there is absolutely no corelation between
language and implementation (ie. compiler, interpreter, byte-compiler
with a virtual machine interpreter, byte-compiler with a virtual
machine interpreter doing just in time compilation, processor,
hardwired processor, microprogrammed processors, etc). And there is
no corelation between an style of implementation and the presence or
absence of a a garbage collector.

But there is a correlation between a language that wants you to think
in terms of bytes and pointers and wants you to implement a memory
management system (things you would of course want to do if you were
implementing a unix kernel), and a high level language that wants you
to think about your problem in terms of the problem domain, and that
will manage the low-level details such as processor and memory for
you.

> Why do you want "automatic memory management" with C?

My point. If you are implementing a unix kernel, indeed you don't
really need automatic memory management. But if you are implementing
anything else like, say, MIS applications or web services, then you
don't care about memory, you care about bank accounts, or salaries, or
stock items and sales, etc.

> However, a garbage collector has been implemented for at least one C
> (Jacob Navia's variant of LCC: LCC-Win32). I'm not sure how he managed
> that, since pointer use would need to be restricted or eliminated...

There's also BoehmGC.

>> - macros (no, C has no macro.
>
> True. The C preprocessor implements macro's...

cpp doesn't implement macros, it implements some basic textual
substitution that has nothing to do with macros. Macro-assemblers
have more powerful macro systems than cpp. (in that respect, we could
say that C is even less than a good macro assembler).

>> For real macros you need lisp),
>
> Please explain.

Have a look at:
http://www.lisperati.com/casting.html
http://gigamonkeys.com/book/macros-defining-your-own.html
http://psg.com/~dlamkins/sl/chapter20.html

Basically, in lisp, macros are compiler hooks, normal lisp
functions. By the properties of lisp, macros receive as arguments
parse trees (symbolic expressions), not text chunks, and they return a
symbolic expression (a parse tree), that substitutes the macro call.

(There are also reader macros that read characters and return an
object, and compiler macros that can substitute specific function
calls with another expression (for example to inline a faster version
when the arguments have certain forms)).

>> - exceptions,
>
> What type of exceptions? C has exceptions for numerical results and also
> has software "exceptions" called signals.

This is not provided by C, but by the POSIX and OS layer. Signals are
sent to any process, whatever the programming language used to build
it.

Exceptions are exceptional conditions, or error cases. They are
handled out of band. I'm really surprized you don't know them, since
they're present in every high level programming language. Even BASIC
which hardly can be considered a high level language, had an ONERROR
statement to handle predefined errors.

>> - packages / modules,
>
> What do you mean by "packages" and "modules"? C has libraries... Are these
> similar in concept? Linux "modules" are just compiled C code, like
> libraries...

In pascal they're called UNIT.
In Modula-2 they're called MODULE.
In ADA and lisp they're called PACKAGE.
In C++ they're called namespace.

This is a notion that is important when you write big programs. In C,
there's no such notion. Libraries must be careful not to use a name
already used in another library, for example, using systematically
hopefully unique prefixes.

>> - bignums,
>
> Available as a library for C...

Like for any assembler yes. What we call high level programming
here is the ability to do:

(print (factorial 40))

and get:

815915283247897734345611269596115894272000000000

Yes, we can also write:

move.l #40,-(sp)
jsr bignum_from_int
move.l d0,d7
move.l d0,-(sp)
jsr bignum_factorial
move.l d0,d6
move.l d0,-(sp)
jsr bignum_print
move.l d6,-(sp)
jsr bignum_free
move.l d7,-(sp)
jsr bignum_free

or if you want it more portably:

bignum_t fourty=bignum_from_int(40);
bignum_t fourty_factorial=bignum_factorial(fourty);
bignum_print(fourty_factorial);
bignum_free(fourty_factorial);
bignum_free(fourty);

>> - bounds checked arrays,
>
> Not available as part of the language.

Yes. It's a characteristic feature of any assembler.

>> multidimensional arrays,
>
> Part of the language.

Idem.

>> - strings
>
> Implemented as "arrays" of characters. Strings aren't a fundamental type in
> C though. Neither are arrays. Although C has array declarations, the C
> language doesn't actually have arrays, but contiguous sequences of objects
> and the offset operator which visually simulate arrays for the novice.

My point again.

>> and characters (no, C has no string and no character),
>
> Part of the C language. C supports byte character sets such as EBCDIC, or
> ASCII, as well as larger character sets such as Unicode with "wide
> characters".
>
>> - strong type safety
>
> C has moderate type safety. *Many* times this limited type checking has to
> be overridden with casts to implement basic numeric conversions or for
> accessing assembly. If a strongly typed language has solutions for this so
> casts or other conversion methods are uneeded, then great!
>
>> (ie. not allowing ("iv"+1)),
>
> "one-past-the-end" is required for any language that supports pointers which

I don't mean that. That's the problem with lowlevel languages, the
meaning of the program is not obvious to a human reader.

"iv" is not a string. It's a pointer to a byte (misnamed char in C),
namely the first byte of a block of memory containing the three bytes
105, 118, 0. "iv"+1 is a pointer to the second byte, 118.

In a sane programming language (not even asking for a high level
programming language here), we should get an error.

> can use the pointer to access an object via smaller individual elements,
> e.g., as an object accessed as an array of characters by pointers in C -
> required. I.e., a pointer-free or restricted pointer language could
> implement this requirement. But, having programmed in early Pascal without
> pointers, and programmed in other high use pointer languages, e.g., PL/1,
> this is _not_ something you want as a programmer. It is something you want
> to prevent if you'd like to reduce coding errors.
>
>> - object oriented programming,
>
> Not available. Although, some of the look of object-oriented code can be
> simulated in C.

(Yes, you can simulate anything with a Turing machine, or with a
brainfuck virtual machine http://en.wikipedia.org/wiki/Brainfuck )

But that's another thing that makes lisp apart (lisp is even higher
level than the other high level programming languages). If a
paradigm is not available in a normal programming language, all you
can do is to simulate it. Not in lisp. Thanks to the features of
lisp, you are not limited to the data abstraction and the procedural
abstraction, you also have the syntactic abstraction (with lisp
macros), and the metalinguistic abstraction (one step beyond macro).

If fact, in lisp, there is no object oriented programming. Well,
there WAS no object oriented programming. But merely by defining
macros and library functions, the lisp programming language can be
programmed (by lisp users, not by high language designers and compiler
priests) to be extended with any programming paradigm you want. It's
been done for object oriented programming (the first implementations
of CLOS, like PCL are just add-on to the base Common Lisp, somewhat
like Objective-C and C++ were pre-processors to C). It's also been
done for other paradigms, like logic programming (prolog), and others.

But the difference is that to do that, you don't need to be a compiler
specialist, to master lex+yacc, and to re-implement a whole language.
With these little compiler hooks called macros, and as a plain
programmer, you can add operators to the lisp language smoothly
integrated to the existing language.

Have a look at:
Structure and Interpretation of Computer Programs
http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html
http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/

>> >> (With self-balancing binary serach
>> >> trees, you can produce a modified copy by re-building log(n) path
>> >> down to the change and sharing all the rest of the tree between old
>> >> and new trees.)
>> >
>> > Is this worth the overhead?... (memory, time, speedup, etc...) It seems
>> > like much work to me - just to not make a copy of some data...
>>
>> Yes it is worth the overhead.
>
> Why? (e.g., structures need to be shared to implement ____ or this
> effectively allows various forms of memory compaction, etc.)
>
>> The point is to keep the existing
>> structures immutable, so they can be shared.
>
> I think he said that...
>
>> This way, you don't need
>> to copy the structure everywhere, just to be sure,
>
> That's useful if you have to shared data, but how frequently does that
> occur? And, does it occur much outside OS design? (which you said wasn't
> needed for a HLL...)

Data structure sharing is natural, when you have a garbage collector,
and use a functionnal programming style. When you have these tools,
it occurs a lot. (Which saves a lot of processor cycles not wasted on
memory management, since we just refer existing data structure we
don't need to allocate a copy again).

>> as it is done
>> usually in C and C++ programs.
>
> It's not that common in C from my experiences...

Yes, it's inversely proportional to the number of bugs. ;-)

> perhaps frequent in C++?

Yes, they like to exercise their copy constructors :-)

Be sure to read this nice introduction to lisp for the practicing
programmer: http://gigamonkeys.com/book/

--
__Pascal Bourguignon__ http://www.informatimago.com/

In deep sleep hear sound,
Cat vomit hairball somewhere.
Will find in morning.

Pascal J. Bourguignon

unread,

Aug 1, 2008, 4:25:21 PM8/1/08

"Bartc" <b...@freeuk.com> writes:

> "Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
> news:87y73h8...@hubble.informatimago.com...
>> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
>>
>>> "Robert Maas, http://tinyurl.com/uh3t"
>
>>>> Yuk. C doesn't have any higher level programming features.
>>>
>>> Huh...? I keep getting this response about C from people across
>>> various
>>> NGs... (Why me?) And, they either:
>>> 1) are unable to express what they mean.
>>> or 2) never state what they mean.
>>
>> C is not a high level language.
>> C is a portable assembler, designed to implement the unix kernel.
>>
>> Are you implementing a system kernel? If no, then you don't need C.
>
> C has a few other uses besides kernels. For example, in building some
> of those higher level languages.
>
> That's what I use it for (or my version of it anyway); for these
> purposes, you don't really need any of those fancy features in your
> list, or that of gremnebulin, because many of those will be in the new
> language.

Well, this is a question of compiler bootstrap. When you are
implementing a language worth its bits, in general you will do it in
that language itself. Of course, when it's the first implementation
ever of that language, you have to bootstrap it from another
programming language, and _historically_, most often the only
programming language available was C (or Fortran, but Fortran is not
worth its bits).

But nowadays, there are much less different architectures, and there
is much more free software much more easily accessible thanks to the
Internet, so it is often very easy to download an implementation of a
higher programming language than C. Today, there is no excuse for
writting a language implementation in C. You can bootstrap your new
language in Lisp, in Haskell, in Modula-2, whatever.

>> - macros (no, C has no macro. For real macros you need lisp),
>
> C has macros out of necessity, but otherwise they render code (ie
> proper code, not Lisp-like) unreadable. Is this really a desirable
> feature? It seems a low-level feature to me.

Yes, cpp macros are a barbaric low-level feature. It's actually
unfortunate that the same word is used both for those and for lisp
macros.

--
__Pascal Bourguignon__ http://www.informatimago.com/

"Indentation! -- I will show you how to indent when I indent your skull!"

Pascal J. Bourguignon

unread,

Aug 1, 2008, 5:15:01 PM8/1/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:

>> * No nested function definitions
>
> True. Unecessary, like goto, if you understand structured programming
> concepts. That's why C doesn't have nested functions. This is mentioned
> somewhere in Ritchie's articles.

The real reason is because either the C designers were too weak, or,
let's assume by respect for our elders, they had not enough space in
their computers to implement that feature. But almost 40 years have
passed since the first C compiler, the language could have evolved and
integrated closures and local functions for a long time.

>> * No formal closures
>
> Not sure what these are...

That's the problem. Most people don't know what they miss.

(let ((x 42))

(lambda (y) (+ x y)))

returns a closure: the local variable x is enclosed inside the close
along with the function built by lambda.

(defun g (f) (funcall f 12))

(g (let ((x 42))
(lambda (y) (+ x y))))

returns 54.

>> * No ... functions as parameters (only function and variable pointers)
>
> Not sure what you mean here. Function pointers can be passed as parameters.
> There is no reason to "pass" the actual function (which is binary code) as a
> "parameter" in C. (i.e., code and data are separate... and there is no need
> to pass code in C.) Is this an attempt to apply object-oriented features to
> C? e.g., encapsulation?
>
>> * No generators or coroutines; intra-thread control flow consists
>> of nested function calls, except for the use of the longjmp or
>> setcontext library functions
>
> False. (And, I've seen no real C examples where these are needed, only
> made up ones...)
> http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
> http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en
> http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en
> http://www.sics.se/~adam/pt/
>
>> * No exception handling; standard library functions signify error
>> conditions with the global errno variable and/or special return values
>
> False. There are numeric exceptions and software exception handling via
> signal() etc.

This is not handled by the language, but by the OS. And if signals
are called signals instead of exceptions, it's because they're not the
same. We're not speaking of signals, but of exceptions.

(defun f (x)
(if (< x 10)
(/ 1 x)
(error "Parameter is too big: ~A ; expected <10" x)))

(loop :named :again :do
(handler-case
(progn (format t "Please enter a number: ")
(let ((x (read)))
(if (numberp x)
(princ (f x))
(return-from :again))))
(division-by-zero () (princ "Sorry, you entered 0."))
(error (err) (princ err)))
(terpri))

Please enter a number: 4
1/4
Please enter a number: 3
1/3
Please enter a number: 0
Sorry, you entered 0.
Please enter a number: 42
Parameter is too big: 42 ; expected <10
Please enter a number: quit
NIL

>> * Only rudimentary support for modular programming
>
> (False.) Okay, let me go look up what you mean by "modular programming"...
> According to Wikipedia, this doesn't need explicit language support and
> classifies languages which use libraries and linkers as (like C) as "modular
> programming". So, False...

C doesn't help in anyways. For example, look at the difficulties to
implement a #import directive. It's not possible it's been retracted
from Objective-C and C++.

But the main difficulty is when you try to link two libraries
exporting the same symbol. Have fun!

>> * No compile-time polymorphism in the form of function or operator
>> overloading
>
> This is an object-oriented language feature. C is not object-oriented.

No, polymorphims is orthogonal to object-orientation. It can be
required independly from objects.

>> * Only rudimentary support for generic programming
>
> Okay, let me go look up what you mean by "generic programming"... This
> seems to be similar in nature to C's casts, but more advanced and related to
> object-oriented classes. This is an object-oriented language feature. C is
> not object-oriented

Generic programming is even less related to object-oriented
programming. I'm very surprised. Why do you think we keep using
DIFFERENT WORDS if it wasn't for DIFFERENT CONCEPTS?

Generic programming is the possibility to have _types_ be parameters
of your program.

So instead of programming ten different sort functions, for ten
different types, integers, floating points, complex, strings, thingies
and marchamalies, you program one soft function, taking as parameter
in addition to the sequence of whatever type, the whatever type
itself.

(In the case of dynamically typed programming languages like lisp, any
function is a generic function, as long as it uses generic operators).

>> * Very limited support for object-oriented programming with regard
>> to polymorphism and inheritance
>
> These are an object-oriented language feature. C is not object-oriented.

This is why we consider it a low-level programming language.

>> * Limited support for encapsulation
>
> This is an object-oriented language feature. C is not object-oriented.

This is why we consider it a low-level programming language.

>> * No native support for multithreading
>
> Unecessary in C. In addition to being hardware specific on certain
> platforms, this has more to do with C compiler implementation, than with the
> language itself.

But some languages, like ADA, and IIRC Modula-3 do include
multithreading primitives. Note that multithreading can be
implemented even without OS or hardware support (it's called "green
threads" then).

>> * No native support for networking
>
> True. OS dependent.
>
>> * No standard libraries for computer graphics and several other
>> application programming needs
>
> True. OS dependent.
>
> --
> FWIW, most of the issues seem to be about lack of object-orientedness, or
> are trivial and unecessary, or are OS implementation specific.

We are not discussing whether they're trivial or necessary, but
whether that their presence or absence classify the language in the
high or low level.

--
__Pascal Bourguignon__ http://www.informatimago.com/

Nobody can fix the economy. Nobody can be trusted with their finger
on the button. Nobody's perfect. VOTE FOR NOBODY.

CBFalconer

unread,

Aug 1, 2008, 7:22:48 PM8/1/08

"Pascal J. Bourguignon" wrote:
> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
>
>>> * No nested function definitions
>>
>> True. Unecessary, like goto, if you understand structured
>> programming concepts. That's why C doesn't have nested functions.
>> This is mentioned somewhere in Ritchie's articles.
>
> The real reason is because either the C designers were too weak,
> or, let's assume by respect for our elders, they had not enough
> space in their computers to implement that feature. But almost
> 40 years have passed since the first C compiler, the language
> could have evolved and integrated closures and local functions
> for a long time.

In which case all 'old' code would have been abandoned. Tut-tut.
Naughty.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

** Posted from http://www.teranews.com **

Pascal J. Bourguignon

unread,

Aug 2, 2008, 4:04:47 AM8/2/08

CBFalconer <cbfal...@yahoo.com> writes:

> "Pascal J. Bourguignon" wrote:
>> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
>>
>>>> * No nested function definitions
>>>
>>> True. Unecessary, like goto, if you understand structured
>>> programming concepts. That's why C doesn't have nested functions.
>>> This is mentioned somewhere in Ritchie's articles.
>>
>> The real reason is because either the C designers were too weak,
>> or, let's assume by respect for our elders, they had not enough
>> space in their computers to implement that feature. But almost
>> 40 years have passed since the first C compiler, the language
>> could have evolved and integrated closures and local functions
>> for a long time.
>
> In which case all 'old' code would have been abandoned. Tut-tut.
> Naughty.

Who says that?

A lot of languages are able to evolve smoothly, even C. When you
program in Objective-C or C++, you can keep your old C code.

--
__Pascal Bourguignon__ http://www.informatimago.com/

"Logiciels libres : nourris au code source sans farine animale."

Richard Heathfield

unread,

Aug 2, 2008, 7:25:53 AM8/2/08

Pascal J. Bourguignon said:

> CBFalconer <cbfal...@yahoo.com> writes:
>
>> "Pascal J. Bourguignon" wrote:

<snip>

>>> But almost
>>> 40 years have passed since the first C compiler, the language
>>> could have evolved and integrated closures and local functions
>>> for a long time.
>>
>> In which case all 'old' code would have been abandoned. Tut-tut.
>> Naughty.
>
> Who says that?
>
> A lot of languages are able to evolve smoothly, even C. When you
> program in Objective-C or C++, you can keep your old C code.

The very first C program I tried to compile as if it were C++ code was
rejected by the C++ compiler because it was illegal C++. And no, this
wasn't a carefully constructed example - it was code that I'd written
before I'd even heard of C++ (the week before, I think it was), and the
boss said "well, let's take a look at this C++ thing, it says it's
backwardly compatible with C, so why not re-compile that foo project
you're working on as a sort of test?"... so I did.

Yes, I could keep the C code as you claim - provided that I kept a C
compiler around too; otherwise what is the point of keeping the code?

C++ was *not* a smooth evolution of C that accepted all C programs as being
legal. Language evolution can and does break existing code.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Rod Pemberton

unread,

Aug 2, 2008, 8:10:28 AM8/2/08

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message

news:87tze48...@hubble.informatimago.com...

> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
> > "Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
> > news:87y73h8...@hubble.informatimago.com...
> >> > Huh...? I keep getting this response about C from people across
> > various
> >> > NGs... (Why me?) And, they either:
> >> > 1) are unable to express what they mean.
> >> > or 2) never state what they mean.
> >>
> >> C is not a high level language.
> >
> > What? That claim seems highly revisionist to me. C, FORTRAN, Pascal,
etc.
> > are all HLLs. Please (re)define "high level language"...
>
> "high-level language" refers to the higher level of abstraction from
machine language.
> http://en.wikipedia.org/wiki/High-level_programming_language
>

Unfortunately, that is mostly revisionist also. Basically any language not
explicitly called "assembler" and which can't directly code cpu "assembly
instructions" is a HLL...

> The less we can say is that C doesn't try to abstract anything from
> the machine language. It only tries to generalize machine language to
> let unix be ported from one processor to the other.

First, you mean "assembly language" not "machine language". Assembly
language uses character based text. Machine language is coded via
electrical signals, such as by switches, latches, etc.

Second, assembly has certain features which are very different from C:
1) assembly uses cpu assembly instructions, but C has no assembly
instructions as part of the language (no VM)
2) assembly directly accesses registers, stack, memory, etc., but C uses
variables as an abstraction
3) assembly usually supports generic integers, but C uses a type system to
implement integer types and pointers
4) assembly uses multiple assembly instructions for flow control, but C uses
abstract logic representations

many etc...

> >> C is a portable assembler,
> >
> > C has low-level features, but it's not an assembler. Due to the era of
it's
> > primary development, it captures much of load-store and early RISC
> > functionality, IMO.
> >
> > FYI, the quote to which you are referring, was by Larry Rosler (of
X3J11):
> > "I taught the first course on C at Bell Labs, using a draft of K&R,
which
> > helped vet the exercises. The students were hardware engineers who were
> > being induced to learn programming. They found C (which is 'portable
> > assembly language') much to their liking. Essentials such as pointers
are
> > very clear if you have a machine model in mind."
> >
> >> designed to implement the unix kernel.
> >
> > Untrue myth. That was never a goal for C. Ritchie's papers are online
> > which confirm this.
>
> The first sentence of the abstract of
> http://cm.bell-labs.com/cm/cs/who/dmr/chist.html
> by Dennis M. Ritchie is:
>
> "The C programming language was devised in the early 1970s as a
> system implementation language for the nascent Unix operating
> system."
>

Unfortunately, the first sentence you quoted doesn't say anything about C
being designed to implement the Unix kernel. C was _used_ to reimplement
the Unix kernel, but it wasn't _designed_ to implement the Unix kernel. It
was developed as a general purpose language independent of the Unix kernel
which preexisted it.

And shortly below that quote is,
"C came into being in the years 1969-1973, in parallel with the early
development of the Unix operating system; the most creative period occurred
during 1972."
...
"While wanting to use a higher-level language, he [Thompson] wrote the
original Unix system in PDP-7 assembler."
...
"By early 1973, the essentials of modern C were complete. The language and
compiler were strong enough to permit us to rewrite the Unix kernel for the
PDP-11 in C during the summer of that year."

(also see "Portability of C Programs and the UNIX System" SC Johnson and DM
Ritchie, section IV which lists other Unix independent design goals of C.)

> >> Are you implementing a system kernel? If no, then you don't need C.
> >
> > C is a general purpose programming language. Are you saying you don't
need
> > a general purpose programming language?
>
> If your purpose is to introduce bugs in all your applications, then
> yes, C is a general purpose programming language.

IMO, you have a really distorted and incorrectly biased view of C. I don't
think I can help you solve this.

> >> > So, I ask, but I never hear what "higher level programming features"
C
> >> > doesn't have... ever.
> >>
> >> - first class functions,
> >
> > C has, AFAIK. It depends on how you define 'class' here... I.e.,
> > procedures (and derivatives of, such as functions) are part of the
language,
> > i.e., first class, but if you're referring to C++ classes, then no...
>
> http://en.wikipedia.org/wiki/First-class_function
>

C's functions don't fit the Wikipedia definition on the "First-class
function" page, but C's functions do fit the Wikipedia definition on the
"First-class object" page, which is used by Wikipedia's "First-class
function" page to define a "First-class function": (I'll let you sort that
contradiction out...)
http://en.wikipedia.org/wiki/First-class_object

> Integers are first class objects in C.
>

[snip]

> But you cannot do all of that with functions:

[snip]
> In lisp [examples]
...

> and any other high level programming language

No, those are (mostly) not available in other HLLs like BASIC, FORTRAN,
Pascal, PL/1, etc...

If we ignore the fact that some of what you said can't be done in C can be -
just differently, why do you need to pass a function to a function in C?
You can call the function explicitly. Or, you can pass a pointer to the
function and use the function pointer to call the function. You can't
create a new function at runtime (usual feature of interpreters), but you
can create a pointer to a non-existant function and then set the pointer to
point to a function outside of your application at runtime, e.g., to an OS
routine, and call it. So, I don't see where passing a C function to another
function is of any value without some other language feature such as
encapsulation. I.e., it doesn't fit the paradigm.

> >> - automatic memory management (garbage collector),
> >
> > C's not interpreted.
>
> C is interpreted.
> CINT - http://root.cern.ch/root/Cint.html
> EiC - http://eic.sourceforge.net/
> Ch - http://www.softintegration.com

C is only partially interpretable. The ones you site are non-compliant with
the C specs. Either they aren't complete or they radically modify the
language to be interpretable.

> > Garbage collectors are normally implemented for
> > interpreted languages.
>
> Money is normally lost in casinos.

Money is normally won in casinos... (Think about it. There are at least
two reasons that this is true...)

> That's not a reason to be go to a
> casino and lose all your money.

If you own the casino, it doesn't really matter does it?

> But the point is that there is absolutely no corelation between
> language and implementation (ie. compiler, interpreter, byte-compiler
> with a virtual machine interpreter, byte-compiler with a virtual
> machine interpreter doing just in time compilation, processor,
> hardwired processor, microprogrammed processors, etc).

There doesn't have to be a correlation for some languages, but most have a
correlation because they are easier to implement one way versus the other.

> And there is
> no corelation between an style of implementation and the presence or
> absence of a a garbage collector.

Pointers to object, dynamic memory allocation, etc. interfere with garbage
collector implementations...

> But there is a correlation between a language that wants you to think
> in terms of bytes and pointers and wants you to implement a memory
> management system (things you would of course want to do if you were
> implementing a unix kernel), and a high level language that wants you
> to think about your problem in terms of the problem domain,

C supports both...

> and that
> will manage the low-level details such as processor and memory for
> you.

C does most, but not all, of that.

> > Why do you want "automatic memory management" with C?
>
> My point. If you are implementing a unix kernel, indeed you don't
> really need automatic memory management. But if you are implementing
> anything else like, say, MIS applications or web services, then you
> don't care about memory, you care about bank accounts, or salaries, or
> stock items and sales, etc.

What your describing is well within the bounds of the auto variables that C
supports... There is nothing sufficiently dynamic in nature in your
examples to require "automatic memory management". I.e., there will only be
small changes to the items in those databases, an new records can consume
extra pre-allocated space. Also, these are completely implementable as
large files (not needing memory management in C at all)...

> > However, a garbage collector has been implemented for at least one C
> > (Jacob Navia's variant of LCC: LCC-Win32). I'm not sure how he managed
> > that, since pointer use would need to be restricted or eliminated...
>
> There's also BoehmGC.

(What do you think he uses?... I don't recall for sure, but I think it was
Boehm... You can search his NG if you need to know.)

> >> - macros (no, C has no macro.
> >
> > True. The C preprocessor implements macro's...
>
> cpp doesn't implement macros, it implements some basic textual
> substitution that has nothing to do with macros. Macro-assemblers
> have more powerful macro systems than cpp. (in that respect, we could
> say that C is even less than a good macro assembler).

IMO, most don't. NASM does, but it's macro's were derived from experience
with some C compiler...

> >> - exceptions,
> >
> > What type of exceptions? C has exceptions for numerical results and
also
> > has software "exceptions" called signals.
>
> This is not provided by C, but by the POSIX and OS layer. Signals are
> sent to any process, whatever the programming language used to build
> it.
>
> Exceptions are exceptional conditions, or error cases. They are
> handled out of band.

As I stated, C has numerical exceptions.

> I'm really surprized you don't know them, since
> they're present in every high level programming language.

Including C...

> Even BASIC
> which hardly can be considered a high level language, had an ONERROR
> statement to handle predefined errors.

Some BASICs, yes, but this is not any different from C's "errno" which is
far more useful, which you deride...

> >> - packages / modules,
> >
> > What do you mean by "packages" and "modules"? C has libraries... Are
these
> > similar in concept? Linux "modules" are just compiled C code, like
> > libraries...
>
> In pascal they're called UNIT.
> In Modula-2 they're called MODULE.
> In ADA and lisp they're called PACKAGE.
> In C++ they're called namespace.

C has different namespaces for certain things. I'll take it that this is
different from C++ terminology.

> This is a notion that is important when you write big programs. In C,
> there's no such notion. Libraries must be careful not to use a name
> already used in another library, for example, using systematically
> hopefully unique prefixes.

Perhaps not different at all... What name's are useable for different
things in each namespace is well defined. You're talking about the ability
to allow use of multiple sets of names without conflict, something FORTH
calls dictionaries. C has some limitations here, but they usually trivial.

> >> - bignums,
> >
> > Available as a library for C...
>
> Like for any assembler yes. What we call high level programming
> here is the ability to do:
>
> (print (factorial 40))
>
> and get:
>
> 815915283247897734345611269596115894272000000000

So, you'd call FORTH a high level language - since it can do this rather
easily, but not C?

> >> - bounds checked arrays,
> >
> > Not available as part of the language.
>
> Yes. It's a characteristic feature of any assembler.

Not true.

> >> multidimensional arrays,
> >
> > Part of the language.
>
> Idem.

I'm not aware of any assembler or assembly language that supports
multidimensional arrays natively...

> > C has moderate type safety. *Many* times this limited type checking has
to
> > be overridden with casts to implement basic numeric conversions or for
> > accessing assembly. If a strongly typed language has solutions for this
so
> > casts or other conversion methods are uneeded, then great!
> >
> >> (ie. not allowing ("iv"+1)),
> >
> > "one-past-the-end" is required for any language that supports pointers
which
>
> I don't mean that. That's the problem with lowlevel languages, the
> meaning of the program is not obvious to a human reader.
>
> "iv" is not a string.

Actually, since you quoted it, "iv" is a string... (It can't be another
type because of the quotes.)

> It's a pointer

Yes, "iv" is _also_ a pointer - implemented behind the scenes - since C
doesn't support string types natively.

> It's a pointer to a byte

No, "iv"+1 is a pointer to the second char _or_ a pointer to the string
starting at the second char since there is a nul... but, only if it wasn't
implicitly cast or converted to another type in code not shown... ;-)

> byte (misnamed char in C),

The C spec's define the relationship. Historically, a byte was standardized
on 8-bits at the beginning of the microprocessor era. The C standard
(re)defines a "byte" to the minimal addressable unit of bits large enough to
represent all the characters in the character set.
A char is a C byte, and must be at least 8-bits. (Approximately... IIRC,
their definitions could create slight conflicts.)

> namely the first byte of a block of memory containing the three bytes
> 105, 118, 0. "iv"+1 is a pointer to the second byte, 118.

(above)

> In a sane programming language (not even asking for a high level
> programming language here), we should get an error.

What you're saying is that because C doesn't support strings or arrays
natively, but implements them as pointers and contiguous sequences of bytes,
that the type system and checking is insufficient for strings and arrays...
I.e., C is "confusing" strings and pointers and so should generate an
error... I think we've covered this. That's the way it's done. I think
it's type system is adequate.

> But that's another thing that makes lisp apart (lisp is even higher
> level than the other high level programming languages).

You mustn't have read my post about a friend who loved lisp, for the first
six months. Then, he kept complaining about being "lost in stupid
parenthesis"...

> Data structure sharing is natural,

Why? I've hardly ever seen this done. And, there usually isn't much
copying of data structures. Normally, each structure has different data.

> when you have a garbage collector,
> and use a functionnal programming style. When you have these tools,
> it occurs a lot.

...I'll take your word for it...

> (Which saves a lot of processor cycles not wasted on
> memory management, since we just refer existing data structure we
> don't need to allocate a copy again).

I'm not sure how you came to that conclusion. Many cycles are spent in the
most common types of GC's scanning memory for unused blocks, far more than
what is required to copy data. Even you demonstrated in one of your earlier
posts you were aware of GC scanning...

Rod Pemberton

Pascal J. Bourguignon

unread,

Aug 2, 2008, 9:47:44 AM8/2/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:

>> >> C is a portable assembler,

>> >> designed to implement the unix kernel.
>> >
>> > Untrue myth. That was never a goal for C. Ritchie's papers are online
>> > which confirm this.
>>
>> The first sentence of the abstract of
>> http://cm.bell-labs.com/cm/cs/who/dmr/chist.html
>> by Dennis M. Ritchie is:
>>
>> "The C programming language was devised in the early 1970s as a
>> system implementation language for the nascent Unix operating
>> system."
>>
>
> Unfortunately, the first sentence you quoted doesn't say anything about C
> being designed to implement the Unix kernel. C was _used_ to reimplement
> the Unix kernel, but it wasn't _designed_ to implement the Unix kernel.

Ok, it was designed to REimplement the Unix kernel.

My point exactly. If unix didn't exist, or had been implemented in
PL/1 instead of BPCL, C would never have been invented.

> It
> was developed as a general purpose language independent of the Unix kernel
> which preexisted it.

It would have be really independent of the Unix kernel if it had been
invented before the unix kernel, and if other projects had justified
its existence before the Unix kernel. But this is not the case.

> And shortly below that quote is,
> "C came into being in the years 1969-1973, in parallel with the early
> development of the Unix operating system; the most creative period occurred
> during 1972."
> ...
> "While wanting to use a higher-level language, he [Thompson] wrote the
> original Unix system in PDP-7 assembler."

Yes, that's because C didn't existed, that Thompson was forced to use
assembler, and that's why he asked Ritchie to design C, to have a
portable programming language specifically designed to implement the
unix kernel. No unix kernel ==> no C. Unix kernel ==> C.

> ...
> "By early 1973, the essentials of modern C were complete. The language and
> compiler were strong enough to permit us to rewrite the Unix kernel for the
> PDP-11 in C during the summer of that year."
>
> (also see "Portability of C Programs and the UNIX System" SC Johnson and DM
> Ritchie, section IV which lists other Unix independent design goals of C.)

You're burying yourself deeper and deeper.
http://cm.bell-labs.com/who/dmr/portpap.html
Section III says:

C was developed for the PDP-11 on the UNIX system in
1972. Portability was not an explicit goal in its design, even
though limitations in the underlying machine model assumed by the
predecessors of C made us well aware that not all machines were
the same [2]. Less than a year later, C was also running on the
Honeywell 6000 system at Murray Hill. Shortly thereafter, it was
made available on the IBM 310 series machines as well. The
compiler for the Honeywell was a new product[8]. but the IBM
compiler was adapted from the PDP-11 version, as were compilers
for several other machines.

As soon as C compilers were available on other machines, a number
of programs, some of them quite substantial, were moved from UNIX
to the new environments. [...]

> No, those are (mostly) not available in other HLLs like BASIC, FORTRAN,
> Pascal, PL/1, etc...

Indeed, that's why they are not high level programming
language. They're only slightly higher level than assembler, but not
high enough. And it's a shame for them because they all but FORTRAN
came after LISP.

> If we ignore the fact that some of what you said can't be done in C can be -
> just differently, why do you need to pass a function to a function in C?
> You can call the function explicitly. Or, you can pass a pointer to the
> function and use the function pointer to call the function. You can't
> create a new function at runtime (usual feature of interpreters), but you
> can create a pointer to a non-existant function and then set the pointer to
> point to a function outside of your application at runtime, e.g., to an OS
> routine, and call it. So, I don't see where passing a C function to another
> function is of any value without some other language feature such as
> encapsulation. I.e., it doesn't fit the paradigm.

Yes, and you can also do that on a Turing Machine.

>> >> multidimensional arrays,
>> >
>> > Part of the language.
>>
>> Idem.
>
> I'm not aware of any assembler or assembly language that supports
> multidimensional arrays natively...

My point.

Assemblers don't support multidimensional arrays natively.
C doesn't support multidimensional arrays natively.
----------------------------------------------------------
C is a kind of assembler.

> What you're saying is that because C doesn't support strings or arrays
> natively, but implements them as pointers and contiguous sequences of bytes,
> that the type system and checking is insufficient for strings and arrays...
> I.e., C is "confusing" strings and pointers and so should generate an
> error... I think we've covered this. That's the way it's done. I think
> it's type system is adequate.

What I'm saying is that:

Assemblers don't support strings (beside the ds.s directive
allowing them to fill a block of memory with the bytes of a
string in the source).
C doesn't support strings (beside the ds.s directive allowing them
to fill a block of memory with the bytes of a string in the
source).
----------------------------------------------------------
C is a kind of assembler.

--
__Pascal Bourguignon__ http://www.informatimago.com/

"You cannot really appreciate Dilbert unless you read it in the
original Klingon"

Ben Bacarisse

unread,

Aug 2, 2008, 12:09:50 PM8/2/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:
<snip>

> So, I ask, but I never hear what "higher level programming features" C
> doesn't have... ever.

You are having right old ding-dong about this elsewhere but that has
turned into one of those point-by-point refutation and
counter-refutation arguments that often loose focus.

Let me, as a fan of C, explain what higher level programming features
it lacks. It is worth pointing out, before the details, that C lacks
these for a reason. Almost every single one of them was around when C
was designed, and they were all around during C's two big
standardisation rounds, but they all incur cost that is not in the
spirit of the language -- it would not fit its niche if it had them.
The fact that C is often outside the area for which it is ideal is a
cultural phenomenon to do with familiarity breeding contentment and
the need to do things quickly without any learning overhead.

Rather than extol the virtues of fancy language feature X in fancy
language Y, I'll try to illustrate how you hit some these missing
features in C, sometimes every day.

(1) C has rather limited types. A simple example: I doubt there are
many C programmers who have not wished for a string type at some time
or another -- if only for throw-away strings:

fp = open_config_file(get_home_dir() + "/" + program_name);

A richer type system would also allow one express, directly, ones intent
more often. For example, a function often has to return value only
sometimes and modern type systems can express this. In many cases, C
has a reasonable get-out: you return NULL or some other "out-of-range"
value like getc's EOF return, but a higher-level language would let
you say that getc returns a char only sometimes.

(2) It is hard to write generic, re-usable code in C because you can't
abstract over the type of data being operated on. The closest you can
come is to build structures that point to their data using void *, or
which simply reserve untyped space. Take, for example, a list. By
using void * as the data part you can make a list that is largely
unconstrained in what it can hold (modulo careful memory allocation),
but how do you express actions on it? Most list-processing code has a
lot of

for (struct node *p = head; p; p = p->next) {
/* do or calculate something using *p */
}

in it because you can't effectively abstract out this pattern.
Counting the number of elements, summing them, finding the maximum are
all the same code but with a slightly different operation at the heart
of the loop but you can't express that in a C function unless you are
prepared to have the result be a void * also. This links to...

(3) C's function pointers are a shadow what is possible in a higher-
level language. As an example, consider a simple search form. The
user gets to say what criteria to search for: by name, by date, by
both, by either -- whatever. In effect there is an expression made up
of NOT, AND, and OR from simpler basic matches:
case-insensitive-string-match, case-sensitive-pattern-match,
numerical-rage, and so on.

In C, you have to invent a data structure to encode all the basic
options (some of these can be simple function pointers) and build a
tree that combines these with the operators. When searching, the
match function is, in effect, and interpreter for a small expression
language. Not hard, but with higher-order functions you can express
this directly in your code with no need for any new (explicit) data
structures. You can write a function that returns a new one that
matches only of both of its argument functions match, or one that
matches if its argument doesn't, and so on.

Put higher-order functions together with generic data structures and
you get a very expressive language.

Of course, none of these is a show-stopper. Every working C
programmer I know has a library of string parsing and gluing
functions, and a tool kit of quasi-generic linked list and tree code
so these things are not hard to write (once you have all those bits).
Being high-level means that can express algorithms more directly and
with less extraneous boilerplate than at a low-level -- not that you
can write programs that were not writable before.

--
Ben.

Mike Sieweke

unread,

Aug 2, 2008, 1:28:57 PM8/2/08

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote:

> "gremnebulin" <peter...@yahoo.com> wrote:
> > * No separate Boolean type: zero/nonzero is used instead[10]
>
> True and False. False for C99, see _Bool. True for C89. Feature is
> trivial and uneeded. Zero and non-zero is common for most HLLs anyway
> (except for one language which used a value other than zero for false,
> perhaps was ADA?).

C99 doesn't have a true boolean type. "_Bool" is just a convenient
alias for "int". In a language with a boolean type, expressions
like "4<5" or "x == y" would return a boolean value, and not an
integer. The "if" statement would take a boolean argument instead
of an integer. Each of these statements would be illegal:

int i = true; // value incompatible with type
bool b = 4; // value incompatible with type
if (i) // "if" expects boolean expression
printf( "true" );
if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
printf( "???" );

All of Wirth's languages, and most languages languages derived from
them have special meaning for "boolean", "true", and "false". You
can't use 0 for false or non-zero for true. Some examples are:
Pascal and Object Pascal
Modula-2 and 3
Oberon
Component Pascal (closer to Oberon than Pascal)
Ada
Delphi

> > * No nested function definitions
>
> True. Unecessary, like goto, if you understand structured programming
> concepts. That's why C doesn't have nested functions. This is mentioned
> somewhere in Ritchie's articles.

I think it's more likely that nested functions were left out because they
complicate the compiler. They also complicate the run-time environment,
and they _can_ have a performance impact. Nice to have, but not
required.

> > * No exception handling; standard library functions signify error
> > conditions with the global errno variable and/or special return values
>
> False. There are numeric exceptions and software exception handling via
> signal() etc.

The "signal" function comes from the OS, not the C language.

You misinterpret the term "exception". In a language with exceptions,
the programmer can define his own exceptions. Then he can throw and
catch these exceptions using a syntax something like this (from
Wikipedia):

try {
line = console.readLine();
if (line.length() == 0) {
throw new EmptyLineException("The line read from console was
empty!");
}
console.printLine("Hello %s!" % line);
console.printLine("The program ran successfully");
} catch (EmptyLineException e) {
console.printLine("Hello!");
} catch (Exception e) {
console.printLine("Error: " + e.message());
} finally {
console.printLine("The program terminates now");
}

You confuse "separate compilation" with "modules". In a modular
language, the compiler would catch the error below. In C the error
won't be caught by the compiler, linker, or run-time. You'll just
see some strange output. That's why he said "rudimentary".

C++ will catch this in the linker, but the error message will
be confusing.

File a.c:
------------
int fa( int i )
{
return i + 4;
}

File b.c:
------------
extern float fa( float f ); // Doesn't match the function in a.c.
void fb( void )
{
printf( "%f\n", fa( 2.0 )); // Called with wrong parameter type.
}

* Enumerated types are missing in C. The "enum" keyword is just a way
to define integer constants.

In C there is no difference between
enum { x, y, z };
and
#define x 0
#define y 1
#define z 2

In a language with true enumerated types (for example C++), this would
be illegal:

typedef enum {
x, y, z
} anEnum;
typedef enum {
a, b, c
} anotherEnum;
void f( anEnum e);
void t( void )
{
f( 4 ); // "invalid conversion from 'int' to 'anEnum'"
f( a ); // "cannot convert 'anotherEnum' to 'anEnum'"
}

--
Mike Sieweke
Duluth, GA

Rod Pemberton

unread,

Aug 3, 2008, 4:20:25 AM8/3/08

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message

news:87vdyjt...@hubble.informatimago.com...

> "Rod Pemberton" <do_no...@nohavenot.cmm> writes:
> > Unfortunately, the first sentence you quoted doesn't say anything about
> > C being designed to implement the Unix kernel. C was _used_ to
> > reimplement the Unix kernel, but it wasn't _designed_ to implement
> > the Unix kernel.
>
> Ok, it was designed to REimplement the Unix kernel.
>
> My point exactly. If unix didn't exist, or had been implemented in
> PL/1 instead of BPCL, C would never have been invented.

Don't you think C was a logical outcome from B, BCPL, machine architecture,
standardized circuitry, etc., even if Unix was written in another HLL such
as PL/1? I do. C or something very similar would've still been developed.
I don't think it would've been as popular or widespread as it is if it
wasn't for Unix being rewritten in C.

> > And shortly below that quote is,
> > "C came into being in the years 1969-1973, in parallel with the early
> > development of the Unix operating system; the most creative period
> > occurred during 1972."
> > ...
> > "While wanting to use a higher-level language, he [Thompson] wrote the
> > original Unix system in PDP-7 assembler."
>
> Yes, that's because C didn't existed,

But, other non-assembler languages existed, perhaps not on the PDP-7 though.

> that Thompson was forced to use
> assembler,

He chose assembler. It was probably the easy choice, i.e., nothing else was
as easily available on the PDP-7. But, he could've ported some other
language, if none was available for that platform.

> > "By early 1973, the essentials of modern C were complete. The language
> > and compiler were strong enough to permit us to rewrite the Unix kernel
> > for the PDP-11 in C during the summer of that year."
> >
> > (also see "Portability of C Programs and the UNIX System" SC Johnson and
> > DM Ritchie, section IV which lists other Unix independent design
> > goals of C.)
>
> You're burying yourself deeper and deeper.

How? That (below) doesn't say anything (implicitly or explicitly) about
either of two of your claims. (C developed to implement Unix, and C is a
form of assembly)

> http://cm.bell-labs.com/who/dmr/portpap.html
> Section III says:
>
> C was developed for the PDP-11 on the UNIX system in
> 1972. Portability was not an explicit goal in its design, even
> though limitations in the underlying machine model assumed by the
> predecessors of C made us well aware that not all machines were
> the same [2]. Less than a year later, C was also running on the
> Honeywell 6000 system at Murray Hill. Shortly thereafter, it was
> made available on the IBM 310 series machines as well. The
> compiler for the Honeywell was a new product[8]. but the IBM
> compiler was adapted from the PDP-11 version, as were compilers
> for several other machines.
>
> As soon as C compilers were available on other machines, a number
> of programs, some of them quite substantial, were moved from UNIX
> to the new environments. [...]
>
...

> > No, those are (mostly) not available in other HLLs like BASIC,
> > FORTRAN, Pascal, PL/1, etc...
>
> Indeed,

...

> that's why they are not high level programming
> language.

That's a value based judgement on your part. They are the definition of
high level programming languages.

> They're only slightly higher level than assembler, but not
> high enough.

Ditto... Doesn't change the definition.

> > If we ignore the fact that some of what you said can't be done in C can
> > be - just differently, why do you need to pass a function to a function
> > in C? You can call the function explicitly. Or, you can pass a pointer
> > to the function and use the function pointer to call the function. You
can't
> > create a new function at runtime (usual feature of interpreters), but
> > you can create a pointer to a non-existant function and then set the
> > pointer to point to a function outside of your application at runtime,
> > e.g., to an OS routine, and call it. So, I don't see where passing a
> > C function to another
> > function is of any value without some other language feature such as
> > encapsulation. I.e., it doesn't fit the paradigm.
>
> Yes, and you can also do that on a Turing Machine.

So? Second (or third?) avoidance of "... why do you need to pass a function
to a function in C?" I.e., can it be a missing feature of C if the language
has no ability to use the feature? Of course not...

> >> >> multidimensional arrays,
> >> >
> >> > Part of the language.
> >>
> >> Idem.
> >
> > I'm not aware of any assembler or assembly language that supports
> > multidimensional arrays natively...
>
> My point.
>
> Assemblers don't support multidimensional arrays natively.

True.

> C doesn't support multidimensional arrays natively.

But, assemblers don't have any native method to simulate multidimensional
arrays either, unlike C...

> > What you're saying is that because C doesn't support strings or arrays
> > natively, but implements them as pointers and contiguous sequences of
> > bytes, that the type system and checking is insufficient for strings and
> > arrays... I.e., C is "confusing" strings and pointers and so should
> > generate an error... I think we've covered this. That's the way it's
> > done. I think it's type system is adequate.
>
> What I'm saying is that:
>
> Assemblers don't support strings (beside the ds.s directive
> allowing them to fill a block of memory with the bytes of a
> string in the source).

Not true. Some assemblers do support strings natively - even without cpu
instruction support. E.g., take a 6502 assembler circa 1983, it supported
different types of strings and multiple character sets: PETSCII, ASCII,
ASCIN, ASCIZ... Also, some cpu's (x86) have cpu string instruction
support, so assemblers for x86 support strings natively.

> C doesn't support strings (beside the ds.s directive allowing them

> to fill a block of memory with the bytes of a string in the
> source).

What is the "ds.s directive"? It's not part of C... That appears to be an
erroneous cut-and-paste. :-)

I understand that you're trying to make C seem similar to assembly, but most
of your claims seem to be very weakly supported (IMO) as compared to the
contrary position (especially, the points in my prior post...).

Rod Pemberton

unread,

Aug 3, 2008, 5:48:46 AM8/3/08

"Mike Sieweke" <msie...@ix.netcom.com> wrote in message
news:msieweke-681DFD...@bignews.bellsouth.net...

> "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
> > "gremnebulin" <peter...@yahoo.com> wrote:
> > > * No separate Boolean type: zero/nonzero is used instead[10]
> >
> > True and False. False for C99, see _Bool. True for C89. Feature is
> > trivial and uneeded. Zero and non-zero is common for most HLLs anyway
> > (except for one language which used a value other than zero for false,
> > perhaps was ADA?).
>
> C99 doesn't have a true boolean type. "_Bool" is just a convenient
> alias for "int".

It might be an "int", but it doesn't have to be:

"An object declared as type _Bool is large enough to store the values 0 and
1." n1256 draft 6.2.5 sub 2

> In a language with a boolean type, expressions
> like "4<5" or "x == y" would return a boolean value, and not an
> integer.

"would return a boolean value" vs. "would return a boolean type"...

I think it should be "boolean type" there. C already returns a boolean
value for logical comparisons: 0 or 1. It doesn't return a boolean type.

> The "if" statement would take a boolean argument instead
> of an integer.

The "if" statement in C currently takes a boolean value. Tthe result of an
expression comparison with zero:
1) false is zero
2) true is non-false

(Specifically, the value for 2) should be one, because of the comparison
with zero, but that isn't explicitly required... IMO, due to branches in
assembly language.)

"In both forms, the first substatement is executed if the expression
compares unequal to 0. In the else form, the second substatement is
executed if the expression compares equal to 0." n1256 draft 6.8.4.1 sub 2

> Each of these statements would be illegal:
>
> int i = true; // value incompatible with type
> bool b = 4; // value incompatible with type
> if (i) // "if" expects boolean expression
> printf( "true" );
> if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> printf( "???" );

What is gained if C has an additional boolean type? More type checking?
More casts? More comparisons?

Let's review those:

> int i = true; // value incompatible with type
> bool b = 4; // value incompatible with type

Type-checking can be good.

> if (i) // "if" expects boolean expression
> printf( "true" );

How do you execute the true body of the if when for the situation when i is
non-zero?

if(i!=0)
printf("true");

What's the advantage to this?

> if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> printf( "???" );

"incompatible with '<' operator"...

Are you referring to the first or second "<" operator? If first, then you
have a serious problem: there is no way to generate a boolean true or false
for greater-than or less-than from an integer comparison. If second, then
you have one of two problems: 1) requires a cast to convert boolean result
i<4 to integer or 2) the integer promotion rules or implicit casts need
correcting. The results are the same unless you want to intentionally flag
this as an error.

> > > * No exception handling; standard library functions signify error
> > > conditions with the global errno variable and/or special return values
> >
> > False. There are numeric exceptions and software exception handling via
> > signal() etc.
>
> The "signal" function comes from the OS, not the C language.

raise() is a C function for generating user or application signals.
signal() is a C function for handling signals.

> You misinterpret the term "exception".

Do I?

> In a language with exceptions,
> the programmer can define his own exceptions.

This can be done in C with the raise(), signal(), and/or assert()
functions... almost identically to your posted example.

[Snipped C++ example of exceptions... using 'catch' and 'throw'.]

> > > * Only rudimentary support for modular programming
> >
> > (False.) Okay, let me go look up what you mean by "modular
programming"...
> > According to Wikipedia, this doesn't need explicit language support and
> > classifies languages which use libraries and linkers as (like C) as
"modular
> > programming". So, False...
>
> You confuse "separate compilation" with "modules".

No I didn't. I have no idea what "modular programming" is... I reiterated
Wikipedia which _still_ says this as I post:

"Modular programming can often be performed even where the language lacks
explicit syntax or semantics to support modules. The use of libraries hooked
together by a linker are a common mechanism for separating parts of the
software into distinct modules."

> In a modular
> language, the compiler would catch the error below. In C the error
> won't be caught by the compiler, linker, or run-time. You'll just
> see some strange output. That's why he said "rudimentary".

Okay.

> C++ will catch this in the linker, but the error message will
> be confusing.
>
> File a.c:
> ------------
> int fa( int i )
> {
> return i + 4;
> }
>
> File b.c:
> ------------
> extern float fa( float f ); // Doesn't match the function in a.c.
> void fb( void )
> {
> printf( "%f\n", fa( 2.0 )); // Called with wrong parameter type.
> }

Useful.

> * Enumerated types are missing in C. The "enum" keyword is just a way
> to define integer constants.
>

True.

> In C there is no difference between
> enum { x, y, z };
> and
> #define x 0
> #define y 1
> #define z 2

...
> In C there is no difference between [enum's] and [#define's]

False. The defines have no type. But, yes, I understand what you meant.

> In a language with true enumerated types (for example C++), this would
> be illegal:
>
> typedef enum {
> x, y, z
> } anEnum;
> typedef enum {
> a, b, c
> } anotherEnum;
> void f( anEnum e);
> void t( void )
> {
> f( 4 ); // "invalid conversion from 'int' to 'anEnum'"
> f( a ); // "cannot convert 'anotherEnum' to 'anEnum'"
> }
>

Who uses enumerated types? I.e., why is this useful if very few people use
it?

Rod Pemberton

Marco van de Voort

unread,

Aug 3, 2008, 6:29:52 AM8/3/08

On 2008-08-01, Rod Pemberton <do_no...@nohavenot.cmm> wrote:
>> On 1 Aug, 02:09, "Rod Pemberton" <do_not_h...@nohavenot.cmm> wrote:
>> > "Robert
> Maas,http://tinyurl.com/uh3t"<jaycx2.3.calrob...@spamgourmet.com.remove>
> wrote in message
>> ke C)
>> >
>> > > Yuk. C doesn't have any higher level programming features.
>> >
>> > Huh...? I keep getting this response about C from people across
> various
>> > NGs... (Why me?) And, they either:
>> > 1) are unable to express what they mean.
>> > or 2) never state what they mean.

(note that my personal stand is that spartan doesn't mean "not high level".
Maybe my age and the fact in that time "high level" meant abstraction from
machine level instruction set. For any other "high level" I never saw a
decent definition, specially for languages that pretend to be general
purpose. Oh, and GC and OOP are DEFINITELY not a requirement)

>> * No separate Boolean type: zero/nonzero is used instead[10]
>
> True and False. False for C99, see _Bool. True for C89. Feature is
> trivial and uneeded. Zero and non-zero is common for most HLLs anyway
> (except for one language which used a value other than zero for false,
> perhaps was ADA?).

For most languages this is different. Most languages only define two values,
and the rest is implementation defined. (e.g. most Wirth languages). The
non-zero is a C convention that crept into some of the newer languages,
probably because it fits the lenient way of defining of scripting languages,
what seems to be the trend nowadays.

Delphi/modern Pascals have a handful of boolean types to this day (one true
boolean, and a bunch of C booleans with different sizes. Often called
"CBOOLEAN or WINBOOL", since the latter is connected to their most used
purpose, API communication)

If you have an url, I'd like that. It's one of the things I miss dearly when
I go from basic Pascal to C. (I still program our microcontrollers in C)

Specially since modern Pascal IDEs have "extract to local function" ability
which makes it very easy to move repeated code to a local function.

>> * No generators or coroutines; intra-thread control flow consists
>> of nested function calls, except for the use of the longjmp or
>> setcontext library functions
>
> False. (And, I've seen no real C examples where these are needed, only
> made up ones...)
> http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
> http://groups.google.com/group/net.lang.c/msg/66008138e07aa94c?hl=en
> http://groups.google.com/group/comp.lang.c/msg/bb78298175c42411?hl=en
> http://www.sics.se/~adam/pt/

Did C99 define threadvars ?

Coroutines (cooperative multitasking) have some limited use where you want
to do finely grained access to a datastructure while keeping the GUI
responsive and up to date. No locks needed, context switches are cheap, and
the transfer points are well defined.

The lockless aspect makes that it gets some attention again nowadays. But a
bit limited.

>> * No exception handling; standard library functions signify error
>> conditions with the global errno variable and/or special return values
>
> False. There are numeric exceptions and software exception handling via
> signal() etc.

When numeric is 16 (afaik the lowest guaranteed size of sigset_t?), then you
are right. I've to side with the others on this one.

>> * Only rudimentary support for modular programming
>
> (False.) Okay, let me go look up what you mean by "modular programming"...
> According to Wikipedia, this doesn't need explicit language support and
> classifies languages which use libraries and linkers as (like C) as "modular
> programming". So, False...

True. If we take the most minimal modular support, it is basically separate
compilation, datahiding and encapsulation.

While the separate compilation and datahiding part is honoured, the
encapsulation is not. You can't guaranteed use two the same functionnames in
two modules due to the single linker space. The linker space is not
modularized. At least not in older Cs.

And I doubt that newer C's _guaranteedly_ fix this.

>> * No compile-time polymorphism in the form of function or operator
>> overloading
>
> This is an object-oriented language feature. C is not object-oriented.

It is not. function overloading can perfectly be procedural.

No it is not. It is more like parameterizing code with a type, in a defined
and controlled way. Not having OO limits it a bit, and that might (and is,
unfortunately) get simulated with macros.

>> * Limited support for encapsulation
>
> This is an object-oriented language feature. C is not object-oriented.

It is not. Modular programming features this too.

>> * No native support for multithreading
>
> Unecessary in C. In addition to being hardware specific on certain
> platforms, this has more to do with C compiler implementation, than with the
> language itself.

(see above Is some form of threadvars part of C99? I can vaguely remember sb
saying that, but I've not much C99 experience myself)

>> * No standard libraries for computer graphics and several other
>> application programming needs
>
> True. OS dependent.

So is filehandling, but C chose to abstract that.

Ben Bacarisse

unread,

Aug 3, 2008, 7:44:18 AM8/3/08

"Rod Pemberton" <do_no...@nohavenot.cmm> writes:

> "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> news:msieweke-681DFD...@bignews.bellsouth.net...

<snip>

>> C99 doesn't have a true boolean type. "_Bool" is just a convenient
>> alias for "int".
>
> It might be an "int", but it doesn't have to be:

It is a small point but _Bool can't be an int and is certainly never
an alias for int. _Bool is one of C99's unsigned integral types (with
rank lower than that of all the others) and it can only hold the
values 0 and 1. You can stuff other bits into it, say using memcpy,
but when you look at the value you will get 0 or 1.

--
Ben.

Mike Sieweke

unread,

Aug 3, 2008, 1:45:51 PM8/3/08

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote:

> "Mike Sieweke" <msie...@ix.netcom.com> wrote in message

> > "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
> > > "gremnebulin" <peter...@yahoo.com> wrote:
> > > > * No separate Boolean type: zero/nonzero is used instead[10]
> > >
> > > True and False. False for C99, see _Bool. True for C89. Feature is
> > > trivial and uneeded. Zero and non-zero is common for most HLLs anyway
> > > (except for one language which used a value other than zero for false,
> > > perhaps was ADA?).
> >
> > C99 doesn't have a true boolean type. "_Bool" is just a convenient
> > alias for "int".
>
> It might be an "int", but it doesn't have to be:
>
> "An object declared as type _Bool is large enough to store the values 0 and
> 1." n1256 draft 6.2.5 sub 2

You're right, as Ben Bacarisse also pointed out. _Bool is not an
integer, and a _Bool variable won't behave like an integer.

> > In a language with a boolean type, expressions
> > like "4<5" or "x == y" would return a boolean value, and not an
> > integer.
>
> "would return a boolean value" vs. "would return a boolean type"...
>
> I think it should be "boolean type" there. C already returns a boolean
> value for logical comparisons: 0 or 1. It doesn't return a boolean type.

No. "0" and "1" are integers. And you can't return a boolean type any
more than you can return an integer type. You can return an integer
value. Although there is a boolean type in C, there are no boolean
values. Even in C99 there is no special meaning to "true" or "false".
They come from stdbool.h, defined with "#define".

> > The "if" statement would take a boolean argument instead
> > of an integer.
>
> The "if" statement in C currently takes a boolean value. Tthe result of an
> expression comparison with zero:
> 1) false is zero
> 2) true is non-false

The "if" statement doesn't take a boolean value, because it was defined
when the language had no boolean type. It takes an integer value. If
you pass it something other than an integer, it will be cast to an
integer. Behind the scenes it does a comparison with 0. The "if (x)"
statement is implemented as "if (x != 0)"

> (Specifically, the value for 2) should be one, because of the comparison
> with zero, but that isn't explicitly required... IMO, due to branches in
> assembly language.)
>
> "In both forms, the first substatement is executed if the expression
> compares unequal to 0. In the else form, the second substatement is
> executed if the expression compares equal to 0." n1256 draft 6.8.4.1 sub 2

That's what I said.

> > Each of these statements would be illegal:
> >
> > int i = true; // value incompatible with type
> > bool b = 4; // value incompatible with type
> > if (i) // "if" expects boolean expression
> > printf( "true" );
> > if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> > printf( "???" );
>
> What is gained if C has an additional boolean type? More type checking?
> More casts? More comparisons?

Yes, more type checking. The compiler could tell you this isn't correct:
x = (y < 5) + 2;
if you meant to type
x = (y << 5) + 2;

The compiler could detect an error typing "if (x=y)" when you meant to
type "if (x==y)" (unless x and y are _Bool, but that's a different
topic).

Would you need more casts? If you're programming in a certain style,
then perhaps yes:
x += y<4;
You'd have to write this as
x += (int)(y<4);

To the maintenance programmer who comes after you, the second form
is preferable, because it says you really meant to do that.

> Let's review those:
>
> > int i = true; // value incompatible with type
> > bool b = 4; // value incompatible with type
>
> Type-checking can be good.
>
> > if (i) // "if" expects boolean expression
> > printf( "true" );
>
> How do you execute the true body of the if when for the situation when i is
> non-zero?
>
> if(i!=0)
> printf("true");
>
> What's the advantage to this?

The main advantage is that the code states its intent. When you need
to show your code to your boss (who might not be an expert in C) it's
better to be explicit. The maintenance programmer who comes after
you will appreciate it, too. But the real advantage comes from all
the other types of errors that can be caught if this verbose form is
required.

> > if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> > printf( "???" );
>
> "incompatible with '<' operator"...
>
> Are you referring to the first or second "<" operator? If first, then you
> have a serious problem: there is no way to generate a boolean true or false
> for greater-than or less-than from an integer comparison. If second, then
> you have one of two problems: 1) requires a cast to convert boolean result
> i<4 to integer or 2) the integer promotion rules or implicit casts need
> correcting. The results are the same unless you want to intentionally flag
> this as an error.

In C the expression "i<4<5" is parsed as "(i<4)<5". So "i<4" returns
a boolean value, which is incompatible with comparison with an integer.
Sorry, I didn't state the assumption that boolean values can't be
compared to or assigned to integers. The point is that the expression
"i<4<5" doesn't make sense, but it's legal C code. If comparison
operators produced boolean values which were incompatible with integers,
then the compiler could catch this error.

> > > > * No exception handling; standard library functions signify error
> > > > conditions with the global errno variable and/or special return values
> > >
> > > False. There are numeric exceptions and software exception handling via
> > > signal() etc.
> >
> > The "signal" function comes from the OS, not the C language.
>
> raise() is a C function for generating user or application signals.
> signal() is a C function for handling signals.
>
> > You misinterpret the term "exception".
>
> Do I?

Yes, but I don't care enough about exceptions to correct your
misconceptions.

> > > > * Only rudimentary support for modular programming
> > >
> > > (False.) Okay, let me go look up what you mean by "modular
> programming"...
> > > According to Wikipedia, this doesn't need explicit language support and
> > > classifies languages which use libraries and linkers as (like C) as
> "modular
> > > programming". So, False...
> >
> > You confuse "separate compilation" with "modules".
>
> No I didn't. I have no idea what "modular programming" is... I reiterated
> Wikipedia which _still_ says this as I post:
>
> "Modular programming can often be performed even where the language lacks
> explicit syntax or semantics to support modules. The use of libraries hooked
> together by a linker are a common mechanism for separating parts of the
> software into distinct modules."

I'll try to explain it as I understand it. Separate compilation means
you can compile module A and module B separately and then link them
together later. You can also produce libraries, and the linker will
handle all that's necessary to let your program use the libraries.
C and FORTRAN both have separate compilation.

Separate compilation is immensely useful, but it leaves open some big
holes:
- The compiler/linker can't verify parameter types or numbers if the
header files are wrong or header files are not used.
- Names must be unique across your whole program, plus every library
you'll ever link to.
- Neither the linker nor the compiler ensures that constants are
defined the same everywhere.

A fully modular language would solve all these problems in the compiler.
The compiler must maintain a separate symbol table for the interface
to each module. Each module (can) define(s) its own namespace.
Constants must be defined inside a module, so there's no confusion
about multiple definitions.

For example (using features from several languages):

interface b -- The interface is somewhat like a header file, if the
-- header file is done correctly.
function f : integer
const a = 4
enum et = ( e1, e2, e3 )
function g( e : et )
end b

module b -- All the executable code goes here.
-- The compiler makes sure the module matches its interface.
function f : integer
return 5
end f
function g( e : et )
if e = e1 then
writeln( "e is e1" );
end if
end g
end b

module a
import b -- Says to use module b's interface.

function f : integer
b.g( b.e1 ) -- Call function f in module b
return b.f -- Call function f in module b
end f
end a

Note that it's illegal to define function b.f in module a. It
must be defined in b's interface.

Some languages don't use a separate interface definition - Oberon,
for example. Oberon has the programmer highlight exported names
with an asterisk.

The nearest thing to an interface definition in C is the ".h" file.
Header files are a poor substitute, because they are implemented with
simple text inclusion. The compiler and linker make no requirements
on what goes where, and there's no protection from doing it wrong.

This is all handled differently in (for example) C++. Name spaces
are optional, and function names are mangled in the object files
so the linker can catch the error if you call a function with the
wrong number or types of arguments. But header files are still done
with simple text inclusion, with very little protection from doing
it wrong. Constants can be defined differently in different modules.

User-defined enumerated types are very widely used, so I don't know why
you think they aren't. I don't know any programmers who don't use them
daily.

I'd even go so far as to say that _everyone_ uses enumerated types,
including you. Characters are an enumerated type defined by the
language. Would you like to program like this?

char HW[] = {72,101,108,108,111,32,87,111,114,108,100,33,0};
void HelloWorld( void )
{
printf( "%s\n", HW );
}

In many languages the boolean type is an enumerated type. For example
in Pascal it's defined somewhat like this:
type boolean = ( false, true );
This is built into the compiler, so it can return a boolean value for
a comparison, and require boolean values in "if" statements.

Enumerated types are used everywhere programmers want to give symbolic
names to values, when the values are mostly irrelevant. Most compilers
use them for token types:
typedef enum
{ kIdentifier, kLeftParen, kRightParen, kIf, kElse, etc.
} TokenEnum;
TokenEnum GetToken( void );

They're used in symbol tables for ID class:
enum { kFunction, kLabel, kType, kParameter, kVariable, etc. }

Some languages don't have user-defined enumerated types, so they emulate
them with constants. Enumerated types are more useful if the compiler
supports them and enforces their use with strong typing.

Here's an example I did last week:

typedef {
kDivideBy1,
kDivideBy2,
kDivideBy4,
kDivideBy8
} DividerEnum;

void SetCpuClockDivider( DividerEnum d );

It's used like this:
SetCpuClockDivider( kDivideBy4 );

With strong typing the compiler can detect this as an error:
SetCpuClockDivider( 8 );

Rod Pemberton

unread,

Aug 4, 2008, 5:25:21 AM8/4/08

"Mike Sieweke" <msie...@ix.netcom.com> wrote in message

news:msieweke-78C1A1...@bignews.bellsouth.net...

> "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
> > "Mike Sieweke" <msie...@ix.netcom.com> wrote in message

> > > In a language with a boolean type, expressions
> > > like "4<5" or "x == y" would return a boolean value, and not an
> > > integer.
> >
> > "would return a boolean value" vs. "would return a boolean type"...
> >
> > I think it should be "boolean type" there. C already returns a boolean
> > value for logical comparisons: 0 or 1. It doesn't return a boolean
type.
>
> No. "0" and "1" are integers.

Ok, we're hung up on the terminology of "boolean value". 0 and 1 are
boolean values irrespective of type. I.e., there are only two possible
resultant values. Boolean values can be stored in any type, not just a
boolean type, that allows two values to be stored: integer, boolean, etc.

> > > The "if" statement would take a boolean argument instead
> > > of an integer.
> >
> > The "if" statement in C currently takes a boolean value. Tthe result of
an
> > expression comparison with zero:
> > 1) false is zero
> > 2) true is non-false
>
> The "if" statement doesn't take a boolean value,

False. It accepts two "values": zero, and non-zero. This is a boolean
value representation irrespective of type.

> it ["if" statement] was defined

> when the language had no boolean type.

True.

> It takes an integer value.

False. It takes an integer type. The values are boolean - just two states.

> If
> you pass it something other than an integer, it will be cast to an
> integer.

If you pass it something other than an integer *type*, it will be cast to an
integer *type*. This has nothing to do with integer *value*.

> > (Specifically, the value for 2) should be one, because of the comparison
> > with zero, but that isn't explicitly required... IMO, due to branches
in
> > assembly language.)
> >
> > "In both forms, the first substatement is executed if the expression
> > compares unequal to 0. In the else form, the second substatement is
> > executed if the expression compares equal to 0." n1256 draft 6.8.4.1 sub
2
>
> That's what I said.

IMO, you're confusing the values with the type. I.e., -5 is an integer
value, but it can be stored in a signed integer type, or a float type. Just
like 0 and 1 are boolean values, but can be stored in integers, booleans,
float, etc.

> > > Each of these statements would be illegal:
> > >
> > > int i = true; // value incompatible with type
> > > bool b = 4; // value incompatible with type
> > > if (i) // "if" expects boolean expression
> > > printf( "true" );
> > > if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> > > printf( "???" );
> >
> > What is gained if C has an additional boolean type? More type checking?
> > More casts? More comparisons?
>
> Yes, more type checking. The compiler could tell you this isn't correct:
> x = (y < 5) + 2;

If I coded that, it's correct. How does the compiler know this isn't
correct since it can't know what I intended to code?

> if you meant to type
> x = (y << 5) + 2;

Who says I meant to type that? The compiler doesn't have the right to
decide which of these two is correct. The code, which I wrote, is what
decides.

If the language has insufficient syntax, then there is a problem with the
syntax that needs to be addressed, not the types or type system. E.g.,
require additional parens around boolean result, etc. instead of requiring a
boolean type. E.g., (from your comments immediately below) change the
assignment operator to something safer, or use a #define to do so.

> The compiler could detect an error typing "if (x=y)" when you meant to
> type "if (x==y)" (unless x and y are _Bool, but that's a different
> topic).

This is a problem with C. Assignment and comparison should've had uniquely
searchable operators. I.e., hard to distinquish "=" from "==" by text
search... E.g., could've used ":=" for assignment.

> > if(i!=0)
> > printf("true");
> >
> > What's the advantage to this?
>
> The main advantage is that the code states its intent.

As does if(i)... to those who program. Entire generations of languages used
zero as false. The way that if() was taught for many of them:

if (true) <==> if (not false) <==> if (not zero)

This "idiom" has worked in every language I've programmed in that supports
an "if" statement. IIRC, there is/was one language that used a different
value for false.

> But the real advantage comes from all
> the other types of errors that can be caught if this verbose form is
> required.

if(i) and if(i!=0) are currently equivalent because if() only supports an
integer type. Currently, this can't catch any additional errors. If you're
suggesting that if() support multiple types, e.g., both integer and boolean,
I don't see any additional errors being caught. The boolean would just be
promoted to an integer. If you're suggesting that if() be converted to
boolean type only, this greatly complicates most condition checks with more
casts and conversions. I don't see this as an improvement.

> If comparison
> operators produced boolean values which were incompatible with integers,
> then the compiler could catch this error.

Why do you want this? Do you intend to have if() support multiple types, or
rework all comparisons, etc., to be boolean only?

> > Who uses enumerated types? I.e., why is this useful if very few people
use
> > it?
>
> User-defined enumerated types are very widely used, so I don't know why
> you think they aren't. I don't know any programmers who don't use them
> daily.

What's wrong with #define's and integers? You have to remember that many C
compilers didn't properly support enum's, struct's, bitfields, etc. well
into the '90's...

> I'd even go so far as to say that _everyone_ uses enumerated types,
> including you. Characters are an enumerated type defined by the
> language.

...

> ... everywhere programmers want to give symbolic

> names to values, when the values are mostly irrelevant.

Typically, that's done with preprocessor #define's in C, as you already
know.

> Most compilers
> use them for token types:
> typedef enum
> { kIdentifier, kLeftParen, kRightParen, kIf, kElse, etc.
> } TokenEnum;
> TokenEnum GetToken( void );
>
> They're used in symbol tables for ID class:
> enum { kFunction, kLabel, kType, kParameter, kVariable, etc. }

Everywhere you'd see a C programmer use #define's and integers.

> Some languages don't have user-defined enumerated types, so they emulate
> them with constants. Enumerated types are more useful if the compiler
> supports them and enforces their use with strong typing.
>
> Here's an example I did last week:
>
> typedef {
> kDivideBy1,
> kDivideBy2,
> kDivideBy4,
> kDivideBy8
> } DividerEnum;
>
> void SetCpuClockDivider( DividerEnum d );
>
> It's used like this:
> SetCpuClockDivider( kDivideBy4 );
>
> With strong typing the compiler can detect this as an error:
> SetCpuClockDivider( 8 );
>

I understand why you want to detect that error, but not why you would want
to detect it in that manner. You have full use in C of any integer values
you want to use.

1) You could add kEnd to DividerEnum, and compare the passed in value with
kEnd in SetCpuClockDivider, i.e., range check.
2) You could've automatically rounded up, down, for in-between values if
any, while setting a default value for any bad value in SetCpuClockDivider,
e.g., by using a switch().
3) You could've placed logic around the call to SetCpuClockDivider so that
when SetCpuClockDivider returns with an error, the code adjusts and recalls.
4) Won't forcing the maintenance programmer to track down the definition of
DividerEnum, which is likely to be moved to a centralized header sometime in
the future, just to find a correct kDivideByN make his/her life more
difficult? (This was one of the problems experienced on 5Mloc of PL/1...)

Rod Pemberton

Richard Harter

unread,

Aug 4, 2008, 12:48:44 PM8/4/08

On Mon, 4 Aug 2008 05:25:21 -0400, "Rod Pemberton"
<do_no...@nohavenot.cmm> wrote:

>"Mike Sieweke" <msie...@ix.netcom.com> wrote in message
>news:msieweke-78C1A1...@bignews.bellsouth.net...
>> "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
>> > "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
>> > > In a language with a boolean type, expressions
>> > > like "4<5" or "x == y" would return a boolean value, and not an
>> > > integer.
>> >
>> > "would return a boolean value" vs. "would return a boolean type"...
>> >
>> > I think it should be "boolean type" there. C already returns a boolean
>> > value for logical comparisons: 0 or 1. It doesn't return a boolean
>type.
>>
>> No. "0" and "1" are integers.
>
>Ok, we're hung up on the terminology of "boolean value". 0 and 1 are
>boolean values irrespective of type. I.e., there are only two possible
>resultant values. Boolean values can be stored in any type, not just a
>boolean type, that allows two values to be stored: integer, boolean, etc.

This isn't quite right. The boolean values are the two possible
truth values, true and false. It is convenient to use 0 and 1 to
represent them in boolean algebra. You are right that they can
be stored in anything that allows two values to be stored.
However you need more than just storing them; you need for the
storage type to have the same algebraic properties as boolean
algebra, and this is where the C conventions are problematic.

Firstly, C does not use {0,1} for {false,true}, it uses
{zero,non-zero}. Any type that permits the zero/non-zero
distinction, e.g. pointers, doubles, bit field values, whatever,
represents boolean values.

Secondly, although anything can be a boolean value, the
comparison operators deliver integer values restricted to {0,1}
with the consequence that we can intermix arithmetic operations
and boolean operations, e.g., ((x<y)+(y<z))!=1. (Hack test for
order transitivity; do not do this at home.)

The upshot is that the C usage is littered with little gotchas,
traps waiting to be sprung. The experienced C programmer doesn't
see them because he/she has long ago formed the habit of stepping
around them.

For example there is no equality comparison operator for boolean
expressions. Thus, x==y, will test true if x and y are of the
same type (modulo various promotion hacks) and have the same
value; we can't use == to test for equivalent truth value, even
though equality is a legitimate operator in boolean algebra. Of
course you can use x&&y instead or (!!x)==(!!y).

Frex, one of the little traps is that it is problematic to define
#define FALSE 0
#define TRUE 1

because the test
if (x == TRUE)

doesn't work right.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.

mike

unread,

Aug 4, 2008, 7:24:20 PM8/4/08

In article <g76i0e$6n0$1...@aioe.org>, do_no...@nohavenot.cmm says...

>
> IMO, you're confusing the values with the type. I.e., -5 is an integer
> value, but it can be stored in a signed integer type, or a float type. Just
> like 0 and 1 are boolean values, but can be stored in integers, booleans,
> float, etc.
>

As a matter of interest, is it guaranteed (by the c standards) that, if
an integer value is converted into a float, that the float value is
numerically exactly equal to the integer value. i.e. is it possible for
5 to convert to 5.0000000000001 or 4.9999999999999 or something similar?

Mike

Sjouke Burry

unread,

Aug 4, 2008, 8:05:16 PM8/4/08

Yep. Floating values guarantee to be almost equal.
And doubles even more so. :)

robert...@yahoo.com

unread,

Aug 4, 2008, 8:31:17 PM8/4/08

On Aug 4, 6:24 pm, mike <m....@irl.cri.replacethiswithnz> wrote:
> As a matter of interest, is it guaranteed (by the c standards) that, if
> an integer value is converted into a float, that the float value is
> numerically exactly equal to the integer value. i.e. is it possible for
> 5 to convert to 5.0000000000001 or 4.9999999999999 or something similar?

No. Consider the valid (32 bit) integer number 2,000,000,001. It has
no exact representation in an IEEE 32 bit float. In fact, rounding
will likely get you the same float for all integer values from
1,999,999,936 to 2,000,000,064.

While you’ll be OK for all 32 integer values in an IEEE double, the
minimum requirements for a double in the C standard at least require a
unique representation for all those values, but not necessarily an
exact one, if I read the standard correctly. If your integer type is
larger than 32 bits, you may not fit in a double either.

Joe Wright

unread,

Aug 4, 2008, 8:55:53 PM8/4/08

What do you mean 'Yep'? The conversion of the integer value 5 to a float
or a double will be exactly 5. There is no case for an approximation.

Here at my house, float has mantissa 24 bits wide and double has
mantissa 53 bits wide. Conversions of integer types char (8 bits) and
short (16 bits) to float will be exact. My int and long are 32 bits and
will fit very nicely in a double exactly.

Not all floating point values are approximate.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Joe Wright

unread,

Aug 4, 2008, 9:41:28 PM8/4/08

Hm.. The IEEE 32-bit float has a 24-bit mantissa. It will hold exactly a
value up to 2^24 or 16,777,216 or so, all char or short values.

The IEEE 64-bit double has a 53-bit mantissa. It will hold exactly any
32-bit int or long value.

Here at my house, GCC 96-bit long double has a 64-bit mantissa and will
accommodate the integral 64-bit long long exactly.

In general, I feel that floating point gets a bad rap for being
approximate rather than exact. FP can be as exact as you want it to be.

robert...@yahoo.com

unread,

Aug 4, 2008, 11:29:49 PM8/4/08

On Aug 4, 8:41 pm, Joe Wright <joewwri...@comcast.net> wrote:
> In general, I feel that floating point gets a bad rap for being
> approximate rather than exact. FP can be as exact as you want it to be.

You should try doing an acceptable job dealing with currency in
(binary) float. Trust me - it's hard.

Mike Sieweke

unread,

Aug 5, 2008, 12:09:57 AM8/5/08

In article <g76i0e$6n0$1...@aioe.org>,
"Rod Pemberton" <do_no...@nohavenot.cmm> wrote:

> "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> news:msieweke-78C1A1...@bignews.bellsouth.net...
> > "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
> > > "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> > > > In a language with a boolean type, expressions
> > > > like "4<5" or "x == y" would return a boolean value, and not an
> > > > integer.
> > >
> > > "would return a boolean value" vs. "would return a boolean type"...
> > >
> > > I think it should be "boolean type" there. C already returns a boolean
> > > value for logical comparisons: 0 or 1. It doesn't return a boolean
> type.
> >
> > No. "0" and "1" are integers.
>
> Ok, we're hung up on the terminology of "boolean value". 0 and 1 are
> boolean values irrespective of type. I.e., there are only two possible
> resultant values. Boolean values can be stored in any type, not just a
> boolean type, that allows two values to be stored: integer, boolean, etc.

Still no. There are only two boolean values: true and false. They
are generally represented as 1 and 0 in the underlying machine, but
that's not required. There was a major disagreement on the
representation of "true" when the C99 standard was created. A lot
of people thought "true" should be represented as -1. The programmer
shouldn't need to know how these values are represented in the machine.
That's one difference between high-level and low-level languages.

Only boolean values may be stored in boolean variables. In C if you
store a boolean value into a variable of another type, the boolean
value is automatically cast to the other type before assignment. C
has the odd characteristic that there are no symbolic names for
boolean values defined in the language. When you assign the integer
"0" to a boolean variable, it casts the integer as a boolean, and
stores it as 0 or 1 in binary. That's very strange.

It probably sounds like I'm just arguing semantics, but I'm not.
There are very real differences in the meaning of "true" as a
boolean value and "0" as an integer. If you want to design a
programming language or argue about language design features, you
need to understand the difference.

> > > > The "if" statement would take a boolean argument instead
> > > > of an integer.
> > >
> > > The "if" statement in C currently takes a boolean value. Tthe result of
> an
> > > expression comparison with zero:
> > > 1) false is zero
> > > 2) true is non-false
> >
> > The "if" statement doesn't take a boolean value,
>
> False. It accepts two "values": zero, and non-zero. This is a boolean
> value representation irrespective of type.

"Non-zero" is not a value. It is a comparison (x!=0).

Here's what the C99 standard has to say about this:

<quote>

6.8.4.1 The if statement
----------------
* Constraints
1 The controlling expression of an if statement shall have scalar type.
------
* Semantics
2 In both forms, the first substatement is executed if the expression
compares unequal to 0.

In the else form, the second substatement is executed if the expression

compares equal to 0. If the first substatement is reached via a label,
the second substatement is not
executed.

[In another section]
Arithmetic types and pointer types are collectively called scalar
types. Array and structure types are collectively called aggregate
types.

[In another section]
Integer and floating types are collectively called arithmetic types.

</quote>

So the standard says that the if statement doesn't take a boolean
value. The value may even be a float.

> > it ["if" statement] was defined
> > when the language had no boolean type.
>
> True.
>
> > It takes an integer value.
>
> False. It takes an integer type. The values are boolean - just two states.

No. You must learn the difference between a type and a value. Please
read the Wikipedia entry for "type (computer science)". "A type is an
attribute of a datum." You don't pass an "integer type" to a function;
you pass a value of an integer type. At least this is true in C.

> > If
> > you pass it something other than an integer, it will be cast to an
> > integer.
>
> If you pass it something other than an integer *type*, it will be cast to an
> integer *type*. This has nothing to do with integer *value*.
>
> > > (Specifically, the value for 2) should be one, because of the comparison
> > > with zero, but that isn't explicitly required... IMO, due to branches
> in
> > > assembly language.)
> > >
> > > "In both forms, the first substatement is executed if the expression
> > > compares unequal to 0. In the else form, the second substatement is
> > > executed if the expression compares equal to 0." n1256 draft 6.8.4.1 sub
> 2
> >
> > That's what I said.
>
> IMO, you're confusing the values with the type. I.e., -5 is an integer
> value, but it can be stored in a signed integer type, or a float type. Just
> like 0 and 1 are boolean values, but can be stored in integers, booleans,
> float, etc.

Still no. Both 0 and 1 are integer values (scalars in the C99
standard). If you store 0 into an integer, you're not storing
"false" into that integer.

From the C99 standard:

<quote>
6.3.1.2 Boolean type
------------
1 When any scalar value is converted to_Bool, the result is 0 if the
value compares equal to 0; otherwise, the result is 1.
</quote>

According to the C99 standard, "0" and "1" are scalar values. So
boolean assignment is defined as a conversion from a scalar value
to an underlying machine representation, like this:
Compare value with 0
If equal then the _Bool variable is assigned the machine
representation of 0
Otherwise, the _Bool variable is assigned the machine
representation of 1
In a real compiler this will be optimized (a lot), but that's what
is really happening.

> > > > Each of these statements would be illegal:
> > > >
> > > > int i = true; // value incompatible with type
> > > > bool b = 4; // value incompatible with type
> > > > if (i) // "if" expects boolean expression
> > > > printf( "true" );
> > > > if (i<4<5) // boolean expr "i<4" incompatible with "<" operator
> > > > printf( "???" );
> > >
> > > What is gained if C has an additional boolean type? More type checking?
> > > More casts? More comparisons?
> >
> > Yes, more type checking. The compiler could tell you this isn't correct:
> > x = (y < 5) + 2;
>
> If I coded that, it's correct. How does the compiler know this isn't
> correct since it can't know what I intended to code?

It's much more likely that this was a typographical error than an obscure
idiom.

> > if you meant to type
> > x = (y << 5) + 2;
>
> Who says I meant to type that? The compiler doesn't have the right to
> decide which of these two is correct. The code, which I wrote, is what
> decides.

The compiler doesn't know which is correct, but it knows which is legal.
The wise language designer makes common errors illegal, so the compiler
can help the programmer.

> If the language has insufficient syntax, then there is a problem with the
> syntax that needs to be addressed, not the types or type system. E.g.,
> require additional parens around boolean result, etc. instead of requiring a
> boolean type. E.g., (from your comments immediately below) change the
> assignment operator to something safer, or use a #define to do so.

These features already exist in languages you don't know. I assumed
these features were common knowledge. My advice is to learn another
language that doesn't follow the "C" tradition. Something in the Lisp
family and something in the Pascal family. Maybe even C#. Not Java.
Learn them well enough that you can use their unique idioms instead of
just transliterating C.

> > The compiler could detect an error typing "if (x=y)" when you meant to
> > type "if (x==y)" (unless x and y are _Bool, but that's a different
> > topic).
>
> This is a problem with C. Assignment and comparison should've had uniquely
> searchable operators. I.e., hard to distinquish "=" from "==" by text
> search... E.g., could've used ":=" for assignment.

The problem is not in the assignment syntax. The problem is in
letting the assignment statement return a value. The problem is
in the language feature that makes this legal:
x = y = z = 4;

> > > if(i!=0)
> > > printf("true");
> > >
> > > What's the advantage to this?
> >
> > The main advantage is that the code states its intent.
>
> As does if(i)... to those who program. Entire generations of languages used
> zero as false. The way that if() was taught for many of them:
>
> if (true) <==> if (not false) <==> if (not zero)
>
> This "idiom" has worked in every language I've programmed in that supports
> an "if" statement. IIRC, there is/was one language that used a different
> value for false.

As I stated in an earlier message, there is an entire family of languages
(the Pascal family) that have distinct values for "true" and "false",
which are incompatible with integer types. In all these languages the
"if" statement takes a boolean value.

There are other languages with distinct values for "true" and "false",
some of which I wasn't aware of:
http://en.wikipedia.org/wiki/Boolean_datatype

I have no experience with C#, but it appears that its "if" statement
expects a boolean value.

> > But the real advantage comes from all
> > the other types of errors that can be caught if this verbose form is
> > required.
>
> if(i) and if(i!=0) are currently equivalent because if() only supports an
> integer type. Currently, this can't catch any additional errors. If you're
> suggesting that if() support multiple types, e.g., both integer and boolean,
> I don't see any additional errors being caught. The boolean would just be
> promoted to an integer. If you're suggesting that if() be converted to
> boolean type only, this greatly complicates most condition checks with more
> casts and conversions. I don't see this as an improvement.

If the assignment statement doesn't return a value, and if the "if"
statement only accepts a boolean value, and if comparison operators
don't return integers, then this common error will always be caught
by the compiler:
if ( x = 4 ) // meant to type "=="

The cost is a few extra comparisons: "if (i!=0)" instead of "if (i)".
The benefit is worth it in my opinion.

There are a host of other errors that could also be caught by the
compiler with changes to the C language. But then it wouldn't be C.

> > If comparison
> > operators produced boolean values which were incompatible with integers,
> > then the compiler could catch this error.
>
> Why do you want this? Do you intend to have if() support multiple types, or
> rework all comparisons, etc., to be boolean only?

The latter. Comparisons should return boolean values, not integers.

> > > Who uses enumerated types? I.e., why is this useful if very few people
> use
> > > it?
> >
> > User-defined enumerated types are very widely used, so I don't know why
> > you think they aren't. I don't know any programmers who don't use them
> > daily.
>
> What's wrong with #define's and integers? You have to remember that many C
> compilers didn't properly support enum's, struct's, bitfields, etc. well
> into the '90's...

And yet they went to the trouble to add them to C99. Those idiots :o}.
Somehow they thought this feature was valuable enough to complicate the
language and add a new reserved word. And when they did C++ they added
strong typing, too. Hmm... What could they have been thinking? When
you can fully appreciate the answer to this question, then I'll continue
this discussion.

> > I'd even go so far as to say that _everyone_ uses enumerated types,
> > including you. Characters are an enumerated type defined by the
> > language.
> ...
>
> > ... everywhere programmers want to give symbolic
> > names to values, when the values are mostly irrelevant.
>
> Typically, that's done with preprocessor #define's in C, as you already
> know.

Then you have a maintenance problem. It's up to the programmer to
make sure no two names share the same value. The compiler won't
help. If you use an enum, the compiler will make each value unique*.

* Unless you think this is clever:

typedef {
kDivideBy1,
kDivideBy2,
kDivideBy4 = 0,
kDivideBy8
} DividerEnum;

Don't be clever.

> > Most compilers
> > use them for token types:
> > typedef enum
> > { kIdentifier, kLeftParen, kRightParen, kIf, kElse, etc.
> > } TokenEnum;
> > TokenEnum GetToken( void );
> >
> > They're used in symbol tables for ID class:
> > enum { kFunction, kLabel, kType, kParameter, kVariable, etc. }
>
> Everywhere you'd see a C programmer use #define's and integers.

Even C programmers use enum's. They're in the language for a
reason. I work on a team of 20 programmers that use C and C++.
We all use enum's every day. I'm not exaggerating. Every day.

(1-3) Why would you go to all that trouble when the compiler can
handle that for you? It doesn't matter that you can do this in
a different way. Let the compiler help you.

With strong typing the compiler will even guarantee that no one can
pass a bad parameter to my function *. I think there's a lot of
value in that. (*This isn't strictly true if a programmer casts
a bad value to a DividerEnum. Fire him.)

(4) Are you seriously suggesting that the maintenance programmer would
use my "SetCpuClockDivider()" function without checking the header
file first? Fire him. Now. I'm not joking.

Are you suggesting that there won't be a header file for every C
source file (except for the main module)? Inconceivable.

One purpose of a header file is to document each function's purpose,
so the programmer doesn't need to look at the source. If your header
files aren't usable for this purpose, fire the programmer. I'm not
kidding. Maybe if he's your brother-in-law, try to train him.
If he can't be trained, fire him.

thomas...@gmx.at

unread,

Aug 5, 2008, 6:35:05 AM8/5/08

On 5 Aug., 05:29, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
wrote:

I have seen quite often the use of float to represent monetary
amounts. The people draw the wrong conclusion that numbers
with decimal point must be represented as float. Later when
inaccuracies show up they start to use all sorts of rounding in
between and still do not succeed to get the correct results.
Such inaccuracies happen even with simple operations like
summing up a large list of values.

I have removed such inaccuracies by deciding for a unit of 1c or
0.01c and using an int or long to represent the amount.
I don't say that this solution works always but I was able to remove
all inaccuracies once and for all.

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

Richard Heathfield

unread,

Aug 5, 2008, 6:51:50 AM8/5/08

thomas...@gmx.at said:

> On 5 Aug., 05:29, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
> wrote:

<snip>

>>
>> You should try doing an acceptable job dealing with currency in
>> (binary) float. Trust me - it's hard.
>
> I have seen quite often the use of float to represent monetary
> amounts. The people draw the wrong conclusion that numbers
> with decimal point must be represented as float.

Sometimes it's unavoidable.

> Later when
> inaccuracies show up they start to use all sorts of rounding in
> between and still do not succeed to get the correct results.
> Such inaccuracies happen even with simple operations like
> summing up a large list of values.

They certainly happen when /multiplying/ a list of values. Consider, for
example, a loan of 270000 (local currency units) over 25 years at 6.25%
calculated daily and added annually, repayments to be made monthly, and
all repayments to be the same amount except for the last, which is
permitted to be slightly but not drastically lower to make everything add
up.

If you can do this calculation accurately in 32-bit integers (and that also
means no bignums, by the way), you deserve a pat on the back.

> I have removed such inaccuracies by deciding for a unit of 1c or
> 0.01c and using an int or long to represent the amount.
> I don't say that this solution works always but I was able to remove
> all inaccuracies once and for all.

It doesn't work for rollup calcs in the general case.

thomas...@gmx.at

unread,

Aug 5, 2008, 9:07:16 AM8/5/08

On 5 Aug., 12:51, Richard Heathfield <r...@see.sig.invalid> wrote:

> thomas.mer...@gmx.at said:
>
> > On 5 Aug., 05:29, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
> > wrote:
> <snip>
>
> >> You should try doing an acceptable job dealing with currency in
> >> (binary) float. Trust me - it's hard.
>
> > I have seen quite often the use of float to represent monetary
> > amounts. The people draw the wrong conclusion that numbers
> > with decimal point must be represented as float.
>
> Sometimes it's unavoidable.

I had a case where it was necessary to create a float just to enter
a value in a database table.

> > Later when
> > inaccuracies show up they start to use all sorts of rounding in
> > between and still do not succeed to get the correct results.
> > Such inaccuracies happen even with simple operations like
> > summing up a large list of values.
>
> They certainly happen when /multiplying/ a list of values. Consider, for
> example, a loan of 270000 (local currency units) over 25 years at 6.25%
> calculated daily and added annually, repayments to be made monthly, and
> all repayments to be the same amount except for the last, which is
> permitted to be slightly but not drastically lower to make everything add
> up.

AFAIK there are exact rules (by law) how this calculation should be
done and at which places rounding should take place to some given
precision (E.g.: The account is never allowed to carry any sub cent
amounts). I cannot see any advantage that floats could give me here,
but I have seen the disadvantages of floats and monetary amounts
in the past.

> If you can do this calculation accurately in 32-bit integers (and that also
> means no bignums, by the way), you deserve a pat on the back.

So use 64-bit integers.
With integers you don't have the problem that you carry some minimal
values around that show up at a totally unrelated place. You can
print all your calculations and the people can verify all the steps
with manual calculation and nobody will find a place where one cent
(or 0.01c or whatever) suddenly shows up without any explanation.
The formal rules for such calculations were defined with manual
calculations (on paper) in mind and not with IEEE 754 floating
points as concept.

BTW.: Isn't something as simple as 0.1 not representable exactly as
IEEE 754 floating point?

With the following Seed7 program I found out that 8071 summations
of 0.1 with single precision floating points (32 bits wide) cause
an error of 0.1 (when rounded to 0.1):

$ include "seed7_05.s7i";
include "float.s7i";

const proc: main is func
local
var float: accumulator is 0.0;
var integer: count is 0;
var integer: number is 0;
var string: stri is "";
begin
for number range 10 to 8080 do
accumulator := 0.0;
for count range 1 to number do
accumulator +:= 0.1;
end for;
stri := str(number);
stri := stri[.. pred(length(stri))] & "." &
stri[length(stri) len 1];
if stri <> (accumulator digits 1) then
writeln(number <& " " <& accumulator <& " " <& stri);
end if;
end for;
end func;

I verified this with the following C program:

# include <stdio.h>

int main (void)
{
float accumulator = 0.0;
int max_count = 8071;
int count;

for (count = 1; count <= max_count; count++) {
accumulator += 0.1;
} /* for */
printf("%d * 0.1 = %0.14f rounded = %0.1f should be %d.%d\n",
max_count, accumulator, accumulator,
max_count / 10, max_count % 10);
return 0;
}

Accountants sometimes spend days with searching for some tiny amount
in some big calculation (such as a 1€ difference in calculations
with a total amount of 1234567€).

> > I have removed such inaccuracies by deciding for a unit of 1c or
> > 0.01c and using an int or long to represent the amount.
> > I don't say that this solution works always but I was able to remove
> > all inaccuracies once and for all.
>
> It doesn't work for rollup calcs in the general case.

Why?

Richard Heathfield

unread,

Aug 5, 2008, 10:15:16 AM8/5/08

thomas...@gmx.at said:

<snip>

> AFAIK there are exact rules (by law) how this calculation should be
> done and at which places rounding should take place to some given
> precision

Right.

> (E.g.: The account is never allowed to carry any sub cent
> amounts).

Less right. :-) I mean yes, there could be situations where that's true,
but banking and insurance aren't two of them.

> I cannot see any advantage that floats could give me here,
> but I have seen the disadvantages of floats and monetary amounts
> in the past.

In the UK all intermediate financial results (in banking and insurance, at
least) have to be accurate to within plus or minus a millionth of a penny.
If you take this as your base unit, then an unsigned 32-bit integer can't
even store a 43-pound (or dollar) balance.

>> If you can do this calculation accurately in 32-bit integers (and that
>> also means no bignums, by the way), you deserve a pat on the back.
> So use 64-bit integers.

In practice, in the real financial world, doubles are routinely used. (So,
at the other extreme, is binary coded decimal!)

> BTW.: Isn't something as simple as 0.1 not representable exactly as
> IEEE 754 floating point?

It's certainly not representable in pure binary. The closest you can get in
16 bits, for example, is 6553/65536, which is 0.0999908447265625 - not too
shabby, really. On my system, a double appears to respond to a request to
store 0.1 by storing this instead:

0.1000000000000000055511151231257827021181583404541015625

which is certainly adequate for financial work.

> With the following Seed7 program I found out that 8071 summations
> of 0.1 with single precision floating points (32 bits wide) cause
> an error of 0.1 (when rounded to 0.1):

Yes. Single-precision floats don't cut it.

> Accountants sometimes spend days with searching for some tiny amount

> in some big calculation (such as a 1? difference in calculations
> with a total amount of 1234567?).

>
>> > I have removed such inaccuracies by deciding for a unit of 1c or
>> > 0.01c and using an int or long to represent the amount.
>> > I don't say that this solution works always but I was able to remove
>> > all inaccuracies once and for all.
>>
>> It doesn't work for rollup calcs in the general case.
>
> Why?

Let's imagine a bank account that pays 5% pa, calculated daily (so that's
0.01336806171134% per day on the close-of-day balance, applied 365 days a
year or 366 days in leap years) on accounts with 500 or more coming in
each month, but the interest rate is significantly higher for balances of
300,000 or more. A customer opened an account on 5/1/2001 with the minimum
balance of 10, but later that same day day he paid in his wages of 500,
and every 30th day thereafter he has paid in 500. (On the 900th day that
the account was open, the customer received an inheritance of 125537.20
which he paid into the account. This is the only break in the savings
pattern, though.) Assuming he keeps up this pattern, on what day will the
new rate cut in for this customer?

If you do this calculation using 32-bit integers, I would be very surprised
indeed if you manage to get the correct date.

Bartc

unread,

Aug 5, 2008, 10:24:08 AM8/5/08

"Richard Heathfield" <r...@see.sig.invalid> wrote in message
news:fOOdndxzcNDHwgXV...@bt.com...

> Let's imagine a bank account that pays 5% pa, calculated daily (so that's
> 0.01336806171134% per day on the close-of-day balance, applied 365 days a
> year or 366 days in leap years) on accounts with 500 or more coming in
> each month, but the interest rate is significantly higher for balances of
> 300,000 or more. A customer opened an account on 5/1/2001 with the minimum
> balance of 10, but later that same day day he paid in his wages of 500,
> and every 30th day thereafter he has paid in 500. (On the 900th day that
> the account was open, the customer received an inheritance of 125537.20
> which he paid into the account. This is the only break in the savings
> pattern, though.) Assuming he keeps up this pattern, on what day will the
> new rate cut in for this customer?

On the day the balance reaches or passes 300,000. If the actual day that
happens is wrong by a day or two (and I make it a couple of decades in the
future), who's ever going to know?

--
Bartc

Richard Heathfield

unread,

Aug 5, 2008, 10:34:30 AM8/5/08

Bartc said:

You are out by a number of *years*.

> who's ever going to know?

The internal auditors, who can slap your wrist if you get it wrong. The
external auditors, who might well get you fired if you get it wrong. The
customer (if he's sufficiently clued-up, and he might well be - bankers
and actuaries have bank accounts too!), who can take you to court if you
get it wrong. Or the Financial Services Authority, who can actually close
you down if you get it wrong.

thomas...@gmx.at

unread,

Aug 5, 2008, 11:30:27 AM8/5/08

On 5 Aug., 16:15, Richard Heathfield <r...@see.sig.invalid> wrote:

> thomas.mer...@gmx.at said:
>
> <snip>
>
> > AFAIK there are exact rules (by law) how this calculation should be
> > done and at which places rounding should take place to some given
> > precision
>
> Right.
>
> > (E.g.: The account is never allowed to carry any sub cent
> > amounts).
>
> Less right. :-) I mean yes, there could be situations where that's true,
> but banking and insurance aren't two of them.
>
> > I cannot see any advantage that floats could give me here,
> > but I have seen the disadvantages of floats and monetary amounts
> > in the past.
>
> In the UK all intermediate financial results (in banking and insurance, at
> least) have to be accurate to within plus or minus a millionth of a penny.
> If you take this as your base unit, then an unsigned 32-bit integer can't
> even store a 43-pound (or dollar) balance.

An IEEE 754 double-precision 64 bit has a mantissa of 52 bits. It
would not be precise to a millionth of a penny at approx 45035996.
For an account this is much, but for an insurance company it may be
necessary to deal with such an amount.

If you get a bill you don't expect the amounts to be printed with an
accuracy of a millionth of a cent/penny. I guess that the smallest
amount allowed on bills (by law) in the final sums will be a cent
(or penny). For other lines there will be let's say 0.01c precision.
The bill can contain many lines with subtotals and totals as netto
and brutto (incl. tax). You can always take your calculator (or a
sheet of paper) and recalculate the total sum (according to some
rules (e.g. sum up the netto ammounts and calculate the tax from the
sum)). If you cannot verify the sum up to the cent the company has
troubles. In my country there are really people who check bills this
way.

I was involved in creating such bills for a mobile phone company.
The bills for big customers contain lots and lots of lines. As long
as doubles were used, some tiny (cent) differences showed up from
time to time. In the moment I switched to long integers this
differences disapeared and they never came back.

> >> If you can do this calculation accurately in 32-bit integers (and that
> >> also means no bignums, by the way), you deserve a pat on the back.
> > So use 64-bit integers.
>
> In practice, in the real financial world, doubles are routinely used. (So,
> at the other extreme, is binary coded decimal!)
>
> > BTW.: Isn't something as simple as 0.1 not representable exactly as
> > IEEE 754 floating point?
>
> It's certainly not representable in pure binary. The closest you can get in
> 16 bits, for example, is 6553/65536, which is 0.0999908447265625 - not too
> shabby, really. On my system, a double appears to respond to a request to
> store 0.1 by storing this instead:
>
> 0.1000000000000000055511151231257827021181583404541015625

For single precision 0.1 is stored as approx

0.10000000149 ...

From that most people would guess that summing up this value a
million times would not cause a difference of 0.1 . But as my test
showed 8071 summations are enough to create such a difference.

Therefore I am not so sure that you are at the safe side with
doubles.

> which is certainly adequate for financial work.
>
> > With the following Seed7 program I found out that 8071 summations
> > of 0.1 with single precision floating points (32 bits wide) cause
> > an error of 0.1 (when rounded to 0.1):
>
> Yes. Single-precision floats don't cut it.

I am too lazy to test it, but I guess that double precision floats
do it neither.

Bartc

unread,

Aug 5, 2008, 11:30:35 AM8/5/08

"Richard Heathfield" <r...@see.sig.invalid> wrote in message

news:Cdadnd0Me99A_gXV...@bt.com...

> Bartc said:
>
>> "Richard Heathfield" <r...@see.sig.invalid> wrote in message
>> news:fOOdndxzcNDHwgXV...@bt.com...
>>
>>
>>> Let's imagine a bank account that pays 5% pa, calculated daily (so
>>> that's 0.01336806171134% per day on the close-of-day balance, applied
>>> 365 days a year or 366 days in leap years) on accounts with 500 or more
>>> coming in each month, but the interest rate is significantly higher for
>>> balances of 300,000 or more. A customer opened an account on 5/1/2001
>>> with the minimum balance of 10, but later that same day day he paid in
>>> his wages of 500, and every 30th day thereafter he has paid in 500. (On
>>> the 900th day that the account was open, the customer received an
>>> inheritance of 125537.20 which he paid into the account. This is the
>>> only break in the savings pattern, though.) Assuming he keeps up this
>>> pattern, on what day will the new rate cut in for this customer?
>>
>> On the day the balance reaches or passes 300,000. If the actual day that
>> happens is wrong by a day or two (and I make it a couple of decades in
>> the future),
>
> You are out by a number of *years*.

That was just a guess, and I forgot the interest is much higher from day
900.

My code calculates it as 4384th day of opening (5 Jan 2013), without
applying rounding. Rounding each day to 0.01 units delayed it by 1 day.
Calculations using 64-bit 'doubles'.

What was the real answer?

--
Bartc

Richard Heathfield

unread,

Aug 5, 2008, 12:01:53 PM8/5/08

thomas...@gmx.at said:

> On 5 Aug., 16:15, Richard Heathfield <r...@see.sig.invalid> wrote:
>> thomas.mer...@gmx.at said:
>>
>> <snip>
>>
>> > AFAIK there are exact rules (by law) how this calculation should be
>> > done and at which places rounding should take place to some given
>> > precision
>>
>> Right.
>>
>> > (E.g.: The account is never allowed to carry any sub cent
>> > amounts).
>>
>> Less right. :-) I mean yes, there could be situations where that's
>> true, but banking and insurance aren't two of them.
>>
>> > I cannot see any advantage that floats could give me here,
>> > but I have seen the disadvantages of floats and monetary amounts
>> > in the past.
>>
>> In the UK all intermediate financial results (in banking and insurance,
>> at least) have to be accurate to within plus or minus a millionth of a
>> penny. If you take this as your base unit, then an unsigned 32-bit
>> integer can't even store a 43-pound (or dollar) balance.
>
> An IEEE 754 double-precision 64 bit has a mantissa of 52 bits. It
> would not be precise to a millionth of a penny at approx 45035996.

Right. I'm not saying 64-bit doubles are always good enough. They aren't.
What I'm saying is that there are situations where 32-bit integers won't
cut it.

<snip>

> If you get a bill you don't expect the amounts to be printed with an
> accuracy of a millionth of a cent/penny.

Right - if you're just dealing with ordinary transactions where interest is
not an issue, and where the values aren't too colossal, integers work just
fine.

<snip>

>> > BTW.: Isn't something as simple as 0.1 not representable exactly as
>> > IEEE 754 floating point?
>>
>> It's certainly not representable in pure binary. The closest you can get
>> in 16 bits, for example, is 6553/65536, which is 0.0999908447265625 -
>> not too shabby, really. On my system, a double appears to respond to a
>> request to store 0.1 by storing this instead:
>>
>> 0.1000000000000000055511151231257827021181583404541015625
>
> For single precision 0.1 is stored as approx
>
> 0.10000000149 ...

Yes, but you don't use single precision if you care about the accuracy of a
long rollup calculation!

> From that most people would guess that summing up this value a
> million times would not cause a difference of 0.1 . But as my test
> showed 8071 summations are enough to create such a difference.

This demonstrates why most people should leave financial programming well
alone. :-)

> Therefore I am not so sure that you are at the safe side with
> doubles.

It is wise to doubt when you don't know. But I *am* sure - because I've
been there, done that, had the fights with the actuarial people, been
through the audit process, the lot.

<snip>

Richard Heathfield

unread,

Aug 5, 2008, 12:06:58 PM8/5/08

Bartc said:

>
> "Richard Heathfield" <r...@see.sig.invalid> wrote in message
> news:Cdadnd0Me99A_gXV...@bt.com...
>> Bartc said:
>>
>>> "Richard Heathfield" <r...@see.sig.invalid> wrote in message
>>> news:fOOdndxzcNDHwgXV...@bt.com...
>>>
>>>
>>>> Let's imagine a bank account that pays 5% pa, calculated daily (so
>>>> that's 0.01336806171134% per day on the close-of-day balance, applied
>>>> 365 days a year or 366 days in leap years) on accounts with 500 or
>>>> more coming in each month, but the interest rate is significantly
>>>> higher for balances of 300,000 or more. A customer opened an account
>>>> on 5/1/2001 with the minimum balance of 10, but later that same day
>>>> day he paid in his wages of 500, and every 30th day thereafter he has
>>>> paid in 500. (On the 900th day that the account was open, the customer
>>>> received an inheritance of 125537.20 which he paid into the account.
>>>> This is the only break in the savings pattern, though.) Assuming he
>>>> keeps up this pattern, on what day will the new rate cut in for this
>>>> customer?
>>>
>>> On the day the balance reaches or passes 300,000. If the actual day
>>> that happens is wrong by a day or two (and I make it a couple of
>>> decades in the future),
>>
>> You are out by a number of *years*.
>
> That was just a guess,

Ah! :-)

> and I forgot the interest is much higher from day 900.

Aye.

>
> My code calculates it as 4384th day of opening (5 Jan 2013), without
> applying rounding. Rounding each day to 0.01 units delayed it by 1 day.
> Calculations using 64-bit 'doubles'.

With 64-bit doubles, I'd expect you to get the right answer, and indeed you
did.

> What was the real answer?

As you calculated. 4384 days is precisely 12 years (including 3 leap
years).

I would expect integer-based calcs to slip the date by at least one day and
possibly more.

James Harris

unread,

Aug 5, 2008, 12:19:34 PM8/5/08

Whoa! Alarm bells should be ringing loudly here. "Who's ever going to
know?"??? I guess most people would not want their banks to get the
figures almost right. It's part of their duty of trust to calculate
correctly or at least to within defined parameters.

FWIW in my experience of writing software in the UK banking industry
we used decimal arithmetic for financial amounts. It was so long ago
that I cannot recall whether we calculated to pence or to some
specific decimal fraction of a penny, but I'm sure details of the
calculations were prescribed.

Imagine the bad publicity if a bank was found to have miscalculated
even slightly in their favour for a number of years. It's not so much
the financial losses to customers if they are negligible but the
appearance of incompetence of the bank. If banks can't calculate
finances correctly, ....

BTW, have you seen HawkEye used in tennis tournaments to rule on
whether a ball was in or out? I can't help feeling I'd rather see
three such machines from different manufacturers each give a verdict.
Then a majority decision could be taken. And the closeness of the
other two should give a measure of confidence or lack thereof in the
trajectories shown. At the moment full trust is placed in one computer
program. Not ideal, though it is the same for both players. OK I know
this is a little OT.

Rod Pemberton

unread,

Aug 5, 2008, 3:59:09 PM8/5/08

"Mike Sieweke" <msie...@ix.netcom.com> wrote in message

news:msieweke-5AC5BC...@bignews.bellsouth.net...

> A lot
> of people thought "true" should be represented as -1.

I agree absolutely 100%. I know why. Do you understand why?
(It's outside the scope of the C language...)

> When you assign the integer
> "0" to a boolean variable, it casts the integer as a boolean, and
> stores it as 0 or 1 in binary. That's very strange.

Why? I don't understand why this is strange. This is exactly the
representation of boolean I've been using since the late '70's... Just as I
don't understand why you chose to limit boolean values to "true" and
"false"....

These are straight from Wikipedia:

1) "...operations on the set {0,1}..."
2) "Boolean algebra is the algebra of two values. These are usually taken to
be 0 and 1, as we shall do here, although F and T, false and true, etc. are
also in common use."

> > > > > The "if" statement would take a boolean argument instead
> > > > > of an integer.
> > > >
> > > > The "if" statement in C currently takes a boolean value. Tthe
result of
> > an
> > > > expression comparison with zero:
> > > > 1) false is zero
> > > > 2) true is non-false
> > >
> > > The "if" statement doesn't take a boolean value,
> >
> > False. It accepts two "values": zero, and non-zero. This is a boolean
> > value representation irrespective of type.
>
> "Non-zero" is not a value. It is a comparison (x!=0).
>

The result of a "!=" (not equal) comparison in C is 0 or 1. The reason the
if(), as I stated previously elsewhere, is because of assembly language.
Specifically, it may only need a single state to effect a branch or jump.
Therefore, the other state can be optimized as non-zero instead of one.
Requiring the use of boolean will produce larger assembly code.

> Here's what the C99 standard has to say about this:

I already posted this previously, in support of my argument. ;)

> So the standard says that the if statement doesn't take a boolean
> value. The value may even be a float.

It must still do the comparison, which results in boolean values, prior to
effecting the conditional. But, since the if() doesn't need both, just one,
the compiler can optimize the away the additional boolean logic.

> > > it ["if" statement] was defined
> > > when the language had no boolean type.
> >
> > True.
> >
> > > It takes an integer value.
> >
> > False. It takes an integer type. The values are boolean - just two
states.
>
> No. You must learn the difference between a type and a value. Please
> read the Wikipedia entry for "type (computer science)". "A type is an
> attribute of a datum."

A type is a container for data. A type is an artificial construct used to
separate the plurality of values that a cpu integer can represent: signed
integer, unsigned integer, differently sized integers (8-bits, 16-bits,
...), split registers, pointers, etc.

> You don't pass an "integer type" to a function;
> you pass a value of an integer type. At least this is true in C.

True. You should wonder why I agree. I.e., you seem confused to me about
what I said.

IMO, you're confusing "true" and "false" with boolean. Boolean is just two
states. You can see the quotes above that came from Wikipedia. 0 and 1 are
the most common representation of boolean. (Think logic circuitry...)

> From the C99 standard:
>
> <quote>
> 6.3.1.2 Boolean type
> ------------
> 1 When any scalar value is converted to_Bool, the result is 0 if the
> value compares equal to 0; otherwise, the result is 1.
> </quote>
>
> According to the C99 standard, "0" and "1" are scalar values. So
> boolean assignment is defined as a conversion from a scalar value
> to an underlying machine representation

No. There is no _conversion_ of a scalar value to a C type, or it's
underlying machine representation. The C type, with it's underlying machine
representation, can already represent the scalar values. It would be
boolean to boolean. I strongly doubt anyone would allow a size reduction
via implicit cast or an integer demotion rule. The usual purpose of
implicit casts or integer promotion is to increase the type's size to
something realistic for the language and cpu.

> > > Yes, more type checking. The compiler could tell you this isn't
correct:
> > > x = (y < 5) + 2;
> >
> > If I coded that, it's correct. How does the compiler know this isn't
> > correct since it can't know what I intended to code?
>
> It's much more likely that this was a typographical error than an obscure
> idiom.

Irrelevant. It's what was coded.

> > > if you meant to type
> > > x = (y << 5) + 2;
> >
> > Who says I meant to type that? The compiler doesn't have the right to
> > decide which of these two is correct. The code, which I wrote, is what
> > decides.
>
> The compiler doesn't know which is correct, but it knows which is legal.

Both are.

> so the compiler
> can help the programmer.

A worthy goal, but impractical. Kind of like chasing windmills...

> The wise language designer makes common errors illegal,

The language designer can't fully expect to know the outcomes, no matter how
wise.

> > If the language has insufficient syntax, then there is a problem with
the
> > syntax that needs to be addressed, not the types or type system. E.g.,
> > require additional parens around boolean result, etc. instead of
requiring a
> > boolean type. E.g., (from your comments immediately below) change the
> > assignment operator to something safer, or use a #define to do so.
>
> These features already exist in languages you don't know.

They exist in languages I do know. That's why I mentioned that the syntax
can be fixed instead of introducing a new type.

> I assumed
> these features were common knowledge. My advice is to learn another
> language that doesn't follow the "C" tradition.

Were you not following this thread? I have experience, at some point or
other, in about fourteen languages.

> Something in the Lisp
> family and

That I don't.

> something in the Pascal family.

That I do (some 20+ years ago...). And, the "if(not zero)" idiom worked
properly then... which I learned prior to Pascal.

> Maybe even C#. Not Java.
> Learn them well enough that you can use their unique idioms instead of
> just transliterating C.

C is one of the better ones. FORTH and Postscript require too much work
keeping track of the stack state. And, can be confusing to do lack of
syntax. FORTRAN as I learned it had almost no string ability and required
code in specific columns... Horrible language. BASIC as I learned it was
very powerful with strings, but limited in many other areas. Although, it's
line oriented-ness was useful in a manufacturing setting, just for it's flow
control. Pascal as I learned it had no real pointers, although it had a
variable type that could be used like one in a very limited sense. It had
no ability to access memory or the operating system directly. It was good
for learning structured programming and that's about it. It was weak even
compared to BASIC. PL/1 as I learned it, a variant unlike IBM's PL/1, which
looked more like Pascal, was very powerful. It had pointers which could
directly access memory or objects like C. It was pass-by-reference by
default. C should've been designed that way. Superb feature. (Of course,
pointers are presenting a problem to optimizing compilers today...)

> > > The compiler could detect an error typing "if (x=y)" when you meant to
> > > type "if (x==y)" (unless x and y are _Bool, but that's a different
> > > topic).
> >
> > This is a problem with C. Assignment and comparison should've had
uniquely
> > searchable operators. I.e., hard to distinquish "=" from "==" by text
> > search... E.g., could've used ":=" for assignment.
>
> The problem is not in the assignment syntax.

Fixing the *syntax* so that assignment is searchable, also fixes *your*
problem. That was my point.

> The problem is in
> letting the assignment statement return a value.

This is solved by using colon equal ":=", or any two character symbol other
than "==".

> The problem is
> in the language feature that makes this legal:
> x = y = z = 4;

Is it? You just stated the problem was distinguishing a likely erroneous
assignment "if(x=y)" from a probably correct comparison "if(x==y)". So, why
are you changing the problem?

> As I stated in an earlier message, there is an entire family of languages
> (the Pascal family) that have distinct values for "true" and "false",
> which are incompatible with integer types. In all these languages the
> "if" statement takes a boolean value.

Well, it's been twenty some years. I'll take your word for it... for now.

> If the assignment statement doesn't return a value, and if the "if"
> statement only accepts a boolean value, and if comparison operators
> don't return integers, then this common error will always be caught
> by the compiler:
> if ( x = 4 ) // meant to type "=="

Why go to all that trouble? If the assignment operator is colon-equal,
":=", this error will be caught:

if(x:=4) /* this can't be mistook for "if(x==4)" */

> The cost is a few extra comparisons: "if (i!=0)" instead of "if (i)".

To implement it natively, the cost is a reimplementation of the C compiler.
Reworking of all language logic. Reworking of the type system. Breaking of
all existing C code. etc.

> The benefit is worth it in my opinion.

#define := =

Require everyone to use :=

> > What's wrong with #define's and integers? You have to remember that
many C
> > compilers didn't properly support enum's, struct's, bitfields, etc. well
> > into the '90's...
>
> And yet they went to the trouble to add them to C99.

C89...?

struct's are K&R
enum's are ANSI C89/ISO C90

> Somehow they thought this feature was valuable enough to complicate the
> language and add a new reserved word.

The point was that although present in the language, they didn't work
properly in many compilers.

> And when they did C++ they added
> strong typing, too.

All this does is prevent programmers from programming. It prevents the
programmer from converting values from the type they have into the type they
need. If implicit conversions aren't available, then the programmer has to
learn how to convert or cast properly. From what I've seen, most don't do
well with casts. Then they spend much time trying to figure out worthless
compiler error messages.

> Hmm... What could they have been thinking?

Who knows... A few members of the original X3J11 team have publicly stated
they made quite a few mistakes.

> When
> you can fully appreciate the answer to this question, then I'll continue
> this discussion.

You're acting like you're the one teaching me, but since you're the one
making all the mistakes...

> > Typically, that's done with preprocessor #define's in C, as you already
> > know.
>
> Then you have a maintenance problem. It's up to the programmer to
> make sure no two names share the same value. The compiler won't
> help.

I'll borrow your line: if a programmer can't write down sequentially
increasing numbers, fire them... I.e., if they can't do that, how can you
expect them to know the difference between "true" and "false"? How to clear
a bit, set a bit, invert a bit? Make a decision? Etc...

> > Everywhere you'd see a C programmer use #define's and integers.
>
> Even C programmers use enum's. They're in the language for a
> reason.

Much stuff is in the C language. Much of it is worthless. Much of it was
by poorly chosen by committee. The C obfuscation competitions are proof...

So, if it fails to catch your coding error, it's not your problem, but the
compilers problem. I.e., job security for you... It's not really about
weeding out errors is it?

> With strong typing the compiler will even guarantee that no one can
> pass a bad parameter to my function *. I think there's a lot of
> value in that.

There's alot of value in learning how not to code poorly because you're
relying too much on the compiler.

> (4) Are you seriously suggesting that the maintenance programmer would
> use my "SetCpuClockDivider()" function without checking the header
> file first? Fire him. Now. I'm not joking.

Given historical C coding, a maintenance programmer could legitimately
assume that the value of kDivideBy8 is 8 from a #define and not 3 from an
enum. In which case, you have a problem. One created by your coding style
being different...

But, the point was that in large applications, the code becomes disjointed.
In a really large application, a few cut-n-pastes, twenty plus years of
coding and numerous coders with widely varying skills and intellect, and a
wrong name or two, and now your maintenance programmer is looking through
untold millions of lines trying to find out which header defined the
variable. If only they'd left it in the same file..., you now wish.

> Are you suggesting that there won't be a header file for every C
> source file (except for the main module)? Inconceivable.

That depends. I don't use header files for my personal code. I want the
declarations near to their use. In the very large PL/1 application I worked
on a number of years ago, the main code was well separated, but the
"headers" weren't organized at all and had little or no real correlation to
the file(s) that used them. Some of them even had random-ish names because
they became so large.

> One purpose of a header file is to document each function's purpose,

Should be at the functions... so the programmer doesn't have to locate the
header file too.

> so the programmer doesn't need to look at the source.

If he's not looking at the source, and you have an integer variable, how
does he determine the range or set of acceptable values? I.e., the "black
box" of code is your perspective, which drives your use of enum's...

Rod Pemberton

unread,

Aug 5, 2008, 3:59:40 PM8/5/08

"Richard Harter" <c...@tiac.net> wrote in message
news:48971d0c...@news.sbtc.net...

> On Mon, 4 Aug 2008 05:25:21 -0400, "Rod Pemberton"
> <do_no...@nohavenot.cmm> wrote:
> >"Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> >news:msieweke-78C1A1...@bignews.bellsouth.net...
> >> "Rod Pemberton" <do_no...@nohavenot.cmm> wrote:
> >> > "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> >> > > In a language with a boolean type, expressions
> >> > > like "4<5" or "x == y" would return a boolean value, and not an
> >> > > integer.
> >> >
> >> > "would return a boolean value" vs. "would return a boolean type"...
> >> >
> >> > I think it should be "boolean type" there. C already returns a
boolean
> >> > value for logical comparisons: 0 or 1. It doesn't return a boolean
> >type.
> >>
> >> No. "0" and "1" are integers.
> >
> >Ok, we're hung up on the terminology of "boolean value". 0 and 1 are
> >boolean values irrespective of type. I.e., there are only two possible
> >resultant values. Boolean values can be stored in any type, not just a
> >boolean type, that allows two values to be stored: integer, boolean, etc.
>
> This isn't quite right. The boolean values are the two possible
> truth values, true and false. It is convenient to use 0 and 1 to
> represent them in boolean algebra.

No... Boolean values are two values. Here, these are straight from
Wikipedia:

> You are right that they can

> be stored in anything that allows two values to be stored.
> However you need more than just storing them; you need for the
> storage type to have the same algebraic properties as boolean
> algebra,

As integers...

> and this is where the C conventions are problematic.
>
> Firstly, C does not use {0,1} for {false,true}, it uses
> {zero,non-zero}.

C does use 0 and 1 for results of comparison and other operators. It just
doesn't use them for if() (and most likely other flow control...).

> Secondly, although anything can be a boolean value, the
> comparison operators deliver integer values restricted to {0,1}

Okay, you got around to it... Do you see how this is correct and contrary
to what you stated above?

> with the consequence that we can intermix arithmetic operations
> and boolean operations, e.g., ((x<y)+(y<z))!=1. (Hack test for
> order transitivity; do not do this at home.)

Are you saying that you aren't supposed to be able to "intermix" arithmetic
and boolean operations? They disagree with you. If fact, they use
arithmetic of booleans as a primary example:
http://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29

> For example there is no equality comparison operator for boolean
> expressions. Thus, x==y, will test true if x and y are of the
> same type (modulo various promotion hacks) and have the same
> value; we can't use == to test for equivalent truth value, even
> though equality is a legitimate operator in boolean algebra.

If you're setting x and y to 0 or 1 only, then the equality still holds
true.

> Of
> course you can use x&&y instead or (!!x)==(!!y).

If there is some chance that x and y might be set to values other than zero
and one, that's safe...

> Frex, one of the little traps is that it is problematic to define
> #define FALSE 0
> #define TRUE 1
>
> because the test
> if (x == TRUE)
>
> doesn't work right.

Under what conditions? If x is restricted to TRUE and FALSE, ...

Rod Pemberton

Bartc

unread,

Aug 5, 2008, 5:51:04 PM8/5/08

"James Harris" <james.h...@googlemail.com> wrote in message
news:8b05232a-bc64-4088...@a70g2000hsh.googlegroups.com...

> On 5 Aug, 15:24, "Bartc" <b...@freeuk.com> wrote:
>> "Richard Heathfield" <r...@see.sig.invalid> wrote in message
>>
>> news:fOOdndxzcNDHwgXV...@bt.com...
>>
>> > Let's imagine a bank account that pays 5% pa, calculated daily (so
>> > that's
>> > 0.01336806171134% per day on the close-of-day balance, applied 365 days
>> > a

>> > ...on what day will the

>> > new rate cut in for this customer?
>>
>> On the day the balance reaches or passes 300,000. If the actual day that
>> happens is wrong by a day or two (and I make it a couple of decades in
>> the
>> future), who's ever going to know?
>
> Whoa! Alarm bells should be ringing loudly here. "Who's ever going to
> know?"??? I guess most people would not want their banks to get the
> figures almost right. It's part of their duty of trust to calculate
> correctly or at least to within defined parameters.

This was more to do with this specific example, with an error of a day or so
after 12 years. I'm sure I've never been able to verify my own banking
figures that accurately.

> FWIW in my experience of writing software in the UK banking industry
> we used decimal arithmetic for financial amounts. It was so long ago

Any simple invoice showing VAT [17.5% sales tax] per item is likely to show
differences if you try and verify the workings with a calculator, because of
having to round intermediate values to the nearest penny. I doubt decimal
arithmetic would have helped here.

This is always going to be a problem: in a shop displaying prices in both
local and say US$, one of the two prices is likely to up to 0.5 cents wrong
due to rounding.

> Imagine the bad publicity if a bank was found to have miscalculated
> even slightly in their favour for a number of years. It's not so much
> the financial losses to customers if they are negligible but the
> appearance of incompetence of the bank. If banks can't calculate
> finances correctly, ....

They only have to explain all the workings in the small print of the account
T&Cs. I'm more concerned with the subterfuges used by credit card companies
for example which have cost me real money, rather than minor discrepancies
due to rounding methods, which I think the public understand.

--
Bartc

Mike Sieweke

unread,

Aug 5, 2008, 9:02:22 PM8/5/08

In article <g7abgj$oic$1...@aioe.org>,
"Rod Pemberton" <do_no...@nohavenot.cmm> wrote:

> "Mike Sieweke" <msie...@ix.netcom.com> wrote in message
> news:msieweke-5AC5BC...@bignews.bellsouth.net...

I thought for a while that you were just having fun playing the
devil's advocate. If that's not true, the alternative is more
disturbing.

Richard Harter

unread,

Aug 7, 2008, 10:47:36 PM8/7/08

On Tue, 5 Aug 2008 15:59:40 -0400, "Rod Pemberton"
<do_no...@nohavenot.cmm> wrote:

You didn't read the very first paragraph. I quote:

Boolean algebra (or Boolean logic) is a logical calculus of truth
values, developed by George Boole in the late 1830s. It resembles
the algebra of real numbers as taught in high school, but with
the numeric operations of multiplication xy, addition x + y, and
negation -x replaced by the respective logical operations of
conjunction x?y, disjunction x?y, and complement 洪. The Boolean
operations are these and all other operations that can be built
from these such x?(y?z)

Notice the first sentence - a logical calculus of truth values.

>
>> You are right that they can
>> be stored in anything that allows two values to be stored.
>> However you need more than just storing them; you need for the
>> storage type to have the same algebraic properties as boolean
>> algebra,
>
>As integers...

In a word, no. Boolean algebra has a domain of two values with
the logical operations defined on the domain. The integers have
a domain of an infinite set of values (the integers) with the
arighmetic operations defined on the domain. P-adic algebras
have a finite domain but they still are algebras with arithmetic
operations.

>
>> and this is where the C conventions are problematic.
>>
>> Firstly, C does not use {0,1} for {false,true}, it uses
>> {zero,non-zero}.
>
>C does use 0 and 1 for results of comparison and other operators. It just
>doesn't use them for if() (and most likely other flow control...).

You are confused; the argument for "if" is a logical value.
That's what "if" means. The C spec is fairly clear about this;
in contexts where a truth value is required, 0 is interpreted as
false and non-zero is interpretred as true. Incidentally, the
"0" need not be an integer 0. A null pointer will be interpreted
as 0 but the actual representation may be quite different.

>
>> Secondly, although anything can be a boolean value, the
>> comparison operators deliver integer values restricted to {0,1}
>
>Okay, you got around to it... Do you see how this is correct and contrary
>to what you stated above?

I see that what I wrote was correct and that it is not at all
contrary to what I wrote above.

>
>> with the consequence that we can intermix arithmetic operations
>> and boolean operations, e.g., ((x<y)+(y<z))!=1. (Hack test for
>> order transitivity; do not do this at home.)
>
>Are you saying that you aren't supposed to be able to "intermix" arithmetic
>and boolean operations? They disagree with you. If fact, they use
>arithmetic of booleans as a primary example:
>http://en.wikipedia.org/wiki/Boolean_algebra_%28logic%29

I read the entire article; I didn't see any such confusion.
Perhaps you could quote a passage that you think justifies your
remarki.

>
>> For example there is no equality comparison operator for boolean
>> expressions. Thus, x==y, will test true if x and y are of the
>> same type (modulo various promotion hacks) and have the same
>> value; we can't use == to test for equivalent truth value, even
>> though equality is a legitimate operator in boolean algebra.
>
>If you're setting x and y to 0 or 1 only, then the equality still holds
>true.

Granted.

>
>> Of
>> course you can use x&&y instead or (!!x)==(!!y).
>
>If there is some chance that x and y might be set to values other than zero
>and one, that's safe...

Just so. If x and y were true boolean variables there would be
no such chance.

>
>> Frex, one of the little traps is that it is problematic to define
>> #define FALSE 0
>> #define TRUE 1
>>
>> because the test
>> if (x == TRUE)
>>
>> doesn't work right.
>
>Under what conditions? If x is restricted to TRUE and FALSE, ...

But you see, in C, x is not restricted to TRUE and FALSE. In
general the test

if (x == TRUE)

is not equivalent to

if (x)

In a language with a boolean type the two forms are equivalent
because you cannot compare x with TRUE unless x is a boolean
variable. (Of course, in such languages you can't #define TRUE
and FALSE.)

The sum and essence of it is that C fakes boolean variables using
type punning. It works and is convenient, but it's a hack.

Rod Pemberton

unread,

Aug 9, 2008, 8:06:54 PM8/9/08

"Richard Harter" <c...@tiac.net> wrote in message

news:489bac56....@news.sbtc.net...

What? How can you make such a claim?

Don't you realize the "1)" above *CAME FROM* the first paragraph...
Perhaps, _you_ didn't read the first paragraph!

And, I had to read the first to get to the other part I quoted... Didn't I?
Sigh... (I.e., that's logical to you isn't it?)

> I quote:

You incompletely quote:

> Boolean algebra (or Boolean logic) is a logical calculus of truth
> values, developed by George Boole in the late 1830s. It resembles
> the algebra of real numbers as taught in high school, but with
> the numeric operations of multiplication xy, addition x + y, and
> negation -x replaced by the respective logical operations of
> conjunction x?y, disjunction x?y, and complement 洪. The Boolean
> operations are these and all other operations that can be built
> from these such x?(y?z)
>

This is missing:

"These turn out to coincide with the set of all operations on the set {0,1}
that take only finitely many arguments; there are 2?2?n such operations when
there are n arguments."

> Notice the first sentence - a logical calculus of truth values.

How do you think or what makes you feel that the first sentence negates or
modifies anything that was said so far?

A definition of calculus:

"1. Mathematics. a method of calculation, esp. one of several highly
systematic methods of treating problems by a special system of algebraic
notations, as differential or integral calculus."

> Notice the first sentence - a logical calculus of truth values.

Notice this sentence - "whose interpretations"

"Boolean domain is a set consisting of exactly two elements whose
interpretations include false and true."
http://en.wikipedia.org/wiki/Boolean_domain

I.e., the only requirement is "two elements", not "true" and "false", not
unique from integers, etc.

Or, this one in the Boolean algebra page:

"For the purpose of understanding Boolean algebra any Boolean domain of two
values will do."

> >> You are right that they can
> >> be stored in anything that allows two values to be stored.
> >> However you need more than just storing them; you need for the
> >> storage type to have the same algebraic properties as boolean
> >> algebra,
> >
> >As integers...
>
> In a word, no.

In a word, yes. Integers (restricted to a set of two values as stated by
Wikipedia... i.e., Boolean domain) completely satisfy your claim of
algebraic properties directly above... They go on to demonstrate such on
the Boolean algebra page.

> Boolean algebra has a domain of two values with
> the logical operations defined on the domain.

What have we been talking about if not this?

> The integers have
> a domain of an infinite set of values (the integers) with the
> arighmetic operations defined on the domain.

The size of the domain is irrelevant, we are using only two values as our
domain...

> >> and this is where the C conventions are problematic.
> >>
> >> Firstly, C does not use {0,1} for {false,true}, it uses
> >> {zero,non-zero}.
> >
> >C does use 0 and 1 for results of comparison and other operators. It
just
> >doesn't use them for if() (and most likely other flow control...).
>

...
> You are confused;

No, I'm not, but that's neither here nor there. Is it?

> the argument for "if" is a logical value.

False. The argument for if() is an expression:

"if ( expression ) statement"
"if ( expression ) statement else statement"

But, the if() accepts two "values": zero, and non-zero. This is a boolean
value representation irrespective of type and the result of an expression
comparison with zero as required per the spec. This could be implemented by
a compiler in any number of ways: a bit flag, a zero and one, a zero and all
bits one (-1), a zero and non-zero, etc.

> The C spec is fairly clear about this;

Yes, it is. It's also exactly as I've previously stated and quoted... to
you and Sieweke.

> in contexts where a truth value is required, 0 is interpreted as
> false and non-zero is interpretred as true.

That depends on the part of C you're referring to. Some C truth values are
zero and non-zero, others are zero and one. (more indepth further below)

> Incidentally, the
> "0" need not be an integer 0.

"0" need not represent an integer... Yes. True. A "0" can represent
non-numerical syntax or have a special representation.

> A null pointer will be interpreted
> as 0 but the actual representation may be quite different.

You got _half_ of that correct: "the actual representation may be quite
different". The other half you've got backwards: "0" used in a null pointer
context represents the NULL pointer. A NULL pointer won't be interpreted as
zero but the NULL pointer. Although, it is commonly implemented as zero,
either to access memory there or as an invalid pointer check, etc. However,
the NULL pointer just needs a value distinct from the other C objects - so
it won't be recognized as one of them mistakenly - and can be non-zero.

> >> Secondly, although anything can be a boolean value, the
> >> comparison operators deliver integer values restricted to {0,1}
> >
> >Okay, you got around to it... Do you see how this is correct and
contrary
> >to what you stated above?
>
> I see that what I wrote was correct and that it is not at all
> contrary to what I wrote above.
>

Sigh...

This is correct:

"Secondly, although anything can be a boolean value, the comparison
operators deliver integer values restricted to {0,1}

This is incorrect:

"Firstly, C does not use {0,1} for {false,true}, it uses {zero,non-zero}."

The C spec. has this language, usually in regards to equality and relational
operators:
"...shall yield 1 if the specified relation is true and 0 if it is false."

Therefore, C does use {0,1} for (false,true}.

> >> For example there is no equality comparison operator for boolean
> >> expressions. Thus, x==y, will test true if x and y are of the
> >> same type (modulo various promotion hacks) and have the same
> >> value; we can't use == to test for equivalent truth value, even
> >> though equality is a legitimate operator in boolean algebra.
> >
> >If you're setting x and y to 0 or 1 only, then the equality still holds
> >true.
>
> Granted.

Oh, now you agree...

> >> Of
> >> course you can use x&&y instead or (!!x)==(!!y).
> >
> >If there is some chance that x and y might be set to values other than
zero
> >and one, that's safe...
>
> Just so. If x and y were true boolean variables there would be
> no such chance.

C99 has boolean variables. And, according to other statements from you, the
logic above is still required for C99, i.e., you're contradicting yourself
here.

> >> Frex, one of the little traps is that it is problematic to define
> >> #define FALSE 0
> >> #define TRUE 1
> >>
> >> because the test
> >> if (x == TRUE)
> >>
> >> doesn't work right.
> >
> >Under what conditions? If x is restricted to TRUE and FALSE, ...
>
> But you see, in C, x is not restricted to TRUE and FALSE.

What you're attempting to state is that because C doesn't provide a
numerical domain distinct from and non-overlappable with the integer domain
to represent booleans, then booleans can't be used properly in C. This is
false. They can be used properly. But, they can also be misused,
intentionally or accidentally. Proper usage of a type is not a requirement
necessary to properly implement a type... All that's required is a proper
type, and proper programming.

So... False. If you're using a C99 compiler and x is declared as _Bool,
it's restricted to 0 and 1, which are perfect Boolean algebra and logic
representations of FALSE and TRUE, both numerically and domain-wise...
I.e., it's restricted to TRUE and FALSE, respectively.

"An object declared as type _Bool is large enough to store the values 0 and
1." n1256 draft 6.2.5 sub 2

> In
> general the test
> if (x == TRUE)
>
> is not equivalent to
>
> if (x)

This depends...

If x is a _Bool, the result depends on how you defined "TRUE"... I.e., did
you define "TRUE" to be "1", and "FALSE" to be "0"?

If x is an integer, that depends on how you defined "TRUE" and "FALSE", and
whether you properly assigned x to TRUE or FALSE only.

> In a language with a boolean type

You're referring to languages other than C, but C99 has a boolean type...

> In a language with a boolean type
> the two forms are equivalent

The two forms are equivalent in C. This is independent of whether you have
a boolean type or integer. It depends on how TRUE and FALSE were defined
and
used.

> In a language with a boolean type
> the two forms are equivalent
> because you cannot compare x with TRUE unless x is a boolean
> variable.

False. C99 is case in point.

> (Of course, in such languages you can't #define TRUE
> and FALSE.)

And, how does C99's _Bool fit into your broken assumptions?

> The sum and essence of it is that C fakes boolean variables using type
punning.

What you're attempting to state, is that C doesn't support native use of
booleans via it's flow control, conditional operators, or etc. and therefore
while C has a true boolean type, the C language doesn't properly or fully
support the use booleans. But, that's not even close to what you actually
said...

So... False. C99's _Bool is a true boolean type as defined both by
Wikipedia and _you_:

1) It has two distinct values: 0 and 1.
2) It's domain is restricted to only those two values.
3) The two values have "interpretations" that are TRUE and FALSE.

Rod Pemberton

Richard Harter

unread,

Aug 10, 2008, 1:40:57 AM8/10/08

On Sat, 9 Aug 2008 20:06:54 -0400, "Rod Pemberton"
<do_no...@nohavenot.cmm> wrote:

[snip reiterated misunderstandings of computer science]

Thank you for your comments. I see no point in further
discussion.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 3:11:48 AM8/30/08

> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
> Languages that can do real work don't usually have both
> interpreters and compilers available for them. They usually have
> one or the other.

I think we're talking past each other due to some misleading
jargon. The word "interpretor" has come to mean the read-eval-print
loop, which is itself no longer exactly that in most
implementations, more like a read-compile-execute-print loop. Still
there are two modes of compilation, block compilation used to
complie entire files, and JIT-compilation used to compile each
read-in expression during interactive work. If you can think of a
new term to refer to the REP loop and its inner workings, that
might help our communication.

Common Lisp definitely can do real work, and definitely has a
compiler, and IMO whether it also has an interpretor or something
like an interpretor or something called "interpretor" as a
misnomer, is not really important so long as it has a
read-somehowExecute-print loop for interactive work, where the
semantics of interactive work are nearly identical with the
semantics of block-file-compiled code.

> > > But, if it can't produce an executable binary, it has little
> > > value to me.
> > There's no such thing as an executable in a vacuum.
> True. "There's no such thing as an" *interpreter* "in a
> vacuum" either... Uh, just where did this vacuum you decided to
> fill [snipped] come from? ;)

I was simply commenting on your misleading use of the word
"executable" as if it had some absolute meaning independent of
context. Common Lisp does in fact produce executable code most of
the time anything is read-somehowExecute-printed or compiled, if
you understand that the meaning of "executable" is always relative,
never absolute.

> > Class files and FASL files as native executable formats, then both
> ...
> > as native executable formats
> Ah, but, that is the entire problem with your statements. These
> aren't "native executable formats", i.e., binaries. They're
> interpreted code.

Only if you're running them on a machine (CPU) that doesn't support
JVM code as part of its instruction set, i.e. most stock hardware
(CPUs). For example, it should be possible to change the micro-code
in a stock CPU so that it executed JVM instead of Intel x86 as its
"machine language". Since the x86 instruction set is micro-coded in
the first place, and *that* is considered native code when running
on such a micro-coded machine (while it is considered non-native
code when emulated on some other CPU type), then JVM code should
just as naturally qualify as native code when running on a
JVM-micro-coded CPU and non-native code when emulated on other
CPUs. The same goes for FASL or other compiled Lisp code, although
in that case the usual problem is the executable header format (ELF
vs. FASL) rather than the body of code itself. But considered in a
vacuum, compiled Lisp or compiled Java is equally well qualified to
be called "executable" as compiled C or compiled Flaming Thunder.

Now in a sense, compiled Lisp is closer to being an "executable" on
stock hardware than compiled Java is. All it would take would be a
tweak to the operating system to make compiled Lisp directly
execute on the CUP via invocation directly from the shell, whereas
it would also take a rewrite of the micro-code to make compiled
Java directly execute like that. So if you want to back off from
making compiled Java directly executable, but still go ahead and
make compiled Lisp directly executable, as your mission not
impossible, that will be fine with me.

> > Common Lisp and Java would produce executables for your particular OS?
> Do they produce permanent binaries from the "native executable
> formats"? No... not usually. They might temporarily produce
> binaries, or binary code snippets, if an architecture uses
> just-in-time compilation instead of a virtual machine to execute
> the code.

Anywhere you have tight loops that execute through many times after
being compiled once, then as a practical measure JIT compilation
seems just fine, effectively very close to equivalent to permanent
executables in regard to efficiency. Anywhere you have code that is
loaded just once and executed just once, such as toplevel scripts,
CPU load probably isn't the bottleneck, load time probably is. So
it's really not important whether the code is JIT compiled or not.
The one possible exception would be Web server applications such as
CGI, where the same code is re-executed thousands of times between
software upgrades. For that usage, it seems best to somehow arrange
that a single "core image" is shared among many transactions for
many clients, so that loading and JIT compilation, or file-compile
and loading, happens just once for a large sequence of network
uses. I seem to recall something like mod_lisp being available (not
here on my Unix shell account, but on other ISPs) to support that
kind of *efficient* operation. An intermediate kind of efficiency
already seems to happen with my CGI applications, where the CMUCL
environment and my toplevel "trampoline" and all the files that my
CGI application loads get read from the hard disk and cached in
fast memory the first time somebody uses my application in a long
time, then successive repeat calls to the same application load
everything from the cache which still has the overhead of running
the code to load FASL files and/or load-and-JIT-compile interpreted
files but there's no hard-disk latency so response time is much
better after the first time in a short time span.

> Wait... wait.... Was part of your suggestion to install _two_
> languages to do the work that should be done by _one_ language?

What is a practicality now, and what can theoretically be done
eventually, are two different matters. Right now, we have at least
three incompatible OS-support methodologies: C (ELF-x86), Common
Lisp (FASL-x86), and Java (Class-JVM). It's going to be a long time
before the differences between the Java virtual machine and similar
features of Common Lisp will be reconciled so that a single
data-management support system, including automatic garbage
collection as part of the OS, would work equally for both without
losing any important features of either language. Until then, I see
no problem with some OS providing both Java and Lisp memory
management as separate entities which can be co-mingled within a
single application. At first, conversion from one language's data
to another would involve either wholesale data conversion
(call/return by value) or emulation of access from one language to
another (similar to SOAP/RMI). Successive versions of the OS could
then gradually merge similar data structure types into a single
common representation, whereby data conversion would become a no-op
and access-emulation would become unnecessary for such a data type.

Extending this further, some other languages such as OCaml, have
additional built-in data types not present in Java or Common Lisp.
So long as we'd be providing system support for all the data types
of Java and Common Lisp, we might also choose to provide system
support for the additional datatypes of those less common
languages. Eventually the essence of each data type would be
abstracted to the point where *all* the languages that have
self-contained runtime-tagged data objects and automatic garbage
collection would use the same data model and the same system
support, thus data could be passed freely among a wide variety of
such languages with neither data conversion nor access-emulation
needed. Maybe that day would never come. I can't predict that at
present. But a partial/mostly unification of data types across
multiple languages, greatly reducing the need for data conversion
or access-emulation among shared data types, should make it much
more efficient to build applications that are comprised of a mix of
modules expressed in different programming languages. A great
advantage of such a system is that just because the "shop" has
started building applications in one language (such as C++) doesn't
mean the "shop" is forever locked into that single language due to
problems with "legacy code" and the all-or-nothing dilemma or the
pain of dealing with foreign-function-interface.

So in the short term, yes, how about fixing the OS to directly
support both Java and Common Lisp data management right there in
the OS so that compiled Java (using Class-JIT->native) and CL
(using a mix of Read-JIT->native and Compile->FASL->load->native)
applications can run right there without first needing to load a
copy of their own storage management system? So you load your Class
files or FASL&sexpr files directly into the OS rather than first
loading a JVM or CL into the OS then loading the Class/FASL/sexpr
stuff into the JVM/CL. Worry about unifying similar datatypes later.

> I ask, but I never hear what "higher level programming features"
> C doesn't have...

It doesn't have true symbols (nevermind whether interned or not).

It doesn't have an explicit representation of objects with explicit
handles on other objects, such that you can tell what constitutes
an object and what is outside that object.

It doesn't have inherent identification of different kinds of
objects such that when you have a pointer to a place in memory you
can just look to see what kind of object you are looking at. As a
result, there's no natural way to mix different kinds of objects
within a uniform container such as an array or linked list.

It doesn't have a reader and printer which are inverses of each
other, whereby somebody could type in source code and have it
immediately converted to internal representation and then have
something type out automatically converted from internal to
external representation. Part of the reason why this is
*impossible* is because objects aren't well-defined and aren't
identified, so if you wanted to print out something you see at the
end of some pointer you have no idea what kind of structure is
supposed to be there hence no idea how much of memory to print out
and how to format the printing.

There's no way to create an anonymous function, no notation to
express one.

There's no way to create a lexical closure (an anonymous function
that encloses lexical bindings within itself thereby establishing
OWN/STATIC variables known privately).

Given a function with N parameters, there's no way to "curry", i.e.
fix one of those parameters with a given value and thereby create a
function with N-1 parameters.

There's no documented "parse tree" whereby the essence of software
algorithms could be manipulated in a hierarchial manner, either for
copy-and-paste in a structural way to build or edit algorithms, or
for emulation such as for debugging by "instrumenting" a piece of
code, or for translation to/from another programming language, or
for statistics gathering such as a cross-reference of call patterns
across several software modules, or for studying "correctness" of
algorithms such as deciding by logic whether a given algorithm
*might* step outside an array and damage other memory thereby
creating a security problem which can be exploited by a computer
virus, or for meta-programming whereby an application directly
writes a parse tree for a new algorithm invented by that first
application either for purpose of cross-compilation or possibly
genetic algorithms for evolving new algorithms, or for automatic
conversion from a domain-specific "language" (syntax) into the
standard language i.e. what in Lisp are called "macros", etc. There
are just so very many uses for a well-documented parse tree that
it's totally losing not to have one.

Then of course there are generic functions and the like not in C.

> Are you referring to object-oriented features?

If you mean what is commonly meant by OOP nowadays, not except that
very last item above. All the rest are pre-OOP. It's not necessary
to have user-defined classes with inheritance of methods from
parent class to child class unless overridden by child definition.
Just old-fashioned well-defined *objects* in the original Lisp
sense are beyond the capabilities of C and mark the distinction
between something only slightly higher than assembly language and a
fully high-level language.

> Most of these are implementable in C

Only at such extreme pain that nobody ever does it except when
replacing C by an entirely new language such as Java or CL or Perl
or OCaml etc. If you're building a whole new language, then it's
worth the effort to put all those features in it. But if you're
just writing a C application, nobody can afford the effort. Some
stuff is Greenspun in some horrid way, that's the best that can be
expected. By comparison, if you start with some truly high-level
langauge (Lisp, Java), you can Greenspun/emulate any feature of any
other language in a relatively straightforward way, and it's
actually worth doing if you have a need for that additional feature
not present in your existing high-level lanaguage.

> I've had experience at one point in time or another in about 14 languages.

Looking at my resume, I see experience in:
- Java (on 3 platforms)
- C (on 4 platforms)
- C++ (on 2 platforms)
- Visual Basic
- Fortran (on 5 platforms)
- Algol
- Lisp (on 5 platforms)
- MacSyma
- PHP, Perl
- COBOL
- Assembly/machine language on 7 different CPUs
Counting Fortran 2d and Fortran 4 as 2 different languages, and
counting Lisp 1.5/1.6 SL/PSL MACL/CMUCL/PowerLisp as 3 different
languages, and counting PHP as different from Perl, and counting
each different CPU for assembly/machine as different, that's a
total of 21 for me, not to brag.

> So, to me, the common statement that C has no higher level
> features is a real mystery given my background.

Its highest level feature is the ability to define a template (not
in the C++ sense) whereby symbolic names are related to offsets
within a block of memory, and that template can then be
rubber-stamped as many times as needed in static and/or dynamic
memory, whereby a notation of objectName.fieldName (which can be
hierarchial if templates are nested) can be used to refer to the
memory location at a particular place within such a templated block
of memory. This feature, called STRUCT, is only slightly more
flexible than what COBOL had in 1964, and in a way it's actually
*less* high-level than what COBOL provided, because COBOL provided
a read/print facility for such fixed-layout data records whereas C
doesn't. What C implements can be Grenspun rather easily in most
any assembly language with only a small (or zero) memory overhead:

CONSORIG: ;Defines address at start, same as address of CAR
CAR: DS 1W ;Allocates one word of memory
CDR: DS 1W ;Allocates one word of memory
CONSEND: DS 0 ;Defines first unused address after end, without allocating
;The symbol-arithmetic expression STR1END-STR1ORIG thus
; expresses the number of bytes for a structure like this.
ORG STR1ORIG ;Resets origin for later memory allocation back at
; the top of that structure, thereby avoiding memory overhead.

SAMPCODE: LOAD A0,STR1END-STR1ORIG ;Parameter to MALLOC
CALL MALLOC ;Stores result in A0
LOADR X1,A0 ;Move result from MALLOC to index register
LOAD A0,myElement
STORE A0,CAR-CONSORIG(X1)
LOAD A0,restOfList
STORE A0,CDR-CONSORIG(X1)
STORE X1,newList
;Note: If restOfList and newList are same memory location, this is CL's PUSH

So really STRUCT and CALLOC are nothing more than shorthand for
assembly-language idioms.

All the rest of C is no more high-level than FORTRAN or COBOL circa
1964, except pointers which are no more high-level than shorthand
notation for assembly language idioms even more primitive than STRUCT:

a = *p;

LOAD X1,p
LOAD A1,0(p)
STORE A1,a

The distinction between pointers to data of various lengths,
thereby requiring alternative use of LOADB (byte=8) or LOADH
(halfword=16) or LOAD (word=32) LOADD (doubleword=64) isn't enough
to claim this is a "high level" programming facility.

The static type checking of different data types already existed in
FORTRAN and COBOL circa 1964.

Really C has nothing higher than FORTRAN, and isn't even as high as
COBOL. So if you want to call FORTRAN "high level", then C
qualifies equally, but that's *nothing* compared to Lisp 1.6 circa
1969 or modern high-level languages such as Common Lisp or Java.

> ... OS developers, game programmers, etc. program also, and
> frequently need the ability to interface to things in assembly
> without using assembly.

> > > 4) the higher level features should provide:
> > > 4a) flow control, variable allocation, abilities to manipulate
> > > integers, arrays, strings
> > > 4b) dynamic memory allocation, functions for file I/O
> > You've begged the question with your sound-bite of "dynamic memory
> > allocation". Without a fullfledged Garbage Collector, you're up a
> > creek trying to get any large application clean of memory leaks.
> Maybe...
> It still doesn't change the idea that programmers need to do
> dynamic memory allocation, as part of the native language.

Where do you draw the line between "native language" and "available
libraries"? It seems to me that if some feature is readily
available at no additional cost to just about anyone using a
particular language, then for this discussion it's just as good as
if it were in the native langauge whatever that's supposed to mean.
In a sense all of an operating system or a programming environment,
except the boot loader, is add-on library. For Java, for example,
there's a machine executable containing the bootstrap loader for
the JVM and for the regular class loader, and a shell script that
invokes that bootstrap loader and passes it information about where
to find the JVM and the regular class loader. I'd prefer to include
all the usually available libraries as if they were part of the
core language. So all of ANSI CL, all of J2SE 1.3 1.4 1.5 and 5.0,
all of J2EE (same versions), would be considered part of the
corresponding core language. Likewise C has a bunch of standard
libraries which should be included, and C++ has the STL and a bunch
of other C++ specific libraries plus the C-compatibility libraries.

Now I think it would be really neat if somebody made it **easy**
(and I mean *trivial* really) to make liberal use of ASDF or other
beyond-the-standard libraries, where there's one net-boot library
you must obtain and install yourself, but then any time you need
some library from one of these network archives you invoke an
"import" function, and it first checks whether it's already
installed in the system area on your machine, and if not then it
automatically downloads it and installs it in your private account.

And of course it would be really neat if somebody volunteered to
work with me to build a search engine that allowed potential users
of net-loadable libraries to browse the intentional datatypes
handled by the various available libraries and find the best
library for the desired usage and tell the potential user what
command to use to cause the net-boot loader to install that
particular library if not already installed.

> > especially if you want different toplevel objects to share
> > inner structure to avoid needing to copy an entire structure just
> > to change one tiny part of it.
> "Inheritance" of data... between objects...

Hmm, that's an interesting way of phrasing the concept. So if you
have a parent object, and you derive a variant object where only
some small part is different, and it's done non-destructively so
that the original object remains unchanged, then in a sense the new
object is inherented from the original object. But even though the
pedigree shows one original object and one modified object, after
the modified object has been created it's impossible to tell by
looking at the two objects which came first. Accordingly I think
"inheritance" is misleading jargon, because in true inheritance
it's trivial to tell which object is the parent and which is
derived, because there's a link from the child object back to the
parent in order to provide default values for anything in the
parent not overridden by the child, but there's no similar pointer
in the reverse direction. Accordingly I'm not comfortable with your
newly-invented jargon.

Also, consider the case where a single parent self-balancing binary
search tree (SBBST) is created, then two separate threads (either
actual threads, or merely exploration of different branches of a
search within a single actual thread doesn't matter), and there is
no longer any toplevel point to the original (parent) SBBST. So
long as some structure from the original tree is still present in
both of the daughter trees, the two daughters will share structure,
but neither inherents from any parent tree because there *is*no*
parent tree in existance at the present time, and it's not clear
which of the two daughters you would claim is inherenting from the
other daughter. So I think in practice your term is unworkable. So
I'm beyond "not comfortable", to outright rejecting your suggestion
as useful jargon for understanding the data structures being used.

> That would make "releasing" the data a problem for the programmer
> (since he/she's no longer in control...)

Agreed, especially when there's no longer any toplevel parent
pointer, only pointers to various daughters (and grand-daughters
etc.). Automatic garbage collection is mandatory (except where
applications are so short-lived that no freeing of memory is in
fact needed in the first place, such as many non-shared CGI
applications.

> and so would become a compiler issue.

Um, it's really the runtime system we're talking about. The only
thing the compiler does is compile calls to the constructors for
various types of data. It's the job of the runtime system to start
a GC thread if it's needed for a long-living application. Of course
the constructors themselves must build objects that the GC thread
is capable of accounting for and reclaming when no longer
referenced. In Lisp the primary constructor CONS stores tagged CAR
and CDR cells and returns a tagged handle that can be incorporated
into CONS cells or arrays etc. later, and the GC thread uses those
tags to know which references are self-contained values (FIXNUMs
and characters) and which are pointers to heap objects of various
kinds. Other constructors for BIGNUMs and arrays likewise return
tagged handles. READ makes liberal use of these constructors as it
converts s-expression syntax into internal trees of constructed
objects. As I see it, the compiler itself has nothing directly to
do with the garbage collector. It merely converts a parse tree into
compiled code that when executed will call various runtime library
functions, and it's these library functions which either are
constructors themselves or eventually call constructors as needed.

> Why would one design a language that takes such control over the
> data away from the programmer?

Very simple answer, one of those "Doh!" answers: To hide details of
the machine (CPU etc.) configuration from the programmers, and
thereby automate the interface between application-level concepts
and machine-level concepts, so that the programmer can concenstrate
on building meaningful data structures rather than needing to
constantly deal with the machine representation of data. This is
analagous to the job of a programming language to hide the CPU
instruction set from the programmer most of the time, allowing the
programmer to think in terms of abstract functions instead of CPU
opcodes.

Example: I need to make a sequence of five characters that form a
word. I don't want to have to bother with the way characters are
encoded in the machine (ASCII or EBCDIC or Latin-1 or 16-bit
UniCode or 32-bit Unicode) and how those characters are embedded in
a sequence (a 32-bit value containing the number of characters
followed by the characters themselves, or all the bytes not
allowing zero followed by a zero byte to terminate, or blocks of up
to 255 bytes with a link from each such block to the next whenever
the overall string is more than 255 bytes total, etc.). I just want
to tell the programming language (compiler or REP loop doesn't
matter) that I want the five characters H E L L O in a string
object and be done with it. I just need some reasonable syntax for
expresssing that literal constant string within my program. "HELLO"
is reasonble. 'HELLO' is also reasoanble. A few other variations on
that idea are also reasonable. The following syntax is NOT
reasonable when I'm writing a high-level program:
S1: DC 5
DC 72
DC 69
DC 76
DC 76
DC 79

> Doesn't that defeat the purpose of having a programmer program?

It really depends on the level of program being written:
To write a bootstrap loader for BIOS, you use assembly language,
and you need to know the CPU you're running on. That's not our
current topic.
To write the kernel of an OS, you may use assembly language or C or
my proposed BootLAP. (You could even use my proposed BootLAP for
the bootstrap loader.) Again, that's not our current topic.
To write the rest of an OS, or a compiler, or a regular
application, you would like a high-level language that hides the
interals of the machine configuration most or all of the time,
making OS or kernel calls to deal with machine details at a more
abstract level than the machine details themselves.

> > (With self-balancing binary serach
> > trees, you can produce a modified copy by re-building log(n) path
> > down to the change and sharing all the rest of the tree between old
> > and new trees.)
> Is this worth the overhead?... (memory, time, speedup, etc...)
> It seems like much work to me - just to not make a copy of some
> data...

If you have a very large amount of data, such as an entire
relational database (emulated), and you want to spawn multiple
versions of that database, perhaps millions of simultaneous
versions each only slightly different from its parent, you cannot
afford to copy the entire database every time you want to spawn a
slightly-modified clone. Self-balancing binary search trees with
all non-changed branches shared is just about the only practical
way to deal with this kind of methodology.

Now if all your data objects that you want to clone-with-modifications
are sufficiently small, simply making a complete copy and then
destructively changing the copy may be a viable alternative. At the
application programmer level, the APIs are identical, so the only
real difference is CPU and memory usage, which requires gathering
runtime statistics to determine which is more efficient for a given
application.

> > Heck, machines are so fast you don't need the underlying OS to
> > support FASL files. You can have your .login script start up Lisp
> > and then use Lisp as your shell, an extra layer of overhead between
> > what you type and the OS, but who really cares when the machine is
> > spending 99.9% of its time waiting for you to compose a new
> > toplevel command and type it in, or waiting for you to select from
> > a menu of things you want it to do next.
> You care when the application you're running is consuming a large
> percent of the machine time.

100% of the CPU time is *always* used, the only question is what
it's used for. If there's nothing useful for a CPU to do, it runs
an idle loop. On any decent OS, you have a way of establishing
priorities for various processes, so that you can have background
tasks running whenever no higher-priority process needs attention,
so that 100% of CPU time (minus overhead of switching tasks) is
consumed doing something useful, and at any moment what's getting
done is the most urgent task at that moment. When your application
is spending 99% of its time waiting for you to compose a new
toplevel command and type it in, it's truly WAITing at the OS
level, suspended while some other process runs, and then *your*
application gets re-activated as soon as you've finished composing
your new toplevel command. Meanwhile every keystroke or mouse
action generates an event which is briefly handled to move the
cursor display or select text or modify the edit buffer of that
command you are in the middle of composing. If 99% of the time
*other* applications are keeping the CPU usefully busy, and then
for that 1% needed to execute your typed-in command you have the
CPU almost totally to yourself, I think there's nothing wrong at
all. Why does that bother you? Or did you misunderstand what I
meant by "WAIT"?

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 4:31:27 AM8/30/08

> > So, I ask, but I never hear what "higher level programming features" C
> > doesn't have... ever.
> From: p...@informatimago.com (Pascal J. Bourguignon)
> - first class functions,

More to the point: First class *anonymous* functions. If you have
to declare the name of every function at compile time, it sorta
defeats the idea of first-class functions. If you don't need to do
that, but the runtime system needs to invent a "name" for each new
function and make sure it doesn't duplicate the name of some other
function previously invented in any thread whasoever in the entire
application, that would be utterly horrid a way to implement
first-class functions, although the API for using it might "look"
as if the functions were anonymous.

But there's something more fundamental that is necessary before you
can have first class functions: You need first class *objects*!!
Not just small integers, but heap objects of arbitrary complexity
must be available. You can do this by the Lisp approach of having a
CONS cell be the *object*, which daisy-chains via handles to other
objects to form a tree, whereby intentional objects can be
arbitrary many CONS cells linked together, and multiple intentional
objects can share some of their CONS cells. Or you can do it the
Java way of having formal OBJECTs which contain arbitrary amounts
of structure all within a single formal OBJECT, such as a member of
class java.util.TreeMap which each contains a complete
self-balancing binary-search tree, or javax.swing.JTable which each
contains all the structure to represent an entire GUI widget, and
where there is *no* sharing of structure between different objects
of the same class, and where intentional type exactly matches OO
type. Either way, you need formal objects expressed somehow as
heap-allocated data with a toplevel handle for each object which
can then be used to point from one object to another object.
Otherwise this requirement and several others below are impossible.

> - automatic memory management (garbage collector),

Agreed, except for short-lived applications where freeing memory
isn't really necessary. Non-shared CGI applications sound like a
good market for a no-GC programming environment. GC requires
first-class objects with well-defined runtime type identification
to allow the GC to find all the parts of an object and all links to
other objects. It doesn't matter whether GC is mark-and-sweep or
reference-count, you need well-defined first-class objects for it
to work reliably at all. Without well-defined first-class objects,
the best you can do is a weak GC that treats *every* integer as if
a pointer to that place in memory, but even then it can't reasonbly
know how *much* memory is to be held around that address except if
MALLOC or the like clearly defines blocks of memory to be
auto-freed as a unit.

> - macros (no, C has no macro. For real macros you need lisp),

This requires that the parse tree be fully accessible as a
first-class object which can either be mutated destructively or
non-destructively copied-and-pasted into newly-being-built
structures (result of macro expansion). Once the parse tree is
accessible, macros are in effect trivial to implmenent, except for
the magic of not needing a function wrapper. For example, if every
time you wanted to call a macro you had to wrap it in an explicit
macro-expander, that would be a pain, but marginally acceptable.
For example if there were a built-in special operator called MACRO
which took the CDR of the form not-evaluated as the form to expand,
that wouldn't be too horrible:
(MACRO defun foo (a b) (+ a b))
(setq arr (make-array '(5) :initial-value 42))
(MACRO setf (aref arr 2) 666)
You'd have to remember which commonly used procedures were macros
and which weren't, and for anything new you'd look in the manual
where you'd see that it was a macro and include the MACRO part when
you write that code the first time.

If there was no special operator like that, so you needed to call
the MACRO-EXPAND function explicitly, with the argument explicitly
quoted, and a EVAL-WHEN wrapper around the whole thing to force it
to be expanded at compile time, that would be a much bigger pain.

> - exceptions,
> - packages / modules,
> - bignums,

Agreed. Note BIGNUMs require first-class objects.

> - bounds checked arrays, multidimensional arrays,

Former agreed. Latter not really necessary. Nested arrays to effect
potentially-ragged arrays are actually good enough for practical
use. If you want syntax that looks like multi-dimensional arrays,
an extra layer of function or macro is good enough. Note this
probably requires first-class objects.

> - strings and characters (no, C has no string and no character),

Now we're treading into intentional data types vs. builtin data
types. If the built-in data type provides FIXNUMs of sufficient
range to include all the UniCode characters you'll ever need for
your application, and an interface routine treats these integers as
if UniCode characters when printing them out, that should be
sufficient for characters. Likewise if sequences (either arrays or
linked lists or self-balancing binary search trees etc.) exist,
then an interface over sequences of integers, or a sequence of
character-interfaces over integers, would suffice. It's **nice** if
the READer and PRINTer and various other functions know about the
intentional type of these characters and strings, but these can all
be implemented by after-the-fact libraries, and hence don't really
*need* to be in the core language.

> - strong type safety (ie. not allowing ("iv"+1)),

Again this doesn't need to be in the core language. It can be in
the library fuction for doing addition.

> - object oriented programming,

Not absolutely necessary. Oldstyle Lisp objects with closures are
good enough. OOP can be implemented in after-the-fact libraries if
there's a good need for it. It doesn't need to be in the core
language.

> >> (With self-balancing binary serach
> >> trees, you can produce a modified copy by re-building log(n) path
> >> down to the change and sharing all the rest of the tree between old
> >> and new trees.)
> > Is this worth the overhead?... (memory, time, speedup, etc...) It seems
> > like much work to me - just to not make a copy of some data...

> Yes it is worth the overhead. The point is to keep the existing
> structures immutable, so they can be shared. This way, you don't
> need to copy the structure everywhere, just to be sure, as it is
> done usually in C and C++ programs.

Thank you for the vote of confidence for SBBST (self-balancing
binary search trees) as I've been calling them recently to be more
precise than what I called them in former years.

I've decided a really good example where this data sharing is
crucial for efficiency is with genetic algorithms. Imagine a
simulated population of millions of simulated genomes, where each
genome has millions of genetic units (base pairs or whatever the
minimal unit of heredity is), all evolving from a single starting
genome. Imagine the **cost** of having to make a total copy of a
genome every time a mutation happens, compared to the cost of
rebuilding one log(n) path in an otherwise shared SBBST. Note that
natural selection tends to create highly clustered clades, where
nearly all the genes are shared between the genomes of nearly all
the members of any tight clade, so that with millions of highly
diverged genomes there are in effect only a rather small number of
high-level clades that don't share much between them, so the amount
of data occupied by shared SBBSTs is much much less than the amount
of data occupied by totally-copied genomes.

No, I haven't yet actually done my own genetic-algorithm simulation
like this, so I don't have statistics about the total amount of
memory typically consumed by a million highly-evolved genomes
relative to the amount of memory consumed by any individual genome.
If I were forced to make a prediction, I'd *guess* a million SBBST
genomes would cost only 10 to 100 times the memory of a single
SBBST genome. Even a neutral-drift simulation I'd *guess* wouldn't
be much worse, maybe 100 to 1000 times. If anybody has done either
kind of simulation and published the resultant evolutionary trees
linking only the final survivors and their common ancestors,
showing number of mutations along each internal path, please point
me to the dataset. From that cladogram it'd be easy to generate the
SBBST-memory-usage figures assuming no rebalacing needed, and the
memory usage with occasional rebalancing needed would be only
slightly larger.

By the way, for a pure neutral-drift simulation, it's not necessary
to actually have genomes during the simulation. Just randomly pick
which node will split (reproduce) and which node will die, and
throw in a mutation counter as an ad hoc add-on that has nothing to
do with the neutral-drift itself but is used later to generate the
SBBST memory usage statistics parameterized per the presumed size
of a genome. The only other thing to do in such a simulation is to
condense a two-step path whenever the side branch from the midpoint
goes extinct, adding the two mutation counters of the old steps to
yield the total mutations of the single combined link, and whenever
one of the two branches from the root goes extinct then discard all
the mutations along the path from that old root to the new LUCA. It
occurs to me that eventually a neutral-drift simulation with a
fixed number of survivors should converge to a statistical fixed
point in terms of total number of mutations from current LUCA to
current survivors, and that fixed number of total mutations should
be a function of the constant number of survivors. Has anybody done
enough simulations of neutral drift with varying fixed number of
survivors and plotted the LUCA-to-survivors total-mutations as
function of number-of-survivors and worked out the mathematical
form of the function or at least gotten enough data for a smooth ad
hoc graphed function? For neutral drift, that single function to
compute total path length in terms of generations, multipled by the
presumed mutation rate per generation, is all I need as input to my
estimates of SBBST memory usage (ignoring the log(n)
path-rebuilding that happens with each rebalance).

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 5:48:04 AM8/30/08

> > C is not a high level language.
> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
> What? That claim seems highly revisionist to me. C, FORTRAN,
> Pascal, etc. are all HLLs. Please (re)define "high level
> language"...

I think that at a bare mininum, today's concept of HLL includes
first class objects with some tagging or memory-management system
to distinguish (at runtime) various types of object from each
other, so that containers such as arrays and self-balancing binary
search trees can contain a mix of different kinds of objects that
cause dispatching as needed to process them. C doesn't provide this.

Our standards for many things (civil rights, cruel and unusual
punishment, trans-national portability, and high-level programming
languages) have changed since 1960. What was considered acceptable
segregation of negros and caucasians in 1960 is no longer
acceptable, in fact equal rights for public schooling and public
transit and housing and employment are now considered mandatory.
What was considered acceptable trans-national portabilty, namely
that everyone in the world learn to accept USASCII as the language
of discourse, has been replaced by UniCode. And what was considered
a high-level programming language in 1960 (FORTRAN and COBOL) is no
longer acceptable. Our standards in all these areas are higher now
than they were back then. C was "high level" per the 1960 standard.
It's not "high level" per the 2008 standard.

The only type of object in C that is sorta first class is a block
of bytes given by MALLOC, which is freed as a unit by FREE, or a
block of STRUCTs give by CALLOC, likewise freed as a unit by FREE.
That's not really good enough.

But per my new outlook of intentional data types, I can see that
it's better than what's directly provided by the operating system.
Or is it? Macintosh OS allows arbitrary-sized blocks of memory to
be allocated directly via system calls, and you have your choice
whether they are relocatable (you get back a handle to a tiny-block
which contains the actual pointer which is automatically updated
whenever the block moves) or non-relocatable (you get back a direct
pointer). So C provides no more than what MacOS already provides,
in fact it provides *less* than what MacOS (System 6) provides. So
C is no more high-level than assembly language, in fact it's
slightly lower level because you can't request relocatable blocks
from MALLOC or CALLOC. You really want to go that road?

> "... Essentials such as pointers are very clear if you have a
> machine model in mind."

Other than a general idea of what's efficient and what's not
efficient, like following through a pointer or directly indexing an
array to find the nth element is fast o(1), but searching a large
random-sequence array or searching a linked list to find a matching
pointer or other value is very slow o(n), and binary search or
SBBST search are of intermediate o(log(n)) time-consumption,
allowing you to choose data structures and algorithms that make
tasks reasonably efficient, is there *any* good reason to keep the
machine model in mind when writing a high-level application
(anything higher than a bootstrap loader or the dispatch vector for
a device driver)?

> C is a general purpose programming language. Are you saying you
> don't need a general purpose programming language?

So is assembly language for a given CPU. The difference is that C
is sorta portable and includes syntactic sugar for indirect
indexing and load/fetch.

> > > So, I ask, but I never hear what "higher level programming features" C
> > > doesn't have... ever.

> > - first class functions,
> C has, AFAIK. It depends on how you define 'class' here...

It means you have an object allocated on the heap which is an
anonymous function which you can apply to arguments. C doesn't have
that. For copies of functions you defined in your source code, you
could copy the entire body of such a function to an array of bytes
that you got by MALLOC, if only you could learn the size of that
function, the number of bytes from beginning to end, which you
can't in any portable way. But even if you could, it'd be of no
use. First class functions are of no value unless you can load them
and/or create them at runtime. C as given by K&R doesn't provide
any portable way to load new functions from files at runtime to
occupy MALLOC blocks and be callable from other code, and surely
doesn't provide any way to *define* a new function and compile it
at runtime. Perhaps you will prove me wrong by designing a portable
way to represent C-functions in disk files and then write a library
for loading them in on demand at runtime. Perhaps you can use ELF
format relocatable libraries so all you need to write is the code
to load single functions. You need an API something like this:
elfOpen("filePathNameString") -> elfHandle
elfFindName(elfHandle, "functionNameString") -> elfFn=struct[startOffset, size]
malloc(elfFn.size) -> ptr
elfLoadBlock(elfHandle, elfFn, ptr)
Do you know anyone who has written that dynamic-function-load
utility in/for C? Note that while the function has a *name* in the
library, that's just used to do the the lookup to find where the
body of the function is located. In memory, the copy of the library
function is anonymous, accessed only via the ptr to the MALLOC
block where it was loaded.

> I.e., procedures (and derivatives of, such as functions) are part
> of the language, i.e., first class

That's not what "first class" (object) means.
You need to learn the jargon before debating this kind of point.

> > - automatic memory management (garbage collector),

> C's not interpreted. Garbage collectors are normally implemented
> for interpreted languages.

You are so utterly wrong I don't know where to begin in unwedging
your mind from your extreme mistake. Garbage collectors have
nothing to do with interepretation of sourcecode. They have
everything to do with dynamic allocation of memory, such as by
MALLOC. Until you understand this, I don't want to waste my time
explaining such essential basics to you.

> The C preprocessor implements macro's...

They are mere string-processing macros, which are extremely crude
and totally disrespectful of C syntax. That's not at all what we're
talking about. The previous poster should have clarified
"parse-tree macros", i.e. transformations from the parse tree as
specified by the source syntax to a different parse tree which is
what actually gets compiled. So on *this* point I don't blame you
for totally misunderstanding the topic of conversation when you
posted that, but from now on I *expect* you to know what we're
talking about, which Lisp provides, but C and Java don't provide,
and C++ provides only in an extremely limited sense with its
"templates".

> > For real macros you need lisp),

> Please explain.

See my explanation above. Does that suffice to enlighten you?

> > - strings
> Implemented as "arrays" of characters.

Except that C doesn't really have arrays of anything, it just has a
pointer to a place in memory and then it's up to the caller to
guess how far to read past that pointer to process all the data.
For "strings', there's a convention that a zero byte terminates the
data, which is respected by some library functions but not others.

> > - object oriented programming,
> Not available. Although, some of the look of object-oriented
> code can be simulated in C.

At what cost in pain? Would it be so painful that nobody has ever
done it except as a toy for some short-lived project? Or has
somebody produced a well written well tested reliable OOP library
that lots of people are currently using? (The third possibility,
which I'm sure is *not* the case, is that OOP is so trivial to
implement in C that novice programmers do it all the day, and
there's no need for anybody to write a library to make it any
easier. In Lisp that third possibility is true for a lot of
intentional data types. In C there's almost nothing that's really
easy to do. For example, suppose you are using non-negative
integers for two purposes, for expressing UniCode points, and for
expressing font numbers. You want to make a sequence of characters,
with font-change commands every so often. You need to tag the two
types of integers so that as you are stepping through the sequence
you can know whether to emit a character in the current font or
change to a new font. In Lisp it's trivial. In C it's really
difficult. In Java it's a big pain but not really difficult.)

> Why? (e.g., structures need to be shared to implement ____ or
> this effectively allows various forms of memory compaction, etc.)

It makes both CPU and memory usage orders of magnitude more
efficient for some applications, and in some cases makes the
difference between doable and impossible. See my other article in
this thread where I propose genetic algorithms as a good case that
really needs this way to manage memory. See my mention of millions
of genomes simultaneously present, each containing millions of
genetic units, all evolved from a single starting genome, by
natural selection or neutral drift. Just imagine the difference of
memory usage with and without sharing of all the parts that are the
same between different members of a clade.

> That's useful if you have to shared data, but how frequently does
> that occur?

It depends on your application. You want an API that is the same
either way, then you can try your application both ways and compare
memory+CPU usage to see which way is more efficient.

Pascal J. Bourguignon

unread,

Aug 30, 2008, 7:25:16 AM8/30/08

jaycx2.3....@spamgourmet.com.remove (Robert Maas, http://tinyurl.com/uh3t) writes:

> 100% of the CPU time is *always* used, the only question is what
> it's used for. If there's nothing useful for a CPU to do, it runs
> an idle loop.

This is not true anymore. On modern processors, the CPU can pause,
entering in a state where it consumes much less energy. This is of
vital importance for laptop computers, but it is also a good marketing
point for desktops (and even servers, they're not all busy 100% of the
time).
http://softwareblogs.intel.com/2007/01/10/all-about-system-power-states-s0-s5/

--
__Pascal Bourguignon__ http://www.informatimago.com/

"You question the worthiness of my code? I should kill you where you
stand!"

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 7:30:43 AM8/30/08

> From: p...@informatimago.com (Pascal J. Bourguignon)

> Integers are first class objects in C.
> You can write literal integers: 0, 1, 2, etc. (compilation time)
> you can define integer constants: const int pi=3;
> you can store integers into variables: int i=0;
> you can create new integers: i+1 (run-time)
> you can pass integers to functions: f(int x); f(42);
> you can return integers from functions: int f(int x); i=f(42);

> But you cannot do all of that [in C] with functions:
> typedef int (*fi)(int);
> you can return functions from functions: fi g(int x); g(42)(2);
> you can pass functions to functions: h(fi g){g(42);} h(f);
> you CANNOT create new functions: (run-time)
> you can store functions into variables: fi f=g(42);
> you can define function constants: int g(int x){return(x+1);}
> You CANNOT write literal functions: (compilation time)

That's great! Now in theory you *can", by extreme hackery, make new
functions [in C] at runtime. Here's how:
- Make up a new filename that isn't the same as any other filename
currently in use or that any other process at this very moment
might try to make. (Suggestion: Use your own process-ID as part
of the file name.)
- Write the syntax of the function definition (with some arbitrary
name) to a disk file by that name.
- Invoke an OS interface library function to spawn a new process to
call the C compiler to compile that disk file to create a
relocatable ELF library under yet another unique file name.
- Call a library function to parse the structure of the ELF library
to find the block of bytes that represents that compiled
function, using the arbitrary name from above to find it in the
library-function table-of-contents.
- Call MALLOC to allocate enough memory for that function body.
- Copy that many bytes from the appropriate place within the ELF
library file to the block that MALLOC got.
- Delete both the source file and the ELF library file.
- Return a pointer to that MALLOC block, cast to the appropriate
type of pointer to function.
But has anybody ever written a library module to make that hackery
easy enough that anybody can use that library to define new
functions at runtime? I don't think so.

Note the key thing about *any* first-class object, more primitive
than what you listed below, is that there are two kinds of object,
those which will fit nicely within a single machine register (such
as fixed-length of some machine-dependent length), and those which
require more memory hence must be allocated out in main memory. For
the latter, the object must be well defined, whereby a single
pointer or handle to the first byte (or other "entry point") of the
object implicitly and effectively automatically refers to *exactly*
that object, no more, no less. C has the first type of object,
namely signed and unsigned bytes and integers of various fixed
lengths, but doesn't have the latter.

Also, implied in what you said above, any first-class object must
have some sort of constructor. Register-sized object, C has them,
but for other kinds of pseudo-objects C doesn't. The lack of a
constructor for functions is why you can't create new C functions
at runtime. But by hackery, such as I proposed above, or what I
said elsewhere in the thread (loading single functions from
existing libraries into MALLOC memory at runtime) which is very
similar, a constructor for such new functions *could* be hacked
together in C. But that ain't good enough to qualify.

> In lisp and any other high level programming language,
> (defun adder (x) (lambda (y) (+ x y)))
> you can return functions from functions: (adder 42) --> #<FUNCTION (Y) (+ 42 Y)
> you can pass functions to functions: (mapcar #'adder '(1 2 3))
> --> (#<FUNCTION (Y) (+ 1 Y)>
> #<FUNCTION (Y) (+ 2 Y)>
> #<FUNCTION (Y) (+ 3 Y)>)
> you can create new functions: see adder above. (run-time)
> you can store functions into variables: (setf g (adder 42))
> you can define function constants: (defun h (z) (* z 2))
> You can write literal functions: ((lambda (x) (* x x)) 3) (compilation time)
> (setf g (lambda (x) (* x x))) (funcall g 3)

Very nicely said. Between the two of us, looking at different
aspects of this point, we might eventually enlighten somebody.

> ... there is no corelation between an style of implementation and
> the presence or absence of a a garbage collector.

Correct. I think I said it better, by saying that dynamic
allocation of memory, not the question of
interpret/compile/JIT/whatever, is what makes memory manageent
including garbage collecting relevant.

> But there is a correlation between a language that wants you to
> think in terms of bytes and pointers and wants you to implement
> a memory management system (things you would of course want to
> do if you were implementing a unix kernel), and a high level
> language that wants you to think about your problem in terms of
> the problem domain, and that will manage the low-level details
> such as processor and memory for you.

That had the potential to be very well said, but I think you
flubbed your English there. Let me copy&paste to make something
more logicallly correct:

There is a correlation between whether a language wants you to:
-a- think in terms of bytes and pointers and implement your own
memory management system,
-b- think about your problem in terms of the problem domain, and
just take memory management for granted;
and whether the intended type of task to be programmed is:
-a- implement a unix kernel or bootstrap loader or device driver or
memory management system for a high-level language, or any
other [almost] bare-machine utility,
-b- just about any other kind of data-processing application you
can possibly imagine, including even compilers.

> If you are implementing a unix kernel, indeed you don't really
> need automatic memory management.

And if you're *implementing* your own memory management system for
some good purpose, such as implementing the runtime for a
high-level programming langauge, then indeed you don't have
automatic memory management before you implement, and you can't
define itself in terms of itself.

> But if you are implementing anything else like, say, MIS
> applications or web services, then you don't care about memory,
> you care about bank accounts, or salaries, or stock items and
> sales, etc.

Well said. You already have enough "on your plate" with the
application itself. You don't want to need to worry about low-level
issues. You want all of those low-level issues taken care of
already before you start.

> Basically, in lisp, macros are compiler hooks, normal lisp
> functions. By the properties of lisp, macros receive as
> arguments parse trees (symbolic expressions), not text chunks,
> and they return a symbolic expression (a parse tree), that
> substitutes the macro call.

Very well said, except there was no need to confuse the issue
by mentionning "symbolic expresssion". Here's a better statement:
? Basically, in lisp, macros are compiler hooks, normal lisp
? functions. By the properties of lisp, a macro receives as
? argument a parse tree, not a text chunk, and it returns a new
? (replacement) parse tree, which then gets used in lieu of the
? original parse tree.

> > What do you mean by "packages" and "modules"?
> In pascal they're called UNIT.
> In Modula-2 they're called MODULE.
> In ADA and lisp they're called PACKAGE.
> In C++ they're called namespace.

IMO the C++ jargon is most clear about the essential feature being
provided. Also, IMO Java does namespaces the best way,
hierarchially from the very start.

> This is a notion that is important when you write big programs.
> In C, there's no such notion. Libraries must be careful not to
> use a name already used in another library, for example, using
> systematically hopefully unique prefixes.

Note that in Lisp, names can be as long as needed to include all
the parts you might want to include, so packages aren't absolutely
necesary. Just hyphenate-prefix every name with the pseudo-package
name. But having true packages (namespaces) is cleaner, especially
since packages are first-class objects rather than just a naming
convention, allowing an application to check whether a particular
package is present or absent for example and do something different
per that information.

> Thanks to the features of lisp, you are not limited to the data
> abstraction and the procedural abstraction, you also have the
> syntactic abstraction (with lisp macros), and the
> metalinguistic abstraction (one step beyond macro).

That's such a good point, mentioned earlier in this thread also,
that IMO it ought to be written up on a permanent Web page, with
peer review for English presentation, so that we can refer novices
to it again and again. Would you have time to do that?

<http://en.wikipedia.org/wiki/Metalinguistic_abstraction>
doesn't really say what needs to be said. Perhaps you can flesh out
that page, or use it as a starter for your own page/essay? Perhaps
a toy problem domain, with an example of how to deal with it using
each of the listed stages of abstraction, would help clear up the
distinction between those various stages.

On a related topic, a month or two ago I mentionned the various
stages of implementing a new intentional data type for a new
problem domain in a different way, something like:
- Writing new functions (constructors, mutators, accessors, processors);
- Writing macros (parse-tree transformations) to create in effect
new kinds of special forms;
- Writing reader macros to support new kinds of literal data
objects, thereby extending the s-expression notational system;
- Writing a whole new parser for a non-Lispy syntax.
You can write a new parser in just about any language, but with
Lisp all four stages are nicely integrated and can be freely
intermixed, allowing graceful miagration/upgrading of sofware from
each stage to the next piecemeal, avoiding any "flag day".
(Reference to that jargon term: <http://tools.ietf.org/html/rfc4192>)

Searching in vain for another reference to "flag day", I found this gem:
<http://www.oreillynet.com/onlamp/blog/2003/05/why_elevators_matter.html>
[Will elementary school students have "Clever Algorithm
Appreciation" class along with "Art Appreciation" and "Music
Appreciation"? I'm looking forward to it.]

Ah, found another reference to "flag day" in jargon:
<http://support.internetconnection.net/DEFINITIONS/Definition_of_Great_Renaming.html>

Ah, finally, the definiitive definition, although poorly formatted
(double spaced on the Web site, changed to single spacing here):
<http://www.ifla.org.sg/documents/internet/jargon.htm>
:flag day: n. A software change that is neither forward- nor
backward-compatible, and which is costly to make and costly to
reverse. "Can we install that without causing a flag day for all
users?" This term has nothing to do with the use of the word
{flag} to mean a variable that has two values. It came into use
when a massive change was made to the {{Multics}} timesharing
system to convert from the old ASCII code to the new one; this was
scheduled for Flag Day (a U.S. holiday), June 14, 1966. See also
{backward combatability}.

Pascal J. Bourguignon

unread,

Aug 30, 2008, 8:32:10 AM8/30/08

jaycx2.3....@spamgourmet.com.remove (Robert Maas, http://tinyurl.com/uh3t) writes:

>> - strings and characters (no, C has no string and no character),
>
> Now we're treading into intentional data types vs. builtin data
> types. If the built-in data type provides FIXNUMs of sufficient
> range to include all the UniCode characters you'll ever need for
> your application, and an interface routine treats these integers as
> if UniCode characters when printing them out, that should be
> sufficient for characters. Likewise if sequences (either arrays or
> linked lists or self-balancing binary search trees etc.) exist,
> then an interface over sequences of integers, or a sequence of
> character-interfaces over integers, would suffice. It's **nice** if
> the READer and PRINTer and various other functions know about the
> intentional type of these characters and strings, but these can all
> be implemented by after-the-fact libraries, and hence don't really
> *need* to be in the core language.

Strings need to be in the language, if you want SYMBOL-NAME and INTERN.
For "string literals" you need reader macros, or have them in the language.

(Lisp only has symbol, integers, rationals and floating-point numbers
built-in in its scanner; all the rest is user modifiable reader
macros).

--
__Pascal Bourguignon__ http://www.informatimago.com/

ATTENTION: Despite any other listing of product contents found
herein, the consumer is advised that, in actuality, this product
consists of 99.9999999999% empty space.

Pascal J. Bourguignon

unread,

Aug 30, 2008, 8:32:26 AM8/30/08

jaycx2.3....@spamgourmet.com.remove (Robert Maas, http://tinyurl.com/uh3t) writes:

Don't you think C was already slow enough compared to Lisp without all
this?

An alternative would be run-time code generation, there are several
libraries to do that, including gnu lightning.

> But has anybody ever written a library module to make that hackery
> easy enough that anybody can use that library to define new
> functions at runtime? I don't think so.

I heard Apple has it well integrated in Xcode, in a debugging context.
This is helped by Objective-C which is more dynamic a language than C
or C++, and has even the notion of categories of methods, that can
conveniently be dynamically loaded, or of classes posing as other
classes.

>> Thanks to the features of lisp, you are not limited to the data
>> abstraction and the procedural abstraction, you also have the
>> syntactic abstraction (with lisp macros), and the
>> metalinguistic abstraction (one step beyond macro).
>
> That's such a good point, mentioned earlier in this thread also,
> that IMO it ought to be written up on a permanent Web page, with
> peer review for English presentation, so that we can refer novices
> to it again and again. Would you have time to do that?
>
> <http://en.wikipedia.org/wiki/Metalinguistic_abstraction>
> doesn't really say what needs to be said. Perhaps you can flesh out
> that page, or use it as a starter for your own page/essay? Perhaps
> a toy problem domain, with an example of how to deal with it using
> each of the listed stages of abstraction, would help clear up the
> distinction between those various stages.

Well this is covered in depth in SICP:

Structure and Interpretation of Computer Programs
http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html
http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/
http://www.codepoetics.com/wiki/index.php?title=Topics:SICP_in_other_languages
http://eli.thegreenplace.net/category/programming/lisp/sicp/

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 8:07:30 AM8/30/08

> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
> Basically any language not explicitly called "assembler" and
> which can't directly code cpu "assembly instructions" is a HLL...

Wrong. The assembler for the IBM 1620 was called "S.P.S." which
stood for "Symbolic Programming System". The name "assembly
language" either was invented later or somehow bypassed the public
relations for the IBM 1620. I seem to remember about the same time
the IBM 7090 and 7094 used BAL as the name for their assembly
language. In fact I don't know much of any assembly language that
is actually called "assembly language"? PDP-10 had MACRO and FAIL,
for example. MOS 6502 and Intel 8080 had CROSS, which ran on a
PDP-10 and cross-assmembled for the 6502 or 8080.

> why do you need to pass a function to a function in C? You can
> call the function explicitly.

Only if you know the name of the function at the time you're
writing the source code for the function that is going to call it.
What if you want to write a generic function that can call some
other function for each element of an array, where *any* function
that takes the appropriate parameter type can be called, not just
one that is known at the time the generic function is compiled, and
then process all the return values in some particular way (which
you *do* know at the time you write the generic function)?
For example, in Lisp you might write:
(defun caller-adder (fn list)
(loop for el in list sum (funcall fn el)))
which adds all the return values together.

> Or, you can pass a pointer to the function and use the function
> pointer to call the function.

Only if the function you want to call was defined in the source
code and the place where you want to call it knows at source-code
time the names of all such functions you might want to pass.
But at least that covers some of the cases of importance.

But suppose the function you want to pass is just a variant of a
defined function, not the exact function itself. For example,
suppose you have a function of two parameters which computes the
first parameter to the power of the second parameter, like EXPT in
Lisp. Now suppose what you want to pass to CALLER-ADDER is that
function with the first parameter a constant 5 and the second
parameter remaining a parameter to take value from the list you're
traversing? In Lisp you can say:
(setq l (list 2 7 5))
(caller-adder #'(lambda (parm) (expt 5 parm)) l)
Note how easy it is in Lisp to "curry" a function by converting one
of its parameters to a constant. In C you just can't do such a
simple thing.

> I don't see where passing a C function to another function is of
> any value without some other language feature such as
> encapsulation.

OK, I'll win the argument very simply, by saying:
** C doesn't provide encapsulation **

> > ... What we call high level programming here is the ability to do:
> > (print (factorial 40))
> > and get:
> > 815915283247897734345611269596115894272000000000

> So, you'd call FORTH a high level language - since it can do this
> rather easily, but not C?

Show me the FORTH program that can do that correctly.
I have Pocket Forth on my Macintosh here, so I can check if you're bluffing.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 8:26:31 AM8/30/08

> From: p...@informatimago.com (Pascal J. Bourguignon)

> Yes, cpp macros are a barbaric low-level feature. It's actually
> unfortunate that the same word is used both for those and for lisp
> macros.

So stop using a single word to refer to them!! Lisp implements
parse-tree transformations. Cpp implements text-chunk
transformations. The former are much more structured than the
latter, hence much more useful for transforming source code prior
to compilation or other use. When you say C isn't as good as Lisp,
say it that way, instead of using the word "macro" for both.

As a test case for the C-lover here, try to implement Common Lisp's
newstyle (ANSI) LOOP special form using Cpp text-chunk macros. Feel
free to look at the Lisp source for CL's LOOP, and you still won't
be able to accomplish the task by means of Cpp text-chunk macros no
matter how many Cpp experts you hire and how much money you throw
at them. Your only real chance is to Greenspun a function that
reads source syntax and returns a parse tree, and then Greenspun
Lisp-style parse-tree transformations, and then re-package the C
compiler to work from a externally-supplied parse tree rather than
from a source file, and then write a new toplevel "compiler" for C
which watches for parse-tree macro definitions and usages at the
parse-tree point and calls the appropriate stuff to transform the
parse tree before execution. But then you're Greenspunning the Lisp
way, not using C style text-chunk "macros" at all, thereby proving
we were right after all, that C is crap, in fact that's what the
letter "C" stands for.

Marco van de Voort

unread,

Aug 30, 2008, 10:36:29 AM8/30/08

On 2008-08-30, Robert Maas, http://tinyurl.com/uh3t <jaycx2.3....@spamgourmet.com.remove> wrote:

>> > Class files and FASL files as native executable formats, then both
>> ...
>> > as native executable formats
>> Ah, but, that is the entire problem with your statements. These
>> aren't "native executable formats", i.e., binaries. They're
>> interpreted code.
>
> Only if you're running them on a machine (CPU) that doesn't support
> JVM code as part of its instruction set, i.e. most stock hardware
> (CPUs). For example, it should be possible to change the micro-code
> in a stock CPU so that it executed JVM instead of Intel x86 as its
> "machine language".
> Since the x86 instruction set is micro-coded in
> the first place, and *that* is considered native code when running
> on such a micro-coded machine (while it is considered non-native
> code when emulated on some other CPU type), then JVM code should
> just as naturally qualify as native code when running on a
> JVM-micro-coded CPU and non-native code when emulated on other
> CPUs.

Well, that is a bridge to far for me. True, x86 is microcoded, but it is a
thin layer over the "real" instruction set, with pretty much the same
concepts.

Java bytecode with its stack approach totally doesn't fit that regime, and I
have some doubts if the more recent Java bytecodes are still interpretable
efficiently at all, since they seem to have added a heap of features that
require compiling with a lot of context.

Richard Harter

unread,

Aug 30, 2008, 12:21:24 PM8/30/08

On Sat, 30 Aug 2008 01:31:27 -0700,
jaycx2.3....@spamgourmet.com.remove (Robert Maas,
http://tinyurl.com/uh3t) wrote:

>> > So, I ask, but I never hear what "higher level programming features" C
>> > doesn't have... ever.
>> From: p...@informatimago.com (Pascal J. Bourguignon)
>> - first class functions,
>
>More to the point: First class *anonymous* functions. If you have
>to declare the name of every function at compile time, it sorta
>defeats the idea of first-class functions. If you don't need to do
>that, but the runtime system needs to invent a "name" for each new
>function and make sure it doesn't duplicate the name of some other
>function previously invented in any thread whasoever in the entire
>application, that would be utterly horrid a way to implement
>first-class functions, although the API for using it might "look"
>as if the functions were anonymous.

This isn't quite right. If you are going this way (the system
invents a name) and you have decent lexical scoping the name
will be local to the scope in which it is defined. Anything
outside the scope is a reference, and references are what is
passed.

The only time you need true anonymous functions is, AFAIK, if you
are doing lambda expressions on the fly, i.e., it is invoked
directly.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 7:33:27 PM8/30/08

> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
> Garbage collection is not usually implemented in compiled languages.

I already answered this same incorrect statement elsewhere in this
thread, so I'll be brief here: Whether GC is useful, and whether
compilation is usually done, are unrelated issues. GC is useful
whenever you dynamically allocate memory and have an application
run long enough to exhaust available virtual-memory if it were not
for GC reclaming some/most of that vm to thereby rescue the
application from aborting due to insufficient memory.

> I've seen no real C examples where these are needed

Regardless of the specific topic, this is an evasion. The point is
not whether C programmers create a contradiction by doing in C what
is impossible in C thereby creating a C example of something not
possible in C. The point is whether some useful algorithm
methodology is easily done in some other language but is not
possible in any decent way in C thereby proving C is deficient in
that respect. A non-C example is the only example possible here,
and requiring a C example is deliberate refusal to face reality.

> > * No compile-time polymorphism in the form of function or operator
> > overloading
> This is an object-oriented language feature. C is not object-oriented.

I see three different ways that overloading/polymorphism can be done:
- Compile-time overloading:
define foo(int x) ...
define foo(string x) ...
string s = "Hello! World"
foo(s) ;Compiles linkage to the second function defined above
int i = 42
foo(s) ;Compiles linkage to the first function defined above
- Runtime dispatch overloading:
class animal inherits from class Object
virtual define foo(x)
class cat inherits from class animal
define foo(x) ...
class dog inherits from class animal
define foo(x) ...
animal y;
y = new cat(...)
foo(y) ;Compiles linkage to generic method foo in class animal
; but at runtime it's noticed that the actual class is cat,
; so the method cat.foo is actually called
y = new dog(...)
foo(y) ;Compiles linkage to generic method foo in class animal
; but at runtime it's noticed that the actual class is dog,
; so the method dog.foo is actually called
- Ad hoc dispatching within a single runtime function:
define foo(x)
case x
int: ...
string: ...
else: error "unknown type of argument to foo, neither int nor string"

Because C doesn't carry type information at run time, neither of
the two runtime dispatching methods are even possible (except by
hackery such as defining a STRUCT which has a member expliciting
enumerating the possible intentional types and another member which
is overlayed differently for each possible type, then it's possible
to do the ad hoc kind of dispatching).

AFAIK C doesn't support the first way of dispatching, at compile
time, either.

Lisp supports all three types of dispatching for all types of data.
Java supports all three types of dispatching only for subclasses of
class Object, not for primitive data types where only compile-time
dispatching is possible.

> C is not object-oriented.

And even worse, provides no way to easily implement OOP within C.
As soon as Lisp had lexical closures, it was possible to implement
OOP in Lisp in a natural way, long before CLOS was included as a
part of the standard language.

> > * No native support for networking
> True. OS dependent.

I agree this is and should be entirely the work of libraries rather
than the core language. So I won't fault C for this "lack",
providing that you (yes, I mean *you*) please show us where to find
decent libraries for doing netoworking directly from C, which must
be written in a natural style that is easy to understand, not using
extreme hackery to thread a camel through the eye of a needle by
stripping the camel to needle-eye-sized onion-thin layers then
re-assembling them on the other side. I already have found a
library for decoding URL-encoded form contents for purpose of CGI
applications.
<http://www.rawbw.com/~rem/HelloPlus/hellos.html#c3>
but I haven't really studied the source code to see how "nice" it
is. Maybe someday I'll look at it. Meanwhile, your job is to find
all the *other* useful C-network modules, such as SOAP, RMI/RPC,
etc.

> > * No standard libraries for computer graphics and several other
> > application programming needs
> True. OS dependent.

Now *that* is *not* a valid *excuse*!! Why hasn't somebody
abstracted the usual characteristics of various graphical display
devices, parameterized the differences, and created a set of
drivers for each common device that act through a standard API or
"interface", together with a toplevel service routine which
dispatches to the appropriate driver?

Can somebody familiar with CLIM please advise me how well Common
Lisp with CLIM satisfies my specification/desiderata above? I'd
like to be able to say that it's easy enough in Lisp that somebody
has done it nicely, hence CLIM, but it's so awfully difficult to do
in C that nobody has been able to afford the cost so it hasn't ever
been done.

Robert Maas, http://tinyurl.com/uh3t

unread,

Aug 30, 2008, 9:13:10 PM8/30/08

> From: gremnebulin <peterdjo...@yahoo.com>
> It's an attemp[t] to apply LISP-like behaviour --lambda and so on.

Lambda is a very specific notational convention that happens to be
used in Lisp because Church invented the Lambda calculus and
McCarthy liked it enough to propose using it as the basis for a
function-defining form within a list-processing language. But the
essense of having **some** syntax for expressing anonymous
functions doesn't in any way depend on the keyword lambda nor on
the structure involved in lambda forms. There's nothing Lisp-like
about the generic task of expressing anonymous functions and using
that expresssion to actually *create* an anonymous function. I'm
sure if you work at it you can think of five or ten different
non-Lisp-like notational ways to expresss an anonymous function,
any one of which *could* have been implemented as a way to actually
*create* anonymous functions.

Defining anonymous functions is very useful, so useful that Java
has a way of doing it shoehorned into nested/inner class
definitions. The syntax used by Java is quite ugly compared to
lambda expressions, but it proves that lambda expressions aren't
the *only* way to do it, so your excuse that Lambda is Lisp-like is
not a valid excuse for C not to provide *some* way to define
anonymous functions. For example, here's a C-like way to do it:
*function (int a,b) foo = { (int x,y) {if x<y return x; else return y; }}
which declares foo as a function-pointer (int a,b) variable whose
initial value is the anonymous function that returns the minimum of
two integers. The value of foo can then be passed to any other
variable declared of type *function (int a,b), and passed as an
argument to any function with a parameter of type *function (int
a,b), and returned as the result of any function whose value is
declared of type *function (int a,b).

Now that syntax is compile-time anonymous-function definition. But
if there's a way to define a named function at runtime, such as the
debugging tool somebody mentionned elsewhere in this thread, then
something like the above syntax could be used to extend that debug
tool to allow defining anonymous functions at runtime using that
debug tool.

> It's missing for all that it goes completely against he grain of
> the langauge.

I disagree. C allows defining named functions, and passing pointers
to such named function at runtime as if they were unnamed
functions, so why is it "against the grain" to define anonymous
functions in the first place and pass pointers to *them* around?
The code to pass a function pointer doesn't know or care what the
name of the function is nor whether it even has/had a name.

Accordingly C is deficient insofar as it does not provide a way to
define anonymous functions.

> Generic programming basically means being able to write algorithms
> with data types abstracted away or paramaterised. Alexander Stepanov,
> author of the STL, argues that generic programming is differt to and
> more useful than OOP.

The kind of generic programming you seem to be talking about would
seem to first require that any generic function require that each
of its parameters satisfy some particular "interface" (in the Java
sense) whereby certain methods are guaranteed to "work" for any
parameter that might get passed. For example, if you define a
"maximum-of-collection" generic, then it requires that the
collection itself provide some way of enumerating its members, and
the collection provide some comparator function such that any two
members of the collection can be compared. Then you can write the
generic maximum-of-collection method like this:
Object collMax(ComparableEnumerableCollection c) {
Boolean function(Object o1,o2) f = c.comparator();
enumeration e = c.enumerator();
Object mx = e.next();
while e.hasMore() {
el = e.next();
if funcall(f,mx,el) then mx = el;
}
return mx;
}
That's standard OOP, mostly Java syntax with a couple extensions.
That can be called by:
MyDogCollection c1 = new MyDogCollection();
c1.add(new MyDog(...));
c1.add(new MyDog(...));
c1.add(new MyDog(...));
Object bestWhatever = collMax(c1);
MyDog bestDog = null;
if (bestWhatever instanceof MyDog) bestDog=bestWhatever;
else throw(exception...Houston we've got a problem...);

Now if you want instead to compile a specialized function for cases
where the type of collection (hence enumeration algorithm) and type
of element (hence a known canonical comparator for such elements)
are fixed at compile time, you change the syntax a little bit.
Template ElementClass collMax(ComparableEnumerableCollection c) {
Boolean function(Object o1,o2) f = c.comparator();
enumeration e = c.enumerator();
ElementClass mx = e.next();
while e.hasMore() {
ElementClass el = e.next();
if funcall(f,mx,el) then mx = el;
}
return mx;
}
MyDog function(MySpecializedCollectionOfDogs) maxF1 =
instantiateTemplate(collMax,MySpecializedCollectionOfDogs,MyDog);
MyCat function(MySpecializedCollectionOfCats) maxF2 =
instantiateTemplate(collMax,MySpecializedCollectionOfCats,MyCat);
MyDogCollection c1 = new MyDogCollection();
c1.add(new MyDog(...));
c1.add(new MyDog(...));
c1.add(new MyDog(...));
MyDog bestDog = funcall(maxF1,c1);
MyCatCollection c2 = new MyCatCollection();
c2.add(new MyCat(...));
c2.add(new MyCat(...));
c2.add(new MyCat(...));
MyCat bestCat = funcall(maxF2,c2);
Note because the values of the variables maxF1 and maxF2 are
specialized functions, the return values from funcalling them are
compile-time known to be of the desired type, so we don't need to
collect the return value in a more generic type of variable then
use instanceof to test whether it's safe to cast to the type we
really want.

Note that instantiateTemplate takes the following parameters:
- The template to be instantiated.
- The specific type of each parameter to the template, or NULL for
any template parameter that is to remain generic (i.e. currying
just some but not all of the template parameters)
- The specific type that the return value will be declared as,
replacing the template token ElementClass (or what other symbol
appeared in that position within the template definition).
Hmm, if this were to be shoehorned into Java, I guess we'd pass a
vector of the specific types of parameters instead of each type
separately, and in any case we'd probably swap the second and third
parameters.

Is the latter (ignoring the fact that I'm using Java-style OOP
classes to parameterize the data types) basically what you had in
mind? If so, can you think of a similarily simple/toy example that
doesn't require OOP classes? If my guess as to your meaning is
totally off base, can you enlighten me as to what you meant, with a
simple/toy example?

stan

unread,

Aug 31, 2008, 3:01:53 PM8/31/08

Robert Maas, http://tinyurl.com/uh3t wrote:
<snip>

> parse tree before execution. But then you're Greenspunning the Lisp
> way, not using C style text-chunk "macros" at all, thereby proving
> we were right after all, that C is crap, in fact that's what the
> letter "C" stands for.

How objective, unbiased, and likely to sway someone is the above? You've
made your opinion very clear. It's unfortunate that your biased
hyperbole prevent any real thought about comparative language issues.
Basically you keep claiming that C isn't lisp and for tht reason it
sucks. They are very different languages and they work very well, but
neither is without limitations or application domains where they are
inappropriate in practical ways. Real progress in programming won't come
from tweaking existing technology it will have to address fundamental
issues. Buggy code and failed projects are common if not the norm and
reall progress will have to find better than existing solutions.

I will ask a question about programming languages. I don't know if you
believe the numbers regarding programming tasks moving overseas or not;
reasonable people seem to disagree on the magnitude at least. Let me for
the moment speculate that "some" jubs are moving overseas. Do you see
Lisp having a commercial renaisance in the current environment? The
thread is about "modern" languages and I'm curious about the perceived
future of programming jobs and the skillsets that might be relevant. I
acknowledge that any answer must of course be speculation and my intent
is not to attempt to poke holes in guesses about the future; I'm
actually curious about what part lisp might play.

Marco van de Voort

unread,

Sep 1, 2008, 4:53:13 AM9/1/08

On 2008-08-30, Robert Maas, http://tinyurl.com/uh3t <jaycx2.3....@spamgourmet.com.remove> wrote:
>

> Our standards for many things (civil rights, cruel and unusual
> punishment, trans-national portability, and high-level programming
> languages) have changed since 1960. What was considered acceptable
> segregation of negros and caucasians in 1960 is no longer
> acceptable, in fact equal rights for public schooling and public
> transit and housing and employment are now considered mandatory.

True, but those things were changed in a hard struggle involving icons as
Dr King and Nelson Mandela. They were redefined in a newsgroup post starting
with "I think".

So it will be interesting to see what kind of evidence you bring to the
table for these statements, others than your own perception.

Marco van de Voort

unread,

Sep 1, 2008, 6:14:47 AM9/1/08

On 2008-08-30, Robert Maas, http://tinyurl.com/uh3t <jaycx2.3....@spamgourmet.com.remove> wrote:

>> From: "Rod Pemberton" <do_not_h...@nohavenot.cmm>
>> Garbage collection is not usually implemented in compiled languages.
>
> I already answered this same incorrect statement elsewhere in this
> thread, so I'll be brief here: Whether GC is useful, and whether
> compilation is usually done, are unrelated issues. GC is useful
> whenever you dynamically allocate memory and have an application
> run long enough to exhaust available virtual-memory if it were not
> for GC reclaming some/most of that vm to thereby rescue the
> application from aborting due to insufficient memory.
>
>> I've seen no real C examples where these are needed
>
> Regardless of the specific topic, this is an evasion.

So do you. You make a horrible brief case for why it is "useful", and then
procede to declare it an universal requirement for a HLL.

A native string type is also useful, yet many HLLs don't have one.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 29, 2008, 3:52:06 PM9/29/08

> >> Thanks to the features of lisp, you are not limited to the data
> >> abstraction and the procedural abstraction, you also have the
> >> syntactic abstraction (with lisp macros), and the
> >> metalinguistic abstraction (one step beyond macro).

REM> That's such a good point, mentioned earlier in this thread also,
REM> that IMO it ought to be written up on a permanent Web page, with
REM> peer review for English presentation, so that we can refer novices
REM> to it again and again. Would you have time to do that?

REM> <http://en.wikipedia.org/wiki/Metalinguistic_abstraction>
REM> doesn't really say what needs to be said. Perhaps you can flesh out
REM> that page, or use it as a starter for your own page/essay? Perhaps
REM> a toy problem domain, with an example of how to deal with it using
REM> each of the listed stages of abstraction, would help clear up the
REM> distinction between those various stages.

> From: p...@informatimago.com (Pascal J. Bourguignon)
> Well this is covered in depth in SICP:
> Structure and Interpretation of Computer Programs
> http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html

The various kinds of expressions (each with its
associated evaluation rule) constitute the syntax of the programming
language.

That seems to use "syntax" in a nonstandard/incorrect way. To my
understanding of the word, the syntax of a programming language is
how the symbols are arranged in the source file (or directly
entered command), while the semantics are what is actually done
when the computer sees this syntax coming in. In Lisp this is most
clear, where *all* the special forms, as well as regular
recursively-evaluated forms, have exactly the same syntax, either
an atom, or (operator ...) where ... are the operarands or tags all
of which have the same syntax as any other form. Thus in regard to
syntax, there's only one type of non-atomic form. It's only in
regard to semantics where they differ. In most other languages this
issue is confused because each different type of special form uses
its own unique syntax.

We have identified in Lisp some of the elements that must appear in
any powerful programming language:
* Numbers and arithmetic operations are primitive data and
procedures.

Whether that's true or not depends very much on what you mean by
"number". To a mathematician, a number is *any* member of *any*
arithmetic system, which is a set with some operations which
satisfy some axioms for that type of arithmetic system.
Interpreting that in a software context, a lazy-evaluation
unlimited-precision interval-arithmetic value would be considered a
"number", but such a repeatedly-mutating object is most definitely
not primitive. (And in SmallTalk, numbers are fullfledged objects
that accept messages. And even in Lisp and Java, large integers are
not quite primitive.)

* Nesting of combinations provides a means of combining operations.

Note that assigning a variable and later retrieving that value for
further use *also* provides a means of combining operations. (For
developing new code one line at a time, as well as for learning how
to program in the first place, variable assignment/retrieval is the
primary way to combine operations.)

(define (square x) (* x x))
[book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
[book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
[book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
To square something, multiply it by itself.

I'm having trouble reading that.

* To apply a compound procedure to arguments, evaluate the body of
the procedure with each formal parameter replaced by the
corresponding argument.

That's not actually correct in the context where the author has
been talking mostly about syntax and what happens when that syntax
is evaluated. If you replace the not-yet-evaluated parameters
within the body by the result of evaluation, you don't get the
correct form for finishing the evaluation. Here's a counterexample:
(define (revappend x y) (append y x))
Now use the authors description of how to evaluate this form:
(revappend (list 1 5) (list 3 7))
First evaluate the sub-forms, yielding:
(list 1 5) => (1 5)
(list 3 7) => (3 7)
Now identify formal parameters matching those values:
x => (1 5)
y => (3 7)
Now replace each formal parameter by the corresponding argument:
(revappend (1 5) (3 7))
Do you see the problem here?

If only numbers are possible values, his description actually
works, but only because evaluating a number results in itself. It
doesn't hurt to feed numeric values back into the body of the
function definition to replace the formal parameters, because it
doesn't hurt to evaluate them an extra time.

Some other English is needed to express the *correct* thing to do.
Some books talk about evaluating the body of the function
definition with each parameter bound to the result of evaluating
the corresponding sub-expression. But that's confusing to
beginners. I would prefer a more low-level description, such as
calling the function while passing the results of evaluation to
that function, but some beginners don't see the difference between
that and what is said in this book. Maybe a diagramic
representation using lazy evaluation to resolve parts of it would
work. First you do a top-down decomposition of the expression to
yield a tree with an extra EVAL operator immediately above each
regular node. Then as sub-expressions actually do get evaluated,
the EVAL of sub-expression is replaced by the returned value
without any EVAL present. Actually the top down decomposition
should probably explicitly replace the EVAL of whole expression by
APPLY of operator to EVALs of each sub-form. Here's an example of
regular nested expression without any call to defined function:
Typed in: (cons (list 3 5) (list 1 4))
Parse tree with EVAL, using compact easy-to-edit dendogram format:
EVAL--+--CONS
+--+--LIST
| +--3
| `--5
`--+--LIST
+--1
`--4
Resolving the EVAL yields:
APPLY--Primitive--CONS
+--EVAL--+--LIST
| +--3
| `--5
`--EVAL--+--LIST
+--1
`--4
Resolving the first inner EVAL yields:
APPLY--Primitive--CONS
+--APPLY--Primitive--LIST
| +--EVAL--3
| `--EVAL--5
`--EVAL--+--LIST
+--1
`--4
Resolving the next deep-inner EVAL yields:
APPLY--Primitive--CONS
+--APPLY--Primitive--LIST
| +--3
| `--EVAL--5
`--EVAL--+--LIST
+--1
`--4
Resolving the next deep-inner EVAL yields:
APPLY--Primitive--CONS
+--APPLY--Primitive--LIST
| +--3
| `--5
`--EVAL--+--LIST
+--1
`--4
Resolving the deep APPLY, by calling LIST, yields:
APPLY--Primitive--CONS
+--+--3
| `--5
`--EVAL--+--LIST
+--1
`--4
Resolving the remaining inner EVAL yields:
APPLY--Primitive--CONS
+--+--3
| `--5
`--APPLY--Primitive--LIST
+--EVAL--1
`--EVAL--4
Resolving the first inner EVAL yields:
APPLY--Primitive--CONS
+--+--3
| `--5
`--APPLY--Primitive--LIST
+--1
`--EVAL--4
Resolving the next deep-inner EVAL yields:
APPLY--Primitive--CONS
+--+--3
| `--5
`--APPLY--Primitive--LIST
+--1
`--4
Resolving the deep APPLY, by calling LIST, yields:
APPLY--Primitive--CONS
+--+--3
| `--5
`--+--1
`--4
Resolving the toplevel APPLY, by calling CONS, yields:
+--+--3
| `--5
+--1
`--4
So now I need to work out a similar way to deal with evaluating the
body of a function definition in a dynamic context in which the
formal parameters are treated as if they were the result of
evaluating the sub-forms within the calling expression. I'm
thinking a SUBST node would be exactly appropriate here, except
that's not actually correct because SUBST replaces *all*
occurrances of a symbol whereas we want *only* instances of such a
symbol where it's being evaluated. So really we need EVAL to take a
second parameter, which is the context, exactly as McCarthy wrote
it up originally it seems. I'll show just a few steps of the
overall process, in this example:
Typed in: (define (square x) (* x x))
Side-effect: GLOBALBINDINGS--`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
(In case you haven't figured it out yet, in dendogram notation CONS
cells are denoted by "+", with CAR to right and CDR downward,
except if CDR is NIL whereupon "`" is used instead.)
(Note that a function definition has three primitive parts, the
function name (to the right), the parameter list (diagonally
down-right, using "`" to denote that direction of link), and the
body (directly down). This is an exceptional use of "`".)
(Note I put brackets around non-alphabetic symbol names, to
distinguish them from the notation I used for branch nodes in
dendograms. This isn't too important for "*" above, but is crucual
for "+" below.)
Typed in: (square (+ 2 5))
Parse tree with EVAL, using compact easy-to-edit dendogram format:
BIND-EVAL--+--SQUARE
| `--+--[+]
| +--2
| `--5
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
Resolving the toplevel BIND-EVAL yields:
BIND-APPLY--SQUARE
| `--EVAL--+--[+]
| +--2
| `--5
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
... Later after parameter to SQUARE has been evaluated fully:
BIND-APPLY--SQUARE
| `--7
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
Next the first key new step, fetching the function definition,
which no longer needs a name, so the name is no longer shown.
BIND-APPLY--FUNCTION
| | | `--x
| | +--[*]
| | +--x
| | `--x
| `--7
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
Next the second key new step, pushing the binding onto the context:
BIND-APPLY-ANONYMOUSFUNCTION
| | `--x
| +--[*]
| +--x
| `--x
+--VARIABLE--x
| `--7
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
Finally we can resolve the ..APPLY.. using the binding context:
BIND-EVAL--+--[*]
| +--x
| `--x
+--VARIABLE--x
| `--7
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
A little bit later we have:
APPLY--Primitive--[*]
+--BIND-EVAL--x
| +--VARIABLE--x
| | `--7
| `--FUNCTION--SQUARE
| | `--x
| +--[*]
| +--x
| `--x
`--BIND-EVAL--x
+--VARIABLE--x
| `--7
`--FUNCTION--SQUARE
| `--x
+--[*]
+--x
`--x
(Note there's only one copy of that large binding-context
structure, but two pointers to it. Perhaps I should adapt
print-circular notation here:)
APPLY--Primitive--[*]
+--BIND-EVAL--x
| [#1]=+--VARIABLE--x
| | `--7
| `--FUNCTION--SQUARE
| | `--x
| +--[*]
| +--x
| `--x
`--BIND-EVAL--x
[#1]
Now the key deepest step occurs here, resolution of BIND-EVAL of
a symbol, i.e. evaluation of a variable within a binding
context, which merely looks up the associated value within that
context structure. I'll do both of them at the same time here:
APPLY--Primitive--[*]
+--7
`--7
Y'all know the final step of course.

... when we address
in chapter 3 the use of procedures with ``mutable data,'' we will
see that the substitution model breaks down and must be replaced
by a more complicated model of procedure application.^15

I suppose that covers their ass but really "mutable" is moot with
respect to the problem I described. Even if some immutable list
structure could be generated, still substituting that value
directly for the formal paramter would not yield the correct
result, for the same reason. EVAL, which takes a form and
recursively evaluates all sub-expressions of it, and APPLY, which
takes an already-resolved function name and already-evaluated
parameter values, must be distinguished. It's OK to start with a
simplified model, but I don't like starting with a flat-out wrong
model.

Description of COND:
... If none of the <p>'s is found to be true, the
value of the cond is undefined.
Nitpick: In every version of Lisp I've used (Stanford 1.5, MacLisp,
SL, PSL, MACL, CMUCL), COND returned NIL in such a case.
Does Scheme have a COND that does something else in that case?

Hey, see what happens when you tell me the URL for an entire book,
which somewhere deep inside has the answer to my question, instead
of showing me the direct answer to my question? I start browsing
the entire book, and find places to discuss before I ever find what
you wanted to show me. This could take weeks before I get to the
point of this thread... Oh well, this isn't a news conference where
reporters expect the famous person to answer questions directly,
it's a discussion forum, and side discussions are OK here, right?

Metalinguistic abstraction -- establishing new languages -- plays an
important role in all branches of engineering design.

Is that all you mean by the jargon? Most new programming languages
are first established using C, hence C is best at metalinguistic
abstraction? I really don't agree!!

It is
particularly important to computer programming, because in programming
not only can we formulate new languages but we can also implement
these languages by constructing evaluators. An evaluator (or
interpreter) for a programming language is a procedure that, when
applied to an expression of the language, performs the actions
required to evaluate that expression.
It is no exaggeration to regard this as the most fundamental idea in
programming:
The evaluator, which determines the meaning of expressions in a
programming language, is just another program.

Note the term "evaluator" is used loosely here. For most
programming languages, the term really means a aspect of the whole
compile-link-load-run process. Thus you write an entire program,
which contains many expressions, each in some context within the
program, and the compiled code which eventually gets run implements
the effective evaluation of each such expression in the appropriate
context. Thus the expression "x = foo(y,z)" might occur twice in a
program, in very different contexts (for example, in one place x is
a global variable, in the other place x is a local/lexical variable
which shadows the global variable by the same name), hence
resulting in actions which are different. In OOP such as Java, the
expression "obj1.add(obj2)" might have totally different actions
depending on what class the value of obj1 belongs to. But the point
is that there's nothing that individually takes an expression and
evaluates it, all by itself, except in Lisp and a few other
languages that have true read-eval-print loops where EVAL does the
interpretation of individual expressions.

However the general point the author made is very good. Note that
each CPU has a "machine language" which is only a language in the
internal sense, that certain bytes or words of memory when
processed by the CPU are interpreted in a CPU-specific way. In a
sense Lisp is like that, where it's what's inside the computer, the
pointy structures, not any print syntax, that determines what
actions are taken. Thus there is a subtle distinction between the
kind of language he seems to be talking about, something you type
into the computer to effect some actions via an interpretor, and
something already inside a computer per which the interpretor
actually takes actions.

In section 4.4 we implement a logic-programming language in which
knowledge is expressed in terms of relations, rather than in terms of
computations with inputs and outputs.

Is this what somebody else in one of these newsgroups meant by that
term? Is this the same as a relational database with queries (such
as via SQL), or is something else meant by "relations"?

Going to that section now, reading discussion of unification and
pattern matching etc.:
Contemporary logic programming languages (including the one we
implement here) have substantial deficiencies, in that their general
``how to'' methods can lead them into spurious infinite loops or other
undesirable behavior. Logic programming is an active field of research
in computer science.^61

Hmm, it sounds like logic programming is "not ready for prime
time", i.e. it's not a ready-made solution to a lot of problems,
and I shouldn't be switching over to using it for D/P algorithms,
and I can pretty much ignore a certain person in these newsgroups
who is constantly bragging about how his favorite language
implements "pattern matching" and "unification" and "logic
programming" hence is better than Lisp.

Now we will apply these ideas to discuss
an interpreter for a logic programming language. We call this language
the query language, because it is very useful for retrieving
information from data bases by formulating queries, or questions,
expressed in the language.

Hmm, maybe I guessed correctly that logic program is like RDBS+SQL.

> http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/

This consists of downloads if immensely large files, each single
file significantly larger than my *total* allocation on my Unix
shell account, hence totally unusable by me. Nevermind that I'm
already at my maximum disk allocation already and have essentially
no free space to download anything at all here.

> http://www.codepoetics.com/wiki/index.php?title=Topics:SICP_in_other_languages

Hmm, the primitives and fundamental mechanisms of lots of different
programming languages as needed for the examples in the SICP book
as translated to those other programming languages. This may be
useful for my plan to organize software of multiple programming
languages per intentional data type. In particular Forth has no
declared data type whatsoever, *all* datatypes are intentional with
the exception that 8-bit 16-bit and 32-bit values are distinguished
from each other. (Some Forths are alleged to also support a
separate stack of explicitly floating-point values, which may be
considered yet another internal datatypes. But because this is just
*some* implementations, not part of any standard, I'll choose to
ignore this possibility.) Thus Forth might be the best case for my
thesis that intentional data types are paramount. If I have some
time/energy/interest, I may experiment with Pocket Forth on my Mac
to see how well it agrees with the Forth standard described in the
Forth section of that Web site. (But since Forth isn't available on
my ISP, I won't be able to include Forth in my Hello-CGI tutorial.)
Meanwhile the sections for other languages I have in my Hello-CGI
tutorial may be useful for finishing my multi-language
CookBook/Matrix, reorganized per intentional datatypes, again
*if/when* I find time/energy/interest in getting on with that
project.

> http://eli.thegreenplace.net/category/programming/lisp/sicp/

That's not a Web site/page, it's a downloadable application/x-none,
which I have no use for, in fact with such a cryptic application
type I don't see how *anyone* could have any use for it.

Pascal J. Bourguignon

unread,

Sep 29, 2008, 6:34:13 PM9/29/08

seeWeb...@teh.intarweb.org (Robert Maas, http://tinyurl.com/uh3t) writes:

> (define (square x) (* x x))
> [book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
> [book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
> [book-Z-G-D-16.gif] [book-Z-G-D-16.gif]
> To square something, multiply it by itself.
>
> I'm having trouble reading that.

It would work better with a graphical browser.

> * To apply a compound procedure to arguments, evaluate the body of
> the procedure with each formal parameter replaced by the
> corresponding argument.

> That's not actually correct in the context where the author has
> been talking mostly about syntax and what happens when that syntax
> is evaluated. If you replace the not-yet-evaluated parameters
> within the body by the result of evaluation, you don't get the
> correct form for finishing the evaluation. Here's a counterexample:
> (define (revappend x y) (append y x))
> Now use the authors description of how to evaluate this form:
> (revappend (list 1 5) (list 3 7))
> First evaluate the sub-forms, yielding:
> (list 1 5) => (1 5)
> (list 3 7) => (3 7)

No. This is not what they said.

This is in the context of The Substitution Model for Procedure Application
http://mitpress.mit.edu/sicp/full-text/sicp/book/node10.html

What the above quotation means is:

(defstruct procedure
parameters
body)

(defun eval-compound-procedure (proc arguments)
(further-evaluate (subst-each arguments
(procedure-parameters proc)
(procedure-body proc))))

(defun further-evaluate (form) form)
(defun subst-each (params args form)
(if (null params)
form
(subst-each (rest params) (rest args)
(subst (first params) (first args) form))))

(defparameter *definitions* (make-hash-table))

(defmacro define (left right)
(cond
((listp left)
(destructuring-bind (name &rest parameters) left
`(progn
(setf (gethash ',name *definitions*)
(make-procedure :parameters ',parameters
:body ',right))
(define-symbol-macro ,name (gethash ',name *definitions*)))))
((and (listp right) (eq 'lambda (first right)))
`(progn
(setf (gethash ',left *definitions*)
(make-procedure :parameters ',(second right)
:body ',(third right)))
(define-symbol-macro ,left (gethash ',left *definitions*))))
(t `(progn
(setf (gethash ',left *definitions*) ',right)
(define-symbol-macro ,left (gethash ',left *definitions*))))))

(define (revappend x y) (append y x))

(eval-compound-procedure revappend '(x y))
--> (APPEND Y X)

(define (fact x) (if (< x 1) 1 (* x (fact (- x 1)))))

(eval-compound-procedure fact '((+ 40 2)))
--> (IF (< (+ 40 2) 1) 1 (* (+ 40 2) (FACT (- (+ 40 2) 1))))

> Now identify formal parameters matching those values:
> x => (1 5)
> y => (3 7)
> Now replace each formal parameter by the corresponding argument:
> (revappend (1 5) (3 7))
> Do you see the problem here?

You are not applying the susbtitution model.

Don't be in a hurry, take your time to study that book!

> Description of COND:
> ... If none of the <p>'s is found to be true, the
> value of the cond is undefined.
> Nitpick: In every version of Lisp I've used (Stanford 1.5, MacLisp,
> SL, PSL, MACL, CMUCL), COND returned NIL in such a case.
> Does Scheme have a COND that does something else in that case?

Yes.
Some will return #f, some will return #<undefined>, or something else.

>> http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/
>
> This consists of downloads if immensely large files, each single
> file significantly larger than my *total* allocation on my Unix
> shell account, hence totally unusable by me. Nevermind that I'm
> already at my maximum disk allocation already and have essentially
> no free space to download anything at all here.

These are videos of the lectures. You could watch them at a cybercafe
or at a friend's. Any $200 new computer is able to display them. In
total, that's more than 20 hours of lectures.

>> http://eli.thegreenplace.net/category/programming/lisp/sicp/
>
> That's not a Web site/page, it's a downloadable application/x-none,
> which I have no use for, in fact with such a cryptic application
> type I don't see how *anyone* could have any use for it.

You really need to come join us into the 21st century!

[pjb@hubble ~]$ wget --save-headers http://eli.thegreenplace.net/category/programming/lisp/sicp/
--2008-09-30 00:02:57-- http://eli.thegreenplace.net/category/programming/lisp/sicp/
Resolving eli.thegreenplace.net... 69.89.22.107
Connecting to eli.thegreenplace.net|69.89.22.107|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html'

0K ... 60.4K=0.5s

2008-09-30 00:03:00 (60.4 KB/s) - `index.html' saved [32624]

Converting index.html... 1-0
Converted 1 files in 0.02 seconds.
[pjb@hubble ~]$ head -20 index.html
HTTP/1.1 200 OK
Date: Mon, 29 Sep 2008 22:02:56 GMT
Server: Apache/2.2.9 (Unix) mod_ssl/2.2.9 OpenSSL/0.9.8i DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/4.4.9
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
X-Pingback: http://eli.thegreenplace.net/xmlrpc.php
Content-Encoding: none
Set-Cookie: PHPSESSID=774a86cd9a5389de2dac2a769fc44c9c; path=/
Connection: close
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<title>Eli Bendersky’s website » SICP</title>

--
__Pascal Bourguignon__ http://www.informatimago.com/

In a World without Walls and Fences,
who needs Windows and Gates?

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 2:47:07 AM9/30/08

> From: p...@informatimago.com (Pascal J. Bourguignon)

> Strings need to be in the language, if you want SYMBOL-NAME and INTERN.

Only if you require SYMBOL-NAME to return an actual string object,
instead of a list or array of integers representing UniCode (or
subset thereof such as USASCII) encoding of characters. MacLisp
didn't have strings, but it had pnames, and it was possible to read
out a pname as a list of US-ASCII character-code integers.

Only if you require INTERN to take an actual string object as
parameter, instead of a list or array of integers representing
UniCode (or subset thereof such as USASCII) encoding of characters.
MacLisp didn't have strings, but it had a way to INTERN something
to make a symbol by that name.

> For "string literals" you need reader macros, or have them in the language.

Yes. But string literals aren't essential. They are just super
convenient, compared to list-of-USASCII-character-codes.

> Lisp only has symbol, integers, rationals and floating-point
> numbers built-in in its scanner; all the rest is user modifiable
> reader macros).

I think that statement proves my point that strings aren't
*essential* in the deep guts of a programming language. They can be
added later if anybody thinks they'll be useful enough. Now Common
Lisp has strings as an internal data type type, so the
user-modifiable reader macro for string-literal syntax can simply
call (MAKE-STRING <length>) and then copy all the character values
into that string one by one. (Or can use MAP to convert a list of
characters or vector of integers into a string. Lots of ways to do
the same task in Common Lisp, and I don't know what CMUCL or any
other particular CL implementation actually does, and that info
isn't essential for this discussion.) But in a version of Lisp that
doesn't have an internal string datatype, a reader macro could just
as well generate some other internal datagype which can be treated
as a "intentianal" data type, or if OOP is available in that
dialect of Lisp then a String class of object could be defined and
used for this purpose.

Note that C doesn't have an interal String datatype at all, and it
only halfway has a character datatype. Then C has a pointer to type
char datatype which can be used as if it were a pointer to a string
in some contexts such as array of type char. Thus C has strings as
an *intentional* data type that doesn't have all the semantics that
a first class string object would have. In particular two different
strings can share tails, very much like the intentional linked-list
data type in Lisp, based on use of the interal CONS datatype, can.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 3:30:09 AM9/30/08

> > 100% of the CPU time is *always* used, the only question is what
> > it's used for. If there's nothing useful for a CPU to do, it runs
> > an idle loop.

> From: p...@informatimago.com (Pascal J. Bourguignon)

> This is not true anymore. On modern processors, the CPU can
> pause, entering in a state where it consumes much less energy.
> This is of vital importance for laptop computers, but it is also a
> good marketing point for desktops (and even servers, they're not
> all busy 100% of the time).

Ah, that slipped my mind because I've never owned one of those. I
stand corrected. For such a CPU, the cost situation changes. If the
CPU directly drives the video for the GUI screen, then presumably
this sleep mode is allowed only when the user isn't wanting to look
at the screen (or when the system mistakenly believes such and the
user must jiggle the mouse or somesuch to wake up the CPU to drive
the screen again). Thus during *any* active interactive session
what I said originally would apply, that the CPU must sit there in
a idle loop most of the time, taking interrupts to drive the screen
every once in a while (I'm guessing a few hundred or thousand times
a second, maybe five or ten or twenty instructions per interrupt,
on a CPU that's capable of executing hundreds of millions of
instructions per second, hence much less than 1% of the CPU time
driving the screen, the rest in the idle loop). On the other hand,
if the screen doesn't require frequent CPU service, then the CPU
might actually be able to go to sleep even while the screen is
visible to the user, so then your correction might apply. Of course
when the user has gone away from the computer long enough to black
the screen, then your correction definitely applies.

So if the CPU must remain awake, in idle loop most of the time
during any interactive session, and if the user typically is at the
computer actively viewing the screen several hours per day, then at
least during those hours my analysis applies, that during those
hours my original analysis applies, that the CPU might as well be
doing something useful instead of that idle loop, if it isn't a
major hassle setting up useful tasks to be done.

> http://softwareblogs.intel.com/2007/01/10/all-about-system-power-states-s0-s5/

Aha, several sleep states plus a hibernate state that differs from
shut-down state only in that the old state is resumed rather than
re-booting, because live-system state is written to disk to be used
later for re-awakening. Thanks for the URL.

Unanswered question: What decides which of the several sleep states
will be entered at what times? (I assume hybernate state is invoked
by explicit user action, whereas various sleep states are invoked
by system noticing some sort of inactive condition caused by lack
of user interaction as well as lack of any running application
except network listeners? What about 'cron' and other timed events?
Is there a special hardware clock set to wake up the CPU when the
next timed event occurs?)

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 3:38:31 AM9/30/08

> > For example, it should be possible to change the micro-code
> > in a stock CPU so that it executed JVM instead of Intel x86 as its
> > "machine language".
> > Since the x86 instruction set is micro-coded in
> > the first place, and *that* is considered native code when running
> > on such a micro-coded machine (while it is considered non-native
> > code when emulated on some other CPU type), then JVM code should
> > just as naturally qualify as native code when running on a
> > JVM-micro-coded CPU and non-native code when emulated on other
> > CPUs.

> From: Marco van de Voort <mar...@stack.nl>
> Well, that is a bridge to far for me. True, x86 is microcoded, but it is a
> thin layer over the "real" instruction set, with pretty much the same
> concepts.

Oh, OK. Is there enough micro-code memory available that in theory
a really serious level of microcode could be written, no longer a
"thin layer" with "pretty much the same concepts", or is the amount
of microcode memory available so tiny that only a "thin layer" is
really possible?

> Java bytecode with its stack approach totally doesn't fit that
> regime, and I have some doubts if the more recent Java bytecodes
> are still interpretable efficiently at all, since they seem to
> have added a heap of features that require compiling with a lot
> of context.

OK, I accept that my harebrained idea probably isn't feasible on
the current Intel CPU chip.

Robert Maas, http://tinyurl.com/uh3t

unread,

Sep 30, 2008, 4:59:13 AM9/30/08

> From: c...@tiac.net (Richard Harter)

> The only time you need true anonymous functions is, AFAIK, if you
> are doing lambda expressions on the fly, i.e., it is invoked
> directly.

You seem to be confusing two different issues:
- Whether there are anonymous functions created on the fly.
- Whether lambda, or some other notation, is used to represent them.
I suspect you are merely being sloppy in your language, rather than
actually getting the two concepts mixed up in your mind. But you
need to be more clear what you intend to say if you want me to
understand. In Lisp it's extremely common to create anonymous
functions on the fly in an interactive environment, and to create
those same functions in source code. It's not quite so common, but
still useful, to create brand-new anonymous functions on the fly in
running compiled object code.

Note that while it's not absolutely necessary to use anonymous
functions in the usual cases, which are typically functions passed
to MAPCAR and the like for the purpose of acting as an interface
between the parameters passed by MAPCAR and the parameters needed
by the library function called, it would be quite a nuisance to
need to specify a name for each of them, as well as a possible
source of bugs if you aren't careful and accidently give two of
them the same name. And having the system automatically make up
random names for each of them could cause another class of bugs.
(mapcar #'(lambda (x) (gethash x ht)) values)
would you really like if you instead needed to say
(defun dskgfaok (x) (gethash x ht))
(mapcar #'dskgfaok values)
or if the system secretly did something like that?

Now consider a "factory" that makes functions.
(defun ht-make-lookup-function (ht)
#'(lambda (x) (gethash x ht)))
So in an application that builds tens of thousands of hash tables,
which reside inside a huge data structure, you can build a separate
lookup function for each table, and store those lookup functions i
another huge data structure, and do all sorts of applications of
such hash functions to appropriate data values. Would you really
like a programming environment that required your "factory" to make
up a different name for each of them? I wouldn't, although I could
hack it if I really needed to.

> Save the Earth now!!
> It's the only planet with chocolate.

Chocolate isn't a well-defined type of organization of matter,
where there's a precise definition what is and what is not chocolate.
Depending on your definition, planet 3548176 in sector 345 of
galaxy NGC 1232 might or might not have chocolate.
There might be thousands of uncharted galaxies, three of which are
in the Hubble Deep Field but too faint to see there, the rest
elsewhere, which have sorta-chocolate where our Earthly definition
is stretched. (And even on Earth, some companies that produce and
sell "chocolate" are not regarded as producing "true" chocolate by
other companies. But that's irrelevant to your point, because
at least *some* of those Earthly companies do indeed produce
something we can all agree is chocolate. Nestles is one of them.
I have a craving for their semi-sweet morsels/chips at the moment.)

Nick Keighley

unread,

Sep 30, 2008, 5:02:20 AM9/30/08

On 30 Sep, 08:38, seeWebInst...@teh.intarweb.org (Robert Maas,
http://tinyurl.com/uh3t) wrote:

> > > For example, it should be possible to change the micro-code
> > > in a stock CPU so that it executed JVM instead of Intel x86 as its
> > > "machine language".
> > > Since the x86 instruction set is micro-coded in
> > > the first place, and *that* is considered native code when running
> > > on such a micro-coded machine (while it is considered non-native
> > > code when emulated on some other CPU type), then JVM code should
> > > just as naturally qualify as native code when running on a
> > > JVM-micro-coded CPU and non-native code when emulated on other
> > > CPUs.

someone top-posted

> > From: Marco van de Voort <mar...@stack.nl>

> > Well, that is a bridge to far for me. True, x86 is microcoded, but it is a
> > thin layer over the "real" instruction set, with pretty much the same
> > concepts.
>
> Oh, OK. Is there enough micro-code memory available that in theory
> a really serious level of microcode could be written, no longer a
> "thin layer" with "pretty much the same concepts", or is the amount
> of microcode memory available so tiny that only a "thin layer" is
> really possible?

I believe there is a RISC core inside a pentium and this is
micro-coded so the processor appears to be a pentium. I also
believe the RISC core was specifically designed to be good at this
(which would make sense) hence there is minimal micro-code
and not much room for radical redesign of the processors "look".

I think it's telling that no one has tried to implement a JVM
in pentium micro code. Well I know only Intel has access
to this stuff- but you'd think they had a few people who
they allow the play with things.

<snip>

--
Nick Keighley

"Astrology is based on scientific fact: there's one born every minute"
-- Patrick Moore

Marco van de Voort

unread,

Sep 30, 2008, 6:17:27 AM9/30/08

On 2008-09-30, Nick Keighley <nick_keigh...@hotmail.com> wrote:

>> > Well, that is a bridge to far for me. True, x86 is microcoded, but it is a
>> > thin layer over the "real" instruction set, with pretty much the same
>> > concepts.
>>
>> Oh, OK. Is there enough micro-code memory available that in theory
>> a really serious level of microcode could be written, no longer a
>> "thin layer" with "pretty much the same concepts", or is the amount
>> of microcode memory available so tiny that only a "thin layer" is
>> really possible?
>
> I believe there is a RISC core inside a pentium

(pentium pro and higher. The Pentium-I is pure CISC)

> and this is micro-coded so the processor appears to be a pentium.

First, it is not a "risc" core, but the execution units are based on RISC
principles, most notable simple and few opcodes and not too many addressing
modes.

IOW it is not a duct type job with a perfectly good MIPS or PPC in there.
(note: I'm not saying you did that, but I just want that to be clear)

> I also believe the RISC core was specifically designed to be good at this
> (which would make sense) hence there is minimal micro-code and not much
> room for radical redesign of the processors "look".

The RISC principle was adapted because it is easier (thus more efficient) to
schedule simple instructions over multiple execution units. The decoding and
scheduling of the instructions is mostly hardwired, so you can't do much
there.

Microcode can just "tweak" certain parts of the CPU startup. It is used by
OSes in their earliest initialization by taking an array of microcode, send
it to the CPU in a special mode, and then the CPU is configured. There is
not really a general purpose usable "RISC"/microop mode there, not even
interface with real memory to my knowledge. And afaik that uop memory is
limited to hundreds of instructions, not where you can do much.

Marco van de Voort

unread,

Sep 30, 2008, 6:34:14 AM9/30/08

On 2008-09-30, Robert Maas, http://tinyurl.com/uh3t <seeWeb...@teh.intarweb.org> wrote:
>> From: Marco van de Voort <mar...@stack.nl>
>> Well, that is a bridge to far for me. True, x86 is microcoded, but it is a
>> thin layer over the "real" instruction set, with pretty much the same
>> concepts.
>
> Oh, OK. Is there enough micro-code memory available that in theory
> a really serious level of microcode could be written, no longer a
> "thin layer" with "pretty much the same concepts", or is the amount
> of microcode memory available so tiny that only a "thin layer" is
> really possible?

The layer is hardcoded. The instruction translation is not done in software.
The microcode is merely a maintenance interface to CPU where you can set
certain delays and other parameters. Afaik the memory is in the high tens
till the low hundreds of instructions.

>> Java bytecode with its stack approach totally doesn't fit that
>> regime, and I have some doubts if the more recent Java bytecodes
>> are still interpretable efficiently at all, since they seem to
>> have added a heap of features that require compiling with a lot
>> of context.
>
> OK, I accept that my harebrained idea probably isn't feasible on
> the current Intel CPU chip.

Or any other general purpose CPU. Having this logic in software is simply to
inefficient. There are special Java cpus, but they are way slower.

robert...@yahoo.com

unread,

Oct 1, 2008, 10:31:36 PM10/1/08

On Sep 30, 2:30 am, seeWebInst...@teh.intarweb.org (Robert Maas,

That's not really true. Most modern OS's can, and do, issue a "halt"
instruction in the idle loop. On many CPUs this invokes the first
basic power save state. And the halt state is exited by the next
interrupt, so there's no need to be running an active idle loop even
with an interactive application running (and of course I mean running
in the sense of active but waiting for the user, not currently
executing instruction).

Nor is this any sort of a new capability. Very many processors have a
halt state, and in many, although not all, it involves some power
savings to be in the halt state. Back when machines were rented, and
not owned, the customers would get sometimes get charged by how much
CPU they used - and there was literally a meter (like an old-fashioned
mechanical odometer) on the CPU someplace, and it would run whenever
the CPU wasn't in the halt state. And IBM would then send out a meter-
reader every month (actually there wasn’t a separate meter-reader -
your hardware maintenance guy would be visiting you regularly anyway,
and he’d read it while he was there). Although in that case entering
the halt state was about saving cash, not energy.

> Unanswered question: What decides which of the several sleep states
> will be entered at what times? (I assume hybernate state is invoked
> by explicit user action, whereas various sleep states are invoked
> by system noticing some sort of inactive condition caused by lack
> of user interaction as well as lack of any running application
> except network listeners? What about 'cron' and other timed events?
> Is there a special hardware clock set to wake up the CPU when the
> next timed event occurs?)

Usually the OS or the BIOS. x86 CPUs include a "System Management
Mode" which allows a BIOS to install code that runs underneath the OS,
and so you can implement various power management policies there, even
if the OS does not. That is, however, inferior to have the OS do that
stuff. And the OS usually does the job based on a few criteria. For
example, if the CPU load is low, it can reduce the clock speed. If
it's running on batter (as opposed to line), or the battery is low,
again it can reduce the clock speed. Lack of user interaction or
other idleness factor into determining how deep a sleep state to
invoke. In Intel's scheme of things, CPU sleep states down to S3 can
typically be exited in well under a second - more like micro-milli
seconds in the case of S0-S2. It's just a prediction of how long you
can delay a response to the next interrupt before excessively
impacting the user's experience.

Jon Harrop

unread,

Oct 2, 2008, 10:18:29 PM10/2/08

Rod Pemberton wrote:
> Languages that can do real work don't usually have both interpreters and
> compilers available for them. They usually have one or the other.

Java (gij)? Javascript (v8)? Python (IronPython, PyPy)? OCaml
(ocamlopt/ocamlc)?

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u

Jon Harrop

unread,

Oct 2, 2008, 10:31:14 PM10/2/08

Rod Pemberton wrote:
> True. Garbage collection is not usually implemented in compiled
> languages. (see LCC-Win32 for counter-example)

Java, C#, OCaml, Lisp, Scheme, SML, Haskell and ATS are all counter
examples.

>> * No ... functions as parameters (only function and variable
>> pointers)
>
> Not sure what you mean here. Function pointers can be passed as
> parameters. There is no reason to "pass" the actual function (which is
> binary code) as a "parameter" in C. (i.e., code and data are separate...
> and there is no need to pass code in C.) Is this an attempt to apply
> object-oriented features to C? e.g., encapsulation?

He is saying that functions are second class in C: they have unnecessary
restrictions in C that are not present in other languages.

>> * No compile-time polymorphism in the form of function or operator
>> overloading
>
> This is an object-oriented language feature.

No. SML and Haskell are counter examples.

>> * Only rudimentary support for generic programming
>
> Okay, let me go look up what you mean by "generic programming"... This
> seems to be similar in nature to C's casts, but more advanced and related
> to object-oriented classes. This is an object-oriented language feature.

No. SML and Haskell are, again, counter examples.

> FWIW, most of the issues seem to be about lack of object-orientedness, or
> are trivial and unecessary, or are OS implementation specific.

Turing argument.

0 new messages