Null references and language design

Message has been deleted

James Harris

unread,

Nov 17, 2010, 3:40:07 AM11/17/10

to

Tony Hoare famously criticises himself for introducing null references
in the mid 1960s calling it his billion-dollar mistake.

I have some ideas but I'm not sure I fully understand what he means by
null references or null pointers (apart from the obvious that they do
not point to a valid object, such as location zero, which triggers an
error on use). Nor am I sure what he believes is so bad about them -
or what makes them much worse than pointers in general.

Anyone care to discuss the issue from the point of view of language
design? Should languages allow null references? Why or why not? (And
is it practical to remove them?)

(Reposing to include comp.programming. I tried to delete the prior
post which went just to comp.lang.misc. Not sure how effective that
will be....)

James

R Kym Horsell

unread,

Nov 17, 2010, 4:05:19 AM11/17/10

to

In comp.programming James Harris <james.h...@googlemail.com> wrote:
> Tony Hoare famously criticises himself for introducing null references
> in the mid 1960s calling it his billion-dollar mistake.
> I have some ideas but I'm not sure I fully understand what he means by
> null references or null pointers (apart from the obvious that they do
> not point to a valid object, such as location zero, which triggers an
> error on use). Nor am I sure what he believes is so bad about them -
> or what makes them much worse than pointers in general.

[...]

I think we've all seen programs that try to access something via
a pointer but do not first check whether that pointer ref's a valid object.
Hoare's original feeling was to dis-allow this, with some kind of
error should any pointer not ref a valid object.

"I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first comprehensive
type system for references in an object oriented language (ALGOL W). My
goal was to ensure that all use of references should be absolutely safe,
with checking performed automatically by the compiler. But I couldn't resist
the temptation to put in a null reference, simply because it was so easy
to implement. This has led to innumerable errors, vulnerabilities, and system
crashes, which have probably caused a billion dollars of pain and damage
in the last forty years. In recent years, a number of program analysers like
PREfix and PREfast in Microsoft have been used to check references, and
give warnings if there is a risk they may be non-null. More recent programming
languages like Spec# have introduced declarations for non-null references.
This is the solution, which I rejected in 1965. "

--
If your ideas are any good you'll have to ram them down people's
throats.
-- Howard Aiken

Robbert Haarman

unread,

Nov 17, 2010, 4:22:23 AM11/17/10

to

Hi James,

On Wed, Nov 17, 2010 at 12:36:49AM -0800, James Harris wrote:
> Tony Hoare famously criticises himself for introducing null references
> in the mid 1960s calling it his billion-dollar mistake.
>
> I have some ideas but I'm not sure I fully understand what he means by
> null references or null pointers (apart from the obvious that they do
> not point to a valid object, such as location zero, which triggers an
> error on use). Nor am I sure what he believes is so bad about them -
> or what makes them much worse than pointers in general.
>

> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)

As you state, null references are special in that they do not point to
valid objects. This means that Bad Things will happen if you try to use
a null reference.

What exactly happens when a null reference is used varies. Sometimes,
it is treated like any old address, and the bytes at the address it points
to will be used as if they denoted an object of the expected type. This has
been the cause of several exploits in operating system kernels. Sometimes,
using a reference in this way will cause an attempt to access memory a
process does not have access to, and return in a segmentation fault -
which will typically kill the process. Sometimes, null references are
handled more gracefully - for example by throwing an exception, which
can be caught, after which the program can proceed normally. There may
or may not be a run-time overhead for checking that a reference is not null.

It is certainly possible to design a language without null references,
and I would argue that this is a very good idea. This way, it can be
proven at compile time that a reference to a value of type T will always
be valid, eliminating both the potential for run-time overhead and that
for run-time errors. Haskell and OCaml are two languages where null
references are not possible and references to values of type T are
always valid.

There is one feature of allowing null references that is lost when null
references are not allowed. Without null references, it is not possible
to create a variable that will hold a reference to a value of type T,
without having a T to refer to. In some cases, you actually want to do
this. Haskell and OCaml allow you to do something similar through unions,
specifically the Maybe monad in Haskell and the Option module in OCaml.
Taking Haskell as an example:

-- Example using a string
-- Note: null reference not possible
printName name = putStrLn name

printName "Bob"

-- Example using a Maybe string
printMaybeName maybeName =
case maybeName of
Nothing -> putStrLn "<unnamed>"
Just name -> putStrLn name

printMaybeName Nothing
printMaybeName (Just "Bob")

As you can see, there is some extra syntax in the Maybe case. In my
experience, it's more than worth it: you write most of your code using
regular types, and it is guaranteed to actually receive values of
those types. In those few cases where you actually may not have a
value, this is clearly indicated. Considering that the number one
error I see in Java programs is the NullPointerException, Tony Hoare
seems to be right about null references being a billion dollar mistake.
It's possible to completely eliminate this source of errors, and
I think we would be wise to do so.

Cheers,

Bob

--
Wise men talk because they have something to say; fools, because they
have to say something.

-- Plato

Dmitry A. Kazakov

unread,

Nov 17, 2010, 6:11:18 AM11/17/10

to

On Wed, 17 Nov 2010 00:36:49 -0800 (PST), James Harris wrote:

> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)

Technically it is quite difficult not to allow them. The problem is that if
you don't allow them, but allow user-managed objects of referential types,
each such object need to be initialized. That is not only when you declare
a variable of a referential type, but also when the object is a component
of another type, like array, record etc. You will need a very elaborated
system of object construction/initialization in order to ensure that no
reference can ever be seen not yet initialized (invalid). As an example
consider construction of a record with references in it and exceptions
propagating somewhere in the middle of the process of initialization.

There are interesting cases like doubly-linked lists where having null
references vs not having them means considerable design differences. E.g.
without null references the list may not be empty => you need a dedicated
list head element => you will very likely downcast list elements.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Rod Pemberton

unread,

Nov 17, 2010, 11:47:15 AM11/17/10

to

"James Harris" <james.h...@googlemail.com> wrote in message
news:8fe5b4e9-8a25-4699...@v23g2000vbi.googlegroups.com...

>
> Tony Hoare famously criticises himself for introducing null references
> in the mid 1960s calling it his billion-dollar mistake.
>

R. Horsell quoted. Thank you.

He was implementing a type system.

> I have some ideas but I'm not sure I fully understand what he means by
> null references or null pointers (apart from the obvious that they do
> not point to a valid object, such as location zero, which triggers an
> error on use).
>

null pointer - a pointer whose value is null. Null is a special value
that's used to detect declared but unitialized pointers.

null reference - the pointer references data at the location for null's
value. The data at null's location is undesired.

First, as I see it, none of this applies for assembly, since assembly uses
addresses without needing data at said addresses. Null pointers are only an
issue with HLL's (high-level languages). HLL's are typically designed to
handle objects or data instead of using addresses directly. Addresses, as
pointers, are hidden, from the programmer, for the most part. I.e., HLL's
use a type system to control access to data or memory.

As I see it, if the language forces pointers to be initialized at
declaration, or only allows pointers to point to a valid object, and
prevents indexing or internal access or sub-access of an object, then there
is no need for a null value. The languages that I've used or still
remember, you're allowed to declare pointers without initialization and
you're allowed to point pointers to any memory region including those
without declared objects. Requiring that a pointer only point to certain
objects is easy. Some objects are only accessible by only one pointer
value. However, other objects need to be accessable within the object,
e.g., "arrays". For these objects, a pointer may point within it. Also,
the pointer may point to allocated but unitialized storage within that
object. I.e., valid object, but bad data. These situations are much harder
to detect. If a pointer can point to any object, any memory address, or a
wide range of memory addresses, you also have the problem of detecting
pointers which refer to undeclared objects. The ability to access
non-declared objects is needed for hardware programming. (more on this
later)

Null for C is commonly defined by the compiler implementor to be value zero
which then references memory location zero, but it can be any value that
doesn't correspond to an address of a C object or an address within a C
object or one unit past the end of certain objects. I.e., your application
can never access that address by accessing a declared object. You could set
a pointer to point there. If you do, then it's detected as a null
reference, even if you needed to access that location. E.g., if null is
non-zero, say 0xDEADBEEF, and you're doing memory probing, i.e., accessing
non-declared and non-allocated memory via a pointer, and you increment your
memory pointer from 0xDEADBEEE to 0xDEADBEEF, you'll get a null reference at
0xDEADBEEF since that's your null's value. Of course, even if null is value
zero, access to address zero is needed on some systems. This introduces
work arounds, if null pointer protection is enabled.

> Nor am I sure what he believes is so bad about them -
> or what makes them much worse than pointers in general.
>

Some believe that HLL's, like C, should only access objects (or data or
memory regions) that they've declared, allocated, and initialized. They
typically call this "type safety". I.e., you cannot access anything you
"shouldn't" have access to. This is perfectly acceptable for application
programming. Of course, this make programming hardware or an operating
system very difficult. I.e., if C prohibited you from accessing an
unallocated object, you couldn't program memory-mapped devices or I/O. Why?
Because, the memory-mapped device cannot be any object that you've declared
or allocated within the C context. It's outside C's context. I.e., without
unrestricted pointers, you're restricted to application-only space. This
was how Pascal worked many decades ago. You couldn't access anything that
was "non-Pascal". This is severely restrictive. How do you program a
device? How do you call the operating system? You couldn't. That was how
Java was supposed to work: C without pointers. Although, from Wikipedia, it
seems that Java had to develop some methods to allow some pointer usage.
IIRC, the C language is attempting to introduce a new keyword, possibly
"restrict", to restrict what a pointer can point to. However, IIRC, this
wasn't because of programming errors, but for compiler code optimization.

> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)
>

IMO, it depends on what you need to program. If you only need to code
applications, there is no reason to access non-declared or non-declarable
objects (memory-mapped devices, I/O etc.) or non-initialized data. If you
need to program memory mapped hardware, I/O ports, or anything that you
cannot declare as an object in the language, they you need unrestricted
pointers.

Rod Pemberton

Patricia Shanahan

unread,

Nov 17, 2010, 12:30:04 PM11/17/10

to

On 11/17/2010 8:47 AM, Rod Pemberton wrote:
...

> IMO, it depends on what you need to program. If you only need to code
> applications, there is no reason to access non-declared or non-declarable
> objects (memory-mapped devices, I/O etc.) or non-initialized data. If you
> need to program memory mapped hardware, I/O ports, or anything that you
> cannot declare as an object in the language, they you need unrestricted
> pointers.

...

I strongly feel that one of the worst mistakes in C was allowing very
casual, informal access to constructed addresses.

There are a few places in an operating system that need to be able to
convert between a calculated integer and a pointer. It would not
be any real hardship to have to use some special syntax in those few places.

The gain would be much less time wasted finding bugs that manifest
themselves in code entirely unrelated to the code with the bug, because
the bug caused corruption of an arbitrary memory area.

Patricia

Dmitry A. Kazakov

unread,

Nov 17, 2010, 12:40:57 PM11/17/10

to

On Wed, 17 Nov 2010 11:47:15 -0500, Rod Pemberton wrote:

> Some believe that HLL's, like C, should only access objects (or data or
> memory regions) that they've declared, allocated, and initialized. They
> typically call this "type safety". I.e., you cannot access anything you
> "shouldn't" have access to. This is perfectly acceptable for application
> programming. Of course, this make programming hardware or an operating
> system very difficult. I.e., if C prohibited you from accessing an
> unallocated object, you couldn't program memory-mapped devices or I/O.

Accessing objects you don't own is not same as to have pointers to them.
E.g. linking to entry points of a library does not require explicit
pointers to subprograms. Even in C you could declare some object extern and
let the linker map it to where needed.

BGB

unread,

Nov 17, 2010, 12:50:53 PM11/17/10

to

well, NULL values are useful.
making them go away or be awkward would be IMO undesirable WRT actually
getting code written (much like variable assignment, yes, maybe it *is*
"evil" in some ways, but the usefulness of the feature is much greater
than its cost).

linked lists and many other data structures would be very problematic
without NULL.

but, there do exist cases where requiring a non-null value could be
useful, as well as it could eliminate some explicit checks.

example:
void foo(MyObj obj)
{
if(!obj) throw new BarfException();
...
do something...
}

one option though is something like a "nonnull" keyword (or probably
"_NonNull" if it were being retrofitted on using C-style rules) which
could probably detect and trap attempts to use NULL references (from
unsafe sources), but otherwise behave like a normal object reference or
pointer.

for example:
void foo( _NonNull MyObj obj)
{
...
}

...
void bar()
{
MyObj obj;
obj=null; // or obj=NULL;
foo(obj); //barf...
//say for example, an exception, such as NullPointerException
//is thrown, or whatever makes sense here
}

admitted though, people may be less likely to use the feature if the
syntax is too ugly though (and a _NonNull keyword is not very pretty,
but the other main option is __nonnull which is possibly worse...).

with C-like declaration syntax/rules, this would also work:
void foo(MyObj _NonNull *obj)
{
...
}

now, this keyword could either be handled at the caller (whenever
calling a function with an object without the keyword set), or at the
callee (where an implicit check-and-throw, or whatever else, is silently
inserted by the compiler).

some obvious cases, such as the above, could likely also be caught at
compile time as well (it is a common simple optimization to propagate
variable assignments like this, and if a variable shows up as a NULL,
one can generate an error).

potentially, all small addresses could be trapped, allowing for trapping
some NULL-like magic values as well (such as UNDEFINED, ...), which if
supported by the language/... could be interpreted as a way to
circumvent this ("well, I can't pass NULL, but what if I shove an
'UNDEFINED', or a 'true' or 'false' there?...").

on many systems, a general purpose "is memory address valid" check could
be performed (say, via a deference and trap), but semantically these are
not equivalent (consider for example, where one demands non-NULL but
does allow certain value types which may be represented via otherwise
invalid pointers), and sadly on many systems such a check could be much
more expensive (vs a NULL check).

"_Valid" or "__valid" could be the validity check, which would be more
general and require that a pointer point to valid memory.

so, possible:

NULL check which only detects NULL:
mov eax, [esp+8]
and eax, eax
jnz .oknull
; handler...
; xmeta "ThrowName", "NullPointerException"
; above: ASM pseudo-op, BGBASM specific
.oknull:

NULL check which detects NULL-page:
mov eax, [esp+8]
cmp eax, 4096
jae .oknull
;handler...
.oknull:

validity check which uses a deference (OS-specific behavior):
mov eax, [esp+8]
movzx eax, [eax]
;if it got here it is safe
;sadly, this is not likely a usable option
for most non-C languages, on Win32, SEH crap is likely needed here,
which would be expensive. Linux could mean jerking off with signals or
similar (Win64 SEH would be a little cheaper though).

the other option is to actually validate the pointer in some OS-specific
way...

hence, the NULL-check option is better in general.

or such...

pete

unread,

Nov 17, 2010, 1:41:56 PM11/17/10

to

Patricia Shanahan wrote:

> I strongly feel that one of the worst mistakes in C was allowing very
> casual, informal access to constructed addresses.
>
> There are a few places in an operating system that need to be able to
> convert between a calculated integer and a pointer. It would not
> be any real hardship to have to use some special syntax
> in those few places.

You do have to use some special syntax in C to do that.
You have to use a cast operator to do that in C.

> The gain would be much less time wasted finding bugs that manifest
> themselves in code entirely unrelated to the code
> with the bug, because
> the bug caused corruption of an arbitrary memory area.

--
pete

Andy Walker

unread,

Nov 17, 2010, 1:51:13 PM11/17/10

to

On 17/11/10 08:40, James Harris wrote:
> I have some ideas but I'm not sure I fully understand what he means by
> null references or null pointers (apart from the obvious that they do
> not point to a valid object,

AFAIAA, there is no "apart from". That's it! Well, apart from
[sic!] the necessary/obvious corollary that some test is available to
detect whether or not your pointer is null.

> such as location zero, which triggers an
> error on use).

Note that the error is not "use" of a null pointer, but trying
to use the non-existent object that it points at -- ie, not guarding
access to pointed-at objects with an explicit or implicit test.

> Nor am I sure what he believes is so bad about them -
> or what makes them much worse than pointers in general.

With a certain amount of trepidation, I think TH is wrong in
this matter. The only relevant error is dereferencing a null pointer.
Programs that do so thereby contain bugs. Certainly you can produce
languages in which all pointers are required [and checked] to point
to real, existing objects of the correct type, and in which all
dereferencing therefore succeeds. But the resulting code, other
things being equal, still has exactly the same bugs. If you forget
to test against "nil", then in the same place in code with no "nil"s
you will forget to test against the default value or the end marker
or whatever other device your language has to enforce "legitimacy".
The difference is that your program may not fail visibly at this
point. Personally, I think programs *should* fail visibly in such
circumstances [whether this takes the form of a "crash and burn" or
a catchable exception]; it is always* better if bugs are manifested
at the place where they occur than at some random unrelated place
much further down the code.

> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)

It's certainly possible to remove them, just as it's possible
to remove pointers altogether or to disallow pointers-to-pointer. But
what's the point? You end up with programmers inventing the concepts
in other ways, using indexes into arrays instead, and exactly the same
bugs occur in exactly the same places. It's just making programming
harder, and it's hard enough already ....

* For some value of "always". You perhaps don't want your nuclear
power station or your Mars lander to just stop working in an
uncontrolled way, so arguably it's better for the program to keep
going with the wrong values [and *perhaps* crash later] than to
crash instantly. But (a) that is a good reason to have exceptions
or some other way to keep going anyway [eg, a "safe" mode], and (b)
that has to be weighed against the increased probability that a
manifest bug can be caught during testing.

--
Andy Walker,
Nottingham.

Rod Pemberton

unread,

Nov 17, 2010, 3:03:17 PM11/17/10

to

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> wrote in message
news:tdtdvo1nz179$.k69wbvrfp4yd$.dlg@40tude.net...

> On Wed, 17 Nov 2010 11:47:15 -0500, Rod Pemberton wrote:
>
> > Some believe that HLL's, like C, should only access objects (or data or
> > memory regions) that they've declared, allocated, and initialized. They
> > typically call this "type safety". I.e., you cannot access anything you
> > "shouldn't" have access to. This is perfectly acceptable for
application
> > programming. Of course, this make programming hardware or an operating
> > system very difficult. I.e., if C prohibited you from accessing an
> > unallocated object, you couldn't program memory-mapped devices or I/O.
>
> Accessing objects you don't own is not same as to have pointers to them.

In C, the object must either be declared to access it, or you must have a
pointer to it to access it. How do you "access objects you don't own" in C
without "having pointers to them"?

> E.g. linking to entry points of a library does not require explicit
> pointers to subprograms. Even in C you could declare some object extern
and
> let the linker map it to where needed.
>

You "own" extern objects. The object is declared and allocated, but
externally via other code. The user doesn't get to control the value of
extern pointers. They originate from the code being linked.

Memory mapped hardware is neither allocated nor declared within the C
context, internally or externally. If you must write to a memory mapped
device, say a text screen at 0xB8000, how do you do so? You must use a
pointer which was assigned 0xB8000 via an integer to pointer conversion.
You can't declare a text screen as an object of 80x25 characters and assign
it to be located at address 0xB8000. You can allocate 80x25 characters, but
they won't be allocated at 0xB8000. You can set a pointer to 0xB8000, but
there is no C object allocated there. A memory mapped device is not
declarable and not allocatable in C, unless the compiler implementor did so
for you.

Rod Pemberton

unread,

Nov 17, 2010, 3:05:07 PM11/17/10

to

"Patricia Shanahan" <pa...@acm.org> wrote in message
news:OO6dnTOunba8jHnR...@earthlink.com...

> On 11/17/2010 8:47 AM, Rod Pemberton wrote:
> ...
> > IMO, it depends on what you need to program. If you only need to code
> > applications, there is no reason to access non-declared or
non-declarable
> > objects (memory-mapped devices, I/O etc.) or non-initialized data. If
you
> > need to program memory mapped hardware, I/O ports, or anything that you
> > cannot declare as an object in the language, they you need unrestricted
> > pointers.
> ...
>
> I strongly feel that one of the worst mistakes in C was allowing very
> casual, informal access to constructed addresses.
>

Constructed addresses to addresses of or within C objects? Or, to non-C
addresses?

The reason I ask is it changes the nature of the discussion. I.e., are you
saying that the null pointer is insufficient to detect errant pointer
accesses? And, are you saying void * is insufficient also? Are you saying
that you shouldn't be able to construct addresses to or within allocated,
declared, and initialized C objects? Or, one shouldn't be able to construct
addresses to just non-C objects? You need to be able to access C objects,
so constructed addresses to them should be allowed, yes?

You understand that the subscript operator [] does just that: "allows
casual, informal access to constructed addresses". Yes? It's used for
indexing from the address of an object, i.e., pointer. I.e., it's used to
"construct addresses" informally. Typically, it's used to simulate "arrays"
in C which doesn't have them. C has array declarations, but no arrays, at
least according to the C syntax. Some abuse the C spec.'s to argue
otherwise, i.e., that arrays are a fundamental type in C.

So, by "constructed addresses", you were talking about assigning an integer
address value to a pointer? E.g.,

unsigned long *ptr;
ptr=(unsigned long*) 0x1234;

Yes?

> There are a few places in an operating system that need to be able to
> convert between a calculated integer and a pointer.

"calculated"? There's not many *calculated* integer to pointer conversions.
There are integer to pointer conversions.

"few places"? Yeah, there's only a handful of those for an x86 OS: IVT, GDT,
BDA, EBDA, BIOS, BIOS-32 for PNP and PCI, VGA BIOS, LAPIC, IOAPIC, ... ;-)

> It would not
> be any real hardship to have to use some special syntax in those few
places.
>

As Pete pointed out, casts are needed for integer to pointer conversion,
well, except for 0, 0L, and null. In C, 0 and 0L syntacticly represent null
and it's value, whether null's value is zero or 0xDEADBEEF or something
else.

> The gain would be much less time wasted finding bugs that manifest
> themselves in code entirely unrelated to the code with the bug, because
> the bug caused corruption of an arbitrary memory area.
>

Is it "arbitrary" if it caused corruption? I'd say no... It's apparently
"in-use" memory by something, or a bug...

Rod Pemberton

Niklas Holsti

unread,

Nov 17, 2010, 3:49:14 PM11/17/10

to

BGB wrote:
> On 11/17/2010 1:40 AM, James Harris wrote:
>> Tony Hoare famously criticises himself for introducing null references
>> in the mid 1960s calling it his billion-dollar mistake.
>>
>> I have some ideas but I'm not sure I fully understand what he means by
>> null references or null pointers (apart from the obvious that they do
>> not point to a valid object, such as location zero, which triggers an
>> error on use). Nor am I sure what he believes is so bad about them -
>> or what makes them much worse than pointers in general.
>>
>> Anyone care to discuss the issue from the point of view of language
>> design? Should languages allow null references? Why or why not? (And
>> is it practical to remove them?)

>

> well, NULL values are useful.
> making them go away or be awkward would be IMO undesirable WRT actually
> getting code written (much like variable assignment, yes, maybe it *is*
> "evil" in some ways, but the usefulness of the feature is much greater
> than its cost).
>
> linked lists and many other data structures would be very problematic
> without NULL.
>
> but, there do exist cases where requiring a non-null value could be
> useful, as well as it could eliminate some explicit checks.

...
> one option though is something like a "nonnull" keyword (or probably
> "_NonNull" if it were being retrofitted on using C-style rules) which
> could probably detect and trap attempts to use NULL references (from
> unsafe sources), but otherwise behave like a normal object reference or
> pointer.

The Ada language uses the keyword pair "not null" to constrain pointers
in this way. You can define pointer (sub)types that are "not null", or
place this constraint on any pointer parameter, for example:

procedure Foo (Ptr : not null Object_Pointer) is begin ... end Foo;

At any call of Foo, the Ada compiler generates code (if necessary) to
check that the actual parameter for Ptr is not null, and to trigger the
Constraint_Error exception if it is null.

With the Foo function itself, the programmer (and the compiler) can
assume that Ptr is not null and can be used without null checks.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .

Dmitry A. Kazakov

unread,

Nov 17, 2010, 3:56:49 PM11/17/10

to

On Wed, 17 Nov 2010 15:03:17 -0500, Rod Pemberton wrote:

> "Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> wrote in message
> news:tdtdvo1nz179$.k69wbvrfp4yd$.dlg@40tude.net...
>> On Wed, 17 Nov 2010 11:47:15 -0500, Rod Pemberton wrote:
>>
>>> Some believe that HLL's, like C, should only access objects (or data or
>>> memory regions) that they've declared, allocated, and initialized. They
>>> typically call this "type safety". I.e., you cannot access anything you
>>> "shouldn't" have access to. This is perfectly acceptable for application
>>> programming. Of course, this make programming hardware or an operating
>>> system very difficult. I.e., if C prohibited you from accessing an
>>> unallocated object, you couldn't program memory-mapped devices or I/O.
>>
>> Accessing objects you don't own is not same as to have pointers to them.
>
> In C, the object must either be declared to access it, or you must have a
> pointer to it to access it. How do you "access objects you don't own" in C
> without "having pointers to them"?

Using the object's name. Note the type of the object is not a referential
type. E.g.

extern int x; // The type of x is int, it is neither int& nor int*

>> E.g. linking to entry points of a library does not require explicit
>> pointers to subprograms. Even in C you could declare some object extern and
>> let the linker map it to where needed.
>
> You "own" extern objects. The object is declared and allocated, but
> externally via other code.

*Any* object is declared, allocated, initialized and later finalized,
deallocated, leaving the declaration scope. Not owned object is not
initialized and finalized by your code. More precisely, it is not within
its declaration scope of your code. For example, in Ada:

type Device_Register is record
Status : Octet := 0; -- Note this is an initialized field!
Data : Octet;
end record;

Port : Device_Register;
for Port'Address use 16#FFFF_001A#; -- Specifies the address
pragma Import (Ada, Port); -- Instructs not to initialize/finalize

The last pragma tells the compiler that the object is not "owned." The
compiler will not attempt any sort of object's initialization /
finalization it otherwise will have to perform. Within the declaration
scope of Port the corresponding object is not initialized / finalized.

The pragma is used to map objects "out of nowhere." E.g. if you've got
somehow an address of an object you can declare a typed view of the object:

procedure Deal_With_Port (Where : Address) is
Port : Device_Register;
for Port'Address use Address; -- Specifies the address
pragma Import (Ada, Port); -- Instructs not to initialize/finalize
begin
... -- doing things with Port at Address

> The user doesn't get to control the value of
> extern pointers. They originate from the code being linked.
>
> Memory mapped hardware is neither allocated nor declared within the C
> context, internally or externally. If you must write to a memory mapped
> device, say a text screen at 0xB8000, how do you do so? You must use a
> pointer which was assigned 0xB8000 via an integer to pointer conversion.

See the Ada example above.

In C you could instruct the linker to place the variable into a
non-initialized section and map the section at the specific address. Of
course people usually do it via pointers, but it does not mean that there
is no other way.

Dmitry A. Kazakov

unread,

Nov 17, 2010, 4:07:10 PM11/17/10

to

On Wed, 17 Nov 2010 18:51:13 +0000, Andy Walker wrote:

> With a certain amount of trepidation, I think TH is wrong in
> this matter. The only relevant error is dereferencing a null pointer.
> Programs that do so thereby contain bugs.

Yes, and more than that. Dereferencing a null pointer is not a bug. It is a
legal use of the pointer with well-defined behavior: an exception
propagation. The program has a bug only when this exception is not handled
while possible.

> Personally, I think programs *should* fail visibly in such
> circumstances [whether this takes the form of a "crash and burn" or
> a catchable exception]; it is always* better if bugs are manifested
> at the place where they occur than at some random unrelated place
> much further down the code.

Because exception propagation is well-defined (the stack is wound up, local
objects are finalized etc), while the effect of accessing a dangling
pointer is absolutely undefined. Surely any defined behavior is better than
an undefined one.

Patricia Shanahan

unread,

Nov 17, 2010, 4:28:15 PM11/17/10

to

pete wrote:
> Patricia Shanahan wrote:
>
>> I strongly feel that one of the worst mistakes in C was allowing very
>> casual, informal access to constructed addresses.
>>
>> There are a few places in an operating system that need to be able to
>> convert between a calculated integer and a pointer. It would not
>> be any real hardship to have to use some special syntax
>> in those few places.
>
> You do have to use some special syntax in C to do that.
> You have to use a cast operator to do that in C.

Casts need to be used so frequently, for much less dangerous operations,
that it is impossible to apply the level of care to every cast that
should be applied to converting a calculated integer into a pointer.

Yes, casts are "special syntax", but nowhere near special enough.

Patricia

James Harris

unread,

Nov 17, 2010, 4:44:20 PM11/17/10

to

On Nov 17, 9:22 am, Robbert Haarman <comp.lang.m...@inglorion.net>
wrote:

- Copying to comp.programming -

Patricia Shanahan

unread,

Nov 17, 2010, 4:44:22 PM11/17/10

to

Rod Pemberton wrote:
...

> You understand that the subscript operator [] does just that: "allows
> casual, informal access to constructed addresses". Yes? It's used for
> indexing from the address of an object, i.e., pointer. I.e., it's used to
> "construct addresses" informally. Typically, it's used to simulate "arrays"
> in C which doesn't have them. C has array declarations, but no arrays, at
> least according to the C syntax. Some abuse the C spec.'s to argue
> otherwise, i.e., that arrays are a fundamental type in C.

...

I do understand the interpretation of [] in C, but I would not expect
designers of future languages to seriously consider using the C array
model. Who wants to deal with buffer overflow if they don't have to?

There are a lot of decisions in C that made sense if you think of the
state of the art in the early 1970's. That was arguably one of them,
though it might have been better to go with more Fortran-like array
handling.

Given 21st. century compiler technology, the overhead for checked array
access is low enough that I would expect any language being designed now
that has arrays to support bounds checking.

Patricia

James Harris

unread,

Nov 17, 2010, 4:44:45 PM11/17/10

to

On Nov 17, 11:11 am, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:

- Copying to comp.programming -

BGB

unread,

Nov 17, 2010, 4:55:05 PM11/17/10

to

I was not claiming originality here, and actually I was just borrowing
the basic idea from another language anyways...

now, supporting it in a C family language (C, C++, C#, Java, ...) would
be a little more novel. granted, one would have to implement compiler
support for it, and hopefully gain enough support for it to actually be
used (granted, "general" support could be added to C and C++ by using
the preprocessor to deal with compilers where it doesn't exist).

#ifndef nonnull
#ifdef FOOCC
#define nonnull _NonNull
#else
#define nonnull
#endif
#endif

granted, this is not an option for Java...

or such...

tm

unread,

Nov 18, 2010, 1:41:58 AM11/18/10

to

On 17 Nov., 09:40, James Harris <james.harri...@googlemail.com> wrote:
> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)

Pointers are the gotos of data. A goto is a POINTER to code where
the execution should continue. A pointer says that you should GOTO
to a specific place in memory to find more data.

Structured programming hides gotos with structured statements.
Structured data structures should also try to hide pointers.

Many uses of pointers go away when several types of containers are
available.

The problem with NULL pointers is that they are overused.
There are pointers which should always refer to legal data.
Such pointers should not be allowed to be NULL.

Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

Torben Ægidius Mogensen

unread,

Nov 18, 2010, 4:55:39 AM11/18/10

to

James Harris <james.h...@googlemail.com> writes:

> Tony Hoare famously criticises himself for introducing null references
> in the mid 1960s calling it his billion-dollar mistake.
>

> Anyone care to discuss the issue from the point of view of language
> design? Should languages allow null references? Why or why not? (And
> is it practical to remove them?)

Null pointers/references (i.e., pointers/references that do not point to
any valid object) are not in themselves a problem. The problems occur
with too free use of these:

1. If pointer dereferences are not checked, following a null pointer
will give invalid values. Even worse is if you allow storing values
at pointed-to locations, as null pointers may not point to a valid
address or may point into space used by the OS.

2. If all reference/pointer types include null references, you are
forced to include a value in your types that you don't need and
don't want, and you will need to check for this value whenever you
work with values of this type (if you want robust code). How would
you feel if every type you declare automatically includes 42 as a
possible value in addition to the values you specify in the type?

3. Null references are sometimes used to signal error conditions. For
example, many C functions return a null reference if an error
occurred during execution of the function. For example, malloc()
returns a null reference if it is unable to allocate memory. You
are not required to check if the result is a null reference, so you
can continue working with the null pointer as if it was a valid
pointer. This can lead to later errors that are much harder to
track. Error conditions should rather be signaled using exceptions
or by returning a tagged union type, so you are forced to check the
pointer for validity before accessing it (or storing it in a pointer
variable). Something like ML's option type or Haskell's Maybe type
is fine. Most ML compilers implement the option type so the NONE
element is implemented as a null reference and SOME x is implemented
as a reference to x, so there is no overhead compared to the C
implementation (apart from a forced check when accessing x). Since
null references are used both to implement "real" values such as
empty lists and to signal errors, you can not use null references to
signal errors in functions that return values where a null reference
may have an non-error meaning. So you need another way to signal
errors anyway, so it would make sense to use this consistently
rather than sometimes using null references and sometimes something
else.

Many languages do not allow null references. ML and Haskell have been
mentioned, and they work fine regardless. Some other posters observed
that not having null references requires you to initialise all pointer
variables to point to valid values when they are declared and complained
that this would lead to elaborate initialisation syntax. ML and Haskell
do require you to initialise all variables, but I would not call the
syntax elaborate nor consider it a problem. In Haskell, you can not
modify a variable after it is created, so uninitialised variables don't
make sense at all. ML (SML, O'Caml, F# and other variants), allow
updatable reference variables. If you don't want a "real"
initialisation, you can always use the option type and initialise to
NONE. As mentioned above, this is (normally) implemented as a null
reference, so apart from forced checks, there is no overhead in doing
so.

I wouldn't mind null pointers so much if they weren't by default
included in every pointer type. If the default is that a pointer type
can not be null, and the type system enforces this to be true, then
having types that explicitly include null pointers is O.K., as long as
the type system enforces checks before dereferencing or when casting to
null-free pointer types. This is, basically, what languages like Spec#
and Cyclone does. In Spec#, these tests must be explicit in the code,
which is a good idea, as the programmer is forced to consider what
action should be taken if a pointer is null. In Java or similar
languages (that do check for null pointers when dereferencing), you
either get an exception or the program aborts.

Torben

BartC

unread,

Nov 18, 2010, 6:49:08 AM11/18/10

to

"tm" <thomas...@gmx.at> wrote in message
news:b8809c80-8e62-4b60...@t35g2000yqj.googlegroups.com...

> On 17 Nov., 09:40, James Harris <james.harri...@googlemail.com> wrote:
>> Anyone care to discuss the issue from the point of view of language
>> design? Should languages allow null references? Why or why not? (And
>> is it practical to remove them?)
>
> Pointers are the gotos of data. A goto is a POINTER to code where
> the execution should continue. A pointer says that you should GOTO
> to a specific place in memory to find more data.
>
> Structured programming hides gotos with structured statements.
> Structured data structures should also try to hide pointers.

How do you implement explicit linked lists, for example, without pointers?
(And without emulating pointers through other mechanisms, as they would have
the same problems.)

> Many uses of pointers go away when several types of containers are
> available.

Because they are used behind the scenes instead? Does this mean having to
use a different language (one with-pointers rather than one
without-pointers) to implement them? This would just be brushing the
problems under the carpet.

> The problem with NULL pointers is that they are overused.
> There are pointers which should always refer to legal data.
> Such pointers should not be allowed to be NULL.

(I think the thread has used the terms pointer (which can be nil or NULL),
and reference (which always points to something valid) interchangeably. My
comments above are about the former.)

--
Bartc

Andy Walker

unread,

Nov 18, 2010, 6:57:25 AM11/18/10

to

On 18/11/10 06:41, tm wrote:
> Pointers are the gotos of data. A goto is a POINTER to code where
> the execution should continue. A pointer says that you should GOTO
> to a specific place in memory to find more data.

Whoa! A "goto" is more than "this is where you can find some
code" -- it actually transfers execution to that place. A pointer is
no more than "this is where you can find some data" -- there is no
implication that you should go and find it. You have two occurrences
of "should", of which the first means "does" and the second means "can".

--
Andy Walker,
Nottingham.

Dmitry A. Kazakov

unread,

Nov 18, 2010, 7:35:24 AM11/18/10

to

You don't need to execute "goto," but you could. Goto is never used without
some condition attached, for obvious reasons. What is harmful in the tandem
condition/barrier check - goto? Certainly nether the check nor the raw
goto. It is a combination of "can" and undefined consequences rather than
certain "does" which is.

Pointers (referential semantics, in general) are as harmful as gotos from
the software design perspective. Clearly there exist correct programs full
of pointers or gotos and totally broken ones without either.

Rod Pemberton

unread,

Nov 18, 2010, 8:58:39 AM11/18/10

to

"Torben Ægidius Mogensen" <tor...@diku.dk> wrote in message
news:7zhbffy...@ask.diku.dk...

>
> In Haskell, you can not
> modify a variable after it is created,
>

I'm not familiar with Haskell. But, I suspect that isn't what you meant to
say. The point of a variable is to be able to modify it after creation.
How would you change it's value from 3 to 5?

Rod Pemberton

Robbert Haarman

unread,

Nov 18, 2010, 9:51:17 AM11/18/10

to

Hi Rod,

The terminology is somewhat confusing. Torben is right, you cannot
modify a Haskell "variable". If you say x = 3, you cannot later change x
to 5. I would say that this means that x is not a variable, but, for
some reason, it's still called a variable in Haskell. See also
https://www.haskell.org/haskellwiki/Variable

Regards,

Bob

--
Success is getting what you want; happiness is wanting what you get.

Richard Harter

unread,

Nov 18, 2010, 10:20:30 AM11/18/10

to

It's what he meant to say. Not being able to modify the value of
a variable is typical in functional languages. In the jargon,
variables are immutable.

BartC

unread,

Nov 18, 2010, 11:42:39 AM11/18/10

to

"Richard Harter" <c...@tiac.net> wrote in message
news:4ce53f22....@text.giganews.com...

Assignment isn't the same as modification. In Python, string types are
immutable, so that in:

a = "ABC"

you can't change the 3rd character of a to "D" for example. But you can do:

a = "ABD"

by just replacing the whole thing.

> In the jargon, variables are immutable.

So Haskell doesn't even allow assignment?

Perhaps 'variables' aren't the best name for them then.

--
Bartc

lawrenc...@siemens.com

unread,

Nov 18, 2010, 11:08:05 AM11/18/10

to

Patricia Shanahan <pa...@acm.org> wrote:
>
> Casts need to be used so frequently, for much less dangerous operations,
> that it is impossible to apply the level of care to every cast that
> should be applied to converting a calculated integer into a pointer.

Casts are not used frequently in well written code.
--
Larry Jones

Even though we're both talking english, we're not speaking the same language.
-- Calvin

BGB

unread,

Nov 18, 2010, 2:11:19 PM11/18/10

to

On 11/18/2010 7:51 AM, Robbert Haarman wrote:
> Hi Rod,
>
> On Thu, Nov 18, 2010 at 08:58:39AM -0500, Rod Pemberton wrote:

>> "Torben �gidius Mogensen"<tor...@diku.dk> wrote in message

>> news:7zhbffy...@ask.diku.dk...
>>>
>>> In Haskell, you can not
>>> modify a variable after it is created,
>>>
>>
>> I'm not familiar with Haskell. But, I suspect that isn't what you meant to
>> say. The point of a variable is to be able to modify it after creation.
>> How would you change it's value from 3 to 5?
>
> The terminology is somewhat confusing. Torben is right, you cannot
> modify a Haskell "variable". If you say x = 3, you cannot later change x
> to 5. I would say that this means that x is not a variable, but, for
> some reason, it's still called a variable in Haskell. See also
> https://www.haskell.org/haskellwiki/Variable
>

it is because many functional languages seem to have an Algebra
obsession or something (at least WRT semantics, common FPL syntax is
decidedly not algebraic), except that AFAICT Algebra and Calculus use
dynamic binding or similar (like, this being why equations can be used
in all sorts of different contexts with variables changing all over the
place and still lacking assignment).

(actually, this whole binding issue, among other things, is part of what
makes math classes so confusing...).

semantically, it doesn't work as well with lexical binding, as one would
have to use functions (or wrap the equation in a function) to do
similar, and people in math-land are typically not so much about making
function calls (more about making constructions which can't possibly
actually exist?...).

but, yes, with neither dynamic binding nor assignment, it is no longer
really 'variable' anymore, is it?...

so, yeah, for getting stuff done my votes are more with more traditional
languages.

and, yes, closures and tail-call optimization, ..., are useful features
for non-FPL languages as well, and when one has them, what merit is
there in the (IMO silly) no variable assignment restrictions?... (or, in
Haskell's case, typesystem mechanics which defy understanding...).

Joshua Maurice

unread,

Nov 18, 2010, 2:41:58 PM11/18/10

to

On Nov 17, 1:44 pm, Patricia Shanahan <p...@acm.org> wrote:
> I do understand the interpretation of [] in C, but I would not expect
> designers of future languages to seriously consider using the C array
> model. Who wants to deal with buffer overflow if they don't have to?
>
> There are a lot of decisions in C that made sense if you think of the
> state of the art in the early 1970's. That was arguably one of them,
> though it might have been better to go with more Fortran-like array
> handling.
>
> Given 21st. century compiler technology, the overhead for checked array
> access is low enough that I would expect any language being designed now
> that has arrays to support bounds checking.

An important design goal of C has been and will continue to be "A
portable programming language which produces binaries which are
comparable to hand written assembly in executable size, runtime speed,
and runtime space." I am not claiming that C should be the default
programming language choice, but I am arguing that it has a niche
which will continue to exist as long as we have processors as we know
them today.

Dmitry A. Kazakov

unread,

Nov 18, 2010, 3:26:04 PM11/18/10

to

On Thu, 18 Nov 2010 11:41:58 -0800 (PST), Joshua Maurice wrote:

> On Nov 17, 1:44 pm, Patricia Shanahan <p...@acm.org> wrote:
>> I do understand the interpretation of [] in C, but I would not expect
>> designers of future languages to seriously consider using the C array
>> model. Who wants to deal with buffer overflow if they don't have to?
>>
>> There are a lot of decisions in C that made sense if you think of the
>> state of the art in the early 1970's. That was arguably one of them,
>> though it might have been better to go with more Fortran-like array
>> handling.
>>
>> Given 21st. century compiler technology, the overhead for checked array
>> access is low enough that I would expect any language being designed now
>> that has arrays to support bounds checking.
>
> An important design goal of C has been and will continue to be "A
> portable programming language which produces binaries which are
> comparable to hand written assembly in executable size, runtime speed,
> and runtime space."

1. Well, is there an example of a non-portable programming language? I am
unaware of a language that cannot be ported.

2. C "arrays" clearly do not serve the goal of performance. This is one
reason why C never managed to supersede FORTRAN in numeric computations
where performance really counts.

3. Massive use of pointers typical to C makes many optimizations at all
levels from the compiler down to the CPU difficult or impossible.

> I am not claiming that C should be the default
> programming language choice, but I am arguing that it has a niche
> which will continue to exist as long as we have processors as we know
> them today.

This is probably true.

Pascal J. Bourguignon

unread,

Nov 18, 2010, 3:40:39 PM11/18/10

to

"BartC" <b...@freeuk.com> writes:

> How do you implement explicit linked lists, for example, without
> pointers? (And without emulating pointers through other mechanisms, as
> they would have the same problems.)

For example in lisp:

(let ((list (cons 1 (cons 2 (cons 3 nil)))))
(first (rest list)))
--> 2

>> Many uses of pointers go away when several types of containers are
>> available.
>
> Because they are used behind the scenes instead? Does this mean having
> to use a different language (one with-pointers rather than one
> without-pointers) to implement them? This would just be brushing the
> problems under the carpet.

No. This is called abstraction.
Meta-linguistic abstraction in this case.

Or more simply: high level programming language.

>> Structured programming hides gotos with structured statements.
>> Structured data structures should also try to hide pointers.
>
> How do you implement explicit linked lists, for example, without
> pointers? (And without emulating pointers through other mechanisms, as
> they would have the same problems.)

To implement linked lists in Common Lisp, you could write:

(shadow '(cons list first rest)) ; we'll reuse the standard names.

(defstruct (cons (:constructor cons (car cdr)))
car cdr)

(defconstant end-of-list 'end-of-list)
(defun first (cell) (cons-car cell))
(defun rest (cell) (cons-cdr cell))
(defun list (&rest arguments)
(if (null arguments)
end-of-list
(cons (car arguments) (apply (function list) (cdr arguments)))))

CL-USER> (list 1 2 3 4)
#<CONS :CAR 1
:CDR #<CONS :CAR 2 :CDR #<CONS :CAR 3 :CDR #<CONS :CAR 4 :CDR END-OF-LIST>>>>
CL-USER> (first (list 1 2 3 4))
1
CL-USER> (rest (list 1 2 3 4))
#<CONS :CAR 2 :CDR #<CONS :CAR 3 :CDR #<CONS :CAR 4 :CDR END-OF-LIST>>>
CL-USER> (first (rest (list 1 2 3 4)))
2
CL-USER>

There's no pointer, and no null pointer. The end of the list is
represented by a random symbol (here, the symbol end-of-list; in Lisp
it's the symbol NIL, but NIL is not a null pointer, it's a symbol).

>> Many uses of pointers go away when several types of containers are
>> available.
>
> Because they are used behind the scenes instead? Does this mean having
> to use a different language (one with-pointers rather than one
> without-pointers) to implement them? This would just be brushing the
> problems under the carpet.

1- it's all turtles down under.

2- if you had a hardware implementing another construct, you wouldn't
need any pointer to implement these structures. Are there pointers
in your brain?

3- When you write a while loop, you don't complain that you're just
using gotos behind the scene. So why are you complaining about high
level data structures?

>> The problem with NULL pointers is that they are overused.
>> There are pointers which should always refer to legal data.
>> Such pointers should not be allowed to be NULL.
>
> (I think the thread has used the terms pointer (which can be nil or
> NULL), and reference (which always points to something valid)
> interchangeably. My comments above are about the former.)

--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.

Pascal J. Bourguignon

unread,

Nov 18, 2010, 3:47:39 PM11/18/10

to

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

> 1. Well, is there an example of a non-portable programming language? I am
> unaware of a language that cannot be ported.

Turing Equivalence.

Jacko

unread,

Nov 18, 2010, 3:54:27 PM11/18/10

to

It really comes down to the system handling traversal or the 'insane'
user. Because beyond traversal, there is no need for pointer including
null referencing. So beyond adding traversals (of a kind) to types,
along with the implicit function set this adds to a type, which can be
named by the traversal.

type dlinkedlist
int value;
trav left;
trav right;
end type

(new dlinkedlist(...)).left.make(object); //??

Then it becomes a semantic of traversals being to same or differing
type, so unoins or subtyping?

Joshua Maurice

unread,

Nov 18, 2010, 5:33:41 PM11/18/10

to

On Nov 18, 12:26 pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:

Sorry. I didn't put the right emphasis on it. The emphasis should have
been: "C's primary design goal is to be a portable /assembly-like/
programming language. It's meant just as a thin veneer on top of
assembly." In that regard, you can write a good FORTRAN compiler in C
without any FORTRAN or assembly (for the most part), but you cannot
write a good C compiler and C standard library in FORTRAN without some
of it written in C or assembly. That's the niche of C which will never
go away.

Richard Harter

unread,

Nov 18, 2010, 5:35:04 PM11/18/10

to

On Thu, 18 Nov 2010 16:42:39 -0000, "BartC" <b...@freeuk.com>
wrote:

>
>
>"Richard Harter" <c...@tiac.net> wrote in message
>news:4ce53f22....@text.giganews.com...
>> On Thu, 18 Nov 2010 08:58:39 -0500, "Rod Pemberton"
>> <do_no...@notreplytome.cmm> wrote:
>>

>>>"Torben Ćgidius Mogensen" <tor...@diku.dk> wrote in message

>>>news:7zhbffy...@ask.diku.dk...
>>>>
>>>> In Haskell, you can not
>>>> modify a variable after it is created,
>>>>
>>>
>>>I'm not familiar with Haskell. But, I suspect that isn't what you meant
>>>to
>>>say. The point of a variable is to be able to modify it after creation.
>>>How would you change it's value from 3 to 5?
>>
>> It's what he meant to say. Not being able to modify the value of
>> a variable is typical in functional languages.
>
>Assignment isn't the same as modification. In Python, string types are
>immutable, so that in:
>
>a = "ABC"
>
>you can't change the 3rd character of a to "D" for example. But you can do:
>
>a = "ABD"
>
>by just replacing the whole thing.

When you do you change the value that a is bound to; that is
modifying the value of a. IOW a is not immutable.

>
>> In the jargon, variables are immutable.
>
>So Haskell doesn't even allow assignment?

Only as initialization, e.g.
pi = 3.14159265358

>
>Perhaps 'variables' aren't the best name for them then.

Some people feel that way. However the use of the term,
variable, in mathematics and logic is much closer to the usage in
functional languages than it is to imperative languages, so one
could just as well say that 'variables' isn't the right name for
imperative language 'variables'.

Spaeaking for myself, I really feel that if one is thinking about
creating or modifying languages one should have some
understanding of the major different kinds of programming
languages.

Pascal J. Bourguignon

unread,

Nov 18, 2010, 5:54:06 PM11/18/10

to

Joshua Maurice <joshua...@gmail.com> writes:
> Sorry. I didn't put the right emphasis on it. The emphasis should have
> been: "C's primary design goal is to be a portable /assembly-like/
> programming language. It's meant just as a thin veneer on top of
> assembly." In that regard, you can write a good FORTRAN compiler in C
> without any FORTRAN or assembly (for the most part), but you cannot
> write a good C compiler and C standard library in FORTRAN without some
> of it written in C or assembly. That's the niche of C which will never
> go away.

That would depend on what you mean by "good". There have been C systems
written and running on high level machines, such as Lisp Machines
(Zeta-C), quite different from usual processors. There are C compilers
targetting the JVM. And indeed, on these machines, there are some
complexities with the implementation of C pointers, that, according to
some criteria, could be deemed 'not good'. But for most purposes, they
are perfectly good implementations of C.

BartC

unread,

Nov 18, 2010, 6:06:58 PM11/18/10

to

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
news:87tyjei...@kuiper.lan.informatimago.com...

> "BartC" <b...@freeuk.com> writes:
>
> "BartC" <b...@freeuk.com> writes:
>
>> How do you implement explicit linked lists, for example, without
>> pointers? (And without emulating pointers through other mechanisms, as
>> they would have the same problems.)
>
> For example in lisp:

> (let ((list (cons 1 (cons 2 (cons 3 nil)))))
> (first (rest list)))
> --> 2

Well, yes, I can also write (1,(2,(3,(4,())))), but I did say explicit
linked lists; this is just an ordinary list, which (let me guess), Lisp
implements like as a linked list.

But my linked lists tend to have multiple horizontal linkage (nodes exist in
several linked lists simultaneously) and often up and down linkage too.
Achieving that without explicit pointers is difficult.

(And the languages I use -- for various reasons -- like to handle data by
value rather than by pointers which results in data being shared. Then
explicit pointers, complete with their nil values, are invaluable.)
...

> (defun list (&rest arguments)
> (if (null arguments)

(Ah, a null pointer test ...)

> 2- if you had a hardware implementing another construct, you wouldn't
> need any pointer to implement these structures. Are there pointers
> in your brain?

I don't think so. But have you ever written an address on an envelope?

> 3- When you write a while loop, you don't complain that you're just
> using gotos behind the scene.

(Ordinary loops aren't enough for me, I need lots of extra controls, and
even then I might use the odd 'goto'.)

> So why are you complaining about high
> level data structures?

In my languages, 90% or more of explicit use of pointers has been eliminated
by using higher level data. That still leave some uses for which pointers
are a good choice, which is what I'm arguing for. I would argue for 'goto'
too.

(And, because I like to implement my languages in themselves, this would be
much more difficult with pointers 100% removed.)

--
Bartc

Joshua Maurice

unread,

Nov 18, 2010, 6:28:17 PM11/18/10

to

On Nov 18, 2:54 pm, p...@informatimago.com (Pascal J. Bourguignon)
wrote:

> Joshua Maurice <joshuamaur...@gmail.com> writes:
> > Sorry. I didn't put the right emphasis on it. The emphasis should have
> > been: "C's primary design goal is to be a portable /assembly-like/
> > programming language. It's meant just as a thin veneer on top of
> > assembly." In that regard, you can write a good FORTRAN compiler in C
> > without any FORTRAN or assembly (for the most part), but you cannot
> > write a good C compiler and C standard library in FORTRAN without some
> > of it written in C or assembly. That's the niche of C which will never
> > go away.
>
> That would depend on what you mean by "good". There have been C systems
> written and running on high level machines, such as Lisp Machines
> (Zeta-C), quite different from usual processors. There are C compilers
> targetting the JVM. And indeed, on these machines, there are some
> complexities with the implementation of C pointers, that, according to
> some criteria, could be deemed 'not good'. But for most purposes, they
> are perfectly good implementations of C.

Let me get this out of the way: One could argue that the hardware is
being designed with C in mind, and vice versa. My (somewhat ignorant)
knowledge is it's mostly the case that C happens to be a very good
abstraction of fast hardware, and not the other way around. And by
"happens" I mean purposefully designed with that goal in mind.

You speak of those "higher level" machines. It happens to be the case
that nearly all of them are virtual machines, and you can't have
turtles all the way down. Eventually it need to end in hardware
assembly, and that is where C excels. C is exceptionally well suited
alternative to writing assembly in terms of executable size, execution
memory size, and execution time because of C's close isomorphism to
most modern assembly.

On a modern desktop, you can write a FORTRAN compiler and runtime in
C. You cannot write a C compiler and C runtime, specifically the C
runtime, in FORTRAN only. Somewhere along the line, you would need a
large portion of code itself written in C or assembly, or a bootstrap
approach which terminates with some C or assembly.

C is the "barest bones metal" portable programming language. There
will always be some need for C as long as it remains an excellent
abstraction of hardware assembly and as long as it lacks competitors
in this niche, both of which seem to be likely for the foreseeable
future.

BartC

unread,

Nov 18, 2010, 6:59:39 PM11/18/10

to

"Joshua Maurice" <joshua...@gmail.com> wrote in message
news:227d2205-1a80-4635...@fj16g2000vbb.googlegroups.com...

> On a modern desktop, you can write a FORTRAN compiler and runtime in
> C. You cannot write a C compiler and C runtime, specifically the C
> runtime, in FORTRAN only. Somewhere along the line, you would need a
> large portion of code itself written in C or assembly, or a bootstrap
> approach which terminates with some C or assembly.

I would dispute that. A C compiler is mostly just a mapping from one
language to another. I'm sure Fortran, even without pointers (and perhaps
without dynamic data, if it's the same version I used long ago), would be
capable enough.

The C runtime: well you would need an implementation of it written mostly in
C; then your C compiler written in Fortran can compile this in the same way
it compiles any other C sources.

There might be a few problem areas, but there even C would need some help.
You just use Fortran + assembler in the same way you'd use C + assembler.

> C is the "barest bones metal" portable programming language.

(I've used (even implemented once) a language called Babbage (from 1970s;
not the more recent joke language), which was even cruder and lower-level
than C. That really *was* a high-level assembler. For example, it didn't
allow parentheses in expressions, which were just evaluated from left to
right, and could refer to registers directly by name.

C is quite a bit removed from assembler, and in fact it can be
exasperatingly difficult at times just to declare variables which match a
machine data size exactly.)

--
Bartc

Pascal J. Bourguignon

unread,

Nov 18, 2010, 7:37:49 PM11/18/10

to

"BartC" <b...@freeuk.com> writes:

> "Pascal J. Bourguignon" <p...@informatimago.com> wrote in message
> news:87tyjei...@kuiper.lan.informatimago.com...
>> "BartC" <b...@freeuk.com> writes:
>>
>> "BartC" <b...@freeuk.com> writes:
>>
>>> How do you implement explicit linked lists, for example, without
>>> pointers? (And without emulating pointers through other mechanisms, as
>>> they would have the same problems.)
>>
>> For example in lisp:
>
>> (let ((list (cons 1 (cons 2 (cons 3 nil)))))
>> (first (rest list)))
>> --> 2
>
> Well, yes, I can also write (1,(2,(3,(4,())))), but I did say explicit
> linked lists; this is just an ordinary list, which (let me guess), Lisp
> implements like as a linked list.

No. I didn't write (1 2 3), which I could, to demonstrate the explicit
consing (allocation) of the nodes, and their linking.

I could also have written it in C:

(let (node
list)
(setf node (cons nil nil))
(setf (car node) 3)
(setf list node)
(setf node (cons nil nil))
(setf (car node) 2)
(setf (cdr node) list)
(setf list node)
(setf node (cons nil nil))
(setf (car node) 1)
(setf (cdr node) list)
(setf list node)

(first (rest list)))
--> 2

but as you can see, cons take as argument the slots of the cell (the
'car' and the 'cdr'), so you can initialize it when allocating it, not
after, which leads to the simplier form of:

(cons 1 (cons 2 (cons 3 nil)))

And I repeat, nil is not a null pointer, it is a constant whose value is
the symbol nil itself. Not a pointer, a symbol!

> But my linked lists tend to have multiple horizontal linkage (nodes exist in
> several linked lists simultaneously)

Here too:

(let ((a "a")
(b '(b a b y))
(c 42))

(let ((l1 (cons a (cons b (cons c nil))))
(l2 (cons a (cons b (cons c nil)))))

(values l1 l2)))

--> (#1="a" #2=(B A B Y) 42)
(#1# #2# 42)

There's even a syntax to denote the 'structure sharing', that is the
fact that the same object is present in several places, in different
lists or structures.

> and often up and down linkage too.
> Achieving that without explicit pointers is difficult.

There's no difficulty at all.

(defstruct double-node
label
up
down)

(defun add-double-node (new old)
(setf (double-node-down new) old
(double-node-up old) new)
new)

(let ((a "a")
(b '(b a b y))
(c 42))

(add-double-node (make-double-node :label a)
(add-double-node (make-double-node :label b)
(make-double-node :label c))))

--> #1=#S(DOUBLE-NODE :LABEL "a" :UP NIL
:DOWN #2=#S(DOUBLE-NODE :LABEL (B A B Y) :UP #1#
:DOWN #S(DOUBLE-NODE :LABEL 42 :UP #2# :DOWN NIL)))

Again, notice the use of the #= / ## syntax to show where an object is
used in several places.

> (And the languages I use -- for various reasons -- like to handle data
> by value rather than by pointers which results in data being
> shared. Then explicit pointers, complete with their nil values, are
> invaluable.)

You don't need pointers to do that. Lisp handles all the data by value,
and has no pointer, and you can still share data.

>> (defun list (&rest arguments)
>> (if (null arguments)
>
> (Ah, a null pointer test ...)

Again, there is no null pointer, and null in lisp doesn't test for a
null pointer, but for a specific symbol, the NIL symbol:

(defun null (object) (eq object 'NIL))

>> 2- if you had a hardware implementing another construct, you wouldn't
>> need any pointer to implement these structures. Are there pointers
>> in your brain?
>
> I don't think so. But have you ever written an address on an envelope?

Postal addresses have properties quite different than pointers.

>> 3- When you write a while loop, you don't complain that you're just
>> using gotos behind the scene.
>
> (Ordinary loops aren't enough for me, I need lots of extra controls, and
> even then I might use the odd 'goto'.)

In a high level language, we tend to encapsulate those odd goto inside
high level control structures. Of course, in high level programming
languages, we don't need to butcher the compiler to add control
structures to the language...

>> So why are you complaining about high
>> level data structures?
>
> In my languages, 90% or more of explicit use of pointers has been eliminated
> by using higher level data. That still leave some uses for which pointers
> are a good choice, which is what I'm arguing for. I would argue for
> goto' too.

In lisp, there's no pointer, and no need for them.

> (And, because I like to implement my languages in themselves, this would be
> much more difficult with pointers 100% removed.)

And lisp implementations are written in lisp too. (Some of them are
written in C or other languages, but most of them are written in lisp).

Pascal J. Bourguignon

unread,

Nov 18, 2010, 7:47:37 PM11/18/10

to

Joshua Maurice <joshua...@gmail.com> writes:

Machines are machines. The way they're implemented is somewhat
irrelevant. The adjective 'virtual' only denotes a mode of
implementation. But even what you'd call 'physical' machine can be
implemented as software running on the microchip (micro-code). Then
your processor is actually a virtual machine implemented in micro-code,
instead of a virtual machine implemented in code running on the virtual
machine imlpemented in micro-code.

But my point is that you could design the hardware to provide other
primitives than those on which C rely.

> On a modern desktop, you can write a FORTRAN compiler and runtime in
> C. You cannot write a C compiler and C runtime, specifically the C
> runtime, in FORTRAN only.

You cannot write it in standard C either. If you allow extensions to
the standard C language to be able to write it in C, then you must also
allow extensions to the standard Fortran language to let you write it in
Fortran too.

> Somewhere along the line, you would need a
> large portion of code itself written in C or assembly, or a bootstrap
> approach which terminates with some C or assembly.

This, again, is only because you are targetting a low level machine
matching the C language. If you targeted a Lisp Machine, you would have
a harder time writing the run-time in C than in Lisp. And if you target
a JVM, you'd have a harder time writing the run-time in C than in Java.

> C is the "barest bones metal" portable programming language.

No. This is only relative to a given machine architecture.

> There will always be some need for C as long as it remains an
> excellent abstraction of hardware assembly and as long as it lacks
> competitors in this niche, both of which seem to be likely for the
> foreseeable future.

There are better languages than C to do the low level stuff that
proponents claim C is good at. For example, Modula-2. (There are
pointers in Modula-2 too, so it is not the best possible replacement
for C, but it's much better than C).

Andy Walker

unread,

Nov 18, 2010, 7:47:10 PM11/18/10

to

On 18/11/10 20:26, Dmitry A. Kazakov wrote:
> 1. Well, is there an example of a non-portable programming language? I am
> unaware of a language that cannot be ported.

Assuming that you mean "cannot in practice" rather than as a
breach of Turing equivalence, and that there is an implied "high level",
then many of the older languages were non-portable. There was commonly
an escape to machine code [ie, with a defined syntax but semantics that
could be sensibly determined only by reading the machine spec]. There
were also commonly hardware-related instructions [eg, to read the front
panel switches or to rewind the mag tape] with no adequate equivalent on
other computers.

Somewhat related, in [eg] Atlas Autocode [a high-level language
somewhere between Fortran and Algol 60], the usual input medium was paper
tape prepared on a flexowriter which had keys for alpha, beta, pi, half
and squared, plus the ability to overstrike characters; all of these
were used within the language, meaning that I have hard-copy of old
programs that I cannot even type in to a modern computer except by using
a word-processing package as intermediary. This is different from, and
less portable than, the related problems in Algol [60 and 68], where it
was recognised that the symbols used in the defining documents were not
universally available, so that each implementation had to document how
symbols were in fact provided [eg, by stropping in various ways].

If all modern high-level languages are portable, it owes not a
little to those experiences, and thus to a convergence in both hardware
and software.

--
Andy Walker,
Nottingham.

Message has been deleted

Rod Pemberton

unread,

Nov 19, 2010, 3:06:29 AM11/19/10

to

"Pascal J. Bourguignon" <p...@informatimago.com> wrote in message

news:87d3q2i...@kuiper.lan.informatimago.com...

> Joshua Maurice <joshua...@gmail.com> writes:
>
> > C is the "barest bones metal" portable programming language.
>
> No. This is only relative to a given machine architecture.
>

The two most portable languages are C and Forth. They've been implemented
on many more architectures than any other language. AFAICT, other languages
don't even come close. Of the two, C fits on more. C also fits onto
non-standardized architectures, although this means that the specifications
define many standard C behaviors for most machines as undefined. So, while
it's true that some architectures are not well suited to C, e.g., 16-bit
word size etc., I'd still say it's the most portable language by far.
Except for a carry flag, C captures the essence of the standardized 8-bit or
16-bit microprocessor, e.g., contiguous memory or flat address space, 8-bit
bytes and byte addressability for ASCII and EBCDIC, addresses the same size
as the largest supported integer, bitwise logic instructions, integer math,
stack, etc. Does C support modern microprocessor features like vector math?
No.

Rod Pemberton

Dmitry A. Kazakov

unread,

Nov 19, 2010, 3:44:05 AM11/19/10

to

On Thu, 18 Nov 2010 14:33:41 -0800 (PST), Joshua Maurice wrote:

> Sorry. I didn't put the right emphasis on it. The emphasis should have
> been: "C's primary design goal is to be a portable /assembly-like/
> programming language. It's meant just as a thin veneer on top of
> assembly." In that regard, you can write a good FORTRAN compiler in C
> without any FORTRAN or assembly (for the most part), but you cannot
> write a good C compiler and C standard library in FORTRAN without some
> of it written in C or assembly.

No, even in FORTRAN-IV you could write a compiler. It had LOGICAL*1 data
type (which served as a substitute for byte/character) and arrays of
LOGICAL*1 [*]. This is enough to produce a compiler. In fact I even
designed an interpreter of some language in FORTRAN-IV. BTW, at that time I
seriously considered C as an alternative for the job. But, surprise, C was
too heavy for the machine I used. It had only 64K, C took minutes to
compile (in 5 passes) anything more complex than "hello world". The code it
produced was catastrophic. So much for the urban legend of a
lightweight-close-to-assembler C.

------------
* Modern FORTRANs have strings, records etc. However, I don't use either
FORTRAN or C for creating compilers.

Torben �gidius Mogensen

unread,

Nov 19, 2010, 5:22:02 AM11/19/10

to

c...@tiac.net (Richard Harter) writes:

> On Thu, 18 Nov 2010 16:42:39 -0000, "BartC" <b...@freeuk.com>
> wrote:

>>So Haskell doesn't even allow assignment?

>>Perhaps 'variables' aren't the best name for them then.
>
> Some people feel that way. However the use of the term,
> variable, in mathematics and logic is much closer to the usage in
> functional languages than it is to imperative languages, so one
> could just as well say that 'variables' isn't the right name for
> imperative language 'variables'.

I agree. The term "variable" was not invented with FORTRAN, it had a
long use in mathematics before that.

> Speaking for myself, I really feel that if one is thinking about

> creating or modifying languages one should have some understanding of
> the major different kinds of programming languages.

I agree here too. Far too many atrocities have been "designed" by
people who knew only one or two languages and wanted to "improve" them.

A budding language designer should know at least the following languages
well enough to code in the "natural" style for each language: Scheme,
Standard ML, Haskell, Prolog, Pascal. This means (at least) using
higher-order functions and continuations in Scheme, data types and
modules in ML, type classes and monads in Haskell, backtracking and
logical variables in Prolog and nested function declarations, function
parameters, subrange types and value-passed arrays in Pascal.

And a language designer should also have a fair knowledge of compiler
design and interpreter design (including type inference and memory
management). If not, you are likely to make bad design decisions
because you are limited what you know how to implement. Some of the
weirder scope rules of scripting languages appear to be caused by th
edesigner/implementors not knowing how to implement lexical closures
(and if you don't know what that means, you are not qualified to design
a programming language).

This might sound a bit arrogant and "ivory tower"ish, but would you like
the bridge you drive your car over daily to be designed by someone who
just has seen and driven over a couple of bridges and thinks he can do
better? Or would you rather have a trained engineer do it? Why would
it be andy different with programming languages? These, too, can break
down at the worst possible time so people get killed or lose property.

That said, you will probably lear a lot by trying to design and
implement a language even without having the prerequisite knowledge.
Just you should do this for the experience and not for the result.

Torben

Rod Pemberton

unread,

Nov 19, 2010, 5:59:28 AM11/19/10

to

"Torben �gidius Mogensen" <tor...@diku.dk> wrote in message
news:7zeiahm...@ask.diku.dk...

>
> A budding language designer should know at least the following languages
> well enough to code in the "natural" style for each language: Scheme,
> Standard ML, Haskell, Prolog, Pascal. This means (at least) using
> higher-order functions and continuations in Scheme, data types and
> modules in ML, type classes and monads in Haskell, backtracking and
> logical variables in Prolog and nested function declarations, function
> parameters, subrange types and value-passed arrays in Pascal.
>

I agree that a language designer should know some languages, but those
obscure, hardly used, and dead languages? If there was a Lisp or Ada
programmer here, they'd tout those languages as the best for various odd
reasons too. It's possible programmers for Perl or Ruby or Java or C++
would mention those too. Why is it that unsuccessful languages and dead
languages, e.g., Fortran and Pascal, are always touted as the best languages
to learn from? C and Forth are far more successful. Shouldn't you learn
from success? Shouldn't you ignore the failures? Learning from failures
only ensures that bad ideas continue to exist. Of the languages I've
experienced, PL/I is the only other language as powerful as C. Yet, I've
never seen anyone praise PL/I for anything...

> (and if you don't know what that means, you are not qualified to design
> a programming language).
>
> This might sound a bit arrogant and "ivory tower"ish, but would you like
> the bridge you drive your car over daily to be designed by someone who
> just has seen and driven over a couple of bridges and thinks he can do
> better? Or would you rather have a trained engineer do it? Why would
> it be andy different with programming languages?
>

It's not. But, you failed to mention the most important language: assembly.
It's the foundation. Without solid knowledge of the microprocessor's
specific assembly language, you cannot implement any of the other languages
mentioned. Even though each microprocessor has it's own assembly language,
microprocessors standardized on the same basic architecture around 1974.
So, there are many similar concepts and instructions among them.

Rod Pemberton

unread,

Nov 19, 2010, 6:10:33 AM11/19/10

to

"Torben Ægidius Mogensen" <tor...@diku.dk> wrote in message
news:7zeiahm...@ask.diku.dk...

>
> This might sound a bit arrogant and "ivory tower"ish, but would you like
> the bridge you drive your car over daily to be designed by someone who
> just has seen and driven over a couple of bridges and thinks he can do
> better? Or would you rather have a trained engineer do it? Why would
> it be andy different with programming languages?
>

In that same vein, who would you rather have as a programmer? A programmer
who once learned structured programming with Pascal or Modula-2? Or, a
programmer who once learned structured programming with an unstructured,
line-numbered BASIC?

I'd prefer the programmer who learned with BASIC. Why? Because, you cannot
tell if the Pascal or Modula-2 programmer learned the concepts of structured
programming when programming in Pascal or Modula-2, or was just doing what
the language allowed or forced them to do. With an unstructured language
like BASIC, it's immediately obvious if the programmer programming in BASIC
learned structured programming, or didn't. They cannot be forced into it by
the language's design.

Rod Pemberton

Robbert Haarman

unread,

Nov 19, 2010, 6:55:19 AM11/19/10

to

We've strayed quite far from the original topic of this thread, but
I would still like to add my 2 cents, so here goes:

On Fri, Nov 19, 2010 at 05:59:28AM -0500, Rod Pemberton wrote:
> "Torben Ægidius Mogensen" <tor...@diku.dk> wrote in message

> news:7zeiahm...@ask.diku.dk...
> >
> > A budding language designer should know at least the following languages
> > well enough to code in the "natural" style for each language: Scheme,
> > Standard ML, Haskell, Prolog, Pascal. This means (at least) using
> > higher-order functions and continuations in Scheme, data types and
> > modules in ML, type classes and monads in Haskell, backtracking and
> > logical variables in Prolog and nested function declarations, function
> > parameters, subrange types and value-passed arrays in Pascal.
> >
>
> I agree that a language designer should know some languages, but those
> obscure, hardly used, and dead languages? If there was a Lisp or Ada
> programmer here, they'd tout those languages as the best for various odd
> reasons too. It's possible programmers for Perl or Ruby or Java or C++
> would mention those too. Why is it that unsuccessful languages and dead
> languages, e.g., Fortran and Pascal, are always touted as the best languages
> to learn from? C and Forth are far more successful. Shouldn't you learn
> from success? Shouldn't you ignore the failures?

While I agree that it is good to learn from success, I don't agree that
failed languages should be ignored (for given definition of success and
failure). You seem to have defined success as "currently widely used" and
failure as "not currently widely used", or something close to those
definitions, which I think is a fine definition.

As far as language design is concerned, on the other hand, I think the
important thing isn't whether a language is widely used today, but
rather the features it provides. The decision to include or not include
certain features can certainly be based on whether or not you think that
this feature will increase whatever you define of success for your language,
but I'd rather that someone who designed the next programming language
actually know the features that have already been invented. If they
still end up creating a language mostly like what's already out there,
then it will be the result of considering various options, rather than
simply not knowing any better.

In that regard, the list provided by Torben is a good one. The idea is to
give programming language designers a good background in what programming
paradigms and language features have already been invented - which serves
both to promote good ideas and to avoid repeating mistakes. I think that
makes a lot of sense.

As it happens, languages that are currently widely used occupy a relatively
small section of the realm of possibilities. C, C#, C++, Java, Perl, PHP,
Python, and VB.NET programs are invariably imperative. You won't find a
lot of functional or constraint-logic programming there. Most of these
languages have similar syntax. There is no strong culture of metaprogramming
there, except perhaps in C++, where it is mostly used to get something
similar to parametric polymorphism. Dependent types are nowhere near in
sight.

If I look at what languages have inspired me in my own quest for a better
programming language, the successful languages above are certainly among
them. But I've also been inspired by the Unix shell for its low barrier
between interactive use and writing scripts, by Scheme for its orthogonality
(provide the minimum set of features to be able to build anything, then
build the rest on top of those), by Common Lisp for its many powerful
methods of abstration (defmacro, dynamic scoping and generic functions
being the more unique features), and Haskell for its powerful static
type system.

Every once in a while, features that were formerly only present in
"failed" languages make it into mainstream languages, and the software
industry benefits as a result. I'm glad people continue to work on these
"failed" languages and people continue to copy features from them.

Regards,

Bob

--
Furious activity is no substitute for understanding.

-- H.H. Williams

BartC

unread,

Nov 19, 2010, 7:01:58 AM11/19/10

to

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> wrote in message
news:1ig2ggfxl5dky.972x1y4t65yu$.dlg@40tude.net...

> On Thu, 18 Nov 2010 14:33:41 -0800 (PST), Joshua Maurice wrote:

>> assembly." In that regard, you can write a good FORTRAN compiler in C
>> without any FORTRAN or assembly (for the most part), but you cannot
>> write a good C compiler and C standard library in FORTRAN without some
>> of it written in C or assembly.

> No, even in FORTRAN-IV you could write a compiler. It had LOGICAL*1 data
> type (which served as a substitute for byte/character) and arrays of
> LOGICAL*1 [*].

(Wasn't there CHARACTER*1 too? And weren't these available only for
byte-addressed hardware?)

> This is enough to produce a compiler. In fact I even
> designed an interpreter of some language in FORTRAN-IV. BTW, at that time
> I
> seriously considered C as an alternative for the job. But, surprise, C was
> too heavy for the machine I used. It had only 64K, C took minutes to
> compile (in 5 passes) anything more complex than "hello world". The code
> it
> produced was catastrophic. So much for the urban legend of a
> lightweight-close-to-assembler C.

When I was building microprocessor systems, I'd heard stories of such
multi-pass compilers taking many minutes of floppy disk drives grinding away
for even the smallest program (c. 1982)

I needed turnaround times of a few seconds. So I just created my own
in-memory compiler, for a language that did the same sorts of things as C,
that was so fast (even on 8-bits) that compilation speed was never an issue.
And for years, it's code even outperformed C. (I still use a descendent of
that language now, although C is usually a little faster these days.)

(And as this was pre-internet, I also saved some money not buying a
compiler...)

--
Bartc

Robbert Haarman

unread,

Nov 19, 2010, 7:54:22 AM11/19/10

to

On Fri, Nov 19, 2010 at 12:01:58PM -0000, BartC wrote:
>
> I needed turnaround times of a few seconds. So I just created my own
> in-memory compiler, for a language that did the same sorts of things
> as C, that was so fast (even on 8-bits) that compilation speed was
> never an issue. And for years, it's code even outperformed C. (I
> still use a descendent of that language now, although C is usually a
> little faster these days.)

Interesting. Can I have a copy?

-- Bob

Robbert Haarman

unread,

Nov 19, 2010, 8:00:45 AM11/19/10

to

On Thu, Nov 18, 2010 at 11:08:05AM -0500, lawrenc...@siemens.com wrote:
>
> Casts are not used frequently in well written code.

Correct. And some languages do not allow many programs to be written
well, requiring casts to get around overly limited type systems. This
applies to C, and also to versions of Java without generics.

Regards,

Bob

BartC

unread,

Nov 19, 2010, 8:09:29 AM11/19/10

to

"Torben Ægidius Mogensen" <tor...@diku.dk> wrote in message
news:7zeiahm...@ask.diku.dk...
> c...@tiac.net (Richard Harter) writes:

> I agree here too. Far too many atrocities have been "designed" by
> people who knew only one or two languages and wanted to "improve" them.
>
> A budding language designer should know at least the following languages
> well enough to code in the "natural" style for each language: Scheme,
> Standard ML, Haskell, Prolog, Pascal. This means (at least) using
> higher-order functions and continuations in Scheme, data types and
> modules in ML, type classes and monads in Haskell, backtracking and
> logical variables in Prolog and nested function declarations, function
> parameters, subrange types and value-passed arrays in Pascal.

Anyone clever enough to program in Scheme (ie. Lisp), ML and Haskell, would
likely create a language not much different.

So it would require a programmer also clever enough to get their head around
Scheme, ML and Haskell.

What about creating a language that anyone can program in? Or that anyone
can look at and have a fairly good idea of what it does and how to replicate
the code in another if necessary. (When I have to rewrite code for some
benchmark or other, I usually look to see if there's a Lua version first.
Lisp or Haskell would be the last I would bother with.)

(Your suggestion is like saying only authors who read, understand and
appreciate Shakespeare and Dostoevsky should be allowed to write popular
novels.)

> And a language designer should also have a fair knowledge of compiler
> design and interpreter design (including type inference and memory
> management). If not, you are likely to make bad design decisions
> because you are limited what you know how to implement. Some of the
> weirder scope rules of scripting languages appear to be caused by th
> edesigner/implementors not knowing how to implement lexical closures
> (and if you don't know what that means, you are not qualified to design
> a programming language).

Well, I've got absolutely know idea what lexical closures are. That hasn't
stopped me creating my own languages, programming in them for 30 years, and
earning my living for a big chunk of that time too. The languages worked.
They were productive. And they were fast.

> This might sound a bit arrogant and "ivory tower"ish, but would you like
> the bridge you drive your car over daily to be designed by someone who
> just has seen and driven over a couple of bridges and thinks he can do
> better? Or would you rather have a trained engineer do it? Why would
> it be andy different with programming languages? These, too, can break
> down at the worst possible time so people get killed or lose property.

But should I have to pay the salary of such a highly skilled engineer, or
spend years in college so that I have the skills myself, just to concrete
over the front drive of my house?

There is room for a variety of skill levels; not all programmers need to be
computer scientists with PhDs; and software can interact so that my lowly,
clunky script language can make use of a library function expertly written
in one of those fancy functional languages ... or would do if the designers
of the language had even considered such a possibility.

> That said, you will probably lear a lot by trying to design and
> implement a language even without having the prerequisite knowledge.
> Just you should do this for the experience and not for the result.

From a language I'm using now:

println "Hello, World!"

Output:

Hello, World!

What are the design deficiencies here?

--
Bartc

Dmitry A. Kazakov

unread,

Nov 19, 2010, 8:30:31 AM11/19/10

to

On Fri, 19 Nov 2010 12:01:58 -0000, BartC wrote:

> "Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> wrote in message
> news:1ig2ggfxl5dky.972x1y4t65yu$.dlg@40tude.net...
>> On Thu, 18 Nov 2010 14:33:41 -0800 (PST), Joshua Maurice wrote:
>
>>> assembly." In that regard, you can write a good FORTRAN compiler in C
>>> without any FORTRAN or assembly (for the most part), but you cannot
>>> write a good C compiler and C standard library in FORTRAN without some
>>> of it written in C or assembly.
>
>> No, even in FORTRAN-IV you could write a compiler. It had LOGICAL*1 data
>> type (which served as a substitute for byte/character) and arrays of
>> LOGICAL*1 [*].
>
> (Wasn't there CHARACTER*1 too?

No characters. Who needs them anyway? (:-))

> And weren't these available only for byte-addressed hardware?)

Likely. The only machines where I enjoyed FORTRAN-IV were IBM and PDP-11.
When PC's came I was able to switch to a better language and MS Fortran was
as unusable as anything from them.

>> This is enough to produce a compiler. In fact I even
>> designed an interpreter of some language in FORTRAN-IV. BTW, at that time
>> I
>> seriously considered C as an alternative for the job. But, surprise, C was
>> too heavy for the machine I used. It had only 64K, C took minutes to
>> compile (in 5 passes) anything more complex than "hello world". The code
>> it
>> produced was catastrophic. So much for the urban legend of a
>> lightweight-close-to-assembler C.
>
> When I was building microprocessor systems, I'd heard stories of such
> multi-pass compilers taking many minutes of floppy disk drives grinding away
> for even the smallest program (c. 1982)

I guess it was a port from Unix Sys V. It is amazing how retarded were Unix
and C comparing to decent OSes and languages at that time. They still are,
which defeats the argument that best languages (OSes, technologies)
survive. On the contrary, the whole history of computing illustrates
negative selection. C is the worst ever languages (I don't count joke
languages like Whitespace). Unix, Windows are the worst OSes. Internet
protocols like HTML/XML are sheer disaster. They were designated to win.

Robbert Haarman

unread,

Nov 19, 2010, 9:12:20 AM11/19/10

to

On Fri, Nov 19, 2010 at 01:09:29PM -0000, BartC wrote:
>
> Anyone clever enough to program in Scheme (ie. Lisp), ML and
> Haskell, would likely create a language not much different.

Possibly.

> So it would require a programmer also clever enough to get their
> head around Scheme, ML and Haskell.

Also possible.

It's not a given, though. Ruby was inspired by (among others) Smalltalk,
Scheme, and Perl, but you don't need to grasp those languages to be
able to program in Ruby. Python's creator knew and drew inspiration from
Modula 2 and 3, Algol 68 and Icon, but I would hazard a guess that
most people who program in Python don't know these other languages.

> What about creating a language that anyone can program in?

I think this has been tried and met with various degrees of success.
Languages that have been designed with this idea in mind include,
for example, BASIC and COBOL. IIRC, COBOL originally did not allow
user-defined functions, because being able to extend the vocabulary
would require people to learn more than a few hundred words in
order to learn the language. The versions of BASIC I started on
certainly didn't offer a way to add my own magic words to the
language. But sure, they were easy to learn.

Personally, I feel that programming is difficult to learn anyway. It
takes a certain mindset to be able to understand what things will
work and what things won't. Why does '10 PRINT "Hello, world!"' work
and not '10 PRINT THE REPORT'? Wrapping your head around that is
actually the hardest part and the reason that many people will never
be programmers at all. Once you have the right mindset, everything
can be learned: language features like subroutines, functions,
anonymous functions, closures, classes and instances, macros, namespaces,
types, etc. etc. But it's not just language features, it's also
concepts such as concurrency, object identity, version control,
networking, differences between your environment and that where
the software will be run, and many more.

> ([the suggestion that would-be language designers should know a variety
> of programming languages] is like saying only authors who read, understand

> and appreciate Shakespeare and Dostoevsky should be allowed to write
> popular novels.)

I didn't see anything in there about being allowed to design languages,
but I would submit that those who know different languages and paradigm
tend to design better languages, just like those who have read various
books tend to write better books.

And better languages does not necessarily mean that it takes more effort
to learn to use those languages, just as better books do not necessarily
require readers to understand and appreciate Shakespeare and Dostoevsky.

In fact, I would argue that the whole point of creating new programming
languages is to make things _easier_. That could be "easier to get started
with programming", but also "easier to build bug-free applications in the
Real World", "easier to build high-performance software",
"easier to build software to perform some specific task", or any number
of other things. Just like not all books are written for beginning
readers with small vocabularies, so don't all programming languages
have to be designed for people with a limited knowledge of programming
concepts. It's good that some language cater to that niche, but it's
absolutely not the only valid design goal.

> >And a language designer should also have a fair knowledge of compiler
> >design and interpreter design (including type inference and memory
> >management). If not, you are likely to make bad design decisions
> >because you are limited what you know how to implement. Some of the
> >weirder scope rules of scripting languages appear to be caused by th
> >edesigner/implementors not knowing how to implement lexical closures
> >(and if you don't know what that means, you are not qualified to design
> >a programming language).
>
> Well, I've got absolutely know idea what lexical closures are. That
> hasn't stopped me creating my own languages, programming in them for
> 30 years, and earning my living for a big chunk of that time too.
> The languages worked. They were productive. And they were fast.

That's good, and I'm happy for you, and for those you have made happy
with the fruits of your labor. Keep up the good work!

On the other hand, having made a living isn't quite the same thing as
having done the best job possible. To me, creating new programming
languages (and tools in general) is all about making something _better_
than what is currently out there. Sure, I could be content being a
Java developer, and it will pay the bills, but I don't think Java is
the best language out there, let alone the best language for all tasks,
so I like to look around and think about how I can make things better.
That doesn't currently pay the bills for me, but I like to think that
it makes the world a better place, and that provides motivation, too.

Regards,

Bob

--
"The only 'intuitive' interface is the nipple. After that, it's all learned."
-- Bruce Ediger

Mark Wooding

unread,

Nov 19, 2010, 10:07:45 AM11/19/10

to

"BartC" <b...@freeuk.com> writes:

> Anyone clever enough to program in Scheme (ie. Lisp), ML and Haskell,
> would likely create a language not much different.

I pretty much gave up my ambitions in language design when I discovered
Common Lisp, because I didn't think I had anything especially useful to
add.

> What about creating a language that anyone can program in?

That's been tried before, many times, with different degrees of success.
I think a programming language designer with this objective would be
well served by /also/ studying other `beginner-oriented' languages (for
want of a better term); the ones that spring to my mind are BASIC, REXX,
and Smalltalk. (Smalltalk was originally intended for teaching
programming to children; even so, it has fairly sophisticated features,
including `blocks' which are effectively closures with an implicit
downward continuation, and a custom control constructs, which borrow
heavily from Lisp. Logo is also a Lisp dialect with slightly odd
syntax.)

> (Your suggestion is like saying only authors who read, understand and
> appreciate Shakespeare and Dostoevsky should be allowed to write popular
> novels.)

It would certainly save us from Dan Brown.

-- [mdw]

Mark Wooding

unread,

Nov 19, 2010, 10:58:20 AM11/19/10

to

"Rod Pemberton" <do_no...@notreplytome.cmm> writes:

> I agree that a language designer should know some languages, but those
> obscure, hardly used, and dead languages?

I think the important thing is to have a knowledge of the available
design space. Many of these `failed' languages had some valuable idea
which is worth knowing.

> If there was a Lisp or Ada programmer here, they'd tout those
> languages as the best for various odd reasons too.

It's not about being `best'; it's about exposure to interesting ideas.
I am a Lisp programmer, and I like Lisp a lot. But Lisp doesn't have
much of any great interest to say about static type systems: Haskell is
much more useful to study in this regard. Common Lisp's macro system is
very different from Scheme's, and Dylan's is different again; I think
all of these are worth studying for their own reasons. If a language
designer has decided not to provide syntactic abstraction mechanisms in
his latest language, I'll be more likely to be interested in his reasons
if I know he's studied Lisp, Scheme and Dylan than if he's obviously
ignorant of these powerful macro systems.

> It's possible programmers for Perl or Ruby or Java or C++ would
> mention those too.

Possibly. They're all quite similar: imperative languages with
single-dispatch object systems. Java is probably the least interesting
of the bunch. Perl has an `interesting' approach to objects; and its
use of `pronouns' has had a small influence elsewhere. I've not seen
any obvious influences of Ruby elsewhere, though it's obviously the
bastard lovechild of Perl and Smalltalk, which makes it fairly
interesting itself. C++ has a more-or-less unique approach to
compile-time metaprogramming and an object system whose main
contribution would seem to be in demonstrating by omission the benefits
of runtime metaobjects and a coherent strategy for dealing with multiple
inheritance.

A language designer interested in object oriented languages would be
well served by studying C++, Smalltalk, CLOS, and Dylan.

> Why is it that unsuccessful languages and dead languages, e.g.,
> Fortran and Pascal, are always touted as the best languages to learn
> from?

Pascal at least is simple. Fortran is worth studying simply because of
its longevity -- it has evolved a lot over the years, and the resulting
historical perspective is valuable in itself. I think studying Lisp is
similarly useful.

> C and Forth are far more successful. Shouldn't you learn from
> success?

It depends on what you mean by `success', really. I don't think
language design is a popularity contest. If I were to measure C's
success in terms of security vulnerabilities, I'd say it was a dismal
failure.

> Shouldn't you ignore the failures?

Certainly not! Mistakes teach us at least as much as successes. Even
the abject failure of a language design teaches us that particular ideas
aren't fruitful; but I don't know of any abject failures of language
design.

> Learning from failures only ensures that bad ideas continue to exist.

Is it better to put the bad ideas on display with a sign over them
saying `bad ideas here', or to hide them away so that people reinvent
them?

> Of the languages I've experienced, PL/I is the only other language as
> powerful as C. Yet, I've never seen anyone praise PL/I for
> anything...

Oh, I have; and rightly so. If only I had an implementation to hand,
I'd like to learn PL/I properly.

> It's not. But, you failed to mention the most important language:
> assembly. It's the foundation. Without solid knowledge of the
> microprocessor's specific assembly language, you cannot implement any
> of the other languages mentioned.

Writing interpreters in Scheme, for example, is almost trivial, and
requires no understanding of processor architecture at all. But, yes,
if you want to design a language which aims for efficient implementation
then an understanding of processor architectures and instruction sets
will serve you well; and actually implementing a compiler which
generates native code directly (rather than via C, for example)
obviously requires strong knowledge of the target system(s).

-- [mdw]

BartC

unread,

Nov 19, 2010, 11:45:08 AM11/19/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message
news:2010111912...@yoda.inglorion.net...

Of the compiler? Or a spec of the language?

Most of my compilers are pretty crap. My main job was always something else,
and the language and compiler was just a tool to achieve that, and were not
a product in themselves.

The original compiler has long since disappeared, and the latest version was
created in the early 90s I think, and upgraded a few years ago to 32-bits
(so not exactly cutting edge...).

On the other hand, it is still my first choice (until my latest design is
ready, anytime now...) when needing a C-class compiled language, because
it's so familiar and comfortable to use, and it uses syntax I like. And the
performance is not bad for something that does no optimisation whatsoever..

--
Bartc

Message has been deleted

Waldek Hebisch

unread,

Nov 19, 2010, 2:40:16 PM11/19/10

to

In comp.lang.misc Joshua Maurice <joshua...@gmail.com> wrote:
>
> Sorry. I didn't put the right emphasis on it. The emphasis should have
> been: "C's primary design goal is to be a portable /assembly-like/
> programming language. It's meant just as a thin veneer on top of

> assembly." In that regard, you can write a good FORTRAN compiler in C
> without any FORTRAN or assembly (for the most part), but you cannot
> write a good C compiler and C standard library in FORTRAN without some

> of it written in C or assembly. That's the niche of C which will never
> go away.

I think you wanted to say different thing: "you can write operating
system or C library in C, but can not in other high-level languages".
Note that compiler converts stream of characters (source program)
into string of octects (executable). Most reasonable languages
can be used to write needed transformation. In fact, quite a lot
of compiler are written in language different than C.

However, even if we change claim from compilers to C library and
operating systems its is still not true: you can not write
"system like code" in standard C, you need extra compiler specific
extensions. If you add needed extensions to compiler for some
other language you can use it as well as C. Historically, Algol,
PL/1, Pascal, Modula 2, Ada, Lisp (and certainly many others)
were used to write operating systems. Fortran 77 is not well
suited to writing compilers or operating systems due to poor
support for data structures (but nevertheless was used at least
to write compilers). Java, Lisp and in general languages having
garbage collectors are tricky to use in system programming because
garbage collector may introduce unpredictable delays, but
for critical routines one can typically use a subset which
is guaranteed not to trigger garbage collection.

So, why currently C dominates system level programming? One
aspect is wide avaliablity of optimizing compilers with needed
extensions. Another factor is inertia: there is large
existing code base and it is hard to justify rewrite (even fear
of buffer overflows does not change much). Also, there is
question of interoperability: to make your code usable you
need to use C header provided by othere and give them your
header files.

Historically (in eighties) C had advantage: one can get
reasonably efficient object code using quite dumb C compiler.
To get efficient code from Fortran, Pascal or Ada one needs
optimizing compiler. Also, first C compilers included
support for system programming while for example Wirth, author
of first Pascal compiler rejected any such extension
(but added them later to Modula). Then, there are company
choices: Bell labs team invented C to write Unix, later
Microsoft decided to use C as its main developent language.
Both Unix and Microsoft products turned out to be quite
popular.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Pascal J. Bourguignon

unread,

Nov 19, 2010, 3:25:08 PM11/19/10

to

Waldek Hebisch <heb...@math.uni.wroc.pl> writes:

This is not a thick trick however.

Notice that the ANSI standard of Common Lisp doesn't specify any form of
memory management. No garbage collector is specified. You can have a
conforming Common Lisp implementation without a garbage collector. It's
an implementation choice. So you can design a Common Lisp
implementation (eg. a compiler) that generates code without a run-time
garbage collector for kernel development, if that was needed.

Then you can easily use lisp in its meta-programming role, and instead
of writing a kernel program you write a program that writes a kernel
program. This kernel program can be generated as a file of bytes, which
can be then copied to the boot sectors.

> So, why currently C dominates system level programming? One
> aspect is wide avaliablity of optimizing compilers with needed
> extensions. Another factor is inertia: there is large
> existing code base and it is hard to justify rewrite (even fear
> of buffer overflows does not change much). Also, there is
> question of interoperability: to make your code usable you
> need to use C header provided by othere and give them your
> header files.

So, it's basically innertia and lack of tools.

> Historically (in eighties) C had advantage: one can get
> reasonably efficient object code using quite dumb C compiler.
> To get efficient code from Fortran, Pascal or Ada one needs
> optimizing compiler. Also, first C compilers included
> support for system programming while for example Wirth, author
> of first Pascal compiler rejected any such extension
> (but added them later to Modula). Then, there are company
> choices: Bell labs team invented C to write Unix, later
> Microsoft decided to use C as its main developent language.
> Both Unix and Microsoft products turned out to be quite
> popular.

--

BartC

unread,

Nov 19, 2010, 4:20:32 PM11/19/10

to

"Stefan Ram" <r...@zedat.fu-berlin.de> wrote in message
news:deficiences-2...@ram.dialup.fu-berlin.de...

> "BartC" <b...@freeuk.com> writes:
>>From a language I'm using now:

>>println "Hello, World!" (....)

>>What are the design deficiencies here?
>

> There is no specification of the langauge, so one cannot
> tell, whether the program is syntactically correct.
>
> There is no requirements specification, so one cannot
> tell, whether the program is semantically correct.

Well, my point was, it is possible to design simple languages without
knowing about lexical closures, continuations, and so on (these are terms I
always have to look up, then instantly forget again).

A real, useful and practical language is a little more involved than shown
in that example but not much (even so, it does demonstrate a few useful
attributes that a language might have, such as lack of clutter, minimal
overheads, and just plain clarity)

> A possible deficiency might be (depending on the
> requirements): The program assumes an environment with a
> text console, so it will not work as a library part in a
> larger program with another UI, like a GUI or a deamon.

Yes, but these are just practical concerns; you can probably say the same
about:

public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}

You don't really need to know about higher-order functions here.

Certainly, it is useful to at least look at a few different languages (and
steal a few ideas), but it is not necessary to be an expert programmer in
them.

--
Bartc

James Harris

unread,

Nov 19, 2010, 5:48:31 PM11/19/10

to

On Nov 19, 9:20 pm, "BartC" <b...@freeuk.com> wrote:

...

> Well, my point was, it is possible to design simple languages without
> knowing about lexical closures, continuations, and so on (these are terms I
> always have to look up, then instantly forget again).
>
> A real, useful and practical language is a little more involved than shown
> in that example but not much (even so, it does demonstrate a few useful
> attributes that a language might have, such as lack of clutter, minimal
> overheads, and just plain clarity)

Very much so. A language can be extremely useful even if it does *not*
have a lot of features. The diversity and amount of code written in
good, readable, clear C - a fairly basic language - has eminently
demonstrated that. IMHO C strikes a very good balance between being
high and low level.

For sure, there *are* some useful features over and above those
present in C that could be included in a new language but only if they
fit with the rest of the language rather than being bolted on, and
only if they genuinely add something useful. Adding a feature because
some other languages have it is obviously not a good recipe. So the
study of other languages can mislead a language designer if self-
control is not exercised.

...

> Certainly, it is useful to at least look at a few different languages (and
> steal a few ideas), but it is not necessary to be an expert programmer in
> them.

I've probably mentioned this before but ISTM very much easier to
modify an existing language design - making up for its weaknesses -
rather than to design a new language from scratch. Examples: B, C,
Python. I sometimes think I should go back to C or something like
Basic and rework my designs. An evolutionary design, rather than a
revolutionary one, is also often easier for other programmers to
accept.

James

James Harris

unread,

Nov 19, 2010, 6:08:04 PM11/19/10

to

On Nov 17, 9:05 am, R Kym Horsell <k...@duo2.kym.com> wrote:
> In comp.programming James Harris <james.harri...@googlemail.com> wrote:> Tony Hoare famously criticises himself for introducing null references
> > in the mid 1960s calling it his billion-dollar mistake.
> > I have some ideas but I'm not sure I fully understand what he means by
> > null references or null pointers (apart from the obvious that they do
> > not point to a valid object, such as location zero, which triggers an
> > error on use). Nor am I sure what he believes is so bad about them -
> > or what makes them much worse than pointers in general.
>
> [...]
>
> I think we've all seen programs that try to access something via
> a pointer but do not first check whether that pointer ref's a valid object.
> Hoare's original feeling was to dis-allow this, with some kind of
> error should any pointer not ref a valid object.
>
> "I call it my billion-dollar mistake. It was the invention of the null
> reference in 1965. At that time, I was designing the first comprehensive
> type system for references in an object oriented language (ALGOL W). My
> goal was to ensure that all use of references should be absolutely safe,
> with checking performed automatically by the compiler. But I couldn't resist
> the temptation to put in a null reference, simply because it was so easy
> to implement. This has led to innumerable errors, vulnerabilities, and system
> crashes, which have probably caused a billion dollars of pain and damage
> in the last forty years. In recent years, a number of program analysers like
> PREfix and PREfast in Microsoft have been used to check references, and
> give warnings if there is a risk they may be non-null. More recent programming
> languages like Spec# have introduced declarations for non-null references.
> This is the solution, which I rejected in 1965. "

As Rod said, thanks for posting this. That said the quote seems more
of a sound bite for media consumption. It doesn't get to the heart of
the issue that Hoare was commenting on.

If anyone's interested there is a video of Tony Hoare discussing the
issue at

http://www.infoq.com/author/Tony-Hoare

I watched the video before making the initial post but I think that
before replying I need to go over it again in the light of comments
people have made on this thread.

(BTW, if anyone is following the thread in only comp.programming you
might like to know that there are some extra comments in
comp.lang.misc that haven't been crossposted.)

James

James Harris

unread,

Nov 19, 2010, 6:31:23 PM11/19/10

to

On Nov 18, 6:41 am, tm <thomas.mer...@gmx.at> wrote:
> On 17 Nov., 09:40, James Harris <james.harri...@googlemail.com> wrote:
>
> > Anyone care to discuss the issue from the point of view of language
> > design? Should languages allow null references? Why or why not? (And
> > is it practical to remove them?)
>
> Pointers are the gotos of data. A goto is a POINTER to code where
> the execution should continue. A pointer says that you should GOTO
> to a specific place in memory to find more data.
>
> Structured programming hides gotos with structured statements.
> Structured data structures should also try to hide pointers.

I agree with this, at least in principle, and Tony Hoare makes the
same point in the video I linked to a few minutes ago at

http://www.infoq.com/author/Tony-Hoare

Gotos are incredibly useful if used properly. It is quite possibly to
write perfectly structured code using gotos if certain restrictions
are kept to. However, if they are present in a piece of code someone
reading that code cannot readily tell whether the restrictions have
been adhered to or not. So we generally prefer iterations, selections
and exceptions. When only high-level control flow constructs can be
used we have assurance about program structure so the code is easier
to read and modify.

Similarly, pointers can be incredibly useful. (Rather than pointers my
current designs include what I call "references" as they are more
general. A reference could be an address in memory or it could be a
two-part reference such as object:offset. Or it could be simply an
alphanumeric key to be looked up in an array-like object etc.)

While control flow has been analysed and resolved to primitives that
(almost?) obviate the need for gotos I'm not sure the same can be said
for pointers or references.

> Many uses of pointers go away when several types of containers are
> available.

But how do those containers get defined? Rather than having to drop
out of the HLL to write them in foreign code or relying on limited pre-
supplied libraries ISTM better that they can be written in the HLL
itself.

Maybe better still is to provide the essential data structure
components and allow them to be combined as needed in the HLL. It's
desirable that someone reading a piece of code can understand it
without looking up lots of unfamiliar pre-built libraries.

James

James Harris

unread,

Nov 19, 2010, 6:58:21 PM11/19/10

to

On Nov 19, 10:48 pm, James Harris <james.harri...@googlemail.com>
wrote:

> On Nov 19, 9:20 pm, "BartC" <b...@freeuk.com> wrote:
>
> ...
>
> > Well, my point was, it is possible to design simple languages without
> > knowing about lexical closures, continuations, and so on (these are terms I
> > always have to look up, then instantly forget again).
>
> > A real, useful and practical language is a little more involved than shown
> > in that example but not much (even so, it does demonstrate a few useful
> > attributes that a language might have, such as lack of clutter, minimal
> > overheads, and just plain clarity)
>
> Very much so. A language can be extremely useful even if it does *not*
> have a lot of features. The diversity and amount of code written in
> good, readable, clear C - a fairly basic language - has eminently
> demonstrated that. IMHO C strikes a very good balance between being
> high and low level.
>
> For sure, there *are* some useful features over and above those
> present in C that could be included in a new language but only if they
> fit with the rest of the language rather than being bolted on, and
> only if they genuinely add something useful. Adding a feature because
> some other languages have it is obviously not a good recipe. So the
> study of other languages can mislead a language designer if self-
> control is not exercised.

On this point....

Not sure who said it first but there is a maxim on the idea of
achieving perfection: it has been achieved not when there is no longer
anything to add but when there is no longer anything to take away.

IIRC the subject of this thread, Hoare, makes a broadly similar point:
rather than being satisfied when so many features have been added that
something has no obvious deficiencies, stop when there are so few
features that the object has obviously no deficiencies.

Another, attributed to William of Ockham/Occam: entities [shall we say
features in this case] should not be multiplied beyond necessity. I
stress the word *necessity* in that.

Of course there is is a balance to be struck. Without that a design
will naturally be unbalanced. Hence, one more allusion. Einstein
probably had it best: things should be as simple as possible but no
simpler.

James

BGB

unread,

Nov 19, 2010, 10:57:49 PM11/19/10

to

but having more features is easier to take to market than less features...

something can be elegantly simple, and people will be like "yep, obvious
enough" and promptly forget that it ever existed.

or, something can be largely unusable but endlessly complicated and with
long bullet-lists of features (with acronym names and version numbers)
and numerous tomes of documentation (nearly all of which will be
severely outdated 3-6 months later), and the marketing hype will be
massive...

hence, one can get by fairly well by making the internals surprisingly
simple, and inventing some big elaborate front-end interface, much like
OO facilities would be only maybe a few kloc internally except for the
1500 API calls most of which do little more than re-call to the other
API calls or maybe cast argument or return types or similar...

it is incredible in this way...

why have 30 functions which address nearly all use cases when you can
have 30 classes which address a few of the common use cases?...

(a few of the other use-cases to be addressed by subsequent features and
more acronyms with version numbers).

(ok, admitted, there is slight satire here, but sadly even my own hobby
projects are far from being immune to excess internal complexity
sometimes...).

or such...

Rod Pemberton

unread,

Nov 20, 2010, 3:42:25 AM11/20/10

to

"BGB" <cr8...@hotmail.com> wrote in message
news:ic7h03$7hk$1...@news.albasani.net...

>
> hence, one can get by fairly well by making the internals surprisingly
> simple, and inventing some big elaborate front-end interface, much like
> OO facilities would be only maybe a few kloc internally except for the
> 1500 API calls most of which do little more than re-call to the other
> API calls or maybe cast argument or return types or similar...
>
> it is incredible in this way...
>
> why have 30 functions which address nearly all use cases when you can
> have 30 classes which address a few of the common use cases?...
>
> (a few of the other use-cases to be addressed by subsequent features and
> more acronyms with version numbers).
>
>
> (ok, admitted, there is slight satire here,

Slight satire? Really? That's exactly how things are done... A small set
of functionality is used to build a larger one.

Many Forth's build up from 35-45 primitives, some as few as 9. Some Lisp's
build up from as few as 9 or 14 operations. C libraries can be implemented
with only 18 to 21 OS functions, and early pre-ANSI C libraries with only
11. See here below the .sig:
http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526

Rod Pemberton

unread,

Nov 20, 2010, 3:43:49 AM11/20/10

to

"Waldek Hebisch" <heb...@math.uni.wroc.pl> wrote in message
news:ic6jr0$133$1...@z-news.wcss.wroc.pl...

>
> However, even if we change claim from compilers to C library and
> operating systems its is still not true: you can not write
> "system like code" in standard C, you need extra compiler specific
> extensions.
>

I think you only need one extension for some assembly on some platforms and
that can be handled by the preprocessor and not C itself...

I can easily imagine not needing any extensions for a simple 8-bit or 16-bit
microprocessor. The small in-progress OS I'm working on for 32-bit x86 is
almost entirely in ANSI C. It uses some extensions, e.g., to inline code,
but the only extension that is required is for assembly to use
microprocessor instructions not supported by C.

> If you add needed extensions to compiler for some
> other language you can use it as well as C. Historically, Algol,
> PL/1, Pascal, Modula 2, Ada, Lisp (and certainly many others)
> were used to write operating systems. Fortran 77 is not well
> suited to writing compilers or operating systems due to poor
> support for data structures (but nevertheless was used at least
> to write compilers). Java, Lisp and in general languages having
> garbage collectors are tricky to use in system programming because
> garbage collector may introduce unpredictable delays, but
> for critical routines one can typically use a subset which
> is guaranteed not to trigger garbage collection.
>

And, C++ is supposedly difficult because of it's memory allocator, e.g., new
and delete. So, ISTM, that if the language is too old or too new, it has
complications unsuitable for low-level programming. It needs to be of a
certain age, e.g., C or Forth, to be just right...

Rod Pemberton

unread,

Nov 20, 2010, 3:44:42 AM11/20/10

to

"James Harris" <james.h...@googlemail.com> wrote in message
news:384bbb7e-0945-41c4...@o2g2000vbh.googlegroups.com...

> On Nov 18, 6:41 am, tm <thomas.mer...@gmx.at> wrote:
> > On 17 Nov., 09:40, James Harris <james.harri...@googlemail.com> wrote:
>
> > > Anyone care to discuss the issue from the point of view of language
> > > design? Should languages allow null references? Why or why not? (And
> > > is it practical to remove them?)
>
> > Pointers are the gotos of data. A goto is a POINTER to code where
> > the execution should continue. A pointer says that you should GOTO
> > to a specific place in memory to find more data.
>
> > Structured programming hides gotos with structured statements.
> > Structured data structures should also try to hide pointers.
>
> I agree with this, at least in principle, and Tony Hoare makes the
> same point in the video I linked to a few minutes ago at
>
> http://www.infoq.com/author/Tony-Hoare
>
> Gotos are incredibly useful if used properly. It is quite possibly to
> write perfectly structured code using gotos if certain restrictions
> are kept to.
>

Yes.

http://en.wikipedia.org/wiki/Control_flow#Minimal_structured_control_flow
http://en.wikipedia.org/wiki/Structured_program_theorem
http://en.wikipedia.org/wiki/Go_To_Statement_Considered_Harmful
http://en.wikipedia.org/wiki/Spaghetti_code

All loops reduce to goto's...

> However, if they are present in a piece of code someone
> reading that code cannot readily tell whether the restrictions have
> been adhered to or not.

Yes.

That's the problem with an unstructured BASIC. The original author
programmed and tested the program with understanding of what was being done
and where the GOTO's were supposed to go. But, as soon as someone who isn't
familiar with the code, or the original author at some later date whereby
he/she has forgotten what was done, use of GOTO's would result in code that
was in error. It also happens when the program becomes too large for the
programmer to keep track of all the nesting, i.e., spaghetti code.

> However, if they are present in a piece of code someone
> reading that code cannot readily tell whether the restrictions have
> been adhered to or not. So we generally prefer iterations, selections
> and exceptions. When only high-level control flow constructs can be
> used we have assurance about program structure so the code is easier
> to read and modify.
>

Yes.

> Similarly, pointers can be incredibly useful. (Rather than pointers my
> current designs include what I call "references" as they are more
> general. A reference could be an address in memory or it could be a
> two-part reference such as object:offset. Or it could be simply an
> alphanumeric key to be looked up in an array-like object etc.)
>

Ok.

> While control flow has been analysed and resolved to primitives that
> (almost?) obviate the need for gotos

Yes, see Böhm and Jacopini or Edsger Dijkstra.

> While control flow has been analysed and resolved to primitives that
> (almost?) obviate the need for gotos I'm not sure the same can be said
> for pointers or references.

I doubt it... You have to have some method(s) to access the object, i.e, an
address or pointer... You also need to have some method(s) to access
intermediate portions of an object, i.e., indexing, named sub-elements, etc.
Many of them can be hidden from the programmer, but I'm unsure about all of
them.

Rod Pemberton

unread,

Nov 20, 2010, 3:47:25 AM11/20/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message

news:2010111911...@yoda.inglorion.net...

>
> As far as language design is concerned, on the other hand, I think the
> important thing isn't whether a language is widely used today, but
> rather the features it provides.
>

Warning, I'm going to state the obvious.

Shouldn't ONE language have all the best features of those other languages?
Is there such a language? No. Why not? (Please think about that before
continuing.)

> The decision to include or not include
> certain features can certainly be based on whether or not you think that
> this feature will increase whatever you define of success for your
> language,
> but I'd rather that someone who designed the next programming language
> actually know the features that have already been invented. If they
> still end up creating a language mostly like what's already out there,
> then it will be the result of considering various options, rather than
> simply not knowing any better.
>

No. That just biases them to include a good feature of one language as a
bad feature in another language. The entire language needs to work well
together. I think that fundamental idea is something everyone here is
missing. Picking and choosing good features from various languages won't
allow you to create a good language overall that has those features.

> In that regard, the list provided by Torben is a good one. The idea is to
> give programming language designers a good background in what programming
> paradigms and language features have already been invented - which serves
> both to promote good ideas and to avoid repeating mistakes. I think that
> makes a lot of sense.
>

The good ideas are being taken out of context: their original language.
Outside that context they aren't good ideas.

Rod Pemberton

unread,

Nov 20, 2010, 3:47:53 AM11/20/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message

news:2010111914...@yoda.inglorion.net...

>
> In fact, I would argue that the whole point of creating new programming
> languages is to make things _easier_. That could be "easier to get started
> with programming", but also "easier to build bug-free applications in the
> Real World", "easier to build high-performance software",
> "easier to build software to perform some specific task", or any number
> of other things.
>

But, they obviously conflict. It's unlikely you can have more than one of
them without bugs... Isn't it?

How do you design a language which is "easier to implement" that doesn't
create higher level, programmer created bugs?

How do you design a language which is "easier to program applications" or
allows high level abstractions that doesn't introduce low-level, unforeseen
bugs?

Rod Pemberton

unread,

Nov 20, 2010, 3:49:13 AM11/20/10

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:bdb6eb85-8ca1-4a39...@j21g2000vba.googlegroups.com...

>
> Not sure who said it first but there is a maxim on the idea of
> achieving perfection: it has been achieved not when there is no longer
> anything to add but when there is no longer anything to take away.
>

It's similar to the concept of "factoring" in Forth. It's used to reduce
"words", i.e., subroutines, into the smallest quantity of useable
functionality.

"The term 'factoring' has been used in the Forth community since at least
the early 1980s. Chapter Six of Leo Brodie's book Thinking Forth (1984) is
dedicated to the subject.'"
http://en.wikipedia.org/wiki/Code_refactoring

Also, Jochen Liedtke's "minimality principle" is similarly stated. He is a
key developer of micro-kernels:
http://en.wikipedia.org/wiki/Microkernel#Essential_components_and_minimality

> IIRC the subject of this thread, Hoare, makes a broadly similar point:
> rather than being satisfied when so many features have been added that
> something has no obvious deficiencies, stop when there are so few
> features that the object has obviously no deficiencies.

...

Rod Pemberton

unread,

Nov 20, 2010, 3:49:41 AM11/20/10

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:8ba96bee-86d5-457d...@j9g2000vbr.googlegroups.com...

>
> I sometimes think I should go back to C
>

There have been various extensions to C and derivations of C:

C with Classes, i.e., C++

Java

Objective-C
http://en.wikipedia.org/wiki/Objective-C

C Blocks extension
http://en.wikipedia.org/wiki/Blocks_(C_language_extension)

ISO embeddable C TR18037

GLSL
http://en.wikipedia.org/wiki/GLSL

Cg
http://en.wikipedia.org/wiki/Cg_(programming_language)

CUDA
http://en.wikipedia.org/wiki/CUDA

Rod Pemberton

unread,

Nov 20, 2010, 3:59:46 AM11/20/10

to

"Stefan Ram" <r...@zedat.fu-berlin.de> wrote in message

news:BASIC-2010...@ram.dialup.fu-berlin.de...
> "Rod Pemberton" <do_no...@notreplytome.cmm> writes:
>
> >I'd prefer the programmer who learned with BASIC. Why?
>
> Did you learn programming with BASIC yourself?
>

Yes and No. I read about programing in BASIC and learned a bit about it.
But, I had no access to a computer. LOGO was my first language learned on a
computer, taught in a nightclass that was intended for adults. I usually
forget about those experiences... Then, BASIC was my second language
learned on a computer, self-taught in the early '80's, and using structured
programming concepts too. I then learned 6502 assembly, also self-taught.
After that, I learned Pascal and Fortran in school, followed by numerous
other programming and scripting languages, mostly self-taught since I wasn't
a CS major, such as C in the early '90's, and later PL/I in the late '90's
while working a for brokerage. C is by far the best, IMO. Of course, as
someone recently pointed out, I have no experience in "functional"
languages, or more recent languages. But, ISTM, they are mostly reinventing
the wheel, since I've not seen anything that cannot be done in C, yet. A
few have commented on not reinventing the wheel in this thread, but that
seems to be what is being done by post C languages...

Rod Pemberton

Robbert Haarman

unread,

Nov 20, 2010, 4:30:55 AM11/20/10

to

On Sat, Nov 20, 2010 at 03:47:53AM -0500, Rod Pemberton wrote:
> "Robbert Haarman" <comp.la...@inglorion.net> wrote in message
> news:2010111914...@yoda.inglorion.net...
> >
> > In fact, I would argue that the whole point of creating new programming
> > languages is to make things _easier_. That could be "easier to get started
> > with programming", but also "easier to build bug-free applications in the
> > Real World", "easier to build high-performance software",
> > "easier to build software to perform some specific task", or any number
> > of other things.
> >
>
> But, they obviously conflict. It's unlikely you can have more than one of
> them without bugs... Isn't it?

Yes, these goals can be conflicting. This is why I think there is room
for multiple programming languages. No single language is going to be best
for every scenario, and so we have more than one.

Regards,

Bob

Robbert Haarman

unread,

Nov 20, 2010, 4:51:14 AM11/20/10

to

Hi Rod,

On Sat, Nov 20, 2010 at 03:47:25AM -0500, Rod Pemberton wrote:
> "Robbert Haarman" <comp.la...@inglorion.net> wrote in message
> news:2010111911...@yoda.inglorion.net...
> >
> > As far as language design is concerned, on the other hand, I think the
> > important thing isn't whether a language is widely used today, but
> > rather the features it provides.
> >
>
> Warning, I'm going to state the obvious.
>
> Shouldn't ONE language have all the best features of those other languages?
> Is there such a language? No. Why not? (Please think about that before
> continuing.)

I would very much like to know the answer to that.

I think part of the reason is that people disagree about what things are
good ideas. There are things that I think are obviously good ideas,
which other people vehemently argue against, and vice versa.

Another part of the reason is that getting a great programming language out
there is very, very hard. The first hurdle is actually coming up with a
great programming language, and a lot of people fall down right there. I
know that I have come up with several ideas that, in retrospect, weren't
the best - and I now know better, because I have studied languages
that do things better than I had originally thought it up.

Once you have a great programming language, you still need a good
implementation and good documentation. You also may want people other
than yourself to actually start using the language and the implementation.
All the while, you are competing with giants like Microsoft, Oracle,
and Google, who are pushing their own languages that probably have 90%
of what makes your language great, but have huge resources to throw at
their projects, large propaganda machines, and a lot of existing mind
share.

As if that wasn't bad enough, you will also have hordes of people who
will flame you because your implementation isn't fast enough, your
documentation isn't good enough, or they already have their favorite
language and they refuse to learn anything else, but want to justify
this to themselves by convincing themselves that that other language
really isn't any better.

Getting a new programming language out there makes an uphill battle
seem easy.

> > In that regard, the list provided by Torben is a good one. The idea is to
> > give programming language designers a good background in what programming
> > paradigms and language features have already been invented - which serves
> > both to promote good ideas and to avoid repeating mistakes. I think that
> > makes a lot of sense.
> >
>
> The good ideas are being taken out of context: their original language.
> Outside that context they aren't good ideas.

I disagree. Many good ideas are simply good ideas. Even if they do
depend on a particular context, you're designing a language here, so you
can bring in as much context as you need. Even if you don't want to do
that, just knowing about the idea doesn't mean you are somehow forced to
bolt it onto your own language. By learning about what has already been
invented and tried, you have a lot to gain and little to lose.

Regards,

Bob

--
"The first casualty of war is truth."

Robbert Haarman

unread,

Nov 20, 2010, 6:41:26 AM11/20/10

to

Hi James,

On Fri, Nov 19, 2010 at 03:31:23PM -0800, James Harris wrote:
>
> While control flow has been analysed and resolved to primitives that
> (almost?) obviate the need for gotos I'm not sure the same can be said
> for pointers or references.

What do you mean by obviating the need for pointers or references? What
are the issues with pointers or references that you would want to
avoid by getting rid of pointers and references?

Regards,

Bob

Robbert Haarman

unread,

Nov 20, 2010, 6:52:12 AM11/20/10

to

On Fri, Nov 19, 2010 at 04:45:08PM -0000, BartC wrote:
> "Robbert Haarman" <comp.la...@inglorion.net> wrote in message
> news:2010111912...@yoda.inglorion.net...
> >On Fri, Nov 19, 2010 at 12:01:58PM -0000, BartC wrote:
> >>
> >>I needed turnaround times of a few seconds. So I just created my own
> >>in-memory compiler, for a language that did the same sorts of things
> >>as C, that was so fast (even on 8-bits) that compilation speed was
> >>never an issue. And for years, it's code even outperformed C. (I
> >>still use a descendent of that language now, although C is usually a
> >>little faster these days.)
> >
> >Interesting. Can I have a copy?
>
> Of the compiler? Or a spec of the language?

Both, if you would be so kind. I'm just curious. :-) You have done
something there which I find very interesting. It also sounds a bit
similar to what I am trying to do with the Voodoo programming language,
although that isn't so much about compilation speed as it is about
providing a simple, portable language that can be used as a target
to support multiple platforms without sacrificing too much performance.

Regards,

Bob

BartC

unread,

Nov 20, 2010, 7:24:24 AM11/20/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message

news:2010112009...@yoda.inglorion.net...

> Once you have a great programming language, you still need a good
> implementation and good documentation. You also may want people other
> than yourself to actually start using the language and the implementation.
> All the while, you are competing with giants like Microsoft, Oracle,
> and Google, who are pushing their own languages that probably have 90%
> of what makes your language great, but have huge resources to throw at
> their projects, large propaganda machines, and a lot of existing mind
> share.

Some people just don't like dealing with big, bloated, 'corporate' products.

For example, Basic is a simple language, yes? Yet I think the last version
of VB6 came in a 3.5GB distribution. 3500 Megabytes. (And that wasn't even
VB.NET...). Didn't Basic use to be available in a 2KB ROM?

Then there are the technologies: COM, DDE, OLE (these are all about 20 years
old, there'll be a whole bunch of new ones now.). Is this really what
programming is going to be about?

(Taking just one example from Win32 API that I had to use: to achieve just
one minor task, say setting a 'tooltip' for a 'button', in a highly
restrictive manner, requires learning a dozen new functions, twenty new
struct layouts, a bunch of new macros, and dozens of new message codes. Plus
two days to investigate why it doesn't work...

When I make something like this available in one of my APIs, it might look
like: Tooltip(Btn,Text), and it would work immediately...)

Even forgetting the corporate stuff, take a language like C, which on the
face of it seems straightforward, and everyone has a good word for it.

Then you look closer, at it's diabolical type declarations that you have to
learn, at it's macro language (which can quickly render any program
unreadable). And when you have to compile modules, you have compilers like
gcc with 500 pages of docs just to explain it's 1200 command-line options.
And to build an application, you have yet another language in 'make-files'.

It's not surprising some people think they can do better: achieve the same
sorts of things, but in a simpler way!

> As if that wasn't bad enough, you will also have hordes of people who
> will flame you because your implementation isn't fast enough, your
> documentation isn't good enough, or they already have their favorite
> language and they refuse to learn anything else, but want to justify
> this to themselves by convincing themselves that that other language
> really isn't any better.
>
> Getting a new programming language out there makes an uphill battle
> seem easy.

With my language designs, I wouldn't even bother trying to make them public.
(And they wouldn't be considered 'sexy' compared with many that are
available.)

Besides, prying other people away from their favourite languages seems
pretty much impossible anyway.

And, a lot of the advantages of your own language, is that it is your own
language..

--
Bartc

BartC

unread,

Nov 20, 2010, 7:46:40 AM11/20/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message

news:2010112011...@yoda.inglorion.net...

> On Fri, Nov 19, 2010 at 04:45:08PM -0000, BartC wrote:
>> "Robbert Haarman" <comp.la...@inglorion.net> wrote in message
>> news:2010111912...@yoda.inglorion.net...
>> >On Fri, Nov 19, 2010 at 12:01:58PM -0000, BartC wrote:
>> >>
>> >>I needed turnaround times of a few seconds. So I just created my own
>> >>in-memory compiler, for a language that did the same sorts of things
>> >>as C, that was so fast (even on 8-bits) that compilation speed was
>> >>never an issue. And for years, it's code even outperformed C. (I
>> >>still use a descendent of that language now, although C is usually a
>> >>little faster these days.)
>> >
>> >Interesting. Can I have a copy?
>>
>> Of the compiler? Or a spec of the language?
>
> Both, if you would be so kind. I'm just curious. :-)

OK, is that your real email at the top of this message?

> You have done
> something there which I find very interesting. It also sounds a bit
> similar to what I am trying to do with the Voodoo programming language,

I doubt if it's anything similar to that. My compiler just turns:

a:=b+c

into text similar to:

mov eax,[b]
add eax,[c]
mov [a],eax

(since it became 32-bits, it generates Nasm-format .asm files, and requires
normal assembling and linking. Previously it didn't use a linker, generating
.exe files itself.)

> although that isn't so much about compilation speed as it is about
> providing a simple, portable language that can be used as a target
> to support multiple platforms without sacrificing too much performance.

Again, I only target x86.

(About this language/compiler: virtually all my programming now is done in a
interpreted/dynamic language of mine. The compiled language is only used
these days when I have to: maintaining the interpreter for the dynamic
language; maintaining itself (as it's written in itself, obviously); and
working on a Loader/Runtime for a new language)

--
Bartc

Robbert Haarman

unread,

Nov 20, 2010, 8:17:32 AM11/20/10

to

On Sat, Nov 20, 2010 at 12:46:40PM -0000, BartC wrote:
>
> "Robbert Haarman" <comp.la...@inglorion.net> wrote in message
> news:2010112011...@yoda.inglorion.net...
> >On Fri, Nov 19, 2010 at 04:45:08PM -0000, BartC wrote:
> >>"Robbert Haarman" <comp.la...@inglorion.net> wrote in message
> >>news:2010111912...@yoda.inglorion.net...
> >>>

> >>>Interesting. Can I have a copy?
> >>
> >>Of the compiler? Or a spec of the language?
> >
> >Both, if you would be so kind. I'm just curious. :-)
>
> OK, is that your real email at the top of this message?

It will work. You could also substitute inglorion for the part before
the at.

Thanks!

Bob

Message has been deleted

BartC

unread,

Nov 20, 2010, 11:34:26 AM11/20/10

to

"Mark Wooding" <m...@distorted.org.uk> wrote in message
news:87lj4p2vy...@metalzone.distorted.org.uk...
> "BartC" <b...@freeuk.com> writes:
>
>> Anyone clever enough to program in Scheme (ie. Lisp), ML and Haskell,
>> would likely create a language not much different.
>
> I pretty much gave up my ambitions in language design when I discovered
> Common Lisp, because I didn't think I had anything especially useful to
> add.

If you still feel a need to help humanity, you might investigate what it is
about Lisp that puts a lot of people off it, and see if it can be made more
palatable.

>> (Your suggestion is like saying only authors who read, understand and
>> appreciate Shakespeare and Dostoevsky should be allowed to write popular
>> novels.)
>
> It would certainly save us from Dan Brown.

(He's not that bad; I managed to get through two-thirds of one of his books
once.)

--
Bartc

Message has been deleted

Pascal J. Bourguignon

unread,

Nov 20, 2010, 1:03:19 PM11/20/10

to

Robbert Haarman <comp.la...@inglorion.net> writes:

In this thread, the problem with pointers that we wanted to avoid, is
that of the null pointer. Namely, the fact that a null pointer cannot
be meaningfully dereferenced, and that in uncontrolled environments,
will produce random behavior or crashes.

Accessorily, a language that allows for uninitialized variables would
not benefit from the suppression of null pointers.

So if we consider a language where there are no null pointers, and there
are no uninitialised variables, we cannot have a function such as
malloc, which returns a bloc of memory that is untyped and uninitialized
(or initialized to clear bits, which could be interpreted as a null
pointer!).

So, if instead of having to write:

typedef int object;
typedef struct node { object element; struct node* next; } node;

node* f(){
node* c=malloc(sizeof(node));
if(c!=0){
c->element=42;
c->next=NULL;
}else{
error(out_of_memory);
}
return(c);
}

you can write:

(defun f ()
(let ((c (cons 42 nil)))
c))

you avoid the risks of writing:

void f(){
node* c=malloc(sizeof(node));
c->element=42;
c->next=NULL;
return(c);
}

which could try to write 42 at the address of the NULL pointer which
would crash the program in sane systems, or:

printf("%d\n",f()->next->element);

which would try read from the address of the NULL pointer which would
crash the program in sane systems.

On the other hand, without a NULL pointer, uninitialized variables and malloc,

(defun f ()
(let ((c (cons 42 nil)))
c))

always returns a valid value. (cdr (f)) is always a valid object (a
symbol named "NIL"), and (car (cdr (f))) or (cdr (cdr (f))) is always a
valid object [ still the symbol named "NIL", since
(car nil) = nil = (cdr nil)) ].

Now this doesn't avoid the need for the programmer to test for the guard
object:

(let ((list (list 1 2 3)))
(loop until (eql nil list)
do (print (car list)) (setf list (cdr list))))

But an erroneous program such as:

(let ((list (list 1 2 3)))
(loop do (print (car list)) (setf list (cdr list))))

would print an infinite number of nil instead of crashing or worse,
having an undefined behavior.

James Harris

unread,

Nov 20, 2010, 7:20:34 PM11/20/10

to

On Nov 20, 11:41 am, Robbert Haarman <comp.lang.m...@inglorion.net>
wrote:

Well, I don't fully know yet. :-( I've read your and others' comments
but not yet gone back over Hoare's lecture. I'll post some proper
replies once I've done that and had time to assimilate.

In the context above I don't mean to remove references themselves but
to manage their manipuation so as to maintain safety contracts, much
as control structures do for jumps.

I can't help feeling there's an issue of exclusive vs shared referents
that's not been addressed in general but on this too I need to check
some more before coming back to you. Thanks for all the responses so
far. It's good to see c.l.m is as alive as before and not home just to
the spam, the wind and the tumbleweed.

James

Robbert Haarman

unread,

Nov 20, 2010, 7:57:28 PM11/20/10

to

Hi James,

On Sat, Nov 20, 2010 at 04:20:34PM -0800, James Harris wrote:
>
> In the context above I don't mean to remove references themselves but
> to manage their manipuation so as to maintain safety contracts, much
> as control structures do for jumps.

I have been thinking about references for safety reasons as well. As
far as I can tell, the way to be safe is to only allow programs to
hold valid references. That is, when the reference is used, it must
point to a live object, and the object must be of the type expected
by the code that uses it. You don't want wild pointers or dangling
pointers.

It seems to me that the best way to get safe references is to
allow a program to obtain references only by creating the corresponding
objects, and to only destroy objects after the last reference to them
has been lost. In other words, no null pointers, no way for programs
to manufacture their own pointers, and no way to destroy objects you
still have references to. Also, types must either be propagated and
checked at compile time, or there need to be run-time type checks.

There certainly are alternatives. You could allow null pointers and
still be safe. Java does this, for example. Similarly, you could
allow objects to be destroyed while there are still references to them
and still be safe. It's a matter of allowing invalid access to be
detected, and specifying the behavior that should occur in such cases.
However, the way I see it, we can prove the absence of such
invalid accesses at compile time, which makes our programs more
reliable. Languages like OCaml and Haskell show that this is a
perfectly viable approach.

> I can't help feeling there's an issue of exclusive vs shared referents
> that's not been addressed in general but on this too I need to check
> some more before coming back to you.

There are certainly issues with exclusive vs. shared referents. In the
context of safety, there are dangling references and double frees to be
worried about when explicit destruction of objects is allowed. Another
issue which I feel receives far too little attention in general is
that of object identity. Many languages offer an operator to test
for equality, but don't strictly define what equality means. Sometimes,
you want to know if two strings have the same characters in them.
Sometimes, you want to know if two references refer to the same string.

Regards,

Bob

--
There are 10 kinds of people in the world; those who understand binary,
and those who don't.

BartC

unread,

Nov 20, 2010, 8:38:35 PM11/20/10

to

"Robbert Haarman" <comp.la...@inglorion.net> wrote in message

news:2010112100...@yoda.inglorion.net...

> It seems to me that the best way to get safe references is to
> allow a program to obtain references only by creating the corresponding
> objects, and to only destroy objects after the last reference to them
> has been lost. In other words, no null pointers, no way for programs
> to manufacture their own pointers, and no way to destroy objects you
> still have references to. Also, types must either be propagated and
> checked at compile time, or there need to be run-time type checks.
>
> There certainly are alternatives. You could allow null pointers and
> still be safe. Java does this, for example.

I use mainly a 'variant' datatype in a dynamic language. The format is
roughly like this:

(tag, pointer, length)

when it contains arrays, strings and such data. The pointer is managed
behind-the-scenes, and is not easily accessible by the programmer.

But, when length=0, the pointer will also be null (0). No problems. Because
the internal code knows the pointer is invalid when the array or whatever is
empty.

On the other hand, I also allow explicit pointer types, and explicit
dereferencing. Now the user could try and dereference a pointer which is (1)
null, or (2) which points to garbage (because what it pointed to no longer
exists), or (3) might not be a pointer at all.

Cases (1) and (3) are easy enough to check for ((3) has always to be
checked; (1) might be optional).

But there is no easy, efficient way of checking (2). (And in fact, case (2)
can also apply to internal handling of pointers, in certain circumstances).

So I guess that makes the language unsafe in some ways. But, I'm not too
bothered because C can also suffer from the same sorts of problems (and I'm
attempting to compete with it.)

The problem then is not so much a null pointer, but an invalid one.
(Possibly, some form of GC might help out here, but it's not a priority at
the moment)

--
Bartc

BGB

unread,

Nov 22, 2010, 2:25:44 AM11/22/10

to

well, the particular 30 to 30 mapping I was thinking of was actually
comparing C style and Java-style file IO.

with 30 C API functions, one can implement pretty much an entire FS API
(complete with stat and readdir and sockets and so on...).
granted, yes, this may all easily boil down into a single call, usually
some magic "syscall()" or similar function which ultimately accepts all
other system calls (at this point often represented as magic numbers and
memory buffers).

Java has 30 classes, most of which implement endless variations on
"well, I want to read-from/write-to the file as some sort of stream".
random IO is it own special edge case, with another class.

when one looks around at the Java API designs, and tries implementing
them, in some ways it all seems fairly silly...

it is like, there came up with all of these assorted "use cases" and
tried writing code for all the use-cases, meanwhile still not providing
a simple and direct analogue of the POSIX-style file IO API's.
(in fact, it is not until JDK 1.4 or so until they re-add things like
variable-argument functions and formatted output, with a design
apparently largely inspired by C's API...).

so, on one hand, one can implement something like the JVM, but upon
understanding parts of it, one can end up resenting how they put the
things together, like, to design the API, they got together a few people
who knew something about OOA/OOD (like what they teach in classes) but
little about how to sensibly implement a piece of VM technology.
meanwhile, the class library quickly turns into an ugly mess of tangled
spaghetti classes, and with no real option to avoid it unless one wants
simply to design their own API.

well, along with meta-circular stuff which seemingly does little more
than annoy/burden a VM implementer and one can wonder if it really
serves any purpose other than try to give the illusion that Java is
implemented in itself.

as a similar example, I ended up implementing OO facilities mostly
following JNI's lead, and in retrospect this was probably a bad idea, as
the front-end API leads to much added bulk.

it is like, on one hand, one has POSIX-like levels of abstraction and
minimalism (well, nevermind "fork()" and "exec()"...).
and, on the other extreme is the Java class-library, which is an
exercise in verbosity.

MS is more middle-ground for the most part, neither minimal but still
much less verbose than the JVM's design.

however, MS is well known for making up new API's with carefully chosen
acronyms and version numbers, and then creating hype, but really they
are just their old technology half-assedly wrapped and re-packaged as
some "new" technology.

but, at least MS's engineering (overall) seems to make a lot more sense
to me personally (compared with even GNU strategies, MS is a lot better
at doing things effectively and solidly and not getting their head stuck
up their own ass with designing things that seem to be either
over-engineered or composed of toothpicks and sticky tape...).

like, is their any better way to get the machine-type name than to make
a small shell-script to print the environment variable? (I discovered:
firstly, this environment variable doesn't actually exist (not passed to
called apps), secondly, there appears to be no other way to get at it
directly, thirdly, this name can apparently differ from what GCC has
used, which is a pain if one needs it for GCC interop...).

say, GCC is using "x86_64-redhat-linux" and bash gives
"x86_64-gnu-linux" or similar (realized now: well, there is "gcc
-dumpmachine" and "gcc -dumpversion", which may have been better than
asking bash).

recent observation was when having to battle with GCC (on Linux) and
some of the crap I have to pass as command-line arguments to get
sensible behavior from GCC (mostly in the realm of shared-object
handling, ...), and I am left to wonder "WTF is going on here?... MS by
default does something sane here by default...". (most are simple
things, but a big pile of annoyances pop up when one tries to port a
>1Mloc codebase to Linux over approx 2 days, one loses a little sanity
here, but hell the code was ported so proof of point...).

now, for example, maybe there is some reason not to want the CWD of the
running app to be part of the library path, but FWIW this should be the
default option, or failing this, at least have a more convenient
compiler option.

also vs WinDbg and even the VS debugger, one realizes that GDB is really
fairly poor in a few ways. or, that in a case that popped up earlier, an
X11/GLX app breaking in the right way in the debugger can effectively
leave the user unable to regain control over their system (the only
option being to CTRL-ALT-F2 over to a console to reboot the system,
although killing and restarting the X server would probably also have
worked). I forget if I tried CTRL-ALT-Backspace, in the past this being
my usual way of dealing with X-breaking events (like, say, control has
been lost with the window manager somehow, and being unable to get focus
to the debugger, one is unable to kill the process at which point
presumably the WM would unstick or whatever...).

in modern Windows, ALT-TAB and CTRL-ALT-DEL pretty much always work to
allow manual focus transfer (except when Windows itself starts breaking...).

I am also left to realize just how convenient it is for files to all
have proper file extensions, and the system behaving according to the
extension (I ended up using file extensions just so that my ported
Makefiles would be able to do their thing as before).

was also wishing for a "don't blow up in error if no source files exist"
option for cp (like a "can't copy==exit silently and do nothing" option,
say for conditional file copying in Makefiles, absent having to wrap cp
or similar...).

well, modern Linux is better than ever before, but Windows is still
better in a few ways... (except at this point WRT overall performance
and disk-usage, where Linux seems to have a lot better stuff-per-MB
ratio than Windows at this point, and seems to pull off generally better
responsiveness and core-system performance, although comparably poorer
3D performance...).

and, after all, with a new install both WiFi and 3D accell managed to
work absent having to jerk them off (amazing...), and Flash was made
able to work within a "reasonable" amount of jerkoff (download and
trying to figure out where the hell to copy the '.so' to, but that done,
Flash works...).

well, what is to say?... everything is better than ever before but, in
some ways, everything is still a raging pile of crap sometimes...

or such...

Torben Ægidius Mogensen

unread,

Nov 22, 2010, 7:30:55 AM11/22/10

to

Robbert Haarman <comp.la...@inglorion.net> writes:

> Every once in a while, features that were formerly only present in
> "failed" languages make it into mainstream languages, and the software
> industry benefits as a result. I'm glad people continue to work on these
> "failed" languages and people continue to copy features from them.

Indeed. Garbage collection is a good example. Thirty years ago it was
not used in any mainstream languages apart from LISP (which you can
argue wasn't even that), but almost all languages designed within the
last ten years feature garbage collection.

Polymorphic/generic types were, similarly, originated in a language that
never made it into mainsteam and were absent from mainstream languages
twenty years ago but most mainstream languages with static types now
include a form of polymorphism/genericity.

Ditto type inference, reflection, closures, dynamic types and even
objects.

Torben

Jacko

unread,

Nov 22, 2010, 4:04:16 PM11/22/10

to

On Nov 20, 8:44 am, "Rod Pemberton" <do_not_h...@notreplytome.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

>
> news:384bbb7e-0945-41c4...@o2g2000vbh.googlegroups.com...
>
>
>
>
>
> > On Nov 18, 6:41 am, tm <thomas.mer...@gmx.at> wrote:
> > > On 17 Nov., 09:40, James Harris <james.harri...@googlemail.com> wrote:
>
> > > > Anyone care to discuss the issue from the point of view of language
> > > > design? Should languages allow null references? Why or why not? (And
> > > > is it practical to remove them?)
>
> > > Pointers are the gotos of data. A goto is a POINTER to code where
> > > the execution should continue. A pointer says that you should GOTO
> > > to a specific place in memory to find more data.
>
> > > Structured programming hides gotos with structured statements.
> > > Structured data structures should also try to hide pointers.
>
> > I agree with this, at least in principle, and Tony Hoare makes the
> > same point in the video I linked to a few minutes ago at
>
> > http://www.infoq.com/author/Tony-Hoare
>
> > Gotos are incredibly useful if used properly. It is quite possibly to
> > write perfectly structured code using gotos if certain restrictions
> > are kept to.
>
> Yes.
>

> http://en.wikipedia.org/wiki/Control_flow#Minimal_structured_control_...http://en.wikipedia.org/wiki/Structured_program_theoremhttp://en.wikipedia.org/wiki/Go_To_Statement_Considered_Harmfulhttp://en.wikipedia.org/wiki/Spaghetti_code

>
> All loops reduce to goto's...
>
> > However, if they are present in a piece of code someone
> > reading that code cannot readily tell whether the restrictions have
> > been adhered to or not.
>
> Yes.
>
> That's the problem with an unstructured BASIC. The original author
> programmed and tested the program with understanding of what was being done
> and where the GOTO's were supposed to go. But, as soon as someone who isn't
> familiar with the code, or the original author at some later date whereby
> he/she has forgotten what was done, use of GOTO's would result in code that
> was in error. It also happens when the program becomes too large for the
> programmer to keep track of all the nesting, i.e., spaghetti code.

This made GOSUB be so much more useful when every line GOSUBed to was
REM just under a line with RETURN as it's entire. Maybe RETURN should
be commoned with REM as 'RETREM var' so forcing var to contain the
line number or address of the next line.

> > However, if they are present in a piece of code someone
> > reading that code cannot readily tell whether the restrictions have
> > been adhered to or not. So we generally prefer iterations, selections
> > and exceptions. When only high-level control flow constructs can be
> > used we have assurance about program structure so the code is easier
> > to read and modify.
>
> Yes.
>
> > Similarly, pointers can be incredibly useful. (Rather than pointers my
> > current designs include what I call "references" as they are more
> > general. A reference could be an address in memory or it could be a
> > two-part reference such as object:offset. Or it could be simply an
> > alphanumeric key to be looked up in an array-like object etc.)
>
> Ok.

Yes, I think the idea of a pointer as having to be declared of a type
is so ass-backard' when the inclusion of the pointer type in an object
is indicative of the type of object which is to be pointed to. This is
especially true of pass by reference and reference enclosure of
objects in objects.

> > While control flow has been analysed and resolved to primitives that
> > (almost?) obviate the need for gotos
>
> Yes, see Böhm and Jacopini or Edsger Dijkstra.
>
> > While control flow has been analysed and resolved to primitives that
> > (almost?) obviate the need for gotos I'm not sure the same can be said
> > for pointers or references.
>
> I doubt it... You have to have some method(s) to access the object, i.e, an
> address or pointer... You also need to have some method(s) to access
> intermediate portions of an object, i.e., indexing, named sub-elements, etc.
> Many of them can be hidden from the programmer, but I'm unsure about all of
> them.

Error of null due to pointers only occurs if the user is responsable
for assigning and chaining and following pointers. The errors become
impossible if ALL traversal is done by keyword such as EACH.

Jacko

unread,

Nov 22, 2010, 4:08:12 PM11/22/10

to

On Nov 20, 11:41 am, Robbert Haarman <comp.lang.m...@inglorion.net>
wrote:

Null pointer Exception.... Or Pointer Arithmetic into out of bounds
memory.

James Harris

unread,

Nov 23, 2010, 5:32:49 AM11/23/10

to

On Nov 19, 11:08 pm, James Harris <james.harri...@googlemail.com>
wrote:

...

> If anyone's interested there is a video of Tony Hoare discussing the
> issue at
>
> http://www.infoq.com/author/Tony-Hoare
>
> I watched the video before making the initial post but I think that
> before replying I need to go over it again in the light of comments
> people have made on this thread.

OK, some points from the above video.

Hoare spent a while contrasting programmers' attitudes to subscript
checking, essentially (ISTM) favouring the increased safety of
carrying out subscript checks at the expense of increased run time. To
a certain extent this is a separate issue. I guess it set the focus on
language safety for the subsequent pointer discussion. He later
referred to a book, Unsafe at Any Speed, which was influential in
improving safety in motor vehicles.

While working on a successor to Algol 60 he suggested a record type
and access to a record could be via a pointer. To avoid random pointer
values pointing to, for example, a piece of code, he suggested that
pointer referents be typed. Then the types can be checked at compile
time and incur no run-time overhead.

Better yet, unlike array indices a language can ensure that a bad
pointer can never be created. [What's needed? As a first guess:
initialisation at declaration, managed record creation, no
manipulation of pointer values, controlled destruction, and validation
whenever a pointer is picked up from an untrusted source such as a
parameter from untrusted code or, possibly, an array element.]

He likened pointers which could be null to discriminated unions. With
them, every access may be wrapped in a discrimination clause. So every
access requires a run-time check. For example, is this a car or a bus?

An issue with the above is what to assign to a pointer at
initialisation time. It may be possible to construct a tree from
leaves as every node that's created has something to point to, but
creation is not if a cyclic reference is needed unless there is some
special syntax or mechanism. Hence the need for null pointers.

To avoid making this post too long I'll comment separately.

James

James Harris

unread,

Nov 23, 2010, 6:20:15 AM11/23/10

to

On Nov 19, 11:08 pm, James Harris <james.harri...@googlemail.com>
wrote:

...

> If anyone's interested there is a video of Tony Hoare discussing the
> issue at
>
> http://www.infoq.com/author/Tony-Hoare
>
> I watched the video before making the initial post but I think that
> before replying I need to go over it again in the light of comments
> people have made on this thread.

Despite watching the video, fundamentally I'm not sure what is claimed
to be wrong - a billion dollar mistake - with null pointers.

As we know, null can be assigned to a pointer to indicate a missing or
unknown value and can generate an exception on use. What's so wrong
with that? It is far better than *not* generating an exception and
either picking up garbage or writing over something else.

Is it because programs crash on reference to a null pointer? I suspect
this is the main perceived issue. However, a *language* could insert a
very fast check for zero before dereferencing or could arrange with
the OS to trap a bad address reference that does occur. This, then, is
about exception handling, not null pointers.

A programmer could insert his own pointer-checking code at points of
his choosing, much like the access to a discriminated union. In the
presence of programmer checks a clever compiler could omit some of its
own checks down certain paths in the code.

So is there anything fundamentally wrong with the concept of null
pointers?

Here are wwo things I don't like about them. The second is something I
would call fundamental but I didn't notice it being mentioned in
Hoare's comments.

1. As a lesser issue, programmer-written code to check pointers can
bulk up the code and obscure program logic.

2. More importantly the concept potentially separates the point (in
code and in time) where an error happens from the point where the
error is detected or manifests itself. This is fundamental to language
design and perhaps best dealt with by supporting

(a). input parameter validation, and
(b). exceptions.

The first, (a), can catch bad inputs effectively by preconditions. The
second, (b), has a number of benefits: it makes reporting the
exception at the point it occurs the default action, allows the
exception to be caught at whatever nesting level the programmer
requires, and keeps other code focussed on the application logic.
Neither of these fully protect a routine. It could still have
unchecked inputs or invalid values introduced in the routine itself.
So nulls may still raise an exception.

James