Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FP is an excellent match for modern CPUs

30 views
Skip to first unread message

ol...@pobox.com

unread,
Apr 6, 2001, 4:07:11 PM4/6/01
to

This is to briefly describe and alert to an article about rather
low-level aspects of CPU architecture, which appeared in the latest
issue of IEEE Computer. The article specifically makes a point that
writing code in a Functional Programming style can directly translate
into significant gains in performance and efficiency of executing such
code on modern and upcoming high-performance CPUs. On the other hand,
imperative programming makes it harder for a CPU to find and exploit
instruction-level parallelism (ILP); it hinders speculation as
well. Such points are very surprising for an article about low-level
processor architectures and for such a magazine as IEEE Computer
(which is oblivious to FP).

The article in question is
"Speculative Multithreaded Processors"
Gurindar S. Sohi, Amir Roth
U. of Wisconsin-Madison
IEEE Computer, April 2001, pp. 66-73 (Cover Feature)

I wish I could quote most of the article; alas it is not possible due
to legal and technical considerations. Instead I try to summarize
their most important points.

Modern CPUs are mostly super-scalar, with increasingly deep pipelines
and increasingly complex algorithms to detect ILP, predict branches,
etc. The law of diminishing returns sets in; in addition, complex
processors are difficult to design and verify. The authors propose an
alternative -- a speculative multi-threaded processor. It has several
execution units. One of the units executes the instruction stream as
it is; the other units can _speculatively_ execute various pieces of
the same instruction stream at the same time. The pieces to execute
speculatively don't have to be soundly parallelizable and free of
conflicts -- as long as we can easily detect the conflict and recover
from it. Thus speculative execution can exploit more ILP because that
ILP doesn't have to be 100% sound.

As was apparent in the age of mainframes, the key to increasing
performance is to overlap computations with i/o (read: memory
access). A modern super-scalar CPU analyzes instruction stream within
a certain window to see which instructions need memory access and if
this access can be safely overlapped with execution of other
instructions in the window. Alas, "imperative programming conventions
make consistently high ILP rare," the article states. "Programmers
tend to group dependent statements together into static structures --
helping them reason about their programs but limiting the independent
work available in any given dynamic instruction window." A large
instruction analysis window could help; however, due to decreasing
accuracy of branch prediction instructions at the end of such window
will unlikely be useful.

For imperative programs, the most natural division of a program into
independent threads is along control-flow boundaries. Proper
sequencing of instruction is paramount in imperative
programming. Therefore, control flow is explicit (given by the
programmer/compiler) while data flow is implicit. The challenge for
the control-flow multi-threading is to identify threads in such a way
as to minimize their data dependencies. A 100% sound way of doing this
is very hard, as evidenced by a limited success of parallelizing
compilers. A speculative thread execution can help -- provided there
is a hardware support to recognize memory access dependency violations
and an ability to back out changes to thread's state (and to the
memory).

Data-driven multi-threading (in the purest form, at the level of a
single instruction) is naturally free from data-dependency
conflicts. It provides almost ideal performance and efficiency
optimization. In a data-driven execution, an instruction is fetched
only when one of its input operands becomes available. 100% sound
data-driven multi-threading is very hard for imperative programs.
"The automatic conversion of imperative code to dataflow-explicit form
has been the subject of some research, but data-driven program
representations can generally be constructed only for code written in
functional, data-driven languages." Speculative data multi-threading
is especially useful for functional programs: "the main-thread concept
also means that dividing a program into disjoint threads is not
strictly necessary and that speculative data-driven threads can
concentrate on performance-enhancing tasks without correctness
obligations. It is likely that data-driven threads will play the role
of helper threads that run ahead of a main thread and absorb
micro-architectural latencies on its behalf."


The authors of the article have been supported by donations from Intel
and Sun Microsystems. That doesn't mean that Intel or Sun endorse FP
-- but that does mean that they are aware of the results of the
paper. I'd be surprised if Intel or Sun don't see an increased flow of
grant applications from functional programmers, quoting from the
research these companies sponsored. IEEE Computer has published e-mail
addresses of the authors: sohi at cs wisc edu and amir at the same
host.

P.S. The sidebar to the article talks about a CRAY MTA with 256 custom
multi-threaded processors -- and no cache! Each processor achieves a
sustainable flow of two instructions per cycle -- and can tolerate
memory latency up to 1024 cycles. This is a uniform-memory processor
indeed: the cost of fetching a datum is the same no matter where it is
located in memory -- even for commodity RAM (>100 ns memory cycle) and
up to 10GHz clock cycle.

--
Posted from www.cs.nps.navy.mil [131.120.10.2]
via Mailgate.ORG Server - http://www.Mailgate.ORG

Paul Morrison

unread,
Apr 7, 2001, 10:51:30 AM4/7/01
to

ol...@pobox.com wrote:

>
> The article in question is
> "Speculative Multithreaded Processors"
> Gurindar S. Sohi, Amir Roth
> U. of Wisconsin-Madison
> IEEE Computer, April 2001, pp. 66-73 (Cover Feature)
>

I talk about very similar ideas in my 1994 book, "Flow-Based Programming",
and I have put the relevant chapter up on my web site at
http://www.jpaulmorrison/fbp/cognates.htm. Very briefly, FBP is a
technique of building applications out of asynchronous processes
communicating via streams of data chunks. The specification of the
connections between processes is external to the process code, so such an
application exhibits what Nate Edwards has called "configurable
modularity", which he says is characteristic of any system that may be
said to be engineered. I feel this approach is an excellent match with
business application design, and much more natural than trying to
decompose a problem to discover what parallelism it may have. It also
matches many recent directions in advanced hardware design, and I discuss
a number of these (that I was aware of in 1994) towards the end of the
above-mentioned chapter (do a "find" on Advanced Hardware Designs). I
realized that hardware is increasingly parallel, and so are business
applications at the job step level - it is only the programming layer in
between that resolutely persists in remaining synchronous! Maybe it's
time to break down that last restriction!

Paul M.

George Russell

unread,
Apr 9, 2001, 5:47:32 AM4/9/01
to
I don't want to be too much of a wet blanket, but it doesn't seem to me much
use spotting parallelism if there isn't much there to be spotted. FP code
these days typically involves a lot of walking through lists, so whatever you
do you have to wait N times while the pointer-to-next-element is indirected.
This is disastrous for modern processors, and I suspect the situation will
get worse if current trends continue.

There are approaches to dealing with this. For example someone (was it Appel?)
wrote a paper about storing lists by clumping the elements 5-at-a-time (or
something like that). However these approaches don't yet seem to have made it
into current compilers, possibly because they look very tricky to get right
in practice. (If you change just one of 5 elements you may need to copy the
whole block.)

Ralph Becket

unread,
Apr 9, 2001, 6:16:36 AM4/9/01
to
In article <3AD18534...@tzi.de>, George Russell <g...@tzi.de> wrote:
>I don't want to be too much of a wet blanket, but it doesn't seem to me much
>use spotting parallelism if there isn't much there to be spotted. FP code
>these days typically involves a lot of walking through lists, so whatever you
>do you have to wait N times while the pointer-to-next-element is indirected.
>This is disastrous for modern processors, and I suspect the situation will
>get worse if current trends continue.

Loop unrolling can help here, the overhead with n-iteration unrolling
being k dereferences and EOL tests for item k within each n element
window (assuming that the processing for each element is independent.)

Even if there is a functional dependency on the result for item
k+1 from the result for item k, unrolling in this environment will still
allow overlapping concurrent processing of the independent parts of the
calculation (e.g. g in f(X_{k+1}) = g(X_{k+1}) + h(f(X_k))).

- Ralph
--
Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

Torben AEgidius Mogensen

unread,
Apr 9, 2001, 8:35:23 AM4/9/01
to
George Russell <g...@tzi.de> writes:

>I don't want to be too much of a wet blanket, but it doesn't seem to me much
>use spotting parallelism if there isn't much there to be spotted. FP code
>these days typically involves a lot of walking through lists, so whatever you
>do you have to wait N times while the pointer-to-next-element is indirected.
>This is disastrous for modern processors, and I suspect the situation will
>get worse if current trends continue.

You are, I suspect, thinking too locally when you look for parallelism
(which was one of the points in the article). If you have a pure
functional language, you know the only way a function call can
influence its continuation is through the function result. Hence, you
can start evaluating the continuation in parallel with the call. If
the continuation wants to access the function result, it will find a
locked register if the result isn't available yet. If, for example,
you have the calls

map g (map f l)

the outer and inner calls can work as a pipeline on the elemenst of l,
using the return-value registers as pipeline registers. Note that
register renaming will allow the two functions to use separate
physical registers even though they are numbered identically. Some
computer architecture people (it may even be the people at UW@M) have
actually suggested specific architectural support for
call+continuation parallelism.

>There are approaches to dealing with this. For example someone (was
>it Appel?)
>wrote a paper about storing lists by clumping the elements 5-at-a-time (or
>something like that). However these approaches don't yet seem to have made it
>into current compilers, possibly because they look very tricky to get right
>in practice. (If you change just one of 5 elements you may need to copy the
>whole block.)

This idea has come up several times. The main problem with this
approach is modularity: If you call a function in another module, you
will have to generate a new version for that to handle the packed
elements. This makes separate compilation harder. Alternatively, you
can use a tagless approach, where a function that uses lists don't
assume anything about the memory layout but instead jumps to a
method-like code pointer when it want to take the list apart. This
adds some overhead for the indirect jump, though.

A third possibility is to use run-time code generation: If you at
runtime apply a list-type function to a compacted list, you will
generate code that is specialized to the compacted format. Subsequent
calls will use the specialized version. The two methods (taglessness
and RTCG) can even be combined.

Torben Mogensen (tor...@diku.dk)

George Russell

unread,
Apr 9, 2001, 9:03:44 AM4/9/01
to
Torben AEgidius Mogensen wrote:
[snip]

> If you have a pure
> functional language, you know the only way a function call can
> influence its continuation is through the function result. Hence, you
> can start evaluating the continuation in parallel with the call. If
> the continuation wants to access the function result, it will find a
> locked register if the result isn't available yet. If, for example,
> you have the calls
>
> map g (map f l)
>
> the outer and inner calls can work as a pipeline on the elemenst of l,
> using the return-value registers as pipeline registers. Note that
> register renaming will allow the two functions to use separate
> physical registers even though they are numbered identically. Some
> computer architecture people (it may even be the people at UW@M) have
> actually suggested specific architectural support for
> call+continuation parallelism.
[snip]
Yes, but the "specific architectural support" you need is not built in
at the moment. So we are not talking about "modern CPUs", as the subject
line of this thread suggests, but some different sort of CPUs to be
developed in the future. In any case I don't see that this new sort of
CPU differs very much from the graph reduction machines it was fashionable
to talk about in functional programming circles a generation ago.

Daniel C. Wang

unread,
Apr 9, 2001, 11:09:14 AM4/9/01
to

tor...@diku.dk (Torben AEgidius Mogensen) writes:
{stuff deleted}

> This adds some overhead for the indirect jump, though.

You'd think that in this C++, Java, crazy world the cost of an indirect
jump wouldn't be any worse than a direct jump. I'm told it is depending on
how smart the branch prediction hardware is on your processor.

Ralph Becket

unread,
Apr 10, 2001, 8:11:16 AM4/10/01
to
In article <r8t7l0u...@chinstrap.CS.Princeton.EDU>,

The point is not the relative costs of various types of branch instruction,
but that dynamic method invocation precludes inlining (where the branch
target is not statically decidable) and therefore necessitates a procedure
call with all the overheads that entails. In other words, dynamic method
dispatch has a cost of the order of tens of instructions before you
actually get to do any work.

Torben AEgidius Mogensen

unread,
Apr 10, 2001, 9:31:29 AM4/10/01
to
rw...@shep.cl.cam.ac.uk (Ralph Becket) writes:

Not necessarily. If a tagless list is uniformly implemented, the
branch target will be the same in all cases and a simple BTB combined
with speculative execution will be able to make the branch virtually
costless. You will do a mispredict and have to squash speculative work
when you reach the end of the list, though. If the processor
speculates over several likely branch targets (which has been
suggested), there will be no delay even at the end, but you will have
to squash mispredicted speculation at all steps through the list. This
will cost you mostly in power (heat), though. The main idea with
massive speculation is, after all, to spend power on unnecessary work
to shorten the critical path.

I'm, myself, more in favour of a kind of non-speculative parallelism
where several pipelines share functional units. If one pipeline stalls
at a load or branch, the other pipelines can exploit the functional
units. I don't envison a full thread change on pipeline stalls, all
pipelines should work in parallel and queue for functional units. This
will not shorten the critical path, so it won't speed up inherently
seqiential programs (which speculation may), but it will do well with
programs with implicit parallelism (such as the call+continuation
parallelism in pure functional programs).

Torben Mogensen (tor...@diku.dk)

George Russell

unread,
Apr 10, 2001, 10:07:25 AM4/10/01
to
If we're on CPU design, I would be interested to know whether those designs
being suggested would help with the common situation when a computation
can normally be done very fast, but sometimes takes a long time. For example,
addition (surely one of the most common operations) can, if implemented
naively, take quite a number of cycles because of carry propagation,
which means current processors need a lot more circuitry (to add n bit numbers
with maximum delay O(log n) you need O(n log n) gates I think). Would these
systems where one pipeline can stall while the others get on with it allow
such things to be simplified?

Brian Rogoff

unread,
Apr 10, 2001, 1:02:14 PM4/10/01
to
On Mon, 9 Apr 2001, George Russell wrote:
> I don't want to be too much of a wet blanket, but it doesn't seem to me much
> use spotting parallelism if there isn't much there to be spotted. FP code
> these days typically involves a lot of walking through lists, so whatever you
> do you have to wait N times while the pointer-to-next-element is indirected.
> This is disastrous for modern processors, and I suspect the situation will
> get worse if current trends continue.

My hope, and it has somewhat different motivations than the multithreaded
CPU thing, is that future FPLs become much better at dealing with arrays.
The work on SAC, Clean, and the Bigarray module of OCaml are pleasant
indicators that the future looks bright. The obvious connection with this
thread is that with heavier use of arrays the performance of FPLs on
current and future processors could get even better.

I agree with what you suggest later when you dismiss the idea of special
purpose hardware for FPLs. No point in fighting Moore's law and economies
of scale. Digressing a bit, I wonder if the analogy between asynchronous
circuits and lazy evaluation is meaningful and/or useful? Would lazy
languages be good behavioral HDLs for asynchronous design? Hmmmm...

-- Brian


Tore Lund

unread,
Apr 10, 2001, 2:29:59 PM4/10/01
to
George Russell wrote:
>
> If we're on CPU design, I would be interested to know whether those designs
> being suggested would help with the common situation when a computation
> can normally be done very fast, but sometimes takes a long time. For example,
> addition (surely one of the most common operations) can, if implemented
> naively, take quite a number of cycles because of carry propagation,

Maybe I am naive, but this is how I would add 64-bit numbers on a 16-bit
machine:

mov ax,source
add target,ax
mov ax,source [2]
adc target [2],ax
mov ax,source [4]
adc target [4],ax
mov ax,source [6]
adc target [6],ax

The carry propagates all right, but I don't see how that in itself leads
to extra cycles.
--
Tore

Laurent Guerby

unread,
Apr 10, 2001, 6:07:54 PM4/10/01
to
If you're looking for functional languages with very efficient
arrays, SISAL might be for you, start from

<http://www.physics.nmt.edu/raymond/sisal/sisal.html>

(A good tutorial used to be on-line, but it looks
like it has disappeared.)

--
Laurent Guerby <gue...@acm.org>

Brian Rogoff

unread,
Apr 10, 2001, 6:30:12 PM4/10/01
to
On 11 Apr 2001, Laurent Guerby wrote:
> If you're looking for functional languages with very efficient
> arrays, SISAL might be for you, start from

SISAL appears quite moribund. I'd say it's a zombie language, in that
it doesn't know it's really dead yet. However, it might be nice for the
living FPLs to tear any good ideas off of the SISAL carcass.

Sorry for the disgusting imagery, but does anyone really use SISAL
anymore? I think it's better to try and enhance existing FPLs than
to try to resurrect has-beens like SISAL. The Clean team in particular
seem to be stressing array processing, though unfortunately their
neglect of Unix renders Clean unuseable in a large portion of the
scientific programming community, which is the community most likely to
appreciate their work. Quite unbelievable, considering that they have far
more Unix users than Mac users :-(.

-- Brian


Niall Dalton

unread,
Apr 10, 2001, 7:37:12 PM4/10/01
to
Brian Rogoff wrote:
> SISAL appears quite moribund.
Well recently there was been a new development effort:
http://sourceforge.net/projects/sisal/


> Sorry for the disgusting imagery, but does anyone really use SISAL
> anymore?
I'm not sure of the active number of users, but it is surely
a shame that there are not more. Parallel programming is hard
enough without the self-flagellation of some of the extant
imperative programming methods.

> I think it's better to try and enhance existing FPLs than
> to try to resurrect has-beens like SISAL.

Well Sisal has some advantages for high performance
implementations that other functional languages do not.
Unfortunately, the language restrictions meant that the
language seems to fall between the cracks, too `wacky'
for the Fortran programmers, not advanced enough for
the hardcore functional people.

pH (Parallel/Eager Haskell) is probably more along the
lines that you are thinking of though. Very interesting
stuff, and tough to implement efficiently. AFAIK there is
a book due out this summer on the language.
For more details:
http://www.csg.lcs.mit.edu/ph/index.html

Niall

--
------------------------------------------------------------
Niall Dalton http://qetesh.ics.uci.edu
Information and Computer Science nda...@ics.uci.edu
University of California, Irvine

Brian Rogoff

unread,
Apr 10, 2001, 10:20:02 PM4/10/01
to
On Tue, 10 Apr 2001, Niall Dalton wrote:
> Brian Rogoff wrote:
> > SISAL appears quite moribund.
> Well recently there was been a new development effort:
> http://sourceforge.net/projects/sisal/
>
> > Sorry for the disgusting imagery, but does anyone really use SISAL
> > anymore?
> I'm not sure of the active number of users, but it is surely
> a shame that there are not more. Parallel programming is hard
> enough without the self-flagellation of some of the extant
> imperative programming methods.

I agree. However, I am in industry so I have to say that my interest in
orphaned research projects with tiny user communities is pretty
low. That's why focusing on OCaml, Clean, Erlang, Mercury and the like,
and trying to fix their deficiencies with respect to array operations
seems like a better use of limited programmer resources.

> > I think it's better to try and enhance existing FPLs than
> > to try to resurrect has-beens like SISAL.
> Well Sisal has some advantages for high performance
> implementations that other functional languages do not.
> Unfortunately, the language restrictions meant that the
> language seems to fall between the cracks, too `wacky'
> for the Fortran programmers, not advanced enough for
> the hardcore functional people.

That's my impression too. Slap a decent module system on it, provide
parametric polymorphism, and I might be more interested. I'll have you
know, at one time I was a Fortran (and MATLAB) programmer.

> pH (Parallel/Eager Haskell) is probably more along the
> lines that you are thinking of though. Very interesting
> stuff, and tough to implement efficiently. AFAIK there is
> a book due out this summer on the language.

Yes, very interesting. Where is the implementation. Seems like a
real "paper tiger" if you ask me; all papers and no code.

-- Brian

Niall Dalton

unread,
Apr 10, 2001, 10:59:56 PM4/10/01
to
Brian Rogoff wrote:
> I agree. However, I am in industry so I have to say that my interest in
> orphaned research projects with tiny user communities is pretty
> low.
Fair enough. As a researcher I do have the luxury to consider
many possible solutions, even if they have been, at times, orphaned
and under-utilized.

> That's why focusing on OCaml, Clean, Erlang, Mercury and the like,
> and trying to fix their deficiencies with respect to array operations
> seems like a better use of limited programmer resources.

It depends on whether your aim is a short term fix to allow
use of these languages in your favourite application domain,
or a more general solution to a larger problem. pH for instance,
has the pleasantness of a modern non-strict functional language,
non-determinism when it is needed, and even arbitrary manipulation
of state when that is appropriate. The aim being a general
purpose language which "makes it effortless to express huge
amounts of parallelism".

> That's my impression too. Slap a decent module system on it, provide
> parametric polymorphism, and I might be more interested. I'll have you
> know, at one time I was a Fortran (and MATLAB) programmer.

Then you should appreciate the need to have fast execution
if you are to attract Fortran programmers.
To quote Patrick Miller, who was/is heavily involved in Sisal:
"ranged from about 80 to 110% of C/Fortran efficiency (we complied to
C, but sometimes our extremely aggressive loop
fusion would discover cascading optimizations
that FORTRAN overlooked)."
Slapping on the features you want may well drive that
performance into the ground if not done carefully.

> > pH (Parallel/Eager Haskell)

> Yes, very interesting. Where is the implementation. Seems like a
> real "paper tiger" if you ask me; all papers and no code.

I've seen and played with a running implementation, based
on the full compiler that Alex Caro reported on in his PhD
dissertation. I believe there is a new compiler in the works,
based on recent research by Jan-Willem Maessen.

cbbr...@hex.net

unread,
Apr 10, 2001, 11:19:22 PM4/10/01
to
Niall Dalton <nda...@ics.uci.edu> writes:
> Brian Rogoff wrote:
> > SISAL appears quite moribund.
> Well recently there was been a new development effort:
> http://sourceforge.net/projects/sisal/

>> Sorry for the disgusting imagery, but does anyone really use SISAL
>> anymore?

> I'm not sure of the active number of users, but it is surely a shame
> that there are not more. Parallel programming is hard enough without
> the self-flagellation of some of the extant imperative programming
> methods.

The problem is that the people doing heavily parallel array work have
probably figured out how to get FORTRAN to do their work; in contrast,
the FP community tends to be much more oriented towards more
"logic-oriented" programming.

>> I think it's better to try and enhance existing FPLs than to try to
>> resurrect has-beens like SISAL.

> Well Sisal has some advantages for high performance implementations
> that other functional languages do not. Unfortunately, the language
> restrictions meant that the language seems to fall between the
> cracks, too `wacky' for the Fortran programmers, not advanced enough
> for the hardcore functional people.

That seems a pretty good characterization.
--
(reverse (concatenate 'string "gro.mca@" "enworbbc"))
http://vip.hex.net/~cbbrowne/resume.html
"Besides a mathematical inclination, an exceptionally good mastery of
one's native tongue is the most vital asset of a competent
programmer." -- Edsger W.Dijkstra

Niall Dalton

unread,
Apr 10, 2001, 11:48:26 PM4/10/01
to
cbbr...@hex.net wrote:
> The problem is that the people doing heavily parallel array work have
> probably figured out how to get FORTRAN to do their work;
I'm not so sure of that, but I'm open to further evidence.

For some problems, Fortran is fine.
Highly regular numerical problems for instance. Vectorizing
Fortran compilers have managed to train their users into
giving them the right kind of code to get high performance.
They are still fairly easily defeated by code that is difficult
to analyse though. Still, one can compile multiple versions
of the code and use simple tests to pick out which is the
highest performance one that can be executed once you
have full knowledge at runtime.

Or to make things easier for the compiler, and for some
problems, the programmer, one could use data parallel extensions
to the language. Essentially you have some canned ways to
manipulate large collections of data, typically arrays.
Again, for regular computations , this is a nice way to go.
What about irregular parallelism or applications with something
other than simple loop nest computations over matrices?
I don't see that data parallel languages are the right way to
code such applications, they are usually done in a message
passing style afaik. Correct handling of data distribution
is a rather nasty problem for the programmer as well..

Looking beyond matrix crunching, Fortran does not look
too appealing to me, in contrast to higher level languages.

Brian Rogoff

unread,
Apr 11, 2001, 12:06:31 AM4/11/01
to
On Tue, 10 Apr 2001, Niall Dalton wrote:
> Brian Rogoff wrote:
> > I agree. However, I am in industry so I have to say that my interest in
> > orphaned research projects with tiny user communities is pretty
> > low.
> Fair enough. As a researcher I do have the luxury to consider
> many possible solutions, even if they have been, at times, orphaned
> and under-utilized.

Lucky you! :-)

> > That's why focusing on OCaml, Clean, Erlang, Mercury and the like,
> > and trying to fix their deficiencies with respect to array operations
> > seems like a better use of limited programmer resources.
> It depends on whether your aim is a short term fix to allow
> use of these languages in your favourite application domain,
> or a more general solution to a larger problem. pH for instance,
> has the pleasantness of a modern non-strict functional language,

Whoa, did I misread that?

The pH language is is a parallel,
eagerly-evaluated variant of
Haskell with syntactic provisions for loops,
barriers, and I- and M- structure storage.

I guess my reading is that eager evaluation doesn't allow non-strict
functions to be defined, so something is amiss between your description
and the web page.

> non-determinism when it is needed, and even arbitrary manipulation
> of state when that is appropriate. The aim being a general
> purpose language which "makes it effortless to express huge
> amounts of parallelism".
>
> > That's my impression too. Slap a decent module system on it, provide
> > parametric polymorphism, and I might be more interested. I'll have you
> > know, at one time I was a Fortran (and MATLAB) programmer.
> Then you should appreciate the need to have fast execution
> if you are to attract Fortran programmers.

Of course. If you've used or heard of Stalin, the Scheme "whole program"
optimizing compiler, you'll know that it frequently beat similar looking
C code for numerics. I bet similar techniques would work fine for ML. I
agree that current implementations will lose in a head to head match
against a good Fortran compiler.

> To quote Patrick Miller, who was/is heavily involved in Sisal:
> "ranged from about 80 to 110% of C/Fortran efficiency (we complied to
> C, but sometimes our extremely aggressive loop
> fusion would discover cascading optimizations
> that FORTRAN overlooked)."

Which C and Fortran compilers were being used in the benchmarks?

> Slapping on the features you want may well drive that
> performance into the ground if not done carefully.

Well, I don't think modules are a problem, since I imagine a smart
implementation can compile away modules. So I take it you mean
parametric polymorphism. I suspect that the same techniques Stalin applies
can work there, or even the C++ approach of monomorphizing everything and
living with the code bloat could do the trick. But you're right when you
suggest that the common implemenation techniques which work well for
symbolic computing probably cause a hit for fast numerics. Currently that
isn't too big an issue for me, but I'm very sympathetic to people who do
numerical linear algebra.

> > > pH (Parallel/Eager Haskell)
> > Yes, very interesting. Where is the implementation. Seems like a
> > real "paper tiger" if you ask me; all papers and no code.
> I've seen and played with a running implementation, based
> on the full compiler that Alex Caro reported on in his PhD
> dissertation. I believe there is a new compiler in the works,
> based on recent research by Jan-Willem Maessen.

Well, hopefully it won't fall in the shadow of my other rant, about
languages which run on some small set of machines with few users. Any
implementor hoping for wide acceptance should target at the very least
Windows and Linux.

-- Brian

Niall Dalton

unread,
Apr 11, 2001, 12:46:21 AM4/11/01
to
Brian Rogoff wrote:
> Whoa, did I misread that?
>
> The pH language is is a parallel,
> eagerly-evaluated variant of
> Haskell with syntactic provisions for loops,
> barriers, and I- and M- structure storage.
>
> I guess my reading is that eager evaluation doesn't allow non-strict
> functions to be defined, so something is amiss between your description
> and the web page.
That's not the case. Non-strict does not mean lazy. A function
is non-strict in an argument if it can return a result even when
the value of that argument is unknown. pH is a non-strict language,
so we can have the result of a function even if we don't know
the value of some or all of the arguments. That can be reconciled
with eager evaluation by evaluating both the function and the
arguments at the same time. I believe this is also called 'lenient'
evaluation. Non-strictness is often implemented via lazy evaluation,
but that need not be the case.

> Of course. If you've used or heard of Stalin, the Scheme "whole program"
> optimizing compiler, you'll know that it frequently beat similar looking
> C code for numerics. I bet similar techniques would work fine for ML. I
> agree that current implementations will lose in a head to head match
> against a good Fortran compiler.

Actually I believe that the Sisal compiler does something similar.
It constructs a single graph representing the entire program to
be compiled.

> Which C and Fortran compilers were being used in the benchmarks?

That I don't know, I'll try to find out.

> Well, I don't think modules are a problem, since I imagine a smart
> implementation can compile away modules.

Assuming whole program compilation, yes, I guess so.

> symbolic computing probably cause a hit for fast numerics. Currently that
> isn't too big an issue for me, but I'm very sympathetic to people who do
> numerical linear algebra.

I have sympathy with anyone needing a lot of speed from
their code but still wanting to code in a nice language:-)

> Any implementor hoping for wide acceptance should target at the very least
> Windows and Linux.

I think that they are aiming for Linux and Solaris, but I could
well be mistaken.

Ralph Becket

unread,
Apr 11, 2001, 5:39:44 AM4/11/01
to
In article <Pine.BSF.4.21.010410...@shell5.ba.best.com>,

Brian Rogoff <b...@shell5.ba.best.com> wrote:
>
>I agree. However, I am in industry so I have to say that my interest in
>orphaned research projects with tiny user communities is pretty
>low. That's why focusing on OCaml, Clean, Erlang, Mercury and the like,
>and trying to fix their deficiencies with respect to array operations
>seems like a better use of limited programmer resources.

What are the problems you perceive with array handling in these languages?
Three common approaches are adopted:

(1) [ML, OCaml] Arrays are imperative data structures and not part of the
functional canon.

(2) [Haskell] Efficient array handling while preserving referential
transparency via monads (requires laziness or whatever you want to call it).

(3) [Mercury, Clean] Efficient array handling while preserving referential
transparency via uniqueness (has slightly cumbersome syntax).

Note that (modulo costs for laziness in option (2)) none of these options
is any more efficient than the other and all of them are as efficient as
array update in unclean imperative languages.

There are constraints on how you can pass arrays around in (2) and (3),
but that's the price you pay for referential transparency. I've been
coding in Mercury for years now and regularly use arrays and encounter a
minimum of pain.

-- Ralph


--
Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

George Russell

unread,
Apr 11, 2001, 6:27:17 AM4/11/01
to
Tore Lund wrote:
>
> George Russell wrote:
> >
> > If we're on CPU design, I would be interested to know whether those designs
> > being suggested would help with the common situation when a computation
> > can normally be done very fast, but sometimes takes a long time. For example,
> > addition (surely one of the most common operations) can, if implemented
> > naively, take quite a number of cycles because of carry propagation,
>
> Maybe I am naive, but this is how I would add 64-bit numbers on a 16-bit
> machine:
[code snipped]

> The carry propagates all right, but I don't see how that in itself leads
> to extra cycles.
The problem arises when you are building an adder from scratch (so don't
have any 16-bit addition to begin with). Consider the simple case of
y = x+1, with carries represented by c. Then you might build this as

y0 = not x0
c0 = x0

y1 = x1 xor c0
c1 = x1 and c0

(and the same for y2,c2,y3,c3,...)

So c(n+1) depends on cn, so for 64-bit numbers you have a circuit depth of
64, hence a long wait to get the answer.

George Russell

unread,
Apr 11, 2001, 6:38:28 AM4/11/01
to
Ralph Becket wrote:
[snip]

> (1) [ML, OCaml] Arrays are imperative data structures and not part of the
> functional canon.
Arrays have to be accessed explicitly via sub and update operators. There
are extra monomorphic versions which can be faster.

>
> (2) [Haskell] Efficient array handling while preserving referential
> transparency via monads (requires laziness or whatever you want to call it).
Standard Haskell does _not_ provide mutable arrays. GHC and Hugs do, but
the old interface is somewhat yucky and seems to be being changed as I speak.

>
> (3) [Mercury, Clean] Efficient array handling while preserving referential
> transparency via uniqueness (has slightly cumbersome syntax).
>
> Note that (modulo costs for laziness in option (2)) none of these options
> is any more efficient than the other and all of them are as efficient as
> array update in unclean imperative languages.
The last statement is not true, since imperative languages usually don't
bother to check array bounds, while Haskell and SML do. Neither GHC or
SML/NJ bother to optimise this, as far as I know. SML/NJ provides an
unsafe operator that omits the checking . . .
[snip]

George Russell

unread,
Apr 11, 2001, 6:41:54 AM4/11/01
to
Niall Dalton wrote:
[snip]

> Then you should appreciate the need to have fast execution
> if you are to attract Fortran programmers.
> To quote Patrick Miller, who was/is heavily involved in Sisal:
> "ranged from about 80 to 110% of C/Fortran efficiency (we complied to
> C, but sometimes our extremely aggressive loop
> fusion would discover cascading optimizations
> that FORTRAN overlooked)."
[snip]
I liked the look of SISAL a lot, and think it's a shame the US
Government appears to have pulled the plug. However as SISAL is
explicitly intended for parallelising systems, I'd have thought
High Performance FORTRAN or something similar would be a fairer
comparison. HPF has a number of operators allowing you to
operate-on-all-these-elements-of-an-array-very-quickly-and-I-
don't-care-what-order.

Bruce Hoult

unread,
Apr 11, 2001, 7:34:06 AM4/11/01
to
In article <3AD43424...@tzi.de>, George Russell <g...@tzi.de>
wrote:

> The last statement is not true, since imperative languages usually don't


> bother to check array bounds, while Haskell and SML do.

That's simply not true!

Pascal and Modula-2 and Ada check array bounds. Java checks array
bounds. Basic checks array bounds.

About the only common imperative language which *doesn't* check array
bounds is C/C++.

-- Bruce

Ralph Becket

unread,
Apr 11, 2001, 8:04:32 AM4/11/01
to
In article <3AD43185...@tzi.de>, George Russell <g...@tzi.de> wrote:
>The problem arises when you are building an adder from scratch (so don't
>have any 16-bit addition to begin with). Consider the simple case of
>y = x+1, with carries represented by c. Then you might build this as
>
>y0 = not x0
>c0 = x0
>
>y1 = x1 xor c0
>c1 = x1 and c0
>
>(and the same for y2,c2,y3,c3,...)
>
>So c(n+1) depends on cn, so for 64-bit numbers you have a circuit depth of
>64, hence a long wait to get the answer.

You're talking about gate level implementation of integer addition?

If so, there are well known fast carry propagation mechanisms with
O(log n) carry delays for n-bit quantities. You can find them
described in any decent introductory text on gate-level hardware
design.

Either way, addition of register-sized quantities is so fundamental that
it's one of the limiting factors of clock speed for a CPU. It's certainly
not the sort of thing that has an impact on processor design.

--
Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

Ralph Becket

unread,
Apr 11, 2001, 8:10:42 AM4/11/01
to
In article <bruce-03A677....@news.nzl.ihugultra.co.nz>,

More to the point, I've done some high performance code tweaking
with array based stuff and found that turning off array bounds checking
rarely makes an improvement over 1% of runtime once all other sensible
optimizations have been applied. One deduces that modern compilers are
pretty good at moving unnecessary bounds checks out of loops. These
days I no longer bother turning it off: there is invariably a better
way to speed up your code than by taking off the safeties.


--
Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

Torben AEgidius Mogensen

unread,
Apr 11, 2001, 8:15:51 AM4/11/01
to
George Russell <g...@tzi.de> writes:

Asynchronous processors can adjust dynamically to varying completion
times. See http://www.cs.man.ac.uk/amulet/.

Torben Mogensen (tor...@diku.dk)

Robert Ennals

unread,
Apr 11, 2001, 8:11:42 AM4/11/01
to
On Wed, 11 Apr 2001, Ralph Becket wrote:

[snip]

>You're talking about gate level implementation of integer addition?
>
>If so, there are well known fast carry propagation mechanisms with
>O(log n) carry delays for n-bit quantities. You can find them
>described in any decent introductory text on gate-level hardware
>design.
>
>Either way, addition of register-sized quantities is so fundamental that
>it's one of the limiting factors of clock speed for a CPU. It's certainly
>not the sort of thing that has an impact on processor design.

I don't think that is the point that George was trying to make.

I think the issue is that operations such as addition are usually extremely
simple. Most additions are adding small numbers to small numbers, and could be
computed very quickly using very simple logic.

Things are held up by the fact that, in a clocked cpu, every addition has to
take the same amount of time, so if we want small additions to be fast, we need
big ones to be fast as well.

Amongst other things, self timed logic has the potential to be better at this
kind of thing, as it removes the requirement for a stage to always take the
same amount of time.

I'm going to be terribly cliquey, and be a CL person, replying to a post from a
CL person, posting a link at CL:

http://www.cl.cam.ac.uk/Research/Rainbow/projects/selftimed.html

(I know you know this stuff Ralph - but you didn't say it)

DISCLAIMER: I am not a hardware person.

>--
>Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

BTW your home page no longer exists on the CL server. Probably time to start
giving a URL at MSR?


-Rob

George Russell

unread,
Apr 11, 2001, 8:43:14 AM4/11/01
to
Er FORTRAN? And I don't think Pascal always checks array bounds either,
though I don't have the standard with me.

George Russell

unread,
Apr 11, 2001, 8:47:24 AM4/11/01
to
Ralph Becket wrote:
[snip]

> More to the point, I've done some high performance code tweaking
> with array based stuff and found that turning off array bounds checking
> rarely makes an improvement over 1% of runtime once all other sensible
> optimizations have been applied. One deduces that modern compilers are
> pretty good at moving unnecessary bounds checks out of loops. These
> days I no longer bother turning it off: there is invariably a better
> way to speed up your code than by taking off the safeties.
[snip]
No, you are generalising to all cases what is only true in some cases.
I can certainly show you code which uses a lot of array accesses without
ever going out of bounds, for which naive bounds checking would be
expensive, and for which the compiler is extremely unlikely to be able
to deduce that the code never goes out of bounds.

George Russell

unread,
Apr 11, 2001, 8:49:55 AM4/11/01
to
Ralph Becket wrote:
[snip]

> You're talking about gate level implementation of integer addition?
>
> If so, there are well known fast carry propagation mechanisms with
> O(log n) carry delays for n-bit quantities. You can find them
> described in any decent introductory text on gate-level hardware
> design.
Yes there are (I mentioned them in a previous reply) but they use a lot
of gates (O(n log n)). Look in one of your textbooks for a picture of
a 64-bit adder . . .

>
> Either way, addition of register-sized quantities is so fundamental that
> it's one of the limiting factors of clock speed for a CPU. It's certainly
> not the sort of thing that has an impact on processor design.
[snip]
The point I was trying to make is, that addition is one of those operations
that can (naively) be done very simply and very fast, but where there are
exceptional cases (in this case, long chains of carry propagation) where
it does not. Floating point operations provide another example. Since
such things are quite common, I'm surprised asynchronous processors,
which could also benefit functional programmers, are not more widespread.

Ralph Becket

unread,
Apr 11, 2001, 9:02:58 AM4/11/01
to
In article <986991758....@nntp-serv.cam.ac.uk>,

Robert Ennals <rj...@cam.ac.uk> wrote:
>
>I don't think that is the point that George was trying to make.
>
>I think the issue is that operations such as addition are usually extremely
>simple. Most additions are adding small numbers to small numbers, and could be
>computed very quickly using very simple logic.
>
>Things are held up by the fact that, in a clocked cpu, every addition has to
>take the same amount of time, so if we want small additions to be fast, we need
>big ones to be fast as well.
>
>Amongst other things, self timed logic has the potential to be better at this
>kind of thing, as it removes the requirement for a stage to always take the
>same amount of time.

(Simon Moore will kill me for this...) I've yet to hear of a self-timed
CPU that ran anywhere near the speed of contemporary clocked logic. My
personal take is that, as far as speed is concerned, self-timed logic
is on the same ground as lazy programming languages in that while the
idea is terribly attractive in theory, the book keeping really shafts you
in practice.

[There, I've probably upset half the newsgroup now.]

>(I know you know this stuff Ralph - but you didn't say it)

I'm afraid I really wasn't that clear on the original topic. While we're
on the subject, is it the case that addition under self-timed logic is
linear wrt the index of the most significant set bit?

Ralph Becket

unread,
Apr 11, 2001, 9:08:46 AM4/11/01
to
In article <3AD452F3...@tzi.de>, George Russell <g...@tzi.de> wrote:
>Yes there are (I mentioned them in a previous reply) but they use a lot
>of gates (O(n log n)). Look in one of your textbooks for a picture of
>a 64-bit adder . . .

Well, you can trade off carry propagation time with extra gates, right
up to implementing additon using lookup tables. In general, the number
of places where you need an adder are few enough that it's worth paying
the extra space cost to get the speed.

>The point I was trying to make is, that addition is one of those operations
>that can (naively) be done very simply and very fast, but where there are
>exceptional cases (in this case, long chains of carry propagation) where
>it does not. Floating point operations provide another example. Since
>such things are quite common, I'm surprised asynchronous processors,
>which could also benefit functional programmers, are not more widespread.

(See my other post) It isn't at all clear that asynchronous (aka self-
timed) logic is good for speed. It *is* good for low power, but to make
the stuff robust you have to spend way more gates on stuff like Muller
C-elements to get the handshaking right than you need for the basic
synchronous equivalent circuit.

Ralph Becket

unread,
Apr 11, 2001, 9:14:54 AM4/11/01
to
In article <3AD4525C...@tzi.de>, George Russell <g...@tzi.de> wrote:
>No, you are generalising to all cases what is only true in some cases.
>I can certainly show you code which uses a lot of array accesses without
>ever going out of bounds, for which naive bounds checking would be
>expensive, and for which the compiler is extremely unlikely to be able
>to deduce that the code never goes out of bounds.

My intuition is that most code that walks arrays is either doing so
in a linear fashion (and is therefore likely to have bounds checking
moved outside the loop) or is jumping around in the array, either
doing something like a binary search (for which the bounds checking
can still be moved by a reasonably smart compiler) or because it's
indexing through some other structure - in which case, for large arrays,
you might expect poor cache performance to dominate, although I admit that
here the compiler is unlikely to have a chance.

But still, you do want bounds checking: it's a Good Thing. The Mercury
compiler gives you the option of turning it off, but people rarely do.

Torben AEgidius Mogensen

unread,
Apr 11, 2001, 9:18:21 AM4/11/01
to
rw...@shep.cl.cam.ac.uk (Ralph Becket) writes:

>(Simon Moore will kill me for this...) I've yet to hear of a self-timed
>CPU that ran anywhere near the speed of contemporary clocked logic. My
>personal take is that, as far as speed is concerned, self-timed logic
>is on the same ground as lazy programming languages in that while the
>idea is terribly attractive in theory, the book keeping really shafts you
>in practice.

From what I heard, AMULET 3 is approximately the same speed as
traditional CPU's produced with the same technology.

>I'm afraid I really wasn't that clear on the original topic. While we're
>on the subject, is it the case that addition under self-timed logic is
>linear wrt the index of the most significant set bit?

No. You can make asynchroneous carry-lookahead adders that terminate
in O(log(N)) time worst-case, but O(log(log(N))) average case (i.e.,
the log of the average longest carry propagation, which itself is
O(log(N))).

Torben Mogensen (tor...@diku.dk)

James Hague

unread,
Apr 11, 2001, 10:05:46 AM4/11/01
to
Ralph Becket wrote:
>
> What are the problems you perceive with array handling in these languages?
> Three common approaches are adopted:
>
> (1) [ML, OCaml] Arrays are imperative data structures and not part of the
> functional canon.
>
> (2) [Haskell] Efficient array handling while preserving referential
> transparency via monads (requires laziness or whatever you want to call
it).
>
> (3) [Mercury, Clean] Efficient array handling while preserving referential
> transparency via uniqueness (has slightly cumbersome syntax).

You missed the obvious: array-oriented functional languages like APL, J, and
K. APL is the direct ancestor of Backus's FP, the language that touched off
modern functional programming research.

James


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----

Ralph Becket

unread,
Apr 11, 2001, 11:18:02 AM4/11/01
to
In article <3ad462f5$1...@corp.newsfeeds.com>,

James Hague <james...@volition-inc.com> wrote:
>
>You missed the obvious: array-oriented functional languages like APL, J, and
>K. APL is the direct ancestor of Backus's FP, the language that touched off
>modern functional programming research.

While arrays are tightly integrated into APL & J, if as I believe
they are pure functional languages, they must have some mechanism for
preserving referential transparency across array operations. AKAIK,
there are two main ways of doing this: you either ensure that the old
version of the array is never accessed again (in which case you can
carry out destructive update on it - this is what the monadic and
unique object approaches do) or you have to create a new array each time
you can't prove that the old value has gone out of scope, which can be
rather expensive and is certainly not competitive with destructive
update.

Do APL and/or J implementations have an alternative means of accommodating
safe destructive update?

--
Ralph....@cl.cam.ac.uk http://www.cl.cam.ac.uk/users/rwab1

Laurent Guerby

unread,
Apr 11, 2001, 1:25:18 PM4/11/01
to
Brian Rogoff <b...@shell5.ba.best.com> writes:
> SISAL appears quite moribund. I'd say it's a zombie language, in that
> it doesn't know it's really dead yet. However, it might be nice for the
> living FPLs to tear any good ideas off of the SISAL carcass.

Sorry, I forgot to mention that I was refering to SISAL in the context
of language designs to look at, not for industrial use, I never
programmed in SISAL and I know no one using it, I just read the stuff
about it on the web.

--
Laurent Guerby <gue...@acm.org>

Brian Rogoff

unread,
Apr 11, 2001, 2:20:11 PM4/11/01
to
On 11 Apr 2001, Ralph Becket wrote:
> In article <Pine.BSF.4.21.010410...@shell5.ba.best.com>,
> Brian Rogoff <b...@shell5.ba.best.com> wrote:
> >
> >I agree. However, I am in industry so I have to say that my interest in
> >orphaned research projects with tiny user communities is pretty
> >low. That's why focusing on OCaml, Clean, Erlang, Mercury and the like,
> >and trying to fix their deficiencies with respect to array operations
> >seems like a better use of limited programmer resources.
>
> What are the problems you perceive with array handling in these languages?

First, and least interesting for the researcher, is raw efficiency. I think
Niall Dalton nailed it on the head when he said that to interest Fortran
programmers you'll need to be within a few percent of the speed of a
*good* Fortran compiler for some typical Fortran programs.

Second, I'd like to have some rather high level array operations. Why
would anyone switch from HPF to a functional language, given the very rich
set of primitives available in HPF?

I think the SAC guys are (and the SISAL guys were) doing interesting work,
and I'd like to see the more popular FPLs get those features.

> Three common approaches are adopted:
>
> (1) [ML, OCaml] Arrays are imperative data structures and not part of the
> functional canon.

I've been pretty happy that the OCaml team has added Bigarrays and are
gradually extending its repertoire.

On the down side, MLs don't have overloading, or type classes, so lots of
numerics folks will ignore it.

> (2) [Haskell] Efficient array handling while preserving referential
> transparency via monads (requires laziness or whatever you want to call it).

When I used Haskell semi-seriously a few years ago "efficient" was not the
impression I had of arrays. And since then, they've dropped array
comprehensions, right? I guess I should grab the latest GHC and check
again.


> (3) [Mercury, Clean] Efficient array handling while preserving referential
> transparency via uniqueness (has slightly cumbersome syntax).

Actually, I think this is the most promising approach. Clean looks really
good, and they're developing numerical linear algebra libraries in it, but
anyone who uses a Unix as their main OS will be disappointed in Clean.
That's the developer's choice, and they can do what they want, but Clean
won't be a serious option for me until that changes.

I haven't tried Mercury for any serious applications, but that will
change soon. I get the impression that it's developers are open to
improving the language, which is very good. Since no doubt Fergus is
reading this, any chance that something like those Haskell functional
dependencies will make it into Mercury?

BTW Ralph, thanks for the tutorial, that's one of those places where
Mercury is seriously lacking, that is, documentation for the non-Prolog
expert.

> Note that (modulo costs for laziness in option (2)) none of these options
> is any more efficient than the other and all of them are as efficient as
> array update in unclean imperative languages.

In theory, yes. In practice, as Niall mentioned, there is a great deal of
engineering effort needed to get these languages as fast as Fortran. Having
unboxed array elements helps, but that means polymorphic arrays are harder
to implement, right?

> There are constraints on how you can pass arrays around in (2) and (3),
> but that's the price you pay for referential transparency. I've been
> coding in Mercury for years now and regularly use arrays and encounter a
> minimum of pain.

Could you update your tutorial with some array processing examples? Gaussian
elimination with partial pivoting would be nice. I'm one of those people
who learns better by reading examples than by studying the manual.

-- Brian


Brian Rogoff

unread,
Apr 11, 2001, 2:53:03 PM4/11/01
to
On 11 Apr 2001, Ralph Becket wrote:
> (See my other post) It isn't at all clear that asynchronous (aka self-
> timed) logic is good for speed. It *is* good for low power, but to make
> the stuff robust you have to spend way more gates on stuff like Muller
> C-elements to get the handshaking right than you need for the basic
> synchronous equivalent circuit.

There was another technology, a long time ago, which wasn't as good for
speed as it's competitor, and required more transistors to implement
logic, but it had much lower power consumption. It's called CMOS :-).

Anyways, there are lots of advantages besides low power for self-timed
logic. Noise immunity is another, as clocked circuits obviously have big
noise spikes at harmonics of the clock frequency. Modularity is one more,
in that it should be easy (in theory :) to reuse asynchronous blocks.

OTOH, it is really hard to design asynchronous circuits, and it would
require a reeducation of a vast number of designers, and a thorough
retooling in the EDA world. I wouldn't bet my money against synchronous
design. We'll see what happens as all of these nasty analog issues become
more important again.

-- Brian


Niall Dalton

unread,
Apr 11, 2001, 3:12:03 PM4/11/01
to
George Russell wrote:
> I liked the look of SISAL a lot, and think it's a shame the US
> Government appears to have pulled the plug. However as SISAL is
> explicitly intended for parallelising systems, I'd have thought
> High Performance FORTRAN or something similar would be a fairer
> comparison. HPF has a number of operators allowing you to
> operate-on-all-these-elements-of-an-array-very-quickly-and-I-
> don't-care-what-order.

Yes, if the benchmarks were being run today, one should
compete against HPF and Fortran95. Also, OpenMP for shared
memory systems as well.

chris.danx

unread,
Apr 11, 2001, 4:36:32 PM4/11/01
to
> My intuition is that most code that walks arrays is either doing so
> in a linear fashion (and is therefore likely to have bounds checking
> moved outside the loop) or is jumping around in the array, either
> doing something like a binary search (for which the bounds checking
> can still be moved by a reasonably smart compiler) or because it's
> indexing through some other structure - in which case, for large arrays,
> you might expect poor cache performance to dominate, although I admit that
> here the compiler is unlikely to have a chance.
>
> But still, you do want bounds checking: it's a Good Thing. The Mercury
> compiler gives you the option of turning it off, but people rarely do.

Does this mean that once your confident that your code is correct and it's
been tested, you can turn bounds checking off if you want more speed?

Or would it still be best to leave it on?

Chris

Zoltan Somogyi

unread,
Apr 12, 2001, 2:31:47 AM4/12/01
to
Brian Rogoff <b...@shell5.ba.best.com> writes:
>Since no doubt Fergus is
>reading this, any chance that something like those Haskell functional
>dependencies will make it into Mercury?

It has been on our todo list for a while, together with constructor classes.
Unfortunately, at the moment noone is actively working on it. On the up side,
there is a good chance that someone will start working on it late this year.

Zoltan Somogyi <z...@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne

Zoltan Somogyi

unread,
Apr 12, 2001, 2:29:22 AM4/12/01
to
George Russell <g...@tzi.de> writes:
>The point I was trying to make is, that addition is one of those operations
>that can (naively) be done very simply and very fast, but where there are
>exceptional cases (in this case, long chains of carry propagation) where
>it does not. Floating point operations provide another example. Since
>such things are quite common, I'm surprised asynchronous processors,
>which could also benefit functional programmers, are not more widespread.

You shouldn't be: faster addition of small numbers would not speed up
current processors by anywhere close to the amount required to compensate
for huge disruption that would be caused by the wholesale adoption of
asynchronous design methods. Even the use of asynchronous design only in
the integer ALU is unlikely to be worthwhile.

Many modern CPUs have pipelines with ten to twenty stages. Integer addition
typically takes only a single pipeline stage, even with worst-case carry
propagation, due to the use of carry circuits with logaritmic delays.
Amdahl's law therefore dictates that the speedup available from faster
additions is quite small. In practice, the fact that all pipeline stages
have to be the same length limits the available gain even further.

Cache access, out-of-order logic, simple movement of data from one part of
a chip to another, and (in x86 CPUs) instruction decoding all take more time
than integer addition, and are far more rewarding targets for optimization.
Unfortunately, it is hard to see how asynchronous logic would help speed them
up, since the first and third are dominated by propagation delay and the
second and fourth by the number of logic levels required.

Ralph Becket

unread,
Apr 12, 2001, 6:15:19 AM4/12/01
to
In article <_e3B6.4546$Ow3.9...@news2-win.server.ntlworld.com>,

chris.danx <chris...@ntlworld.com> wrote:
>
>Does this mean that once your confident that your code is correct and it's
>been tested, you can turn bounds checking off if you want more speed?
>
>Or would it still be best to leave it on?

As I say, in my experience turning bounds checking off nets only a very
small performance improvement. Unless I really, really have to squeeze
every last clock cycle out of the thing (almost never), I leave bounds
checking turned on. And the rest of the library is flooded with runtime
sanity checks that you can't turn off at all and they're invaluable for
detecting when something goes wrong.

Ralph Becket

unread,
Apr 12, 2001, 6:29:35 AM4/12/01
to
In article <Pine.BSF.4.21.01041...@shell5.ba.best.com>,

Brian Rogoff <b...@shell5.ba.best.com> wrote:
>
>First, and least interesting for the researcher, is raw efficiency. I think
>Niall Dalton nailed it on the head when he said that to interest Fortran
>programmers you'll need to be within a few percent of the speed of a
>*good* Fortran compiler for some typical Fortran programs.

If you mean writing back ends for vector processors then you're probably
right, given that numerical processing is a fairly specialized area and
that there are other things with higher priority (for the compiler writers)
on the wish list.

>Second, I'd like to have some rather high level array operations. Why
>would anyone switch from HPF to a functional language, given the very rich
>set of primitives available in HPF?
>
>I think the SAC guys are (and the SISAL guys were) doing interesting work,
>and I'd like to see the more popular FPLs get those features.

I'm not very familiar with these things, but adding appropriate array
manipulations to the library should be straightforward. Parallelizing
stuff may prove more problematic, but there's plenty of interest in that
general area.

>I haven't tried Mercury for any serious applications, but that will
>change soon. I get the impression that it's developers are open to
>improving the language, which is very good. Since no doubt Fergus is
>reading this, any chance that something like those Haskell functional
>dependencies will make it into Mercury?
>
>BTW Ralph, thanks for the tutorial, that's one of those places where
>Mercury is seriously lacking, that is, documentation for the non-Prolog
>expert.

The intention is definitely there to extend the typeclass system. For
my money, class constructors and functional dependencies are necessary
for typeclasses to become genuinely useful.

We also want to work on two or three books including a decent tutorial
and some higher powered stuff.

>> Note that (modulo costs for laziness in option (2)) none of these options
>> is any more efficient than the other and all of them are as efficient as
>> array update in unclean imperative languages.
>
>In theory, yes. In practice, as Niall mentioned, there is a great deal of
>engineering effort needed to get these languages as fast as Fortran. Having
>unboxed array elements helps, but that means polymorphic arrays are harder
>to implement, right?

I believe that this depends on which Mercury back end you're using. The
standard low-level back end will typically box anything that doesn't fit
into a machine register.

>Could you update your tutorial with some array processing examples? Gaussian
>elimination with partial pivoting would be nice. I'm one of those people
>who learns better by reading examples than by studying the manual.

I'm afraid I'm a bit busy right now, but I definitely do intend to devote
time to writing more tutorial material. With luck I'll be joining the team
which will help...

If you want I can send you some code employing a bunch of array sorting
algorithms.

Franck Arnaud

unread,
Apr 12, 2001, 7:02:06 AM4/12/01
to

> I agree with what you suggest later when you dismiss the idea of special
> purpose hardware for FPLs. No point in fighting Moore's law and economies
> of scale.

What about compiling functional languages directly to gates then? FPGAs
seem likely candidates to benefit from economies of scale.

FPGAs traditionally were used as a development tool or replacement for
fixed function hardware but as the gate layout is really just memory
there's why not
use them as general computation engines by reprogramming for the task at
hand,
as often (eg every "context switch"?) as needed.

There seem to be some efforts to compile imperative languages to
hardware,
eg Handel-C. See http://www.eecs.wsu.edu/~reconfig/links.html for some
links.

--
Posted from finch-post-10.mail.demon.net [194.217.242.38]
via Mailgate.ORG Server - http://www.Mailgate.ORG

Marcin 'Qrczak' Kowalczyk

unread,
Apr 12, 2001, 6:28:43 PM4/12/01
to
Wed, 11 Apr 2001 12:38:28 +0200, George Russell <g...@tzi.de> pisze:

> Standard Haskell does _not_ provide mutable arrays. GHC and Hugs do, but
> the old interface is somewhat yucky and seems to be being changed as I speak.

Indeed.

But the recommended ghc's interface (IArray and MArray) is not bad.

I am currently redesigning these modules. The interface of using these
arrays will stay the same; the method of making your own instances
of array classes will change.

> The last statement is not true, since imperative languages usually don't

> bother to check array bounds, while Haskell and SML do. Neither GHC or
> SML/NJ bother to optimise this, as far as I know.

The main reason of the changes is to remove unnecessary bounds checking
(and translation of indices from chosen type with chosen bounds to
0-based Ints).

Arrays will remain safe: indices provided by a programer will be
checked; but bounds checking inside bulk operations like listArray,
elems, accumArray, (==), amap, freeze etc. will be removed.

This implies changing some methods to functions and introducing
different methods instead. Current overloaded array interface is
defined in terms of operations which do bounds checking. So to
implement bulk operations without bounds checking, currently one
would have to use magic (rewrite rules) and duplicate some code.

In effect using fuctions like elems will be faster than explicit loops.
Unfortunately not everything can be coded using them. Perhaps the set
of provided operations should be extended to include e.g. a zipWith
analogue for arrays.

Immutable arrays are going to be slightly faster than mutable arrays
for some operations. One reason is that functions like elems or assocs
can be lazy, which allows the actual construction of the intermediate
list to be avoided in many cases. Obtaining such laziness for mutable
arrays requires changing the style to use explicit indexing, which
do bounds checking.

In ghc-5.00 there is also a type called DiffArray, an instance
of IArray, which uses physical updates in place to implement
fast functional array update operator '//', while changing the
representation of old versions to a backward diff wrt. the "current"
version. Performance of these arrays has a poor constant factor
(there is some bookkeeping which often happens to be unnecessary),
but updating has good asymptotical cost and a functional interface
(thread-safe). I haven't measured how bad it is wrt. explicitly
mutable arrays.

--
__("< Marcin Kowalczyk * qrc...@knm.org.pl http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK

Brian Rogoff

unread,
Apr 12, 2001, 7:14:28 PM4/12/01
to
On Thu, 12 Apr 2001, Franck Arnaud wrote:
> > I agree with what you suggest later when you dismiss the idea of special
> > purpose hardware for FPLs. No point in fighting Moore's law and economies
> > of scale.
>
> What about compiling functional languages directly to gates then?

Why? I mean, using FPLs as HDLs (hardware definition languages) seems like
a good idea, but what would be the advantage of devoting effort to
developing a really good FPL -> (some FPGA architecture) back end, which
is what you seem to be suggesting.

> FPGAs seem likely candidates to benefit from economies of scale.

Yes. People use FPGAs for more and more lately. I think of them as the
scripting languages of the HW world. They also always lag in speed
compared to similar ASIC technology, which always lags in speed compared
to similar custom methodologies. TANSTAAFL.

> There seem to be some efforts to compile imperative languages to
> hardware,
> eg Handel-C. See http://www.eecs.wsu.edu/~reconfig/links.html for some
> links.

The goal there is to use Handel-C to design hardware, not to generate
architectures for running a "C machine". What you describe (am I wrong?)
seems more like designing hardware (by creating an FPGA configuring back
end) to run FPLs. You still won't win the speed battle. Better to spin an
ASIC.

-- Brian


Brian Rogoff

unread,
Apr 12, 2001, 11:47:40 PM4/12/01
to
On Wed, 11 Apr 2001, George Russell wrote:
> No, you are generalising to all cases what is only true in some cases.
> I can certainly show you code which uses a lot of array accesses without
> ever going out of bounds, for which naive bounds checking would be
> expensive, and for which the compiler is extremely unlikely to be able
> to deduce that the code never goes out of bounds.

The Ada solution seems perfectly reasonable. Bounds checking is mandated
by the standard, compilers have a way to turn it off, and there is a
pragma called Suppress which allows you to turn it off in a region
of code, like this

declare
pragma Suppress(Index_Check);
pragma Suppress(Overflow_Check);
--- etc.
begin
-- your unchecked code goes here
end;

or even pragma Suppress(All_Checks) if you're craz^H^H^H^Hconfident that
the checks are not necessary.

This could easily be adapted to ML and other languages.

-- Brian


Daniel C. Wang

unread,
Apr 13, 2001, 3:08:30 AM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:
> The Ada solution seems perfectly reasonable. Bounds checking is mandated
> by the standard, compilers have a way to turn it off, and there is a
> pragma called Suppress which allows you to turn it off in a region
> of code, like this
>
{stuff deleted}

> or even pragma Suppress(All_Checks) if you're craz^H^H^H^Hconfident that
> the checks are not necessary.
>
> This could easily be adapted to ML and other languages.
>

There is something better than "trust me I know what I'm doing"
semantics...

http://www.ececs.uc.edu/~hwxi/DML/DML.html
and
http://www.ececs.uc.edu/~hwxi/Xanadu/Xanadu.html

Martin Berger

unread,
Apr 13, 2001, 8:34:22 AM4/13/01
to
> There is something better than "trust me I know what I'm doing"
> semantics...
>
> http://www.ececs.uc.edu/~hwxi/DML/DML.html
> and
> http://www.ececs.uc.edu/~hwxi/Xanadu/Xanadu.html

if you don't mind the increase in typechecking complexity ...

martin

Brian Rogoff

unread,
Apr 13, 2001, 11:37:37 AM4/13/01
to
On 13 Apr 2001, Daniel C. Wang wrote:
> There is something better than "trust me I know what I'm doing"
> semantics...
>
> http://www.ececs.uc.edu/~hwxi/DML/DML.html
> and
> http://www.ececs.uc.edu/~hwxi/Xanadu/Xanadu.html

Interesting attitude.

I would call that "promising research", and wait a few years before I'd
pronounce it better than the simple engineering solution. I'd say the same
thing about C. Barry Jay's FiSH, and other reasearch languages. In the
meantime, there is a very simple solution...

-- Brian


Daniel C. Wang

unread,
Apr 13, 2001, 12:42:56 PM4/13/01
to

Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk> writes:

ML type inference is already super exponential in the worst case. Adding a
simplex constraint solving phase, which is only exponential in the worst
case, does not change the computational complexity of type checking the
language at all in theory. In practice the simplex phase is pretty quick.

As for the "user interface complexity issues", by default you can use bounds
checked arrays all over the place. It's only when you want to get rid of the
bounds checking that you have to write down richer types.

Martin Berger

unread,
Apr 13, 2001, 12:57:03 PM4/13/01
to

> > if you don't mind the increase in typechecking complexity ...
>
> ML type inference is already super exponential in the worst case.

if i remember my undergraduate FP course correctly, it is hyperexponential
in the nesting depth of local definitions. but in practice on never has
a nesting depth more than 4 or five, hence the complexity is actually
constant in this parameter (please correct me if i'm wrong).

in any case, my quip was directed at your suggestions of using dependent
types, which are generally undeceidable

martin

Daniel C. Wang

unread,
Apr 13, 2001, 12:50:34 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

I don't understand this attitude. If you consider OCaml "promising
research", I can perhaps understand it. The system above is a conservative
extensions to the OCaml language. From a theoretical standpoint the system
is done. The only major "research" is in tweaking the user interface issues,
and seeing how to make more palatable to programers. But all the important
algorithmic issues have been worked out.

In anycase, right now it is better than the simple engineering
solution. Because the simple engineering solution (trust the programmer)
doesn't work.

Daniel C. Wang

unread,
Apr 13, 2001, 1:16:06 PM4/13/01
to

Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk> writes:

But DML was carefully designed so that the dependent type checking is
decidable that was the whole point!

Brian Rogoff

unread,
Apr 13, 2001, 3:15:44 PM4/13/01
to
On 13 Apr 2001, Daniel C. Wang wrote:
> Brian Rogoff <b...@shell5.ba.best.com> writes:
> > Interesting attitude.
> >
> > I would call that "promising research", and wait a few years before I'd
> > pronounce it better than the simple engineering solution. I'd say the same
> > thing about C. Barry Jay's FiSH, and other reasearch languages. In the
> > meantime, there is a very simple solution...
>
> I don't understand this attitude.

Let me try again then. Really it's the "wait a few years after the tires
have been kicked enough" attitude.

> If you consider OCaml "promising research", I can perhaps understand it.

Nope, I'd say that OCaml, while no doubt still very researchy, is well
into the realm of practical. I guess others even more conservative than
me would dispute that. BTW, I'd put SML in there with OCaml too, though
none of the SML implementations I tried measured up to OCaml's IMO.

Certainly parts of OCaml are still being haggled over (no, no, no, I'm
not going to say anything about that here :-) and the methodology being
used involves integrating the researchy stuff into the language, letting
lots of people try it for a while, and then refining it. DML is not a part
of any SML that lots of people use. DeCaml is based on Caml Light, not
OCaml. At this rate, it will never be widely used. Now, Xi is off hacking
Xanadu, which, honestly, I couldn't care less about. SISAL city, orphaned
research, dead language in the making, etc., etc.

> The system above is a conservative extensions to the OCaml language.

Yes, I understand that. From what I gathered (and I saw this stuff before
when Hongwei Xi posted to the caml-list arguing that the Java approach to
null pointers and uninitialized arrays was better than the ML one) DML and
DeCaml use a very restricted form of dependent types in which the
programmer annotates the program with things like type sizes which allow
the compiler to prove various properties at compile time. (Note to Joachim
Durcholz: since you are looking for techniques like Eiffel style
assertions you may find DML interesting).

> From a theoretical standpoint the system is done. The only major
> "research" is in tweaking the user interface issues, and seeing how to
> make more palatable to programers. But all the important algorithmic
> issues have been worked out.

George Russell mentioned that he had programs where he could eliminate the
bounds check but that compilers which he'd used couldn't. It's things like
that which prompt me to take a conservative attitude to approaches which
look great on paper but haven't been used for a few years. Yes, I'm being
a "stick in the mud", just like those embedded guys who reject GC. If or
when it stands the test of time I'll promote it out of "promising
research". Given Xi's direction, abandoning the D{eCA}ML implementations
for Xanadu, I bet it dies.

> In anycase, right now it is better than the simple engineering
> solution. Because the simple engineering solution (trust the programmer)
> doesn't work.

Well, I didn't read Xi's thesis, but as I said I'd be happier waiting a
few years after such a system has been in use before pronouncing it
better. In particular, if there are useful cases that still requre a
"trust the prorammer" approach with Xi's dependent types, I might think
that other approaches should be investigated. "Trust the programmer"
is not guaranteed safe, but it does work (I take it the combination of
the first and second clauses bothers you :), and there is no implementation
burden for compiler writers. One way to do it in OCaml is to provide
unsafe_get and unsafe_set for arrays, as is already done.

BTW, what makes you think Xi's dependent types are a better approach for
this particular problem than Jay's static shape analysis, besides the
fact that DML is upward compatible with {S|OCA}ML?

-- Brian


Daniel C. Wang

unread,
Apr 13, 2001, 4:39:03 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

{stuff deleted}


> Well, I didn't read Xi's thesis, but as I said I'd be happier waiting a
> few years after such a system has been in use before pronouncing it
> better. In particular, if there are useful cases that still requre a
> "trust the prorammer" approach with Xi's dependent types, I might think
> that other approaches should be investigated.

One very nice thing about the DML approach is that there is no theoretical
reason there should be any useful case which requires a "trust the
programmer" approach. DML uses the simplex algorithm to solve an arbitrary
system of linear equations on i`nteger constraints.

For there to be a useful case that DML can't solve there would need to be
some integer constraint problem that simplex can't solve in a resonable
amout of time. The simplex algorithm has been around for ages and people
just don't have any problems with the worst case exponential complexity of
it.

Even if you found such a problem, if the programmer actually knows the check
is not need because they have a solution to the constraint problem that
simplex can't find fast enough, they can over constrain the system to make
the simplex alogrithm run faster.

Xi has shown several examples of being able to verity it is safe to remove
the bounds checks for both binary search and in place quicksort. I can't
really imagine anything getting any more complicated than that in 99% of the
programs I've ever seen.

> "Trust the programmer"
> is not guaranteed safe, but it does work (I take it the combination of
> the first and second clauses bothers you :), and there is no implementation
> burden for compiler writers. One way to do it in OCaml is to provide
> unsafe_get and unsafe_set for arrays, as is already done.

Regardless, of the merits of DML. I find the "trust the programmer" attitude
completely unacceptable. Safe systems, should require programmers to provide
machine checkable evidence of their claims. If programmers cannot provide
evidence, their claims are simply wishful thinking, and there is no reason
to let them compromise the integrity of the system.

In general it's not clear how to force programmers to proivde evidence for
arbitrary claims. However, for claims about range checking I don't see any
reason why the *DML approach* is not more than acceptable today.

Let me be clear, I'm advocating the underlying *design* of DML which Xi has
implemented in DeCaml and Xanadu. One can argue about the quality and
embeding in a given language. In anycase, it is simply not acceptable to
turn off range checking in unsafe ways. If you accept the "trust the
programmer" philosophy you might as well just program in C.

Brian Rogoff

unread,
Apr 13, 2001, 6:52:00 PM4/13/01
to
On 13 Apr 2001, Daniel C. Wang wrote:
[...snip...]

> Let me be clear, I'm advocating the underlying *design* of DML which Xi has
> implemented in DeCaml and Xanadu. One can argue about the quality and
> embeding in a given language. In anycase, it is simply not acceptable to
> turn off range checking in unsafe ways. If you accept the "trust the
> programmer" philosophy you might as well just program in C.

OK, I'm convinced that the approach is better in theory, even though in
practice there are no acceptable implementations. Modify the OCaml
compiler to support it and I'll even use it.

I'm not convinced by your statements about C though. I suppose if you had
a choice between Modula-3 and C for a safety critical system, you'd choose
C, because, since M3 has unsafe features in the language, "you might as
well just program in C"? You don't really believe that. There are degrees
of safety, and many techniques applied in the production of safety
critical software (even object code inspection!); claiming that the
ability to turn off bounds checking in a localized section of code is
tantamount to writing in C with all compiler checks off is extreme.

-- Brian

Daniel C. Wang

unread,
Apr 13, 2001, 8:55:35 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

From the standpoint of "what properties do I have guarantees for" there is
no difference. From the "how hard is it to find the bug after my system
fails" M3 and the like where there is a clearly defined unsafe subset have
an advantage over C. In anycase, M3 wins mainly because it has garabge
collection. If you started programing in M3 using unsafe allocation all over
the place, it really would be like programming in C.

Brian Rogoff

unread,
Apr 13, 2001, 10:07:47 PM4/13/01
to
On 13 Apr 2001, Daniel C. Wang wrote:
> Brian Rogoff <b...@shell5.ba.best.com> writes:
> > I'm not convinced by your statements about C though. I suppose if you had
> > a choice between Modula-3 and C for a safety critical system, you'd choose
> > C, because, since M3 has unsafe features in the language, "you might as
> > well just program in C"? You don't really believe that. There are degrees
> > of safety, and many techniques applied in the production of safety
> > critical software (even object code inspection!); claiming that the
> > ability to turn off bounds checking in a localized section of code is
> > tantamount to writing in C with all compiler checks off is extreme.
>
> From the standpoint of "what properties do I have guarantees for" there is
> no difference. From the "how hard is it to find the bug after my system
> fails" M3 and the like where there is a clearly defined unsafe subset have
> an advantage over C. In anycase, M3 wins mainly because it has garabge
> collection. If you started programing in M3 using unsafe allocation all over
> the place, it really would be like programming in C.

Well, this is just bullshit. I've used Ada and C quite a bit (as has
Laurent Guerby who is no doubt reading this) and I know quite well that
programming in Ada is not at all like programming in C from the safety
point of view. There's a lot more to safety than just garbage collection.

-- Brian


Daniel C. Wang

unread,
Apr 13, 2001, 10:18:25 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

> On 13 Apr 2001, Daniel C. Wang wrote:
> [...snip...]
> > Let me be clear, I'm advocating the underlying *design* of DML which Xi has
> > implemented in DeCaml and Xanadu. One can argue about the quality and
> > embeding in a given language. In anycase, it is simply not acceptable to
> > turn off range checking in unsafe ways. If you accept the "trust the
> > programmer" philosophy you might as well just program in C.
>
> OK, I'm convinced that the approach is better in theory, even though in
> practice there are no acceptable implementations. Modify the OCaml
> compiler to support it and I'll even use it.
>

I also, should be a little less broad in my claim. OCaml with respect to
polymorphic recursion and polymorphic variants... does not have a complete
type inference/checking algorithm. There are provably type-safe programs
which cannot be proven safe in OCaml. This bothers some people. The lack of
polymorphic recursion however is no excuses for using "Unsafe.cast" in your
OCaml program to "solve" the problem of polymorphic recursion.

You should complain to the OCaml folk to fix it by letting the programmer
provide hints that increase the power of the inference algorithm. Or
threaten to move to another language that supports the feature. This is the
attitude which I think people should have wrt incompleteness of any
type system.

The attitude "the system is incomplete" so let's just make it unsound. Seems
silly with respect to polymorphic recursion. It should be equally silly with
reasoning about integer constraints. Especially when it's the case that the
system is incomplete in only a very few cases in practice.

In an ideal world, the only "escape hatch" a safe programming language
should provide to the user is an interface to a tactical theorem prover, so
the user can prove those facts which the type system cannot verify by
itself. Unsafe.cast is not an acceptable option for those domains for which
there exists well understood powerful constraint solvers and checkers. The
integer domain is one such area. Just like polymorphic type inference.

Daniel C. Wang

unread,
Apr 13, 2001, 10:32:16 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

Note, I'm comparing M3 and C. In Ada where you probably don't heap allocate
much of anything GC might not buy you much. I haven't programed in Ada
myself so you can ignore this claim.

But really, remove unsafe casts from C.

Add garbage collection.
Replace void pointers with generics/polymorphism.
Require array bounds checking.
Throw out broken C libraries.

The language you have left is not all that bad. It looks like M3. Okay,
there are concurrency issues which I'm sure both M3 and Ada handle better
than C but these languages are more a like than not.

Brian Rogoff

unread,
Apr 13, 2001, 10:58:19 PM4/13/01
to
On 13 Apr 2001, Daniel C. Wang wrote:
> Brian Rogoff <b...@shell5.ba.best.com> writes:
> > OK, I'm convinced that the approach is better in theory, even though in
> > practice there are no acceptable implementations. Modify the OCaml
> > compiler to support it and I'll even use it.
> >
>
> I also, should be a little less broad in my claim. OCaml with respect to
> polymorphic recursion and polymorphic variants... does not have a complete
> type inference/checking algorithm. There are provably type-safe programs
> which cannot be proven safe in OCaml. This bothers some people. The lack of
> polymorphic recursion however is no excuses for using "Unsafe.cast" in your
> OCaml program to "solve" the problem of polymorphic recursion.

Well, if I'm not mistaken (I'm not an expert on type theory) there is
always some slack in any inference system so you'll need to either have
explicit types (a la Haskell) or settle for an undecidable type system and
set iteration limits (a la Mercury and C++) if you want all that power.

For now, I just don't use polymorphic recursion, and I don't try to fake
it by using Obj.magic. BTW, yes I much prefer the name Unsafe.cast to
Obj.magic :-).

> You should complain to the OCaml folk to fix it by letting the programmer
> provide hints that increase the power of the inference algorithm.

I have. The response has been positive. The OCaml folk are not hasty in
adding things to the language and would like to do things right. I
appreciate that approach, and I'd think that you would too.

In the meantime, I'm patient, since there aren't (IMO of course) better
alternatives than OCaml, and I can usually find some way to get things
done in the language.

> Or threaten to move to another language that supports the
> feature. This is the attitude which I think people should have wrt
> incompleteness of any type system.

Where do you work? I'm a commercial programmer. I've used a fair number of
languages, and despite some of the problems you've, ummm, politely
described wrt OCaml, I find it better than anything else. There are a few
type system enhancements I'd love to have, and I'm well aware that there
are a number of rough edges (yes, the order of evaluation, the top level
module name thing, etc). But at some point,I just suck it in and get on
with my real work, and live without polymorphic recursion, overloading,
and recursive modules (I'm repeating that list in case Xavier et al are
reading :-) even though they'd make my life easier.

I suspect that you don't have such constraints and are able to choose
whatever language you want at any time. That's great, but it doesn't work
that way where I work. We'll be using OCaml for quite a while; we have
enough legacy code and accumulated knowledge that we're not going to
dump it over the lack of polymorphic recursion. That would be crazy.

-- Brian


Daniel C. Wang

unread,
Apr 13, 2001, 11:28:03 PM4/13/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:
{stuff deleted}
> Where do you work? I'm a commercial programmer. I've used a fair number of

Work? :) I fashion myself a language designer... I don't "work". I just
pontificate. As an person with a deadline, who can't wait for the langauge
deisigners to fix things. I'll cry uncle and say that Obj.magic is an
acceptable stop gap. But it is not an acceptable thing to have in a langauge
definition. (This whole thing started when you suggested Ada's unchecked
pragmas were acceptable thing to have in the language spec.)

> languages, and despite some of the problems you've, ummm, politely
> described wrt OCaml, I find it better than anything else. There are a few
> type system enhancements I'd love to have, and I'm well aware that there
> are a number of rough edges (yes, the order of evaluation, the top level
> module name thing, etc). But at some point,I just suck it in and get on
> with my real work, and live without polymorphic recursion, overloading,
> and recursive modules (I'm repeating that list in case Xavier et al are
> reading :-) even though they'd make my life easier.

This is a resonable attitude. I'm only suggesting that you should have the
same attituded wrt array bounds checking. Just add dependent types to the
list of "must have extensions". You can get a lot more than just array
bounds checking if you extended the domain of the constraints to other
useful constraint domains.

Markus Mottl

unread,
Apr 14, 2001, 5:21:32 AM4/14/01
to
Daniel C. Wang <danwan...@cs.princeton.edu> wrote:
> I'll cry uncle and say that Obj.magic is an acceptable stop gap. But
> it is not an acceptable thing to have in a langauge definition. (This
> whole thing started when you suggested Ada's unchecked pragmas were
> acceptable thing to have in the language spec.)

Note that Obj.magic is *not* part of the language definition. It is in
implementation feature that has not even been documented intentionally
(to prevent beginners from using it). It's intended use is for very
low-level things like, e.g., interfacing to foreign languages, where
you'll arguably have to break type safety.

I once used it for something different (automatically resizing arrays),
where I exploited the way the garbage collector works to set unused
array parts to some kind of "null pointer" to free memory (the GC can
cope with this). The interface to this library is type safe, of course.

What concerns the naming: I prefer "Obj.magic" to "Unsafe.cast", because
the latter looks like an invitation to do something unsound, whereas
the first (especially if left undocumented) looks dangerously magical.

Regards,
Markus Mottl

--
Markus Mottl, mo...@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

Martin Berger

unread,
Apr 14, 2001, 10:13:52 AM4/14/01
to

> But DML was carefully designed so that the dependent type checking is
> decidable that was the whole point!

indeed, but one runs rather easily into situations where the restrictions
become a problem

in any case, i don't want to knock xi's work or dependent types. in fact
i'm rather fond of both and would like to see much more progress in this
direction, 'cause i think it is the right direction for types' research

martin

Daniel C. Wang

unread,
Apr 14, 2001, 1:03:07 PM4/14/01
to

Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk> writes:

Maybe, I'm the only one who thinks this detail is worth pointing out, but Xi
method is not limited to the simplex method. The hole cute things is that
type checking is split into two phases. One that looks like an H-M style
type inference algorithm which collects a set of constraints. These
constraints have to be linear constraints or else the constraint solver
(simplex) won't be able to find a solution. It would be "easy" to plug in
another constraint solver, or give the programmer the option of providing an
explicit solution to the constraints that simplex can't solve.

Fergus Henderson

unread,
Apr 14, 2001, 4:49:36 PM4/14/01
to

There is a fundamental difference in design philosphy between C and
M3/Ada. The designers of M3 and Ada valued safety highly. The
designers of C favoured other properties, such as convenience, over
safety. The designers and standizers of M3 and Ada tried very hard to
minimize the areas of undefined behaviour. The designers and
standardizers of C didn't.

In addition to unsafe casts, void pointers, and unchecked array
indices, there are MANY, *MANY* other issues in C that can
result in undefined behaviour.

In C, even simple things like string literals and integer division can
be potentially undefined behaviour waiting to happen.

For a slightly more obscure example, consider this one:

| -- The same identifier has both internal and external
| linkage in the same translation unit (6.2.2).

How many C programmers even understand what it means for an
identifier to have both internal and external linkage in the
same translation unit?

There are many more well known problems, though, including classics
such as violating the sequence point requirements. Indeed whole books
have been written about the problems, and how to avoid them, e.g.

C Traps and Pitfalls
Andrew Koenig
Addison Wesley, 1989

Safer C
Les Hatton
McGraw Hill

And that's not to mention the many ways in which you can screw things
up in C even without invoking undefined behaviour, for example by
forgetting the "break;" at the end of a switch case.

I enclose below a *partial* list of the causes of undefined behaviour,
from the C standard. Make sure you remember them all, otherwise you
will probably get bitten by one that you've forgotten or where unaware
of -- it has certainly happened to me.

| J.2 Undefined behavior
|
| [#1] The behavior is undefined in the following
| circumstances:
|
| -- A ``shall'' or ``shall not'' requirement that appears
| outside of a constraint is violated (clause 4).
[This probably covers dozens or maybe hundreds of different
causes of undefined behaviour not listed below.]
|
| -- A nonempty source file does not end in a new-line
| character which is not immediately preceded by a
| backslash character or ends in a partial preprocessing
| token or comment (5.1.1.2).
|
| -- Token concatenation produces a character sequence
| matching the syntax of a universal character name
| (5.1.1.2).
|
| -- A program in a hosted environment does not define a
| function named main using one of the specified forms
| (5.1.2.2.1).
|
| -- A character not in the basic source character set is
| encountered in a source file, except in an identifier,
| a character constant, a string literal, a header name,
| a comment, or a preprocessing token that is never
| converted to a token (5.2.1).
|
| -- An identifier, comment, string literal, character
| constant, or header name contains an invalid multibyte
| character or does not begin and end in the initial
| shift state (5.2.1.2).
|
| -- The same identifier has both internal and external
| linkage in the same translation unit (6.2.2).
|
| -- An object is referred to outside of its lifetime
| (6.2.4).
|
| -- The value of a pointer to an object whose lifetime has
| ended is used (6.2.4).
|
| -- The value of an object with automatic storage duration
| is used while it is indeterminate (6.2.4, 6.7.8, 6.8).
|
| -- A trap representation is read by an lvalue expression
| that does not have character type (6.2.6.1).
|
| -- A trap representation is produced by a side effect that
| modifies any part of the object using an lvalue
| expression that does not have character type (6.2.6.1).
|
| -- The arguments to certain operators are such that could
| produce a negative zero result, but the implementation
| does not support negative zeros (6.2.6.2).
|
| -- Two declarations of the same object or function specify
| types that are not compatible (6.2.7).
|
| -- Conversion to or from an integer type produces a value
| outside the range that can be represented (6.3.1.4).
|
| -- Demotion of one real floating type to another produces
| a value outside the range that can be represented
| (6.3.1.5).
|
| -- An lvalue does not designate an object when evaluated
| (6.3.2.1).
|
| -- A non-array lvalue with an incomplete type is used in a
| context that requires the value of the designated
| object (6.3.2.1).
|
| -- An lvalue having array type is converted to a pointer
| to the initial element of the array, and the array
| object has register storage class (6.3.2.1).
|
| -- An attempt is made to use the value of a void
| expression, or an implicit or explicit conversion
| (except to void) is applied to a void expression
| (6.3.2.2).
|
| -- Conversion of a pointer to an integer type produces a
| value outside the range that can be represented
| (6.3.2.3).
|
| -- Conversion between two pointer types produces a result
| that is incorrectly aligned (6.3.2.3).
|
| -- A pointer is used to call a function whose type is not
| compatible with the pointed-to type (6.3.2.3).
|
| -- An unmatched ' or " character is encountered on a
| logical source line during tokenization (6.4).
|
| -- A reserved keyword token is used in translation phase 7
| or 8 for some purpose other than as a keyword (6.4.1).
|
| -- A universal character name in an identifier does not
| designate a character whose encoding falls into one of
| the specified ranges (6.4.2.1).
|
| -- The initial character of an identifier is a universal
| character name designating a digit (6.4.2.1).
|
| -- Two identifiers differ only in nonsignificant
| characters (6.4.2.1).
|
| -- The identifier __func__ is explicitly declared
| (6.4.2.2).
|
| -- The program attempts to modify a string literal
| (6.4.5).
|
| -- The characters ', \, ", //, or /* occur in the sequence
| between the < and > delimiters, or the characters ', \,
| //, or /* occur in the sequence between the "
| delimiters, in a header name preprocessing token
| (6.4.7).
|
| -- Between two sequence points, an object is modified more
| than once, or is modified and the prior value is read
| other than to determine the value to be stored (6.5).
|
| -- An exceptional condition occurs during the evaluation
| of an expression (6.5).
|
| -- An object has its stored value accessed other than by
| an lvalue of an allowable type (6.5).
|
| -- An attempt is made to modify the result of a function
| call, a conditional operator, an assignment operator,
| or a comma operator, or to access it after the next
| sequence point (6.5.2.2, 6.5.15, 6.5.16, 6.5.17).
|
| -- For a call to a function without a function prototype
| in scope, the number of arguments does not equal the
| number of parameters (6.5.2.2).
|
| -- For call to a function without a function prototype in
| scope where the function is defined with a function
| prototype, either the prototype ends with an ellipsis
| or the types of the arguments after promotion are not
| compatible with the types of the parameters (6.5.2.2).
|
| -- For a call to a function without a function prototype
| in scope where the function is not defined with a
| function prototype, the types of the arguments after
| promotion are not compatible with those of the
| parameters after promotion (with certain exceptions)
| (6.5.2.2).
|
| -- A function is defined with a type that is not
| compatible with the type (of the expression) pointed to
| by the expression that denotes the called function
| (6.5.2.2).
|
| -- The operand of the unary * operator has an invalid
| value (6.5.3.2).
|
| -- A pointer is converted to other than an integer or
| pointer type (6.5.4).
|
| -- The value of the second operand of the / or % operator
| is zero (6.5.5).
|
| -- Addition or subtraction of a pointer into, or just
| beyond, an array object and an integer type produces a
| result that does not point into, or just beyond, the
| same array object (6.5.6).
|
| -- Addition or subtraction of a pointer into, or just
| beyond, an array object and an integer type produces a
| result that points just beyond the array object and is
| used as the operand of a unary * operator that is
| evaluated (6.5.6).
|
| -- Pointers that do not point into, or just beyond, the
| same array object are subtracted (6.5.6).
|
| -- An array subscript is out of range, even if an object
| is apparently accessible with the given subscript (as
| in the lvalue expression a[1][7] given the declaration
| int a[4][5]) (6.5.6).
|
| -- The result of subtracting two pointers is not
| representable in an object of type ptrdiff_t (6.5.6).
|
| -- An expression is shifted by a negative number or by an
| amount greater than or equal to the width of the
| promoted expression (6.5.7).
|
| -- An expression having signed promoted type is left-
| shifted and either the value of the expression is
| negative or the result of shifting would be not be
| representable in the promoted type (6.5.7).
|
| -- Pointers that do not point to the same aggregate or
| union (nor just beyond the same array object) are
| compared using relational operators (6.5.8).
|
| -- An object is assigned to an inexactly overlapping
| object or to an exactly overlapping object with
| incompatible type (6.5.16.1).
|
| -- An expression that is required to be an integer
| constant expression does not have an integer type; has
| operands that are not integer constants, enumeration
| constants, character constants, sizeof expressions
| whose results are integer constants, or immediately-
| cast floating constants; or contains casts (outside
| operands to sizeof operators) other than conversions of
| arithmetic types to integer types (6.6).
|
| -- A constant expression in an initializer is not, or does
| not evaluate to, one of the following: an arithmetic
| constant expression, a null pointer constant, an
| address constant, or an address constant for an object
| type plus or minus an integer constant expression
| (6.6).
|
| -- An arithmetic constant expression does not have
| arithmetic type; has operands that are not integer
| constants, floating constants, enumeration constants,
| character constants, or sizeof expressions; or contains
| casts (outside operands to sizeof operators) other than
| conversions of arithmetic types to arithmetic types
| (6.6).
|
| -- The value of an object is accessed by an array-
| subscript [], member-access . or ->, address &, or
| indirection * operator or a pointer cast in creating an
| address constant (6.6).
|
| -- An identifier for an object is declared with no linkage
| and the type of the object is incomplete after its
| declarator, or after its init-declarator if it has an
| initializer (6.7).
|
| -- A function is declared at block scope with an explicit
| storage-class specifier other than extern (6.7.1).
|
| -- A structure or union is defined as containing no named
| members (6.7.2.1).
|
| -- An attempt is made to access, or generate a pointer to
| just past, a flexible array member of a structure when
| the referenced object provides no elements for that
| array (6.7.2.1).
|
| -- When the complete type is needed, an incomplete
| structure or union type is not completed in the same
| scope by another declaration of the tag that defines
| the content (6.7.2.3).
|
| -- An attempt is made to modify an object defined with a
| const-qualified type through use of an lvalue with non-
| const-qualified type (6.7.3).
|
| -- An attempt is made to refer to an object defined with a
| volatile-qualified type through use of an lvalue with
| non-volatile-qualified type (6.7.3).
|
| -- The specification of a function type includes any type
| qualifiers (6.7.3).
|
| -- Two qualified types that are required to be compatible
| do not have the identically qualified version of a
| compatible type (6.7.3).
|
| -- An object which has been modified is accessed through a
| restrict-qualified pointer to a const-qualified type,
| or through a restrict-qualified pointer and another
| pointer that are not both based on the same object
| (6.7.3.1).
|
| -- A restrict-qualified pointer is assigned a value based
| on another restricted pointer whose associated block
| neither began execution before the block associated
| with this pointer, nor ended before the assignment
| (6.7.3.1).
|
| -- A function with external linkage is declared with an
| inline function specifier, but is not also defined in
| the same translation unit (6.7.4).
|
| -- Two pointer types that are required to be compatible
| are not identically qualified, or are not pointers to
| compatible types (6.7.5.1).
|
| -- The size expression in an array declaration is not a
| constant expression and evaluates at program execution
| time to a nonpositive value (6.7.5.2).
|
| -- In a context requiring two array types to be
| compatible, they do not have compatible element types,
| or their size specifiers evaluate to unequal values
| (6.7.5.2).
|
| -- A declaration of an array parameter includes the
| keyword static within the [ and ] and the corresponding
| argument does not provide access to the first element
| of an array with at least the specified number of
| elements (6.7.5.3).
|
| -- A storage-class specifier or type qualifier modifies
| the keyword void as a function parameter type list
| (6.7.5.3).
|
| -- In a context requiring two function types to be
| compatible, they do not have compatible return types,
| or their parameters disagree in use of the ellipsis
| terminator or the number and type of parameters (after
| default argument promotion, when there is no parameter
| type list or when one type is specified by a function
| definition with an identifier list) (6.7.5.3).
|
| -- The value of an unnamed member of a structure or union
| is used (6.7.8).
|
| -- The initializer for a scalar is neither a single
| expression nor a single expression enclosed in braces
| (6.7.8).
|
| -- The initializer for a structure or union object that
| has automatic storage duration is neither an
| initializer list nor a single expression that has
| compatible structure or union type (6.7.8).
|
| -- The initializer for an aggregate or union, other than
| an array initialized by a string literal, is not a
| brace-enclosed list of initializers for its elements or
| members (6.7.8).
|
| -- An identifier with external linkage is used, but in the
| program there does not exist exactly one external
| definition for the identifier, or the identifier is not
| used and there exist multiple external definitions for
| the identifier (6.9).
|
| -- A function definition includes an identifier list, but
| the types of the parameters are not declared in a
| following declaration list (6.9.1).
|
| -- An adjusted parameter type in a function definition is
| not an object type (6.9.1).
|
| -- A function that accepts a variable number of arguments
| is defined without a parameter type list that ends with
| the ellipsis notation (6.9.1).
|
| -- The } that terminates a function is reached, and the
| value of the function call is used by the caller
| (6.9.1).
|
| -- An identifier for an object with internal linkage and
| an incomplete type is declared with a tentative
| definition (6.9.2).
|
| -- The token defined is generated during the expansion of
| a #if or #elif preprocessing directive, or the use of
| the defined unary operator does not match one of the
| two specified forms prior to macro replacement
| (6.10.1).
|
| -- The #include preprocessing directive that results after
| expansion does not match one of the two header name
| forms (6.10.2).
|
| -- The character sequence in an #include preprocessing
| directive does not start with a letter (6.10.2).
|
| -- There are sequences of preprocessing tokens within the
| list of macro arguments that would otherwise act as
| preprocessing directives (6.10.3).
|
| -- The result of the preprocessing operator # is not a
| valid character string literal (6.10.3.2).
|
| -- The result of the preprocessing operator ## is not a
| valid preprocessing token (6.10.3.3).
|
| -- The #line preprocessing directive that results after
| expansion does not match one of the two well-defined
| forms, or its digit sequence specifies zero or a number
| greater than 2147483647 (6.10.4).
|
| -- A non-STDC #pragma preprocessing directive that is
| documented as causing translation failure or some other
| form of undefined behavior is encountered (6.10.6).
|
| -- A #pragma STDC preprocessing directive does not match
| one of the well-defined forms (6.10.6).
|
| -- The name of a predefined macro, or the identifier
| defined, is the subject of a #define or #undef
| preprocessing directive (6.10.8).
|
| -- An attempt is made to copy an object to an overlapping
| object by use of a library function, other than as
| explicitly allowed (e.g., memmove) (clause 7).
|
| -- A file with the same name as one of the standard
| headers, not provided as part of the implementation, is
| placed in any of the standard places that are searched
| for included source files (7.1.2).
|
| -- A header is included within an external declaration or
| definition (7.1.2).
|
| -- A function, object, type, or macro that is specified as
| being declared or defined by some standard header is
| used before any header that declares or defines it is
| included (7.1.2).
|
| -- A standard header is included while a macro is defined
| with the same name as a keyword (7.1.2).
|
| -- The program attempts to declare a library function
| itself, rather than via a standard header, but the
| declaration does not have external linkage (7.1.2).
|
| -- The program declares or defines a reserved identifier,
| other than as allowed by 7.1.4 (7.1.3).
|
| -- The program removes the definition of a macro whose
| name begins with an underscore and either an uppercase
| letter or another underscore (7.1.3).
|
| -- An argument to a library function has an invalid value
| or a type not expected by a function with variable
| number of arguments (7.1.4).
|
| -- The pointer passed to a library function array
| parameter does not have a value such that all address
| computations and object accesses are valid (7.1.4).
|
| -- The macro definition of assert is suppressed in order
| to access an actual function (7.2).
|
| -- The argument to the assert macro does not have a scalar
| type (7.2).
|
| -- The CX_LIMITED_RANGE, FENV_ACCESS, or FP_CONTRACT
| pragma is used in any context other than outside all
| external declarations or preceding all explicit
| declarations and statements inside a compound statement
| (7.3.4, 7.6.1, 7.12.2).
|
| -- The value of an argument to a character handling
| function is neither equal to the value of EOF nor
| representable as an unsigned char (7.4).
|
| -- A macro definition of errno is suppressed in order to
| access an actual object, or the program defines an
| identifier with the name errno (7.5).
|
| -- Part of the program tests floating-point status flags,
| sets floating-point control modes, or runs under non-
| default mode settings, but was translated with the
| state for the FENV_ACCESS pragma ``off'' (7.6.1).
|
| -- The exception-mask argument for one of the functions
| that provide access to the floating-point status flags
| has a nonzero value not obtained by bitwise OR of the
| floating-point exception macros (7.6.2).
|
| -- The fesetexceptflag function is used to set floating-
| point status flags that were not specified in the call
| to the fegetexceptflag function that provided the value
| of the corresponding fexcept_t object (7.6.2.4).
|
| -- The argument to fesetenv or feupdateenv is neither an
| object set by a call to fegetenv or feholdexcept, nor
| is it an environment macro (7.6.4.3, 7.6.4.4).
|
| -- The value of the result of an integer arithmetic or
| conversion function cannot be represented (7.8.2.1,
| 7.8.2.2, 7.8.2.3, 7.8.2.4, 7.20.6.1, 7.20.6.2, 7.20.1).
|
| -- The program modifies the string pointed to by the value
| returned by the setlocale function (7.11.1.1).
|
| -- The program modifies the structure pointed to by the
| value returned by the localeconv function (7.11.2.1).
|
| -- A macro definition of math_errhandling is suppressed or
| the program defines an identifier with the name
| math_errhandling (7.12).
|
| -- An argument to a floating-point classification or
| comparison macro is not of real floating type (7.12.3,
| 7.12.14).
|
| -- A macro definition of setjmp is suppressed in order to
| access an actual function, or the program defines an
| external identifier with the name setjmp (7.13).
|
| -- An invocation of the setjmp macro occurs other than in
| an allowed context (7.13.2.1).
|
| -- The longjmp function is invoked to restore a
| nonexistent environment (7.13.2.1).
|
| -- After a longjmp, there is an attempt to access the
| value of an object of automatic storage class with non-
| volatile-qualified type, local to the function
| containing the invocation of the corresponding setjmp
| macro, that was changed between the setjmp invocation
| and longjmp call (7.13.2.1).
|
| -- The program specifies an invalid pointer to a signal
| handler function (7.14.1.1).
|
| -- A signal handler returns when the signal corresponded
| to a computational exception (7.14.1.1).
|
| -- A signal occurs as the result of calling the abort or
| raise function, and the signal handler calls the raise
| function (7.14.1.1).
|
| -- A signal occurs other than as the result of calling the
| abort or raise function, and the signal handler refers
| to an object with static storage duration other than by
| assigning a value to an object declared as volatile
| sig_atomic_t, or calls any function in the standard
| library other than the abort function, the _Exit
| function, or the signal function (for the same signal
| number) (7.14.1.1).
|
| -- The value of errno is referred to after a signal
| occurred other than as the result of calling the abort
| or raise function and the corresponding signal handler
| obtained a SIG_ERR return from a call to the signal
| function (7.14.1.1).
|
| -- A signal is generated by an asynchronous signal handler
| (7.14.1.1).
|
| -- A function with a variable number of arguments attempts
| to access its varying arguments other than through a
| properly declared and initialized va_list object, or
| before the va_start macro is invoked (7.15, 7.15.1.1,
| 7.15.1.4).
|
| -- The macro va_arg is invoked using the parameter ap that
| was passed to a function that invoked the macro va_arg
| with the same parameter (7.15).
|
| -- A macro definition of va_start, va_arg, va_copy, or
| va_end is suppressed in order to access an actual
| function, or the program defines an external identifier
| with the name va_copy or va_end (7.15.1).
|
| -- The va_start or va_copy macro is invoked without a
| corresponding invocation of the va_end macro in the
| same function, or vice versa (7.15.1, 7.15.1.2,
| 7.15.1.3, 7.15.1.4).
|
| -- The type parameter to the va_arg macro is not such that
| a pointer to an object of that type can be obtained
| simply by postfixing a * (7.15.1.1).
|
| -- The va_arg macro is invoked when there is no actual
| next argument, or with a specified type that is not
| compatible with the promoted type of the actual next
| argument, with certain exceptions (7.15.1.1).
|
| -- The va_copy or va_start macro is called to initialize a
| va_list that was previously initialized by either macro
| without an intervening invocation of the va_end macro
| for the same va_list (7.15.1.2, 7.15.1.4).
|
| -- The parameter parmN of a va_start macro is declared
| with the register storage class, with a function or
| array type, or with a type that is not compatible with
| the type that results after application of the default
| argument promotions (7.15.1.4).
|
| -- The member designator parameter of an offsetof macro is
| an invalid right operand of the . operator for the type
| parameter, or designates a bit-field (7.17).
|
| -- The argument in an instance of one of the integer-
| constant macros is not a decimal, octal, or hexadecimal
| constant, or it has a value that exceeds the limits for
| the corresponding type (7.18.4).
|
| -- A byte input/output function is applied to a wide-
| oriented stream, or a wide character input/output
| function is applied to a byte-oriented stream (7.19.2).
|
| -- Use is made of any portion of a file beyond the most
| recent wide character written to a wide-oriented stream
| (7.19.2).
|
| -- The value of a pointer to a FILE object is used after
| the associated file is closed (7.19.3).
|
| -- The stream for the fflush function points to an input
| stream or to an update stream in which the most recent
| operation was input (7.19.5.2).
|
| -- The string pointed to by the mode argument in a call to
| the fopen function does not exactly match one of the
| specified character sequences (7.19.5.3).
|
| -- An output operation on an update stream is followed by
| an input operation without an intervening call to the
| fflush function or a file positioning function, or an
| input operation on an update stream is followed by an
| output operation with an intervening call to a file
| positioning function (7.19.5.3).
|
| -- An attempt is made to use the contents of the array
| that was supplied in a call to the setvbuf function
| (7.19.5.6).
|
| -- There are insufficient arguments for the format in a
| call to one of the formatted input/output functions, or
| an argument does not have an appropriate type
| (7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2).
|
| -- The format in a call to one of the formatted
| input/output functions or to the strftime or wcsftime
| function is not a valid multibyte character sequence
| that begins and ends in its initial shift state
| (7.19.6.1, 7.19.6.2, 7.23.3.5, 7.24.2.1, 7.24.2.2,
| 7.24.5.1).
|
| -- In a call to one of the formatted output functions, a
| precision appears with a conversion specifier other
| than those described (7.19.6.1, 7.24.2.1).
|
| -- A conversion specification for a formatted output
| function uses an asterisk to denote an argument-
| supplied field width or precision, but the
| corresponding argument is not provided (7.19.6.1,
| 7.24.2.1).
|
| -- A conversion specification for a formatted output
| function uses a # or 0 flag with a conversion specifier
| other than those described (7.19.6.1, 7.24.2.1).
|
| -- A conversion specification for one of the formatted
| input/output functions uses a length modifier with a
| conversion specifier other than those described
| (7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2).
|
| -- An s conversion specifier is encountered by one of the
| formatted output functions, and the argument is missing
| the null terminator (unless a precision is specified
| that does not require null termination) (7.19.6.1,
| 7.24.2.1).
|
| -- An n conversion specification for one of the formatted
| input/output functions includes any flags, an
| assignment-suppressing character, a field width, or a
| precision (7.19.6.1, 7.19.6.2, 7.24.2.1, 7.24.2.2).
|
| -- A % conversion specifier is encountered by one of the
| formatted input/output functions, but the complete
| conversion specification is not exactly %% (7.19.6.1,
| 7.19.6.2, 7.24.2.1, 7.24.2.2).
|
| -- An invalid conversion specification is found in the
| format for one of the formatted input/output functions,
| or the strftime or wcsftime function (7.19.6.1,
| 7.19.6.2, 7.23.3.5, 7.24.2.1, 7.24.2.2, 7.24.5.1).
|
| -- The number of characters transmitted by a formatted
| output function is greater than INT_MAX (7.19.6.1,
| 7.19.6.3, 7.19.6.8, 7.19.6.10).
|
| -- The result of a conversion by one of the formatted
| input functions cannot be represented in the
| corresponding object, or the receiving object does not
| have an appropriate type (7.19.6.2, 7.24.2.2).
|
| -- A c, s, or [ conversion specifier is encountered by one
| of the formatted input functions, and the array pointed
| to by the corresponding argument is not large enough to
| accept the input sequence (and a null terminator if the
| conversion specifier is s or [) (7.19.6.2, 7.24.2.2).
|
| -- A c, s, or [ conversion specifier with an l qualifier
| is encountered by one of the formatted input functions,
| but the input is not a valid multibyte character
| sequence that begins in the initial shift state
| (7.19.6.2, 7.24.2.2).
|
| -- The input item for a %p conversion by one of the
| formatted input functions is not a value converted
| earlier during the same program execution (7.19.6.2,
| 7.24.2.2).
|
| -- The vfprintf, vfscanf, vprintf, vscanf, vsnprintf,
| vsprintf, vsscanf, vfwprintf, vfwscanf, vswprintf,
| vswscanf, vwprintf, or vwscanf function is called with
| an improperly initialized va_list argument, or the
| argument is used (other than in an invocation of
| va_end) after the function returns (7.19.6.8, 7.19.6.9,
| 7.19.6.10, 7.19.6.11, 7.19.6.12, 7.19.6.13, 7.19.6.14,
| 7.24.2.5, 7.24.2.6, 7.24.2.7, 7.24.2.8, 7.24.2.9,
| 7.24.2.10).
|
| -- The contents of the array supplied in a call to the
| fgets, gets, or fgetws function are used after a read
| error occurred (7.19.7.2, 7.19.7.7, 7.24.3.2).
|
| -- The file position indicator for a binary stream is used
| after a call to the ungetc function where its value was
| zero before the call (7.19.7.11).
|
| -- The file position indicator for a stream is used after
| an error occurred during a call to the fread or fwrite
| function (7.19.8.1, 7.19.8.2).
|
| -- A partial element read by a call to the fread function
| is used (7.19.8.1).
|
| -- The fseek function is called for a text stream with a
| nonzero offset and either the offset was not returned
| by a previous successful call to the ftell function on
| a stream associated with the same file or whence is not
| SEEK_SET (7.19.9.2).
|
| -- The fsetpos function is called to set a position that
| was not returned by a previous successful call to the
| fgetpos function on a stream associated with the same
| file (7.19.9.3).
|
| -- A non-null pointer returned by a call to the calloc,
| malloc, or realloc function with a zero requested size
| is used to access an object (7.20.3).
|
| -- The value of a pointer that refers to space deallocated
| by a call to the free or realloc function is used
| (7.20.3).
|
| -- The pointer argument to the free or realloc function
| does not match a pointer earlier returned by calloc,
| malloc, or realloc, or the space has been deallocated
| by a call to free or realloc (7.20.3.2, 7.20.3.4).
|
| -- The value of the object allocated by the malloc
| function is used (7.20.3.3).
|
| -- The value of any bytes in a new object allocated by the
| realloc function beyond the size of the old object are
| used (7.20.3.4).
|
| -- The program executes more than one call to the exit
| function (7.20.4.3).
|
| -- During the call to a function registered with the
| atexit function, a call is made to the longjmp function
| that would terminate the call to the registered
| function (7.20.4.3).
|
| -- The string set up by the getenv or strerror function is
| modified by the program (7.20.4.5, 7.21.6.2).
|
| -- A command is executed through the system function in a
| way that is documented as causing termination or some
| other form of undefined behavior (7.20.4.6).
|
| -- A searching or sorting utility function is called with
| an invalid pointer argument, even if the number of
| elements is zero (7.20.5).
|
| -- The comparison function called by a searching or
| sorting utility function alters the contents of the
| array being searched or sorted, or returns ordering
| values inconsistently (7.20.5).
|
| -- The array being searched by the bsearch function does
| not have its elements in proper order (7.20.5.1).
|
| -- The current conversion state is used by a
| multibyte/wide character conversion function after
| changing the LC_CTYPE category (7.20.7).
|
| -- A string or wide string utility function is instructed
| to access an array beyond the end of an object (7.21.1,
| 7.24.4).
|
| -- A string or wide string utility function is called with
| an invalid pointer argument, even if the length is zero
| (7.21.1, 7.24.4).
|
| -- The contents of the destination array are used after a
| call to the strxfrm, strftime, wcsxfrm, or wcsftime
| function in which the specified length was too small to
| hold the entire null-terminated result (7.21.4.5,
| 7.23.3.5, 7.24.4.4.4, 7.24.5.1).
|
| -- The first argument in the very first call to the strtok
| or wcstok is a null pointer (7.21.5.8, 7.24.4.5.7).
|
| -- The type of an argument to a type-generic macro is not
| compatible with the type of the corresponding parameter
| of the selected function (7.22).
|
| -- A complex argument is supplied for a generic parameter
| of a type-generic macro that has no corresponding
| complex function (7.22).
|
| -- The argument corresponding to an s specifier without an
| l qualifier in a call to the fwprintf function does not
| point to a valid multibyte character sequence that
| begins in the initial shift state (7.24.2.11).
|
| -- In a call to the wcstok function, the object pointed to
| by ptr does not have the value stored by the previous
| call for the same wide string (7.24.4.5.7).
|
| -- An mbstate_t object is used inappropriately (7.24.6).
|
| -- The value of an argument of type wint_t to a wide
| character classification or case mapping function is
| neither equal to the value of WEOF nor representable as
| a wchar_t (7.25.1).
|
| -- The iswctype function is called using a different
| LC_CTYPE category from the one in effect for the call
| to the wctype function that returned the description
| (7.25.2.2.1).
|
| -- The towctrans function is called using a different
| LC_CTYPE category from the one in effect for the call
| to the wctrans function that returned the description
| (7.25.3.2.1).

--
Fergus Henderson <f...@cs.mu.oz.au> | "I have always known that the pursuit
| of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.

Daniel C. Wang

unread,
Apr 14, 2001, 5:50:58 PM4/14/01
to

f...@cs.mu.oz.au (Fergus Henderson) writes:

{stuff deleted}


> There is a fundamental difference in design philosphy between C and
> M3/Ada. The designers of M3 and Ada valued safety highly. The
> designers of C favoured other properties, such as convenience, over
> safety. The designers and standizers of M3 and Ada tried very hard to
> minimize the areas of undefined behaviour. The designers and
> standardizers of C didn't.

oh come on be fair. C was invented to write Unix. It wasn't intend to be the
language it is today. It retained cruft to be backward compatible. Ada and
M3 are both "clean slate" designs. Let's also be clear that Ada is relegated
to a pretty niche market because writing a conforming Ada compiler is a fair
amount of more work than a C compiler. Ada is an overspecified langauge if
there ever was one. I'm more sorry that M3 didn't take off... but even in
M3 you have to resort to tricks in a compiler dependent way.

In anycase after reading the list of undefined C behaviors, I stand by my
claim that its not that hard to come up with a "safe" C by tweaking the
underlying language design. Throwing away the cruft is not that hard. The
fact that there are crazy corner cases that most C programmers don't know
about, just means these are crazy corner cases that aren't really a problem.

People program in langauges I hate. The fact the Perl scripts run more of
the web than we'd like to admit, just says that "clean language" design is
more of an aesthic pursuit that matters less than we'd all like.

Let's just not look down on C. It's not that horrid of a language. I'll bet
more than 90% of the software used in order to propagate my random usenets
posts are written in C. Not Ada and not M3. If you tallied up the amount of
economic activity that software written in a particular language
generates. It'll be a fight between COBOL and C for the number one spot.

Brian Rogoff

unread,
Apr 14, 2001, 9:27:01 PM4/14/01
to
On 14 Apr 2001, Daniel C. Wang wrote:
> f...@cs.mu.oz.au (Fergus Henderson) writes:
>
> {stuff deleted}
> > There is a fundamental difference in design philosphy between C and
> > M3/Ada. The designers of M3 and Ada valued safety highly. The
> > designers of C favoured other properties, such as convenience, over
> > safety. The designers and standizers of M3 and Ada tried very hard to
> > minimize the areas of undefined behaviour. The designers and
> > standardizers of C didn't.
>
> oh come on be fair. C was invented to write Unix. It wasn't intend to be the
> language it is today. It retained cruft to be backward compatible.

Any language which becomes successful over time will have a strong
pressure to maintain backwards compatibility. Most FPLs have user
communities which are research oriented, and hence somewhat more tolerant
of experimentation that standard languages like Ada, C, and Common Lisp.

> M3 are both "clean slate" designs. Let's also be clear that Ada is relegated
> to a pretty niche market because writing a conforming Ada compiler is a fair
> amount of more work than a C compiler. Ada is an overspecified langauge if
> there ever was one.

Actually, the Ada spec is a lot smaller than the C++ spec (not that *that*
says a lot) and deals with lots of issues at the bitfiddling level that
languages like ML don't deal with at all.

> I'm more sorry that M3 didn't take off... but even in
> M3 you have to resort to tricks in a compiler dependent way.

Different strokes I guess. I never really liked M3 that much at all. No
compelling reason to switch from Ada 95 to Modula-3 IMO. As soon as you
add garbage collection, I see no reason not to go full ahead and adopt a
real functional language. I feel the same way about Java BTW.

> In anycase after reading the list of undefined C behaviors, I stand by my
> claim that its not that hard to come up with a "safe" C by tweaking the
> underlying language design. Throwing away the cruft is not that hard. The
> fact that there are crazy corner cases that most C programmers don't know
> about, just means these are crazy corner cases that aren't really a problem.

Indeed, and I think it is even possible to come up with a higher level C
like language that's more capable at bitfiddling systems stuff and is
"almost functional".

> People program in langauges I hate. The fact the Perl scripts run more of
> the web than we'd like to admit, just says that "clean language" design is
> more of an aesthic pursuit that matters less than we'd all like.

That's true. But some of these more aesthetically appealing tools can make
a useful difference in some parts of the industrial world. It takes a lot
of engineering work though; little prototypes like DML that are abandoned
will have no effect, and the chances for whole cloth new languages like
Xanadu are slim.

> Let's just not look down on C. It's not that horrid of a language. I'll bet
> more than 90% of the software used in order to propagate my random usenets
> posts are written in C. Not Ada and not M3.

The interesting and scary question is not about your newsreader or web
browser, but about the software controlling an airplane, nuclear missile,
or medical machinery.

Anyways, neither C, nor Fortran, nor COBOL, nor Ada, are horrible when
judged in historical context. It takes years for reasearch to make it into
languages. It's also true that, say COBOL, probably has a lot more in its
domain of application than some FPers may give it credit for.

-- Brian


Brian Rogoff

unread,
Apr 15, 2001, 3:07:07 PM4/15/01
to
On Sat, 14 Apr 2001, Markus Mottl wrote:
> Daniel C. Wang <danwan...@cs.princeton.edu> wrote:
> > I'll cry uncle and say that Obj.magic is an acceptable stop gap. But
> > it is not an acceptable thing to have in a langauge definition. (This
> > whole thing started when you suggested Ada's unchecked pragmas were
> > acceptable thing to have in the language spec.)
[...snip...]

> What concerns the naming: I prefer "Obj.magic" to "Unsafe.cast", because
> the latter looks like an invitation to do something unsound, whereas
> the first (especially if left undocumented) looks dangerously magical.

I'd like a descendant of OCaml to be a standardized language one day,
supporting many implementations. There is probably going to be a need for
a dangerous subset of any "real" language (interfaces to C for instance)
so instead of burying our heads in the sand, I think documenting, categorizing
and containing the unsafe ops (like Modula-3 and to a lesser degree Ada) is
worthwhile.

I also think choosing good names is important, difficult, and underappreciated.
Another for instance: in OCaml "length" is used in the sequential collection
modules (List, Array, String, ...) as the function to compute the size of the
collection. It doesn't sound right to me to say that a set or map has a length,
so I would have chosen "size" or "num_elems". "Size" breaks down for graphs so
that suggests num_elems, and then num_nodes/num_edges for graphs. (PS: Yeah, I
know that sounds petty, "a foolish consistency is the hobgoblin of little
minds", but remember "I am a bear of very little brain", or Dijkstra's more
prosaic

Summarzing: as a slow witted human being I have a very small head and
I had better learn to live with it and to respect my limitations and
give them full credit, rather than to try to ignore them, for the
latter vain effort will be punished by failure.
)

Having a really consistently thought out notation and set of names really
helps.

-- Brian


Albert Y. C. Lai

unread,
Apr 16, 2001, 1:03:35 AM4/16/01
to
"Daniel C. Wang" <danwan...@cs.princeton.edu> writes:

> Let's just not look down on C. It's not that horrid of a language. I'll bet
> more than 90% of the software used in order to propagate my random usenets
> posts are written in C. Not Ada and not M3. If you tallied up the amount of
> economic activity that software written in a particular language
> generates. It'll be a fight between COBOL and C for the number one spot.

Let's just not look down on bugs, buggy software, and the programmers
and the tools involved in making them. I'll bet more than 90% of all
software I use is buggy, not bug free. If you tallied up the amounts
of economic activities that different stages of the software process
generate, it'll be a fight between bug introduction and bug finding
for the number one spot. Let's not even discourage such important
economic activities.

Matthias Blume

unread,
Apr 16, 2001, 10:37:34 AM4/16/01
to
Markus Mottl wrote:
>
> Daniel C. Wang <danwan...@cs.princeton.edu> wrote:
> > I'll cry uncle and say that Obj.magic is an acceptable stop gap. But
> > it is not an acceptable thing to have in a langauge definition. (This
> > whole thing started when you suggested Ada's unchecked pragmas were
> > acceptable thing to have in the language spec.)
>
> Note that Obj.magic is *not* part of the language definition.

What is the OCaml language definition? I always thought it to be identical
to the compiler... :)

> It's intended use is for very low-level things like, e.g., interfacing to
> foreign languages, where you'll arguably have to break type safety.

Perhaps. But there is a difference between drilling a tiny hole into a wall
(perhaps with the purpose of runnig a new water pipe into a building) and using
a wrecking ball that brings down the entire wall (or even the entire building)...

Matthias

Jacques Garrigue

unread,
Apr 16, 2001, 9:57:55 PM4/16/01
to
Matthias Blume <bl...@research.bell-labs.com> writes:

> Markus Mottl wrote:
> >
> > Note that Obj.magic is *not* part of the language definition.
>
> What is the OCaml language definition? I always thought it to be identical
> to the compiler... :)

Huh, not exactly, but you're not that far away :)
* a rather good approximation of it is in the reference manual;
this is not the formal semantics you find for SML, but at least it
tells you what a program is supposed to do, which seems current
practice for most other languages. An important property of ocaml is
that evaluation does not depend on types, and this keeps this part
simple.
* for details on the type system, you have to look into papers,
but that can be hard because it is not so clear which paper applies
to which version of ocaml.
* for the most recent experimental features, this may well be only
on implementor's mind, and your alternative is either asking him,
or checking what the compiler does. Yet, if there is a
contradiction between the two, then this is a bug in the compiler,
not in the implementor's mind. In that meaning, the compiler is
_not_ the definition.

That's what it gets you to use a language which contains some
experimental features :-)
By the way, weren't there some in SML/NJ also ?

> > It's intended use is for very low-level things like, e.g., interfacing to
> > foreign languages, where you'll arguably have to break type safety.
>
> Perhaps. But there is a difference between drilling a tiny hole
> into a wall (perhaps with the purpose of runnig a new water pipe
> into a building) and using a wrecking ball that brings down the
> entire wall (or even the entire building)...

If you are answering about Obj.magic, then I think you misunderstood
what Markus is saying. As soon as you have a foreign function
interface, you have access to an unsafe world (since most other
programming languages are unsafe). Obj.magic is mostly used in
conjunction with FFIs to move some part of the dirty work on the ML
side. You could also do it on the other side (in fact, you are
encouraged to do so), but it might be as unsafe, and sometimes you get
clearer code by using it.

In fact something equivalent to Obj.magic is in the language
definition, but what it does is left completely unspecified
(implementation dependent):

external my_own_magic : 'a -> 'b = "%identity"

Cheers,

---------------------------------------------------------------------------
Jacques Garrigue Kyoto University garrigue at kurims.kyoto-u.ac.jp
<A HREF=http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/>JG</A>

George Russell

unread,
Apr 17, 2001, 4:59:59 AM4/17/01
to
"Daniel C. Wang" wrote:
[snip]

> There is something better than "trust me I know what I'm doing"
> semantics...
>
> http://www.ececs.uc.edu/~hwxi/DML/DML.html
> and
> http://www.ececs.uc.edu/~hwxi/Xanadu/Xanadu.html
I don't want to delve into these papers, so can you tell me quickly
if these can cope with the case of an N*N array a such that
for all j<=i, a[i][j]<=i? I have come across something like this
in some sparse array code.

Matthias Blume

unread,
Apr 17, 2001, 11:25:13 AM4/17/01
to
Jacques Garrigue wrote:
>
> Matthias Blume <bl...@research.bell-labs.com> writes:
> > Markus Mottl wrote:
> > >
> > > Note that Obj.magic is *not* part of the language definition.
> >
> > What is the OCaml language definition? I always thought it to be identical
> > to the compiler... :)
>
> Huh, not exactly, but you're not that far away :)
> * a rather good approximation of it is in the reference manual;
> this is not the formal semantics you find for SML, but at least it
> tells you what a program is supposed to do, which seems current
> practice for most other languages. An important property of ocaml is
> that evaluation does not depend on types, and this keeps this part
> simple.
> * for details on the type system, you have to look into papers,
> but that can be hard because it is not so clear which paper applies
> to which version of ocaml.
> * for the most recent experimental features, this may well be only
> on implementor's mind, and your alternative is either asking him,
> or checking what the compiler does. Yet, if there is a
> contradiction between the two, then this is a bug in the compiler,
> not in the implementor's mind. In that meaning, the compiler is
> _not_ the definition.

Well, yes to all of the above. Notice the smiley in my original post.

> That's what it gets you to use a language which contains some
> experimental features :-)
> By the way, weren't there some in SML/NJ also ?

Yes, definitely. And it sure seems to annoy the hell out of some of our users...
(We even have Unsafe.cast. Dan's point, however, was that Unsafe.cast ought
not to be in the language definition. Nowadays CM can (could) track all sources
that explicitly or implicitly depend on unsafe features so that their impact
on safety can be contained.)

> > > It's intended use is for very low-level things like, e.g., interfacing to
> > > foreign languages, where you'll arguably have to break type safety.
> >
> > Perhaps. But there is a difference between drilling a tiny hole
> > into a wall (perhaps with the purpose of runnig a new water pipe
> > into a building) and using a wrecking ball that brings down the
> > entire wall (or even the entire building)...
>
> If you are answering about Obj.magic, then I think you misunderstood
> what Markus is saying. As soon as you have a foreign function
> interface, you have access to an unsafe world (since most other
> programming languages are unsafe).

No, I don't think I misunderstood. I do know that access to most foreign
programming languages is unsafe. What I was trying to say is that it is
perhaps not _that_ unsafe. Yes, you have to get through the solid barriers
of safety (hence "drilling the hole"), but you don't want to tear the
whole building down while doing so.

This might sound awfully vague right now. I hope I can tell you much more
about it Real Soon (tm). :)

> In fact something equivalent to Obj.magic is in the language
> definition, but what it does is left completely unspecified
> (implementation dependent):
>
> external my_own_magic : 'a -> 'b = "%identity"

Well, the language definition is precisely where something like the above (which
is just like SML/NJ's Unsafe.cast) does not belong. At least in Dan's view if
I understand him right.

Greetings,
Matthias

Markus Mottl

unread,
Apr 17, 2001, 11:58:00 AM4/17/01
to
Matthias Blume <bl...@research.bell-labs.com> wrote:
>> In fact something equivalent to Obj.magic is in the language
>> definition, but what it does is left completely unspecified
>> (implementation dependent):
>>
>> external my_own_magic : 'a -> 'b = "%identity"

> Well, the language definition is precisely where something like the
> above (which is just like SML/NJ's Unsafe.cast) does not belong.
> At least in Dan's view if I understand him right.

I agree that this shouldn't be part of a language definition as such but
of the compiler documentation, i.e. how to integrate new basic functions,
types, etc. It's too representation/implementation dependent to be useful
in a language definition.

Daniel C. Wang

unread,
Apr 17, 2001, 1:49:00 PM4/17/01
to

George Russell <g...@tzi.de> writes:
{stuff deleted}

> I don't want to delve into these papers, so can you tell me quickly
> if these can cope with the case of an N*N array a such that
> for all j<=i, a[i][j]<=i? I have come across something like this
> in some sparse array code.


Create a new abstract type

type {n:nat} {m:nat} my_matrix(n)(m)

and constrain the update and subscript functions with the invariant you want
to maintain.

val new : {n:nat}{m:nat} nat(n) -> nat(m) -> my_matrix(n)(m)

val update : {n:nat}{m:nat}{i:nat}{nat:j}{nat:k | k <= i }
my_martix(n)(m) -> nat(i) -> nat(j) -> nat(k) -> unit

val sub : {n:nat}{m:nat}{i:nat}{nat:j}{nat:k | k <= i }
my_martix(n)(m) -> nat(i) -> nat(j) -> nat(k)


The information will get checked an propagated by the type system.

Daniel C. Wang

unread,
Apr 17, 2001, 2:17:56 PM4/17/01
to

Matthias Blume <bl...@research.bell-labs.com> writes:
{stuff deleted}

> At least in Dan's view if I understand him right.

My view is that no safe language designed today should include an option to
turn off array bounds checking. Arguments of the form "we need to give the
programmer an escape hatch", for the particular case of array bounds
checking are IMNHO starting to sound unjustifiable, because of things like
DML.

Anyone who has programmed in a language with H-M polymorphic type inference,
just cringes at the notion of (void*) in C and Java's Object type and
associated casts. C++ template gurus probably hate this sort of thing
too. Things like (void*) and Java's downcast are there to "give the
programmers an escape hatch". It's clear that H-M style type inference makes
these arguments look silly.

In my extreme world view, there is also no reason to have an unsafe FFI. We
just throw Proof Carrying Code at the problem. Of course it's not clear that
this technique is going to work in practice there's lots of engineering
todo. Link to a summary of bleeding edge PCC work
(http://www.cs.princeton.edu/~appel/papers/fpcc.pdf)

To sum up in my extreme world view the only "escape hatch" a programmer
should ever be given is access to a theorem prover. I think, this extremist
viewpoint is the right research direction to go in. I also think good
programmers would demand this too, but they just haven been provided the
right set of tools that make this extremist viewpoint practical today.

"Unsafe.cast" is probably a bad name... "IPrayThisIsNotAnUnsafe.cast" is a
better name. :)


Daniel C. Wang

unread,
Apr 17, 2001, 2:25:19 PM4/17/01
to

"Daniel C. Wang" <danwan...@cs.princeton.edu> writes:

> George Russell <g...@tzi.de> writes:
> {stuff deleted}
>
> > I don't want to delve into these papers, so can you tell me quickly
> > if these can cope with the case of an N*N array a such that
> > for all j<=i, a[i][j]<=i? I have come across something like this
> > in some sparse array code.
>
>
> Create a new abstract type
>
> type {n:nat} {m:nat} my_matrix(n)(m)
>
> and constrain the update and subscript functions with the invariant you want
> to maintain.

Woops bug with the spec.. (I left out the j <= i constraint)

val new : {n:nat}{m:nat} nat(n) -> nat(m) -> my_matrix(n)(m)

val update : {n:nat}{m:nat}{i:nat}{nat:j | j <= i}{nat:k | k <= i }


my_martix(n)(m) -> nat(i) -> nat(j) -> nat(k) -> unit

val sub : {n:nat}{m:nat}{i:nat}{nat:j | j <= i}{nat:k | k <= i }

George Russell

unread,
Apr 17, 2001, 3:21:47 PM4/17/01
to
"Daniel C. Wang" wrote:
[snip]
> val new : {n:nat}{m:nat} nat(n) -> nat(m) -> my_matrix(n)(m)
>
> val update : {n:nat}{m:nat}{i:nat}{nat:j | j <= i}{nat:k | k <= i }
> my_martix(n)(m) -> nat(i) -> nat(j) -> nat(k) -> unit
>
> val sub : {n:nat}{m:nat}{i:nat}{nat:j | j <= i}{nat:k | k <= i }
> my_martix(n)(m) -> nat(i) -> nat(j) -> nat(k)
>
>
OK, I admit it, I'm impressed. Perhaps it would even be possible to
get the whole sparse matrix code I saw through DML. I presume that
general boolean combinations of comparisons between variables are
allowed, which might mean you can indeed cope with the style of
optimised FORTRAN matrix array code (which does things like
changing the use of parts of an array as you go through the
main loop, to avoid using extra space . . .)

Martin Berger

unread,
Apr 18, 2001, 8:59:37 AM4/18/01
to

> To sum up in my extreme world view the only "escape hatch" a programmer
> should ever be given is access to a theorem prover. I think, this extremist
> viewpoint is the right research direction to go in. I also think good
> programmers would demand this too, but they just haven been provided the
> right set of tools that make this extremist viewpoint practical today.

finally someone who shares my viewpoint. i have been saying just that
for quite a while, usually to much amusement. how long do you reckon it'll
take before debugging oriented programming will be superseeded by programming
with theorem proving? i think there's at least 20 years of research and
development ahead -- martin

James Hague

unread,
Apr 18, 2001, 10:11:20 AM4/18/01
to
Martin Berger wrote:
>
> finally someone who shares my viewpoint. i have been saying just that
> for quite a while, usually to much amusement. how long do you reckon it'll
> take before debugging oriented programming will be superseeded by
programming
> with theorem proving? i think there's at least 20 years of research and
> development ahead

You'll find the same viewpoint in any book written by Dijkstra in the 1970s
(e.g. _A Discipline of Programming_). While it has much appeal, it is
difficult to just sit on the side and wait for another 20 years. I own a
book by Cohen, _Programming in the 1990s (published in 1990), which takes
the stance that the 90s will be dominated by a theorem-proving style of
programming. Oops :)

James


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----

Martin Berger

unread,
Apr 18, 2001, 10:33:58 AM4/18/01
to

> You'll find the same viewpoint in any book written by Dijkstra in the 1970s
> (e.g. _A Discipline of Programming_). While it has much appeal, it is
> difficult to just sit on the side and wait for another 20 years. I own a
> book by Cohen, _Programming in the 1990s (published in 1990), which takes
> the stance that the 90s will be dominated by a theorem-proving style of
> programming. Oops :)


"it is difficult to make predictions, especially about the future"

computing is full of broken promises. i don't actually expect programms
to ever be *fully* verified, except in exceptional circumstances.

i think it is cruicial for advanced programming languages to accomodate
degrees of verification. during prototyping you may just want HM style
typechecking for quick trunaround. later you might want to add complext
dependent types + proof hints for certain parts of the program. etc

martin

Brian Rogoff

unread,
Apr 18, 2001, 12:17:47 PM4/18/01
to
On Wed, 18 Apr 2001, James Hague wrote:
> Martin Berger wrote:
> > finally someone who shares my viewpoint. i have been saying just that
> > for quite a while, usually to much amusement. how long do you reckon it'll
> > take before debugging oriented programming will be superseeded by
> programming
> > with theorem proving? i think there's at least 20 years of research and
> > development ahead

I think a large part of the problem is marketing. Look at DML and DeCaml.
These are being held up as a solution to the issues surrounding the
elimination of array bounds checking. OK, there is interest. Now where
can I play around with this system. Oh, the DML and DeCaml stuff is just a
prototype for a thesis, and I have to use Xanadu instead... Why not fold
this stuff into SML/NJ or one of the other SMLs, or OCaml, or GHC? Or, if
the author wants to use them in imperative languages, why not add them to
(Generic :) Java, or GNU Fortran, or GNAT?

To be fair, some of the other work Dan Wang pointed to is being done on
MLTON and SMLNJ. That has a bit better chance of being widely used, though
I don't have any idea how big those user communities are. I should also
point out (since I provoked Dan by suggesting Ada's pragma Suppress for
ML) that OCaml has a static analysis tool for finding potentially uncaught
exceptions, so even in real world languages you can find such tools :-).

> You'll find the same viewpoint in any book written by Dijkstra in the 1970s
> (e.g. _A Discipline of Programming_). While it has much appeal, it is
> difficult to just sit on the side and wait for another 20 years. I own a
> book by Cohen, _Programming in the 1990s (published in 1990), which takes
> the stance that the 90s will be dominated by a theorem-proving style of
> programming. Oops :)

Hey, I have that book too! Oops is right. Still, it's a nice book, and I
think Dijkstra will be viewed favorably by history. He's very much ahead
of the state of practice.

-- Brian


Daniel C. Wang

unread,
Apr 18, 2001, 12:28:11 PM4/18/01
to

"James Hague" <james...@volition-inc.com> writes:

> Martin Berger wrote:
> >
> > finally someone who shares my viewpoint. i have been saying just that
> > for quite a while, usually to much amusement. how long do you reckon it'll
> > take before debugging oriented programming will be superseeded by
> programming
> > with theorem proving? i think there's at least 20 years of research and
> > development ahead
>
> You'll find the same viewpoint in any book written by Dijkstra in the 1970s
> (e.g. _A Discipline of Programming_). While it has much appeal, it is
> difficult to just sit on the side and wait for another 20 years. I own a
> book by Cohen, _Programming in the 1990s (published in 1990), which takes
> the stance that the 90s will be dominated by a theorem-proving style of
> programming. Oops :)
>

We'll I don't expect programmers to use the theorem prover for day to day
hacking. Just when they want to escape out of the type system or prove a
non-trivial property.

George Russell

unread,
Apr 18, 2001, 12:56:04 PM4/18/01
to
Martin Berger wrote:
[snip]

> finally someone who shares my viewpoint. i have been saying just that
> for quite a while, usually to much amusement. how long do you reckon it'll
> take before debugging oriented programming will be superseeded by programming
> with theorem proving? i think there's at least 20 years of research and
> development ahead -- martin
Which probably means much longer. Whenever computer scientists talk about
"10 or 20 years", for example machines that pass the Turing Test, computer vision,
computer chess players at grandmaster level (this one has at least been done,
only about 20 years late), you can be sure that the promised development is going
to take at least 30 years and probably a great deal more.

If it weren't that the university library here probably doesn't go back that
far, I would wager that someone wrote about the future widespread use of automatic
verifications of computer programs by 1955.

Daniel C. Wang

unread,
Apr 18, 2001, 1:06:45 PM4/18/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:

> On Wed, 18 Apr 2001, James Hague wrote:
> > Martin Berger wrote:
> > > finally someone who shares my viewpoint. i have been saying just that
> > > for quite a while, usually to much amusement. how long do you reckon it'll
> > > take before debugging oriented programming will be superseeded by
> > programming
> > > with theorem proving? i think there's at least 20 years of research and
> > > development ahead
>
> I think a large part of the problem is marketing.

umm its funding. If I could get as much VC money pushing out advanced
programming languages as I could selling dog biscuits over the
Internet. We'd see a lot more programming languages with lots of new
features.

I mean who really can make money by designing better programming languages?
Sun's made money on Java indirectly, but the language itself is probably a
money loser. I guess Tcl is the only really good example, but even there
no one made money on the language but related consulting.

I think language design is akin to designing and careful study of
concrete. I know a few civil engineers who can get really excited about
concrete. I'm sure there is small dedicated band of people developing newer,
faster drying, and cheaper mixes. The foundation of your house might be
built from one of these mixes. The world has been revolutionized by the
invention of concrete, but hey it's still just concrete to most
people. Marketing it isn't going to help. :)

Daniel C. Wang

unread,
Apr 18, 2001, 1:46:35 PM4/18/01
to

George Russell <g...@tzi.de> writes:
{stuff deleted}
> If it weren't that the university library here probably doesn't go back that
> far, I would wager that someone wrote about the future widespread use of automatic
> verifications of computer programs by 1955.

What do you call a static type checker? H-M type inference is clearly doing
something akin to theorem proving too. Not to mention all the work in model
checking that is used for both hardware and software systems.

George Russell

unread,
Apr 18, 2001, 1:57:39 PM4/18/01
to
H-M type inference is certainly a good thing, but it is still pretty primitive
compared with the sort of automatic invariant checking that has been suggested
in the literature for donkey's years. And how widespread is H-M type inference
anyway? Although it must now be more than 20 years old, it is surely still
mainly used by researchers rather than programmers.

Brian Rogoff

unread,
Apr 18, 2001, 2:14:30 PM4/18/01
to
On 18 Apr 2001, Daniel C. Wang wrote:
> Brian Rogoff <b...@shell5.ba.best.com> writes:
>
> > On Wed, 18 Apr 2001, James Hague wrote:
> > > Martin Berger wrote:
> > > > finally someone who shares my viewpoint. i have been saying just that
> > > > for quite a while, usually to much amusement. how long do you reckon it'll
> > > > take before debugging oriented programming will be superseeded by
> > > programming
> > > > with theorem proving? i think there's at least 20 years of research and
> > > > development ahead
> >
> > I think a large part of the problem is marketing.
>
> umm its funding. If I could get as much VC money pushing out advanced
> programming languages as I could selling dog biscuits over the
> Internet. We'd see a lot more programming languages with lots of new
> features.

No, things don't work that way. Perl, Python, and C++ caught on without
big budgets behind them (C++ is arguable, but I have no reason to think
Stroustrup lied on this topic). Ruby is somewhat popular in Japan.

By marketing, I mean persuading programmers to use the tools. Don't get
hung up on VC money, because you aren't going to get it, and language
designers don't get rich. You should consider yourself successful if your
ideas change the way programmers work, right?

Making a brand new language like Xanadu is a low probability of success
effort IMO. It's hard enough for a new language to get popular, so why not
just dovetail on to decent languages with at least a modicum of popularity?

> I mean who really can make money by designing better programming languages?

No one. You should be thinking of yourself as an evangelist, not as
someone who is going to make a lot of money. Or, at best, as someone who
can sell lecture time once FPs have taken over.

> Sun's made money on Java indirectly, but the language itself is probably a
> money loser. I guess Tcl is the only really good example, but even there
> no one made money on the language but related consulting.

> I think language design is akin to designing and careful study of
> concrete.

I think the analogy is very weak. Civil engineers often test their
theories in the real world. A lot of programming language design has a
decidedly academic flavor, with researchers pronouncing problems solved
after a few papers have been published and no long term testing going on.
This is what some of the Erlang guys (maybe just Uffe?) were complaining
about before.

> I know a few civil engineers who can get really excited about
> concrete. I'm sure there is small dedicated band of people developing newer,
> faster drying, and cheaper mixes. The foundation of your house might be
> built from one of these mixes. The world has been revolutionized by the
> invention of concrete, but hey it's still just concrete to most
> people. Marketing it isn't going to help. :)

See that Dijkstra quote about analogies. It seems you SML guys have been
watching too much "home improvement" TV :-)

-- Brian


Daniel C. Wang

unread,
Apr 18, 2001, 2:27:29 PM4/18/01
to

George Russell <g...@tzi.de> writes:
{stuff deleted}
> H-M type inference is certainly a good thing, but it is still pretty primitive
> compared with the sort of automatic invariant checking that has been suggested
> in the literature for donkey's years. And how widespread is H-M type inference
> anyway? Although it must now be more than 20 years old, it is surely still
> mainly used by researchers rather than programmers.

If you look at DML (especially Xanadu) it looks a lot like Hore logic. The
fact that H-M type inference is more than 20 years old yet not widely
adopted, reflects the slow rate at which any language technology is adopted.
It's not because the technology is "not ready for prime time". Why hasn't
Sun adpoted GJ or any other support for generics in Java?
(http://www.cs.bell-labs.com/who/wadler/pizza/gj/)

From a technical standpoint the major hurdle left is a effective way to
deal with state. Things like Clean's uniqueness typing, monads, and linear
type systems will probably present usable solutions in a few years.

Anyway, I think it would be fair to say that it's not that the researchers
who over promised it's that the industry has been slow to adopt. Mainly,
because of business and software engineering issues. It is pretty easy to
build a system where you can verify many invariants automatically. It's not
so easy to convince people using such a system would be a good thing.

George Russell

unread,
Apr 18, 2001, 2:48:16 PM4/18/01
to
"Daniel C. Wang" wrote:
[snip]
> From a technical standpoint the major hurdle left is a effective way to
> deal with state. Things like Clean's uniqueness typing, monads, and linear
> type systems will probably present usable solutions in a few years.
[snip]
Well we'll see, that's all. Personally I doubt it, and you can quote me
in 2010 when I'm proved wrong. It just seems to me that the types you
will have to declare for any non-trivial state manipulation are likely to
be far too complicated for normal use.

Personally I'm waiting for quantum computing, since then maybe we'll
be able to effectively run 10^(large number) parallel copies of Isabelle
to produce all the intermediate proof obligations. But I'm not holding
my breath . . .

Daniel C. Wang

unread,
Apr 18, 2001, 2:59:07 PM4/18/01
to

George Russell <g...@tzi.de> writes:

> "Daniel C. Wang" wrote:
> [snip]
> > From a technical standpoint the major hurdle left is a effective way to
> > deal with state. Things like Clean's uniqueness typing, monads, and linear
> > type systems will probably present usable solutions in a few years.
> [snip]
> Well we'll see, that's all. Personally I doubt it, and you can quote me
> in 2010 when I'm proved wrong. It just seems to me that the types you
> will have to declare for any non-trivial state manipulation are likely to
> be far too complicated for normal use.

hopefully for 99% of the cases you don't care. It's only when you want to
use "Unsafe.cast" that you'll need to write things down....
Like, you said we can just wait and see.

Daniel C. Wang

unread,
Apr 18, 2001, 2:55:09 PM4/18/01
to

Brian Rogoff <b...@shell5.ba.best.com> writes:
{stuff deleted}

> Making a brand new language like Xanadu is a low probability of success
> effort IMO. It's hard enough for a new language to get popular, so why not
> just dovetail on to decent languages with at least a modicum of popularity?

In less than 5 years everyone will be programming in C#. That's at least
what Microsoft hopes. Sun has achieved a lot by throwing money at
Java. Do you think Ada would have existed if not for the DOD, and
multi-million dollar defense contracts that required the use of Ada?

Marketing and tools require money. Academics usally can't get the
manpower or funding from traditional granting agencies to do things like
write documentation or build debuggers. It just isn't "research". If you
want research to escape into the wild there needs to be money behind it.

The economics of langauge design don't encourage long term direct
investement or any investment at all. You can get lucky and get "pro bono"
resources by hitting the right niche. (Like Perl and many other scripting
langauges.) But without a change in the economics you just aren't going to
see high-quality research ideas turned into high-quality programming
environments.

High-quality programming environments based on technically advanced
programming concepts do not exist not because the concepts are "not ready
for prime time". It's just there has been no one willing to invest the
energy to do so, because the economics are bad. It's easier to get money to
sell dog biscits on the Internet than build tools to make programming a more
reliable task. I'm complaining about the fact that most pl researchers have
to litterally beg, borrow, and steal resources to bring ideas to
market. This is not the case in other areas. Marketing is not cheap, and
there simply isn't any money for it.

{stuff deleted}

Brian Rogoff

unread,
Apr 18, 2001, 4:08:21 PM4/18/01
to
On 18 Apr 2001, Daniel C. Wang wrote:
> Brian Rogoff <b...@shell5.ba.best.com> writes:
> {stuff deleted}
> > Making a brand new language like Xanadu is a low probability of success
> > effort IMO. It's hard enough for a new language to get popular, so why not
> > just dovetail on to decent languages with at least a modicum of popularity?
>
> In less than 5 years everyone will be programming in C#. That's at least
> what Microsoft hopes. Sun has achieved a lot by throwing money at
> Java. Do you think Ada would have existed if not for the DOD, and
> multi-million dollar defense contracts that required the use of Ada?

No, of course not. But given that C#, Java, C++, Ada, Fortran, etc., exist
already, and have use communities, why not integrate provers for type
array bounds checking into those languages, rather than start afresh with
a brand new language (Xanadu)? There are even companies which build
advanced static analyzers into Ada subsets which are used in safety
critical domains, see http://www.praxis-cs.co.uk/. Maybe they could be
convinced to license that technology?

Or even better (this is c.l.f. :) integrate that technology into one of
the widely used FPLs. You know which one I'd suggest, but even SML/NJ
would be a reasonable choice. In any case, that is a perfect example of
why promising technology doesn't make it from research to practice.



> The economics of langauge design don't encourage long term direct
> investement or any investment at all. You can get lucky and get "pro bono"
> resources by hitting the right niche. (Like Perl and many other scripting
> langauges.)

Once a language is successful in a niche, its supporters will try to use
it for everything. Perl got popular before the web took off. In fact Perl
has some very valuable lessons, since arguably better languages, like
Icon, were utter failures in industry compared to Perl not because they
were uninteresting languages (I find Icon far more interesting and
powerful for string processing) but because Perl had a much richer system
library: for example until recently you couldn't write a directory
traversal program in Icon without doing an exec of whatever the OS call
is.

Once ML (inclusive ;-) is in a few niches, it will also be pushed elsewhere.
I see OCaml as having a bit of momentum, very tiny now to be sure, but the
user community is growing steadily.

> But without a change in the economics you just aren't going to
> see high-quality research ideas turned into high-quality programming
> environments.

I've seen some really nice stuff from INRIA and the former Rice PLT group.
Sure, it could use more work, but so could all of the commercial systems I
see.

> High-quality programming environments based on technically advanced
> programming concepts do not exist not because the concepts are "not ready
> for prime time".

The concepts are not ready for prime-time. Few commercial programmers are
comfortable with them, so it's hard to find people who can use these tools
effectively. That's part of the definition of being "ready for prime
time".

> It's just there has been no one willing to invest the energy to do so,
> because the economics are bad.

You (language designers) have to build better tools, and make them
accessible to a wider range of programmers. You complain about VC and
economics. My complaint is that academia has a reward system which is
not tuned to produce working tools but just simple prototypes. There is
blame to go around.

> It's easier to get money to sell dog biscits on the Internet than build
> tools to make programming a more reliable task.

Where have you been hiding? That bubble has burst. Sand Hill Road isn't as
crowded as it used to be, in fact my commute is becoming much more
bearable. :-)

> I'm complaining about the fact that most pl researchers have
> to litterally beg, borrow, and steal resources to bring ideas to
> market.

Once you have your foot in the door, companies are willing to write
checks. We need to be convinced that the software you produce will
be around for a while. So, while in theory, you convinced me that DML is
a good approach, a quick reading of Xi's pages convinced me that until
someone else picks it up it ain't gonna happen.

> This is not the case in other areas. Marketing is not cheap, and
> there simply isn't any money for it.

Marketing is extremely cheap at this stage. Torvalds just released his
stuff on the net as a Minix replacement without a whole lot of fanfare.

Anyways, I'm not paid by INRIA, and I do a fair bit of OCaml marketing.
At some stage in a language's acceptance, it will get advocates who are
not "original users" and I believe we're seeing that now in the OCaml
world.

-- Brian


It is loading more messages.
0 new messages