Glasgow haskell vs. Lispworks

Rene de Visser

unread,

Aug 10, 2007, 3:09:01 PM8/10/07

to

Hello,

I have used Common Lisp for about 4 years, and was persuaded that it was the
best available programming language (well at least the one I liked the most;
somehow I never caught on to Smalltalk).

I read a posting (maybe even in this newgroup) from someone who had used
Common Lisp for a large number of years, and had learned haskell. His
description of the good features of Haskell persuaded me that it was worth
learning, at least to improve my Common Lisp skills.

After about 3 months of struggling to learn Haskell (I found Common Lisp
easy to learn), I finally started to get the hang of it.

After around 2 years of (Glasgow) Haskell I now prefer it as a language to
Common Lisp. Glasgow haskell is still improving.

However Lispworks is (IMHO) about 10 years ahead of GHC (Glasgow haskell
compiler) in functionality (e.g. the next release of GHC will for the first
time include a debugger.) so I still sometimes use Lispworks. Also there are
some very nice libraries in Common Lisp, alone one of which alone might take
several man years to port to Haskell.

-- What I prefer about Haskell.

Maintaining large programs is easier. You can obtain the type signature of
any function, and the type signature of a Haskell function pretty well
describes what a Haskell function does. From this I can also understand
better how function works, and how to change it. When I make changes, the
type checker shows me where I need to make further changes, and where I have
made a mistake. On the other hand a unit test only tells me that something
is wrong, but not where or why.

Libraries can often be understood from their interface.

Different libraries can be used together in combination. Too often Common
Lisp libraries have their own version of defun or other standard functions
that are not compatible with each other (and very hard to intergrate).

No ugly macro composition problems, or finding that I need to pass a macro
to a HOF as a parameter.

I find Haskell code prettier and easier to read (I find Common Lisp easy to
read as well, however not as good Haskell).
I find large Haskell programs of similar complexity (IMHO) easier to
understand than Common Lisp ones.

Haskell does "automatic state management" (an abuse of the term, but I find
that immutably, and functional purity does make it easier for me to reason
about programs). The next step after garbage collection, but as an C
programmer will tell you: garbage collection is evil, it restricts what the
programmer can do with manual memory management.

Equals works correctly in Haskell. Not need to decide between eq, eql,
equals, equalp and other data structure specific versions.

I have less choices in Haskell.For me this simplifies programming. Code
sometimes follows directly form the types.

Haskell's type system gives me the option of designing a program in terms of
its types. I find this a useful option.

-- What I prefer about Lispworks

Completely incremental development. No waiting for a file to compile.
Rebuild only once a week to make sure rebuilding from scratch still works.

Great integrated development enviroment. Lispworks is well stress tested
over the years. Problems are often known in advance. And you know others
have found solutions to them.

Good range of libraries available for common lisp.

Macros are easier in common lisp (I don't find template haskell as
friendly).

Embedding DSL's in common lisp is easier and often blends in better
(visually/syntactically).

Switching data structures in a running application is much easier in
Lispworks (note that this also possible in a GHC application, the libraries
our however not really there yet (generate marshalling functions from the
old to the new structures, and then dynamically import the new modules and
perform the marshalling)).

-- Why I can't add the features of Haskell to Common Lisp.

I don't have time to create the complete GHC system as a DSL in Common Lisp.
This is probably around 50 man years work.

All the Common Lisp libraries would still be in normal Common Lisp, so this
really wouldn't help anyway.

Summary.

Haskell and Common Lisp are both good languages.

Rene

Rainer Joswig

unread,

Aug 10, 2007, 8:10:00 PM8/10/07

to

In article <f9id4j$9ut$1...@online.de>,

"Rene de Visser" <Rene_de...@hotmail.com> wrote:

> -- What I prefer about Haskell.
>
> Maintaining large programs is easier. You can obtain the type signature of
> any function, and the type signature of a Haskell function pretty well
> describes what a Haskell function does. From this I can also understand
> better how function works, and how to change it. When I make changes, the
> type checker shows me where I need to make further changes, and where I have
> made a mistake. On the other hand a unit test only tells me that something
> is wrong, but not where or why.

Even if you have static type checking, you will need unit tests.
There is no way around it.

A 'unit test' tests the smallest testable unit if code. So
if a unit test fails, the error is right there in the unit.
By definition.

Sure, a unit test can't tell you why you added two numbers
instead of multiplying it. The type checker will also
not tell you why you wrote a + b, instead of a * b.
But the unit test can detect it.

...

> Different libraries can be used together in combination. Too often Common
> Lisp libraries have their own version of defun or other standard functions
> that are not compatible with each other (and very hard to intergrate).

I don't understand the issue here.

> No ugly macro composition problems, or finding that I need to pass a macro
> to a HOF as a parameter.

Since a macro is not a function, you cannot pass it. By definition.

If you have a macro foo and you want to pass some code, you
write (lambda (bar) (foo bar)) and pass that.

> I find Haskell code prettier and easier to read (I find Common Lisp easy to
> read as well, however not as good Haskell).
> I find large Haskell programs of similar complexity (IMHO) easier to
> understand than Common Lisp ones.

I have made the opposite experience. Also reading source code
in Haskell from other people, I see coding that is some much
behind Lisp coding style it is not even funny. Just yesterday
I compared the coding-style for emacs-style editors in Haskell,
OCAML, Emacs Lisp and Common Lisp. Common Lisp and Emacs Lisp
had much better code. OCAML was kind of readable. The Haskell code
was obfuscated by design. In Lisp you can get very declarative
code. Haskell and OCAML code, though it was functional, was
more low-level.

> Haskell does "automatic state management" (an abuse of the term, but I find
> that immutably, and functional purity does make it easier for me to reason
> about programs). The next step after garbage collection, but as an C
> programmer will tell you: garbage collection is evil, it restricts what the
> programmer can do with manual memory management.

No, C programmers don't like garbage collectors, because it
comes with a cost they don't like.

> Equals works correctly in Haskell. Not need to decide between eq, eql,
> equals, equalp and other data structure specific versions.

Equal can't work 'correctly'. You might want to read Kent Pitman's
article about it.

http://www.nhplace.com/kent/PS/EQUAL.html

> I have less choices in Haskell.For me this simplifies programming. Code
> sometimes follows directly form the types.

Yes, this can be good.

> Haskell's type system gives me the option of designing a program in terms of
> its types. I find this a useful option.
>
> -- What I prefer about Lispworks
>
> Completely incremental development. No waiting for a file to compile.
> Rebuild only once a week to make sure rebuilding from scratch still works.
>
> Great integrated development enviroment. Lispworks is well stress tested
> over the years. Problems are often known in advance. And you know others
> have found solutions to them.
>
> Good range of libraries available for common lisp.
>
> Macros are easier in common lisp (I don't find template haskell as
> friendly).
>
> Embedding DSL's in common lisp is easier and often blends in better
> (visually/syntactically).
>
> Switching data structures in a running application is much easier in
> Lispworks (note that this also possible in a GHC application, the libraries
> our however not really there yet (generate marshalling functions from the
> old to the new structures, and then dynamically import the new modules and
> perform the marshalling)).
>
> -- Why I can't add the features of Haskell to Common Lisp.
>
> I don't have time to create the complete GHC system as a DSL in Common Lisp.
> This is probably around 50 man years work.

It would also not make much sense. But there are/were Haskell-like
languages on top of Lisp. For example there was Yale Haskell
implemented in CL. No longer maintained.

> All the Common Lisp libraries would still be in normal Common Lisp, so this
> really wouldn't help anyway.
>
> Summary.
>
> Haskell and Common Lisp are both good languages.

No doubt.

> Rene

David Golden

unread,

Aug 10, 2007, 9:29:27 PM8/10/07

to

Rene de Visser wrote:

> -- Why I can't add the features of Haskell to Common Lisp.
>
> I don't have time to create the complete GHC system as a DSL in Common
> Lisp. This is probably around 50 man years work.

FWIW, one of the early haskell implementations, Yale Haskell,
was _implemented in_ Common Lisp. Okay, it is pre- Haskell'98,
and not developed anymore, but still, it exists [1]. And it still
loads under CMUCL 19d after minor adjustments [2]. It worked under
Lispworks too, apparently, probably still would with similarly minor
adjustments. It can merrily call out to Lisp functions as set
out in included documentation.

Building with cmucl on linux/x86
Download src_205.tgz tarball from [1].
Download, review and apply patch at [2].
In the root directory of the patched distribution, in a csh
session, set $HASKELL to point to the root directory of the
distribution, and adjust haskell-setup and haskell-development
csh scripts appropriately for your locations, and source
haskell-development to setup for build.

See com/cmu/README

Basically do:
> com/cmu/compile
> com/cmu/build-prelude
> com/cmu/savesys

If all went well, you'll now have a cmucl image file that
has a haskell toplevel:

> cmucl -core bin/new-cmu-haskell.core

Yale Haskell Y2.0.5 CMU Common Lisp version 19d (19D) on X86
Type :? for help.
Main> :run progs/demo/primes.hs
Compiling unit "progs/demo/primes.hs".
Optimizers: ()
Evaluating main.
How many primes? 10
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Main> :q

[1] Yale Haskell 2.0.5:
http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/syntax/haskell/

[2] Quick kludge patch for compiling Yale Haskell 2.0.5 under
CMUCL 19d on linux/x86 (N.B. I didn't bother with the X11 or emacs
support):

http://oldr.net/yale-haskell-2.0.5-with-cmucl-19d.patch

Jon Harrop

unread,

Aug 10, 2007, 10:15:20 PM8/10/07

to

Rainer Joswig wrote:
> Even if you have static type checking, you will need unit tests.
> There is no way around it.
>
> A 'unit test' tests the smallest testable unit if code. So
> if a unit test fails, the error is right there in the unit.
> By definition.

In our experience, static typing renders such small scale unit tests
unnecessary.

--
Dr Jon D Harrop, Flying Frog Consultancy
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?usenet

Stuart Cook

unread,

Aug 11, 2007, 8:09:49 AM8/11/07

to

On Aug 11, 10:10 am, Rainer Joswig <jos...@lisp.de> wrote:
> In article <f9id4j$9u...@online.de>,

> "Rene de Visser" <Rene_de_Vis...@hotmail.com> wrote:
>
> > Equals works correctly in Haskell. Not need to decide between eq, eql,
> > equals, equalp and other data structure specific versions.
>
> Equal can't work 'correctly'. You might want to read Kent Pitman's
> article about it.
>
> http://www.nhplace.com/kent/PS/EQUAL.html

[Disclaimer: I'm a Haskell fan, and not a Lisper, so take my words
with an appropriate grain of salt.]

That article describes why generic equality can't work *in (Common)
Lisp*.

The fundamental problem for CL is that given a cons cell, it's
impossible to reliably determine whether it should be interpreted as a
cons-pair, a list, a tree, or something else entirely. This is because
CL gives you the flexibility to use conses in many different ways
without having to explicitly declare your intention.

Haskell denies you this particular flexibility, but in exchange you
gain the ability to have compiler discover your intentions using the
types of your terms. Pairs, lists and trees all have distinct types,
and Haskell can use the appropriate comparison function automatically.

So this is a nice example of a feature that CL can't[1] have -- not
because CL somehow isn't powerful enough, but because in this case it
has *too much* unrestricted power. "With great power comes great
responsibility", after all.

At a broad level of generalization, I suppose you could say that Lisp
compilers tend to assume you know what you're doing, as long as it
looks like it might make sense. A Haskell compiler tries to keep you
"honest", and in exchange does its best to assist you using the
information you provide.

(There are some other nice properties of Haskell arising from
typeclasses, purity etc., but I won't go into these right now.)

> > I have less choices in Haskell.For me this simplifies programming. Code
> > sometimes follows directly form the types.
>
> Yes, this can be good.

Come to think of it, this is my entire post in a nutshell.

Stuart Cook

[1] Well, you might be able able to come close by avoiding raw conses
entirely, but would you be able call the result Lisp?

Rainer Joswig

unread,

Aug 11, 2007, 8:22:05 AM8/11/07

to

In article <1186834189....@z24g2000prh.googlegroups.com>,
Stuart Cook <sco...@gmail.com> wrote:

> On Aug 11, 10:10 am, Rainer Joswig <jos...@lisp.de> wrote:
> > In article <f9id4j$9u...@online.de>,
> > "Rene de Visser" <Rene_de_Vis...@hotmail.com> wrote:
> >
> > > Equals works correctly in Haskell. Not need to decide between eq, eql,
> > > equals, equalp and other data structure specific versions.
> >
> > Equal can't work 'correctly'. You might want to read Kent Pitman's
> > article about it.
> >
> > http://www.nhplace.com/kent/PS/EQUAL.html
>
> [Disclaimer: I'm a Haskell fan, and not a Lisper, so take my words
> with an appropriate grain of salt.]
>
> That article describes why generic equality can't work *in (Common)
> Lisp*.

It talks that there are different equalities and you have to
say which you want:

* identity (is this really the same thing?)
* value (does it have the same value?)
* structural (does is have the same structure?)
* fuzzy (is it same up to some limit?)

and so on.

> The fundamental problem for CL is that given a cons cell, it's
> impossible to reliably determine whether it should be interpreted as a
> cons-pair, a list, a tree, or something else entirely. This is because
> CL gives you the flexibility to use conses in many different ways
> without having to explicitly declare your intention.
>
> Haskell denies you this particular flexibility, but in exchange you
> gain the ability to have compiler discover your intentions using the
> types of your terms. Pairs, lists and trees all have distinct types,
> and Haskell can use the appropriate comparison function automatically.

The compiler can't discover my intentions. That would be AI.

>
> So this is a nice example of a feature that CL can't[1] have -- not
> because CL somehow isn't powerful enough, but because in this case it
> has *too much* unrestricted power. "With great power comes great
> responsibility", after all.
>
> At a broad level of generalization, I suppose you could say that Lisp
> compilers tend to assume you know what you're doing, as long as it
> looks like it might make sense. A Haskell compiler tries to keep you
> "honest", and in exchange does its best to assist you using the
> information you provide.
>
>
> (There are some other nice properties of Haskell arising from
> typeclasses, purity etc., but I won't go into these right now.)
>
>
> > > I have less choices in Haskell.For me this simplifies programming. Code
> > > sometimes follows directly form the types.
> >
> > Yes, this can be good.
>
> Come to think of it, this is my entire post in a nutshell.
>
>
> Stuart Cook
>
> [1] Well, you might be able able to come close by avoiding raw conses
> entirely, but would you be able call the result Lisp?

Sure, why not? In Lisp you are using all kinds of data structures which
are not build out of conses.

--
http://lispm.dyndns.org

ross...@ps.uni-sb.de

unread,

Aug 11, 2007, 4:41:15 PM8/11/07

to

On 11 Aug., 14:22, Rainer Joswig <jos...@lisp.de> wrote:
>
> * identity (is this really the same thing?)
> * value (does it have the same value?)
> * structural (does is have the same structure?)
> * fuzzy (is it same up to some limit?)
>
> and so on.

Neither of these does The Right Thing in general. E.g. when you have
data structures that are largely structural, but contain a few mutable
nodes somewhere inside (say a tree carrying some mutable infos). The
right thing would be choosing different notions of equality on
different nodes. SML and Haskell do that automatically, thanks to
typing. E.g. in SML,

val r = ref 1

[r, r] = [r, r] (* true *)
[r, r] = [r, ref 1] (* false *)

Note that the lists are compared structurally, while the contained
references are compared by identity - which is the only definition of
equality properly mirroring observational equivalence.

> > Haskell denies you this particular flexibility, but in exchange you
> > gain the ability to have compiler discover your intentions using the
> > types of your terms. Pairs, lists and trees all have distinct types,
> > and Haskell can use the appropriate comparison function automatically.
>
> The compiler can't discover my intentions. That would be AI.

I'm confused. Is that a proof that SML and Haskell do not exist?

Rainer Joswig

unread,

Aug 11, 2007, 5:02:11 PM8/11/07

to

In article <1186864875.0...@b79g2000hse.googlegroups.com>,
ross...@ps.uni-sb.de wrote:

> On 11 Aug., 14:22, Rainer Joswig <jos...@lisp.de> wrote:
> >
> > * identity (is this really the same thing?)
> > * value (does it have the same value?)
> > * structural (does is have the same structure?)
> > * fuzzy (is it same up to some limit?)
> >
> > and so on.
>
> Neither of these does The Right Thing in general. E.g. when you have
> data structures that are largely structural, but contain a few mutable
> nodes somewhere inside (say a tree carrying some mutable infos). The
> right thing would be choosing different notions of equality on
> different nodes. SML and Haskell do that automatically, thanks to
> typing. E.g. in SML,
>
> val r = ref 1
>
> [r, r] = [r, r] (* true *)
> [r, r] = [r, ref 1] (* false *)
>
> Note that the lists are compared structurally, while the contained
> references are compared by identity - which is the only definition of
> equality properly mirroring observational equivalence.

strings

* "foo" and "foo", same strings ?

are these really the same objects?

* "foo" and "foo", same contents ?

It could be different string types (fixed length, growing strings,
unicode strings, ascii strings, ...)

* "foo" and "Foo", same word?

* "Rainer Joswig" and "Ranier Joswig", same name (minus spelling)?

>
> > > Haskell denies you this particular flexibility, but in exchange you
> > > gain the ability to have compiler discover your intentions using the
> > > types of your terms. Pairs, lists and trees all have distinct types,
> > > and Haskell can use the appropriate comparison function automatically.
> >
> > The compiler can't discover my intentions. That would be AI.
>
> I'm confused. Is that a proof that SML and Haskell do not exist?

Choosing the appropriate comparison function does not depend on
the types of the arguments, but on the intentions of what
features of two objects I want to compare how.

dbe...@eecs.wsu.edu

unread,

Aug 11, 2007, 6:09:29 PM8/11/07

to

> In our experience, static typing renders such small scale unit tests
> unnecessary.
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy
> OCaml for Scientistshttp://www.ffconsultancy.com/products/ocaml_for_scientists/?usenet

This is also my experience using SML/NJ and CML in building a 50K+ LOC
app.

jsnX

unread,

Aug 11, 2007, 7:51:49 PM8/11/07

to

On Aug 11, 2:02 pm, Rainer Joswig <jos...@lisp.de> wrote:
> are these really the same objects?

In a language without assignments, who cares? In Haskell they are
equal.

> * "foo" and "foo", same contents ?

If they are of different types, they *may* be equal. If that
functionality is desired, Haskell allows it.

> Choosing the appropriate comparison function does not depend on
> the types of the arguments, but on the intentions of what
> features of two objects I want to compare how.

I think in general, you are right about that. However, I think it
misses the OP's point; what he's saying is, Haskell makes it easy to
tell when object can be substituted for another with no change in the
result of the program.

CL is not unique in the annoyance of 'multiple equality' -- Ruby has
the same problem. Haskell's simple notion of equality is due to its
absolute referential transparency, something even O'Caml doesn't share
with it.

Matthew Danish

unread,

Aug 11, 2007, 8:24:30 PM8/11/07

to

On Aug 11, 5:02 pm, Rainer Joswig <jos...@lisp.de> wrote:
> In article <1186864875.060177.110...@b79g2000hse.googlegroups.com>,

>
> * "foo" and "foo", same strings ?
>
> are these really the same objects?

Hi Rainer,

Object identity is a notion specific to object-oriented languages like
Common Lisp. In (pure) functional programming languages, it doesn't
make any sense to ask for a comparison of object identity.

Identity comes into play with mutability. Then it is possible to
distinguish between different "objects." For example, if you modify
the first character of "foo", but the other "foo" does not change,
then there are clearly two different objects.

> Choosing the appropriate comparison function does not depend on
> the types of the arguments, but on the intentions of what
> features of two objects I want to compare how.

1) In Haskell, the intentions of the programmer are reflected in the
types, so this is less of a mystery.

2) You can still argue that, for example,

[1,2,3] == [3,2,1]

under a reasonable definition of equality. The lists contain the same
elements, and the same count of each element. The conventional notion
of list equality includes ordering too, however. Informally, I would
say that the best notion of equality to use is the intersection of all
reasonable definitions; and that this is used by standard Haskell
instances of Eq.

[1,2,3] == [1,2,3]

is true whether or not you consider ordering, elements, or count to be
relevant.

Rainer Joswig

unread,

Aug 11, 2007, 8:24:48 PM8/11/07

to

In article <1186876309.6...@i38g2000prf.googlegroups.com>,
jsnX <jason...@gmail.com> wrote:

> On Aug 11, 2:02 pm, Rainer Joswig <jos...@lisp.de> wrote:
> > are these really the same objects?
>
> In a language without assignments, who cares?

In a language which can pass data and can create compound
data structures one might care.

> In Haskell they are
> equal.
>
> > * "foo" and "foo", same contents ?
>
> If they are of different types, they *may* be equal. If that
> functionality is desired, Haskell allows it.
>
> > Choosing the appropriate comparison function does not depend on
> > the types of the arguments, but on the intentions of what
> > features of two objects I want to compare how.
>
> I think in general, you are right about that. However, I think it
> misses the OP's point; what he's saying is, Haskell makes it easy to
> tell when object can be substituted for another with no change in the
> result of the program.

'Haskell makes it easy to tell when object can be substituted for another
with no change in the result of the program.'

For some objects that are equal based on the Haskell definition
of equality. Some other objects might not be equal based
on that definition, yet the the result of the program
is the same using either one.

> CL is not unique in the annoyance of 'multiple equality' -- Ruby has
> the same problem. Haskell's simple notion of equality is due to its
> absolute referential transparency, something even O'Caml doesn't share
> with it.

I don't see it as an annoyance. It is just a consequence that
there are multiple useful definitions of equality in the presence
of observable identity, value of objects and structure
of objects and their sub-objects.

Rainer Joswig

unread,

Aug 11, 2007, 8:46:38 PM8/11/07

to

In article <1186878270.5...@o61g2000hsh.googlegroups.com>,
Matthew Danish <m...@cs.cmu.edu> wrote:

> On Aug 11, 5:02 pm, Rainer Joswig <jos...@lisp.de> wrote:
> > In article <1186864875.060177.110...@b79g2000hse.googlegroups.com>,
> >
> > * "foo" and "foo", same strings ?
> >
> > are these really the same objects?
>
> Hi Rainer,
>
> Object identity is a notion specific to object-oriented languages like
> Common Lisp. In (pure) functional programming languages, it doesn't
> make any sense to ask for a comparison of object identity.
>
> Identity comes into play with mutability. Then it is possible to
> distinguish between different "objects." For example, if you modify
> the first character of "foo", but the other "foo" does not change,
> then there are clearly two different objects.

Identity comes into play if you can create and pass data.
For example comparing two things by identity is (usually) much faster
than comparing by structure.

The ability to identify an object based on identity
allows you also to save space. For example in
a large compound data structure you could replace
all structural equal objects with the same object.
If you can't modify in your language, you might
input such a compound data structure and return
a more space efficient version from a function.

> > Choosing the appropriate comparison function does not depend on
> > the types of the arguments, but on the intentions of what
> > features of two objects I want to compare how.
>
> 1) In Haskell, the intentions of the programmer are reflected in the
> types, so this is less of a mystery.
>
> 2) You can still argue that, for example,
>
> [1,2,3] == [3,2,1]
>
> under a reasonable definition of equality. The lists contain the same
> elements, and the same count of each element. The conventional notion
> of list equality includes ordering too, however. Informally, I would
> say that the best notion of equality to use is the intersection of all
> reasonable definitions; and that this is used by standard Haskell
> instances of Eq.
>
> [1,2,3] == [1,2,3]
>
> is true whether or not you consider ordering, elements, or count to be
> relevant.

Pitman:
'There is no uniquely determined equality function for complex structures
--there are only arbitrary ones.'

Matthew Danish

unread,

Aug 11, 2007, 9:04:43 PM8/11/07

to

On Aug 11, 8:46 pm, Rainer Joswig <jos...@lisp.de> wrote:
> Identity comes into play if you can create and pass data.
> For example comparing two things by identity is (usually) much faster
> than comparing by structure.

Which is a matter for the compiler to worry about. The programmer
cannot determine any such difference, because it doesn't exist in the
semantics of the programming language.

Rainer Joswig

unread,

Aug 12, 2007, 4:35:21 AM8/12/07

to

In article <1186880683....@k79g2000hse.googlegroups.com>,
Matthew Danish <m...@cs.cmu.edu> wrote:

In the case of Haskell. In Common Lisp the developer
can use the knowledge in the code. I'm also not sure if a
compiler automatically does the 'right' thing. Probably the
SSC will do it. ;-)

Alan Perlis: "A Lisp programmer knows the value of everything,
but the cost of nothing."
Maybe that's also true for Haskell programmers?

--
http://lispm.dyndns.org

ross...@ps.uni-sb.de

unread,

Aug 12, 2007, 11:36:49 AM8/12/07

to

On 11 Aug., 23:02, Rainer Joswig <jos...@lisp.de> wrote:
>
> * "foo" and "foo", same strings ?
>
> are these really the same objects?

Who cares? Assuming they are immutable, they are equal semantically.
The runtime may or may not share them (and may make a different choice
at different points in time).

> * "foo" and "Foo", same word?
>
> * "Rainer Joswig" and "Ranier Joswig", same name (minus spelling)?

No, because they are observably different.

> Choosing the appropriate comparison function does not depend on
> the types of the arguments, but on the intentions of what
> features of two objects I want to compare how.

That's why you want to be able to introduce your own abstract types
and define a suitable equality instance for them that is then used
automatically by the compiler (as in Haskell).

Markus E.L. 2

unread,

Aug 13, 2007, 4:11:13 AM8/13/07

to

Rainer Joswig wrote:

> In article <1186876309.6...@i38g2000prf.googlegroups.com>,
> jsnX <jason...@gmail.com> wrote:
>
>> On Aug 11, 2:02 pm, Rainer Joswig <jos...@lisp.de> wrote:
>> > are these really the same objects?
>>
>> In a language without assignments, who cares?
>
> In a language which can pass data and can create compound
> data structures one might care.

In Haskell everything is values (elements) from given sets (which
correspond to the types given). Like in mathematics, two elements are
either the same of they aren't. There is not middle way. I think the
idea has a certain beauty and Haskell drives it to ultimate
consequence.

Regards -- Markus

Markus E.L. 2

unread,

Aug 13, 2007, 4:15:11 AM8/13/07

to

Rainer Joswig wrote:

> In article <1186880683....@k79g2000hse.googlegroups.com>,
> Matthew Danish <m...@cs.cmu.edu> wrote:
>
>> On Aug 11, 8:46 pm, Rainer Joswig <jos...@lisp.de> wrote:
>> > Identity comes into play if you can create and pass data.
>> > For example comparing two things by identity is (usually) much faster
>> > than comparing by structure.
>>
>> Which is a matter for the compiler to worry about. The programmer
>> cannot determine any such difference, because it doesn't exist in the
>> semantics of the programming language.
>
> In the case of Haskell. In Common Lisp the developer
> can use the knowledge in the code. I'm also not sure if a

That is the difference between a pure and an impure language (no
offence intended: I mostly use impure languages).

> compiler automatically does the 'right' thing.

In Haskell there is only the right thing to do. It also implies
compiler has the freedom to decide when "data structures" are copied
if this might be more efficient.

> Alan Perlis: "A Lisp programmer knows the value of everything,
> but the cost of nothing."
> Maybe that's also true for Haskell programmers?

Almost certainly. Haskell programmers only know about values (AFAICS).

Regards -- Markus

Joachim Durchholz

unread,

Aug 14, 2007, 3:24:58 AM8/14/07

to

Rainer Joswig schrieb:

> * "foo" and "foo", same strings ?
>
> are these really the same objects?

That depends on whether you want equality or identity.

Observationally, A is identical to B if both are equal and remain equal
under any change to A. If I can't change A, equality implies identity;
since identity implies equality by definition, identity and equality are
equivalent under immutability.
So in a pure language, identity is the same as equality.

> * "foo" and "foo", same contents ?
>
> It could be different string types (fixed length, growing strings,
> unicode strings, ascii strings, ...)

They couldn't (in Haskell).

> * "foo" and "Foo", same word?
>
> * "Rainer Joswig" and "Ranier Joswig", same name (minus spelling)?

You work with equivalence classes in such a case.

I.e. either you write a function that maps "same modulo whatever" to
equal values, and compare the results.
Or you write a "fuzzy" equality comparison function.

The language semantics remains unaffected.

Regards,
Jo

Joachim Durchholz

unread,

Aug 14, 2007, 3:34:06 AM8/14/07

to

Rainer Joswig schrieb:

> Identity comes into play if you can create and pass data.
> For example comparing two things by identity is (usually) much faster
> than comparing by structure.

In a pure language, the majority of shared data structures is equal
anyway. If you want a copy of a data structure, you simply copy it (and
the run-time system will execute the copy by creating another pointer).

> The ability to identify an object based on identity
> allows you also to save space. For example in
> a large compound data structure you could replace
> all structural equal objects with the same object.

You don't replace with immutable data structures :-)

> If you can't modify in your language, you might
> input such a compound data structure and return
> a more space efficient version from a function.

I haven't seen this kind of thing done ever.

Even when unmarshalling data from an external source, any sharing is
already indicated in the external data, not constructed by detecting
equal substructures. That would be too inefficient for the taste of
most, I'd think.

> Pitman:
> 'There is no uniquely determined equality function for complex structures
> --there are only arbitrary ones.'

Only if you have mutable data structures.

Regards,
Jo

Rainer Joswig

unread,

Aug 14, 2007, 4:41:17 AM8/14/07

to

In article <f9rlsd$6kc$1...@online.de>,
Joachim Durchholz <j...@durchholz.org> wrote:

> Rainer Joswig schrieb:
> > Identity comes into play if you can create and pass data.
> > For example comparing two things by identity is (usually) much faster
> > than comparing by structure.
>
> In a pure language, the majority of shared data structures is equal
> anyway. If you want a copy of a data structure, you simply copy it (and
> the run-time system will execute the copy by creating another pointer).
>
> > The ability to identify an object based on identity
> > allows you also to save space. For example in
> > a large compound data structure you could replace
> > all structural equal objects with the same object.
>
> You don't replace with immutable data structures :-)

That's why I wrote that next paragraph. :-)

> > If you can't modify in your language, you might
> > input such a compound data structure and return
> > a more space efficient version from a function.
>
> I haven't seen this kind of thing done ever.

And?

> Even when unmarshalling data from an external source, any sharing is
> already indicated in the external data, not constructed by detecting
> equal substructures. That would be too inefficient for the taste of
> most, I'd think.
>
> > Pitman:
> > 'There is no uniquely determined equality function for complex structures
> > --there are only arbitrary ones.'
>
> Only if you have mutable data structures.

No, equality is not magically falling flat into one case,
just because the data structures are not mutable.

> Regards,
>

Ole Nielsby

unread,

Aug 14, 2007, 5:04:21 AM8/14/07

to

Joachim Durchholz <j...@durchholz.org> wrote:

> Rainer Joswig schrieb:
>> * "foo" and "foo", same strings ?
>>
>> are these really the same objects?
>
> That depends on whether you want equality or identity.
>
> Observationally, A is identical to B if both are equal and remain equal
> under any change to A. If I can't change A, equality implies identity;
> since identity implies equality by definition, identity and equality are
> equivalent under immutability.

Not quite. I don't know if this applies to other languages but in my
own PILS language, there is a distinction based on the need to locate
an expression in the source text.

A function call is represented as a node:

:call function-name: arguments

A module might call the same function with the same arguments in
several places. I'd say the nodes representing these calls are equal.

But if one of the calls should fail, the PILS system is able to pinpoint
the failing call in the source text. This is quite difficult to implement
if the call nodes were identical. (I've done it to some extent in earlier
versions of PILS, and it's definitely a thing I don't want to do again.)

Constants - data (simple or composite) that are passed unchanged
by the eval function and can't fail - generally don't need to be located
in the source code, and they are uniquely represented.

All objects are still immutable

(unless they happen to be wrappers for wxWidgets or COM, of course)

Rene de Visser

unread,

Aug 14, 2007, 5:30:37 AM8/14/07

to

"Rainer Joswig" <jos...@lisp.de> wrote in message
news:joswig-CBB8C9.10411714082007@news->

> No, equality is not magically falling flat into one case,
> just because the data structures are not mutable.
>
>> Regards,
>>

Hello Rainer,

You seem to be searching for ways that it is not possible to have a default
uniform equality. And naturally you are suceeding quite well (there are many
possibilites).

You are correct that equality does not magically fall flat into one case. It
takes effort from the language designer (and programmer) to be able to
create a situation where it possible to choose a uniform default equivalence
class (that is generic across different types) that is usefull.

I think the more interesting question/puzzle is:

What restrictions and axioms are necessary in order to be able to choose a
default equality and equivalence class that is consistent and useful.

i.e. the opposite of your goal above: How can it work, rather than how can
it not work. Or to put it differently

A good solution should
i) Work across all types under HM.
ii) Be other than the trivial equality.
iii) Support the addition of other equivalence classes through the addition
of types.

Hint:

In each algebraic structure (e.g. a group) there is a natural equality. What
do these equalities have in common across different algebraic structures.
(Note that Z, Z(7) and Z*7 are treated as different groups, not as the same
one, and naturally each one has its own equality that only applies to
members within the same group).

Rene.

Markus E.L. 2

unread,

Aug 14, 2007, 6:20:29 AM8/14/07

to

Rainer Joswig wrote:

>> > Pitman:
>> > 'There is no uniquely determined equality function for complex structures
>> > --there are only arbitrary ones.'
>>
>> Only if you have mutable data structures.
>
> No, equality is not magically falling flat into one case,
> just because the data structures are not mutable.

But it is. Mathematics has no different notions of equality: Either
two elements are the same or they aren't. Same with the initial
example of 'is "foo" the same as "foo"?'. Of course they are: In pure
functional programming there is not even the language to ask wether
they are different: They denote the same element in the set of all
strings, so they are the same. The same argument applies to 'Is "foo"
the same as ("fo" ^ "o")?' ('^' being the string concatenation
operator here).

Regards -- Markus

Ville Oikarinen

unread,

Aug 14, 2007, 7:27:16 AM8/14/07

to

On Tue, 14 Aug 2007, Joachim Durchholz wrote:

> > Pitman:
> > 'There is no uniquely determined equality function for complex structures
> > --there are only arbitrary ones.'
>
> Only if you have mutable data structures.

I think it depends on the context. I reason it this way:

In java, you can have several Comparator implementations for the same
class. Likewise, it should be possible to have several equality
definitions. To me it would be more natural if in every context
"a equals b" <=> "a compare b == 0".

There can be different conditions for equality in different situations.

For example two Person objects can be considered equal, if their person id
is equal. Other attributes may differ, if the entities are different
versions of the same data, which can happen if there have been database
writes between two reads.

This definition of equality is handy when you want to replace the old
Person instance with the new one: collections use equality to determine
which instance to replace.

Of course, in some other context you may need to compare all attributes,
in which case another equality implementation is needed.

- Ville Oikarinen

Joachim Durchholz

unread,

Aug 14, 2007, 9:10:00 AM8/14/07

to

Rainer Joswig schrieb:

> Joachim Durchholz <j...@durchholz.org> wrote:
>
>> Rainer Joswig schrieb:
>>

>>> If you can't modify in your language, you might
>>> input such a compound data structure and return
>>> a more space efficient version from a function.
>>
>> I haven't seen this kind of thing done ever.
>
> And?

The situation you describe (trying to unify two equal but
stored-in-different-places data structures) doesn't arise in practice,
so there's no disadvantage if a language cannot do that.

>>> Pitman:
>>> 'There is no uniquely determined equality function for complex structures
>>> --there are only arbitrary ones.'
>> Only if you have mutable data structures.
>
> No, equality is not magically falling flat into one case,
> just because the data structures are not mutable.

What kinds of equality relationships remain that cannot be handled using
the techniques I detailed?
Come on, be a bit more specific.

Regards,
Jo

Joachim Durchholz

unread,

Aug 14, 2007, 9:17:11 AM8/14/07

to

Ole Nielsby schrieb:

> Not quite. I don't know if this applies to other languages but in my
> own PILS language, there is a distinction based on the need to locate
> an expression in the source text.

Ah, now we're in the realm of debugging. At that level, some
distinctions do indeed make sense, the one that you noted and a few others.

Personally, I think there should be two levels in a language: the
application logic (no side effects, referentially transparent, equality
is identity, equational reasoning), and a supervision logic (logging,
aborting calls that don't meet their deadline, throwing and catching
exceptions that arise from failed assertions, inspecting the state of
the computation for debugging or curiosity, source code references).
The application logic should not have access to the supervision logic
since that would immediately destroy all the nice properties of the
application logic.

Just my 2c.

Regards,
Jo

Jon Harrop

unread,

Aug 15, 2007, 1:49:42 AM8/15/07

to

Joachim Durchholz wrote:
> Rainer Joswig schrieb:
>> * "foo" and "foo", same strings ?
>>
>> are these really the same objects?
>
> That depends on whether you want equality or identity.
>
> Observationally, A is identical to B if both are equal and remain equal

> under any change to A. If I can't change A, equality implies identity...

Surely two different expressions can evaluate to equal values that are
distinct in memory (not identical)?

Paul Rubin

unread,

Aug 15, 2007, 2:14:12 AM8/15/07

to

Jon Harrop <j...@ffconsultancy.com> writes:
> > Observationally, A is identical to B if both are equal and remain equal
> > under any change to A. If I can't change A, equality implies identity...
>
> Surely two different expressions can evaluate to equal values that are
> distinct in memory (not identical)?

Well, these like identity and equality are only meaningful within the
language semantics. In a pure FPL, two expressions have the same
identity if the denote the same value. Memory locations may be part
of the implementation but aren't in the language semantics. Otherwise
constant integers could change their identity every time the GC moved
them.

Jon Harrop

unread,

Aug 15, 2007, 11:56:33 AM8/15/07

to

Paul Rubin wrote:
> Well, these like identity and equality are only meaningful within the
> language semantics. In a pure FPL, two expressions have the same
> identity if the denote the same value. Memory locations may be part
> of the implementation but aren't in the language semantics.

Does that not have a grave cost in terms of memory consumption?

> Otherwise constant integers could change their identity every time the GC
> moved them.

OCaml has that behaviour:

# 3l == 3l;;
- : bool = false

Joachim Durchholz

unread,

Aug 15, 2007, 1:47:16 PM8/15/07

to

Jon Harrop schrieb:

> Paul Rubin wrote:
>> Well, these like identity and equality are only meaningful within the
>> language semantics. In a pure FPL, two expressions have the same
>> identity if the denote the same value. Memory locations may be part
>> of the implementation but aren't in the language semantics.
>
> Does that not have a grave cost in terms of memory consumption?

Why should it?

Regards,
Jo

Rainer Joswig

unread,

Aug 15, 2007, 3:38:58 PM8/15/07

to

In article <7xir7hg...@ruckus.brouhaha.com>,

Paul Rubin <http://phr...@NOSPAM.invalid> wrote:

> Jon Harrop <j...@ffconsultancy.com> writes:
> > > Observationally, A is identical to B if both are equal and remain equal
> > > under any change to A. If I can't change A, equality implies identity...
> >
> > Surely two different expressions can evaluate to equal values that are
> > distinct in memory (not identical)?
>
> Well, these like identity and equality are only meaningful within the
> language semantics.

Even 'language semantics' needs to be mapped to real machines
with finite memory. There you get a new class of effects
to consider (space and time for example).

Markus E.L. 2

unread,

Aug 15, 2007, 12:39:43 PM8/15/07

to

Jon Harrop wrote:

> Paul Rubin wrote:
>> Well, these like identity and equality are only meaningful within the
>> language semantics. In a pure FPL, two expressions have the same
>> identity if the denote the same value. Memory locations may be part
>> of the implementation but aren't in the language semantics.
>
> Does that not have a grave cost in terms of memory consumption?

Why do you suspect so? Because one cannot mutate structures? Or is
there any other reason? Personally I see it as a further step down the
same path that started with abandoning malloc/free style memory
management and introducing garbage collection. It leaves all decisions
on actual copying to the compiler or run time environment and actually
buys them a lot of freedom as far as caching is concerned (only think
SMP and what it implies if there are no mutable memory locations).

>> Otherwise constant integers could change their identity every time the GC
>> moved them.
>
> OCaml has that behaviour:
>
> # 3l == 3l;;
> - : bool = false

Yes. Anad as far as I understood the manual,

# let x = 3l;;
val x : int32 = 3l
# let y = x;;
val y : int32 = 3l
# x == x;;
- : bool = true
#

is the usual behaviour, but is by no means guaranteed: The compiler
might decide do create a "local" copy of a value in certain
cirumstances and then the comparison (==) to the original fails.

Regards -- Markus

dbe...@eecs.wsu.edu

unread,

Aug 15, 2007, 7:19:17 PM8/15/07

to

On Aug 15, 8:56 am, Jon Harrop <j...@ffconsultancy.com> wrote:
> ...
> OCaml has that behavior

> # 3l == 3l;;
> - : bool = false

Oh dear!

Equality is a special case of an equivalence relation; reflexive,
symmetric and transistive. Other than the (proper) determination of
equality, often it is the case that a module (structure in SML terms)
is needed to compute a desired quitvalence relation. For example,
ignoring the distinction between upper and lower case.

A very good introduction is found in

Daniel J. Velleman
How To Prove It: a structured approach

now, I think, in a second or third edition.

Markus E.L. 2

unread,

Aug 15, 2007, 8:51:30 PM8/15/07

to

'dbenson AT eecs DOT wsu DOT edu' wrote:

> On Aug 15, 8:56 am, Jon Harrop <j...@ffconsultancy.com> wrote:
>> ...
>> OCaml has that behavior
>> # 3l == 3l;;
>> - : bool = false
>
> Oh dear!

Not quite so :-).

Jon simply has been using the object identity instead of value
equality (=). If one uses (=) one gets the proper result:

# 3l == 3l;;
- : bool = false

# 3l = 3l;;
- : bool = true

To illustrate the background a bit more:

# type foo = Int of int | Float of float;;
type foo = Int of int | Float of float
# let x = Int 5;;
val x : foo = Int 5
# let y = Int 5;;
val y : foo = Int 5
# x == y;;
- : bool = false
# x = y;;
- : bool = true

So everything is quite as it should be (in a language that has mutable
data structures, that is).

Regards -- Markus

Jon Harrop

unread,

Aug 16, 2007, 12:26:35 AM8/16/07

to

dbe...@eecs.wsu.edu wrote:
> On Aug 15, 8:56 am, Jon Harrop <j...@ffconsultancy.com> wrote:
>> ...
>> OCaml has that behavior
>> # 3l == 3l;;
>> - : bool = false
>
> Oh dear!
>

> Equality...

Note that this is about identity and not equality:

# 3l = 3l;;
- : bool = true

They are not the same thing.

Jon Harrop

unread,

Aug 16, 2007, 12:34:16 AM8/16/07

to

Markus E.L. 2 wrote:
> Jon Harrop wrote:
>> Paul Rubin wrote:
>>> Well, these like identity and equality are only meaningful within the
>>> language semantics. In a pure FPL, two expressions have the same
>>> identity if the denote the same value. Memory locations may be part
>>> of the implementation but aren't in the language semantics.
>>
>> Does that not have a grave cost in terms of memory consumption?
>
> Why do you suspect so? Because one cannot mutate structures?

No, I use physical equality (identity) even in purely functional settings.

> Or is
> there any other reason? Personally I see it as a further step down the
> same path that started with abandoning malloc/free style memory
> management and introducing garbage collection. It leaves all decisions
> on actual copying to the compiler or run time environment and actually
> buys them a lot of freedom as far as caching is concerned (only think
> SMP and what it implies if there are no mutable memory locations).

I use physical equality in OCaml as an optimization, to test for unaltered
results.

For example, an 'a -> 'a map that avoids all allocation is no rewrites are
done:

let id_map f a =
if a = [||] then a else
let b = ref a in
try
for i = 0 to length a - 1 do
let e = f a.(i) in
if e != a.(i) then begin
b := Array.copy a;
(!b).(i) <- e;
raise (Start (i+1))
end
done;
a
with Start start ->
let b = !b in
for i = start to length a - 1 do
let e = f a.(i) in
if e != a.(i) then b.(i) <- e;
done;
b

or a sort that can return the input list without copying it:

# let rec sort = function
| [] -> []
| h1::t as list -> match sort t with
| h2::t when h1>h2 -> h2::sort(h1::t)
| t' -> if t==t' then list else h1::t';;
val sort : 'a list -> 'a list = <fun>

When handling symbols, I hash cons strings and compare them for physical
equality knowing that it represents semantic equality.

I appreciate that this can all be done without using physical equality, but
can that be done efficiently (particularly without allocation)?

ross...@ps.uni-sb.de

unread,

Aug 16, 2007, 2:26:38 AM8/16/07

to

On 16 Aug., 06:34, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> No, I use physical equality (identity) even in purely functional settings.

Actually, you don't: identity is an inherently impure concept - it
clearly violates referential transparency. So once you're using it you
are no longer doing pure functional programming. ;-)

- Andreas

Rainer Joswig

unread,

Aug 16, 2007, 2:57:11 AM8/16/07

to

In article <1187245598.4...@b79g2000hse.googlegroups.com>,
ross...@ps.uni-sb.de wrote:

Running code on a physical machine with finite memory
and finite speed already breaks it. You can observe
the differences with a stop watch and by seeing
which functions will run in x MB memory and which not.

Jon Harrop

unread,

Aug 16, 2007, 3:04:05 AM8/16/07

to

I mean: my pure code calls my internally-impure id_map. You can use the
ordinary map instead but it is several times slower in this case.

Paul Rubin

unread,

Aug 16, 2007, 3:34:53 AM8/16/07

to

Jon Harrop <j...@ffconsultancy.com> writes:
> I mean: my pure code calls my internally-impure id_map.

Then it's not pure either ;).

> You can use the ordinary map instead but it is several times slower
> in this case.

is == even guaranteed to work the way you hope it does? I.e. if it
just compares pointer identity, what happens after a gc?

ross...@ps.uni-sb.de

unread,

Aug 16, 2007, 5:30:05 AM8/16/07

to

On 16 Aug., 08:57, Rainer Joswig <jos...@lisp.de> wrote:
> In article <1187245598.456031.180...@b79g2000hse.googlegroups.com>,

That does not break referential transparency and thus is unrelated to
my remark.

- Andreas

Philippa Cowderoy

unread,

Aug 16, 2007, 6:26:22 AM8/16/07

to

On Thu, 16 Aug 2007, Rainer Joswig wrote:

> In article <1187245598.4...@b79g2000hse.googlegroups.com>,
> ross...@ps.uni-sb.de wrote:
>
> > On 16 Aug., 06:34, Jon Harrop <j...@ffconsultancy.com> wrote:
> > >
> > > No, I use physical equality (identity) even in purely functional settings.
> >
> > Actually, you don't: identity is an inherently impure concept - it
> > clearly violates referential transparency. So once you're using it you
> > are no longer doing pure functional programming. ;-)
> >
> > - Andreas
>
> Running code on a physical machine with finite memory
> and finite speed already breaks it.

It breaks the purity of our final implementation, not of the code itself.
Where a run doesn't run out of memory etc etc, it's still an accurate
model of the code - it just has additional detail.

--
fli...@flippac.org

"My religion says so" explains your beliefs. But it doesn't explain
why I should hold them as well, let alone be restricted by them.

Joachim Durchholz

unread,

Aug 16, 2007, 12:33:08 PM8/16/07

to

Paul Rubin schrieb:

> Jon Harrop <j...@ffconsultancy.com> writes:
>> I mean: my pure code calls my internally-impure id_map.
>
> Then it's not pure either ;).

Not necessarily. You can have impure code with a pure interface.
(Memoization is one example of this.)

Regards,
Jo

Jon Harrop

unread,

Aug 16, 2007, 1:04:46 PM8/16/07

to

Joachim Durchholz wrote:
> Not necessarily. You can have impure code with a pure interface.
> (Memoization is one example of this.)

Or the id_map that I just posted.

Markus E.L. 2

unread,

Aug 16, 2007, 2:37:45 AM8/16/07

to

Jon Harrop wrote:

> Markus E.L. 2 wrote:
>> Jon Harrop wrote:
>>> Paul Rubin wrote:
>>>> Well, these like identity and equality are only meaningful within the
>>>> language semantics. In a pure FPL, two expressions have the same
>>>> identity if the denote the same value. Memory locations may be part
>>>> of the implementation but aren't in the language semantics.
>>>
>>> Does that not have a grave cost in terms of memory consumption?
>>
>> Why do you suspect so? Because one cannot mutate structures?
>
> No, I use physical equality (identity) even in purely functional settings.

I do not understand you reply. The grave cost in the pure case is
probably not memory consumption, but that any comparison has actually
to be recursive and can't take the shortcut of physical equality. But
I think that perhaps the compiler is smart enough to know when it can
take the shortcut.

>> Or is there any other reason? Personally I see it as a further step
>> down the same path that started with abandoning malloc/free style
>> memory management and introducing garbage collection. It leaves all
>> decisions on actual copying to the compiler or run time environment
>> and actually buys them a lot of freedom as far as caching is
>> concerned (only think SMP and what it implies if there are no
>> mutable memory locations).

> I use physical equality in OCaml as an optimization, to test for unaltered
> results.

Ah, I understand. This is a very special case, though: It won't work
with mutable structures, would it?

This is indeed a good idea.

> I appreciate that this can all be done without using physical equality, but
> can that be done efficiently (particularly without allocation)?

The only hope one can entertain here is that the compiler is smart
enough. But probably you're right: This is one of the effects that
make Haskell slower.

Regards -- Markus

Jon Harrop

unread,

Aug 17, 2007, 8:08:40 AM8/17/07

to

Markus E.L. 2 wrote:
> I do not understand you reply. The grave cost in the pure case is
> probably not memory consumption, but that any comparison has actually
> to be recursive and can't take the shortcut of physical equality. But
> I think that perhaps the compiler is smart enough to know when it can
> take the shortcut.

I assume the built-in equality will always try physical equality before
structural equality at every step.

>> I use physical equality in OCaml as an optimization, to test for
>> unaltered results.
>
> Ah, I understand. This is a very special case, though: It won't work
> with mutable structures, would it?

Yes, that is exactly why I said I use it in "purely functional" settngs.

>> When handling symbols, I hash cons strings and compare them for physical
>> equality knowing that it represents semantic equality.
>
> This is indeed a good idea.

It is rather error prone and could benefit from some language improvements.

>> I appreciate that this can all be done without using physical equality,
>> but can that be done efficiently (particularly without allocation)?
>
> The only hope one can entertain here is that the compiler is smart
> enough. But probably you're right: This is one of the effects that
> make Haskell slower.

Yes. I'd like to know why exactly Haskell is so slow. I had assumed that it
was just laziness but it is very slow even on eager code. Perhaps there is
more to it, such as an inability to optimize away type class dictionaries.

Laurent Deniau

unread,

Aug 17, 2007, 9:55:03 AM8/17/07

to

Jon Harrop wrote:
> Yes. I'd like to know why exactly Haskell is so slow. I had assumed that it
> was just laziness but it is very slow even on eager code. Perhaps there is
> more to it, such as an inability to optimize away type class dictionaries.

Dictionaries are not the only reason. The Haskell program below is
about 3-4 times slower than its C version and I do not see why
(problem with unboxing the Int?).

c99 -O3 sum.c -o sum-c
time ./sum-c 100000000
987459712

real 0m0.141s
user 0m0.140s
sys 0m0.000s

ghc -O2 sum.hs -optc -O3 -o sum-hs
time ./sum-hs 100000000
987459712

real 0m0.494s
user 0m0.492s
sys 0m0.000s

ld.

-----

import System

f :: Int -> Int -> Int
f s n = if n > 0 then f (s+n) (n-1) else s

main = do
[n] <- getArgs
putStrLn $ show $ f 0 (read n)

-----

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int f(int s, int n)
{
return n > 0 ? f(s+n, n-1) : s;
}

int main(int argc, char *argv[])
{
assert(argc == 2);
printf("%d\n", f(0, strtol(argv[1],0,0)));
}

Phil Armstrong

unread,

Aug 17, 2007, 11:44:56 AM8/17/07

to

Laurent Deniau <Laurent...@gmail.com> wrote:
> Jon Harrop wrote:
>> Yes. I'd like to know why exactly Haskell is so slow. I had assumed that it
>> was just laziness but it is very slow even on eager code. Perhaps there is
>> more to it, such as an inability to optimize away type class dictionaries.
>
> Dictionaries are not the only reason. The Haskell program below is
> about 3-4 times slower than its C version and I do not see why
> (problem with unboxing the Int?).

About twice as slow on my hardware. ghc -ddump-simpl says that the
arguments are unboxed primitive types, so it's getting that much
right.

Phil

--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt

Phil Armstrong

unread,

Aug 17, 2007, 12:32:56 PM8/17/07

to

Phil Armstrong <phil...@kantaka.co.uk> wrote:
> Laurent Deniau <Laurent...@gmail.com> wrote:
>> Jon Harrop wrote:
>>> Yes. I'd like to know why exactly Haskell is so slow. I had assumed that it
>>> was just laziness but it is very slow even on eager code. Perhaps there is
>>> more to it, such as an inability to optimize away type class dictionaries.
>>
>> Dictionaries are not the only reason. The Haskell program below is
>> about 3-4 times slower than its C version and I do not see why
>> (problem with unboxing the Int?).
>
> About twice as slow on my hardware. ghc -ddump-simpl says that the
> arguments are unboxed primitive types, so it's getting that much
> right.

Followup. Oops: gcc -O2 is faster than -O3 on this code by a third or
so. So the C version does turn out to be three times quicker...

A short perusal of the generated assembler suggests that the C
compiler is spotting that is doing classic tail-call elimination
(replacing the call to f with a jump into the body of f having dumping
the arguments into the appropriate registers). The Haskell version
appears to be pushing the arguments on and off the stack every time (I
think -- my reading of assembler output is a little rusty to say the
least, and it was never that good in the first place).

No thunks or anything in the central loop, just a missed optimisation
that the C compiler is making & ghc isn't.

Phil Armstrong

unread,

Aug 17, 2007, 1:35:24 PM8/17/07

to

Phil Armstrong <phil...@kantaka.co.uk> wrote:
> Followup. Oops: gcc -O2 is faster than -O3 on this code by a third or
> so. So the C version does turn out to be three times quicker...

Following up to myself again; this seems to be a gcc-4.0 weirdness (I
was using the Apple compiler, but the Linux gcc-4.0 does the same
thing). Gcc 4.1 & 4.2 have the expected behaviour: gcc-4.2 -O3 is
about 4 times faster than the Haskell/ghc version of sum, and 2.75
times faster for -02. Looking at the generated asm, -O2 is doing tail
call elimination, whilst -O3 is unrolling the loop as well.

dbe...@eecs.wsu.edu

unread,

Aug 17, 2007, 5:58:42 PM8/17/07

to

On Aug 15, 5:51 pm, development-2006-8ecbb5cc8aREMOVET...@ANDTHATm-e-
leypold.de (Markus E.L. 2) wrote:
...

> Not quite so :-).
>
> Jon simply has been using the object identity instead of value
> equality (=).

Thanks. Let's see if I've the right conception: there is a set of
objects. The (mathematical) equality on objects is said to be object
identity, written as ==. Objects have, or better, denote, values.
The value equality, written as =, is an equivalence relation on
objects such that two objects are equivalent just in case they denote
the same value.

Jon Harrop

unread,

Aug 18, 2007, 4:58:35 AM8/18/07

to

Except the "mathematical" equality is the structural equality "=", rather
than the physical equality "==":

# (2, 3) = (2, 3);; (* The contents are the same *)
- : bool = true
# (2, 3) == (2, 3);; (* These are two different tuple allocations *)
- : bool = false

In OCaml, "==" is O(1) whereas "=" traverses the data structure (including
mutable parts). Also, "==" corresponds to pointer or word-value equality
and OCaml has a copying collector, so there is no physical total order
function whereas the global "compare" function provides a total order over
general values by traversing them.

SML is slightly different because it offers a single equality that starts
with a physical equality and recursively considers data structures but
terminating when it gets to mutable data. SML also adds equality types, to
statically determine when incomparable types are compared and reject the
program with an error.

F# is similar to OCaml but alters "=" to call a member function (the
underlying .NET platform is OO), so it effectively has dictionaries like
Haskell that allow equality to be overridden on a per-type basis.

Haskell's type classes provide the same functionality as F# in the context
of equality, allowing you to override equality for different types but
adding the ability to have overloads dynamically resolved.

There are various trade-offs involved. The OCaml and SML approaches are slim
and fast but, without equality types, OCaml is prone to run-time errors
from comparison functions. Haskell and F# are more robust but more
complicated and slower. In particular, type errors are more obfuscated.

In comparison, Lisp has a venerable zoo of equality functions that serve
many different special cases.

Joachim Durchholz

unread,

Aug 18, 2007, 9:06:54 AM8/18/07

to

dbe...@eecs.wsu.edu schrieb:

> On Aug 15, 5:51 pm, development-2006-8ecbb5cc8aREMOVET...@ANDTHATm-e-
> leypold.de (Markus E.L. 2) wrote:
> ...
>> Not quite so :-).
>>
>> Jon simply has been using the object identity instead of value
>> equality (=).
>
> Thanks. Let's see if I've the right conception: there is a set of
> objects. The (mathematical) equality on objects is said to be object
> identity, written as ==.

Since mathematics in general does not deal with mutable entities, there
is no such thing as "mathematic equality on objects".
(There is most certainly a branch of mathematics that deals with mutable
objects, but that's not what people assume when you're talking about
"mathematical equality".)

The real question is how to apply mathematical equality to mutable
objects. The answer is: it's not possible, you end up with lots of
different equality definitions, none of them really better than the other.
The real problem with transferring the idea of mathematical equality to
mutable data is that sometimes you're interested in an equality that
stays constant over the entire program's running time (identity), and
sometimes you want an equality that operates just on the current
snapshot (equality).
Then there's the problem of representation vs. external views. You can
have different binary representations denoting the same abstract value,
such as UTF-8 vs. UTF-16 - should they be equal or not?
Similar problems happen to identity. Say, you have two proxies to the
same object, then the objects manipulated through the proxies should be
considered identical - but it's difficult to establish that identity
technically, e.g. if one proxy goes through an IP route via Timbuktu and
the other is a simple localhost access.

> Objects have, or better, denote, values.

They *have* them.
"Denoting" is a term that relates syntax to values.

> The value equality, written as =, is an equivalence relation on
> objects such that two objects are equivalent just in case they denote
> the same value.

So equality is the same as equivalence is the same as sameness.

Note that this kind of tail chasing is very typical if you try to define
equality in an axiomatic way. I'm not even sure there's a good way to do it.

Regards,
Jo

ross...@ps.uni-sb.de

unread,

Aug 18, 2007, 11:43:20 AM8/18/07

to

On 18 Aug., 10:58, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> > Thanks. Let's see if I've the right conception: there is a set of
> > objects. The (mathematical) equality on objects is said to be object
> > identity, written as ==. Objects have, or better, denote, values.
> > The value equality, written as =, is an equivalence relation on
> > objects such that two objects are equivalent just in case they denote
> > the same value.
>
> Except the "mathematical" equality is the structural equality "=", rather
> than the physical equality "==":

Neither is a "mathematical" equality in any useful sense. A proper
built-in "mathematical" equality would have to follow Leibnitz'
notion. For PL objects that means, equate exactly those objects that
are semantically equivalent, i.e. behave the same under all
circumstances observable from within (the rest of) the language.

SML's equality does exactly that. Neither of OCaml's operators
achieves it, nor any of the multitude of Lisp's equalities. In the
case of Haskell and F# equality is user-extensible, so you don't have
any semantic guarantees anyway. At least for Haskell, all pre-defined
instances adhere to the above principle, though (don't know about F#).

- Andreas

dbe...@eecs.wsu.edu

unread,

Aug 18, 2007, 3:41:36 PM8/18/07

to

On Aug 18, 1:58 am, Jon Harrop <j...@ffconsultancy.com> wrote:
> ...

> Except the "mathematical" equality is the structural equality "=", rather

> than the physical equality "==" ...

It depends on what one means by "mathematical". Consider an abstract
set. It comes equipped with a (mathematical) notion of equaluty.

Now consider a set of objects. It has the same notion of equality,
called object identity, written as ==. This is a perfectly fine
notion of equality, it just has to be implemented properly.

Viewing objects this way, the value equality, written =, is an
equivalence relation on objects. It has to be implemented properly as
well.

Lisp, common or otherwise, has a whole zoo of functions called
"equality", but if some residents of the zoo are not equivalence
relations, they are mislabeled.

dbe...@eecs.wsu.edu

unread,

Aug 18, 2007, 5:35:56 PM8/18/07

to

On Aug 18, 1:58 am, Jon Harrop <j...@ffconsultancy.com> wrote:
...

> SML is slightly different because it offers a single equality that starts
> with a physical equality and recursively considers data structures but

> terminating when it gets to mutable data. ...

I think it clearer to start with what Larry Paulson writes about
references (containing mutable data) in his book, <<ML for the Working
Programmer>>: "Two references of the same type are equal precisely
when they denote the same address."

Otherwise, equality is denotational, but not defined on reals, and
recursive over data structures (which may contain references, but not
reals.)

Markus E.L. 2

unread,

Aug 18, 2007, 1:56:28 PM8/18/07

to

'rossberg AT ps DOT uni-sb DOT de' wrote:

> SML's equality does exactly that. Neither of OCaml's operators
> achieves it, nor any of the multitude of Lisp's equalities. In the

Out of curiosity: What is the problem with (=) in OCaml?

Regards -- Markus

Neelakantan Krishnaswami

unread,

Aug 19, 2007, 3:36:50 AM8/19/07

to

In article <<yo3ayg2...@hod.lan.m-e-leypold.de>>,
Markus E.L. 2 <development-2006-8...@ANDTHATm-e-leypold.de>
wrote:

>
> Out of curiosity: What is the problem with (=) in OCaml?

The big problem is that it breaks type abstraction. If you have an
abstract type, (=) will happily compare values of that abstract
type. This allows you to distinguish semantically equal values.

--
Neel R. Krishnaswami
ne...@cs.cmu.edu

Neelakantan Krishnaswami

unread,

Aug 19, 2007, 3:29:15 PM8/19/07

to

In article <<3d7inwf...@hod.lan.m-e-leypold.de>>,

Markus E.L. 2 <development-2006-8...@ANDTHATm-e-leypold.de>
wrote:
>

> The only hope one can entertain here is that the compiler is smart
> enough. But probably you're right: This is one of the effects that
> make Haskell slower.

It also makes it faster. Removing physical equality from the language
means that pointer equality is no longer an observable effect. This
means that various optimizations become allowable:

let p1 = f(e) in
let p2 = f(e) in
whatever

is safely transformable to

let p1 = f(e) in
let p2 = p1 in
whatever

only if there's no observerable difference between different calls to
f. If you can test whether the returned value is in the same or
different memory, then you can tell, and so this is no longer a safe
transformation.

Rainer Joswig

unread,

Aug 19, 2007, 3:37:11 PM8/19/07

to

In article <slrnfch6gb...@gs3106.sp.cs.cmu.edu>,
Neelakantan Krishnaswami <ne...@cs.cmu.edu> wrote:

To me physical equality is more useful than the optimization
of your example.

--
http://lispm.dyndns.org

dbe...@eecs.wsu.edu

unread,

Aug 19, 2007, 4:00:57 PM8/19/07

to

On Aug 18, 8:43 am, rossb...@ps.uni-sb.de wrote:
> ... Neither of OCaml's operators
> achieves it, ...

Andreas, could you kindly elaborate? I'm slowly getting the hang of
this (and in the process discovering my decision to stick with SML was
a sound one...)

Jon Harrop

unread,

Aug 19, 2007, 4:19:48 PM8/19/07

to

OCaml provides two equality operators and SML provides one.

OCaml's physical equality tests the identity of two values (typically a
pointer comparison by it also works on unboxed types) and is O(1). OCaml's
structural equality traverses structures including mutable substructures
and is O(n).

In constrast, SML's only equality traverses only up to references and is
O(n).

Joachim Durchholz

unread,

Aug 19, 2007, 8:27:22 PM8/19/07

to

Rainer Joswig schrieb:

> To me physical equality is more useful than the optimization
> of your example.

You don't need physical equality if you're programming in a pure language.
Even in an impure language, if you program in a mostly pure fashion, you
don't need physical equality.

Regards,
Jo

Rainer Joswig

unread,

Aug 19, 2007, 8:41:45 PM8/19/07

to

In article <faan42$n6f$1...@online.de>,
Joachim Durchholz <j...@durchholz.org> wrote:

Only if you don't run your programs.

Joachim Durchholz

unread,

Aug 19, 2007, 9:09:56 PM8/19/07

to

Rainer Joswig schrieb:

> Joachim Durchholz <j...@durchholz.org> wrote:
>> Rainer Joswig schrieb:
>>> To me physical equality is more useful than the optimization
>>> of your example.
>> You don't need physical equality if you're programming in a pure language.
>> Even in an impure language, if you program in a mostly pure fashion, you
>> don't need physical equality.
>

> Only if you don't run your programs.

Oh come on, you should really know better than that.

Regards,
Jo

Rainer Joswig

unread,

Aug 19, 2007, 9:19:13 PM8/19/07

to

In article <faapjs$rf9$1...@online.de>,
Joachim Durchholz <j...@durchholz.org> wrote:

'You should really know better than that'? What kind of argument is that?

I want to use this feature. I want to use an impure language
that lets me directly use 'physical equality'.

ross...@ps.uni-sb.de

unread,

Aug 19, 2007, 11:44:59 PM8/19/07

to

On Aug 18, 7:56 pm, development-2006-8ecbb5cc8aREMOVET...@ANDTHATm-e-

leypold.de (Markus E.L. 2) wrote:
>
> > SML's equality does exactly that. Neither of OCaml's operators
> > achieves it, nor any of the multitude of Lisp's equalities. In the
>
> Out of curiosity: What is the problem with (=) in OCaml?

The "problem" - if you want to call it that - is that it does not
adhere to observational equivalence in the case of mutable values.

let r1 = ref 1
let r2 = ref 1
r1 = r2 (* true, although they are not equivalent *)

In OCaml, you have to explicitly switch to (==) for mutables - but
what when you compare more complex values, e.g. two int ref lists?

Jon Harrop

unread,

Aug 20, 2007, 1:56:25 AM8/20/07

to

ross...@ps.uni-sb.de wrote:
> The "problem" - if you want to call it that - is that it does not
> adhere to observational equivalence in the case of mutable values.
>
> let r1 = ref 1
> let r2 = ref 1
> r1 = r2 (* true, although they are not equivalent *)
>
> In OCaml, you have to explicitly switch to (==) for mutables - but
> what when you compare more complex values, e.g. two int ref lists?

In practice, I have never switched to == for mutables so I don't really
understand why this is an issue. I assume it is for purely theoretical
reasons?

Conversely, I find the practical problem of being unable to compare two
values for equality in O(1) very disturbing...

Joachim Durchholz

unread,

Aug 20, 2007, 2:17:07 AM8/20/07

to

Rainer Joswig schrieb:

> In article <faapjs$rf9$1...@online.de>,
> Joachim Durchholz <j...@durchholz.org> wrote:
>
>> Rainer Joswig schrieb:
>>> Joachim Durchholz <j...@durchholz.org> wrote:
>>>> Rainer Joswig schrieb:
>>>>> To me physical equality is more useful than the optimization
>>>>> of your example.
>>>> You don't need physical equality if you're programming in a pure language.
>>>> Even in an impure language, if you program in a mostly pure fashion, you
>>>> don't need physical equality.
>>> Only if you don't run your programs.
>> Oh come on, you should really know better than that.
>>
>> Regards,
>> Jo
>
> 'You should really know better than that'? What kind of argument is that?

A silly one.
Matching that "only if you don't run your programs" tone of yours: give
a concrete argument and we can talk.

> I want to use this feature. I want to use an impure language
> that lets me directly use 'physical equality'.

What for?

Regards,
Jo

Rainer Joswig

unread,

Aug 20, 2007, 2:40:31 AM8/20/07

to

In article <fabbjq$jkc$1...@online.de>,
Joachim Durchholz <j...@durchholz.org> wrote:

> Rainer Joswig schrieb:
> > In article <faapjs$rf9$1...@online.de>,
> > Joachim Durchholz <j...@durchholz.org> wrote:
> >
> >> Rainer Joswig schrieb:
> >>> Joachim Durchholz <j...@durchholz.org> wrote:
> >>>> Rainer Joswig schrieb:
> >>>>> To me physical equality is more useful than the optimization
> >>>>> of your example.
> >>>> You don't need physical equality if you're programming in a pure language.
> >>>> Even in an impure language, if you program in a mostly pure fashion, you
> >>>> don't need physical equality.
> >>> Only if you don't run your programs.
> >> Oh come on, you should really know better than that.
> >>
> >> Regards,
> >> Jo
> >
> > 'You should really know better than that'? What kind of argument is that?
>
> A silly one.
> Matching that "only if you don't run your programs" tone of yours: give
> a concrete argument and we can talk.

The statement expanded is: on a real computer you likely will be able to
observe if an implementation of an algorithms uses pointer identity
to compare 'things' or not. You will see differences in runtime
and/or memory usages. No, the language may semantically make no
difference, but semantics is not running on a machine (usually). The
implementation of a given language is running on a computer
with some 'speed', some given memory and some given instruction
units. Some people may say, they don't care about that - I do.
I, as a programmer, want to have the ability to use the
concept of object identity explicitly in my software.

> > > I want to use this feature. I want to use an impure language
> > that lets me directly use 'physical equality'.
>
> What for?

I want to be able to check if two 'objects' are the same.
For me (!) this is one of the basics of object-oriented programming.
Mutable objects with identity that can be manipulated by
functions.

>
> Regards,
> Jo

Jon Harrop

unread,

Aug 20, 2007, 2:43:45 AM8/20/07

to

Rainer Joswig wrote:
> I want to be able to check if two 'objects' are the same.

You still can, of course, by emulating physical equality via tagging with
unique integers on all allocations. Even though OCaml bundles physical
equality, there is no physical total order (pointer comparison) because the
GC moves pointers under your feet.

Paul Rubin

unread,

Aug 20, 2007, 2:55:52 AM8/20/07

to

Jon Harrop <j...@ffconsultancy.com> writes:
> Conversely, I find the practical problem of being unable to compare two
> values for equality in O(1) very disturbing...

I would like to know whether the formal semantics of ocaml guarantee
that a==a is always true.

Jon Harrop

unread,

Aug 20, 2007, 3:10:33 AM8/20/07

to

Paul Rubin wrote:
> I would like to know whether the formal semantics of ocaml guarantee
> that a==a is always true.

I believe so, yes.

Note that a=a is definitely not always true in OCaml.

Neelakantan Krishnaswami

unread,

Aug 20, 2007, 3:59:24 AM8/20/07

to

In article <<joswig-068F16....@news-europe.giganews.com>>,

That's because it's the simplest possible example which illustrates
the issue -- and even it can turn an exponential time algorithm into a
linear one.

Physical equality makes allocation into a visible side-effect, which
means that any optimization that changes the order of evaluation of
subexpressions stops being semantics-preserving.

More interesting optimizations that rely on similar semantic
equalities are list fusion optimizations, which eliminate intermediate
data structures (eg, map f (map g list) = map (f o g) list).

Another useful optimization that goes out the window with physical
equality is optimizing functions by compiling specialized versions for
particular types, because two different specializations of the same
function will have different physical locations.

Jon Harrop

unread,

Aug 20, 2007, 6:19:31 AM8/20/07

to

Neelakantan Krishnaswami wrote:
> That's because it's the simplest possible example which illustrates
> the issue -- and even it can turn an exponential time algorithm into a
> linear one.

If you move from one Haskell compiler to another, your "optimization" can
even turn a linear time algorithm into an exponential one when the new
compiler doesn't happen to implement that optimization.

I can only assume that people would not depend upon the existence of such
optimizations and that coding that way is considered bad style. Is that
correct?

> Physical equality makes allocation into a visible side-effect, which
> means that any optimization that changes the order of evaluation of
> subexpressions stops being semantics-preserving.

Only if you assume the semantics were defined to begin with, which they were
not.

> Another useful optimization that goes out the window with physical
> equality is optimizing functions by compiling specialized versions for
> particular types, because two different specializations of the same
> function will have different physical locations.

I appreciate this theoretical viewpoint but I think it is not constructive
for two main reasons:

1. Haskell is only fast in theory. In practice, it remains 3x slower.

2. The semantics-breaking optimizations that you say are "out the window",
are in fact done.

For example, constants are hoisted in some cases:

# let f() = Some 3;;
val f : unit -> int option = <fun>
# f() == f();;
- : bool = true

but not others:

# let f x y = Some x;;
val f : 'a -> 'b -> 'a option = <fun>
# let f = f 3 in f 4 == f 5;;
- : bool = false

As you say, this is an observable optimization but physical equality only
returns "definitely equal" or "probably unequal" and there are practical
situations where that is useful, just as there are practical situations
where a "probably prime" function is useful. Surely you would not advocate
getting rid of probably prime functions?

ross...@ps.uni-sb.de

unread,

Aug 20, 2007, 7:59:26 AM8/20/07

to

On Aug 20, 7:56 am, Jon Harrop <j...@ffconsultancy.com> wrote:
>
> > In OCaml, you have to explicitly switch to (==) for mutables - but
> > what when you compare more complex values, e.g. two int ref lists?
>
> In practice, I have never switched to == for mutables so I don't really
> understand why this is an issue. I assume it is for purely theoretical
> reasons?

Well, it makes the semantics more consistent, which generally tends to
avoid nasty surprises and subtle bugs.

> Conversely, I find the practical problem of being unable to compare two
> values for equality in O(1) very disturbing...

I don't understand what you are talking about. Comparison of non-
atomic values can never be O(1), short of requiring the implementation
to do (prohibitively expensive) hash-consing on everything.

- Andreas

ross...@ps.uni-sb.de

unread,

Aug 20, 2007, 9:08:51 AM8/20/07

to

On Aug 20, 3:19 am, Rainer Joswig <jos...@lisp.de> wrote:
> In article <faapjs$rf...@online.de>,

> Joachim Durchholz <j...@durchholz.org> wrote:
>
> > Rainer Joswig schrieb:
> > > Joachim Durchholz <j...@durchholz.org> wrote:
> > >> Rainer Joswig schrieb:
> > >>> To me physical equality is more useful than the optimization
> > >>> of your example.
> > >> You don't need physical equality if you're programming in a pure language.
> > >> Even in an impure language, if you program in a mostly pure fashion, you
> > >> don't need physical equality.
>
> > > Only if you don't run your programs.
>
> > Oh come on, you should really know better than that.
>

> 'You should really know better than that'? What kind of argument is that?

Your statement was pretty much implying that one cannot write actual
programs in a pure language - which I consider being on the verge of
trolldom. Joachim's reply was accordingly: you know people are writing
real programs in Haskell or Clean, so you know better.

> I want to use this feature. I want to use an impure language
> that lets me directly use 'physical equality'.

That's fine, but doesn't justify derogative remarks on different
approaches.

- Andreas

Joachim Durchholz

unread,

Aug 20, 2007, 10:17:20 AM8/20/07

to

Rainer Joswig schrieb:

> The statement expanded is: on a real computer you likely will be able to
> observe if an implementation of an algorithms uses pointer identity
> to compare 'things' or not. You will see differences in runtime
> and/or memory usages. No, the language may semantically make no
> difference, but semantics is not running on a machine (usually).

So, basically, you're arguing that because there are cases where the
underlying implementation can undermine an abstraction, the abstraction
is useless in general.

Oh come on. If that were really the case, we'd hand-allocate every byte
in the computer because memory allocation can fail.
Actually, this stance does have its justification in extremely
constrained scenarios.
But it is entire inapplicable when you're programming in Lisp anyway. Or
in any other language with automatic GC, for that matter.

> The
> implementation of a given language is running on a computer
> with some 'speed', some given memory and some given instruction
> units. Some people may say, they don't care about that - I do.
> I, as a programmer, want to have the ability to use the
> concept of object identity explicitly in my software.

Non sequitur. Having the ability to control memory usage and CPU
instruction mix is largely unrelated to the ability of using the concept
of identity.

Or, to put it another way and get back from this overly abstract mode of
discussion back to concrete details: You do have the performance and
memory implications of object identity in every single equality of pure
code. In a pure language, the vast majority of data structures is
shared; there *is* an overhead, but it is orders of magnitude smaller
than one might think from the outside. (Also, shared mutable data
structures don't scale well, the programmer time needed to make sure
that you've got sharing where possible and copying where needed become
prohibitive after a point, and when linking to code with unknown
qualities, you don't even know whether you can share so you have to copy
anyway, where a pure interface would allow you to share even across
abstraction boundaries. Also, consider Erlang which does not use sharing
for their message passing, and still is fast enough to use in high-end
Telco switches.)

>>>> I want to use this feature. I want to use an impure language
>>> that lets me directly use 'physical equality'.
>> What for?
>
> I want to be able to check if two 'objects' are the same.
> For me (!) this is one of the basics of object-oriented programming.
> Mutable objects with identity that can be manipulated by
> functions.

Of course, if you *want* mutable objects, you need identity.
But you have failed to demonstrate why you want mutability.

Regards,
Jo

Paul Rubin

unread,

Aug 20, 2007, 10:29:00 AM8/20/07

to

Joachim Durchholz <j...@durchholz.org> writes:
> across abstraction boundaries. Also, consider Erlang which does not
> use sharing for their message passing, and still is fast enough to use
> in high-end Telco switches.)

1. I don't think telco switching is a CPU-intensive application.

2. I get the impression that those Erlang messages tend to be rather
small, e.g. a message saying some phone call has just ended. They use
Mnesia for when large structures are involved, though I guess the
queries are still individually small.

ross...@ps.uni-sb.de

unread,

Aug 20, 2007, 10:42:28 AM8/20/07

to

On Aug 20, 8:40 am, Rainer Joswig <jos...@lisp.de> wrote:
>
> The statement expanded is: on a real computer you likely will be able to
> observe if an implementation of an algorithms uses pointer identity
> to compare 'things' or not. You will see differences in runtime
> and/or memory usages.

On a real computer, I will also likely be able to observe whether an
algorithm was implemented in hand-optimised assembly code or in Lisp.
Do you necessarily conclude that it is preferable to do the former?

(Besides the fact that the runtime of a language with "semantic"
equality is by no means prevented from using pointer equality as an
optimization - it will certainly do.)

- Andreas

Joachim Durchholz

unread,

Aug 20, 2007, 10:56:58 AM8/20/07

to

Jon Harrop schrieb:

> Neelakantan Krishnaswami wrote:
>> That's because it's the simplest possible example which illustrates
>> the issue -- and even it can turn an exponential time algorithm into a
>> linear one.
>
> If you move from one Haskell compiler to another, your "optimization" can
> even turn a linear time algorithm into an exponential one when the new
> compiler doesn't happen to implement that optimization.
>
> I can only assume that people would not depend upon the existence of such
> optimizations and that coding that way is considered bad style. Is that
> correct?

Well, sort of. There are no clear answers in that direction.

For example, all modern FPLs assume tail recursion optimization.
Expressing loops as recursion becomes too impractical without that
optimization. (Aside note: in practice, you get tail *call* optimization
which is just as simple to implement and and optimizes a few other
cases, but TRO would be sufficient.)

There's some active research about further optimization techniques. Once
a technique is researched well enough, it quickly becomes mainstream in
all compilers, so programmers can assume it in their code and start to
write idioms that would have been prohibitively expensive without it.

This isn't too different from C, actually. There are compilers out there
that do array bounds checking (even for arrays that are passed as
pointers over the stack: every array gets something like its own address
spaces on malloca(), so the run-time system can do the checking).
Of course, if you know your compilers do this kind of checking, this has
an impact on what kind of code one should write.
The main difference is that the best that you can do with C is micro
optimizations, where in an FPL, you can do real code transformations
that can change the big-Oh complexity of the code.

(Personally, I think the kind of big-Oh-impacting optimizations
available should be made part of the language or otherwise transparent,
but that's just my personal stance.)

>> Physical equality makes allocation into a visible side-effect, which
>> means that any optimization that changes the order of evaluation of
>> subexpressions stops being semantics-preserving.
>
> Only if you assume the semantics were defined to begin with, which they were
> not.

???

Haskell even has a formal semantics (not a complete one, but enough to
be useful).

>> Another useful optimization that goes out the window with physical
>> equality is optimizing functions by compiling specialized versions for
>> particular types, because two different specializations of the same
>> function will have different physical locations.
>
> I appreciate this theoretical viewpoint but I think it is not constructive
> for two main reasons:
>
> 1. Haskell is only fast in theory. In practice, it remains 3x slower.

Well, that's certainly interesting for some applications, but it's not
necessarily interesting for others.

Also, that speed difference isn't as clear as you make it out to be. The
numbers certainly come out differently if you ask about the performance
achievable with a given, limited programmer time budget.
It's the old assembly argument: if you want real speed, you can always
drop down to assembly, but most of the time, the advantages in
programmer speed allow you do to algorithmic optimizations that the
lower-level language cannot afford because getting the code right is
difficult enough.

> 2. The semantics-breaking optimizations that you say are "out the window",
> are in fact done.
>
> For example, constants are hoisted in some cases:
>
> # let f() = Some 3;;
> val f : unit -> int option = <fun>
> # f() == f();;
> - : bool = true
>
> but not others:
>
> # let f x y = Some x;;
> val f : 'a -> 'b -> 'a option = <fun>
> # let f = f 3 in f 4 == f 5;;
> - : bool = false

Equality for functions is undecidable, that's why you don't get it.

> As you say, this is an observable optimization but physical equality only
> returns "definitely equal" or "probably unequal" and there are practical
> situations where that is useful, just as there are practical situations
> where a "probably prime" function is useful. Surely you would not advocate
> getting rid of probably prime functions?

I've always been feeling uneasy around these.
I can see their advantages, but I'd really be happy if somebody found
algorithms that gave something better than probabilities. (I *know* that
the probabilities are far better than the reliability of the hardware
the algorithms are running on. It's just that stacking probabilities is
admissible only if they are independent, and I consider the probability
that there's an overlooked dependency far higher than the fault
probability of the hardware.)

Regards,
Jo

ross...@ps.uni-sb.de

unread,

Aug 20, 2007, 11:31:18 AM8/20/07

to

On Aug 20, 4:56 pm, Joachim Durchholz <j...@durchholz.org> wrote:
>
> (Aside note: in practice, you get tail *call* optimization
> which is just as simple to implement and and optimizes a few other
> cases, but TRO would be sufficient.)

I don't think so. TRO (which usually means self-recursion) is
insufficient as soon as you are dealing with mutual recursion. And you
often do when you have mutually recursive datatypes, for example.
Also, certain combinators would probably be less useful if they didn't
tail-call their arguments.

- Andreas

Rainer Joswig

unread,

Aug 20, 2007, 11:43:21 AM8/20/07

to

In article <1187615331.6...@q4g2000prc.googlegroups.com>,
ross...@ps.uni-sb.de wrote:

> On Aug 20, 3:19 am, Rainer Joswig <jos...@lisp.de> wrote:
> > In article <faapjs$rf...@online.de>,
> > Joachim Durchholz <j...@durchholz.org> wrote:
> >
> > > Rainer Joswig schrieb:
> > > > Joachim Durchholz <j...@durchholz.org> wrote:
> > > >> Rainer Joswig schrieb:
> > > >>> To me physical equality is more useful than the optimization
> > > >>> of your example.
> > > >> You don't need physical equality if you're programming in a pure language.
> > > >> Even in an impure language, if you program in a mostly pure fashion, you
> > > >> don't need physical equality.
> >
> > > > Only if you don't run your programs.
> >
> > > Oh come on, you should really know better than that.
> >
> > 'You should really know better than that'? What kind of argument is that?
>
> Your statement was pretty much implying that one cannot write actual
> programs in a pure language - which I consider being on the verge of
> trolldom. Joachim's reply was accordingly: you know people are writing
> real programs in Haskell or Clean, so you know better.

Calm down. I never said that.

>
> > I want to use this feature. I want to use an impure language
> > that lets me directly use 'physical equality'.
>
> That's fine, but doesn't justify derogative remarks on different
> approaches.

Look, "if you program in a mostly pure fashion, you don't need physical equality".

That's plain wrong in the general case. Sure you can write
programs and don't need it. But that's not true for all programs
and all problems.

>
> - Andreas

--
http://lispm.dyndns.org

Joachim Durchholz

unread,

Aug 20, 2007, 12:02:48 PM8/20/07

to

Paul Rubin schrieb:

> Joachim Durchholz <j...@durchholz.org> writes:
>> across abstraction boundaries. Also, consider Erlang which does not
>> use sharing for their message passing, and still is fast enough to use
>> in high-end Telco switches.)
>
> 1. I don't think telco switching is a CPU-intensive application.

You're wrong. Telco switches are very much like IP switches: shoveling
data through at maximum speed, plus doing all sorts of bookkeeping in
the background.
I'd expect Telco switches to be more complicated than IP switches
because they'd have to handle not just
Telco switches are probably more complicated because they have to manage
IP *and* Telco protocols.

(IIRC Ulf Wiger has reported that in an AXP switch, the Erlang side of
things are 50% of the LoC and 80% of the functionality. The rest is C
plus a few other languages.)

> 2. I get the impression that those Erlang messages tend to be rather
> small, e.g. a message saying some phone call has just ended.

That's my impression, too.
Of course you don't send multi-megabyte messages if you know they're
going to be copied! You structure the application differently.
I'd also think that the actual data that goes over the wire isn't much
processed inside the Erlang code; it would be far more advantageous to
keep it on the coprocessor boards that are attached to the wire, and
just instruct the boards what to do with the data.

> They use
> Mnesia for when large structures are involved, though I guess the
> queries are still individually small.

AFAIK Mnesia is used for handling configuration data.
I don't think it's being used for achieving persistence, because these
switches don't have any real downtime (six-sigma means that the switch
mail fail for roughly half a minute per year, including maintenance
periods).

BTW message sending is an impure operation in Erlang. (As are most
Mnesia operations, of course.)

Regards,
Jo

Joachim Durchholz

unread,

Aug 20, 2007, 12:11:55 PM8/20/07

to

Rainer Joswig schrieb:

> Look, "if you program in a mostly pure fashion, you don't need physical equality".
>
> That's plain wrong in the general case. Sure you can write
> programs and don't need it. But that's not true for all programs
> and all problems.

Nope. In a pure (side-effect-free) program, equality and identity are
the same concept.
E.g. integers are immutable (there's no way to assign 5 to 7). Integers
are either equal (they they are also identical), or they are inequal -
this is exactly because they are immutable.

Of course, you could have two immutable data structures that represent
the same value. You may be tempted to say that the two are equal but not
identical - however, this will gain you little (since, for all intents
and purposes, the two data structures are equal after all!), but it will
break equational reasoning since equals may not behave equally anymore
(some code that you submit the data structures to may distinguish which
of the two it got and give different results - heck, you *certainly*
have such code because otherwise you wouldn't want an identity operation).

Regards,
Jo

Joachim Durchholz

unread,

Aug 20, 2007, 12:14:16 PM8/20/07

to

ross...@ps.uni-sb.de schrieb:

> On Aug 20, 4:56 pm, Joachim Durchholz <j...@durchholz.org> wrote:
>> (Aside note: in practice, you get tail *call* optimization
>> which is just as simple to implement and and optimizes a few other
>> cases, but TRO would be sufficient.)
>
> I don't think so. TRO (which usually means self-recursion) is
> insufficient as soon as you are dealing with mutual recursion.

OK. I'd have subsumed mutual recursion under TRO.
(Of course, to handle that case, you need to construct a call graph to
find the recursions, turning TRO into a global optimization while TCO
can be done locally. OTOH TCO is more difficult to implement on some
architectures because you need to manipulate the stack - e.g. the JVM
doesn't allow this.)

Regards,
Jo

ross...@ps.uni-sb.de

unread,

Aug 20, 2007, 12:20:32 PM8/20/07

to

On Aug 20, 5:43 pm, Rainer Joswig <jos...@lisp.de> wrote:
>
> Look, "if you program in a mostly pure fashion, you don't need physical equality".
>
> That's plain wrong in the general case. Sure you can write
> programs and don't need it. But that's not true for all programs
> and all problems.

What general case? "You don't need pointer arithmetics" is equally
wrong in the "general case". Still, Lisp does pretty well without it.
So what is your point?

You seem to be generalising from a very specific point of view of what
your own problem domains and/or personal preferences are.

In a pure language, you simply cannot have physical equality. Yet
people other than you manage to write decent stuff in something like
Haskell. Thus, Joachim's statement cannot be "plain wrong".

- Andreas

Jon Harrop

unread,

Aug 20, 2007, 12:27:53 PM8/20/07

to

ross...@ps.uni-sb.de wrote:
> On Aug 20, 7:56 am, Jon Harrop <j...@ffconsultancy.com> wrote:
>>
>> > In OCaml, you have to explicitly switch to (==) for mutables - but
>> > what when you compare more complex values, e.g. two int ref lists?
>>
>> In practice, I have never switched to == for mutables so I don't really
>> understand why this is an issue. I assume it is for purely theoretical
>> reasons?
>
> Well, it makes the semantics more consistent, which generally tends to
> avoid nasty surprises and subtle bugs.

Can you give an example of such a bug?

>> Conversely, I find the practical problem of being unable to compare two
>> values for equality in O(1) very disturbing...
>
> I don't understand what you are talking about. Comparison of non-
> atomic values can never be O(1), short of requiring the implementation
> to do (prohibitively expensive) hash-consing on everything.

Physical equality is essentially a fuzzy equal/unknown test that takes O(1),
allowing you to short-circuit certain operations as an optimization only
when physical equality demonstrates that two values are equal because they
are physically identical. If you don't have physical equality then the
alternatives are somewhat ugly.

Can you translate my sort into SML, retaining the optimization that any
already-sorted tail lists incur no allocations?

# let rec sort = function
| [] -> []
| h1::t as list -> match sort t with
| h2::t when h1>h2 -> h2::sort(h1::t)
| t' -> if t==t' then list else h1::t';;
val sort : 'a list -> 'a list = <fun>

Rainer Joswig

unread,

Aug 20, 2007, 1:31:27 PM8/20/07

to

In article <1187626832.0...@i38g2000prf.googlegroups.com>,
ross...@ps.uni-sb.de wrote:

> On Aug 20, 5:43 pm, Rainer Joswig <jos...@lisp.de> wrote:
> >
> > Look, "if you program in a mostly pure fashion, you don't need physical equality".
> >
> > That's plain wrong in the general case. Sure you can write
> > programs and don't need it. But that's not true for all programs
> > and all problems.
>
> What general case?

I want to be able to compare objects by identity. To optimize
a program for space and time. Me. Not the compiler.

> "You don't need pointer arithmetics" is equally
> wrong in the "general case". Still, Lisp does pretty well without it.
> So what is your point?

How are both related? What is your point?

> You seem to be generalising from a very specific point of view of what
> your own problem domains and/or personal preferences are.

How is that different from your view? Do you have a better view?
Is it more valid?

> In a pure language, you simply cannot have physical equality. Yet
> people other than you manage to write decent stuff in something like
> Haskell. Thus, Joachim's statement cannot be "plain wrong".

You can write decent stuff in assembler. What is your point?

--
http://lispm.dyndns.org

Jon Harrop

unread,

Aug 20, 2007, 1:38:41 PM8/20/07

to

Joachim Durchholz wrote:
> The main difference is that the best that you can do with C is micro
> optimizations, where in an FPL, you can do real code transformations
> that can change the big-Oh complexity of the code.

That is just repeating the previous statement.

> (Personally, I think the kind of big-Oh-impacting optimizations
> available should be made part of the language or otherwise transparent,
> but that's just my personal stance.)

Exactly. I see no merit in being pedantic about formal definitions only to
leave anything practically-important undefined.

>>> Physical equality makes allocation into a visible side-effect, which
>>> means that any optimization that changes the order of evaluation of
>>> subexpressions stops being semantics-preserving.
>>
>> Only if you assume the semantics were defined to begin with, which they
>> were not.
>
> ???
>
> Haskell even has a formal semantics (not a complete one, but enough to
> be useful).

OCaml does not. Hence an argument that an optimization is
not "semantics-preserving" is a non-starter in the context of OCaml.

>> 1. Haskell is only fast in theory. In practice, it remains 3x slower.
>
> Well, that's certainly interesting for some applications, but it's not
> necessarily interesting for others.
>
> Also, that speed difference isn't as clear as you make it out to be.

Haskell is >3x slower than the fastest languages on all of these benchmarks:

http://www.ffconsultancy.com/languages/ray_tracer/results.html
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=fannkuch&lang=all
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=fasta&lang=all
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=knucleotide&lang=all
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=regexdna&lang=all
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=spectralnorm&lang=all

> The
> numbers certainly come out differently if you ask about the performance
> achievable with a given, limited programmer time budget.

I don't buy that for a second. Look at these results for the regex-dna
benchmark running on my machine:

Python: 27LOC 1.67s
OCaml: 41LOC 3.770s
C++: 226LOC 3.941s
Haskell: 274LOC 6.738s

If it isn't bad enough that Haskell is 4x slower than Python, it is even
substantially more verbose than C++. If that doesn't convince you that
Haskell is far from God's gift to development time maybe this 500kb web
page discussing Haskell implementations of that trivial benchmark will:

http://www.haskell.org/haskellwiki/Shootout/Regex_DNA

> It's the old assembly argument: if you want real speed, you can always
> drop down to assembly, but most of the time, the advantages in
> programmer speed allow you do to algorithmic optimizations that the
> lower-level language cannot afford because getting the code right is
> difficult enough.

That is just another reason why Haskell is fast in theory.

>> For example, constants are hoisted in some cases:
>>
>> # let f() = Some 3;;
>> val f : unit -> int option = <fun>
>> # f() == f();;
>> - : bool = true
>>
>> but not others:
>>
>> # let f x y = Some x;;
>> val f : 'a -> 'b -> 'a option = <fun>
>> # let f = f 3 in f 4 == f 5;;
>> - : bool = false
>
> Equality for functions is undecidable, that's why you don't get it.

Those were option values being compared (e.g. f 3 4 -> Some 3), not
functions.

>> As you say, this is an observable optimization but physical equality only
>> returns "definitely equal" or "probably unequal" and there are practical
>> situations where that is useful, just as there are practical situations
>> where a "probably prime" function is useful. Surely you would not
>> advocate getting rid of probably prime functions?
>
> I've always been feeling uneasy around these.
> I can see their advantages, but I'd really be happy if somebody found
> algorithms that gave something better than probabilities. (I *know* that
> the probabilities are far better than the reliability of the hardware
> the algorithms are running on. It's just that stacking probabilities is
> admissible only if they are independent, and I consider the probability
> that there's an overlooked dependency far higher than the fault
> probability of the hardware.)

I certainly agree that deterministic proof is preferable to probabilistic
likelihood. But when proofs run dry and probabilities can yield practical
benefits, I like to be able to use them. Speaking from a practical point of
view, SML and Haskell both have features that I desire but dishoning
physical equality is not among them.

Joachim Durchholz

unread,

Aug 20, 2007, 2:31:42 PM8/20/07

to

Jon Harrop schrieb:

> Joachim Durchholz wrote:
>> Haskell even has a formal semantics (not a complete one, but enough to
>> be useful).
>
> OCaml does not. Hence an argument that an optimization is
> not "semantics-preserving" is a non-starter in the context of OCaml.

Ah, I understand the context now.

Still, arguing about semantics-preserving transformations does make
sense. Such arguments are far more difficult to resolve for a language
without a formal semantics, of course, but that doesn't mean it is
irrelevant. (After all, the conversion of source code to machine code is
a semantics-preserving transformation, too, otherwise nobody would
program in the language.)

>>> 1. Haskell is only fast in theory. In practice, it remains 3x slower.
>> Well, that's certainly interesting for some applications, but it's not
>> necessarily interesting for others.
>>
>> Also, that speed difference isn't as clear as you make it out to be.
>
> Haskell is >3x slower than the fastest languages on all of these benchmarks:
>
> http://www.ffconsultancy.com/languages/ray_tracer/results.html
> http://shootout.alioth.debian.org/sandbox/benchmark.php?test=fannkuch&lang=all
> http://shootout.alioth.debian.org/sandbox/benchmark.php?test=fasta&lang=all
> http://shootout.alioth.debian.org/sandbox/benchmark.php?test=knucleotide&lang=all
> http://shootout.alioth.debian.org/sandbox/benchmark.php?test=regexdna&lang=all
> http://shootout.alioth.debian.org/sandbox/benchmark.php?test=spectralnorm&lang=all

Yes, but none of these benchmarks take programmer time into account (nor
are they geared towards this kind of benchmarking).

>> The
>> numbers certainly come out differently if you ask about the performance
>> achievable with a given, limited programmer time budget.
>
> I don't buy that for a second.

*shrug* look at the outcome of ICFP contests.
There are other references. I dimly remember a test run by DARPA where
they asked different programmers to program a small task in their
respective standard languages. Haskell came out with impressive results,
including timing (and that's years ago, when optimization wasn't as
sophisticated as they are today). Of course, those guys didn't care
about a factor of 3, they cared about big-Oh differences.

> Look at these results for the regex-dna
> benchmark running on my machine:
>
> Python: 27LOC 1.67s
> OCaml: 41LOC 3.770s
> C++: 226LOC 3.941s
> Haskell: 274LOC 6.738s
>
> If it isn't bad enough that Haskell is 4x slower than Python, it is even
> substantially more verbose than C++.

*shrug* proves very little. There's always the odd problem that's
difficult to encode in any given language. More often, it's just a C++
programmer writing C++ code in another language.

> If that doesn't convince you that
> Haskell is far from God's gift to development time maybe this 500kb web
> page discussing Haskell implementations of that trivial benchmark will:
>
> http://www.haskell.org/haskellwiki/Shootout/Regex_DNA

*shrug* I have seen enough concise Haskell code that no C++ could match.

>> It's the old assembly argument: if you want real speed, you can always
>> drop down to assembly, but most of the time, the advantages in
>> programmer speed allow you do to algorithmic optimizations that the
>> lower-level language cannot afford because getting the code right is
>> difficult enough.
>
> That is just another reason why Haskell is fast in theory.

Just compare the team notes from the various ICFP contests. Halfway
through the contest, C++ teams usually report about nasty pointer bugs
that took them half a day to chase down, while the Haskellistas usually
report "almost done" and start to think about algorithmic optimizations.

One of the contests did raytracing. IIRC the first three places were
taken by Haskell and OCaml teams.
That's a general trend. OCaml tends to be faster, but the difference
isn't so large that there's a clear-cut advantage. For those tasks where
correctness was important (raytracing, finding a way through a
labyrinth), Haskell tended to be more successful, otherwise, OCaml could
play out its speed advantages.

In general, I don't think that speed is half as important as you make it
out to be. Given enough time to apply all optimizations, OCaml could
well be faster than Haskell; on the other hand, since programmer time is
limited, it may still be better to work in Haskell because you get the
web interface, the mail library, and the database access layer done and
optimized while the hapless C++ programmers just got around the last
pointer bugs and are just starting to think about optimizing the
intersection finding algorithm. (I'm exaggerating, of course, and you're
more interested in OCaml than in C++ anyway, but I do think if getting
the algorithms done is more important than getting every algorithm fast,
then Haskell is probably the more productive language.)

Regards,
Jo

Joachim Durchholz

unread,

Aug 20, 2007, 2:38:46 PM8/20/07

to

Jon Harrop schrieb:

> Physical equality is essentially a fuzzy equal/unknown test that takes O(1),
> allowing you to short-circuit certain operations as an optimization only
> when physical equality demonstrates that two values are equal because they
> are physically identical. If you don't have physical equality then the
> alternatives are somewhat ugly.

The run-time system will use address equality to speed up the
comparison, of course.
In an immutable language, you typically organize the computations in a
way that equal values derive from the same computation anyway, so they
are shared. (Determining that two independently create values are equal
should take the same amount of time, whether you're in a mutable or an
immutable language.)

Regards,
Jo

Markus E.L. 2

unread,

Aug 19, 2007, 11:50:06 AM8/19/07

to

Neelakantan Krishnaswami wrote:

> In article <<yo3ayg2...@hod.lan.m-e-leypold.de>>,
> Markus E.L. 2 <development-2006-8...@ANDTHATm-e-leypold.de>
> wrote:
>>
>> Out of curiosity: What is the problem with (=) in OCaml?
>
> The big problem is that it breaks type abstraction. If you have an
> abstract type, (=) will happily compare values of that abstract
> type. This allows you to distinguish semantically equal values.

Woops. You're right. That is not so good. On the other side the
implementator of an abstract type is not barred from providing a
"proper" compare function. Some posts ago somebody said that SML does
it right. How would I defined/overload a abstract type comparison for
SML?

Regards -- Markus

Markus E L

unread,

Aug 19, 2007, 9:54:32 PM8/19/07

to

Rainer Joswig wrote:

> In article <faapjs$rf9$1...@online.de>,

> Joachim Durchholz <j...@durchholz.org> wrote:
>
>> Rainer Joswig schrieb:
>> > Joachim Durchholz <j...@durchholz.org> wrote:
>> >> Rainer Joswig schrieb:
>> >>> To me physical equality is more useful than the optimization
>> >>> of your example.
>> >> You don't need physical equality if you're programming in a pure language.

>> >> Even in an impure language, if you program in a mostly pure fashion, you

>> >> don't need physical equality.
>> >
>> > Only if you don't run your programs.
>>
>> Oh come on, you should really know better than that.
>>

>> Regards,
>> Jo

>
> 'You should really know better than that'? What kind of argument is that?

Come on, what argument is "To me physical equality is more useful than
the optimization than the optimization of your example"? And that in
response to a contribution that in no regard tried to portrait
optimization as the _only_ application of physical equality?

> I want to use this feature. I want to use an impure language
> that lets me directly use 'physical equality'.

Looking for new trouble, aren't you? Bah!

-- Markus

Markus E L

unread,

Aug 20, 2007, 9:50:09 AM8/20/07

to

Jon Harrop wrote:

> If you move from one Haskell compiler to another, your "optimization" can
> even turn a linear time algorithm into an exponential one when the new
> compiler doesn't happen to implement that optimization.
>
> I can only assume that people would not depend upon the existence of such
> optimizations and that coding that way is considered bad style. Is that
> correct?

Yes, basically. Coding a compiler which does not have the aforesaid
optimizations is considered bad style. ;-)

> I appreciate this theoretical viewpoint but I think it is not constructive
> for two main reasons:
>
> 1. Haskell is only fast in theory. In practice, it remains 3x slower.

Which is still fast for what it does.

> 2. The semantics-breaking optimizations that you say are "out the window",
> are in fact done.

> For example, constants are hoisted in some cases:
>
> # let f() = Some 3;;
> val f : unit -> int option = <fun>
> # f() == f();;
> - : bool = true

That's the other direction: Folding "different" objects to the
same.

Regards -- Markus

Markus E L

unread,

Aug 19, 2007, 6:54:13 PM8/19/07

to

Rainer Joswig wrote:

> In article <slrnfch6gb...@gs3106.sp.cs.cmu.edu>,
> Neelakantan Krishnaswami <ne...@cs.cmu.edu> wrote:
>

>> In article <<3d7inwf...@hod.lan.m-e-leypold.de>>,

>> Markus E.L. 2 <development-2006-8...@ANDTHATm-e-leypold.de>
>> wrote:
>> >

>> > The only hope one can entertain here is that the compiler is smart
>> > enough. But probably you're right: This is one of the effects that
>> > make Haskell slower.
>>
>> It also makes it faster. Removing physical equality from the language
>> means that pointer equality is no longer an observable effect. This
>> means that various optimizations become allowable:

>>
>> let p1 = f(e) in
>> let p2 = f(e) in
>> whatever
>>
>> is safely transformable to
>>
>> let p1 = f(e) in
>> let p2 = p1 in
>> whatever
>>
>> only if there's no observerable difference between different calls to
>> f. If you can test whether the returned value is in the same or
>> different memory, then you can tell, and so this is no longer a safe
>> transformation.
>

> To me physical equality is more useful than the optimization
> of your example.

Not to contest that statement -- but it would be rather - informative
- if you elaborated in which ways it is more useful to you. The only
thing I learned from your response was: It's more useful to Rainer
Joswig. Could it be more useful to other people as well? What are the
circumstances?

(And no: If that starts another "discussion" like what now is being
called the "Harrop wars", then kindly leave it.)

Regards -- Markus

Markus E L

unread,

Aug 20, 2007, 12:18:52 PM8/20/07

to

Rainer Joswig wrote:
>>
>> > I want to use this feature. I want to use an impure language
>> > that lets me directly use 'physical equality'.
>>
>> That's fine, but doesn't justify derogative remarks on different
>> approaches.
>
> Look, "if you program in a mostly pure fashion, you don't need physical equality".
>
> That's plain wrong in the general case. Sure you can write
> programs and don't need it. But that's not true for all programs
> and all problems.

Looks like you overlooked "if you program in a mostly pure
fashion". This is rather different from "all programs". (Though the
jury is still out on wether you can tackle all problems in a "pure
fashion" -- considering the example of the Haskell IO monad I'm
tempted to say that is so).

- M

Paul Rubin

unread,

Aug 20, 2007, 3:48:50 PM8/20/07

to

Joachim Durchholz <j...@durchholz.org> writes:
> > 1. I don't think telco switching is a CPU-intensive application.
>
> You're wrong. Telco switches are very much like IP switches: shoveling
> data through at maximum speed, plus doing all sorts of bookkeeping in
> the background.

Sure, but the CPU never sees that high speed data, at least in the ATM
switch I worked on. The data shovelling was all done by FPGA's. The
CPU (a fairly wimpy 68020-series part in that specific example) only
handled the bookkeeping stuff which was very low speed by comparison.

> I'd expect Telco switches to be more complicated than IP switches
> because they'd have to handle not just
> Telco switches are probably more complicated because they have to
> manage IP *and* Telco protocols.

Right, but again, the application code isn't especially speed
intensive. All that low level stuff (including routing) is done by
the hardware and the CPU never sees it.

> (IIRC Ulf Wiger has reported that in an AXP switch, the Erlang side of
> things are 50% of the LoC and 80% of the functionality. The rest is C
> plus a few other languages.)

I wonder how much of the C code is in C for speed rather than because
it has to poke device registers or something like that.

> AFAIK Mnesia is used for handling configuration data.
> I don't think it's being used for achieving persistence, because these
> switches don't have any real downtime (six-sigma means that the switch
> mail fail for roughly half a minute per year, including maintenance
> periods).

What about stuff like call accounting?

Chris Smith

unread,

Aug 20, 2007, 5:13:40 PM8/20/07

to

Rainer Joswig <jos...@lisp.de> wrote:
> > > That's plain wrong in the general case. Sure you can write
> > > programs and don't need it. But that's not true for all programs
> > > and all problems.
> >
> > What general case?
>
> I want to be able to compare objects by identity. To optimize
> a program for space and time. Me. Not the compiler.

I suppose most people here are looking for a more sensible position
underneath what you're saying.

Taken at face value, the answer is simple: if it just happens to give
you pure pleasure to compare objects for identity, then go use a
language that lets you do it. Just don't make any claims, on that
basis, regarding what languages ought to do. Your goal, to compare
objects for identity yourself, is not a rational one in its own right.

If your desire were to write programs with decent performance, then one
could discuss whether certain language design choices supported that
desire; but apparently you are now precluding that discussion.

> > "You don't need pointer arithmetics" is equally
> > wrong in the "general case". Still, Lisp does pretty well without it.
> > So what is your point?
>
> How are both related? What is your point?

How are they related? They are both low-level programming techniques
that can be used to improve performance. They are both completely
unnecessary from the standpoint of language semantics. They both
describe techniques that good compilers can use, in some cases, behind
the scenes; but for which it's probably possible with some effort for a
human to beat the compiler.

Actually, it's a remarkably good analogy. It's hard to believe you
don't see the similarity. If you don't think it fits for some
particular reason, I suppose you'd better say why.

--
Chris Smith