[Caml-list] OCaml troll on Slashdot

Karl Zilles

unread,

Mar 14, 2005, 8:34:59 PM3/14/05

to Caml Mailing List

http://developers.slashdot.org/article.pl?sid=05/03/14/2258219

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Oliver Bandel

unread,

Mar 15, 2005, 3:45:07 AM3/15/05

to caml...@inria.fr

On Mon, Mar 14, 2005 at 05:29:42PM -0800, Karl Zilles wrote:
> http://developers.slashdot.org/article.pl?sid=05/03/14/2258219
[...]

Well, I used the code that is provided there.
And it runs on my computer not 16 minutes...

=====================
(...)
4 by 10
10 empty cells:
XXOX
OXXX
XXXO
XOXX
XXXO
OXXX
XXOX
OXXX
XXXO
XOXX

real 0m10.677s
user 0m9.440s
sys 0m0.040s
=====================
(with ocamlc)

=====================
real 0m7.583s
user 0m6.390s
sys 0m0.020s
(with ocamlopt)
=====================

So the whole discussion is stiupid...

Ciao,
Oliver

Michael Vanier

unread,

Mar 15, 2005, 3:55:03 AM3/15/05

to oli...@first.in-berlin.de, caml...@inria.fr

Maybe he hasn't discovered ocamlopt yet.

Mike

> Date: Tue, 15 Mar 2005 09:32:43 +0100
> From: Oliver Bandel <oli...@first.in-berlin.de>

Jon Harrop

unread,

Mar 15, 2005, 4:09:45 AM3/15/05

to caml...@yquem.inria.fr

On Tuesday 15 March 2005 08:45, Michael Vanier wrote:
> Maybe he hasn't discovered ocamlopt yet.

No, the OCaml code (compiled with ocamlopt) is much, much slower than the C++.
As we all know, this can mean only one thing...

Also, his C++ is actually shorter than the OCaml, and the OCaml defines a lot
of familiar looking functions (map, length, "prepend" etc.).

I'll take a more detailed look in a minute...

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

Richard Jones

unread,

Mar 15, 2005, 4:35:09 AM3/15/05

to Caml Mailing List

On Mon, Mar 14, 2005 at 05:29:42PM -0800, Karl Zilles wrote:
> http://developers.slashdot.org/article.pl?sid=05/03/14/2258219

The problem was with his stupid implementation of a hash table:

http://developers.slashdot.org/comments.pl?sid=142494&cid=11939530

(That comment is +5 informative)

Rich.

--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com

YANG Shouxun

unread,

Mar 15, 2005, 5:17:08 AM3/15/05

to caml...@yquem.inria.fr

On Tuesday 15 March 2005 17:25, Richard Jones wrote:
> On Mon, Mar 14, 2005 at 05:29:42PM -0800, Karl Zilles wrote:
> > http://developers.slashdot.org/article.pl?sid=05/03/14/2258219
>
> The problem was with his stupid implementation of a hash table:
>
> http://developers.slashdot.org/comments.pl?sid=142494&cid=11939530
>
> (That comment is +5 informative)

No. My experiments show that the OCaml implementation performs far better than
the C++ implementation when the column and row get larger:

C++ is compiled with -O3, not sure what is the better optimization level,
while OCaml version (actually I used Sys.argv and references to the two
parameters) is compiled with ocamlopt

4x4 c++
real 0m0.003s
user 0m0.002s
sys 0m0.002s

4x4 ocaml
real 0m0.177s
user 0m0.137s
sys 0m0.001s

8x8 c++
real 0m8.703s
user 0m7.000s
sys 0m0.018s

8x8 ocaml
real 0m0.747s
user 0m0.485s
sys 0m0.002s

12x12 c++
the process was killed by the OS

12x12 ocaml
real 0m1.210s
user 0m0.886s
sys 0m0.001s

padi...@irisa.fr

unread,

Mar 15, 2005, 5:45:39 AM3/15/05

to Richard Jones, Caml Mailing List

> On Mon, Mar 14, 2005 at 05:29:42PM -0800, Karl Zilles wrote:
>> http://developers.slashdot.org/article.pl?sid=05/03/14/2258219
>
> The problem was with his stupid implementation of a hash table:

Perhaps his program is not good.
But he made a point. His experience say something
and I agree with many stuff he said.

In the ICFP 2000 contest, the mercury team advocated that
their program was very fast and so implicitly claimed
that logic programming was very fast, but if you look at
their code they mostly used a functionnal style,
not a logic style.

I think it is a bad idea to call "troll" someone who is
against ocaml, especially when he says something true.
Perhaps the webpage of ocaml should put a warning
"ocaml can compete against C, but it is not magical,
a beginner will certainly make very inefficient program".

PS: I am a big fan of functionnal programming, and I try as much
as possible to make functionnal code, but it is true that sometimes
I have to renounce to this and go back to imperative style because
it would be inefficient.

Diego Olivier Fernandez Pons

unread,

Mar 15, 2005, 6:06:02 AM3/15/05

to caml...@inria.fr

Bonjour,

> Perhaps the webpage of ocaml should put a warning
> "ocaml can compete against C, but it is not magical,
> a beginner will certainly make very inefficient program".

Well, my Caml code may be inefficent but it has an important advantage
over my C++ code :
- it exists
- it works

Maybe should we also warn beginners that when they will try to port
their caml code in C++ to see how much they are loosing because of
Caml, they will spend a very very bad time.

Diego Olivier

Eijiro Sumii

unread,

Mar 15, 2005, 9:51:40 AM3/15/05

to caml...@inria.fr, su...@saul.cis.upenn.edu

> Perhaps his program is not good.
> But he made a point. His experience say something
> and I agree with many stuff he said.

I don't think he says _anything_ correct except trivial facts (like
his OCaml program is very slow:-) - but this is because it is written
_extremely_ poorly).

As we can see from the careful wording at caml.inria.fr (both the new
one and the old one), OCaml is *not* defined as a functional language.
In fact, it is good even at imperative/OO programming thanks to
garbage collection, parametric polymorphism, data types, pattern
matching, etc.!

--
Eijiro Sumii (http://www.cis.upenn.edu/~sumii/)
Department of Computer and Information Science, University of Pennsylvania

Christophe TROESTLER

unread,

Mar 15, 2005, 10:35:17 AM3/15/05

to O'Caml Mailing List

On Tue, 15 Mar 2005, Eijiro Sumii <eijiro...@anet.ne.jp> wrote:
>
> > Perhaps his program is not good. But he made a point. His
> > experience say something and I agree with many stuff he said.
>

> I don't think he says _anything_ correct except trivial facts [...]

I agree with that. The ./ post is a typical
I-don't-know-what-I-am-talking-about-so-I-say-it-loud ! If he made a
point I believe it is that it's easier to hit the "send" key than to
try to be a good programmer...

> As we can see from the careful wording at caml.inria.fr (both the
> new one and the old one), OCaml is *not* defined as a functional
> language. In fact, it is good even at imperative/OO programming
> thanks to garbage collection, parametric polymorphism, data types,
> pattern matching, etc.!

In that vein, I like Doug Bagley koans. Here is the one about HOF:

A disciple who was a recent convert from another sect felt troubled
by the teachings of his former master, who taught the dogma that
only referentially transparent languages have Functional Nature. So
he asks his new master, Pierre Weis: "Does OCaml have Functional
Nature?", to which wise master replied: "HOF!" On hearing the
mystical incantation, the new convert was enlightened.

My 2¢,
ChriS

Sylvain Kerjean

unread,

Mar 15, 2005, 11:51:27 AM3/15/05

to

>
> In that vein, I like Doug Bagley koans. Here is the one about HOF:
>
> A disciple who was a recent convert from another sect felt troubled
> by the teachings of his former master, who taught the dogma that
> only referentially transparent languages have Functional Nature. So
> he asks his new master, Pierre Weis: "Does OCaml have Functional
> Nature?", to which wise master replied: "HOF!" On hearing the
> mystical incantation, the new convert was enlightened.
>
> My 2¢,
> ChriS
>
>

Sorry but i don't understand your "HOF" acronym so i can't laugh with
the joke. Would be keen of you to explain it :)

Thomas Fischbacher

unread,

Mar 15, 2005, 1:14:57 PM3/15/05

to Christophe TROESTLER, O'Caml Mailing List

On Tue, 15 Mar 2005, Christophe TROESTLER wrote:

> I agree with that. The ./ post is a typical
> I-don't-know-what-I-am-talking-about-so-I-say-it-loud ! If he made a
> point I believe it is that it's easier to hit the "send" key than to
> try to be a good programmer...

Isn't it sad that those people generate so much more noise than those who
know how to program and really do cool stuff with ocaml?

--
regards, t...@cip.physik.uni-muenchen.de (o_
Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU)

Kip Macy

unread,

Mar 15, 2005, 1:41:47 PM3/15/05

to Thomas Fischbacher, Christophe TROESTLER, O'Caml Mailing List

Most people who do things well are too busy doing them to have time to
talk about them.

-Kip

Christopher A. Watford

unread,

Mar 15, 2005, 2:04:31 PM3/15/05

to Christophe TROESTLER, O'Caml Mailing List

On Tue, 15 Mar 2005 16:25:01 +0100 (CET), Christophe TROESTLER
<debi...@tiscali.be> wrote:
> I agree with that. The ./ post is a typical
> I-don't-know-what-I-am-talking-about-so-I-say-it-loud ! If he made a
> point I believe it is that it's easier to hit the "send" key than to
> try to be a good programmer...
>

><SNIP>
>
> My 2¢,
> ChriS

You're right, it looked something like this to me:

Hi! I'm new to OCaml, and my first attempt and making something like I
would in C++ failed miserably. I didn't used standard library
functions, because I didn't know they existed. I also went ahead and
compiled things in a rediculously unoptimized fashion. Finally, I
didn't bother comparing how my noobie code scaled against the C++.
LIKE OMG OCAML SUX0R!

People post similar things for C#, PHP, any language really. Besides,
it is VERY VERY simple to troll slashdot.

--
Christopher A. Watford
christoph...@gmail.com
http://dorm.tunkeymicket.com

Jon Harrop

unread,

Mar 15, 2005, 3:04:40 PM3/15/05

to caml...@yquem.inria.fr

On Tuesday 15 March 2005 18:55, Christopher A. Watford wrote:
> On Tue, 15 Mar 2005 16:25:01 +0100 (CET), Christophe TROESTLER

> You're right, it looked something like this to me:
>
> Hi! I'm new to OCaml, and my first attempt and making something like I
> would in C++ failed miserably.

Perhaps this isn't the best forum to be saying this, but that guy's C++ code
sucked as well. It could have been a lot more concise and efficient if he'd
actually used C++...

Maybe the task will get on the shootout and we can do it properly in OCaml.

On Tuesday 15 March 2005 18:26, Kip Macy wrote:
> Most people who do things well are too busy doing them to have time to
> talk about them.

Yes, I have to resort to putting it in my sig. ;-)

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Yoann Padioleau

unread,

Mar 15, 2005, 3:14:38 PM3/15/05

to Oliver Bandel, caml...@inria.fr

Oliver Bandel <oli...@first.in-berlin.de> writes:

> On Mon, Mar 14, 2005 at 05:29:42PM -0800, Karl Zilles wrote:
> > http://developers.slashdot.org/article.pl?sid=05/03/14/2258219
> [...]
>
>
> Well, I used the code that is provided there.
> And it runs on my computer not 16 minutes...

try 7 by 7 and you will see that it effectively more than 16 minutes.

--
Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

Yoann Padioleau

unread,

Mar 15, 2005, 3:14:39 PM3/15/05

to YANG Shouxun, caml...@yquem.inria.fr

YANG Shouxun <yang...@fltrp.com> writes:

> No. My experiments show that the OCaml implementation performs far bett=

er than
> the C++ implementation when the column and row get larger:

Perhaps because you are a liar.

>
> C++ is compiled with -O3, not sure what is the better optimization leve=
l,
> while OCaml version (actually I used Sys.argv and references to the two=

> parameters) is compiled with ocamlopt
>
> 4x4 c++
> real 0m0.003s
> user 0m0.002s
> sys 0m0.002s
>
> 4x4 ocaml
> real 0m0.177s
> user 0m0.137s
> sys 0m0.001s
>
> 8x8 c++
> real 0m8.703s
> user 0m7.000s
> sys 0m0.018s
>
> 8x8 ocaml
> real 0m0.747s
> user 0m0.485s
> sys 0m0.002s

I dont know where you get those numbers ?
I tried the code of the "troll" and the ocaml version performs far _worse=
_ than
the c++ implementation when the column and row get larger.

>
> 12x12 c++
> the process was killed by the OS
>
> 12x12 ocaml
> real 0m1.210s
> user 0m0.886s
> sys 0m0.001s
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

--

Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

Yoann Padioleau

unread,

Mar 15, 2005, 3:24:35 PM3/15/05

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop <j...@ffconsultancy.com> writes:

> On Tuesday 15 March 2005 08:45, Michael Vanier wrote:
> > Maybe he hasn't discovered ocamlopt yet.
>

> No, the OCaml code (compiled with ocamlopt) is much, much slower than t=

he C++.
> As we all know, this can mean only one thing...
>

> Also, his C++ is actually shorter than the OCaml, and the OCaml defines=

a lot
> of familiar looking functions (map, length, "prepend" etc.).
>
> I'll take a more detailed look in a minute...

I have made the obvious optimization which is to replace the assoc list b=
y
a Map (just changing 4 lines in the "troll" code),
the ocaml version is then far far faster.

But computing 7 by 7 with c++ take 1.5 sec and with
ocaml 50.0 sec (but it is better than "more than 16 minutes").

--
Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

Jon Harrop

unread,

Mar 15, 2005, 3:44:48 PM3/15/05

to caml...@inria.fr

On Tuesday 15 March 2005 20:17, Yoann Padioleau wrote:
> I have made the obvious optimization which is to replace the assoc list by

> a Map (just changing 4 lines in the "troll" code),
> the ocaml version is then far far faster.
>
> But computing 7 by 7 with c++ take 1.5 sec and with
> ocaml 50.0 sec (but it is better than "more than 16 minutes").

I spent a little time on it but some Anonymous Coward has already done a much
better job and posted the resulting code on Jacques Garrigue's website. ;-)

I also posted the resulting timings on slashdot. IIRC, OCaml was less than
half as much code and ran in 1.66 times the time (if you unroll the most time
consuming function a little). That's pretty much what I'd expect given that
the problem is too simple and array-based to take much advantage of OCaml and
functional programming.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

padi...@irisa.fr

unread,

Mar 15, 2005, 4:15:18 PM3/15/05

to Jon Harrop, caml...@inria.fr

> On Tuesday 15 March 2005 20:17, Yoann Padioleau wrote:
>> I have made the obvious optimization which is to replace the assoc list
>> by
>> a Map (just changing 4 lines in the "troll" code),
>> the ocaml version is then far far faster.
>>
>> But computing 7 by 7 with c++ take 1.5 sec and with
>> ocaml 50.0 sec (but it is better than "more than 16 minutes").
>
> I spent a little time on it but some Anonymous Coward has already done a
> much
> better job and posted the resulting code on Jacques Garrigue's website.
> ;-)

where ?

>
> I also posted the resulting timings on slashdot. IIRC, OCaml was less than
> half as much code and ran in 1.66 times the time (if you unroll the most
> time
> consuming function a little). That's pretty much what I'd expect given
> that
> the problem is too simple and array-based to take much advantage of OCaml
> and
> functional programming.

Yes but this ocaml code use array ?
In that case it supports what the "troll" said, that is
the resulting code is no more "functionnal".
I agree with eijro sumii that ocaml is not just about functionnal
programming but in the mind of many people advocating ocaml is advocating
functionnal programming.

I think the way to answer to those trolls is to teach them the way
to do, in that case to use Map instead of associative list, and
not to be pretentious and to tell that he is just a newbie.
He is not a newbie, this garden optimization problem is not that simple.

Perhaps the python/perl/ruby community are successful because they
answer to those "trolls" who often then become the first
advocater of the langage who in turn answer to other "trolls" which
make the community bigger and bigger.

I am an egoist so I dont really care that this guy use ocaml,
but the more people use ocaml, the more I will have a chance to
get a job where I am paid to code in caml :)

Message has been deleted

Yoann Padioleau

unread,

Mar 15, 2005, 5:24:38 PM3/15/05

to William D.Neumann, padi...@irisa.fr, Jon Harrop, caml...@inria.fr

"William D.Neumann" <wneu...@cs.unm.edu> writes:

> On Mar 15, 2005, at 2:03 PM, padi...@irisa.fr wrote:
>
> > where ?
>
> http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/garden2.ml

>
> > Yes but this ocaml code use array ?
>

> Lists and arrays. Each where it's appropriate.

I agree.

>
> > In that case it supports what the "troll" said, that is
> > the resulting code is no more "functionnal".
>

> To which, I'd assume the majority response would be, "So?"

So some of his arguments are right. You make object "So?" but
we could continue a long moment that way.

> And to
> which I might add Benjamin Pierce's comment, "Most arguments about
> 'What is the essence of...?' do more to reveal the prejudices of the
> participants than to uncover any objective truth about the topic of
> discussion. Attempts to define the term 'object-oriented' are no
> exception." Sure, he wrote it in regards to object oriented
> programming, but it fits functional programming very well

Well not trying to define stuff is better ?

> (a polymorphic comment, indeed).

:)

>
> > He is not a newbie, this garden optimization problem is not that
> > simple.
>

> Well, a five minute glance at the code seems to indicate a very
> shallow understanding of OCaml. He might be a wizard in other aspects
> of programming/CS, but dollars to donuts

"dollars to donuts" ?
I am an american newbie so I have no idea of what it means :)

> he's very much an OCaml newbie.

He is far better than most of my students.
It all depends on what you call a newbie.
We are all the newbie of someone else (except perhaps xavier leroy, eijir=
o sumii and company).

>
> > Perhaps the python/perl/ruby community are successful because they
> > answer to those "trolls" who often then become the first
> > advocater of the langage who in turn answer to other "trolls" which
> > make the community bigger and bigger.
>

> I dunno. A couple of days ago there was a /. story about the new
> Randal Schwartz book, and at least one person was complaining about
> the perl newsgroups/mailing lists being full of unfriendly, unhelpful
> folk.

yes but there is also full of friendly helpful folks.

>
> William D. Neumann
>
> "You've got Rita Marlowe in the palm of your hand."
> "Palm of my hand? You haven't seen Rita Marlowe..."
>
> -- Will Success Spoil Rock Hunter?

>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

--

Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

Richard Jones

unread,

Mar 15, 2005, 5:44:32 PM3/15/05

to caml...@inria.fr

Keep it civilised please people :-)

Rich.

--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com

_______________________________________________

William D.Neumann

unread,

Mar 15, 2005, 6:14:43 PM3/15/05

to Yoann Padioleau, caml-list

On Mar 15, 2005, at 3:12 PM, Yoann Padioleau wrote:

>> To which, I'd assume the majority response would be, "So?"
>
> So some of his arguments are right. You make object "So?" but
> we could continue a long moment that way.

Perhaps, perhaps not. His point seems to be that programming in a
"functional style"[1] is inherently slower than an imperative style
because a list or a map have different performance characteristics than
do arrays. To which the only response is along the lines of "True.
They are different data structures, and they behave differently --
sometimes worse, sometimes better." But the point is that needlessly
restricting yourself to such a style seems like such an odd thing to do
that I have a hard time caring about the truth of the assertion. The
truth of a statement is orthogonal to its silliness.

> Well not trying to define stuff is better ?

That's not really the intent of the quote (or at least I don't think it
is). I read it as saying that those who insist that e.g. "functional
programming" *cannot* include the notion of mutable data structures, or
that it *cannot* be OO if it doesn't offer encapsulation or classes,
aren't really bringing anything useful to the table. You can argue
'till you're blue in the face whether or not mutable arrays or strings
have any place in a "functional" language, but when you're done, have
you really accomplished anything?

> "dollars to donuts" ?
> I am an american newbie so I have no idea of what it means :)

Sorry. It's shorthand for "I'll wager my X dollars against your X
donuts that I am correct," and is a way of expressing confidence in
your position. It used to mean a lot more when you could get a dozen
donuts for a dollar...

[1] Where functional style is restricted to, among other things, no
mutable data structures.

Jon Harrop

unread,

Mar 15, 2005, 6:45:45 PM3/15/05

to caml...@yquem.inria.fr

On Tuesday 15 March 2005 23:07, William D.Neumann wrote:
> His point seems to be that programming in a
> "functional style"[1] is inherently slower than an imperative style
> because a list or a map have different performance characteristics than
> do arrays.

True, but don't forget that using arrays does not imply imperative programming
in general. For example, I partook in a thread on c.l.functional recently,
comparing the performance of the sieve of Erasthenes (I know, a
microbenchmark) in different languages. With a purely functional
implementation of arrays, the Haskell implementation beat C++ (with
vector<bool>) and was even competing with OCaml for a short while!

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Thomas Fischbacher

unread,

Mar 15, 2005, 7:04:33 PM3/15/05

to Jon Harrop, caml...@yquem.inria.fr

On Tue, 15 Mar 2005, Jon Harrop wrote:

> With a purely functional implementation of arrays,

Rather, with a purely functional implementation of the description of
array operations.

Christopher Dutchyn

unread,

Mar 15, 2005, 7:14:42 PM3/15/05

to William D.Neumann, Yoann Padioleau, caml-list

On Tue, 15 Mar 2005, William D.Neumann wrote:

> His point seems to be that programming in a "functional style"[1] is
> inherently slower than an imperative style because a list or a map have
> different performance characteristics than do arrays.

Nick Pippinger gave the first crisp result comparing performance of
functional and imperative languages in "Pure vs Impure Lisp" [POPL 1996].
Researchers like Oege de Moor are working on characterizing more of these
differences.

--
Christopher Dutchyn
UBC Computer Science

Oliver Bandel

unread,

Mar 15, 2005, 7:24:31 PM3/15/05

to caml...@inria.fr

On Tue, Mar 15, 2005 at 04:07:37PM -0700, William D.Neumann wrote:
> On Mar 15, 2005, at 3:12 PM, Yoann Padioleau wrote:
>
> >>To which, I'd assume the majority response would be, "So?"
> >
> >So some of his arguments are right. You make object "So?" but
> >we could continue a long moment that way.
>
> Perhaps, perhaps not. His point seems to be that programming in a
> "functional style"[1] is inherently slower than an imperative style
> because a list or a map have different performance characteristics than
> do arrays. To which the only response is along the lines of "True.

Well, I tested the program with the original values for
columnSize and numRows as well as differet values.
I did not used the values that are mentioned in the posting
(7x3), because I had not too much time.

But with

let columnSize = 5;;
and with
let numRows = 7;;

(look at the ";;" it seems he had used it in the toplevel... :))

and ocamlprof I got stuff like that:

================================================
(* checks if each cell in center colum is covered by an empty cell *)
let rec isCovered c1 c2 c3 =
(* 2650588 *) let
memoized_isCovered = memoize3 isCovered_table isCovered
in
match (c1, c2, c3) with
(* if center runs out of cells first, we are covered *)
| (_, [], _) -> (* 501290 *) true
(* if others run out first, we are not covered *)
| ([], _, _) -> (* 0 *) false
| (_, _, []) -> (* 0 *) false
(* Empty followed by Planted in center colum
skip this cell and next cell *)
| ( (_::(_::rest1)), (Empty::(Planted::rest2)), (_::(_::rest3)) ) ->
(* 553730 *) true && ( isCovered rest1 rest2 rest3 )

| ( (Empty::rest1), (_::rest2), (_::rest3) ) ->
(* 837698 *) true && ( isCovered rest1 rest2 rest3 )

| ( (_::rest1), (_::rest2), (Empty::rest3) ) ->
(* 291720 *) true && ( isCovered rest1 rest2 rest3 )

(* Empty followed by Empty in center
This cell is covered, but don't skip next cell *)
| ( (_::rest1), (Empty::rest2), (_::rest3) ) ->
(* 297530 *) true && ( isCovered rest1 rest2 rest3 )
(* this cell is covered by next cell *)
| ( (_::rest1), (Planted::(Empty::rest2)), (_::rest3) ) ->
(* 99282 *) true && ( isCovered rest1 (Empty::rest2) rest3 )

(* default: we reach this if our current cell is not covered *)
| (_, _, _) -> (* 69338 *) false;;
================================================

which does not really looks tail recursive.

Called more than 2 * 10^6 times...

And many other examples...

e.g. this one:

(* applies a list of functions to an argument *)
let rec applyFunctionList functions argument =
(* 267386 *) match functions with
| [] -> (* 9782 *) []
| f::rest -> (* 257604 *) (f argument)::(applyFunctionList rest argument);;

"only" called 267386 times, but when looking how the arguments
are used: also applyFunctionList is not tail recursive...
...and even if called less than 10^6 times... it's a function that
creates a list in a non-tailrec way.

IMHO this is the reason, why the program performs so badly!

Ths stuff of tail recursion - even if it took a while
until I got it - was one of the first things on this list,
that people told me that it is necessary....

...but as a *real* C++ programmer it seems it is not necessary to learn...
...and better use the energy to tell all people how badly OCaml
performs!

Well... he performs badly in code-writing. :->

If he had read this mailing list, he wouls have seen that HE
(better: the code he wrote) is/was the problem, not OCaml itself. :)

Ciao,
Oliver

Oliver Bandel

unread,

Mar 15, 2005, 7:44:42 PM3/15/05

to caml...@inria.fr

On Tue, Mar 15, 2005 at 10:26:06AM -0800, Kip Macy wrote:
> Most people who do things well are too busy doing them to have time to
> talk about them.

Yes, that's the reason, why the OCaml-core developers so seldom
talk to us here. The do not have the nervs to talk to us trolls. ;-)

I for myself again send my thanks to the OCaml developers...

...gasho!

Ciao,
Oliver

Oliver Bandel

unread,

Mar 15, 2005, 7:44:42 PM3/15/05

to caml...@inria.fr

On Tue, Mar 15, 2005 at 01:55:07PM -0500, Christopher A. Watford wrote:
> On Tue, 15 Mar 2005 16:25:01 +0100 (CET), Christophe TROESTLER
> <debi...@tiscali.be> wrote:
> > I agree with that. The ./ post is a typical
> > I-don't-know-what-I-am-talking-about-so-I-say-it-loud ! If he made a
> > point I believe it is that it's easier to hit the "send" key than to
> > try to be a good programmer...
> >
> ><SNIP>
> >
> > My 2¢,
> > ChriS
>
> You're right, it looked something like this to me:
>
> Hi! I'm new to OCaml, and my first attempt and making something like I
> would in C++ failed miserably. I didn't used standard library
> functions, because I didn't know they existed. I also went ahead and
> compiled things in a rediculously unoptimized fashion. Finally, I
> didn't bother comparing how my noobie code scaled against the C++.
> LIKE OMG OCAML SUX0R!
>
> People post similar things for C#, PHP, any language really. Besides,
> it is VERY VERY simple to troll slashdot.

...but on the other side..... we can thank such posts,
because we had a nice evening and a theme to talk about... ;-)

So, we are not better... ? ;-)

Ciao,
Oliver

Oliver Bandel

unread,

Mar 15, 2005, 7:44:47 PM3/15/05

to caml...@inria.fr

On Tue, Mar 15, 2005 at 07:56:12PM +0000, Jon Harrop wrote:
> On Tuesday 15 March 2005 18:55, Christopher A. Watford wrote:
> > On Tue, 15 Mar 2005 16:25:01 +0100 (CET), Christophe TROESTLER
> > You're right, it looked something like this to me:
> >
> > Hi! I'm new to OCaml, and my first attempt and making something like I
> > would in C++ failed miserably.
>
> Perhaps this isn't the best forum to be saying this, but that guy's C++ code
> sucked as well. It could have been a lot more concise and efficient if he'd
> actually used C++...
>
> Maybe the task will get on the shootout and we can do it properly in OCaml.

[...]

good idea. :)

Ciao,
oliver

Yoann Padioleau

unread,

Mar 15, 2005, 8:17:32 PM3/15/05

to Oliver Bandel, caml...@inria.fr

Oliver Bandel <oli...@first.in-berlin.de> writes:

> But with
>
> let columnSize = 5;;
> and with
> let numRows = 7;;
>
> (look at the ";;" it seems he had used it in the toplevel... :))

And ?
First, I dont think it is a good idea to make fun of people who try to us=
e caml.
Second, many people I know still put ";;" cos they were taught that way
(at the very beginning with caml-light I think you were forced to put th=
ose ";;", it
became optional later, and habits are hard to change for many people :)=
)
Third, doing stuff at the toplevel is a good idea.

> which does not really looks tail recursive.
>
> Called more than 2 * 10^6 times...
>
> And many other examples...
>
>
> e.g. this one:
>
> (* applies a list of functions to an argument *)
> let rec applyFunctionList functions argument =
> (* 267386 *) match functions with
> | [] -> (* 9782 *) []

> | f::rest -> (* 257604 *) (f argument)::(applyFunctionList rest=

argument);;
>
> "only" called 267386 times, but when looking how the arguments
> are used: also applyFunctionList is not tail recursive...
> ...and even if called less than 10^6 times... it's a function that
> creates a list in a non-tailrec way.
>
> IMHO this is the reason, why the program performs so badly!

IMHO the reason it was slow is because it used associative list (instead =
of Map)
for associative access, and list of list (instead of array) for storing =
the grid.
I am not sure that making the function tail-recursive would have been the=
big
hit in this example.
I often transform my functions to make them tail-recursive because of sta=
ck overflow pb, not
that much because of speed pbs
(and many functions in the standard library are not tail-recursive, such =
as map)

>
> Ths stuff of tail recursion - even if it took a while
> until I got it - was one of the first things on this list,
> that people told me that it is necessary....

I am not sure it is the first optimisation trick to give to a fresh ocaml=
programmer.

This tail-recursion stuff is one of the thing I hate the most with fp bec=
ause it forces
you to change your code to adapt to the machine whereas it should be the
inverse.
I don't understand why the compiler don't do himself those transformation=
s.
Why is it so hard to take a non-tail-recursive-function and make it a tai=
l-recursive-one ?

>
> ...but as a *real* C++ programmer it seems it is not necessary to learn=

...
> ...and better use the energy to tell all people how badly OCaml
> performs!
>
> Well... he performs badly in code-writing. :->

We all :)
Each time I look at the code of someone else I find it awful,
and each time a guy look at my code he has the same reaction.

>
> If he had read this mailing list, he wouls have seen that HE
> (better: the code he wrote) is/was the problem, not OCaml itself. :)

If he had read this mailing list he would surely have stopped ocaml forev=
er,
and this is not a compliment for the ocaml community.

Nevertheless, he has a little bit of a troll :)
He should have post his experience to the caml-list, to get a chance
to improve his code, instead of going directly to slashdot.

>
>
> Ciao,
> Oliver
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

--

Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

YANG Shouxun

unread,

Mar 15, 2005, 8:35:14 PM3/15/05

to caml...@yquem.inria.fr

On Wednesday 16 March 2005 04:02, Yoann Padioleau wrote:
> YANG Shouxun <yang...@fltrp.com> writes:

> > No. My experiments show that the OCaml implementation performs far better

> > than the C++ implementation when the column and row get larger:
>
> Perhaps because you are a liar.

I'm 100% sure I'm not a liar. I don't know what's the benefit of being a liar,
nor the benefits of calling others a liar.

I checked and get the same results, though the figures are of course not
exactly the same.

> > C++ is compiled with -O3, not sure what is the better optimization level,

> > while OCaml version (actually I used Sys.argv and references to the two

> > parameters) is compiled with ocamlopt
> >
> > 4x4 c++
> > real 0m0.003s
> > user 0m0.002s
> > sys 0m0.002s
> >
> > 4x4 ocaml
> > real 0m0.177s
> > user 0m0.137s
> > sys 0m0.001s
> >
> > 8x8 c++
> > real 0m8.703s
> > user 0m7.000s
> > sys 0m0.018s
> >
> > 8x8 ocaml
> > real 0m0.747s
> > user 0m0.485s
> > sys 0m0.002s
>
> I dont know where you get those numbers ?

As I said, the C++ version was compiled with "-O3" and the OCaml version with
ocamlopt. The compiled programs were timed using "time garden_XXX x y" where
XXX is either cpp for C++ version and ml for OCaml version.

I'm using Debian GNU/Linux, with g++ version 3.3.5 and ocamlopt version
3.08.2. The computer is an Intel P4 2.40GHz with 256MB memory and 512KB cache
and 506MB swap.

Below is what I got from the site of C++ version:
---8<---

#include <math.h>
#include <stdio.h>

enum cell{ Empty=0, Planted=1 };

int columnSize = 5;

int numStates;
cell **allStates;

int *allCosts;

// array indicating when a center column of cells is covered by
// the columns on either side of it.
// indexed as isCovered[c1][c2][c3] true if c1 and c3 cover c2,
// or false otherwise
char ***isCoveredArray;

char isCovered( int inC1, int inC2, int inC3 ) {
cell *c1 = allStates[ inC1 ];
cell *c2 = allStates[ inC2 ];
cell *c3 = allStates[ inC3 ];

for( int i=0; i<columnSize; i++ ) {
// test for a cell that is not covered and return false
if( c2[i] == Planted &&
c1[i] == Planted &&
c3[i] == Planted ) {

if( i <=0 || c2[i-1] == Planted ) {
if( i >=columnSize-1 || c2[i+1] == Planted ) {

return false;
}
}
}
}

// else all covered
return true;
}

void printState( int inStateIndex ) {
cell *state = allStates[ inStateIndex ];

for( int i=0; i<columnSize; i++ ) {
if( state[i] == Empty ) {
printf( "O" );
}
else {
printf( "X" );
}
}
printf( "\n" );
}

struct costStatePair {
int cost;
// the index in the allStates array
int stateIndex;
};

// table for memoizing optLayout
struct costStatePair ***optLayoutTable;

// place where optLayout results are returned
struct costStatePair optLayoutResult;

/**
* Computes the optimum layout for a given number of columns given
* states for the first two columns.
*
* @param inNumColumns the number of columns. Must be at least 3.
* @param inFirstColumnStateIndex the index of the state (in allStates)
* of the first column.
* @param inSecondColumnStateIndex the index of the state (in allStates)
* of the second column.
*
* @return result is set in the optLayoutResult structure and returned.
* Thus, this call is NOT threadsafe.
*/
struct costStatePair optLayout( int inNumColumns, int inFirstColumnStateIndex,
int inSecondColumnStateIndex ) {

int c = inNumColumns-1;
if( optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex].cost != -1 ) {
// already have this result
return optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex];
}

if( inNumColumns < 3 ) {
optLayoutResult.cost =
allCosts[ inFirstColumnStateIndex ] +
allCosts[ inSecondColumnStateIndex ];
optLayoutResult.stateIndex = -1;

// save in table
optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex].cost = optLayoutResult.cost;
optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex].stateIndex =
optLayoutResult.stateIndex;

return optLayoutResult;
}

int bestCost = inNumColumns * columnSize + 1;
int bestStateIndex = -1;

// examin all possible states for the third column
for( int i=0; i<numStates; i++ ) {

// if i does its part to cover our second column
if( isCovered( inFirstColumnStateIndex,
inSecondColumnStateIndex, i ) ) {

char iCovered = false;

int costOfRest = 0;
if( inNumColumns > 3 ) {
struct costStatePair result =
optLayout( inNumColumns - 1,
inSecondColumnStateIndex,
i );
costOfRest = result.cost -
allCosts[ inSecondColumnStateIndex ];

// optLayout will only pick a 3rd column that helps to
// cover i
iCovered = true;
}
else {
// just the cost of i by itself
costOfRest = allCosts[i];

// make sure that i is covered by inSecondColumnStateIndex
// put the all-planted column (0) on the other side of i
iCovered = isCovered( inSecondColumnStateIndex, i, 0 );
}

if( iCovered && costOfRest < bestCost ) {
bestCost = costOfRest;
bestStateIndex = i;
}
}
}

optLayoutResult.cost = bestCost +
allCosts[ inFirstColumnStateIndex ] +
allCosts[ inSecondColumnStateIndex ];

optLayoutResult.stateIndex = bestStateIndex;

// save in table
optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex].cost = optLayoutResult.cost;
optLayoutTable[c]
[inFirstColumnStateIndex]
[inSecondColumnStateIndex].stateIndex = optLayoutResult.stateIndex;

return optLayoutResult;
}

void printOptLayout( int inNumColumns ) {
// find best starting state for 10 columns by
// running optLayout on 11 columns
int bestFirstColumn = -1;
int bestSecondColumn = -1;
int bestCost = (inNumColumns+1) * columnSize + 1;

for( int i=0; i<numStates; i++ ) {
// state 0 is the all-planted state
struct costStatePair result = optLayout( inNumColumns+1, 0, i );

if( result.cost < bestCost ) {
bestFirstColumn = i;
bestSecondColumn = result.stateIndex;
bestCost = result.cost;
}
//printf( "best cost = %d\n", bestCost );
}

printf( "%d empty cells:\n", bestCost );
printState( bestFirstColumn );
printState( bestSecondColumn );

for( int c=inNumColumns; c>=3; c-- ) {

struct costStatePair restult =
optLayout( c, bestFirstColumn, bestSecondColumn );

printState( restult.stateIndex );

bestFirstColumn = bestSecondColumn;
bestSecondColumn = restult.stateIndex;
}
}

int main( int inNumArgs, char **inArgs ) {

int numColumns = 10;

if( inNumArgs > 1 ) {
sscanf( inArgs[1], "%d", &columnSize );
}
if( inNumArgs > 2 ) {
sscanf( inArgs[2], "%d", &numColumns );
}

numStates = (int)( pow( 2, columnSize ) );
allStates = new cell*[ numStates ];
allCosts = new int[ numStates ];

int i, j, k;
for( i=0; i<numStates; i++ ) {

allCosts[i] = 0;

allStates[i] = new cell[ columnSize ];
for( int b=0; b<columnSize; b++ ) {

if( ( ( i >> b ) & 0x1 ) == 1 ) {
allStates[i][b] = Empty;
allCosts[i] ++;
}
else {
allStates[i][b] = Planted;
}
}
}

int numEntries = 0;
optLayoutTable = new struct costStatePair **[numColumns +1];
int c;
for( c=0; c<numColumns+1; c++ ) {
optLayoutTable[c] = new struct costStatePair *[numStates];
for( int i=0; i<numStates; i++ ) {
optLayoutTable[c][i] = new struct costStatePair[ numStates ];

for( int j=0; j<numStates; j++ ) {
optLayoutTable[c][i][j].cost = -1;
optLayoutTable[c][i][j].stateIndex = -1;

numEntries ++;
}
}
}
printf( "Done constructing memoization table with %d entries\n",
numEntries );

/*
isCoveredArray = new char**[ numStates ];
for( i=0; i<numStates; i++ ) {
isCoveredArray[i] = new char*[numStates];
for( j=0; j<numStates; j++ ) {
isCoveredArray[i][j] = new char[numStates];
for( k=0; k<numStates; k++ ) {
isCoveredArray[i][j][k] = isCovered( i, j, k );
}
}
}
*/

/*
for( i=0; i<numStates; i++ ) {
for( j=0; j<numStates; j++ ) {
delete [] isCoveredArray[i][j];
}
delete [] isCoveredArray[i];
}
delete [] isCoveredArray;
*/

for( c=2; c<=numColumns; c++ ) {
printf( "\n%d by %d\n", columnSize, c );

printOptLayout( c );
}

for( c=0; c<numColumns+1; c++ ) {

for( int i=0; i<numStates; i++ ) {
delete [] optLayoutTable[c][i];
}
delete [] optLayoutTable[c];
}
delete [] optLayoutTable;

for( i=0; i<numStates; i++ ) {
delete [] allStates[i];
}
delete [] allStates;
delete [] allCosts;

return 0;
}

---8<---

The original OCaml version:

---8<---
(* Computes optimal garden layouts using dynamic programming*)

(* the width of a state in our dynamic programming table *)
(* the algorithm will run in O( 2^columnSize ) time *)
let columnSize = 4;;

(* How many rows to compute solutions for *)
let numRows = 10;;

(* generate all possible states for columns of height h *)
type cell = Empty
| Planted;;

let printCell c =
match c with
| Empty -> "O"
| Planted -> "X";;

let rec printCells cs =
match cs with
| [] -> ""
| (c::rest) -> (printCell c) ^ (printCells rest);;

let rec printStates states =
match states with
| [] -> ""
| (s::[]) -> printCells s
| (s::rest) -> (printCells s) ^ " " ^ (printStates rest);;

(* apply f to every element of list *)
let rec map f list =
match list with
| [] -> []
| x::rest -> (f x) :: (map f rest);;

let rec generateStates numStates =
match numStates with
| 0 -> [[]]
| h ->
let
shorterStates = generateStates (h-1) and
prepend c s = c::s
in
( map ( prepend Empty ) shorterStates ) @
( map ( prepend Planted ) shorterStates );;

let allStates = generateStates columnSize;;

let rec genIndexList currentIndex maxIndex =
if( currentIndex <= maxIndex )
then currentIndex :: ( genIndexList (currentIndex+1) maxIndex )
else [];;

let rec length list =
match list with
| [] -> 0
| x::rest -> 1 + length rest;;

let stateIndices = genIndexList 0 ((length allStates) - 1);;

let rec findIndexRec s states startIndex =
match states with
| [] -> -1
| (x::xs) ->
if( s = x ) then startIndex
else findIndexRec s xs (startIndex+1);;

let findIndex s = findIndexRec s allStates 0;;

let rec generateSolidState length cellValue =
match length with
| 0 -> []
| x -> cellValue::(generateSolidState (x-1) cellValue );;

let allPlantedState = generateSolidState columnSize Planted;;
let allEmptyState = generateSolidState columnSize Empty;;

(* computes the min cost pair in a list of cost-state pairs *)
let rec minCost pairs =
match pairs with
| [] -> ( 100000, allEmptyState ) (* error *)
| (c,s)::[] -> (c,s)
| (c,s)::rest ->
let
( minCostRest, minStateRest ) = minCost rest
in
if( c < minCostRest ) then (c,s)
else ( minCostRest, minStateRest );;

(* computes the min cost pair in a list of cost-state-state triples *)
let rec minCost3 triples =
match triples with
| [] -> ( 100000, allEmptyState, allEmptyState ) (* error *)
| (c,s1,s2)::[] -> (c,s1,s2)
| (c,s1,s2)::rest ->
let
( minCostRest, minState1Rest, minState2Rest ) = minCost3 rest
in
if( c < minCostRest ) then (c,s1,s2)
else ( minCostRest, minState1Rest, minState2Rest );;

(* adds c to the cost of the minimum-cost pair in a list *)
let addToMin c pairList =
let (minC, minS) = minCost pairList in
(c + minC, minS );;

(* computes the cost of a state*)
let rec cost state =
match state with
| [] -> 0
| Empty::rest -> 1 + cost rest
| Planted::rest -> cost rest;;

let costList = map cost allStates;;

(* Set up an associative list for memoization *)
let lookup key table = List.assoc key !table;;
let insert key value table = table := (key, value) :: !table;;

(* memoize any 3-parameter function *)
let memoize3 table f x y z =
try
lookup (x,y,z) table
with
| Not_found ->
let result = f x y z in
insert (x, y, z) result table;
result;;

(* table for memoizing optLayout *)
let isCovered_table = ref [];;

(* checks if each cell in center colum is covered by an empty cell *)
let rec isCovered c1 c2 c3 =

let
memoized_isCovered = memoize3 isCovered_table isCovered
in
match (c1, c2, c3) with
(* if center runs out of cells first, we are covered *)

| (_, [], _) -> true

(* if others run out first, we are not covered *)

| ([], _, _) -> false
| (_, _, []) -> false

(* Empty followed by Planted in center colum
skip this cell and next cell *)
| ( (_::(_::rest1)), (Empty::(Planted::rest2)), (_::(_::rest3)) ) ->

true && ( isCovered rest1 rest2 rest3 )

| ( (Empty::rest1), (_::rest2), (_::rest3) ) ->

true && ( isCovered rest1 rest2 rest3 )

| ( (_::rest1), (_::rest2), (Empty::rest3) ) ->

true && ( isCovered rest1 rest2 rest3 )

(* Empty followed by Empty in center
This cell is covered, but don't skip next cell *)
| ( (_::rest1), (Empty::rest2), (_::rest3) ) ->

true && ( isCovered rest1 rest2 rest3 )
(* this cell is covered by next cell *)
| ( (_::rest1), (Planted::(Empty::rest2)), (_::rest3) ) ->

true && ( isCovered rest1 (Empty::rest2) rest3 )

(* default: we reach this if our current cell is not covered *)

| (_, _, _) -> false;;

let memoized_isCovered = memoize3 isCovered_table isCovered;;

(*

-- this is a lazy array of arrays of arrays.
-- first index is number of columns (-1... so 1 colum result is
-- at index 0
-- Second index is the state index (an index into allStates). This
-- is the state of the first column
-- Third index is the state index (an index into allStates). This
-- is the state of the column before the first column
-- So, first dimension is infinite, while second and third dimensions are
-- finite.
optSolutionArray = [ list2 |
f <- (map optLayout [1..]),
list <- [ map f stateIndices ],
list2 <- [
[ map f2 stateIndices | f2 <- list ]
]
]

-- lazy array memoizing isCovered
isCoveredArray = [ list2 |
f <- (map isCovered allStates),
list <- [ map f allStates ],
list2 <- [
[ map f2 allStates | f2 <- list ]
]
]
*)

(* applies a list of functions to an argument *)
let rec applyFunctionList functions argument =

match functions with
| [] -> []
| f::rest -> (f argument)::(applyFunctionList rest argument);;

exception Length_mismatch__makeTriples;;

let rec makeTriples listOfPairs listOfSingles =
match (listOfPairs, listOfSingles) with
| ([],[]) -> []
| ((p1,p2)::pRest, s::sRest) -> (p1,p2,s)::(makeTriples pRest sRest)
| _ -> raise Length_mismatch__makeTriples;;

(* discard middle element of each triple *)
let rec discardMiddle listOfTriples =
match listOfTriples with
| [] -> []
| (t1,_,t3)::tRest -> (t1,t3)::(discardMiddle tRest);;

let rec filterTriples f list =
match list with
| [] -> []
| (x,y,z)::rest ->
if( f x y z ) then (x,y,z)::( filterTriples f rest )
else filterTriples f rest;;

let rec filter f list =
match list with
| [] -> []
| x::rest ->
if( f x ) then x::( filter f rest )
else filter f rest;;

(* Simple memo fib *)
(*
let rec fib = function
| 0 -> 0
| 1 -> 1
| n -> memo_fib (n-1) + memo_fib (n-2)
and memo_fib n = memoize fib n;;
*)

(* table for memoizing optLayout *)
let optLayout_table = ref [];;

(*
-- computes the optimum layout of i columns given that the first column
-- has state s and the column before the first column has state s1
-- Ensures that first colum (s) is covered by s1 and the optimal solution
-- to the remaining colums. Does not ensure that s1 is covered (since
-- s is fixed, this function has no control over the coverage of s1). *)
let rec optLayout numColumns s s1 =
(*print_string "Working on optLayout ";
print_int numColumns;
print_string (" s=" ^ (printCells s));
print_string (" s1=" ^ (printCells s1));
print_newline ();
*)
match numColumns with
| 1 -> ( cost s, allPlantedState )
| i ->
let
memoized_optLayout = memoize3 optLayout_table optLayout
in
(* so much cleaner with list comprehensions
addToMin ( cost s )
[ (c,s2) |
s2 <- allStates,
(c,s3) <- [ optLayout (i-1) s2 s ],
isCovered s s2 s3,
isCovered s1 s s2 ]
*)
let
removeUncoveredS2 s2 = (isCovered s1 s s2)
in
let
allGoodStates = List.filter removeUncoveredS2 allStates
in
let
(* (opt i-1) applied to all states, a list of
functions waiting to be applied to s *)
optAppliedToAllStates =
List.map (memoized_optLayout (i-1)) allGoodStates
in
let
(* these are all optimal pairs for all choices of s2 *)
optAppliedToAllS2 = applyFunctionList optAppliedToAllStates s
in
let
(* triples of (c,s3,s2) *)
optC_S3_S2 = makeTriples optAppliedToAllS2 allGoodStates
in
let
(* a filter function to remove uncovered triples from our list *)
removeUncovered (c, s3, s2) = (isCovered s s2 s3) (*&&
(isCovered s1 s s2)*)
in
let
covered_optC_S3_S2 = List.filter removeUncovered optC_S3_S2
in

let
(minC, minS3, minS2) = minCost3 covered_optC_S3_S2
in
( (cost s) + minC, minS2 )

(*
-- look up result in optSolutionArray instead of computing
-- This memoization dramatically speeds up computation
-- And hey... it was very elegant to express this in Haskell
memoizedOptLayout 1 s s1 = optLayout 1 s s1
memoizedOptLayout i s s1 = optSolutionArray !! (i-1) !! s !! s1
*)

let memoized_optLayout = memoize3 optLayout_table optLayout;;

(* computes the first column of the optimum x-column layout *)
let computeOpt x =
(* force an extra planted column before the first column
force an extra empty colum before that *)
optLayout (x + 1) allPlantedState allEmptyState;;

let rec printOptGivenState numColumns s s1 =
match numColumns with
| 1 -> printCells s
| x ->
let
(c, optNextState) = optLayout x s s1
in
( printCells s ) ^ "\n" ^
printOptGivenState (x - 1) optNextState s;;

(* hey... it actually works!
this is the main function *)
let printOpt x =
let
(c, optStartingState) = computeOpt x
in
print_int c;
print_string
(" empty cells:\n" ^
printOptGivenState x optStartingState allPlantedState );;

let main () =
(* first, fill up the memo table *)
(*
for i=1 to 10 do
print_string( "Filling table row " );
print_int( i );
print_newline();
List.map
(applyFunctionList ( List.map (memoized_optLayout i) allStates ))
allStates;
print_string( " done with table row " );
print_int( i );
print_newline();
done;
*)
for i = 2 to numRows do
print_int( columnSize );
print_string( " by " );
print_int( i );
print_newline();
printOpt i;
print_newline();
print_newline();
done;;

main ();;

---8<---

Since I don't want to change the source code and recompile each time I want to
pass a different argument list, I made the following changes, without
handling some possible exceptions:

---8<---
--- Garden.ml 2005/03/15 09:38:16 1.1
+++ Garden.ml 2005/03/16 01:22:08
@@ -2,10 +2,10 @@

(* the width of a state in our dynamic programming table *)
(* the algorithm will run in O( 2^columnSize ) time *)
-let columnSize = 4;;
+let columnSize = ref 4;;

(* How many rows to compute solutions for *)
-let numRows = 10;;
+let numRows = ref 10;;

(* generate all possible states for columns of height h *)
@@ -54,7 +54,7 @@

-let allStates = generateStates columnSize;;
+let allStates = generateStates !columnSize;;

let rec genIndexList currentIndex maxIndex =
if( currentIndex <= maxIndex )
@@ -88,8 +88,8 @@
| 0 -> []
| x -> cellValue::(generateSolidState (x-1) cellValue );;

-let allPlantedState = generateSolidState columnSize Planted;;
-let allEmptyState = generateSolidState columnSize Empty;;
+let allPlantedState = generateSolidState !columnSize Planted;;
+let allEmptyState = generateSolidState !columnSize Empty;;

(* computes the min cost pair in a list of cost-state pairs *)
@@ -372,7 +372,7 @@

let main () =
(* first, fill up the memo table *)
-(*
+ (*
for i=1 to 10 do
print_string( "Filling table row " );
print_int( i );
@@ -385,8 +385,11 @@
print_newline();
done;
*)
- for i = 2 to numRows do
- print_int( columnSize );
+ let len = Array.length Sys.argv in
+ if len > 1 then columnSize := int_of_string (Sys.argv.(1));
+ if len > 2 then numRows := int_of_string (Sys.argv.(2));
+ for i = 2 to !numRows do
+ print_int( !columnSize );
print_string( " by " );
print_int( i );
print_newline();
---8<---

> I tried the code of the "troll" and the ocaml version performs far _worse_

> than the c++ implementation when the column and row get larger.
>
> > 12x12 c++
> > the process was killed by the OS
> >
> > 12x12 ocaml
> > real 0m1.210s
> > user 0m0.886s
> > sys 0m0.001s

I admit I didn't look into the C++ or OCaml code yet, but that's not what I'm
trying to say here.

Everybody is welcome to point out where I might be mistaken in producing the
incredible results (at least to Yoann Padioleau) and confirm whether the
experiments can be repeated or not.

Jacques Garrigue

unread,

Mar 15, 2005, 8:44:50 PM3/15/05

to padi...@irisa.fr, caml...@inria.fr

Well, since it seems difficult to hide that I'm an anonymous coward,
I add a few comments for the list.

From: padi...@irisa.fr
> Yes but this ocaml code use array ?
> In that case it supports what the "troll" said, that is
> the resulting code is no more "functionnal".
> I agree with eijro sumii that ocaml is not just about functionnal
> programming but in the mind of many people advocating ocaml is advocating
> functionnal programming.
>
> I think the way to answer to those trolls is to teach them the way
> to do, in that case to use Map instead of associative list, and
> not to be pretentious and to tell that he is just a newbie.
> He is not a newbie, this garden optimization problem is not that simple.

In this particular case, I believe the trouble is rather that the
problem is really geared toward a specific solution. Efficient
memoization requires efficient access to memoized results, which in
this case can be obtained by mapping states to integers. And there
happens to be a trivial mapping. Then any solution will have to
iterate on the states.

If you look at my translation, I do not mutate arrays (except for
memoization), and use no references. That is, the mapping from states
to integers is actually produced in a purely functional way, and
arrays are only there to provide O(1) access. Moreover state
representation uses lists and natural sharing, so that it is
reasonably space efficient.
I also use lists, pattern matching, and recursion, so I believe this
fulfills his requirement about being written in functional style.

Out of curiosity, I also wrote a purely functional version, where
memoizing is done through an array of lazy values.
http://wwwfun.kurims.kyoto-u.ac.jp/~garrigue/garden3.ml
The code is actually slightly shorter. Performance is rather close.
On a Pentium M 1.8, for a 8x8 garden, I obtain (including memory usage)
Garden.cpp 5.8s 6.1MB (g++ -O)
Garden.cpp 4.5s 6.2MB (g++ -O3)
garden2.ml 5.9s 8.3MB (ocamlopt)
garden3.ml 10s 27.4MB (ocmalopt)
So you can see that garden2, while being almost purely functional,
is really equivalent in performance to his C++ code (which is a bit
dumb, but garden2 doesn't try to be more clever), while garden3 uses
more memory (as expected).

The only thing this example shows is that writing in a functional
language doesn't dispense you of doing a complexity analysis. In
particular the use of structural equality and association lists may
cost a lot in practice.
Some people may have a mystical belief that the compiler will
automatically improve your code through program transformation, but at
least in ocaml the situation is simple: the compiler does no
transformation whatsoever, so you get what you write.
And of course, SICP is a good reading before starting to write
in a functional programming language; I suppose all the structures I
used are explained there in detail.

Jacques Garrigue

Jon Harrop

unread,

Mar 15, 2005, 10:05:28 PM3/15/05

to caml...@yquem.inria.fr

Just for the record, I'd like to dispell a couple of myths:

On Wednesday 16 March 2005 01:05, Yoann Padioleau wrote:
> IMHO the reason it was slow is because it used associative list (instead of
> Map) for associative access,

Although Map is asymptotically faster than List.assoc for lookup (O(ln n) vs
O(n)), OCaml's Hashtbl and array-based equivalents are typically several
times faster than Map.

Also, I think that many people would consider the use of an imperative data
structure, such as a hash table, for memoizing to be the remit of functional
programming.

On Wednesday 16 March 2005 00:18, Oliver Bandel wrote:
> which does not really looks tail recursive.
> Called more than 2 * 10^6 times...
> And many other examples...

In OCaml, non-tail-recursive functions are often faster than their tail
recursive equivalents for small inputs (e.g. short lists). I expect that the
functions you have identified fall into this category, so converting them to
tail-recursive form is likely to slow the program down rather than speed it
up.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Oliver Bandel

unread,

Mar 16, 2005, 5:37:19 AM3/16/05

to caml...@inria.fr

On Wed, Mar 16, 2005 at 02:05:43AM +0100, Yoann Padioleau wrote:
> Oliver Bandel <oli...@first.in-berlin.de> writes:
>
> > But with
> >
> > let columnSize = 5;;
> > and with
> > let numRows = 7;;
> >
> > (look at the ";;" it seems he had used it in the toplevel... :))
>
> And ?

Only nice to see.
Nothing more. ;-)

It's only that I think about it, because he
talks with seemingly bad intentions about OCaml.
(a kind of C++ dogmatism)

> First, I dont think it is a good idea to make fun of people who try to use caml.

I don't do that.

But I can't see that there is really an interest in exploring OCaml.
If he only would be cynic about "well I tried it, but it has proven
it's a crap language... but I give you (experienced OCaml programmers)
the chance to show me that it isn't bad (and *I* have o learn the language)"
then this woul be ok.

But it seems that he's saying: "well, OCaml never can compete with C++".

Maybe I'm wrong here and he's a good guy. But IMHO it seems he's looking
for proved Ocaml-failure (and not for "well, I tried it, it seems OCaml sucks,
but please tell me it does not!", which I could accept).

> Second, many people I know still put ";;" cos they were taught that way

Hey, that was the way I started too! :)

That comment from me only shows: Well, it seems he's an absolute beginner.

To be a beginner is not wrong. It's good!
To be a beginner every day and every time means: to be open for new
things every time!
That's good! :)

But as stated above: Would be nice if the OCaml-bashing would contain at least
a bit of "maybe I'm wrong, and I only use cynism, because I thaught it
performed better (or It's easier to learn).

Maybe I've overlooked that positive-thinking approach behind
his harsh words?!

> (at the very beginning with caml-light I think you were forced to put those ";;", it
> became optional later, and habits are hard to change for many people :) )

> Third, doing stuff at the toplevel is a good idea.

Yes.

So, what are we talking about?

Do you think *I'm the bad guy*?!

"mea culpa" ?!!!????

>
> > which does not really looks tail recursive.
> >
> > Called more than 2 * 10^6 times...
> >
> > And many other examples...
> >
> >
> > e.g. this one:
> >
> > (* applies a list of functions to an argument *)
> > let rec applyFunctionList functions argument =
> > (* 267386 *) match functions with
> > | [] -> (* 9782 *) []

> > | f::rest -> (* 257604 *) (f argument)::(applyFunctionList rest argument);;

> >
> > "only" called 267386 times, but when looking how the arguments
> > are used: also applyFunctionList is not tail recursive...
> > ...and even if called less than 10^6 times... it's a function that
> > creates a list in a non-tailrec way.
> >
> > IMHO this is the reason, why the program performs so badly!
>

> IMHO the reason it was slow is because it used associative list (instead of Map)
> for associative access,

Maybe it is.
I don't say it is *not*.

I have not loked into the code in so much detail
that I could see that the associative map is the problem
or is not.
If the list is short, an associative list may not perform so badly.
(Correct me, if I'm wrong.)

But IMHO creating the environment for a function that is non tail-rec
needs more ressources than using assoc-lists ionstead of maps
in that application (Correct me, if I'm wrong).

(And maybe - if it is always the case - that all depends on
the amount of data/how often functions are called and so on.

> and list of list (instead of array) for storing the grid.

Yes, that also seems to be a problem.

> I am not sure that making the function tail-recursive would have been the big
> hit in this example.

But as long as nobody tried it / analzed it, your assumption is
only an assumption, as my assumption is only an assumotion too!

... IMHO it makes sense in a field where THE solution is not
already known, to come out with ideas that are *possibly*
solutions to a problem.

So, nothing else is "use map instead of..." and "use arrays instead of...".

The real freaks doesn't post to that thread, as it maybe is trivial to them,
to talk about that stuff...?!

> I often transform my functions to make them tail-recursive because of stack overflow pb, not

> that much because of speed pbs

> (and many functions in the standard library are not tail-recursive, such as map)

Well, maybe I have overestimated the tailrec-point,
but as long as there is not a proven counter-example,
the opposite of what I stated seems to be only an assumption too.

Maybe such freaks like the OCaml core-team, or M.Mottl will
see in milliseconds what is going on... but the whole thread
on slashdot (and here) seems to be guessing from many people.

If I'm wrong, please show me.

!!! To see different results it would be nice to have different implementations
!!! to compare them all!

>
> >
> > Ths stuff of tail recursion - even if it took a while
> > until I got it - was one of the first things on this list,
> > that people told me that it is necessary....
>

> I am not sure it is the first optimisation trick to give to a fresh ocaml programmer.

That is a thing what special studied computer science people
gave me as one of the first hints on that list.
(And wehen looking in SICP, it seems, that it's very necessary to think
abot it in more detail.)

Would be interesting to have different variations of the
code, using different ways of coding some special tasks
in different ways.... and maybe oe implementation,
that uses *all* suggestions.

To do it in the language shootout, as on Jon stated it, seems
to be a veryg good idea.

>
> This tail-recursion stuff is one of the thing I hate the most with fp because it forces

> you to change your code to adapt to the machine whereas it should be the
> inverse.

But only because you hate it does not mean that changing the code will not result
in better performance.

At least for that list-creating stuff I have (e.g. reading in a file
and write it to a list of lines) I have seen the advantages (in speed!)
when using tailrec versions or imperative versions) instead of
simple recursion.

And some of that troll-code's functions are really doing that!
Build lists by building lists non-tailrec.

> I don't understand why the compiler don't do himself those transformations.

Well I understand, why a compile does not do that job: because it's
too complicate for a programmer of a compiler to implement that feature. ;-)

But on the other side I think the same: Why not implementing such stuff directly
into the compiler?! Would be nice...

(... or not, because we never would think about what we are doing then...)

> Why is it so hard to take a non-tail-recursive-function and make it a tail-recursive-one ?

It's not hard, it's only "practise, practise, practise" or
"exercise, exercise, exercise..." ;-)

One of the people on this list once shwed a way of how to change a loop
into tailrevc-stuff in general (shown for two or three args of a func).

(Remi Vacant?!).

Maybe I have somewhere a printed page of it and can look for it / find it...

Ciao,
Oliver

P.S.: Well... too much beer this night... :(

David Fox

unread,

Mar 16, 2005, 6:37:12 AM3/16/05

to Kip Macy, Thomas Fischbacher, Christophe TROESTLER, O'Caml Mailing List

That is sad too.

--

This message contains information which may be confidential and privileged. Unless you are the 
addressee (or authorized to receive for the addressee), you may not use, copy or disclose to anyone 
the message or any information contained in the message. If you have received the message in error, 
please advise the sender and delete the message.  Thank you.

Thomas Fischbacher

unread,

Mar 16, 2005, 6:35:38 AM3/16/05

to Oliver Bandel, caml...@inria.fr

On Wed, 16 Mar 2005, Oliver Bandel wrote:

> > Second, many people I know still put ";;" cos they were taught that way
>
> Hey, that was the way I started too! :)

I should say, I do it *on purpose*, even knowing that it is not necessary.

Why? Putting ";;" or not does not make a difference for the programmer,
but not using certain "syntax quirks" makes it easier to operate on the
source code with tools, quite in general.

--
regards, t...@cip.physik.uni-muenchen.de (o_
Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU)

_______________________________________________

Yoann Padioleau

unread,

Mar 16, 2005, 8:15:26 AM3/16/05

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop <j...@ffconsultancy.com> writes:

> Just for the record, I'd like to dispell a couple of myths:
>
> On Wednesday 16 March 2005 01:05, Yoann Padioleau wrote:

> > IMHO the reason it was slow is because it used associative list (inst=

ead of
> > Map) for associative access,
>

> Although Map is asymptotically faster than List.assoc for lookup (O(ln =
n) vs
> O(n)), OCaml's Hashtbl and array-based equivalents are typically severa=
l
> times faster than Map.

I agree, I beleived that too but
I switched from Map to Hashtbl in the "troll" code and Hashtbl sux.
I don't know why.

> Also, I think that many people would consider the use of an imperative =
data
> structure, such as a hash table, for memoizing to be the remit of funct=
ional
> programming.

I do. As much as possible I try to have "persistent" (persistent in the o=
kasaki sense)
data-structures.

>
> On Wednesday 16 March 2005 00:18, Oliver Bandel wrote:
> > which does not really looks tail recursive.
> > Called more than 2 * 10^6 times...
> > And many other examples...
>

> In OCaml, non-tail-recursive functions are often faster than their tail=

> recursive equivalents for small inputs (e.g. short lists).

really ? why ?

> I expect that the
> functions you have identified fall into this category, so converting th=
em to
> tail-recursive form is likely to slow the program down rather than spee=
d it
> up.

--
Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

Jacques Garrigue

unread,

Mar 16, 2005, 8:46:53 AM3/16/05

to padi...@irisa.fr, caml...@yquem.inria.fr

From: Yoann Padioleau <padi...@irisa.fr>
> Jon Harrop <j...@ffconsultancy.com> writes:
> > Although Map is asymptotically faster than List.assoc for lookup (O(ln n) vs
> > O(n)), OCaml's Hashtbl and array-based equivalents are typically several

> > times faster than Map.
>
> I agree, I beleived that too but
> I switched from Map to Hashtbl in the "troll" code and Hashtbl sux.
> I don't know why.

Because the default hash function doesn't work well on complex
data-structures, where it has lots of collisions, and results in
putting lots of values in the same bucket. It's a bad idea to directly
use complex data structures as key anyway, but particularly bad with
hash tables.

> > In OCaml, non-tail-recursive functions are often faster than their tail

> > recursive equivalents for small inputs (e.g. short lists).
>
> really ? why ?

Because tail-recursive versions do some extra work to ensure
tail-recursiveness. For instance building a list in reverse order, and
converting it back with List.rev at the end. This only pays off for
huge lists.

Jacques Garrigue

Yoann Padioleau

unread,

Mar 16, 2005, 8:46:25 AM3/16/05

to Oliver Bandel, caml...@inria.fr

Oliver Bandel <oli...@first.in-berlin.de> writes:

> If he only would be cynic about "well I tried it, but it has proven
> it's a crap language... but I give you (experienced OCaml programmers)

> the chance to show me that it isn't bad (and *I* have o learn the langu=

age)"
> then this woul be ok.

I agree.

> Maybe I'm wrong here and he's a good guy. But IMHO it seems he's lookin=
g
> for proved Ocaml-failure (and not for "well, I tried it, it seems OCaml=

sucks,
> but please tell me it does not!", which I could accept).

I think his intention were good (but he is surely a bad guy too).

> Do you think *I'm the bad guy*?!

no :) of course not :)

> > I am not sure that making the function tail-recursive would have been=

the big
> > hit in this example.
>
> But as long as nobody tried it / analzed it, your assumption is
> only an assumption, as my assumption is only an assumotion too!

Well at least my assumption about "use Map instead of assoc list is a big=
hit"
has been proven. The running time from the program go from "more than 16 =
minutes" to just
50 seconds (this is what I call a big hit).
It would be interesting to make his function tail-rec (and keeping the as=
soc list)
and see if it is a big hit but I am too lazy for this and I hate
those tail-rec transformation and I think it would
not be a big hit cos using Map and Array is the big hit.
It's your turn oliver bandel to do the job :)

> > I often transform my functions to make them tail-recursive because of=

stack overflow pb, not
> > that much because of speed pbs

> > (and many functions in the standard library are not tail-recursive, s=

uch as map)
>
>
> Well, maybe I have overestimated the tailrec-point,
> but as long as there is not a proven counter-example,
> the opposite of what I stated seems to be only an assumption too.

True.

>
> !!! To see different results it would be nice to have different impleme=

ntations
> !!! to compare them all!

I agree.

> Would be interesting to have different variations of the
> code, using different ways of coding some special tasks
> in different ways.... and maybe oe implementation,
> that uses *all* suggestions.
>
> To do it in the language shootout, as on Jon stated it, seems
> to be a veryg good idea.

I agree.
I must confess that I have very few intuition about what is more importan=
t
in optimization, but perhaps because it depends on the program.
Sometimes X is a better optimization than Y on this program Z.
Sometimes Y is a better optimization than X on this program Z2.

> > This tail-recursion stuff is one of the thing I hate the most with fp=
because it forces
> > you to change your code to adapt to the machine whereas it should be =
the
> > inverse.
>
> But only because you hate it does not mean that changing the code will =

not result
> in better performance.

true :)

--
Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

Yoann Padioleau

unread,

Mar 16, 2005, 9:25:01 AM3/16/05

to Jacques Garrigue, padi...@irisa.fr, caml...@yquem.inria.fr

Jacques Garrigue <garr...@math.nagoya-u.ac.jp> writes:

> From: Yoann Padioleau <padi...@irisa.fr>
> > Jon Harrop <j...@ffconsultancy.com> writes:
> > > Although Map is asymptotically faster than List.assoc for lookup (O=
(ln n) vs
> > > O(n)), OCaml's Hashtbl and array-based equivalents are typically se=
veral
> > > times faster than Map.
> >
> > I agree, I beleived that too but
> > I switched from Map to Hashtbl in the "troll" code and Hashtbl sux.
> > I don't know why.
>
> Because the default hash function doesn't work well on complex
> data-structures, where it has lots of collisions, and results in
> putting lots of values in the same bucket. It's a bad idea to directly
> use complex data structures as key anyway, but particularly bad with
> hash tables.
>

> > > In OCaml, non-tail-recursive functions are often faster than their =

tail
> > > recursive equivalents for small inputs (e.g. short lists).
> >
> > really ? why ?
>
> Because tail-recursive versions do some extra work to ensure
> tail-recursiveness. For instance building a list in reverse order, and
> converting it back with List.rev at the end. This only pays off for
> huge lists.

I am happy. I have learned (or re-learned) a few stuff from this thread
so in a way from this "troll" :)
Perhaps it is not always a waste of time to post in the news :)

It would be cool if some of those insights on optimization would be put
somewhere via a wiki.
http://haskell.org/hawiki/ThingsToAvoid is a good stard but it is for h=
askell
(but many stuff said still apply to ocaml).
I am sure that I am not the only one to ask for a wiki
(and not the only one to do nothing about it :) )

It seems that this page is no more accessible from the new website
http://caml.inria.fr/pub/old_caml_site/FAQ/FAQ_EXPERT-eng.html#efficacit=
e

>
> Jacques Garrigue
>

--
Yoann Padioleau, INSA de Rennes, France www.irisa.fr/prive/padiolea/
Opinions expressed here are only mine. Je n'écris qu'à titre personne=
l.
**____ Get Free. Be Smart. Simply use Linux and Free Software. ____*=
*

_______________________________________________

brogoff

unread,

Mar 16, 2005, 12:55:16 PM3/16/05

to caml...@yquem.inria.fr

On Wed, 16 Mar 2005, Jacques Garrigue wrote:
> From: Yoann Padioleau <padi...@irisa.fr>
> > Jon Harrop <j...@ffconsultancy.com> writes:
> > > In OCaml, non-tail-recursive functions are often faster than their tail
> > > recursive equivalents for small inputs (e.g. short lists).
> >
> > really ? why ?
>
> Because tail-recursive versions do some extra work to ensure
> tail-recursiveness. For instance building a list in reverse order, and
> converting it back with List.rev at the end. This only pays off for
> huge lists.

No doubt the implementors will want me guillotined for bringing this up again,
but using the (still functional!) set_cdr! tail recursive functions, which do
*not* reverse the list, are always faster than the non tail recursive
list functions, even for small lists, at least in my experience. So I suspect
that your "for instance" is in fact the only reason for the disparity. I'd
welcome a counterexample.

Those Obj based List functions are what ExtLib provides, I think, and there are
even ways to get this optimization neatly into ML style languages. Maybe in ML
20XY this will be addressed.

-- Brian

Jon Harrop

unread,

Mar 16, 2005, 2:57:06 PM3/16/05

to caml...@yquem.inria.fr

On Wednesday 16 March 2005 17:43, brogoff wrote:
> On Wed, 16 Mar 2005, Jacques Garrigue wrote:
> > Because tail-recursive versions do some extra work to ensure
> > tail-recursiveness. For instance building a list in reverse order, and
> > converting it back with List.rev at the end. This only pays off for
> > huge lists.
>
> No doubt the implementors will want me guillotined for bringing this up
> again, but using the (still functional!) set_cdr! tail recursive functions,
> which do *not* reverse the list, are always faster than the non tail
> recursive list functions, even for small lists, at least in my experience.
> So I suspect that your "for instance" is in fact the only reason for the
> disparity. I'd welcome a counterexample.

Here is one of the counterexamples given in my book, two implementations of a
fold_right function over an implicit semi-inclusive range of integers [l..u):

# let rec fold_right1 f accu l u =
if l < u then f (fold_right1 f accu (l + 1) u) l else accu;;
val fold_right1 : ('a -> int -> 'a) -> 'a -> int -> int -> 'a = <fun>
# let rec fold_right2 f accu l u =
if l < u then let u = u - 1 in fold_right2 f (f accu u) l u else accu;;
val fold_right2 : ('a -> int -> 'a) -> 'a -> int -> int -> unit = <fun>

(A program for timing is at the end of this e-mail).

Here, the non-tail-recursive fold_left function is significantly faster on
smaller lists and the tail-recursive fold_right functions is much faster in
larger lists.

I believe there are many other counterexamples. Indeed, I would even say that
most functions are counterexamples. Perhaps the reason is that non-tail
recursion is used when it is natural for a given task, and transforming into
tail-recursive form is then necessarily more complicating the function.

> Those Obj based List functions are what ExtLib provides, I think, and there
> are even ways to get this optimization neatly into ML style languages.
> Maybe in ML 20XY this will be addressed.

I don't know what the performance of such transformed code would be. Perhaps
the transformation would slow code down. Consequently, it may be early days
to call it an "optimisation".

Here's the timing program:

let rec fold_right1 f accu l u =
if l < u then f (fold_right1 f accu (l + 1) u) l else accu
let rec fold_right2 f accu l u =
if l < u then let u = u - 1 in fold_right2 f (f accu u) l u else accu

let rec test f n = if n>0 then (f (); test f (n-1))

let _ =
let t = Unix.gettimeofday () in
test (fun () -> ignore (fold_right1 ( + ) 0 1 5000)) 10000;
Printf.printf "Non-tail-recursive took: %fs\n"
(Unix.gettimeofday () -. t);
let t = Unix.gettimeofday () in
test (fun () -> ignore (fold_right2 ( + ) 0 1 5000)) 10000;
Printf.printf "Tail-recursive took: %fs\n\n"
(Unix.gettimeofday () -. t);
let t = Unix.gettimeofday () in
test (fun () -> ignore (fold_right1 ( + ) 0 1 50000)) 1000;
Printf.printf "Non-tail-recursive took: %fs\n"
(Unix.gettimeofday () -. t);
let t = Unix.gettimeofday () in
test (fun () -> ignore (fold_right2 ( + ) 0 1 50000)) 1000;
Printf.printf "Tail-recursive took: %fs\n"
(Unix.gettimeofday () -. t)

$ ./test
Non-tail-recursive took: 0.513307s
Tail-recursive took: 0.582472s

Non-tail-recursive took: 1.950229s
Tail-recursive took: 0.590756s

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Oliver Bandel

unread,

Mar 16, 2005, 6:46:15 PM3/16/05

to caml...@inria.fr

On Wed, Mar 16, 2005 at 12:23:02PM +0100, Thomas Fischbacher wrote:
>
> On Wed, 16 Mar 2005, Oliver Bandel wrote:
>
> > > Second, many people I know still put ";;" cos they were taught that way
> >
> > Hey, that was the way I started too! :)
>
> I should say, I do it *on purpose*, even knowing that it is not necessary.
>
> Why? Putting ";;" or not does not make a difference for the programmer,
> but not using certain "syntax quirks" makes it easier to operate on the
> source code with tools, quite in general.

Oh, good idea.

But it makes a difference, when using the toplevel, because it
immediately answers the type it infers, which is a nice feature.

Ciao,
Oliver

Oliver Bandel

unread,

Mar 16, 2005, 7:05:37 PM3/16/05

to caml...@inria.fr

On Wed, Mar 16, 2005 at 02:33:10PM +0100, Yoann Padioleau wrote:
> Oliver Bandel <oli...@first.in-berlin.de> writes:
>
> > If he only would be cynic about "well I tried it, but it has proven
> > it's a crap language... but I give you (experienced OCaml programmers)

> > the chance to show me that it isn't bad (and *I* have o learn the language)"

> > then this woul be ok.
>
> I agree.
>

> > Maybe I'm wrong here and he's a good guy. But IMHO it seems he's looking
> > for proved Ocaml-failure (and not for "well, I tried it, it seems OCaml sucks,

> > but please tell me it does not!", which I could accept).
>
> I think his intention were good (but he is surely a bad guy too).
>
> > Do you think *I'm the bad guy*?!
>
> no :) of course not :)
>

> > > I am not sure that making the function tail-recursive would have been the big

> > > hit in this example.
> >
> > But as long as nobody tried it / analzed it, your assumption is
> > only an assumption, as my assumption is only an assumotion too!
>

> Well at least my assumption about "use Map instead of assoc list is a big hit"
> has been proven. The running time from the program go from "more than 16 minutes" to just

> 50 seconds (this is what I call a big hit).

> It would be interesting to make his function tail-rec (and keeping the assoc list)

> and see if it is a big hit but I am too lazy for this and I hate
> those tail-rec transformation and I think it would
> not be a big hit cos using Map and Array is the big hit.
> It's your turn oliver bandel to do the job :)

Well, I'm too lazy for that job too. :)

But going from > 16 minutes to about 50 seconds is a good
performance boost. :)

But if c++ is beyond 1 second, that is the direction to go...

Well, I'm shure that it is not really a problem to write OCaml-code
that can compete with the C++ code. :)

But especially THAT is the reason, why there is not really
a motivation to do that job: We not have to save the church
of OCaml in a sense of a faith...

...in the sense of
"Faith occurs when a person believes that something is true
even though he suspects it is false".

...so it's not necessary to do the holy war...

... if you every day see, how fast OCaml programs can be...
...and if you know how much the programmer's blindness
yields to slow programs (as I often had to experience),
then no stress is coming up to do that job...

... the map-stuff has shown how far one can go with Ocaml
and I'm shure there is a lot more optimization possible.

So, there is not enough motivation to do that optimization,
because I know that it is possible. ;-)

It's up to the troll to show that tailrec functions
will *not* gain to a speedup and that 50 seconds
is the best what is possible with OCaml... ;-)

The 16 min => 50 seconds speedup has shown enough potential
and disarmed the troll I think.

The rest can be done on the language shootout, as Jon suggested.

Maybe *then* there is a motivation (maybe something like
a micro-programming contest... ;-))

Ciao,
Oliver

Jacques Garrigue

unread,

Mar 16, 2005, 8:19:05 PM3/16/05

to bro...@speakeasy.net, caml...@yquem.inria.fr

From: brogoff <bro...@speakeasy.net>

> On Wed, 16 Mar 2005, Jacques Garrigue wrote:
> > From: Yoann Padioleau <padi...@irisa.fr>
> > > Jon Harrop <j...@ffconsultancy.com> writes:
> > > > In OCaml, non-tail-recursive functions are often faster than their tail
> > > > recursive equivalents for small inputs (e.g. short lists).
> > >
> > > really ? why ?
> >
> > Because tail-recursive versions do some extra work to ensure
> > tail-recursiveness. For instance building a list in reverse order, and
> > converting it back with List.rev at the end. This only pays off for
> > huge lists.
>
> No doubt the implementors will want me guillotined for bringing this
> up again, but using the (still functional!) set_cdr! tail recursive
> functions, which do *not* reverse the list, are always faster than
> the non tail recursive list functions, even for small lists, at
> least in my experience. So I suspect that your "for instance" is in
> fact the only reason for the disparity. I'd welcome a
> counterexample.
>
> Those Obj based List functions are what ExtLib provides, I think,
> and there are even ways to get this optimization neatly into ML
> style languages. Maybe in ML 20XY this will be addressed.

I beg to differ on performance.
Here are the results of a micro-benchmark comparing List.map and
ExtLib.List.map.

With List.map
l10*10000: 0.117188s
l100*1000: 0.132812s
l1000*100: 0.195312s
l10000*10: 0.859375s
l100000*1: 3.328125s

With ExtLib.List.map
l10'*10000: 0.187500s
l100'*1000: 0.203125s
l1000'*100: 0.304688s
l10000'*10: 1.382812s
l100000'*1: 1.937500s

So you can see that the tail-recursive version only gets faster
somewhere between 10000 and 100000 elements. Hardly typical use of
lists. On the other hand, there are cases where the non tail-recursive
version will blow-up your stack, so you have no choice but to go with
the possibly slower tail-recursive one.

Jacques Garrigue

Code used:

let rec genlist n = if n > 0 then n :: genlist (n-1) else []

let l10 = genlist 10
let l100 = genlist 100
let l1000 = genlist 1000
let l10000 = genlist 10000
let l100000 = genlist 100000

let time f n =
let t0 = (Unix.times ()).Unix.tms_utime in
f n;
(Unix.times()).Unix.tms_utime -. t0

let call_map l n =
for i = 1 to n do ignore (List.map succ l) done

let call_extmap l n =
for i = 1 to n do ignore (ExtList.List.map succ l) done

let process l =
List.iter
(fun (name, f, n) ->
Gc.full_major ();
let t = time f (n*100) in
Printf.printf "%s*%d: %fs\n" name n t;
flush stdout)
l

let () =
process
[ "l10", call_map l10, 10000;
"l100", call_map l100, 1000;
"l1000", call_map l1000, 100;
"l10000", call_map l10000, 10;
"l100000", call_map l100000, 1;
"l10'", call_extmap l10, 10000;
"l100'", call_extmap l100, 1000;
"l1000'", call_extmap l1000, 100;
"l10000'", call_extmap l10000, 10;
"l100000'", call_extmap l100000, 1; ]

brogoff

unread,

Mar 16, 2005, 10:50:23 PM3/16/05

to caml...@yquem.inria.fr

I just ran your counterexample and the tail recursive code was faster
for each. I used native code compilation.

Byte code compilation gives similar results to yours, except that as I ran the
test on "ye olde SPARC machine", it took a hell of a lot longer.

I'll try on a few different architectures later.

Anyways, long lists do occur, and Stack_overflow at the customer site sucketh
bigtime.

-- Brian

Yaron Minsky

unread,

Mar 16, 2005, 11:00:22 PM3/16/05

to brogoff, caml...@yquem.inria.fr

On Wed, 16 Mar 2005 19:35:17 -0800 (PST), brogoff
<bro...@speakeasy.net> wrote:
> Anyways, long lists do occur, and Stack_overflow at the customer site sucketh
> bigtime.

I completely agree. As someone who makes extensive use of OCaml in a
business setting, the fact that the standard libraries throw
exceptions on large data structures is a real pain. It's a violation
of the principle of least surprise, and I've run into it in practice
quite a few times. Our solution was to just rewrite the list module
with tail-recursive versions. We're happy to make the performance
tradeoff. If we really want things to be fast, List.map is no good
anyway, due to the lack of inlining of the function parameter.

I do think that it would be a great thing if the tail-recursive list
functions were moved into the standard library.

Yaron

Christian Szegedy

unread,

Mar 17, 2005, 4:56:28 AM3/17/05

to caml...@yquem.inria.fr

brogoff wrote:

>I just ran your counterexample and the tail recursive code was faster
>for each. I used native code compilation.
>
>Byte code compilation gives similar results to yours, except that as I ran the
>test on "ye olde SPARC machine", it took a hell of a lot longer.
>
>

As far as I know, the bytecode compiler does not eliminate
tail calls.

Jon Harrop

unread,

Mar 17, 2005, 5:27:06 AM3/17/05

to caml...@yquem.inria.fr

On Thursday 17 March 2005 03:48, Yaron Minsky wrote:
> On Wed, 16 Mar 2005 19:35:17 -0800 (PST), brogoff
> <bro...@speakeasy.net> wrote:
> > Anyways, long lists do occur, and Stack_overflow at the customer site
> > sucketh bigtime.
>
> I completely agree. As someone who makes extensive use of OCaml in a
> business setting, the fact that the standard libraries throw
> exceptions on large data structures is a real pain.

I believe the Stack_overflow exception is not necessarily thrown - the process
may simply die. IIRC, I found this on x86 Debian Linux when swap was turned
off.

> It's a violation
> of the principle of least surprise, and I've run into it in practice
> quite a few times.

I understand your (and Brian's) concern but I think the solution is to be more
careful when programming, rather than to replace all of the library
functions. That seems like a sledgehammer to crack a nut to me...

> Our solution was to just rewrite the list module
> with tail-recursive versions.

I would recommend using and making documentation instead, as
non-tail-recursive functions can be very useful. In non-trivial code I would
recommend putting the asymptotic complexity of stack use in the docs.

> We're happy to make the performance
> tradeoff. If we really want things to be fast, List.map is no good
> anyway, due to the lack of inlining of the function parameter.

Yes, if your program uses non-tail-recursive functions with very deep
recursion then it will be much slower anyway. Consequently, you are likely to
fix this whilst optimising anyway.

As Brian has said before, users can throw you a sideball by giving input which
requires deep recursion by a non-tail-recursive function which had not been
expected by the programmer. Provided you are thorough, this shouldn't happen
though.

I must say that I've had fewer stack problems with my OCaml code than with
code I've used in other languages...

> I do think that it would be a great thing if the tail-recursive list
> functions were moved into the standard library.

I would not like to see trivial tail-recursive alternatives put into the
library (e.g. let map f l = rev_map f (rev l)). I think it would bloat the
library unnecessarily and they are often not the best solution, particularly
in combination. For example, writing these functions yourself makes a rev rev
and the following deforestation obvious. Given that these are likely to be
performance-critical functions, this is obviously useful.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Oliver Bandel

unread,

Mar 17, 2005, 5:27:16 AM3/17/05

to caml...@yquem.inria.fr

On Wed, Mar 16, 2005 at 09:43:42AM -0800, brogoff wrote:
> On Wed, 16 Mar 2005, Jacques Garrigue wrote:
> > From: Yoann Padioleau <padi...@irisa.fr>
> > > Jon Harrop <j...@ffconsultancy.com> writes:
> > > > In OCaml, non-tail-recursive functions are often faster than their tail
> > > > recursive equivalents for small inputs (e.g. short lists).
> > >
> > > really ? why ?
> >
> > Because tail-recursive versions do some extra work to ensure
> > tail-recursiveness. For instance building a list in reverse order, and
> > converting it back with List.rev at the end. This only pays off for
> > huge lists.
>
> No doubt the implementors will want me guillotined for bringing this up again,
> but using the (still functional!) set_cdr! tail recursive functions, which do
> *not* reverse the list, are always faster than the non tail recursive
> list functions, even for small lists, at least in my experience.

And I experienced, that adding a "_rev" at the end of a function
often makes more sense than adding a List.rev inside.

Often some code produces a list and gives it to a function that
again creates a list, based on that data.
Then "_rev"^2 means _orig and all is going well.

If there are an odd number of "_rev"-functions,
a List.rev can be added *THEN* (instead of always feel
the necessity to provide "_orig"-functions instead of
"_rev"-functions.

Ciao,
Oliver

Oliver Bandel

unread,

Mar 17, 2005, 5:27:28 AM3/17/05

to caml...@yquem.inria.fr

On Wed, Mar 16, 2005 at 10:41:08PM +0900, Jacques Garrigue wrote:
> From: Yoann Padioleau <padi...@irisa.fr>
> > Jon Harrop <j...@ffconsultancy.com> writes:
> > > Although Map is asymptotically faster than List.assoc for lookup (O(ln n) vs
> > > O(n)), OCaml's Hashtbl and array-based equivalents are typically several
> > > times faster than Map.
> >
> > I agree, I beleived that too but
> > I switched from Map to Hashtbl in the "troll" code and Hashtbl sux.
> > I don't know why.
>
> Because the default hash function doesn't work well on complex
> data-structures, where it has lots of collisions, and results in
> putting lots of values in the same bucket. It's a bad idea to directly
> use complex data structures as key anyway, but particularly bad with
> hash tables.

Can you please provide more details here?!
When to use Hashtbl and when not?

Are there (freely available) papers on this theme/topic?

>
> > > In OCaml, non-tail-recursive functions are often faster than their tail
> > > recursive equivalents for small inputs (e.g. short lists).
> >
> > really ? why ?
>
> Because tail-recursive versions do some extra work to ensure
> tail-recursiveness. For instance building a list in reverse order, and
> converting it back with List.rev at the end. This only pays off for
> huge lists.

Where/when is the break-even point?

When to decide for the one or the other solution?

Trying?

Or counting the number cycles the resulting machine code will
need to do it in the one or the other way?

Or are there abstract ways of finding the break even point?

Or experience based rules of thumb?

Ciao,
Oliver

Oliver Bandel

unread,

Mar 17, 2005, 5:27:26 AM3/17/05

to caml...@yquem.inria.fr

On Wed, Mar 16, 2005 at 03:14:11PM +0100, Yoann Padioleau wrote:
[...]

> I am happy. I have learned (or re-learned) a few stuff from this thread
> so in a way from this "troll" :)
> Perhaps it is not always a waste of time to post in the news :)

"Who knows if it's good or bad?!"

>
> It would be cool if some of those insights on optimization would be put
> somewhere via a wiki.

> http://haskell.org/hawiki/ThingsToAvoid is a good stard but it is for haskell

> (but many stuff said still apply to ocaml).

[...]

String.copy ?
Functional Objects?

Ciao,
Oliver

Jon Harrop

unread,

Mar 17, 2005, 5:35:11 AM3/17/05

to caml...@yquem.inria.fr

On Thursday 17 March 2005 03:35, brogoff wrote:
> I just ran your counterexample and the tail recursive code was faster
> for each. I used native code compilation.

That's odd. My previous results were for a 1.2GHz Athlon t-bird, ocamlopt
3.08. Tail recursion is also significantly slower on an 866MHz P3 (x86
native-code) with ocamlopt 3.07:

Non-tail-recursive took: 0.873906s
Tail-recursive took: 1.005320s

Non-tail-recursive took: 4.288313s
Tail-recursive took: 0.986330s

And on an Athlon MP 2600+ with ocamlopt 3.06:

Non-tail-recursive took: 0.289890s
Tail-recursive took: 0.332338s

Non-tail-recursive took: 1.981812s
Tail-recursive took: 0.332071s

This may be a cache effect as these CPUs all have 256kb cache. Perhaps if you
try a smaller/larger problem depending on your cache size...

Oliver Bandel

unread,

Mar 17, 2005, 5:54:38 AM3/17/05

to caml...@yquem.inria.fr

On Thu, Mar 17, 2005 at 10:16:17AM +0000, Jon Harrop wrote:
> On Thursday 17 March 2005 03:48, Yaron Minsky wrote:
> > On Wed, 16 Mar 2005 19:35:17 -0800 (PST), brogoff
> > <bro...@speakeasy.net> wrote:
> > > Anyways, long lists do occur, and Stack_overflow at the customer site
> > > sucketh bigtime.
> >
> > I completely agree. As someone who makes extensive use of OCaml in a
> > business setting, the fact that the standard libraries throw
> > exceptions on large data structures is a real pain.
>
> I believe the Stack_overflow exception is not necessarily thrown - the process
> may simply die. IIRC, I found this on x86 Debian Linux when swap was turned
> off.

Then it is the OS what kills your process, not OCaml.
If your process grows too much, the OS can/must
kill it, so that other processes has enough room to live.

If you have swap-space, the OS can use it.
If not, there is no other possibility than killing the
prcoess.

So, use swpa space, or buy some more memeory for your computer.

It's not hype to have swap-space because people often say
"I have enough memory in RAM - I don't need a awap."
But as you see this is a misleading assumption.

If swap-space would be of no use, I'm shure the Linux-/Debian
developers/maintainers had thrown it out of the kernel.

When the OS kills your process, then OCaml has no chance to
stop this...

Ciao,
Oliver

Ville-Pertti Keinonen

unread,

Mar 17, 2005, 6:15:10 AM3/17/05

to Jon Harrop, caml...@yquem.inria.fr

On Thu, 2005-03-17 at 10:31 +0000, Jon Harrop wrote:
> On Thursday 17 March 2005 03:35, brogoff wrote:
> > I just ran your counterexample and the tail recursive code was faster
> > for each. I used native code compilation.
>
> That's odd. My previous results were for a 1.2GHz Athlon t-bird, ocamlopt
> 3.08. Tail recursion is also significantly slower on an 866MHz P3 (x86

It could be specific to i386 code generation.

Tail-recursive is faster for both of your tests on both of the amd64
machines (512/1024k L2 cache sizes) I tried on when using 64-bit
binaries but slightly slower for the first case when using 32-bit
binaries.

sebasti...@philips.com

unread,

Mar 17, 2005, 7:47:46 AM3/17/05

to caml...@yquem.inria.fr

Just to chime in on this...

Did anybody every consider the following simple solution for the 'map' problem?

let unbreakable_map f xs =
let rec
map limit f xs = (* recursion depth limited to limit *)
match xs with
[] -> []
| x::xs when limit > 0 -> let f_x = f x in f_x::(map (limit-1) f xs)
| _ -> List.rev_append (List.rev_map f xs) []
in map 512 f xs;;

The function is not tail-recursive for lists of length up to 512---at which point it
switches to a tail-recursive algorithm and uses the heap instead of the stack
to keep track of structural recursion.

The overhead introduced is counting down and comparing an int-counter.

Clearly, this solution is applicable to most other structural operations on lists
and it gets rid of many stack overflows nicely.

Since the "magic number" 512 necessarily represents a trade-off, I would like
to see it being chosen at the time the Ocaml compiler is constructed for a specific
target architecture, i.e. at the time somebody more or less knows the stack size.

Cheers,

Sebastian.

----
Dr. Sebastian Egner
Senior Scientist Channel Coding & Modulation
Philips Research Laboratories
Prof. Holstlaan 4 (WDC 1-051, 1st floor, room 51)
5656 AA Eindhoven
The Netherlands
tel: +31 40 27-43166 *** SINCE 10-Feb-2005 ***
fax: +31 40 27-44004
email: sebasti...@philips.com

Jon Harrop <j...@ffconsultancy.com>

Sent by:
caml-li...@yquem.inria.fr

16-03-2005 20:51

To: caml...@yquem.inria.fr
cc: (bcc: Sebastian Egner/EHV/RESEARCH/PHILIPS)
Subject: Re: [Caml-list] OCaml troll on Slashdot

Classification:

Jason Hickey

unread,

Mar 17, 2005, 12:34:33 PM3/17/05

to brogoff, caml...@yquem.inria.fr

brogoff wrote:
> No doubt the implementors will want me guillotined for bringing this up again,
> but using the (still functional!) set_cdr! tail recursive functions, which do
> *not* reverse the list, are always faster than the non tail recursive
> list functions, even for small lists, at least in my experience. So I suspect
> that your "for instance" is in fact the only reason for the disparity. I'd
> welcome a counterexample.

This optimization is probably reasonable. Note that assignment can have
a somewhat unexpected cost when the runtime uses a generational garbage
collector (as OCaml does).

Speaking generally, the collector would like the invariant "all pointers
in a generation refer to younger generations", where the "younger"
relation is reflexive. This is true for purely functional languages,
but not true when assignment is added.

In general, the implementation needs to add a check during assignment:
"if the value being modified belongs to a strictly older generation than
the value being stored, then mark it."

In the implementation of List.map using set_cdr! it should be possible
to eliminate the check. The cons-cell was just allocated, so it should
belong to the minor heap.

Jason

--
Jason Hickey http://www.cs.caltech.edu/~jyh
Caltech Computer Science Tel: 626-395-6568 FAX: 626-792-4257

brogoff

unread,

Mar 17, 2005, 1:25:50 PM3/17/05

to caml...@yquem.inria.fr

On Thu, 17 Mar 2005, Jon Harrop wrote:
> On Thursday 17 March 2005 03:48, Yaron Minsky wrote:
> > On Wed, 16 Mar 2005 19:35:17 -0800 (PST), brogoff
> > <bro...@speakeasy.net> wrote:
> > > Anyways, long lists do occur, and Stack_overflow at the customer site
> > > sucketh bigtime.
> >
> > I completely agree. As someone who makes extensive use of OCaml in a
> > business setting, the fact that the standard libraries throw
> > exceptions on large data structures is a real pain.
>
> I believe the Stack_overflow exception is not necessarily thrown - the process
> may simply die. IIRC, I found this on x86 Debian Linux when swap was turned
> off.
>
> > It's a violation
> > of the principle of least surprise, and I've run into it in practice
> > quite a few times.
>
> I understand your (and Brian's) concern but I think the solution is to be more
> careful when programming, rather than to replace all of the library
> functions. That seems like a sledgehammer to crack a nut to me...

There are a few problems with that suggestion. The cases where I've been bitten
didn't have lists of length 100_000 (or whatever the rather arbitrary number
above which we're not supposed to use lists) but rather had quite a few lists
that were close enough to that number, and the maps (appends, etc.) were being
called inside other mapping functions, on other lists.

Also, as I state above, the number is arbitrary, and having OCaml choke at some
particular size rather than letting me use large lists violates that least
surprise principle. I had an offline discussion recently with another caml-list
person in which he told me that he wished OCaml used Bignums instead of int's
by default. I disagree, but I don't think it's a dumb idea. The behavior of
the standard List functions is worse IMO. Maybe the standard Lisp.map should be
named List.unsafe_map (1/2 :-))?

I realize that this problem can be coded around, sometimes with better data
structures, or by the double reversing approach (which is what I used to use)
but my own sense of programming language aesthetics is that this is a flaw, or
at least a hole in the language that should be filled one day.

> > We're happy to make the performance
> > tradeoff. If we really want things to be fast, List.map is no good
> > anyway, due to the lack of inlining of the function parameter.

That's quite true. However, Yaron, I bet you usually write the program with map
first, and then optimize later iff it isn't fast enough, right? Manual inlining
makes code uglier.

> Yes, if your program uses non-tail-recursive functions with very deep
> recursion then it will be much slower anyway. Consequently, you are likely to
> fix this whilst optimising anyway.
>
> As Brian has said before, users can throw you a sideball by giving input which
> requires deep recursion by a non-tail-recursive function which had not been
> expected by the programmer. Provided you are thorough, this shouldn't happen
> though.

As I said above, you can have "not so deep recursions" nested inside each other
many times, leading to a deep recursion. Besides, "deep" is arbitrary.

> I must say that I've had fewer stack problems with my OCaml code than with
> code I've used in other languages...

Absolutely correct, and my whining should in no way be construed as saying
that OCaml sucks. However, when the language is so good, it's users begin to
desire perfection. When a language is like, say C++ or Perl, then Ada and
Python look really good.

>
> > I do think that it would be a great thing if the tail-recursive list
> > functions were moved into the standard library.
>
> I would not like to see trivial tail-recursive alternatives put into the
> library (e.g. let map f l = rev_map f (rev l)).

I agree with you here, and I'm only beating this dead dromedary to keep up the
pressure for a better solution. It's known in theory how to better than
the double rev solution, but as Jacques pointed out (and I accept his
counterexample for now!) there may be implications on GC which still give a
slight edge to the non tail recursive versions for smaller lists.

-- Brian

Marcin 'Qrczak' Kowalczyk

unread,

Mar 17, 2005, 2:14:51 PM3/17/05

to caml...@inria.fr

Jason Hickey <j...@cs.caltech.edu> writes:

> In general, the implementation needs to add a check during assignment:
> "if the value being modified belongs to a strictly older generation
> than the value being stored, then mark it."
>
> In the implementation of List.map using set_cdr! it should be possible
> to eliminate the check. The cons-cell was just allocated, so it
> should belong to the minor heap.

No, the next cell was allocated afterwards, and it may have triggered GC
in the meantime.

When you allocate a sufficiently large number of cons cells in ascending
order, a GC during that operation will cause some link to be old to young.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk

unread,

Mar 17, 2005, 2:29:50 PM3/17/05

to caml...@inria.fr

brogoff <bro...@speakeasy.net> writes:

> Also, as I state above, the number is arbitrary, and having OCaml
> choke at some particular size rather than letting me use large lists
> violates that least surprise principle. I had an offline discussion
> recently with another caml-list person in which he told me that he
> wished OCaml used Bignums instead of int's by default. I disagree,
> but I don't think it's a dumb idea. The behavior of the standard
> List functions is worse IMO. Maybe the standard Lisp.map should be
> named List.unsafe_map (1/2 :-))?

It's not the fault of the mapping function but of the stack being
non-extensible. A user-written recursion can blow it too. Functional
programming is supposed to encourage recursion, and a non-tail-recursive
'map' is much more readable than alternatives.

My implementation of my language Kogut has extensible stack.
And transparent bignums when appropriate. Yes, it's slower,
but correctness is more important.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

_______________________________________________

Oliver Bandel

unread,

Mar 17, 2005, 6:34:29 PM3/17/05

to caml...@yquem.inria.fr

On Thu, Mar 17, 2005 at 10:06:45AM -0800, brogoff wrote:
[...]

> Also, as I state above, the number is arbitrary, and having OCaml choke at some
> particular size rather than letting me use large lists violates that least
> surprise principle. I had an offline discussion recently with another caml-list
> person in which he told me that he wished OCaml used Bignums instead of int's
> by default. I disagree, but I don't think it's a dumb idea. The behavior of
> the standard List functions is worse IMO. Maybe the standard Lisp.map should be
> named List.unsafe_map (1/2 :-))?

Well, or call
ocaml ocaml_unsafe,
ocamlc ocamlc_unsafe
ocamlopt ocamlopt_unsafe
as long as they are not able to convert simple-recursive
programs into tail-recursive programs (per default or
per cli-option or at all).

>
> I realize that this problem can be coded around, sometimes with better data
> structures, or by the double reversing approach (which is what I used to use)
> but my own sense of programming language aesthetics is that this is a flaw, or
> at least a hole in the language that should be filled one day.

M aybe one day, the cli-switch "-tailrec-all" or "-force-tailrec"
will automatically convert all functions that aren't tailrec
into tailrec ones.
Or maybe a pragma (something like in _inline_)
will be used to convert certain functions
into it's tailrec counterpart internally (maybe only
when a switch "-tailrec-by-pragma" is set).

Ciao,
Oliver

Oliver Bandel

unread,

Mar 17, 2005, 6:54:20 PM3/17/05

to caml...@yquem.inria.fr

On Thu, Mar 17, 2005 at 12:31:57PM +0100, sebasti...@philips.com wrote:
> Just to chime in on this...
>
> Did anybody every consider the following simple solution for the 'map'
> problem?
>
> let unbreakable_map f xs =
> let rec
> map limit f xs = (* recursion depth limited to limit *)
> match xs with
> [] -> []
> | x::xs when limit > 0 -> let f_x = f x in f_x::(map (limit-1) f
> xs)
> | _ -> List.rev_append (List.rev_map f xs) []
> in map 512 f xs;;
>
> The function is not tail-recursive for lists of length up to 512---at
> which point it
> switches to a tail-recursive algorithm and uses the heap instead of the
> stack
> to keep track of structural recursion.

IMHO this does not makes sense.
Better checking listlength with List.length and then calling the
needed function directly.
Why using an integer counter... this introduces overhead.

Better use the match on limit and then call one of the
functions (which both are without a limit-incrementor).

BTW: As stated out by other people, the needed limit can vary.
so it should be possible to give it as argument (at least
as an optional value).

Ciao,
Oliver

David Brown

unread,

Mar 17, 2005, 7:14:28 PM3/17/05

to Oliver Bandel, caml...@yquem.inria.fr

On Thu, Mar 17, 2005 at 10:41:13PM +0100, Oliver Bandel wrote:

> IMHO this does not makes sense.
> Better checking listlength with List.length and then calling the
> needed function directly.
> Why using an integer counter... this introduces overhead.

List.length is O(n) and will be at least as much overhead as a counter kept
for the beginning portion of the list.

Dave

Karl Zilles

unread,

Mar 17, 2005, 7:16:14 PM3/17/05

to Oliver Bandel, caml...@yquem.inria.fr

Oliver Bandel wrote:
> On Thu, Mar 17, 2005 at 12:31:57PM +0100, sebasti...@philips.com wrote:
>>let unbreakable_map f xs =
>> let rec
>> map limit f xs = (* recursion depth limited to limit *)
>> match xs with
>> [] -> []
>> | x::xs when limit > 0 -> let f_x = f x in f_x::(map (limit-1) f
>>xs)
>> | _ -> List.rev_append (List.rev_map f xs) []
>> in map 512 f xs;;
>>
>>The function is not tail-recursive for lists of length up to 512---at
>>which point it
>>switches to a tail-recursive algorithm and uses the heap instead of the
>>stack
>>to keep track of structural recursion.
>
> IMHO this does not makes sense.
> Better checking listlength with List.length and then calling the
> needed function directly.
> Why using an integer counter... this introduces overhead.

Well, if you use List.length, then you're running an integer counter
over the entire length of the list. His way, you only count the first
512 items.

However, given the weird benchmark timings, God knows which would be
faster :)

Jacques Garrigue

unread,

Mar 17, 2005, 8:27:48 PM3/17/05

to sebasti...@philips.com, caml...@yquem.inria.fr

From: sebasti...@philips.com

> Did anybody every consider the following simple solution for the 'map'
> problem?
>
> let unbreakable_map f xs =
> let rec
> map limit f xs = (* recursion depth limited to limit *)
> match xs with
> [] -> []
> | x::xs when limit > 0 -> let f_x = f x in f_x::(map (limit-1) f
> xs)
> | _ -> List.rev_append (List.rev_map f xs) []
> in map 512 f xs;;
>
> The function is not tail-recursive for lists of length up to
> 512---at which point it switches to a tail-recursive algorithm and
> uses the heap instead of the stack to keep track of structural
> recursion.

I just tried to do it, and it seems to work well. The
overhead of the counter is really small.
Note that in the long list case I call the extlib map function rather
than rev o rev_map, as it performs better on such long lists.

Results: (ocamlopt, limit=1000)
using List.map

l10*10000: 0.117188s
l100*1000: 0.132812s
l1000*100: 0.195312s

l10000*10: 0.843750s
l100000*1: 3.328125
using your approach (reverting to ExtLib.List.map)
l10'*10000: 0.140625s
l100'*1000: 0.140625s
l1000'*100: 0.203125s
l10000'*10: 1.265625s
l100000'*1: 1.945312s
using ExtLib.List.map

l10'*10000: 0.187500s
l100'*1000: 0.203125s
l1000'*100: 0.304688s
l10000'*10: 1.382812s
l100000'*1: 1.937500s

The performance is even better with append, and with bytecode also.
So this seems that at least extlib could use that.

One criticism is that it is not really tail recursive, as it consumes
some large amount of stack, albeit limited. Brian seemed to imply that he
has programs for which even this would be bad, but I'm really curious
to such case. Basically it would mean using map recursively rather
deep, meaning exponential runtime anyway, so I'm not sure this is a
problem.

> Since the "magic number" 512 necessarily represents a trade-off, I
> would like to see it being chosen at the time the Ocaml compiler is
> constructed for a specific target architecture, i.e. at the time
> somebody more or less knows the stack size.

Honestly, this part is going to be difficult.
And I'm not sure it is that relevant anyway.
A fixed value already gives safety, good performance on small lists,
and reasonable overall performance.

Jacques Garrigue

Harrison, John R

unread,

Mar 18, 2005, 1:14:40 AM3/18/05

to Marcin 'Qrczak' Kowalczyk, caml...@inria.fr

| It's not the fault of the mapping function but of the stack being
| non-extensible. A user-written recursion can blow it too. Functional
| programming is supposed to encourage recursion, and a
non-tail-recursive
| 'map' is much more readable than alternatives.

Yes, I agree absolutely. I'm sure most users would be happier with the
occasional speed bump while the stack is extended than with uncaught
exceptions. And they'd be relieved of the need to recode trivial
functions with uglifying and time-wasting hacks to make them robust.

| My implementation of my language Kogut has extensible stack.
| And transparent bignums when appropriate. Yes, it's slower,
| but correctness is more important.

It's starting to sound like my dream language.

John.

Bauer, Christoph

unread,

Mar 18, 2005, 3:25:22 AM3/18/05

to brogoff, caml...@yquem.inria.fr

Hi,

> I realize that this problem can be coded around, sometimes
> with better data
> structures, or by the double reversing approach (which is
> what I used to use)
> but my own sense of programming language aesthetics is that
> this is a flaw, or
> at least a hole in the language that should be filled one day.

There is an implementation of VLists for OCaml. VLists are - like
single linked lists - persistent, but have better performance,
require less memory and are GC friendlier. Here is
the announcement and a link to the paper:

http://sourceforge.net/mailarchive/forum.php?thread_id=6740097&forum_id=29880

Stack overflow problems should be solved. Unfortunatly you loose
the ability of pattern matching and List contruction with the
bracket notation.

The paper is quite nice.

Christoph Bauer

Alain Frisch

unread,

Mar 18, 2005, 3:44:49 AM3/18/05

to Bauer, Christoph, caml...@yquem.inria.fr

Bauer, Christoph wrote:
> There is an implementation of VLists for OCaml. VLists are - like
> single linked lists - persistent, but have better performance,
> require less memory and are GC friendlier.

This depends on the kind of operation you need: the VList.tl operation
is more complex and always incurs memory allocation (this can be avoided
in theory, but it seem difficult to do it in OCaml). In particular, if
you implement the map function using only VList.tl and VList.hd, the
result will probably be much slower than with lists (of course, there is
a more efficient VList.map, but it cannot be written using only the
VList API).

-- Alain

brogoff

unread,

Mar 18, 2005, 12:54:50 PM3/18/05

to caml...@inria.fr

On Thu, 17 Mar 2005, Marcin 'Qrczak' Kowalczyk wrote:
> It's not the fault of the mapping function but of the stack being
> non-extensible. A user-written recursion can blow it too. Functional
> programming is supposed to encourage recursion, and a non-tail-recursive
> 'map' is much more readable than alternatives.

Interesting approach. Do you have any information as to how big the performance
hit is?

I've never used SML/NJ except for a few toy programs, but I recall that it
puts activation records on the Gc'ed heap (correct me if I'm wrong
someone) so that call/cc is more efficient, so stack overflows shouldn't
be a problem there either. Could you comment on why you chose extensible stacks
rather than the SMLNJ approach for Kogut.

> My implementation of my language Kogut has extensible stack.
> And transparent bignums when appropriate. Yes, it's slower,
> but correctness is more important.

Hard to disagree when you put it that way, but there you seem to be posing
a false dichotomy. With enough work, C code can be made safe.

What I think you intend is that you'd rather it be easy to write safe code
than that it be asy to write fast code, in the language. I wouldn't mind
that, as long as it isn't ridiculously hard or impossible to do the latter in
the language.

-- Brian

Marcin 'Qrczak' Kowalczyk

unread,

Mar 18, 2005, 1:54:42 PM3/18/05

to caml...@inria.fr

brogoff <bro...@speakeasy.net> writes:

>> It's not the fault of the mapping function but of the stack
>> being non-extensible. A user-written recursion can blow it too.
>> Functional programming is supposed to encourage recursion, and a
>> non-tail-recursive 'map' is much more readable than alternatives.
>
> Interesting approach. Do you have any information as to how big the
> performance hit is?

No.

I recall there was some paper which claimed that heap allocation is
significantly more expensive than stack allocation even with the most
clever GCs. I couldn't find it now. My stack frame allocation is very
similar to heap allocation.

What matters is not only the overhead of stack overflow checking
itself, but the combined effect of several interdependent design
choices. Even if I measured the cost of stack overflow checking in
isolation, it would not give a meaningful answer because it requires
and provides other things and thus the cost would be different if
applied to an implementation with different properties.

I implement a custom stack instead of using the system stack. Effects:
- a C global variable is used for the stack pointer instead of a
machine register, thus stack frame allocation and access to stack
contents is slower (even if it was not checked for overflow)
- checking for stack overflow adds overhead
+ tail call optimization is easier
+ scanning the stack by the GC is easier
+ portable green threads are easier
+ stack overflow is not fatal
+ triggering a context switch is done by faking the stack overflow
condition, so it doesn't need an *additional* overhead (except that
the stack limit is a volatile variable and allocation of a stack
frame is one machine instruction longer because of that)

> I've never used SML/NJ except for a few toy programs, but I recall
> that it puts activation records on the Gc'ed heap (correct me if I'm
> wrong someone) so that call/cc is more efficient, so stack overflows
> shouldn't be a problem there either. Could you comment on why you
> chose extensible stacks rather than the SMLNJ approach for Kogut.

It should be slightly more efficient because allocating stack frames
doesn't increase the frequency of GCs, because stack frames don't need
to be copied by a copying GC, and because they don't need a link to
the previous frame (they do have a "descriptor" pointer though, which
stores their size and is also used to generate stack trace on uncaught
exceptions).

I haven't measured how large the efficiency difference is. It could
even be negative, because I must deallocate a stack frame explicitly,
but I doubt that.

> What I think you intend is that you'd rather it be easy to write
> safe code than that it be asy to write fast code, in the language.
> I wouldn't mind that, as long as it isn't ridiculously hard or
> impossible to do the latter in the language.

It's hard to allow everything...

The combined effect of dynamic typing, passing function arguments in
C global variables, using a custom stack and stack overflow checking
caused the Ackermann function to be 5 times slower than in C.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

_______________________________________________

Paul Argentoff

unread,

Mar 31, 2005, 6:46:08 AM3/31/05

to Jon Harrop, caml...@yquem.inria.fr

Dear Jon Harrop,

Let JH = "Jon Harrop" in
written_by JH =>

JH> No, the OCaml code (compiled with ocamlopt) is much, much slower than
JH> the C++.

Sorry, what?

--
Yours truly, WBR, Paul Argentoff.
Jabber: pa...@jabber.rtelekom.ru
RIPE: PA1291-RIPE

Paul Argentoff

unread,

Mar 31, 2005, 6:55:19 AM3/31/05

to Yoann Padioleau, Jon Harrop, caml...@yquem.inria.fr

Dear Yoann Padioleau,

Let YP = "Yoann Padioleau" in
written_by YP =>

YP> I have made the obvious optimization which is to replace the assoc
YP> list by a Map (just changing 4 lines in the "troll" code), the ocaml
YP> version is then far far faster.

A programmer is always the clue ;)