[Caml-list] The Implicit Accumulator: a design pattern using optional arguments

Jon Harrop

unread,

Jun 27, 2007, 8:22:11 AM6/27/07

to caml...@yquem.inria.fr

I can't find the thread where we were talking about design patterns recently
but I'd like to note a design pattern that works nicely in OCaml. I'll call
it "The Implicit Accumulator".

ML programmers often use nested auxiliary functions or separate functions to
handle base cases. For example, writing rev in terms of rev_append:

# let rec rev_append l1 l2 = match l1 with
| [] -> l2
| a :: l -> rev_append l (a :: l2);;
val rev_append : 'a list -> 'a list -> 'a list = <fun>
# let rev l = rev_append l [];;
val rev : 'a list -> 'a list = <fun>

Provided performance is unimportant, you can make the accumulator implicit in
OCaml by specifying the default value in an optional argument instead of
having a separate function:

# let rec rev ?(back=[]) = function
| [] -> back
| h::t -> rev ~back:(h::back) t;;
val rev : ?back:'a list -> 'a list -> 'a list = <fun>

When you don't want the auxiliary (rev_append) function, I think this style
results in shorter and clearer code. I used it in the "search" function of my
Sudoku solver, for example:

let rec search ?(x=0) ?(y=0) f accu = match x, y with
9, y -> search ~x:0 ~y:(y+1) f accu (* Next row *)
| 0, 9 -> f accu (* Found a solution *)
| x, y ->
if m.(y).[x] <> '0' then search ~x:(x+1) ~y f accu else
fold (fun accu n ->
let n = Char.chr (n + 48) in
if invalid x y n then accu else
(m.(y).[x] <- n;
let accu = search ~x:(x+1) ~y f accu in
m.(y).[x] <- '0';
accu)) accu 1 10

and it crops up quite a lot in addition to all of the "conventional" uses of
optional arguments.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Quôc Peyrot

unread,

Jun 27, 2007, 9:22:01 AM6/27/07

to caml...@yquem.inria.fr

On Jun 27, 2007, at 2:14 PM, Jon Harrop wrote:

>
> I can't find the thread where we were talking about design patterns
> recently
> but I'd like to note a design pattern that works nicely in OCaml.
> I'll call
> it "The Implicit Accumulator".
>
> ML programmers often use nested auxiliary functions or separate
> functions to
> handle base cases. For example, writing rev in terms of rev_append:
>
> # let rec rev_append l1 l2 = match l1 with
> | [] -> l2
> | a :: l -> rev_append l (a :: l2);;
> val rev_append : 'a list -> 'a list -> 'a list = <fun>
> # let rev l = rev_append l [];;
> val rev : 'a list -> 'a list = <fun>
>
> Provided performance is unimportant, you can make the accumulator
> implicit in
> OCaml by specifying the default value in an optional argument
> instead of
> having a separate function:
>
> # let rec rev ?(back=[]) = function
> | [] -> back
> | h::t -> rev ~back:(h::back) t;;
> val rev : ?back:'a list -> 'a list -> 'a list = <fun>

Could you be more specifics about the performance hit?

> When you don't want the auxiliary (rev_append) function, I think
> this style
> results in shorter and clearer code. I used it in the "search"
> function of my
> Sudoku solver, for example:

It's funny that you speak about this, because I recently (few days
ago) used
a pattern similar to yours, but to actually improve performances.
I had something like that (which is quite different than my actual
code, but
the idea is the same):

let encrypt str =
let len = String.length str in
let encrypted = String.create len in
(* ... *)
encrypted

(*...*)
for i = 0 to 10000000 do
let encrypted = encrypt str in
(* do something on the result *)
done

Which is slow due to the string allocation happening each time we
call "encrypt"

So I rewrote it like that:

let encrypt ?encrypted str =
let len = String.length str in
let result = match encrypted with
| None -> String.create len
| Some s -> s
in
(* ... *)
result

(* ... *)
let encrypted = String.create (String.length str) in
for i = 0 to 1000000000 do
let encrypted = encrypt ~encrypted str in
(* ... *)
done

Which gave me more than a 2x speedup while still being able to call a
simple:
let encrypted = encrypt str
during normal usage

I was quite happy with this solution, but maybe there is something
more elegant to do?
(I'm still in the process of learning good optimization patterns in
ocaml which preserve readability)

--
Best Regards,
Quôc

Thomas Fischbacher

unread,

Jun 27, 2007, 9:58:07 AM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

> It's funny that you speak about this, because I recently (few days ago)
> used
> a pattern similar to yours, but to actually improve performances.
> I had something like that (which is quite different than my actual
> code, but
> the idea is the same):
>
> let encrypt str =
> let len = String.length str in
> let encrypted = String.create len in
> (* ... *)
> encrypted

vs.

> let encrypt ?encrypted str =
> let len = String.length str in
> let result = match encrypted with
> | None -> String.create len
> | Some s -> s
> in
> (* ... *)
> result

> Which gave me more than a 2x speedup while still being able to call a

> simple:
> let encrypted = encrypt str
> during normal usage

I use this strategy a lot and found that it eventually pays to use
uniform conventions for that: all my functions that can benefit from
having space pre-allocated where to write a result to use ?target
as their very first named optional argument (and also return that
target buffer afterwards).

However, unless I am mistaken, I fear that this also does introduce
some intrinsic/unavoidable inefficiency, as providing a ?target
argument will (presuambly?) require dynamic consing of a <thingy>
option cell - so not a good idea for a very small function that is
called very very often.

There are many many way more advanced tricks one would want to play
with the idea of "allocating buffers at the appropriate time". For
example, if this were LISP, one could often use dynamically scoped (in
the sense of (declare (dynamic-extent buffer-stack))) contextual
variables for great benefit (and these gory details often can also be
hidden quite conveniently under a few (maybe even in-place macrolet)
macros...), but unfortunately, OCaml does not support anything like
that. Of course, re-entrantness of your code may always become an
issue if you move buffers to higher scopes.

One thing OCaml can do better than, say, CMU CL, is to define globally
visible functions that depend on some otherwise inaccessible context,
as in the following example:

let float_factorial =
let _known_factorials = ref [|1.0;1.0;2.0;6.0;24.0;120.0;720.0|] in
(fun n ->
let known_factorials = !_known_factorials in
let nr_known = Array.length known_factorials in
if n < nr_known
then
known_factorials.(n)
else
let new_known_factorials = Array.make (n+1) 0.0 in
begin
for i=0 to nr_known-1 do
new_known_factorials.(i) <- known_factorials.(i)
done;
(let rec fill f_pos pos =
if pos > n then ()
else
let () = new_known_factorials.(pos) <- f_pos in
fill (f_pos *. (float_of_int (pos+1))) (pos+1)
in
fill (known_factorials.(nr_known-1)*.(float_of_int nr_known)) nr_known);
_known_factorials := new_known_factorials;
new_known_factorials.(n)
end)
;;

A corresponding

(let ((buffer (make-array ...)))
(defun float-factorial (n)
...))

just plainly does not work with CMU CL/SBCL. :-(

Other advanced optimization techniques that can be used for benefit
in very specialized situations include (explicit) continuation coding:
rather than returning a value (e.g. a tuple), you take as an argument
a function to which you then pass on your return value(s). This is quite
useful whenever "execution flow branches out into multiple paths that
have to be taken", and may sometimes (though rarely) also be used for
good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.

--
best regards,
Thomas Fischbacher
t...@functionality.de

Jon Harrop

unread,

Jun 27, 2007, 10:00:34 AM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 14:18:07 Quôc Peyrot wrote:
> On Jun 27, 2007, at 2:14 PM, Jon Harrop wrote:

> > Provided performance is unimportant...

>
> Could you be more specifics about the performance hit?

Actually, no. I have no quantitative results but I remember that it is
significant.

> I was quite happy with this solution, but maybe there is something
> more elegant to do?

Nice. :-)

> (I'm still in the process of learning good optimization patterns in
> ocaml which preserve readability)

I was going to mention symbol tables to Raj B for his Python JIT compiler but
he hasn't gotten back to me yet.

Basically, you memoize strings:

# let symbol =
let m = Hashtbl.create 1 in
fun string ->
try Hashtbl.find m string with Not_found ->
Hashtbl.add m string string;
string;;
val symbol : '_a -> '_a = <fun>

This is another trick I learned whilst writing my Mathematica interpreter (so
many tricks, so little time). This function looks totally pointless, like a
no-op, but if you pipe your identifiers through it (e.g. when building the
AST during parsing) then all structurally-equal strings are the same physical
string. If you're careful, this lets you use physical equality for string
comparison and that is a lot faster.

You might do something similar by memoizing temporary strings of different
lengths.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Thomas Fischbacher

unread,

Jun 27, 2007, 10:20:37 AM6/27/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> Basically, you memoize strings:
>
> # let symbol =
> let m = Hashtbl.create 1 in
> fun string ->
> try Hashtbl.find m string with Not_found ->
> Hashtbl.add m string string;
> string;;
> val symbol : '_a -> '_a = <fun>

..which is, of course, just a consequence of the OCaml drawback that
there is no proper "symbol" data type (which actually would be useful,
in particular in conjunction with proper EQ hash tables)...

By the way, the Perl interpreter does the same thing with
script-constant strings, also using this to provide
"effective almost-symbols".

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Quôc Peyrot

unread,

Jun 27, 2007, 11:09:05 AM6/27/07

to caml...@yquem.inria.fr

On Jun 27, 2007, at 3:55 PM, Thomas Fischbacher wrote:

> Quôc Peyrot wrote:

[...]

> > let encrypt ?encrypted str =
> > let len = String.length str in
> > let result = match encrypted with
> > | None -> String.create len
> > | Some s -> s
> > in
> > (* ... *)
> > result
>
>> Which gave me more than a 2x speedup while still being able to
>> call a simple:
>> let encrypted = encrypt str
>> during normal usage
>
> I use this strategy a lot and found that it eventually pays to use
> uniform conventions for that: all my functions that can benefit from
> having space pre-allocated where to write a result to use ?target
> as their very first named optional argument (and also return that
> target buffer afterwards).

Ah, thanks, I was actually trying to find a common name too, but
didn't really like "result". "target" is nice :p

> However, unless I am mistaken, I fear that this also does introduce
> some intrinsic/unavoidable inefficiency, as providing a ?target
> argument will (presuambly?) require dynamic consing of a <thingy>
> option cell - so not a good idea for a very small function that is
> called very very often.
>
> There are many many way more advanced tricks one would want to play
> with the idea of "allocating buffers at the appropriate time". For
> example, if this were LISP, one could often use dynamically scoped (in
> the sense of (declare (dynamic-extent buffer-stack))) contextual
> variables for great benefit (and these gory details often can also be
> hidden quite conveniently under a few (maybe even in-place macrolet)
> macros...), but unfortunately, OCaml does not support anything like
> that. Of course, re-entrantness of your code may always become an
> issue if you move buffers to higher scopes.

I didn't get that part, but I'm not familiar with Lisp.

> One thing OCaml can do better than, say, CMU CL, is to define globally
> visible functions that depend on some otherwise inaccessible context,
> as in the following example:
>
> let float_factorial =
> let _known_factorials = ref [|1.0;1.0;2.0;6.0;24.0;120.0;720.0|] in
> (fun n ->

I encountered this pattern today while reading extlib's OptParse.Opt
code:

let value_option metavar default coerce errfmt =
let data = ref default in
{
option_metavars = [metavar];
option_defhelp = None;
option_get = (fun _ -> !data);
option_set_value = (fun x -> data := Some x);
(*...*)

I was a little bit surprised at first that we could do that
(let ...ref... in) but it's really nice.
To me it seems that the common feature which enables us to do all
these tricks is the fact that we have a garbage collector (correct me
if I am wrong). It's really powerful, and I find it fascinating.

I mean, for someone like me, with quite some experience in the asm/c/c
++ world (i.e. a garbage collector-less world) but not much in other
languages, it's easy to naively think of a garbage collector as a
fancy feature to prevent from having to call "free/delete". But I'm
starting to realize there is a whole new set of powerful design
patterns which come along. It has been said multiple times on this
mailing list, but I think we really miss a book about these design
patterns and optimization tricks often specific to a given (or a set
of) feature (functional, lazy computations, garbage collector...).

I find it ironical that high-level languages (such as ocaml) are
intended (of course that's my interpretation of it) to hide low-level
details and give you more expressiveness in your code, which should
naively make you more productive, and make it easier to program
something. But requires therefore tons of new knowledges and deep
understanding of advanced concepts to be able to actually code
efficient (runtime and memory-wise) code.

I mean, in asm/c/c++ there isn't much feature to learn, you pretty
much do everything yourself. It's therefore quite easy (comparing to
OCaml) to actually see what is efficient and what is not. OCaml is so
high-level, and is doing so much for you, that you really need to
learn a lot more about compilation theory to be able to actually feel
at ease when you are looking for efficiency without giving up too
much code elegance. But don't get me wrong, I love it, it's
fascinating, but still ironical from my point of view.

> Other advanced optimization techniques that can be used for benefit
> in very specialized situations include (explicit) continuation coding:
> rather than returning a value (e.g. a tuple), you take as an argument
> a function to which you then pass on your return value(s). This is
> quite
> useful whenever "execution flow branches out into multiple paths that
> have to be taken", and may sometimes (though rarely) also be used for
> good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.

I didn't get that part at all. I think I would need an example to
understand
why it is interesting to pass the function instead of just returning
the tuple
and processing it.

--
Best Regards,
Quôc

Quôc Peyrot

unread,

Jun 27, 2007, 11:10:41 AM6/27/07

to caml...@yquem.inria.fr

On Jun 27, 2007, at 3:53 PM, Jon Harrop wrote:

[...]

> I was going to mention symbol tables to Raj B for his Python JIT
> compiler but
> he hasn't gotten back to me yet.
>
> Basically, you memoize strings:
>
> # let symbol =
> let m = Hashtbl.create 1 in
> fun string ->
> try Hashtbl.find m string with Not_found ->
> Hashtbl.add m string string;
> string;;
> val symbol : '_a -> '_a = <fun>
>
> This is another trick I learned whilst writing my Mathematica
> interpreter (so
> many tricks, so little time). This function looks totally
> pointless, like a
> no-op, but if you pipe your identifiers through it (e.g. when
> building the
> AST during parsing) then all structurally-equal strings are the
> same physical
> string. If you're careful, this lets you use physical equality for
> string
> comparison and that is a lot faster.

It's a really nice trick indeed.

--
Best Regards,
Quôc

Jon Harrop

unread,

Jun 27, 2007, 11:27:00 AM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 14:55:53 Thomas Fischbacher wrote:
> For
> example, if this were LISP, one could often use dynamically scoped (in
> the sense of (declare (dynamic-extent buffer-stack))) contextual
> variables for great benefit (and these gory details often can also be
> hidden quite conveniently under a few (maybe even in-place macrolet)
> macros...), but unfortunately, OCaml does not support anything like
> that.

This seems to be something that Lisp uses to allocate data structures on the
stack rather than the heap. Why would you want that?

> let float_factorial =
> let _known_factorials = ref [|1.0;1.0;2.0;6.0;24.0;120.0;720.0|] in
> (fun n ->
> let known_factorials = !_known_factorials in
> let nr_known = Array.length known_factorials in
> if n < nr_known
> then
> known_factorials.(n)
> else
> let new_known_factorials = Array.make (n+1) 0.0 in
> begin
> for i=0 to nr_known-1 do
> new_known_factorials.(i) <- known_factorials.(i)
> done;
> (let rec fill f_pos pos =
> if pos > n then ()
> else
> let () = new_known_factorials.(pos) <- f_pos in
> fill (f_pos *. (float_of_int (pos+1))) (pos+1)
> in
> fill (known_factorials.(nr_known-1)*.(float_of_int nr_known)) nr_known);
> _known_factorials := new_known_factorials;
> new_known_factorials.(n)
> end)

I can't quite follow that. Is it doing something cleverer than this:

let float_factorial =
let m = ref [||] in
fun n ->
try (!m).(n) with _ ->
let m' = Array.make (n + 1) 1. in
for i=1 to n do
m'.(i) <- float i *. m'.(i - 1)
done;
m := m';
m'.(n);;

> Other advanced optimization techniques that can be used for benefit
> in very specialized situations include (explicit) continuation coding:
> rather than returning a value (e.g. a tuple), you take as an argument
> a function to which you then pass on your return value(s). This is quite
> useful whenever "execution flow branches out into multiple paths that
> have to be taken",

Are you referring to CPS?

> and may sometimes (though rarely) also be used for
> good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.

Weren't values and multiple-value-bind completely superceded by pattern
matching?

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Mattias Engdegård

unread,

Jun 27, 2007, 11:30:29 AM6/27/07

to j...@ffconsultancy.com, caml...@yquem.inria.fr

># let symbol =
> let m = Hashtbl.create 1 in
> fun string ->
> try Hashtbl.find m string with Not_found ->
> Hashtbl.add m string string;
> string;;
>val symbol : '_a -> '_a = <fun>

Nice, but perhaps it should return a different type to avoid accidentally
comparing an interned string with a non-interned one?

We would need an accessor to extract the string from an abstract type but
I suppose that would be inlined.

Robert Fischer

unread,

Jun 27, 2007, 11:41:08 AM6/27/07

to caml...@yquem.inria.fr

I don't have a top-level with me right now, so I'm shooting from the
hip, but I don't think it is a compile-time error to attempt to use the
referential equality operator on two different types. If I'm right,
then you will want a separate is-equals function that wraps the
referential equality check and enforces type safety.

Robert Fischer
IT Firefighter
Smokejumper Consulting

Mattias Engdegård

unread,

Jun 27, 2007, 11:50:38 AM6/27/07

to robert....@smokejumperit.com, caml...@yquem.inria.fr

>I don't have a top-level with me right now, so I'm shooting from the
>hip, but I don't think it is a compile-time error to attempt to use the
>referential equality operator on two different types.

No, (==) : 'a -> 'a , so this would work:

module Sym : sig
type t
val symbol : string -> t
val str : t -> string
end = struct
type t = string

let symbol =
let m = Hashtbl.create 1 in

fun s ->
try Hashtbl.find m s
with Not_found -> (Hashtbl.add m s s; s)
let str s = s
end;;

# "toto" == "alpha" ;;
- : bool = false
# Sym.symbol "alpha" == Sym.symbol "alpha" ;;
- : bool = true
# Sym.symbol "alpha" == "alpha" ;;
Characters 22-29:
Sym.symbol "alpha" == "alpha" ;;
^^^^^^^
This expression has type string but is here used with type Sym.t

The cost is a slightly clumsier use of symbols as strings (Symbol.str),
but my feeling is that the extra safety is worth it.

Jon Harrop

unread,

Jun 27, 2007, 12:01:14 PM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 16:06:51 Quôc Peyrot wrote:
> I mean, for someone like me, with quite some experience in the asm/c/c
> ++ world (i.e. a garbage collector-less world) but not much in other
> languages, it's easy to naively think of a garbage collector as a
> fancy feature to prevent from having to call "free/delete". But I'm
> starting to realize there is a whole new set of powerful design
> patterns which come along. It has been said multiple times on this
> mailing list, but I think we really miss a book about these design
> patterns and optimization tricks often specific to a given (or a set
> of) feature (functional, lazy computations, garbage collector...).

This is an excellent idea. I'll write an OCaml Journal article on design
patterns! :-)

> > Other advanced optimization techniques that can be used for benefit
> > in very specialized situations include (explicit) continuation coding:
> > rather than returning a value (e.g. a tuple), you take as an argument
> > a function to which you then pass on your return value(s). This is
> > quite
> > useful whenever "execution flow branches out into multiple paths that
> > have to be taken", and may sometimes (though rarely) also be used for
> > good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.
>
> I didn't get that part at all. I think I would need an example to
> understand
> why it is interesting to pass the function instead of just returning
> the tuple
> and processing it.

I think Thomas is referring to continuation passing style (CPS). That isn't an
optimization though (it slows things down) but it does let you abstract away
mutation. However, it is not entirely safe in the absence of linear types.

For example, the immutable Map and mutable Hashtbl both map keys to values. If
you wrap them with an API written in CPS then you can switch between Maps and
Hashtbls transparently:

module type MAP = sig
type t
val create : unit -> t
val add : string -> string -> t -> (t -> 'a) -> 'a
val remove : string -> t -> (t -> 'a) -> 'a
val copy : t -> (t * t -> 'a) -> 'a
end;;

module Pure : MAP = struct
module Map = Map.Make(String)

type t = string Map.t

let create() = Map.empty

let add k v m f = f(Map.add k v m)
let remove k m f = f(Map.remove k m)
let copy m f = f(m, m)
end;;

module Impure : MAP = struct
type t = (string, string) Hashtbl.t

let create() = Hashtbl.create 1

let add k v m f =
Hashtbl.replace m k v;
let f_m = f m in
Hashtbl.remove m k;
f_m

let remove k m f =
let v = Hashtbl.find m k in
Hashtbl.remove m k;
let f_m = f m in
Hashtbl.add m k v;
f_m

let copy m f = f(m, Hashtbl.copy m)
end;;

However, this is not completely safe because you might erroneously return a
map or hash table from the continuation "f" passed to these functions.
Enforcing this statically requires linear types.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Robert Fischer

unread,

Jun 27, 2007, 12:02:58 PM6/27/07

to caml...@yquem.inria.fr

I'm happier with Ocaml every day. :-D

Robert Fischer
IT Firefighter
Smokejumper Consulting

Mattias Engdegård

unread,

Jun 27, 2007, 12:05:33 PM6/27/07

to mat...@virtutech.se, robert....@smokejumperit.com, caml...@yquem.inria.fr

># "toto" == "alpha" ;;
>- : bool = false

Ahem. The comparison fails for "alpha" == "alpha" as well (which I
meant to write). Sorry.

Thomas Fischbacher

unread,

Jun 27, 2007, 12:40:36 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

>> There are many many way more advanced tricks one would want to play
>> with the idea of "allocating buffers at the appropriate time". For
>> example, if this were LISP, one could often use dynamically scoped (in
>> the sense of (declare (dynamic-extent buffer-stack))) contextual
>> variables for great benefit (and these gory details often can also be
>> hidden quite conveniently under a few (maybe even in-place macrolet)
>> macros...), but unfortunately, OCaml does not support anything like
>> that. Of course, re-entrantness of your code may always become an
>> issue if you move buffers to higher scopes.
>
> I didn't get that part, but I'm not familiar with Lisp.

One example: what you can do quite easily in LISP is to introduce
a global lookup thingy MEMOIZATIONS (say, a list of hash tables
or something like that) and define macros WITH-LOCAL-MEMORY and
MEMOIZING where WITH-LOCAL-MEMORY defines a new dynamic
memoization scope, and MEMOIZING wraps up a few functions in such
a way that they use memoization. Result: we can have both
memoization on functions as well as defined behaviour with respect
to when memoized values are being forgotten again (namely, when
we are finished with the evaluation of the form

(WITH-LOCAL-MEMORY ...)).

Note that functions dynamically called from functions lexically scoped
inside this construct will ALSO use the same local memoization table!
In other words, when control flow exits the WITH-LOCAL-MEMORY block
(in whatever way it does so), our memoizing information is returned to
precisely the state it was in before we entered that block. That is
the magic of dynamic scoping.

In code:

(defvar *MEMOIZATIONS* '())

(defun _LOOKUP-MEMO (sym-f args)
(let ((f-args (cons sym-f args)))
(labels
((walk (rest-memo)
(if (null rest-memo)
(values nil nil)
(let ((h (car rest-memo)))
(multiple-value-bind (entry was-present)
(gethash f-args h)
(if was-present
(values entry t)
(walk (cdr rest-memo))))))))
(walk *MEMOIZATIONS*))))

(defmacro WITH-LOCAL-MEMORY (&body body)
`(let ((*MEMOIZATIONS*
(cons (make-hash-table :test 'equal)
*MEMOIZATIONS*)))
(declare (dynamic-extent *MEMOIZATIONS*))
. ,body))

(defmacro MEMOIZING (funs &body body)
(let* (($args (gensym "args-"))
($key (gensym "key-"))
($val (gensym "val-"))
($have (gensym "have-"))
(wrap-funcall
(lambda (sym)
`(,sym (&rest ,$args)
(let ((,$key (cons ',sym ,$args)))
(multiple-value-bind (,$val ,$have)
(_LOOKUP-MEMO ',sym ,$args)
(if ,$have ,$val
(let ((,$val (apply #',sym ,$args)))
(setf (gethash ,$key (car *MEMOIZATIONS*)) ,$val)
,$val))))))))
`(flet ,(mapcar wrap-funcall funs)
. ,body)))

;; Example:

(labels
((foo (x) (+ 1 (* 3 x)))
(bar (x) (/ x 2)))
(memoizing (foo bar)
(labels
((check-3x+1 (n nr-steps)
(cond
((= n 1) nr-steps)
((evenp n) (check-3x+1 (bar n) (+ 1 nr-steps)))
(t (check-3x+1 (foo n) (+ 1 nr-steps))))))
(do ((j 1 (+ 1 j))) ((= j 100))
(with-local-memory
(print (cons j (check-3x+1 j 0))))))))

> I mean, for someone like me, with quite some experience in the asm/c/c
> ++ world (i.e. a garbage collector-less world) but not much in other
> languages, it's easy to naively think of a garbage collector as a fancy
> feature to prevent from having to call "free/delete". But I'm starting
> to realize there is a whole new set of powerful design patterns which
> come along. It has been said multiple times on this mailing list, but I
> think we really miss a book about these design patterns and
> optimization tricks often specific to a given (or a set of) feature
> (functional, lazy computations, garbage collector...).

Two comments about this: First, one should not think along the lines of
"design patterns" here, as if this were Something Universally Good(TM).
Rather, a "design pattern" very often is the equivalent of a clever way
to open a tin with a pair of scissors: an "industry best practice"
workaround that deals with a problem created by the language that should
not be there in the first place! (Paul Graham wrote a nice article on
this.)

Whenever you discover a "design pattern" in your work, it pays to think
about it like this: why does such a pattern occur? Is it because I try
to work around a problem such as wanting to tell the machine about X but
not being able to express it the way I like to think about it myself?

If so, it is often a good idea to consider introducing a language
extension (quite simple if your language provides you with some
meta-linguistic capabilities, ideally: dirty macros plus a code-walker)
to deal with this evidently linguistic limitation.

This brings me to my second comment: it does take a lot of experience
to advance to the level of a language-shaping wizard: there are many
pitfalls and things that at first look as if they may work, but have
subtle undesired implications. One has to develop a strong sense for
important invariants under code transformations to get that bit right.

With this, I suppose, a proper book on both "functional optimization
strategies" and "ideas that help you to overcome mental barriers with
respect to what's possible when one can shape the language" would be
useful, not so much to "teach specific patterns", but to teach people
how to overcome their mental blockades and learn how to use their
phantasy to do marvelous things by shaping language. In the
Permaculture community, there is this proverb that "yield is
limited only by imagination". I think this holds just as much for
functional and in particular metalinguistic programming.

(I have been planning for years to eventually write up some
lengthier introductory text on metalinguistic techniques, but so
far only managed to write a few articles and give some short courses
on the subject...)

> I find it ironical that high-level languages (such as ocaml) are
> intended (of course that's my interpretation of it) to hide low-level
> details and give you more expressiveness in your code, which should
> naively make you more productive, and make it easier to program
> something. But requires therefore tons of new knowledges and deep
> understanding of advanced concepts to be able to actually code
> efficient (runtime and memory-wise) code.

Languages such as OCaml are not "intended to hide low-level details".

Rather, there are (at least) two very different notions of "programming"
around:

(1) Putting information into a mechanically behaving system in order
to get some desired behaviour. (This is what asm/C/C++ is about,
but actually, this even is a much broader notion that also includes
e.g. this: http://amasci.com/amateur/mirror.html)

(2) Formalizing some "mental process" in such a way that one can
then use stringent reasoning to analyze its properties. (This is
what, in essence, functional programming is about.)

Evidently, the more advanced you get, the more important the second
point of view becomes. But, with people being hedonistic and out for
quick results, we will keep on re-inventing simple instruction-based
programming systems over and over again, redoing all the historic
mistakes of unstructured goto programming, inappropriate checks for
exceptional conditions (such as overflows), not paying attention to
dynamical resource management at the level of the framework,
etc. etc. in novel designs till the dusk of the computing epoch.

>> Other advanced optimization techniques that can be used for benefit
>> in very specialized situations include (explicit) continuation coding:
>> rather than returning a value (e.g. a tuple), you take as an argument
>> a function to which you then pass on your return value(s). This is quite
>> useful whenever "execution flow branches out into multiple paths that
>> have to be taken", and may sometimes (though rarely) also be used for
>> good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.
>
> I didn't get that part at all. I think I would need an example to
> understand
> why it is interesting to pass the function instead of just returning
> the tuple
> and processing it.

Exercise: Write a program that takes as an argument an integer N and
spits out a string that is a piece of C code which looks as follows
for N=2:

void sort(int x1, int x2, int *buffer)
{
if(x1>x2)
{
buffer[0]=x1;
buffer[1]=x2;
}
else
{
buffer[0]=x2;
buffer[1]=x1;
}
}

This should be generalized to higher N, where the constraints are:

* The generated piece of code must only contain ifs, </> comparisons,
and assignments to the buffer.

* In the end, buffer must hold the input variables in sorted order.

* The code must use the minimal number of comparisons.

If you do this exercise, you will discover that the idea of continuation
coding can be very, very helpful.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Thomas Fischbacher

unread,

Jun 27, 2007, 12:45:25 PM6/27/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> This seems to be something that Lisp uses to allocate data structures on the
> stack rather than the heap. Why would you want that?

In order to avoid dynamic memory management and get dynamically scoped
pre-allocated "implicit context" buffers to which I can refer as if they
were ordinary variables.

Well, it avoids some of the computations in your example, which re-does
all the array whenever it has to be extended.

>>Other advanced optimization techniques that can be used for benefit
>>in very specialized situations include (explicit) continuation coding:
>>rather than returning a value (e.g. a tuple), you take as an argument
>>a function to which you then pass on your return value(s). This is quite
>>useful whenever "execution flow branches out into multiple paths that
>>have to be taken",
>
> Are you referring to CPS?

Yes, but not the call/cc implicit CPS, but explicitly passing around
continuations.

>>and may sometimes (though rarely) also be used for
>>good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.
>
> Weren't values and multiple-value-bind completely superceded by pattern
> matching?

No. :-) Pattern matching requires constructors, which cons. I am talking
about dynamical memory management avoidance techniques. There are a lot.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Daniel Bünzli

unread,

Jun 27, 2007, 12:57:14 PM6/27/07

to caml...@yquem.inria.fr

Le 27 juin 07 ą 15:53, Jon Harrop a écrit :

> Basically, you memoize strings:
>
> # let symbol =
> let m = Hashtbl.create 1 in
> fun string ->
> try Hashtbl.find m string with Not_found ->
> Hashtbl.add m string string;
> string;;
> val symbol : '_a -> '_a = <fun>

It should be pointed out that this trick, known as hash-consing, is
not limited to strings. Basically for a given type you create a
single value representing all values that are structurally
equivalent. It allows to compare values structurally by using
physical equality. This paper [1] shows how to abstract the design
pattern.

Daniel

[1]

Jean-Christophe Filliātre, Sylvain Conchon, Type-safe modular hash-
consing, Proceedings of the 2006 workshop on ML.
http://www.lri.fr/~filliatr/ftp/publis/hash-consing2.ps.gz

Gabriel Kerneis

unread,

Jun 27, 2007, 1:18:28 PM6/27/07

to caml...@yquem.inria.fr

Le Wed, 27 Jun 2007 17:06:51 +0200, Quôc Peyrot <cho...@lrde.epita.fr>
a écrit :

> It has been said multiple times on this
> mailing list, but I think we really miss a book about these design
> patterns and optimization tricks often specific to a given (or a set
> of) feature (functional, lazy computations, garbage collector...).

_Purely functional data structures_ by Chris Osaki might interest you.
It's a very good book, covering lazy evaluation and persistent
amortized data structures (among other things). Moreover, it does
insist on optimizations (often left as exercises to the reader, with
enough hints to be easy to figure out).

Regards,
--
Gabriel

signature.asc

Jon Harrop

unread,

Jun 27, 2007, 1:56:10 PM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 18:16:33 Gabriel Kerneis wrote:
> _Purely functional data structures_ by Chris Osaki might interest you.
> It's a very good book, covering lazy evaluation and persistent
> amortized data structures (among other things). Moreover, it does
> insist on optimizations (often left as exercises to the reader, with
> enough hints to be easy to figure out).

That is certainly an excellent book and is recommended reading for any
OCamler, but it isn't specific to the OCaml language and doesn't cover many
of the things that I would consider to be OCaml design patterns (like the one
I just posted).

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Jon Harrop

unread,

Jun 27, 2007, 2:13:40 PM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 16:48:44 Mattias Engdegård wrote:
> The cost is a slightly clumsier use of symbols as strings (Symbol.str),
> but my feeling is that the extra safety is worth it.

That is exactly what I did. Also, you cannot pattern match over the Sym.t but
I believe the OCaml compiler doesn't optimize pattern matches over strings
anyway.

Incidentally, can we add this to the list of wanted optimizations: O(log n)
matching of strings, arrays and polymorphic variants.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Jon Harrop

unread,

Jun 27, 2007, 2:25:55 PM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 17:44:04 you wrote:
> Jon Harrop wrote:
> > This seems to be something that Lisp uses to allocate data structures on
> > the stack rather than the heap. Why would you want that?
>
> In order to avoid dynamic memory management and get dynamically scoped
> pre-allocated "implicit context" buffers to which I can refer as if they
> were ordinary variables.

Do you mean something like this:

let dt() =
let start = ref (time()) in
fun () ->
let time' = time() in
let dt = time' -. !start in
start := time';
dt

Call dt() to get a new delta timer, call the delta timer to get the time since
it was last called:

# let dt1 = dt();;
val dt1 : unit -> float = <fun>
# let dt2 = dt();;
val dt2 : unit -> float = <fun>
# dt1();;
- : float = 4.66352200508117676
# dt2();;
- : float = 4.48727107048034668
# dt1();;
- : float = 3.36179709434509277
# dt2();;
- : float = 2.09420299530029297

You could call this a factory pattern.

> > let float_factorial =
> > let m = ref [||] in
> > fun n ->
> > try (!m).(n) with _ ->
> > let m' = Array.make (n + 1) 1. in
> > for i=1 to n do
> > m'.(i) <- float i *. m'.(i - 1)
> > done;
> > m := m';
> > m'.(n);;
>
> Well, it avoids some of the computations in your example, which re-does
> all the array whenever it has to be extended.

On "float_factorial 1000000", my original implementation was >2x faster. If
you call for slowly increasing arguments then you can do much better still by
doubling the length of the array to amortize allocation:

let float_factorial3 =
let n = ref 0 in

let m = ref [||] in

fun j ->
if j <= !n then (!m).(j) else
if j < Array.length !m then begin
let m' = !m in
for i = !n to j do

m'.(i) <- float i *. m'.(i - 1)
done;
m := m';

m'.(j)
end else begin
n := j;
let m' = Array.make (2 * j + 1) 1. in
for i=1 to j do

m'.(i) <- float i *. m'.(i - 1)
done;
m := m';

m'.(j)
end

This is ~7x faster for 1 .. 20000.

> >>Other advanced optimization techniques that can be used for benefit
> >>in very specialized situations include (explicit) continuation coding:
> >>rather than returning a value (e.g. a tuple), you take as an argument
> >>a function to which you then pass on your return value(s). This is quite
> >>useful whenever "execution flow branches out into multiple paths that
> >>have to be taken",
> >
> > Are you referring to CPS?
>
> Yes, but not the call/cc implicit CPS, but explicitly passing around
> continuations.

Yes, that's very useful in OCaml.

> >>and may sometimes (though rarely) also be used for
> >>good as a poor man's VALUES/MULTIPLE-VALUE-BIND substitute.
> >
> > Weren't values and multiple-value-bind completely superceded by pattern
> > matching?
>
> No. :-) Pattern matching requires constructors, which cons.

Here is a pattern match without constructors:

let x = 3

Here is a pattern match that doesn't cons:

let f(x, y) = x + y in
f 3 4

Here is a pattern match with constructors that doesn't cons:

type t = A | B

let f = function
| A -> 0
| B -> 1

What exactly are you having trouble implementing in OCaml? It sounds as if
you're still trying to work around the inefficiencies of Lisp and the beauty
of OCaml is that you don't have to. :-)

Incidentally, the ray tracer is a good demonstration of this. The performance
of the Lisp implementations is crippled by very slow allocation and
deallocation. Juho Snellman tried to circumvent this problem using
multiple-value-bind in a macro:

(defmacro def ((name params &body body)
(mname &rest mparams)
(wname &rest wparams))
`(progn
(declaim (inline ,name ,wname))
(defun ,name ,params
(declare (type double-float ,@params))
,@body)
(defmacro ,mname ,(mapcar #'car mparams)
,(loop with inner = (list name)
with body = ``,',inner
with all-names = nil
for (form count) in (reverse mparams)
for names = (loop repeat count collect (gensym))
do
(setf all-names (append all-names names))
(setf body ``(multiple-value-bind ,',(reverse names)
,,form ,,body))
finally
(setf (cdr inner) (reverse all-names))
(return body)))
(defun ,wname ,(mapcar #'car wparams)
(,mname ,@(mapcar #'cadr wparams)))))

While this greatly improves the performance of the Lisp, it remains far slower
than most other languages.

The equivalent optimization in OCaml is to pass multiple arguments in curried
form, exploiting ocamlopt's big step semantics without losing the
expressiveness of a functional style.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Brian Hurt

unread,

Jun 27, 2007, 2:34:16 PM6/27/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

>On Wednesday 27 June 2007 16:48:44 Mattias Engdegård wrote:
>
>
>>The cost is a slightly clumsier use of symbols as strings (Symbol.str),
>>but my feeling is that the extra safety is worth it.
>>
>>
>
>That is exactly what I did. Also, you cannot pattern match over the Sym.t but
>I believe the OCaml compiler doesn't optimize pattern matches over strings
>anyway.
>
>Incidentally, can we add this to the list of wanted optimizations: O(log n)
>matching of strings, arrays and polymorphic variants.
>
>
>

Actually, what I'd like is a more powerful regular expression engine-
one where I can give multiple different patterns with constant values,
and create a single regular expression that if the first pattern is
matched, the first constant value is matched, etc. Something with a
signature like:

type 'a regex_t
val compile : (string * 'a) -> default:'a -> 'a regex_t
val re_match : 'a regex_t -> string -> 'a

Which would allow me to do stuff like:

let re = compile [ ("foo", 1); ("bar", 2); ("baz", 3) ] ~default:(-1);;

let f str =
match re_match re str with
| 1 -> (* it's a foo *)
| 2 -> (* it's a bar *)
| 3 -> (* it's a baz *)
| -1 -> (* it didn't match *)
;;

In other words, something like ocamllex, except dynamic.

Brian

Quôc Peyrot

unread,

Jun 27, 2007, 3:29:42 PM6/27/07

to caml...@yquem.inria.fr

I'm not sure I understand why we can't do it in ocaml, but I
would probably need to try to implement it to see why.
I mean if we implement something along the lines:
let memo f =
let hash = ...

then can't we have the same feature using "let in" ?

let memoized_f = memo f in
let memoized_f = memo f in
(* when we leave this scope, we should get back the first table *)

>> I mean, for someone like me, with quite some experience in the asm/
>> c/c ++ world (i.e. a garbage collector-less world) but not much in
>> other languages, it's easy to naively think of a garbage
>> collector as a fancy feature to prevent from having to call "free/
>> delete". But I'm starting to realize there is a whole new set of
>> powerful design patterns which come along. It has been said
>> multiple times on this mailing list, but I think we really miss a
>> book about these design patterns and optimization tricks often
>> specific to a given (or a set of) feature (functional, lazy
>> computations, garbage collector...).

[...]

> With this, I suppose, a proper book on both "functional optimization
> strategies" and "ideas that help you to overcome mental barriers with
> respect to what's possible when one can shape the language" would be
> useful, not so much to "teach specific patterns", but to teach people
> how to overcome their mental blockades and learn how to use their
> phantasy to do marvelous things by shaping language. In the
> Permaculture community, there is this proverb that "yield is
> limited only by imagination". I think this holds just as much for
> functional and in particular metalinguistic programming.

I wanted to prevent from reducing my comment to the "functional" part.
The design patterns discussed in this thread has more to do with ocaml
itself (or the garbage collector for my example). I've seen plenty of
tutorials
and articles about functional languages but they often expose the same
things and lack (from my point of view again, please keep in mind
that runtime/memory efficiency is important to me) on the efficiency
side
(especially the memory one, often ignoring as well the big runtime
hit you
can have due to the allocations).

Writing concise and elegant code is very important to me, really. I hate
dirty ugly code, but I'm not willing to sacrifice the efficiency.
That's why
I really like OCaml, it is very powerful but allows us to efficient
code...
as long as you actually do understand how the compiler is going to
process your code. And that's the book I wish I could have. We can call
that design pattern or industry best practices. I just want to be
able to
write real-life code (and not necessarily scientific-oriented code,
sorry
Jon Harrop ;) ) without having to re-discover all these best practices
(which might be very natural for someone with a deeper understanding
of all the different features a specific language provides). I don't
think
the only subtle part of OCaml is the fact that it is a functional
language.
There is really more to it, IMHO.

> (I have been planning for years to eventually write up some
> lengthier introductory text on metalinguistic techniques, but so
> far only managed to write a few articles and give some short courses
> on the subject...)

Keep us posted ;)

>> I find it ironical that high-level languages (such as ocaml) are
>> intended (of course that's my interpretation of it) to hide low-
>> level details and give you more expressiveness in your code,
>> which should naively make you more productive, and make it easier
>> to program something. But requires therefore tons of new
>> knowledges and deep understanding of advanced concepts to be able
>> to actually code efficient (runtime and memory-wise) code.
>
> Languages such as OCaml are not "intended to hide low-level details".

Sorry, I'm not really good with words, I indeed wanted to talk about (2)

>
> Rather, there are (at least) two very different notions of
> "programming"
> around:
>
> (1) Putting information into a mechanically behaving system in order
> to get some desired behaviour. (This is what asm/C/C++ is about,
> but actually, this even is a much broader notion that also
> includes
> e.g. this: http://amasci.com/amateur/mirror.html)
>
> (2) Formalizing some "mental process" in such a way that one can
> then use stringent reasoning to analyze its properties. (This is
> what, in essence, functional programming is about.)
>
>
> Evidently, the more advanced you get, the more important the second
> point of view becomes.

(2) is very important to me, but as said earlier, I often found while
reading
articles (functional programming) that highly category-2 oriented people
tend not to care at all (or barely) about real-life runtime of the
program.
Writing correct code is not good enough, I need to be able to execute it
within my life-time (I know, I'm a bit provocative here ;) )

We often oppose category-2 oriented people with category-1 oriented
people.
I think we can meet somewhere in a middle, and the book we talked about
would certainly help.

But yes, I agree that, sometimes, the design pattern is here to solve
a problem
the language itself should solve, but that's not always the case.

--
Best Regards,
Quôc

Quôc Peyrot

unread,

Jun 27, 2007, 3:33:13 PM6/27/07

to caml...@yquem.inria.fr

I have this book in my TOREAD list (for a long time now, my bad)
I must admit I don't use very often pure functional datastructures in
OCaml.
My main concern with functional programing has always been the
runtime hit
you get due to the extra memory allocations (which can be significant).

But yes, I should definitely read this book, thanks for reminding me :p

Quôc Peyrot

unread,

Jun 27, 2007, 3:35:54 PM6/27/07

to caml...@yquem.inria.fr

On Jun 27, 2007, at 7:48 PM, Jon Harrop wrote:

> On Wednesday 27 June 2007 18:16:33 Gabriel Kerneis wrote:
>> _Purely functional data structures_ by Chris Osaki might interest
>> you.
>> It's a very good book, covering lazy evaluation and persistent
>> amortized data structures (among other things). Moreover, it does
>> insist on optimizations (often left as exercises to the reader, with
>> enough hints to be easy to figure out).
>
> That is certainly an excellent book and is recommended reading for any
> OCamler, but it isn't specific to the OCaml language and doesn't
> cover many
> of the things that I would consider to be OCaml design patterns
> (like the one
> I just posted).

I totally agree with that. We really need this missing book.

by the way, any news about "Ocaml for experienced programmer"?
I don't even know its table of content :(

--
Best Regards,
Quôc

Brian Hurt

unread,

Jun 27, 2007, 3:50:35 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

>
> On Jun 27, 2007, at 7:16 PM, Gabriel Kerneis wrote:
>
>> Le Wed, 27 Jun 2007 17:06:51 +0200, Quôc Peyrot <cho...@lrde.epita.fr>
>> a écrit :
>>
>>> It has been said multiple times on this
>>> mailing list, but I think we really miss a book about these design
>>> patterns and optimization tricks often specific to a given (or a set
>>> of) feature (functional, lazy computations, garbage collector...).
>>
>>
>> _Purely functional data structures_ by Chris Osaki might interest you.
>> It's a very good book, covering lazy evaluation and persistent
>> amortized data structures (among other things). Moreover, it does
>> insist on optimizations (often left as exercises to the reader, with
>> enough hints to be easy to figure out).
>
>
> I have this book in my TOREAD list (for a long time now, my bad)
> I must admit I don't use very often pure functional datastructures in
> OCaml.
> My main concern with functional programing has always been the
> runtime hit
> you get due to the extra memory allocations (which can be significant).

In Ocaml, allocations are relatively cheap- a cost similiar to that of
allocating on the stack. Which is why when you tell long-time Ocaml
programmers that you want to avoid an allocation cost by allocating on
the stack, they tend to go "um, why?"

Mutable data structures have their cost as well. When you assign a
pointer into an object old enough to be in the major heap, Ocaml kicks
off a minor collection. For small N, this can often make O(log N)
purely functional structures faster than their O(1) imperative counterparts.

No to mention the correctness advantages, plus other advantages.

Brian

skaller

unread,

Jun 27, 2007, 3:58:59 PM6/27/07

to Brian Hurt, caml...@yquem.inria.fr

On Wed, 2007-06-27 at 14:31 -0400, Brian Hurt wrote:

> Actually, what I'd like is a more powerful regular expression engine-
> one where I can give multiple different patterns with constant values,
> and create a single regular expression that if the first pattern is
> matched, the first constant value is matched, etc. Something with a
> signature like:

> let re = compile [ ("foo", 1); ("bar", 2); ("baz", 3) ]
> ~default:(-1);;
>
> let f str =
> match re_match re str with
> | 1 -> (* it's a foo *)
> | 2 -> (* it's a bar *)
> | 3 -> (* it's a baz *)
> | -1 -> (* it didn't match *)
> ;;
>
> In other words, something like ocamllex, except dynamic.

I think you can do this now? it's just messy: use

"(foo)|(bar)|(baz)"

and check which group 1, 2 or 3 matches.

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

Quôc Peyrot

unread,

Jun 27, 2007, 4:09:48 PM6/27/07

to caml...@yquem.inria.fr

On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote:

> In Ocaml, allocations are relatively cheap- a cost similiar to that
> of allocating on the stack. Which is why when you tell long-time
> Ocaml programmers that you want to avoid an allocation cost by
> allocating on the stack, they tend to go "um, why?"
>
> Mutable data structures have their cost as well. When you assign a
> pointer into an object old enough to be in the major heap, Ocaml
> kicks off a minor collection. For small N, this can often make O
> (log N) purely functional structures faster than their O(1)
> imperative counterparts.
>
> No to mention the correctness advantages, plus other advantages.

If I have a tree/map datastructure and I add an element to it, my
understanding it that, when building the new tree, all the node up to
the root are going to be replaced. Is my understanding correct?

Now let's say I want to build a tree with millions elements, and I'm
only interested in the final result, i.e. I don't need to be able to
rollback to a previous state or fancy stuff like that (I therefore
never keep a reference to the root of the old tree, each time I add a
new element).
Is there a way of writing the building code (with a fold-type of
construct or something like that) such that the compiler will
understand it, and will generate a code equivalent to the imperative
code (i.e. we don't allocate new nodes up to the root each time we
insert an element)?

--
Best Regards,
Quôc

Jonathan Bryant

unread,

Jun 27, 2007, 4:20:58 PM6/27/07

to Brian Hurt, caml...@yquem.inria.fr

On Jun 27, 2007, at 2:31 PM, Brian Hurt wrote:

> Actually, what I'd like is a more powerful regular expression engine-

This is off the top of my head, but I've always thought it would be
nice to have built-in regexps using a Scanf style. Instead of using
$1, $2 (or \1, \2, depending on your language), have each
parenthesized argument be passed to a function, and the whole thing
is a function of a string parameter. If a scanf conversion is used,
that is the type passed to the function, otherwise it's simply passed
as a string. Something like:

# let re = /(foo|bar|baz)/ (fun x -> match x with | "foo" -> ...
etc ... );;
re : string -> 'a = <fun>

# let simple_phone_number = /(%d)-(%d)-(%d)/ (fun x y z -> (x,y,z) );;
simple_phone_number : string -> (int * int * int) = <fun>

I don't know how hard this would be to implement. It would require
some scanf style compiler magic, but it overcomes scanf's shortcoming
that it doesn't allow character classes. It might even be possible
for use in pattern matching:

Any idea how hard this would be?

--Jonathan

Brian Hurt

unread,

Jun 27, 2007, 4:38:02 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

>
> On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote:
>
>> In Ocaml, allocations are relatively cheap- a cost similiar to that
>> of allocating on the stack. Which is why when you tell long-time
>> Ocaml programmers that you want to avoid an allocation cost by
>> allocating on the stack, they tend to go "um, why?"
>>
>> Mutable data structures have their cost as well. When you assign a
>> pointer into an object old enough to be in the major heap, Ocaml
>> kicks off a minor collection. For small N, this can often make O
>> (log N) purely functional structures faster than their O(1)
>> imperative counterparts.
>>
>> No to mention the correctness advantages, plus other advantages.
>
>
> If I have a tree/map datastructure and I add an element to it, my
> understanding it that, when building the new tree, all the node up to
> the root are going to be replaced. Is my understanding correct?

No. Only those elements that change need to be reallocated- the rest
can be shared between the old and new tree. So, assuming the tree is a
balanced tree, only the log N or so nodes between the root and the
changed node will need to be reallocated, of the N nodes in the tree- so
you're sharing N - log N nodes between the two trees.

>
> Now let's say I want to build a tree with millions elements, and I'm
> only interested in the final result, i.e. I don't need to be able to
> rollback to a previous state or fancy stuff like that (I therefore
> never keep a reference to the root of the old tree, each time I add a
> new element).
> Is there a way of writing the building code (with a fold-type of
> construct or something like that) such that the compiler will
> understand it, and will generate a code equivalent to the imperative
> code (i.e. we don't allocate new nodes up to the root each time we
> insert an element)?
>

With a million element tree, you're only reallocating 20-30 nodes
(assuming it's balanced), and sharing the other 999,970-999,980 nodes
between the two trees. Allocating each new node is only going to be a
small number of clock cycles, like 5-10 clocks. So the total allocation
cost is only 100-200 clocks or so.

What kills performance, and makes hash tables higher performance at this
point, is the balancing logic and the cache misses you're taking walking
down the tree (the younger heap stays in cache). But even there, we're
not talking about huge performance differences- for all except the most
time-critical code, tree based maps are generally fast enough.

Brian

Quôc Peyrot

unread,

Jun 27, 2007, 5:00:22 PM6/27/07

to Brian Hurt, caml...@yquem.inria.fr

On Jun 27, 2007, at 10:35 PM, Brian Hurt wrote:

> Quôc Peyrot wrote:
>
>>
>> On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote:
>>
>>> In Ocaml, allocations are relatively cheap- a cost similiar to
>>> that of allocating on the stack. Which is why when you tell
>>> long-time Ocaml programmers that you want to avoid an allocation
>>> cost by allocating on the stack, they tend to go "um, why?"
>>>
>>> Mutable data structures have their cost as well. When you assign
>>> a pointer into an object old enough to be in the major heap,
>>> Ocaml kicks off a minor collection. For small N, this can often
>>> make O (log N) purely functional structures faster than their O
>>> (1) imperative counterparts.
>>>
>>> No to mention the correctness advantages, plus other advantages.
>>
>>
>> If I have a tree/map datastructure and I add an element to it, my
>> understanding it that, when building the new tree, all the node up
>> to the root are going to be replaced. Is my understanding correct?
>
> No. Only those elements that change need to be reallocated- the
> rest can be shared between the old and new tree. So, assuming the
> tree is a balanced tree, only the log N or so nodes between the
> root and the changed node will need to be reallocated, of the N
> nodes in the tree- so you're sharing N - log N nodes between the
> two trees.

That's exactly what I meant with "all the nodes up to the root" (you
need to change a leaf, for that you need to change its parent, but
for that you ... and so on up to the root).

>> Now let's say I want to build a tree with millions elements, and
>> I'm only interested in the final result, i.e. I don't need to be
>> able to rollback to a previous state or fancy stuff like that (I
>> therefore never keep a reference to the root of the old tree,
>> each time I add a new element).
>> Is there a way of writing the building code (with a fold-type of
>> construct or something like that) such that the compiler will
>> understand it, and will generate a code equivalent to the
>> imperative code (i.e. we don't allocate new nodes up to the root
>> each time we insert an element)?
>>
> With a million element tree, you're only reallocating 20-30 nodes
> (assuming it's balanced), and sharing the other 999,970-999,980
> nodes between the two trees. Allocating each new node is only
> going to be a small number of clock cycles, like 5-10 clocks. So
> the total allocation cost is only 100-200 clocks or so.

I don't understand that part. Each time you add a node, log2(n) node
needs to be re-allocated. Hence the total number of reallocations is:
Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is significant.
Where is my mistake in these maths?

--
Best Regards,
Quôc

Pal-Kristian Engstad

unread,

Jun 27, 2007, 5:08:35 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:
> I don't understand that part. Each time you add a node, log2(n) node
> needs to be re-allocated. Hence the total number of reallocations is:
> Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is significant.

So you are saying that inserting N elements into a tree takes O(n*log2(n)).

> Where is my mistake in these maths?

No mistake. After all, after insertion you essentially have a sorted
tree, since iterating the tree takes O(n). Sorting take O(n*log n)...

Thanks,

PKE.

--
Pål-Kristian Engstad (eng...@naughtydog.com), Lead Graphics & Engine Programmer,
"Uncharted"-team, Naughty Dog, Inc., 1601 Cloverfield Blvd, 6000 North,
Santa Monica, CA 90404, USA. Ph.: (310) 633-9112.

"Most of us would do well to remember that there is a reason Carmack
is Carmack, and we are not Carmack.",
Jonathan Blow, 2/1/2006, GD Algo Mailing List

Quôc Peyrot

unread,

Jun 27, 2007, 5:20:47 PM6/27/07

to Pal-Kristian Engstad, caml...@yquem.inria.fr

On Jun 27, 2007, at 10:58 PM, Pal-Kristian Engstad wrote:

> Quôc Peyrot wrote:
>> I don't understand that part. Each time you add a node, log2(n)
>> node needs to be re-allocated. Hence the total number of
>> reallocations is:
>> Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is significant.

Oops, my mistake I meant "floor" not ceil.

> So you are saying that inserting N elements into a tree takes O
> (n*log2(n)).

Not exactly, log2(N!) is a closer estimate (estimate because of the
"floor" introducing a bias in the sum)

--
Best Regards,
Quôc

Pal-Kristian Engstad

unread,

Jun 27, 2007, 5:28:36 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:
>
> On Jun 27, 2007, at 10:58 PM, Pal-Kristian Engstad wrote:
>
>> Quôc Peyrot wrote:
>>> I don't understand that part. Each time you add a node, log2(n) node
>>> needs to be re-allocated. Hence the total number of reallocations is:
>>> Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is significant.
>
> Oops, my mistake I meant "floor" not ceil.
>
>> So you are saying that inserting N elements into a tree takes
>> O(n*log2(n)).
>
> Not exactly, log2(N!) is a closer estimate (estimate because of the
> "floor" introducing a bias in the sum)

No, no. Let's assume it always touches log2(n) elements, where n goes
from 1 .. N. Then we have:

time = C * log2(1) + C * log2(2) + C * log2(3) + .... + C * log N =
C * (log2(1) + ... + log2(N) <= C * N * log2(N),

since

log(n) <= log(N).

Thanks,

PKE.

--
Pål-Kristian Engstad (eng...@naughtydog.com), Lead Graphics & Engine Programmer,
"Uncharted"-team, Naughty Dog, Inc., 1601 Cloverfield Blvd, 6000 North,
Santa Monica, CA 90404, USA. Ph.: (310) 633-9112.

"Most of us would do well to remember that there is a reason Carmack
is Carmack, and we are not Carmack.",
Jonathan Blow, 2/1/2006, GD Algo Mailing List

_______________________________________________

Quôc Peyrot

unread,

Jun 27, 2007, 5:38:36 PM6/27/07

to Pal-Kristian Engstad, caml...@yquem.inria.fr

On Jun 27, 2007, at 11:18 PM, Pal-Kristian Engstad wrote:

> Quôc Peyrot wrote:
>>
>> On Jun 27, 2007, at 10:58 PM, Pal-Kristian Engstad wrote:
>>
>>> Quôc Peyrot wrote:
>>>> I don't understand that part. Each time you add a node, log2(n)
>>>> node needs to be re-allocated. Hence the total number of
>>>> reallocations is:
>>>> Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is
>>>> significant.
>>
>> Oops, my mistake I meant "floor" not ceil.
>>
>>> So you are saying that inserting N elements into a tree takes O
>>> (n*log2(n)).
>>
>> Not exactly, log2(N!) is a closer estimate (estimate because of
>> the "floor" introducing a bias in the sum)
> No, no. Let's assume it always touches log2(n) elements, where n
> goes from 1 .. N. Then we have:
>
> time = C * log2(1) + C * log2(2) + C * log2(3) + .... + C * log
> N = C * (log2(1) + ... + log2(N) <= C * N * log2(N),
>
> since
>
> log(n) <= log(N).

I think we agree, I'm just saying that C*N*log2(N) is a over-
estimated upper bound for large N.

C*(log2(1) + ... + log2(N)) = C * log2(1*2*....*N) = C * log2(N!)
hence

time <= C*log2(N!) <= C * N * log2(N)

as an example, for N = 100

sum(floor(log2(i)), i = 1..100) = 480
log2(100!) = 524 (overestimated by 10%)
100 * log2(100) = 664 (overestimated by 40%)

for large N, the difference is going to be significant.

But anyway, the details doesn't matter, it just looks to me that
inserting N elements in a functional map, is really significant.
If my maths are correct, for only 1000 elements, we are going to have
an extra 8500 allocations (the difference is so huge I actually can't
believe my maths...)

--
Best Regards,
Quôc

Pal-Kristian Engstad

unread,

Jun 27, 2007, 6:24:11 PM6/27/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:
> I think we agree, I'm just saying that C*N*log2(N) is a over-estimated
> upper bound for large N.
>
> C*(log2(1) + ... + log2(N)) = C * log2(1*2*....*N) = C * log2(N!)
> hence
>
> time <= C*log2(N!) <= C * N * log2(N)
>
> as an example, for N = 100
>
> sum(floor(log2(i)), i = 1..100) = 480
> log2(100!) = 524 (overestimated by 10%)
> 100 * log2(100) = 664 (overestimated by 40%)
>
> for large N, the difference is going to be significant.

Ok, I see what you mean.

> But anyway, the details doesn't matter, it just looks to me that
> inserting N elements in a functional map, is really significant.
> If my maths are correct, for only 1000 elements, we are going to have
> an extra 8500 allocations (the difference is so huge I actually can't
> believe my maths...)

But you are forgetting that you won't have to reallocate every log2(n)
elements, for every n, there is a chance that the actual number is much
less than log2(n). log2(n) really is just the worst case, where you have
to change every red-black tree (say) all the way to the top. This is
fairly unlikely, thus the number of allocations is in practice going to
be far less.

I tested OCaml's Map adding a counter for each allocation. Adding
100,000 random key, value pairs resulted in 151,587 allocations, and
1,000,000 adds resulted in 1,516,551 allocations. Adding sorted numbers
gave: 299,964 and 2,999,958 allocations. So it looks like it is near
linear (with a factor of 1.5 to 3.0) in practice.

PKE.

--
Pål-Kristian Engstad (eng...@naughtydog.com), Lead Graphics & Engine Programmer,
"Uncharted"-team, Naughty Dog, Inc., 1601 Cloverfield Blvd, 6000 North,
Santa Monica, CA 90404, USA. Ph.: (310) 633-9112.

"Most of us would do well to remember that there is a reason Carmack
is Carmack, and we are not Carmack.",
Jonathan Blow, 2/1/2006, GD Algo Mailing List

_______________________________________________

Jon Harrop

unread,

Jun 27, 2007, 7:06:15 PM6/27/07

to caml...@yquem.inria.fr

On Wednesday 27 June 2007 19:31:34 you wrote:
> In other words, something like ocamllex, except dynamic.

Am I right in remembering that Str uses a regexp->bytecode compiler that
invokes an interpreter written in C?

If so, assuming you can represent it in the current bytecode maybe you can
implement this with the current tool stack?

Thomas Fischbacher

unread,

Jun 28, 2007, 7:03:37 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> I think Thomas is referring to continuation passing style (CPS). That isn't an
> optimization though (it slows things down) but it does let you abstract away
> mutation. However, it is not entirely safe in the absence of linear types.

Which one do you prefer?

let sum_nums n =
let rec work sum todo =
if todo=0 then sum
else work (sum+todo) (todo-1)
in work 0 n
;;

let sum_nums2 n =
let rec work (sum,todo) =
if todo=0 then sum
else work ((sum+todo),(todo-1))
in work (0,n)
;;

Certainly the first one, right? On the one hand, it is simpler,
and on the other hand, also faster, because it avoids consing the
pair cells that are passed around in the second example.

It is important to note that the recursive call to work can not only
be seen as a tail call, but in particular also as a continuation call.
Viewed in such a way, this is a strategy to provide more than one
"return value" to a continuation. The call to "work" could just as
much be a call to any other continuation that takes two arguments,
so this can be a nice way to reduce unnecessary consing - provided you
can make sure the continuation closures are not heap-allocated over
and over again.

Having said that, it is just as easy to get carried away by all these
nice little oprimization ideas in a functional language as it is in an
imperative language. While for some parts of the code, and in particular
in library functions, optimization is an issue, this more often than not
is not a good thing...

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 7:20:10 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

>>In order to avoid dynamic memory management and get dynamically scoped
>>pre-allocated "implicit context" buffers to which I can refer as if they
>>were ordinary variables.
>
>
> Do you mean something like this:
>
> let dt() =
> let start = ref (time()) in
> fun () ->
> let time' = time() in
> let dt = time' -. !start in
> start := time';
> dt
>
> Call dt() to get a new delta timer, call the delta timer to get the time since
> it was last called:

Here, you are just packing state into a closure, so effectively you
build an "object". I am talking about attaching "contextual state"
to the dynamical call stack, which is a completely unrelated issue.

>>>Weren't values and multiple-value-bind completely superceded by pattern
>>>matching?
>>
>>No. :-) Pattern matching requires constructors, which cons.
>
>
> Here is a pattern match without constructors:
>
> let x = 3
>
> Here is a pattern match that doesn't cons:
>
> let f(x, y) = x + y in
> f 3 4

And, incidentally, also does not work.

> Here is a pattern match with constructors that doesn't cons:
>
> type t = A | B
>
> let f = function
> | A -> 0
> | B -> 1

You are evading the question. How do you return two arguments from a
function without constructing a 2-tuple (which is a consing operation).

A continuation call to a higher order function is one way to get
something similar to MULTIPLE-VALUE-*. But often, this is a hack.

> What exactly are you having trouble implementing in OCaml? It sounds as if
> you're still trying to work around the inefficiencies of Lisp and the beauty
> of OCaml is that you don't have to. :-)

> Incidentally, the ray tracer is a good demonstration of this. The performance
> of the Lisp implementations is crippled by very slow allocation and
> deallocation. Juho Snellman tried to circumvent this problem using
> multiple-value-bind in a macro:

Jon, you still don't get it. OCaml is just another language among a
whole zoo which also contains many other interesting systems. I know
you would like to believe otherwise, but it certainly is not the
hottest invention since fire and the wheel. I often use it myself
professionally and non-professionally, for a variety of purposes and
projects, but then only because some thought on the structure of the
problem at hand has shown that among all the tools that suck (all
programming languages do), OCaml turned out to be the one that sucked
least. To me, sometimes, it's Java that sucks least, sometimes it is
Emacs (Lisp), sometimes Common LISP, and sometimes Perl.

> While this greatly improves the performance of the Lisp, it remains far slower
> than most other languages.

According to your usually-screwed-up metrics, I suppose OCaml must then
even be faster than Michael Schuhmacher. :-)

--
Thomas Fischbacher
t...@functionality.de

Jon Harrop

unread,

Jun 28, 2007, 7:39:52 AM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 12:01:34 Thomas Fischbacher wrote:
> Jon Harrop wrote:
> > I think Thomas is referring to continuation passing style (CPS). That
> > isn't an optimization though (it slows things down) but it does let you
> > abstract away mutation. However, it is not entirely safe in the absence
> > of linear types.
>
> Which one do you prefer?
>
> let sum_nums n =
> let rec work sum todo =
> if todo=0 then sum
> else work (sum+todo) (todo-1)
> in work 0 n
> ;;
>
> let sum_nums2 n =
> let rec work (sum,todo) =
> if todo=0 then sum
> else work ((sum+todo),(todo-1))
> in work (0,n)
> ;;
>
> Certainly the first one, right?

I would write:

let rec work sum = function
| 0 -> sum
| todo -> work (sum + todo) (todo - 1)

let sum_nums n = work 0 n

because it is shorter, clearer, 65% faster and it is idiomatic ML rather than
idiomatic Lisp.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 7:43:06 AM6/28/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

> let memoized_f = memo f in
> let memoized_f = memo f in
> (* when we leave this scope, we should get back the first table *)

Yes, but this only concerns the LEXICAL scope (i.e. as long as execution
flow stays within that piece of code that is written dow as the "in"
body of this "let in". I am talking about DYNAMICAL scope, i.e.
something visible also to functions which I call from within there,
which can be defined in a completely different place in the code.

Let's take some animated 3d application. You may want to introduce a
dynamically scoped notion of, say, "pre-allocated stack of buffers for
position vectors which behaves like a C stack with respect to functions
temporarily using this for computations", so that all the functions that
call one another during rendering of a frame may use that very same
stack in an orderly fashion (with scoping behaviour just as the C
stack). Still, the existence and visibility of this stack is confined
to some "render-animation" function.

> I wanted to prevent from reducing my comment to the "functional" part.
> The design patterns discussed in this thread has more to do with ocaml
> itself (or the garbage collector for my example). I've seen plenty of
> tutorials
> and articles about functional languages but they often expose the same
> things and lack (from my point of view again, please keep in mind
> that runtime/memory efficiency is important to me) on the efficiency side
> (especially the memory one, often ignoring as well the big runtime hit you
> can have due to the allocations).

The issue with functional programming is that using dynamic memory
management (which is very complicated under the hood) is very easy
for the programmer, and hence quite seductive - even in situations
where for performance reasons, one should not do so. With C/C++,
using dynamic memory management is considerable effort (all these
calls to malloc()/free()), so people are naturally more inclined to
avoid it wherever it is unnecessary. But also here, it still is
very complicated things going on under the hood. (Concerning the
"but I at least get guarantees on execution times": how does one
know how long a free() is going to take, and in what sense is this
then better than temporarily disabling GC until we reach a point
where we can spend some time on it?)

>> Rather, there are (at least) two very different notions of "programming"
>> around:
>>
>> (1) Putting information into a mechanically behaving system in order
>> to get some desired behaviour. (This is what asm/C/C++ is about,
>> but actually, this even is a much broader notion that also includes
>> e.g. this: http://amasci.com/amateur/mirror.html)
>>
>> (2) Formalizing some "mental process" in such a way that one can
>> then use stringent reasoning to analyze its properties. (This is
>> what, in essence, functional programming is about.)
>>
>>
>> Evidently, the more advanced you get, the more important the second
>> point of view becomes.

> We often oppose category-2 oriented people with category-1 oriented

> people.
> I think we can meet somewhere in a middle, and the book we talked about
> would certainly help.

Well, as these are really basically unrelated notions of "programming",
there is little that speaks against approaches that include ideas and
heed aspects of programming(1) as well as programming(2). It is easy
to get into the pitfall os supposing that one could only have either the
one or the other.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Joel Reymont

unread,

Jun 28, 2007, 7:47:16 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Where does the 65% speed-up come from?

Just from using match?

On Jun 28, 2007, at 12:32 PM, Jon Harrop wrote:

> I would write:
>
> let rec work sum = function
> | 0 -> sum
> | todo -> work (sum + todo) (todo - 1)
>
> let sum_nums n = work 0 n
>
> because it is shorter, clearer, 65% faster and it is idiomatic ML
> rather than
> idiomatic Lisp.

--
http://topdog.cc - EasyLanguage to C# compiler
http://wagerlabs.com - Blog

Jon Harrop

unread,

Jun 28, 2007, 8:15:51 AM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 12:42:57 Joel Reymont wrote:
> Where does the 65% speed-up come from?

Good question.

> Just from using match?

Yes, or you can reorder the branches of the "if" statement, putting the common
branch first.

My code gives:

camlTest__work_58:
L101:
cmpq $1, %rbx
je .L100
movq %rbx, %rdi
addq $-2, %rdi
leaq -1(%rax, %rbx), %rax
movq %rdi, %rbx
jmp .L101
.align 4
L100:
ret
.text
.align 16
.globl camlTest__sum_nums3_61
camlTest__sum_nums3_61:
L102:
movq %rax, %rbx
movq $1, %rax
jmp camlTest__work_58
.text
.align 16
.globl camlTest__entry

So it branches out of the loop when todo=0 and does one branch per loop.

Both of Thomas' implementations give:

camlTest__work_60:
L101:
cmpq $1, %rbx
jne .L100
ret
.align 4
L100:
movq %rbx, %rdi
addq $-2, %rdi
leaq -1(%rax, %rbx), %rax
movq %rdi, %rbx
jmp .L101
.text
.align 16
.globl camlTest__sum_nums_58
camlTest__sum_nums_58:
L102:
movq %rax, %rbx
leaq camlTest__2(%rip), %rax
movq $1, %rax
jmp camlTest__work_60
.text
.align 16
.globl camlTest__entry

which branches within the loop if todo<>0 and then back to the start of the
loop. So this branches twice per loop.

PS: This has nothing to do with consing or continuations.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 9:01:18 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> I would write:
>
> let rec work sum = function
> | 0 -> sum
> | todo -> work (sum + todo) (todo - 1)
>
> let sum_nums n = work 0 n
>
> because it is shorter, clearer, 65% faster and it is idiomatic ML rather than
> idiomatic Lisp.

You are still evading the issue: you nevertheless pass multiple
arguments to a continuation, rather than consing a return value.

But apart from this: I find it somehow strange that the OCaml compiler
produces noticeable differences in the speed of compiled code when one
does some transcriptions that actually should not matter, such as when
replacing for with Array.iter...

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Quôc Peyrot

unread,

Jun 28, 2007, 9:12:48 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

On Jun 28, 2007, at 2:08 PM, Jon Harrop wrote:

> On Thursday 28 June 2007 12:42:57 Joel Reymont wrote:
>> Where does the 65% speed-up come from?
>
> Good question.
>
>> Just from using match?
>
> Yes, or you can reorder the branches of the "if" statement, putting
> the common
> branch first.
>
> My code gives:

[...]

> So it branches out of the loop when todo=0 and does one branch per
> loop.
>
> Both of Thomas' implementations give:

[...]

> \which branches within the loop if todo<>0 and then back to the

> start of the
> loop. So this branches twice per loop.

That is so strange! #@$!?
Can someone enlighten us?

--
Best Regards,
Quôc

Jon Harrop

unread,

Jun 28, 2007, 9:13:55 AM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 13:59:51 Thomas Fischbacher wrote:
> Jon Harrop wrote:
> > I would write:
> >
> > let rec work sum = function
> >
> > | 0 -> sum
> > | todo -> work (sum + todo) (todo - 1)
> >
> > let sum_nums n = work 0 n
> >
> > because it is shorter, clearer, 65% faster and it is idiomatic ML rather
> > than idiomatic Lisp.
>
> You are still evading the issue: you nevertheless pass multiple arguments to
> a continuation, rather than consing a return value.

Continuations and consing have nothing to do with this.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 9:35:18 AM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

>>You are still evading the issue: you nevertheless pass multiple arguments to
>>a continuation, rather than consing a return value.
>
>
> Continuations and consing have nothing to do with this.

Continuations and consing was what the discussion was about
before you changed the subject.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 9:37:06 AM6/28/07

to Quôc Peyrot, caml...@yquem.inria.fr

Quôc Peyrot wrote:

>> loop. So this branches twice per loop.
>
>
> That is so strange! #@$!?
> Can someone enlighten us?

Don't worry: this presumably is just some strange behaviour of the
present version of the OCaml compiler. One presumably should expect
neither past nor future versions to behave in the same way.

Yes, it is strange. And maybe the compiler dev team should look
into that...

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Jon Harrop

unread,

Jun 28, 2007, 10:50:57 AM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 14:33:15 Thomas Fischbacher wrote:
> Jon Harrop wrote:
> >>You are still evading the issue: you nevertheless pass multiple arguments
> >> to a continuation, rather than consing a return value.
> >
> > Continuations and consing have nothing to do with this.
>
> Continuations and consing was what the discussion was about
> before you changed the subject.

I am more than happy to talk about continuations and consing but you need to
post code that uses continuations or conses before anyone can help.

On Thursday 28 June 2007 12:18:44 Thomas Fischbacher wrote:
> > > Pattern matching requires constructors, which cons.

> ...

> You are evading the question.

You didn't ask a question. You made an incorrect statement: "Pattern matching

requires constructors, which cons".

In the context of avoiding allocation, that is a critical misunderstanding as
there was no allocation to avoid.

> How do you return two arguments from a function without constructing a
> 2-tuple (which is a consing operation).

You can rewrite:

let f() = 3, 4

let g() =
let x, y = f() in
x + y

in CPS as:

let f k = k 3 4

let g = f ( + )

Performance is within 1%.

> A continuation call to a higher order function is one way to get
> something similar to MULTIPLE-VALUE-*.

Not really. Lisp's MULTIPLE-VALUE-* is used to avoid Lisp's heinously slow
allocator. OCaml opted for a fast allocator and no MULTIPLE-VALUE-*.

> But often, this is a hack.

Using CPS to avoid inefficiencies that don't exist is certainly a bad idea.

> According to your usually-screwed-up metrics...

Time taken?

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Jon Harrop

unread,

Jun 28, 2007, 10:52:11 AM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 12:39:04 Thomas Fischbacher wrote:
> Let's take some animated 3d application. You may want to introduce a
> dynamically scoped notion of, say, "pre-allocated stack of buffers for
> position vectors which behaves like a C stack with respect to functions
> temporarily using this for computations", so that all the functions that
> call one another during rendering of a frame may use that very same
> stack in an orderly fashion (with scoping behaviour just as the C
> stack). Still, the existence and visibility of this stack is confined
> to some "render-animation" function.

Pass the stack as an argument to those functions that need it.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 12:02:35 PM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> I am more than happy to talk about continuations and consing but you need to
> post code that uses continuations or conses before anyone can help.

Let us look at the following example:

===>

let print_gc_info verbose =
let s = if verbose then Gc.stat() else Gc.quick_stat() in
Printf.printf
"=== GC Stats (%s) ===
minor_words: %f
promoted_words: %f
major_words: %f
minor_collections: %d
major_collections: %d
heap_words: %d
heap_chunks: %d
live_words: %d
live_blocks: %d
free_words: %d
free_blocks: %d
largest_free: %d
fragments: %d
compactions: %d
top_heap_words: %d
"
(if verbose then "verbose" else "quick")
s.Gc.minor_words
s.Gc.promoted_words
s.Gc.major_words
s.Gc.minor_collections
s.Gc.major_collections
s.Gc.heap_words
s.Gc.heap_chunks
s.Gc.live_words
s.Gc.live_blocks
s.Gc.free_words
s.Gc.free_blocks
s.Gc.largest_free
s.Gc.fragments
s.Gc.compactions
s.Gc.top_heap_words
;;

let twofuns_v1 x =
(x mod 3, x mod 7)
;;

let twofuns_v2 c x =
c (x mod 3) (x mod 7)
;;

let walk_pairs_v1 ?(target=[|0;0|]) n =
begin
target.(0) <- 0;
target.(1) <- 0;
for i=0 to n-1 do
let (f1,f2) = twofuns_v1 i in
begin
target.(0) <- target.(0) + f1;
target.(1) <- target.(1) + f2;
end
done;
target
end
;;

let walk_pairs_v2 ?(target=[|0;0|]) n =
begin
target.(0) <- 0;
target.(1) <- 0;
(let cont x y =
begin
target.(0) <- target.(0) + x;
target.(1) <- target.(1) + y;
end
in
for i=0 to n-1 do
twofuns_v2 cont i
done);
target
end
;;

let () =
let walker =
if Sys.argv.(1) = "v1"
then
let () = Printf.printf "Using variant #1\n" in
walk_pairs_v1
else
let () = Printf.printf "Using variant #2\n" in
walk_pairs_v2
in
let target=[|0;0|] in
begin
ignore(walker ~target 100000000);
Printf.printf "%d %d\n" target.(0) target.(1)
end
;;

print_gc_info true;;
<===

I get (for the non-continuation approach):

=== GC Stats (verbose) ===
minor_words: 300018751.000000
promoted_words: 65.000000
major_words: 74.000000
minor_collections: 9155
major_collections: 0
heap_words: 61440
heap_chunks: 1
live_words: 74
live_blocks: 23
free_words: 61366
free_blocks: 1
largest_free: 61366
fragments: 0
compactions: 0
top_heap_words: 61440

And using the continuation:

=== GC Stats (verbose) ===
minor_words: 447.000000
promoted_words: 0.000000
major_words: 9.000000
minor_collections: 0
major_collections: 0
heap_words: 61440
heap_chunks: 1
live_words: 9
live_blocks: 3
free_words: 61431
free_blocks: 1
largest_free: 61431
fragments: 0
compactions: 0
top_heap_words: 61440

..so the continuation-based approach can execute not using the GC
at all - neither major nor minor. Sure, the OCaml GC behaves so
nicely that it does not make a difference in terms of run-time for
this particular small example (...or is it that calling a closure
is at present so inefficient that it outweighs the benefits of not
having to cons?) - but (1) is this the same if the minor heap is
more complicated, as in a real application and (2) shouldn't there
be huge potential for optimization in the second case then?

In particular, concerning point (2), when comparing run times
with the following bit of C code, I find that both OCaml
variants are slower than the C variant by more than a factor
of 3:

===>

#include <stdio.h>

typedef void (*cfii)(int,int);

static int buf[2]={0,0};

void twofuns_cont(cfii c,int x)
{
c(x%3,x%7);
}

void incbuf(int x,int y)
{
buf[0]+=x;
buf[1]+=y;
}

int main(void)
{
int i;

for(i=0;i<100000000;i++)
{
twofuns_cont(&incbuf,i);
}
printf("%d %d\n",buf[0],buf[1]);
}
<===

>>According to your usually-screwed-up metrics...
>
>
> Time taken?

We are talking about your ray-tracer here.

For those who do not know yet, the fundamental problem with that study
of yours is that you kept on setting the *criteria* what to consider as
a permissible solution only after seeing the result, and doing so in
such a way that the outcome is the one you desired, i.e. to create the
impression OCaml were the best system around. This is not proper
scientific behaviour.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Thomas Fischbacher

unread,

Jun 28, 2007, 12:05:02 PM6/28/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> Pass the stack as an argument to those functions that need it.

Precisely this is how you would have to do this with ML. But then, if
you do not just deal with one stack, but have more complicated context,
OR if you have to start thinking about threading the stack through
functions that do not use it, but use other functions which use it,
things easily get quite awkward.

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Dirk Thierbach

unread,

Jun 28, 2007, 3:42:41 PM6/28/07

to caml...@yquem.inria.fr

On Thu, Jun 28, 2007 at 05:03:25PM +0100, Thomas Fischbacher wrote:
> Jon Harrop wrote:
> >Pass the stack as an argument to those functions that need it.

> Precisely this is how you would have to do this with ML. But then, if
> you do not just deal with one stack, but have more complicated context,
> OR if you have to start thinking about threading the stack through
> functions that do not use it, but use other functions which use it,
> things easily get quite awkward.

So could one "hide the plumbing" in a monad?

- Dirk

Thomas Fischbacher

unread,

Jun 28, 2007, 6:16:57 PM6/28/07

to Dirk Thierbach, caml...@yquem.inria.fr

Dirk Thierbach wrote:

>>>Pass the stack as an argument to those functions that need it.
>
>
>>Precisely this is how you would have to do this with ML. But then, if
>>you do not just deal with one stack, but have more complicated context,
>>OR if you have to start thinking about threading the stack through
>>functions that do not use it, but use other functions which use it,
>>things easily get quite awkward.
>
>
> So could one "hide the plumbing" in a monad?

Hmmm. In principle, yes. I think that should be possible here.
(Done this for a related problem where contextual typesetting
information is passed around.)

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Jon Harrop

unread,

Jun 28, 2007, 9:17:51 PM6/28/07

to caml...@yquem.inria.fr

On Thursday 28 June 2007 23:12:34 Thomas Fischbacher wrote:
> Dirk Thierbach wrote:
> >>>Pass the stack as an argument to those functions that need it.
> >>
> >>Precisely this is how you would have to do this with ML. But then, if
> >>you do not just deal with one stack, but have more complicated context,
> >>OR if you have to start thinking about threading the stack through
> >>functions that do not use it, but use other functions which use it,
> >>things easily get quite awkward.
> >
> > So could one "hide the plumbing" in a monad?
>
> Hmmm. In principle, yes. I think that should be possible here.
> (Done this for a related problem where contextual typesetting
> information is passed around.)

Is this the kind of thing you're after:

http://okmij.org/ftp/ML/#dynvar

I must confess, I don't see the lure when you have currying...

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Jon Harrop

unread,

Jun 28, 2007, 9:19:15 PM6/28/07

to Thomas Fischbacher

On Thursday 28 June 2007 17:01:05 you wrote:
> ...so the continuation-based approach can execute not using the GC

> at all - neither major nor minor. Sure, the OCaml GC behaves so
> nicely that it does not make a difference in terms of run-time for
> this particular small example

Yes.

> (1) is this the same if the minor heap is more complicated, as in a real
> application and

Converting to CPS typically slows programs down IME.

> (2) shouldn't there be huge potential for optimization in the second case
> then?

Why?

> In particular, concerning point (2), when comparing run times
> with the following bit of C code, I find that both OCaml
> variants are slower than the C variant by more than a factor
> of 3:

You are benchmarking "mod" and bounds checking.

> >>According to your usually-screwed-up metrics...
> >
> > Time taken?
>
> We are talking about your ray-tracer here.

Yes.

> For those who do not know yet, the fundamental problem with that study
> of yours is that you kept on setting the *criteria* what to consider as
> a permissible solution only after seeing the result, and doing so in
> such a way that the outcome is the one you desired, i.e. to create the
> impression OCaml were the best system around. This is not proper
> scientific behaviour.

My choice of the intermediate implementations is subjective. The shortest and
faster are a free for all. Feel free to post a faster Lisp implementation.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
The OCaml Journal
http://www.ffconsultancy.com/products/ocaml_journal/?e

_______________________________________________

Dirk Thierbach

unread,

Jun 29, 2007, 2:28:58 AM6/29/07

to caml...@yquem.inria.fr

Thomas Fischbacher wrote:
> Dirk Thierbach wrote:

>>>> Pass the stack as an argument to those functions that need it.

>> So could one "hide the plumbing" in a monad?

> Hmmm. In principle, yes. I think that should be possible here.
> (Done this for a related problem where contextual typesetting
> information is passed around.)

And I think it has the additional advantage that deallocation is safe,
because the typesystem guarantees that no reference to the stack(s)
can escape the monad.

It's more or less just a variation of the state monad.

- Dirk

Thomas Fischbacher

unread,

Jun 29, 2007, 6:57:21 AM6/29/07

to Jon Harrop, caml...@yquem.inria.fr

Jon Harrop wrote:

> Is this the kind of thing you're after:
>
> http://okmij.org/ftp/ML/#dynvar

Wait a second...

Evidently, you can do tricks like that once you have some handle on the
continuation (this is how Oleg does it). But as far as I can see, this
will require Delimcc. Isn't that a bytecode-only thing, and also a
non-officially-supported hack on top of the OCaml system?

--
best regards,
Thomas Fischbacher
t...@functionality.de

_______________________________________________

Bill Wood

unread,

Jun 29, 2007, 9:18:21 AM6/29/07

to Thomas Fischbacher, caml...@yquem.inria.fr

On Wed, 2007-06-27 at 14:55 +0100, Thomas Fischbacher wrote:
. . .
> One thing OCaml can do better than, say, CMU CL, is to define globally
> visible functions that depend on some otherwise inaccessible context,
> as in the following example:
>
> let float_factorial =
> let _known_factorials = ref [|1.0;1.0;2.0;6.0;24.0;120.0;720.0|] in
> (fun n ->
> let known_factorials = !_known_factorials in
> let nr_known = Array.length known_factorials in
> if n < nr_known
> then
> known_factorials.(n)
> else
> let new_known_factorials = Array.make (n+1) 0.0 in
> begin
> for i=0 to nr_known-1 do
> new_known_factorials.(i) <- known_factorials.(i)
> done;
> (let rec fill f_pos pos =
> if pos > n then ()
> else
> let () = new_known_factorials.(pos) <- f_pos in
> fill (f_pos *. (float_of_int (pos+1))) (pos+1)
> in
> fill (known_factorials.(nr_known-1)*.(float_of_int nr_known)) nr_known);
> _known_factorials := new_known_factorials;
> new_known_factorials.(n)
> end)
> ;;
>
> A corresponding
>
> (let ((buffer (make-array ...)))
> (defun float-factorial (n)
> ...))
>
> just plainly does not work with CMU CL/SBCL. :-(

Just to set the record straight on this, Thomas and I did a little
experimentation and discovered that change happens -- CMU CL (at least)
now does the Right Thing here. You can indeed define stateful
functions, like the OCaml one above, in CMU Common Lisp using 'defun'.

Pierre Etchemaïté

unread,

Jun 30, 2007, 4:21:37 AM6/30/07

to caml...@yquem.inria.fr

Le Wed, 27 Jun 2007 14:53:05 +0100, Jon Harrop <j...@ffconsultancy.com> a écrit :

>
> # let symbol =
> let m = Hashtbl.create 1 in
> fun string ->
> try Hashtbl.find m string with Not_found ->
> Hashtbl.add m string string;
> string;;
> val symbol : '_a -> '_a = <fun>

The usual way to do this in OCaml is using a weak hashtable. Benefit is
that the strings can still be collected when not longer used. Weak
hashtables also have a "merge" function that do exactly what we want:

module H = Weak.Make(struct
type t = string
let hash = Hashtbl.hash
let equal (x:t) y = x = y
end)

let symbol =
let symbols = H.create 13 in
H.merge symbols