[Caml-list] "Ref" and copy of functions

Pierre-Evariste Dagand

unread,

Dec 13, 2007, 12:28:24 PM12/13/07

to caml...@yquem.inria.fr

Hi Caml-list,

I'm looking for advices about a "clean way" of doing something which,
by design, isn't. So, let states the problem.

I'm doing a combinator library based on Arrows for a Yampa-like system :

type ('a,'b) arrow = Arrow of ( 'a -> 'b )

Thus, I can instantiate an arrow with :

let arr f = Arrow ( f )
val arr : ('a -> 'b) -> ('a, 'b) arrow

Then, I have, for example, the composition of arrows :

let (>>>) (Arrow f) (Arrow g) =Arrow ( fun c -> g ( f c ) )
val ( >>> ) : ('a, 'b) arrow -> ('b, 'c) arrow -> ('a, 'c) arrow

Right here, everything plays well (excepted some weak type variables
issues but that's not a big deal).

But (and that's here that the trouble is) I need a "loop" combinator,
that I wrote this way :

let loop init (Arrow f) =
let state = ref init in
Arrow (
fun c ->
let new_state , output = f ( !state , c ) in
state := new_state;
output
)
val loop : 'a -> ('a * 'b, 'a * 'c) arrow -> ('b, 'c) arrow

Finally, I can write a small arrow :

let arr_counter = loop 0 ( arr ( fun ( counter , x ) -> counter+1 ,
counter+x ) )
let arr_double = arr ( fun x -> 2*x )
let arr_my_small_arrow = arr_counter >>> arr_double

And I execute it with :

let run (Arrow f) input =
f input
val run : ('a, 'b) arrow -> 'a -> 'b

And it works. Right.

But now, I would like to be able to "copy" a built arrow and to be
able to execute the copy without side-effecting on the first one.
Obviously, I cannot do that in this implementation.

My current solution is really *ugly* :

let arrow_string = Marshal.to_string arrow [ Marshal.No_sharing ;
Marshal.Closures ]

Then I "Marshal.from_string arrow_string". I'm not even sure if it
will not blow out of my hands with "strange" things in the Ref cell.

I'm aware of implementations in CPS but I use to think that it has a
performance cost that I'm not ready to pay (I'm fighting against a C++
code so...).

So if someone knows an elegant solution to my problem (if someone has
read this mail until here), please let me know :-)

Best regards,

P.S.: For those who are just curious, these combinators are used in a
functional-reactive library for Peer-to-Peer overlays programming.

--
Pierre-Evariste DAGAND
http://perso.eleves.bretagne.ens-cachan.fr/~dagand/

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Christophe Papazian

unread,

Dec 14, 2007, 4:22:52 AM12/14/07

to

You can use some kind of copy closure like :

type ('a,'b) a = {f:'a -> 'b; c: unit -> ('a,'b) a} (* f is the
function, and c is a fresh copy *)
let rec create f = {f=f; c = fun () -> create f} (* we have to suppose
here that f is purely functional *)
let rec loop i a = let s = ref i in {f= (fun c -> let ni,nr = a.f (!
s,c) in s:=ni; nr); c= fun () -> loop !s a}

type ('a, 'b) a = { f : 'a -> 'b; c : unit -> ('a, 'b) a; }
val create : ('a -> 'b) -> ('a, 'b) a = <fun>
val loop : 'a -> ('a * 'b, 'a * 'c) a -> ('b, 'c) a = <fun>

let a = create (fun (b,s) -> not b , if b then String.lowercase s else
String.uppercase s)
let b = loop true a

val a : (bool * string, bool * string) a = {f = <fun>; c = <fun>}
val b : (string, string) a = {f = <fun>; c = <fun>}

# b.f "t";;
- : string = "t"
# b.f "t";;
- : string = "T"
# b.f "t";;
- : string = "t"
# let c = b.c ();; (* fresh copy *)
val c : (string, string) a = {f = <fun>; c = <fun>}
# c.f "t";;
- : string = "T"
# b.f "t";;
- : string = "T"
# c.f "t";;
- : string = "t"
# b.f "t";;
- : string = "t"

Does it help you ?

Christophe

Zheng Li

unread,

Dec 14, 2007, 6:15:56 AM12/14/07

to caml...@inria.fr

Hi, "Pierre-Evariste Dagand" <peda...@gmail.com> writes: > I'm

looking for advices about a "clean way" of doing something which,
> by design, isn't. So, let states the problem. > I'm doing a

combinator library based on Arrows for a Yampa-like system : I
don't understand what "arrow" is and the rational behind it, so
don't take my suggestion seriously. > type ('a,'b) arrow = Arrow
of ( 'a -> 'b ) > let arr f = Arrow ( f ) > val arr : ('a -> 'b)
-> ('a, 'b) arrow > let (>>>) (Arrow f) (Arrow g) =Arrow ( fun c
-> g ( f c ) ) > let loop init (Arrow f) = > let state = ref

init in > Arrow ( > fun c -> > let new_state ,
output = f ( !state , c ) in > state := new_state; >

output > ) > let arr_counter = loop 0 ( arr ( fun ( counter ,

x ) -> counter+1 , > counter+x ) ) > let arr_double = arr ( fun x
-> 2*x ) > let arr_my_small_arrow = arr_counter >>> arr_double >

let run (Arrow f) input = > f input At least on this
particular example, you can use the following encoding (I
deliberately use rectypes here, but you can always use extra
variant instead)

---definition---

(* Only in mind: type ('a,'b) arrow = 'a -> 'b * ('a, 'b) arrow;;
*)

let rec arr f x = f x, arr f;;

let rec (>>>) f g x =
let rf,nf = f x in let rg,ng = g rf in
rg, nf >>> ng;;

let rec loop init f x =
let ((init', r), n) = f (init,x) in
r, loop init' n;;

let rec run f = function
| [] -> []
| h::t -> let r,n = f h in r :: run n t;;

---test---

let arr_counter = loop 0 (arr (fun (c,x) -> c+1,c+x));;

let arr_double = arr (( * ) 2);;

let arr_my_small_arrow = arr_counter >>> arr_double;;

run arr_my_small_arrow [6;5;4;3];;
- : int list = [12; 12; 12; 12]

Since it doesn't use mutable variable, it's free to make copy. >

P.S.: For those who are just curious, these combinators are used
in a > functional-reactive library for Peer-to-Peer overlays

programming. It looks like "arrow" is for some kind of flow-like
programming, then you may also take a look at SDFlow [1]. Your
example can be encoded in a point-free style as

---flow---

let arr_counter =
comb |- map (fun (x,c) -> x+c,c+1) |- split |> curry |-
feedr[<'0>];;

let arr_double = map (( * ) 2);;

let arr_my_small_arrow = arr_counter |- arr_double;;

arr_my_small_arrow [<'6;'5;'4;'3>] |> to_list;;
- : int list = [12; 12; 12; 12]

Note however, SDFlow is just a proof of concept. The syntax on
recursion can be made much more straightforward with some camlp4
script to allow recursive value with preconditions. I just don't
have time to do that.

HTH.

--
Zheng Li
http://www.pps.jussieu.fr/~li

David Teller

unread,

Dec 14, 2007, 9:52:01 AM12/14/07

to Caml

Speaking of SDFlow, I've slightly extended it with three functions
to_array, to_stream and of_array. I may extend it a bit further, but I
don't think so. Do you accept submissions ?

I've also made use of SDFlow in a Camlp4 extension for stream
comprehension, by the way. I'll try and publish this within a few days.

Cheers,
David

On Fri, 2007-12-14 at 11:49 +0100, Zheng Li wrote:

> Note however, SDFlow is just a proof of concept. The syntax on
> recursion can be made much more straightforward with some camlp4
> script to allow recursive value with preconditions. I just don't
> have time to do that.
>
> HTH.

--
David Teller
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

Zheng Li

unread,

Dec 14, 2007, 11:17:13 AM12/14/07

to caml...@inria.fr

Hi,

"Pierre-Evariste Dagand" <peda...@gmail.com> writes:
> Yes, that's a CPS-way of doing things. I am aware of it and my first
> implementation was in this style.
>
> The reasons why I decided to switch to Ref were that :
> 1/ The implementation is much more "intuitive" (I don't have a CPS
> pre-processor in my brain)
> 2/ I thought I could get rid of my hack (Marshal/UnMarshal) easily
> 3/ I used to think that I will be faster
>
> Point 1 is not a problem, Santa Claus might bring me this CPS
> pre-processor in a few days.
>
> Point 2 seems to be wrong : being fast is nice but if it's for hitting
> the wall, well... one should slow down. And for the moment, I haven't
> found a clean solution and I'm not sure whether my hack is safe or
> not.

I guess you won't be able to find such a solution. In the Ref-based
implementation, the result arrow type is stateful. You won't be able to
_implicitly_ copy a state _inside_ the language, otherwise first-class
continuation would already appear.

A related issue appears on the "arr" function, if the argument "f" is
itself stateful, even with a functional encoding like CPS won't help to
copy the state inside. You can only solve this with a out-of-language
solution such as Marshal or independent process evaluation . However,
they also charge you cost. AFAIR, marshal (native code) has some
cross-module safety problem.

> So, remains point 3. That's why I have carried out an experience on my
> initial Ref implementation, your Rectypes implementation and a CPS
> implementation without rectypes (because rectypes frighten me).

Since the corresponding mechanics in Ref and CPS are mutation and
closure (de)construction, no wonder the former is faster. One ad-hoc
solution, if you only have to provide high-level functional API to
outside, is to make states explicit, like

type ('a,'b) arrow = {mutable state: 'c; eval: 'c -> 'a -> 'b}

This is not valid OCaml, since the 'c type is out of scope. You can
either encode it with existential type (which also involve closure
(de)construction, but I'm not sure about the cost). Or simply using
Obj.t instead of 'c as far as you won't expose it, at least it won't be
unsafer than Marshal and faster.

Zheng Li

unread,

Dec 14, 2007, 11:20:47 AM12/14/07

to caml...@inria.fr

Hello,

David Teller <David....@ens-lyon.org> writes:
> Speaking of SDFlow, I've slightly extended it with three functions
> to_array, to_stream and of_array. I may extend it a bit further, but I
> don't think so. Do you accept submissions ?

Sure, submissions are always welcome, or you may derive new version
based on it as you like.

> I've also made use of SDFlow in a Camlp4 extension for stream
> comprehension, by the way. I'll try and publish this within a few days.

Glad to know.

Best wishes.

--
Zheng Li
http://www.pps.jussieu.fr/~li

_______________________________________________

Loup Vaillant

unread,

Dec 14, 2007, 11:31:26 AM12/14/07

to Pierre-Evariste Dagand, caml...@yquem.inria.fr, Zheng Li

2007/12/14, Pierre-Evariste Dagand <peda...@gmail.com>:
> And the results are ...
> 1/ Ref : 0.75 microseconds
> 2/ Variant CPS : 2.69 microseconds
> 3/ Rectypes : 2.80 microseconds
> [ compiled with ocamlopt.opt, with -rectypes for 3/ ]
>
> It's about 4 times longer and, against a C++ code, I fear that I can't
> afford it.

Err, Is this likely to be a bottoleneck? Or do you want it to be a
library, so it has to be fast anyway?

Loup

Pierre-Evariste Dagand

unread,

Dec 14, 2007, 11:56:13 AM12/14/07

to Zheng Li, caml...@yquem.inria.fr

> I guess you won't be able to find such a solution. In the Ref-based
> implementation, the result arrow type is stateful. You won't be able to
> _implicitly_ copy a state _inside_ the language, otherwise first-class
> continuation would already appear.

That's why I asked this to the Caml-list : to find a black magic recipe :-)

> Since the corresponding mechanics in Ref and CPS are mutation and
> closure (de)construction, no wonder the former is faster. One ad-hoc
> solution, if you only have to provide high-level functional API to
> outside, is to make states explicit, like
>
> type ('a,'b) arrow = {mutable state: 'c; eval: 'c -> 'a -> 'b}
>
> This is not valid OCaml, since the 'c type is out of scope. You can
> either encode it with existential type (which also involve closure
> (de)construction, but I'm not sure about the cost). Or simply using
> Obj.t instead of 'c as far as you won't expose it, at least it won't be
> unsafer than Marshal and faster.

This solution looks promising, I will look at it and see if it suits my needs.

Thanks,

--
Pierre-Evariste DAGAND

Pierre-Evariste Dagand

unread,

Dec 14, 2007, 11:57:29 AM12/14/07

to caml...@yquem.inria.fr

[I forgot to cc the list, sorry]

> Err, Is this likely to be a bottoleneck? Or do you want it to be a
> library, so it has to be fast anyway?

Indeed, it will be a library and I would like to get it the faster I can.

These arrows can be connected to :
1/ A simulator that simulates a whole network (target size : 10.000 nodes)
2/ A network interface that connects the arrow with the World
3/ If I have enough time, a debugger and a model checker

In the 2/ case, I'm looking for documentation about the granularity of
the Unix/Windows threads switching (as I rely on threads to interact
with the world).

For the other cases, the faster, the better.

--
Pierre-Evariste DAGAND

Jacques Garrigue

unread,

Dec 16, 2007, 12:18:08 AM12/16/07

to peda...@gmail.com, caml...@yquem.inria.fr

From: "Pierre-Evariste Dagand" <peda...@gmail.com>

> I'm looking for advices about a "clean way" of doing something which,
> by design, isn't. So, let states the problem.
>
> I'm doing a combinator library based on Arrows for a Yampa-like system :
>
> type ('a,'b) arrow = Arrow of ( 'a -> 'b )

[...]

> But now, I would like to be able to "copy" a built arrow and to be
> able to execute the copy without side-effecting on the first one.
> Obviously, I cannot do that in this implementation.

Here is yet another solution, using objects, which actually combines
Zheng's and Oleg's ideas. That is, it separates state and function,
but provides only access to state through a cloning method, so that it
is completely type safe. This is just what objects are good at!

class ['a,'b] arrow (f : 'a -> 'b) =
object (self) method call = f method copy = self end

let (>>>) rf rg : ('a,'b) arrow =
object
val rf : ('a,'c) arrow = rf
val rg : ('c,'b) arrow = rg
method call x = rg#call (rf#call x)
method copy = {< rf = rf#copy; rg = rg#copy >}
end

let loop init rf : ('b,'c) arrow =
object
val mutable state = init
val rf : ('a*'b,'a*'c) arrow = rf
method call x =
let state', y = rf#call (state, x) in
state <- state'; y
method copy = {< rf = rf#copy >}
end

let arr = new arrow

let arr_counter = loop 0 (arr (fun (counter,x) -> counter+1, counter+x))
let arr_double = arr (fun x -> 2*x)
let arr_my_small_arrow = arr_counter >>> arr_double

The key here is the {< ... >} construct, which creates a shallow copy
of an object, eventually with some fields changed. As a result, in
loop there is no need to handle the state explicitly: the state field
in the copied object will be distinct from the state field in the
original object. On the other hand, you must explicitly update fields
holding arrows, since the copy is shallow. Note that the explicit
"val" fields are needed to allow updating their contents when copying.

One difference with Oleg's approach is that we take a copy of the
original object, rather than creating a completely new record. In this
case, this doesn't mean much, since there is no extra computation
involved. Still, the state after copying is not the original but the
current one. And this may matter more if the construction is more
complicated.

Jacques Garrigue

Pierre-Evariste Dagand

unread,

Dec 16, 2007, 11:39:26 AM12/16/07

to caml...@yquem.inria.fr

> Here is yet another solution, using objects, which actually combines
> Zheng's and Oleg's ideas. That is, it separates state and function,
> but provides only access to state through a cloning method, so that it
> is completely type safe. This is just what objects are good at!
>
> class ['a,'b] arrow (f : 'a -> 'b) =
> object (self) method call = f method copy = self end

Yet another nice solution I would never have imagined :-)

And the performance is quite good :

1.428032 microseconds on my small benchmark

(the unsafe Ref implantation takes 0.75 microseconds while the CPS
implantation takes 2.7 microseconds).

Nevertheless, I have carried out my benchmark on Oleg's implementation
and find an impressive performance :

0.74 microseconds !

Now, I will re-write my combinators and see which style better suits my needs.

Thanks,

--
Pierre-Evariste DAGAND
http://perso.eleves.bretagne.ens-cachan.fr/~dagand/

_______________________________________________

Pierre-Evariste Dagand

unread,

Dec 16, 2007, 1:27:45 PM12/16/07

to caml...@yquem.inria.fr

> Now, I will re-write my combinators and see which style better suits my needs.

Finally, impressed by the speed of the record solution, I have
re-written the whole library with records.

I have measured the performance of this implementation against the
Ref-based one on "real" code. Thus, I have simulated my Chord [1]
implementation with a simulator instrumented such that it drops the
processing time for each arrow processing.

The results can be found at [2].

In a nutshell : the record solution has a negligible overhead compared
to the Ref solution.

Thanks all,

[1]: http://pdos.csail.mit.edu/chord/
[2]: http://perso.eleves.bretagne.ens-cachan.fr/~dagand/projets/opis/processing_speed_records.pdf

--
Pierre-Evariste DAGAND