Assume we define an Ocaml representation of erlang terms like the following:
type term = | T_atom of string | T_bool of bool | T_float of float | T_list of term list | ...
Assume we want an implementation where erlang will communicate with ocaml via TCP/IP sending terms as byte streams, instead of sharing the same address space. Then, I can think of two ways we can convert the terms encoded as binaries to and from the ocaml representation: 1) via ocaml's FFI, using the erl_interface library that is part of OTP and 2) implementing the conversion in Ocaml, using jinterface as inspiration. I've interfaced a dynamically typed language that provides a C interface with Ocaml before, using the FFI, but I haven't used erl_interface nor jinterface before, Maybe someone who has could indicate which of my suggestions above would be simpler (or provide a better one).
TCP/IP is the easiest way to interface, however it is not the fastest. There are three options:
1. Via some network transport (e.g. TCP) 2. Port program 3. Driver
I haven't done any benchmarks between #1 & #2, but as far as I recall #2 is about 10 times slower than #3. In terms of complexity 1 & 2 are on the same order, whereas #3 is more difficult to implement and test, but if you are looking for speed, it would be a goo incentive.
I would stay away from jinterface for any of these tasks, and stick to ei interface library for #2 and a combination of ei and erl_interface for #1. #3 requires a bit different approach (check out inet_drv.c for examples of how to marshal data by avoiding extra copying).
On the Ocaml side for #1 & #2 you can use ei interface library and create C -> Ocaml interface wrappers to deal with Ocaml values. The most difficult term to encode would probably be a tuple as Erlang and Ocaml have a different semantic representation of a tuple. In Ocaml a tuple cannot be distinguished from an array on the C side (as Ocaml doesn't quite have RTTI aside from most generic tags for strings, doubles, etc), and on the Ocaml side you don't have a luxury of element(N, Tuple) and size(Tuple) calls that are available in Erlang.
> Assume we define an Ocaml representation of erlang terms like the > following:
> type term = > | T_atom of string > | T_bool of bool > | T_float of float > | T_list of term list > | ...
> Assume we want an implementation where erlang will communicate with > ocaml via TCP/IP sending terms as byte streams, instead of sharing the > same address space. Then, I can think of two ways we can convert the > terms encoded as binaries to and from the ocaml representation: 1) via > ocaml's FFI, using the erl_interface library that is part of OTP and > 2) implementing the conversion in Ocaml, using jinterface as > inspiration. I've interfaced a dynamically typed language that > provides a C interface with Ocaml before, using the FFI, but I haven't > used erl_interface nor jinterface before, Maybe someone who has could > indicate which of my suggestions above would be simpler (or provide a > better one).
> TCP/IP is the easiest way to interface, however it is not the fastest. > There are three options:
> 1. Via some network transport (e.g. TCP) > 2. Port program > 3. Driver
> I haven't done any benchmarks between #1 & #2, but as far as I recall #2 > is about 10 times slower than #3. In terms of complexity 1 & 2 are on > the same order, whereas #3 is more difficult to implement and test, but > if you are looking for speed, it would be a goo incentive.
> I would stay away from jinterface for any of these tasks, and stick to > ei interface library for #2 and a combination of ei and erl_interface > for #1. #3 requires a bit different approach (check out inet_drv.c for > examples of how to marshal data by avoiding extra copying).
> On the Ocaml side for #1 & #2 you can use ei interface library and > create C -> Ocaml interface wrappers to deal with Ocaml values. The > most difficult term to encode would probably be a tuple as Erlang and > Ocaml have a different semantic representation of a tuple. In Ocaml a > tuple cannot be distinguished from an array on the C side (as Ocaml > doesn't quite have RTTI aside from most generic tags for strings, > doubles, etc), and on the Ocaml side you don't have a luxury of > element(N, Tuple) and size(Tuple) calls that are available in Erlang.
> Serge
> Fermin Reig wrote: > > Hi,
> > Assume we define an Ocaml representation of erlang terms like the > > following:
> > type term = > > | T_atom of string > > | T_bool of bool > > | T_float of float > > | T_list of term list > > | ...
> > Assume we want an implementation where erlang will communicate with > > ocaml via TCP/IP sending terms as byte streams, instead of sharing the > > same address space. Then, I can think of two ways we can convert the > > terms encoded as binaries to and from the ocaml representation: 1) via > > ocaml's FFI, using the erl_interface library that is part of OTP and > > 2) implementing the conversion in Ocaml, using jinterface as > > inspiration. I've interfaced a dynamically typed language that > > provides a C interface with Ocaml before, using the FFI, but I haven't > > used erl_interface nor jinterface before, Maybe someone who has could > > indicate which of my suggestions above would be simpler (or provide a > > better one).
On Jun 30, 11:48 pm, "F Reig" <fermin.r...@gmail.com> wrote:
> Is the Erlang Term Format documented somewhere, or do we have to > reverse-engineer it by reading the implementations of ei_decode_* and > ei_encode_* ?
It is documented, at least as a text file in the source code distribution (erts/emulator/internal_doc/erl_ext_dist.txt)
There's certainly enough confusion around term naming formats:
- Erlang Term Format - External Binary Format - Driver Term Format
I think the first one is what's represented by ETERM and what's used in erl_interface (see erl_interface user's guide). The second one is the result of serialization of internal representation of Erlang terms using term_to_binary() and binary_to_term() functions on Erlang side and ei interface on C side. Finally, the Driver Term Format is documented in the erts / erl_driver section of documentation. This one is used when writing drivers and is a fast way of sending data from C to Erlang by avoiding copying and serialization.
F Reig wrote: > On 6/30/07, Ulf Wiger <u...@wiger.net> wrote: >> I would like to add as an option that the Erlang Term Format >> is decoded directly in Ocaml:
> Is the Erlang Term Format documented somewhere, or do we have to > reverse-engineer it by reading the implementations of ei_decode_* and > ei_encode_* ?
Of course, using driver_send_term() when sending data from OCaml to Erlang would be nice - fast and flexible in that it can also send to any Erlang process.
But it will only work for the case where OCaml is linked into the erlang runtime, right?
Item 1 on the agenda is perhaps to determine the the high-level data representation, independent of encoding. It wouldn't have to be type-rich. Something similar in scope to UBF(B) would probably go a long way. http://www.sics.se/~joe/ubf/site/home.html (Ignore the "things change frequently bit - it hasn't for years.)
> There's certainly enough confusion around term naming formats:
> - Erlang Term Format > - External Binary Format > - Driver Term Format
> I think the first one is what's represented by ETERM and what's used in > erl_interface (see erl_interface user's guide). The second one is the > result of serialization of internal representation of Erlang terms using > term_to_binary() and binary_to_term() functions on Erlang side and ei > interface on C side. Finally, the Driver Term Format is documented in > the erts / erl_driver section of documentation. This one is used when > writing drivers and is a fast way of sending data from C to Erlang by > avoiding copying and serialization.
> Serge
> F Reig wrote: > > On 6/30/07, Ulf Wiger <u...@wiger.net> wrote: > >> I would like to add as an option that the Erlang Term Format > >> is decoded directly in Ocaml:
> > Is the Erlang Term Format documented somewhere, or do we have to > > reverse-engineer it by reading the implementations of ei_decode_* and > > ei_encode_* ?
Ulf Wiger wrote: > Of course, using driver_send_term() when sending data > from OCaml to Erlang would be nice - fast and flexible > in that it can also send to any Erlang process.
> But it will only work for the case where OCaml is > linked into the erlang runtime, right?
Yes, indeed.
> Item 1 on the agenda is perhaps to determine the > the high-level data representation, independent of > encoding. It wouldn't have to be type-rich. Something > similar in scope to UBF(B) would probably go a long way. > http://www.sics.se/~joe/ubf/site/home.html > (Ignore the "things change frequently bit - it > hasn't for years.)
Agreed. Selecting interface method (TCP/Port/Driver/etc) is of less importance at this time. Thinking of the data representation will also help to get a more clear picture of capabilities of bindings in both languages.
> 2007/7/1, Serge Aleynikov <sal...@gmail.com>: >> There's certainly enough confusion around term naming formats:
>> - Erlang Term Format >> - External Binary Format >> - Driver Term Format
>> I think the first one is what's represented by ETERM and what's used in >> erl_interface (see erl_interface user's guide). The second one is the >> result of serialization of internal representation of Erlang terms using >> term_to_binary() and binary_to_term() functions on Erlang side and ei >> interface on C side. Finally, the Driver Term Format is documented in >> the erts / erl_driver section of documentation. This one is used when >> writing drivers and is a fast way of sending data from C to Erlang by >> avoiding copying and serialization.
>> Serge
>> F Reig wrote: >>> On 6/30/07, Ulf Wiger <u...@wiger.net> wrote: >>>> I would like to add as an option that the Erlang Term Format >>>> is decoded directly in Ocaml: >>> Is the Erlang Term Format documented somewhere, or do we have to >>> reverse-engineer it by reading the implementations of ei_decode_* and >>> ei_encode_* ?
> [...] > Item 1 on the agenda is perhaps to determine the > the high-level data representation, independent of > encoding. It wouldn't have to be type-rich. Something > similar in scope to UBF(B) would probably go a long way. > http://www.sics.se/~joe/ubf/site/home.html > (Ignore the "things change frequently bit - it > hasn't for years.)
So, what do people have in mind as the kind of interface? I'm asking from the Ocaml point of view. Option 1 is that, at the Ocaml side, you have a type similar to this
type term = | T_atom of string | T_bool of bool | T_float of float | T_list of term list | ...
and everything that comes from the erlang world is of this type. (Of course, you can have variations on this, such as variants or phantom types, but the general idea is that of tagged values.)
Option 2 is to use arbitrayr Ocaml types in the interface, such as bool or the record type { name:string; age:int}. What cannot be accommodated in this model is collections of arbitrary length containing values of different types. Those would need to be collections of tagged values.
I would like to see tuples and binaries as well. If the Ocaml side can handle Erlang binaries, it can handle any Erlang term as an opaque value (the Erlang side would use term_to_binary() and binary_to_term() to pack and unpack the value)
If the data being transfered is tagged, you can of course send binaries as strings, and tuples as lists.
> > [...] > > Item 1 on the agenda is perhaps to determine the > > the high-level data representation, independent of > > encoding. It wouldn't have to be type-rich. Something > > similar in scope to UBF(B) would probably go a long way. > > http://www.sics.se/~joe/ubf/site/home.html > > (Ignore the "things change frequently bit - it > > hasn't for years.)
> So, what do people have in mind as the kind of interface? I'm asking > from the Ocaml point of view. Option 1 is that, at the Ocaml side, you > have a type similar to this
> type term = > | T_atom of string > | T_bool of bool > | T_float of float > | T_list of term list > | ...
> and everything that comes from the erlang world is of this type. (Of > course, you can have variations on this, such as variants or phantom > types, but the general idea is that of tagged values.)
> Option 2 is to use arbitrayr Ocaml types in the interface, such as > bool or the record type { name:string; age:int}. What cannot be > accommodated in this model is collections of arbitrary length > containing values of different types. Those would need to be > collections of tagged values.
One way to represent Erlang tuples could be to treat them as arrays, whereas binaries could be represented as strings (OCaml strings can contain 0-byte character as they don't rely on it being the end-of-string signifier):
type term = ... | T_tuple of term array | T_binary of string
Why wouldn't map Erlang tuples to OCaml tuples? Say, if we have an OCaml function that returns a tuple from Erlang, it would have to have a fixed signature with a given number of elements as the return value:
let get_tuple : ... -> term * term
This is because OCaml statically determines the shape of the tuple, and the only way to work with the result of such a function is by pattern matching on its shape:
match get_tuple with (T_atom ok, T_int i) -> ... | (T_atom error, T_string s) -> ...;;
Alternatively we can reserve a custom tuple type "pair" for marshaling two-value tuples, on which OCaml offers fst and snd value retrieval functions, and use arrays for N-tuples:
F Reig wrote: > On 7/1/07, Ulf Wiger <u...@wiger.net> wrote: >> [...] >> Item 1 on the agenda is perhaps to determine the >> the high-level data representation, independent of >> encoding. It wouldn't have to be type-rich. Something >> similar in scope to UBF(B) would probably go a long way. >> http://www.sics.se/~joe/ubf/site/home.html >> (Ignore the "things change frequently bit - it >> hasn't for years.)
> So, what do people have in mind as the kind of interface? I'm asking > from the Ocaml point of view. Option 1 is that, at the Ocaml side, you > have a type similar to this
> type term = > | T_atom of string > | T_bool of bool > | T_float of float > | T_list of term list > | ...
> and everything that comes from the erlang world is of this type. (Of > course, you can have variations on this, such as variants or phantom > types, but the general idea is that of tagged values.)
> Option 2 is to use arbitrayr Ocaml types in the interface, such as > bool or the record type { name:string; age:int}. What cannot be > accommodated in this model is collections of arbitrary length > containing values of different types. Those would need to be > collections of tagged values.
> Alternatively we can reserve a custom tuple type "pair" for marshaling > two-value tuples, on which OCaml offers fst and snd value retrieval > functions, and use arrays for N-tuples:
Hello everyone,
I'm reading the discussions in this group ... quite interesting in
fact! :)
I'm using Erlang since only one year, and I'm a complete newby in
OCaml ... but would love to spend some time coding on this project! My
idea is to try to do something similar to py_interface (a erlang node
implemented in python, communicating through a erlang port ... if I'm
not wrong). I have absolutely no idea of the amount of work to
replicate this in ocaml ... I guess a first step would be to reuse the
previous ocaml stream decoding and communicate through a TCP
connection.
So I have few questions ...
First, is it too ambitious as a project to learn OCaml? ;)
Second, has anybody some code already?
Any advice/guidance is welcome!
Thanks,
ludo
On Jul 3, 5:32 am, "F Reig" <fermin.r...@gmail.com> wrote: