http://people.csail.mit.edu/mikelin/ocaml+twt
This version introduces a major backwards-incompatible change: the
eradication of "in" from let expressions, and the need to indent the let
body (as suggested by the F# lightweight syntax). This reduces the
familiar phenomenon of long function bodies getting progressively more
indented as they go along. That is, before where you had:
let x = 5 in
printf "%d\n" x
let y = x+1 in
printf "%d\n" y
You'd now just write:
let x = 5
printf "%d\n" x
let y = x+1
printf "%d\n" y
I was hesitant to introduce this feature because it's extra hackish in
implementation (even moreso than the rest of this house of cards). It also
removes some programmer freedom, because you cannot have the let body on the
same line as the let, and you cannot have a statement sequentially following
the let, outside the scope of the binding. But after playing with it, I
think it is worthwhile. Please let me know what you think. I am still not
completely sure that I haven't broken something profound that will force me
to totally backtrack this change, but let's give it a try. I will obviously
keep the 0.8x versions around for those who prefer it and for existing code
(including a lot of my own).
Standard disclaimer: ocaml+twt is a flagrant, stupendous,
borderline-ridiculous hack, but it works quite well, I write all my new code
using it, and I recommend it if you like this style. On the other hand, if
someone with more free time and knowledge of camlp4 wants to step up, I have
a couple ideas about how you might do it right...
Mike
I get a segmentation fault when marshalling
a large data structure. I could produce a file
of ~30MB, but for a larger data structure of
the same kind, I get a seg fault.
Do you know of any limit in the marshalling
functions w.r.t. size ?
Some part of my data structure are big doubly linked
graphs.
---
Sébastien Ferré
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
Indeed, the marshalling/unmarshalling functions can overflow the
execution stack. You could try to increase maximum stack size for your
process (ulimit -s with a Unix shell).
--
Olivier
cela provient du fait que tu passes par le Marshaling c'est-à-dire que
tu transformes ta donnée en une chaîne de caractères. Or, celles-ci ont
une taille limite (voir module Sys pour la valeur exacte) d'où le seg fault.
A mon avis essaye d'écrire directement ta valeur dans le fichier avec un
output_value ou bien utilise "ocaml xml" pour lire/écrire des données
sous le format xml (c'est plus bcp lent mais cela passera à coup sûr la
limitation des 30 Mo)
Amicalement,
Frédéric Gava
Sebastien Ferre a écrit :
Amicalement,
Sebastien
> pourtant, je passe bien par un appel a output_value
> dans un fichier, sans passer par une chaine intermediaire.
Maybe output_value uses a string internally. Try with a bytecode
version of your executable, an exception should be raised (or have a
look at the implementaiton of output_value).
Best,
Daniel
output_value doesn't use a string internally, it uses malloc. Anyway,
if the marshalling function runs out of memory (wether because malloc
returns NULL or because the caml string is too large), an
Out_of_memory exception is raised.
If it segfaults, that's most probably because the marshalling runs out
of executable stack (because of too much recursion). I've seen it do
this before. The "fix" is to increase the maximum size of the
executable stack.
The behavior is the same with bytecode or native code since it's not
the interpreter's stack that overflows, it's the C one.
>> pourtant, je passe bien par un appel a output_value
>> dans un fichier, sans passer par une chaine intermediaire.
>
> Maybe output_value uses a string internally. Try with a bytecode
> version of your executable, an exception should be raised (or have a
> look at the implementaiton of output_value).
I used a bytecode version.
I checked the code of output_value, and it uses an internal
string. So it won't work.
Anyway, I knew I would have to go for a more serious
solution as soon as data get really large. I think of
using something like GDBM.
Thanks for the help.
Sebastien
Jonathan
y
On 1/17/07, Sebastien Ferre <fe...@irisa.fr> wrote:
>
>
I think the question is more along the lines "byte code threads" vs. native
(e.g. POSIX) threads rather than "byte vs. native code". It's true that
byte code threads, which can naturally only be used with byte code, require
an intermediate copy step to OCaml-strings if you want to write to
channels. That's bad on 32bit platforms due to the size limitations on
strings (< 16MB).
I'd recommend using Bigarrays of characters to marshal out data in cases
where OCaml-strings don't suffice. The code for this is extremely simple:
extern CAMLprim int
caml_output_value_to_block(value v, value v_flags, char *bstr, int len);
CAMLprim value bigstring_marshal_stub(value v, value v_flags)
{
char *buf;
long len;
int alloc_flags = BIGARRAY_UINT8 | BIGARRAY_C_LAYOUT | BIGARRAY_MANAGED;
caml_output_value_to_malloc(v, v_flags, &buf, &len);
return alloc_bigarray(alloc_flags, 1, buf, &len);
}
The signature of the OCaml-function is:
external marshal : 'a -> Marshal.extern_flags list -> t =
"bigstring_marshal_stub"
Where type "t" is a bigarray of characters with C-layout.
You can even do without the intermediate copying if you know the maximum
size of the marshalled data and preallocate a bigarray for that. Use
"caml_output_value_to_block" for that purpose. It's defined in
"byterun/extern.c" of the OCaml-distribution.
Regards,
Markus
--
Markus Mottl http://www.ocaml.info markus...@gmail.com
> If it segfaults, that's most probably because the marshalling runs out
> of executable stack (because of too much recursion). I've seen it do
> this before. The "fix" is to increase the maximum size of the
> executable stack.
Indeed, you're right.
I could solve the problem by using the 'ulimit -s' command.
> The behavior is the same with bytecode or native code since it's not
> the interpreter's stack that overflows, it's the C one.
I didn't know the existence of this C stack.
How can I have an idea of the necessary size ?
Is it related to the depth of data structures to
be marshaled ?
Thanks !
Sébastien
I downloaded the new version some day ago and immediately fell in love
with the compact syntax. In my opinion it feels much more natural.
I especially realized that it took me more effort to convert old
ocaml+twt code (lots of semantically relevant indentation changes) then
it did to convert vanilla ocaml code (essentially s/ *\( in\|;\)$//g
plus some optional parentheses removal).
> I was hesitant to introduce this feature because it's extra hackish in
> implementation (even moreso than the rest of this house of cards). It also
> removes some programmer freedom, because you cannot have the let body on the
> same line as the let, and you cannot have a statement sequentially following
> the let, outside the scope of the binding.
A let body beginning in the first line is no problem if you add an
additional semicolon:
let print x y = print_string x ; (* <-- note the semicolon *)
print_string " "
print_string y
print "Hello" "World"
If you need a function in private scope you can easily declare and call
it inside a 'let _ =' block:
let x = 5
printf "%d\n" x
let _ =
let y = x+1
printf "%d\n" y
printf "no y here"
I ran into some minor problems due to ocaml+twt not recognizing the
object related syntax. As I personally use it only in rare cases, I
ended up with just putting the critical section in one long line.
I suggest to implement the '#light' pragma (as in f#) which would allow
to swith on and off indentation awareness on the fly. This would also
enable me to replace all ocaml compilers by wrappers calling ocaml+twt
implicitly. If you want I can prepare a little patch.
Thanks for your effort -- keep going on
Ingo
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
You're right. I isolated the problem to the following piece of code:
let _x = ref 0
_x := 1
ocaml+twt complains: 'syntax error at line 2'
I think you should add a '_' to the regular expression for identifiers
in line 218 of ocaml+twt.ml.
Sorry for the false alarm about object orientation (in my code if
had 'val __dbg' inside a class definition).
Anyway I'd regard the #light pragma as very desirable.