[Caml-list] Search for the smallest possible possible Ocaml segfault....

Till Varoquaux

unread,

Nov 8, 2007, 9:17:48 AM11/8/07

to caml-list

I have a open bug in ocaml
(http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
to a segfault. The bug has been there for more than 4 months and is
still marked as "new". Since it seems to be stalling I thought I might
give it a gentle prod: what is the smallest possible ocaml program you
can come up with that leads to a reproducible segfault without using
FFI's Obj or Marshal. Here is mine:

Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x' "xy");;

My point of view would be to disable positional parameters until they are fixed.
Till
P.S.: I am very grateful for the great work the INRIA's team, my
intention is not to criticise it. My impression was that bug reported
via the mail ling list were acknowledged faster, I'm putting this
theory to the test.
--
http://till-varoquaux.blogspot.com/

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jean-Christophe Filliâtre

unread,

Nov 8, 2007, 9:52:47 AM11/8/07

to Till Varoquaux, caml-list

Till Varoquaux wrote:
> what is the smallest possible ocaml program you
> can come up with that leads to a reproducible segfault without using
> FFI's Obj or Marshal.

and I guess you mean in both bytecode and nativecode and without
compiling with -unsafe, right?

--
Jean-Christophe Filliâtre
http://www.lri.fr/~filliatr/

Till Varoquaux

unread,

Nov 8, 2007, 10:20:58 AM11/8/07

to Jean-Christophe Filliâtre, caml-list

>
> and I guess you mean in both bytecode and nativecode and without
> compiling with -unsafe, right?
>

Yes,
Of course -unsafe is out of the picture. Using unsafe functions
(String.unsafe-get etc...) will also disqualify your solution.

And I'll push it even further: using -rectypes counts as a penalty but
crashing the interpreter is a bonus.

Till
P.S. I forgot to give credit in my previous mail: the bug was found
with Stéphane Glondu and Jérôme Vouillon while trying to understand
scanf, however they should not be held responsible for this post .
Adam Chlipala pointed out this might be the record for a segfault in
Ocaml, giving me the idea of launching this contest...

Adrien

unread,

Nov 8, 2007, 10:55:50 AM11/8/07

to Till Varoquaux, Jean-Christophe Filliâtre, caml-list

With the bytecode interpreter, I've been able to make ocaml segfault by
using Graphics.
I don't know how reproductible this is but iirc, I had to
Graphics.open_graph "" then close the window with the cross on the window
and last reuse Graphics.open_graph "" (has maybe to be done twice with a
Graphics.close_graph() between).
I encountered this on ocaml-3.10.0 under linux. I did not try to reproduce
it as it wasn't my goal and right now I don't have ocaml available. Anyway,
this or something very similar would raise a fatal I/O error which
surprinsigly was sometimes fatal to the bytecode interpreter but not always.

Morality : use close_graph() ;)

---

Adrien Nader

Alain Frisch

unread,

Nov 8, 2007, 11:06:20 AM11/8/07

to Till Varoquaux, caml-list

Till Varoquaux wrote:
> I have a open bug in ocaml
> (http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
> to a segfault. The bug has been there for more than 4 months and is
> still marked as "new". Since it seems to be stalling I thought I might
> give it a gentle prod: what is the smallest possible ocaml program you
> can come up with that leads to a reproducible segfault without using
> FFI's Obj or Marshal. Here is mine:
>
> Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x' "xy");;

Till, what a childish attitude ;-)

The following is certainly not the smallest, but it uses only the
Pervasives module, so maybe it is cute enough to qualify for the Jury's
prize.

a.mli = sub/a.mli: type t val x: t val f: t -> unit
a.ml: type t = int let x,f = 0,print_int
sub/a.ml: type t = string let x,f = "",print_string
b.ml: let r = A.x
c.ml: A.f B.r;;

To be compiled with:

ocamlc -o main a.mli a.ml b.ml sub/a.mli sub/a.ml c.ml

-- Alain

Jeremy Yallop

unread,

Nov 8, 2007, 11:08:44 AM11/8/07

to Till Varoquaux, caml-list

Till Varoquaux wrote:
> I have a open bug in ocaml
> (http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
> to a segfault. The bug has been there for more than 4 months and is
> still marked as "new". Since it seems to be stalling I thought I might
> give it a gentle prod: what is the smallest possible ocaml program you
> can come up with that leads to a reproducible segfault without using
> FFI's Obj or Marshal. Here is mine:
>
> Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x' "xy");;

I've already reported this (on the mailing list) and it's probably been
fixed by now, but in OCaml 3.10.0:

!((object val virtual x:'a method x=x end)#x)

Jeremy.

Jeremy Yallop

unread,

Nov 8, 2007, 11:12:35 AM11/8/07

to Till Varoquaux, caml-list

Jeremy Yallop wrote:
> Till Varoquaux wrote:
>> I have a open bug in ocaml
>> (http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
>> to a segfault. The bug has been there for more than 4 months and is
>> still marked as "new". Since it seems to be stalling I thought I might
>> give it a gentle prod: what is the smallest possible ocaml program you
>> can come up with that leads to a reproducible segfault without using
>> FFI's Obj or Marshal. Here is mine:
>>
>> Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x'
>> "xy");;
>
> I've already reported this (on the mailing list) and it's probably been
> fixed by now, but in OCaml 3.10.0:
>
> !((object val virtual x:'a method x=x end)#x)

I made it shorter:

!((object val virtual x:_ method x=x end)#x)

Till Varoquaux

unread,

Nov 8, 2007, 11:17:54 AM11/8/07

to Jeremy Yallop, caml-list

It is fixed now:

> This class should be virtual. The following variables are undefined : x

It would have been a good contender for the shorter bug.
Till

Martin Jambon

unread,

Nov 8, 2007, 11:31:40 AM11/8/07

to Alain Frisch, caml-list

I think that the standard library should provide a Pervasives.segfault
function.

Martin

--
http://wink.com/profile/mjambon
http://martin.jambon.free.fr

Pascal Zimmer

unread,

Nov 8, 2007, 12:03:35 PM11/8/07

to caml-list

What about this one:

Unix.kill 0 11;;

;-)

Pascal

Stefano Zacchiroli

unread,

Nov 8, 2007, 12:03:50 PM11/8/07

to Inria Ocaml Mailing List

On Thu, Nov 08, 2007 at 11:17:08AM -0500, Till Varoquaux wrote:
> > This class should be virtual. The following variables are undefined : x
> It would have been a good contender for the shorter bug.

Not here:

$ cat a.ml

!((object val virtual x:_ method x = x end)#x)

$ ocamlc a.ml
$ ./a.out
Segmentation fault
$ ocamlc -version
3.10.0

Cheers.

--
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{cs.unibo.it,debian.org,bononia.it} -%- http://www.bononia.it/zack/
(15:56:48) Zack: e la demo dema ? /\ All one has to do is hit the
(15:57:15) Bac: no, la demo scema \/ right keys at the right time

Till Varoquaux

unread,

Nov 8, 2007, 12:10:56 PM11/8/07

to Inria Ocaml Mailing List

> $ ocamlc -version
> 3.10.0
>
> Cheers.

It's fixed in the development release

$ ocamlc -version
3.10.1+dev0 (2007-05-21)

Till

> --
> Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
> zack@{cs.unibo.it,debian.org,bononia.it} -%- http://www.bononia.it/zack/
> (15:56:48) Zack: e la demo dema ? /\ All one has to do is hit the
> (15:57:15) Bac: no, la demo scema \/ right keys at the right time
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

--
http://till-varoquaux.blogspot.com/

Zheng Li

unread,

Nov 8, 2007, 12:11:32 PM11/8/07

to caml...@inria.fr

"Till Varoquaux" <till.va...@gmail.com> writes:
> I have a open bug in ocaml
> (http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
> to a segfault. The bug has been there for more than 4 months and is
> still marked as "new". Since it seems to be stalling I thought I might
> give it a gentle prod: what is the smallest possible ocaml program you
> can come up with that leads to a reproducible segfault without using
> FFI's Obj or Marshal. Here is mine:
>
> Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x' "xy");;
>

exception E of [>];;
try raise(E`X) with E`X x-> !x

53 bytes in total, tested with v.3.10 toplevel. If compiled, just
provide it with an 0 byte interface file.

--
Zheng Li
http://www.pps.jussieu.fr/~li

Oliver Bandel

unread,

Nov 8, 2007, 12:12:10 PM11/8/07

to caml-list

Zitat von Till Varoquaux <till.va...@gmail.com>:

> I have a open bug in ocaml
> (http://caml.inria.fr/mantis/view.php?id=4321) that leads very simply
> to a segfault. The bug has been there for more than 4 months and is
> still marked as "new". Since it seems to be stalling I thought I might
> give it a gentle prod: what is the smallest possible ocaml program you
> can come up with that leads to a reproducible segfault without using
> FFI's Obj or Marshal. Here is mine:
>
> Scanf.sscanf "\"%2$c%1$s\"" "%{%c%s%}" (fun f->Printf.printf f 'x' "xy");;

[...]

Strange... I tried with toplevel, and it crashed. :(

This and the other things are a book of horror-stories... :(

I hope, theese strange things will be fixed soon.

Ciao,
Oliver

Oliver Bandel

unread,

Nov 8, 2007, 12:12:59 PM11/8/07

to caml-list

Zitat von Pascal Zimmer <pzi...@janestcapital.com>:

> What about this one:
>
> Unix.kill 0 11;;

[...]

heheh, good joke :)

segfault humor ;-)

Ciao,
Oliver

Xavier Leroy

unread,

Nov 8, 2007, 12:56:24 PM11/8/07

to Till Varoquaux, caml-list

Guys,

Before posting this kind of messages, I'd like you to stop for a
second and think about what you're doing. The answer is: a disservice
to the Caml community. (This also applies to most of Skaller's rants
and some of Harrop's marketing, by the way.)

Yes, OCaml has bugs, like all software of this complexity -- and
probably a lot less than your own software. Be supportive and
cooperative: post bug reports with repro cases where they belong, on
the bug tracking system, and let us developers handle them the way we
see fit.

That kind of snickering posts is neither helpful nor supportive. What
benefits do you expect from trashing Caml in public? Especially since
most of you never paid any cent for it, and some of you make a living
out of it.

> My impression was that bug reported via the mail ling list were
> acknowledged faster, I'm putting this theory to the test.

You've just demonstrated out that this practice is very effective at
pissing me off. Are you satisfied?

- Xavier Leroy

Tom Primožič

unread,

Nov 8, 2007, 1:12:12 PM11/8/07

to Xavier Leroy, caml-list

Every time when I see a mail from Xavier, I am surprised - they are so
incredibly rare, and usually only in very long and "important" threads.

However, this one beats all the previous ones!

Hail to the King, Baby!

- Tom

Robert Fischer

unread,

Nov 8, 2007, 1:24:20 PM11/8/07

to Tom Primožič, caml-list, Xavier Leroy

Martin Jambon

unread,

Nov 8, 2007, 1:26:15 PM11/8/07

to caml-list

On Thu, 8 Nov 2007, Martin Jambon wrote:

> I think that the standard library should provide a Pervasives.segfault
> function.

Of course, this is a joke.
I don't want to advertise against OCaml since I'm making a living out of
it like many of us on this list.

The meaning is really that in OCaml it is simply impossible to
get segmentation faults.

My joke was about suggesting a way of making it much easier for
beginners to trigger segmentation faults because they may miss them
sometimes if they are used to other inferior tools.

Till Varoquaux

unread,

Nov 8, 2007, 1:34:10 PM11/8/07

to Xavier Leroy, caml-list

>
> > My impression was that bug reported via the mail ling list were
> > acknowledged faster, I'm putting this theory to the test.
>
> You've just demonstrated out that this practice is very effective at
> pissing me off. Are you satisfied?
>
> - Xavier Leroy

I owe you a public apology: this was a very childish behavior.

My aim was not to discredit the language nor the implementation. I
feel blessed being able to use it in my day to day job and I think we
all agree it is miles ahead of other "professional" languages and more
practical than most pure research language.

Triggering segfaults without resorting to unsafe techniques is very
hard otherwise my previous post would have been pointless so I guess
you could see this as a praise in disguise.

Till

Basile STARYNKEVITCH

unread,

Nov 8, 2007, 2:02:10 PM11/8/07

to Robert Fischer, caml-list, Xavier Leroy

Robert Fischer wrote:
> Now I'm imagining Xavier fighting zombies with a chainsaw hand.

I really think that the Ocaml community owes a big lot to Xavier Leroy,
so I really suggest all of us to stop this rant.

And I also know that Ocaml is much cleaner and more robust (i.e. less
buggy) than a lot of other compilers (either commercial, or opensource).

At last, I want to publicly congratulate Xavier Leroy on this list for
his Michel Monpetit prize
http://www-c.inria.fr/Internet/scientific-research/researchers-news/prizes-and-distinctions-1/prizes-and-distinctions-1/inria-receives-awards-for-the-quality-of-its-research

I still hope that Xavier Leroy won't be pissed off too long.

And I had the honor to work one year in Xavier Leroy's team. It was my
best professionnal year, and I appreciated a lot Xavier personal
qualities (in addition of his scientific ones).

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

Oliver Bandel

unread,

Nov 8, 2007, 2:07:12 PM11/8/07

to caml-list

Hello Xavier,

Zitat von Xavier Leroy <Xavier...@inria.fr>:

> Guys,
>
> Before posting this kind of messages, I'd like you to stop for a
> second and think about what you're doing. The answer is: a disservice
> to the Caml community.

[...]

I hope that nobody has intended this.
And I doubt that people wanted to disservice the OCaml-community.

For me OCaml is the best language I've used.
I think, most people here will agree.

Segfaults in OCaml are seldom, but nevertheless
those seldom seen segfaults should be fixed.

The original poster stated out that the bug he
posted was four months on status "new".

This was a littlebid astonishing, and possibly
the reason why this thread was started.

[...]

> Yes, OCaml has bugs, like all software of this complexity -- and
> probably a lot less than your own software. Be supportive and
> cooperative: post bug reports with repro cases where they belong, on
> the bug tracking system, and let us developers handle them the way we
> see fit.

[...]

Yes, I agree here.

But it's not so easy to find a real bug in Ocaml,
so people might not be motivated to make a login-account
on the bugtracker and remember another password for
something that seldom is in use. ;-)

>
> That kind of snickering posts is neither helpful nor supportive.

I hope theese bugs will be all reported.
And also I hope, they will be fixed then.

I hope that you don't get any trouble from theese posts.

I for myself have only stated that I hope the bugs will be fixed.

The thing that I laughed about, really was a joke,
because sending SIGSEGV to PID = 0 means that the program
has to abort with a SIGSEGV, because it sends that signal to itself.

So, this was NOT a Ocaml-bug, and it was NOT a bug at all.
So, I think laughing about that is NOT laughing about OCaml.

I hope you can see this thread with humor, if not today,
maybe later.

There is no reason for you to be bothered here.
Your work is undisputed.

Best Regards,
Oliver Bandel

Pierre Weis

unread,

Nov 9, 2007, 1:10:15 PM11/9/07

to Oliver Bandel, caml-list

Hello world,

> Segfaults in OCaml are seldom, but nevertheless
> those seldom seen segfaults should be fixed.
>
> The original poster stated out that the bug he
> posted was four months on status "new".
>
> This was a littlebid astonishing, and possibly
> the reason why this thread was started.

I think this is my fault: I implemented Scanf in the first place.

May be some of you missed the point in the segfault example involving Scanf
that was given on this list: the example involves using positional parameters
in the string argument passed to sscanf and using meta format specifications
in the format string argument.

Positional parameters:
----------------------

Positional parameters are parameters number specifications that allows format
strings to refer to another parameter than the next in the presentation
order. This is supposed to be useful for internationalization where you can
change the printing order of the parameters to reflect the translation.

This new feature has only been introduced in the documentation for Printf
just after the successful correction of the long standing strange behaviour
of printf with respect to partial evaluation. Positional parameters have
never been mentioned in the Scanf documentation and Scanf is not supposed
to supported them.

To say the least, this feature is still experimental and not yet completely
implemented in the current sources of the compiler. In fact, the typechecker
abruptely rejects format strings with positional parameters, as exemplify
here:

# Printf.printf "%2$i %1$s" "toto" 1;;
Bad conversion %$, at char number 0 in format string ``%2$i %1$s''

As mentioned above, Scanf is not supposed to handle positional parameters,
and indeed rejects them at runtime in the format string (this could seem
overkill, given that the type-checker rejects positional parameters in the
first place, but well, 2 checks are better than one!).

Meta format specifications:
---------------------------

However, Scanf is still capable to read a format string lexem given in the
input, provided the format used to read this lexem involves a meta format
that properly describes the format string lexem to be read. For instance:
Scanf.sscanf s "%{ %i %f %}" is supposed to read in the input (the string s)
a format string that specify to read first an integer, then a floating point
number.

A more practical working example could be:

# let fmt =
Scanf.sscanf "\"Reference: %i Price : %f\"" "%{%i%f%}"
(fun fmt -> fmt);;
val fmt : (int -> float -> '_a, '_b, '_c, '_d, '_d, '_a) format6 = <abstr>
# string_of_format fmt;;
- : string = "Reference: %i Price : %f"

This features allows a procedure to read the format string it has to use to
read a file: the procedure just reads the format as the first line of the
file.

The example seg-fault analysis:
-------------------------------

So, the seg-faulting example given is quite involved, since it uses scanf's
capability to read a format string in the input, in order to create a format
string with positional specification. This was clearly unexpected, given the
``axiom'' that says "the type checker will prevent that in the first place
since it rejects any positional parameter!". In this case, the typechecker
cannot reject the format string given in the program, since it has no
positional specification; and it cannot reject the format read, since this
format is unknown at compile time!

This simply means that there is a bug in the type compatibility runtime test
for format strings that fails to properly reject positional parameters. This
is not difficult to correct, if not at all satifactory.

The corrections or problem suppression:
---------------------------------------

I once thought that introducing positional parameters in Scanf, Printf, and
Format, would be a piece of cake in comparison to correcting the hard bug of
printf's treatment of partial evaluation (this ``misfeature'' stood there for
more than 10 years, before a correction can be figured out). Unfortunately, I
was wrong, positional parameters are not at all easy, even when the printf
behaviour is corrected: admittedly, the runtime implementation for positional
parameters was not too hard for Printf and it has been done quickly; on the
other hand, the type checking of format strings with positional parameters
proved to be untrivial: you need a deep breath to dig into the old code, you
need once more understand it in depth, and basically you must rewrite it with
a new logic that supports positional parameters. This has not yet been done,
unfortunately.

In conclusion, I have to correct the runtime compatibility check for formats
to suppress the problem, and remove any mention of positional parameters from
the documentation, until I achieve the new type checking stuff for format
strings.

Or just give up on this complex feature that has already upset too many people.

I agree with any one who cares that I was wrong to introduce a not yet
properly baked feature in Caml. The natural over optimistic tendancy of my
researcher's enthousiasm caught me there ...

[...]

> I for myself have only stated that I hope the bugs will be fixed.

I hope them to be fixed as well. Sometimes it's difficult. Sometimes we lack
the time and concentration to find the solution. Even worse, sometimes we
never find the solution for years...

Best regards,

PS: You can check in the bug tracking system that there are more than one
message concerning positional parameters in format strings, and that I
already answered to a lot of them.

--
Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

Bünzli Daniel

unread,

Nov 10, 2007, 9:33:01 AM11/10/07

to caml-list List

Le 9 nov. 07 à 19:09, Pierre Weis a écrit :

> In conclusion, I have to correct the runtime compatibility check for
> formats
> to suppress the problem, and remove any mention of positional
> parameters from
> the documentation, until I achieve the new type checking stuff for
> format
> strings.

A question I have is why caml's formatting libraries were not
deprecated in favor of an implementation using Danvy's functional
unparsing [1]. This approach doesn't require an extension to the type
system and if I read correctly these results [2] it seems at least as
efficient as the current implementation. Scanf seems also doable [3].

The less complexity there is in the type system the safer we are in
the end.

Best,

Daniel

[1] http://www.brics.dk/RS/98/12/
[2] http://tkb.mpl.com/~tkb/software/misc/cpsio-test.pdf
[3] http://caml.inria.fr/pub/ml-archives/caml-list/2002/04/156ee5ae044ee4ff06ff988b384be6c2.fr.html

Jon Harrop

unread,

Nov 10, 2007, 10:07:53 AM11/10/07

to caml...@yquem.inria.fr

On Saturday 10 November 2007 14:32, Bünzli Daniel wrote:
> A question I have is why caml's formatting libraries were not
> deprecated in favor of an implementation using Danvy's functional
> unparsing [1]. This approach doesn't require an extension to the type
> system and if I read correctly these results [2] it seems at least as
> efficient as the current implementation. Scanf seems also doable [3].

Functional unparsing requires a lot more code, produces worse error messages,
is much harder to learn, is incompatible with the excellent Format module,
and the number of OCaml programs performance bound by these advanced printf
constructs is negligible.

I'd much rather see effort put into visualization and GUI tools rather than
ASCII text tools...

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

Bünzli Daniel

unread,

Nov 10, 2007, 10:49:13 AM11/10/07

to caml-list caml-list

Le 10 nov. 07 à 15:58, Jon Harrop a écrit :

> Functional unparsing requires a lot more code,

It's a little bit less concise but I wouldn't say it is a *lot*.

> produces worse error messages,

Example please.

> is much harder to learn,

I don't think so. There really nothing hard in it, it is just ...
different.

> is incompatible with the excellent Format module,

Which wouldn't prevent the design of a new format module.

> I'd much rather see effort put into visualization and GUI tools
> rather than
> ASCII text tools...

I'd rather have a simple and correct type system.

Daniel

Jon Harrop

unread,

Nov 10, 2007, 2:22:56 PM11/10/07

to caml...@yquem.inria.fr

On Saturday 10 November 2007 15:43, Bünzli Daniel wrote:
> Le 10 nov. 07 ŕ 15:58, Jon Harrop a écrit :

> > Functional unparsing requires a lot more code,
>
> It's a little bit less concise but I wouldn't say it is a *lot*.

printf "%25s %d\n%25s %g\n%25s %s\n\n%25s ]%*d[\n%25s ]%*d[\n\
%25s ]%*g[\n%25s ]%*g[\n%25s ]%*s[\n%25s ]%*s[\n"
"int:" 10 "float:" 1.234 "string:" "foo"
"int with width:" 20 24 "int with -width:" (-20) 42
"float with width:" 20 1.234 "float with -width:" (-20) 567.8
"string with width:" 20 "Hello"
"string with -width:" (-20) "Goodbye"

vs:

print_string
(format
(wlit "int:" 25 $ lit " " $ int $ nl $
wlit "float:" 25 $ lit " " $ flt $ nl $
wlit "string:" 25 $ lit " " $ str $ nnl 2 $
wlit "int with width:" 25 $ lit " ]" $ intw $ lit "[" $ nl $
wlit "int with -width:" 25 $ lit " ]" $ intw $ lit "[" $ nl $
wlit "float with width:" 25 $ lit " ]" $ fltw $ lit "[" $ nl $
wlit "float with -width:" 25 $ lit " ]" $ fltw $ lit "[" $ nl $
wlit "string with width:" 25 $ lit " ]" $ strw $ lit "[" $ nl $
wlit "string with -width:" 25 $ lit " ]" $ strw $ lit "[" $ nl )
()
10 1.234 "foo" (* int, float, string *)
24 20 42 (-20) (* int with width, -width spec *)
1.234 20 567.8 (-20) (* float with width, -width spec *)
"Hello" 20 (* string with width spec *)
"Goodbye" (-20) (* string with -width spec *)
)

> > produces worse error messages,
>
> Example please.

Oops, I forgot one of the 36 superfluous "$" operators. OCaml now fails to
catch my trivial (but likely) mistake and now my program produces incorrect
output and the patient dies on the table whilst gurgling out "should've spent
less time proving correctness and more time testing":

# let test1 () =
print_string
(format
(wlit "int:" 25 $ lit " " $ int $ nl $
wlit "float:" 25 $ lit " " $ flt $ nl $
wlit "string:" 25 $ lit " " $ str $ nnl 2 $
wlit "int with width:" 25 $ lit " ]" $ intw $ lit "[" $ nl
wlit "int with -width:" 25 $ lit " ]" $ intw $ lit "[" $ nl $
wlit "float with width:" 25 $ lit " ]" $ fltw $ lit "[" $ nl $
wlit "float with -width:" 25 $ lit " ]" $ fltw $ lit "[" $ nl $
wlit "string with width:" 25 $ lit " ]" $ strw $ lit "[" $ nl $
wlit "string with -width:" 25 $ lit " ]" $ strw $ lit "[" $ nl )
()
10 1.234 "foo" (* int, float, string *)
24 20 42 (-20) (* int with width, -width spec *)
1.234 20 567.8 (-20) (* float with width, -width spec *)
"Hello" 20 (* string with width spec *)
"Goodbye" (-20) (* string with -width spec *)
);;
val test1 : unit -> unit = <fun>
# test1();;
int: 10
float: 1.234
string: foo

int with width: ] 24[ int with -width:
]42 [
float with width: ] 1.234[
float with -width: ]567.8 [
string with width: ] Hello[
string with -width: ]Goodbye [
- : unit = ()

> > is much harder to learn,
>
> I don't think so. There really nothing hard in it, it is just ...
> different.

Exactly: how many people have heard of printf and how many have heard of
continuation passing style, let alone functional unparsers?

> > is incompatible with the excellent Format module,
>
> Which wouldn't prevent the design of a new format module.

Sure. But we're an extremely finite-sized community and need to prioritize
where we put our efforts. My vote simply goes to putting effort elsewhere.

> > I'd much rather see effort put into visualization and GUI tools
> > rather than
> > ASCII text tools...
>
> I'd rather have a simple and correct type system.

I'd rather have feature-complete software that works.

In practice, a language implementation will only become robust if it has both
a solid theoretical foundation and a significant user base to test the
implementation. As Knuth said "beware: I have proven it correct but not
tested it".

With the benefit of hindsight, Standard ML put too much emphasis on
theoretical correctness and not enough on practical utility. Consequently,
Standard ML does not enjoy OCaml's popularity, has fewer libraries, no
competitively performant compilers and so on. I would not like to see OCaml
take that route.

In this context, I would have thought that printf is very commonly used but
scanf is not. If scanf is difficult to fix then these features could simply
be removed from it.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e

_______________________________________________

Pierre Weis

unread,

Nov 13, 2007, 3:53:57 AM11/13/07

to Bünzli Daniel, caml-list List

[...]

> A question I have is why caml's formatting libraries were not
> deprecated in favor of an implementation using Danvy's functional
> unparsing [1]. This approach doesn't require an extension to the type
> system and if I read correctly these results [2] it seems at least as
> efficient as the current implementation. Scanf seems also doable [3].

If this functional unparsing can be rendered fully backward compatible with
the existing printf and scanf related Caml users' code base, then we can give
it a try. Otherwise, we would have to maintain two libraries in parallel...

> The less complexity there is in the type system the safer we are in
> the end.

The type system is not modified. We just have to add a new basic
(polymorphic) type constant to the set of basic types; and the corresponding
constant values are then type checked using the plain old polymorphic type
algebra.

The problem we have with the dynamic format strings compatibility check is
similar to any other bug into any other basic Caml primitive: a bug in those
primitives can be fatal to the type safety. (The bug in the compatibility
check has similar consequences as would a bug in the implementation of the
int_of_string primitive, if int_of_string were erroneously returning a float
value in some rare cases, while still keeping the regular string -> int type
scheme.)

All the best,

--
Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

_______________________________________________

Pierre Weis

unread,

Nov 13, 2007, 4:13:59 AM11/13/07

to Bünzli Daniel, caml-list caml-list

> I'd rather have a simple and correct type system.

You have a correct type system. Admittedly, its simplicity can be discussed,
if we consider the many features added to the language that really impact the
type algebra.

The format strings feature is not from this family of new and deep
modification of the type system: it neither impact the type algebra nor is a
late addition (it was introduced more than 10 years ago). Believe me, it is
in essence rock solid and fully type safe.

On the other hand, yes, there is still some work to do to fully support the
new additional feature of positional parameters. I agree, this is not
easy. But no, the format strings are not essentially flawed by the bug
reported here: the new feature implementation will be corrected or the new
feature will be removed.

Best regards,

--
Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

_______________________________________________

Pierre Weis

unread,

Nov 13, 2007, 4:22:38 AM11/13/07

to Jon Harrop, caml...@yquem.inria.fr

[...]

> In this context, I would have thought that printf is very commonly used but
> scanf is not. If scanf is difficult to fix then these features could simply
> be removed from it.

May be I was not clear in my previous messages: scanf is not difficult to fix
since these features has never been introduced in scanf (hence removing them
is particularly easy).

The problem is not in scanf but in the primitive that checks the format
strings compatibility.

Best regards,

--
Pierre Weis

INRIA Rocquencourt, http://bat8.inria.fr/~weis/

_______________________________________________