[Caml-list] Scripting in ocaml

23 views
Skip to first unread message

Denis Bueno

unread,
Dec 20, 2006, 10:43:31 PM12/20/06
to OCaml Mailing List
I've been writing bash scripts to perform various build- and
development-related tasks, and I don't enjoy it. I won't bore you with
detailed reasons why. The upshot is that I'd like to script in OCaml.

I have considered writing a few camlp4 extensions to make it easier to
write scripts:

1) create a syntax which grabs environment variables:

e.g. $FOO would grab the value of the environment variable FOO

2) some sort of more convenient process interaction, e.g., for piping.

(1) seems pretty straightforward, though I haven't found the time to
implement it yet. (2) is not as clear to me, but I'll think about it
and probably look at scsh.

I googled a bit but couldn't find anything related to this. Has
anything done, or started doing, anything like this?

-Denis

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Erik de Castro Lopo

unread,
Dec 20, 2006, 11:36:34 PM12/20/06
to caml...@inria.fr
Denis Bueno wrote:

> I've been writing bash scripts to perform various build- and
> development-related tasks, and I don't enjoy it. I won't bore you with
> detailed reasons why. The upshot is that I'd like to script in OCaml.

Makes a lot of sense. I used to do scripting style tasks in Python
but nowadays I prefer to use Ocaml.

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"We are shut up in schools and college recitation rooms for ten or
fifteen years, and come out at last with a belly full of words and
do not know a thing." -- Ralph Waldo Emerson

skaller

unread,
Dec 21, 2006, 2:24:54 AM12/21/06
to Erik de Castro Lopo
On Thu, 2006-12-21 at 15:34 +1100, Erik de Castro Lopo wrote:
> Denis Bueno wrote:
>
> > I've been writing bash scripts to perform various build- and
> > development-related tasks, and I don't enjoy it. I won't bore you with
> > detailed reasons why. The upshot is that I'd like to script in OCaml.
>
> Makes a lot of sense. I used to do scripting style tasks in Python
> but nowadays I prefer to use Ocaml.

As one of the authors of several major pieces of Python,
my comment is that whilst it provides great convenience
and good code structure .. dynamic typing just doesn't
scale.

The big problem using Ocaml (bytecode) for scripting
is probably the ugly dynamic loading support, and
for long running systems, the lack of unloading support.
(plus the syntax which isn't really well suited to small
scale scripting applications).

As an alternative you might consider Neko and/or NekoML.

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

Till Varoquaux

unread,
Dec 21, 2006, 4:14:54 AM12/21/06
to caml...@inria.fr
You might want to have a look at cash:
http://pauillac.inria.fr/cash/
It aims at being an ocaml equivalent of scsh. I don't know if it is
still maintained.

chamo (one of cameleon's [http://pauillac.inria.fr/cash/] component)
is also ocaml scriptable. They included an enhanced toplevel. You
could reuse part of there work .

Till
P.S. sorry john for spamming you...

Chad Perrin

unread,
Dec 21, 2006, 4:21:23 AM12/21/06
to skaller
On Thu, Dec 21, 2006 at 06:22:36PM +1100, skaller wrote:
>
> As one of the authors of several major pieces of Python,
> my comment is that whilst it provides great convenience
> and good code structure .. dynamic typing just doesn't
> scale.
>
> The big problem using Ocaml (bytecode) for scripting
> is probably the ugly dynamic loading support, and
> for long running systems, the lack of unloading support.
> (plus the syntax which isn't really well suited to small
> scale scripting applications).

Actually, I find that static typing is the biggest problem I have with
administration script writing in OCaml. It can be great for bigger
projects where dynamic typing doesn't always scale (as you mentioned),
but for something under thirty lines of code (to pick a number out of
thin air) the syntax requirements imposed by static typing and the
semantic gymnastics that must sometimes be done (string_of_int, et
cetera) can get in the way a bit.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]
Ben Franklin: "As we enjoy great Advantages from the Inventions of
others we should be glad of an Opportunity to serve others by any
Invention of ours, and this we should do freely and generously."

skaller

unread,
Dec 21, 2006, 5:31:52 AM12/21/06
to Chad Perrin
On Thu, 2006-12-21 at 02:18 -0700, Chad Perrin wrote:
> On Thu, Dec 21, 2006 at 06:22:36PM +1100, skaller wrote:

> > The big problem using Ocaml (bytecode) for scripting
> > is probably the ugly dynamic loading support, and
> > for long running systems, the lack of unloading support.
> > (plus the syntax which isn't really well suited to small
> > scale scripting applications).
>
> Actually, I find that static typing is the biggest problem I have with
> administration script writing in OCaml. It can be great for bigger
> projects where dynamic typing doesn't always scale (as you mentioned),
> but for something under thirty lines of code (to pick a number out of
> thin air) the syntax requirements imposed by static typing and the
> semantic gymnastics that must sometimes be done (string_of_int, et
> cetera) can get in the way a bit.

If you examine your
scripting code I'm going to bet you find that most variables
actually have a single statically known type!

What I find with useful with Python code is stuff like:

(a) exec/eval of loaded strings into executable code

(b) scope control, turning dictionaries into scopes
to modify the behaviour of script

This is indeed dynamics .. and requires dynamic typing,
but it isn't really that variables have a changing type,
it seems more about the ability of a system to extend
itself dynamically.


--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

_______________________________________________

Serge Aleynikov

unread,
Dec 21, 2006, 8:31:28 AM12/21/06
to skaller
skaller wrote:
> On Thu, 2006-12-21 at 15:34 +1100, Erik de Castro Lopo wrote:
>> Denis Bueno wrote:
>>
>>> I've been writing bash scripts to perform various build- and
>>> development-related tasks, and I don't enjoy it. I won't bore you with
>>> detailed reasons why. The upshot is that I'd like to script in OCaml.
>> Makes a lot of sense. I used to do scripting style tasks in Python
>> but nowadays I prefer to use Ocaml.
>
> As one of the authors of several major pieces of Python,
> my comment is that whilst it provides great convenience
> and good code structure .. dynamic typing just doesn't
> scale.

Could you please illustrate your point by more concrete reasoning.
Erlang uses dynamic typing (or rather strict typing) and scales very well.

> The big problem using Ocaml (bytecode) for scripting
> is probably the ugly dynamic loading support, and
> for long running systems, the lack of unloading support.
> (plus the syntax which isn't really well suited to small
> scale scripting applications).

Why is dynamic loading ugly? In Erlang dynamic loading allows to do
hot-swappable code-reloading which is a very neat feature for
long-running systems.

> As an alternative you might consider Neko and/or NekoML.
>

Regards,

Serge

skaller

unread,
Dec 21, 2006, 8:55:51 AM12/21/06
to Serge Aleynikov
On Thu, 2006-12-21 at 08:30 -0500, Serge Aleynikov wrote:
> skaller wrote:

> > As one of the authors of several major pieces of Python,
> > my comment is that whilst it provides great convenience
> > and good code structure .. dynamic typing just doesn't
> > scale.
>
> Could you please illustrate your point by more concrete reasoning.

It isn't a matter of reasoning but experience. Many people believe
static typing is good because it improves the likelihood a program
is correct, perhaps by making it easier for both the human and
machine to reason about the code -- and in particular catch
errors which would otherwise require testing, and even then
the actual location of a bug might be hard to determine.

I guess the arguments for static typing and scalability are
well known. So the reasoning part is clear: we infer larger
programs without dynamic typing are harder to get right.

That has indeed been my experience. For small programs I can
see all at once on the screen, dynamic typing is ok. For larger
programs, the type system seems to help by abstracting the
code, so even the larger bulk of code can be scanned and
remembered as a whole, if not in detail.

I guess static typing allows some kind of divide-and-conquer
approach.

> Erlang uses dynamic typing (or rather strict typing) and scales very well.

Perhaps that is because it is purely functional?

> > The big problem using Ocaml (bytecode) for scripting
> > is probably the ugly dynamic loading support, and
> > for long running systems, the lack of unloading support.
> > (plus the syntax which isn't really well suited to small
> > scale scripting applications).
>
> Why is dynamic loading ugly?

Sorry, I wasn't clear. I meant to refer to dynamic loading in Ocaml,
not in general. Exactly why that is I don't know. Perhaps the
interaction of the static typing and dynamic loading is not
easy because it is not well understood how to make it so,
in Ocaml, and perhaps in general (without losing type safety)


--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

_______________________________________________

Serge Aleynikov

unread,
Dec 21, 2006, 10:02:34 AM12/21/06
to skaller
skaller wrote:
> On Thu, 2006-12-21 at 08:30 -0500, Serge Aleynikov wrote:
>> skaller wrote:
>
>>> As one of the authors of several major pieces of Python,
>>> my comment is that whilst it provides great convenience
>>> and good code structure .. dynamic typing just doesn't
>>> scale.
>> Could you please illustrate your point by more concrete reasoning.
>
> It isn't a matter of reasoning but experience. Many people believe
> static typing is good because it improves the likelihood a program
> is correct, perhaps by making it easier for both the human and
> machine to reason about the code -- and in particular catch
> errors which would otherwise require testing, and even then
> the actual location of a bug might be hard to determine.

OCaml has static typing but doesn't provide an actual location of code
in exceptions. Is it because it was hard or intentional?

> I guess the arguments for static typing and scalability are
> well known. So the reasoning part is clear: we infer larger
> programs without dynamic typing are harder to get right.

I agree with your statements applicable to imperative/OO dynamicly typed
languages. However if a dynamic language is functional, this makes it
possible to provide run-time static analysis to identify type related
bugs and unreachable code. An example of this is the Dyalizer tool for
Erlang.

In this case running such a tool after compilation reveals the same
problems that a compile-level static checker would do, but allows for
run-time code reloading, more safe term serialization/deserialization,
and safer cross-language integration. Needless to say, this benefits
comes at price of efficiency, but there are many cases when these
features out-weight a reasonable performance loss that can be regained
by parallelizing computations.

> Sorry, I wasn't clear. I meant to refer to dynamic loading in Ocaml,
> not in general. Exactly why that is I don't know. Perhaps the
> interaction of the static typing and dynamic loading is not
> easy because it is not well understood how to make it so,
> in Ocaml, and perhaps in general (without losing type safety)

I believe that introducing strict typing in the language might help in
bridging static typing and dynamic loading. Strict typing would allow
to do run-time type verification and type-specific guard checks.
Though, this would no longer be Ocaml as we know it today. ;-)

Serge

Richard Jones

unread,
Dec 21, 2006, 10:06:10 AM12/21/06
to Denis Bueno
On Wed, Dec 20, 2006 at 10:41:20PM -0500, Denis Bueno wrote:
> I've been writing bash scripts to perform various build- and
> development-related tasks, and I don't enjoy it. I won't bore you with
> detailed reasons why. The upshot is that I'd like to script in OCaml.
>
> I have considered writing a few camlp4 extensions to make it easier to
> write scripts:
>
> 1) create a syntax which grabs environment variables:
>
> e.g. $FOO would grab the value of the environment variable FOO
>
> 2) some sort of more convenient process interaction, e.g., for piping.

I think it's a great idea - I'd love to push OCaml for scripting.
However I hope your camlp4-fu is up to snuff. You'd want, as you say,
a syntax for pipelines and file redirection, but more importantly
you'd want a very simple syntax for running commands. So you can
write some unholy OCaml/sh combination like:

let nr_files = int_of_string ` ls | wc -l `

Actually, even Perl isn't a very usable alternative for shell
scripting because of the amount of code you have to write just to fork
off a command and capture the output.

Rich.

--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Internet Marketing and AdWords courses - http://merjis.com/courses - NEW!
Merjis blog - http://blog.merjis.com - NEW!

Chad Perrin

unread,
Dec 21, 2006, 3:23:44 PM12/21/06
to skaller
On Thu, Dec 21, 2006 at 09:29:15PM +1100, skaller wrote:
> On Thu, 2006-12-21 at 02:18 -0700, Chad Perrin wrote:
> > On Thu, Dec 21, 2006 at 06:22:36PM +1100, skaller wrote:
>
> > > The big problem using Ocaml (bytecode) for scripting
> > > is probably the ugly dynamic loading support, and
> > > for long running systems, the lack of unloading support.
> > > (plus the syntax which isn't really well suited to small
> > > scale scripting applications).
> >
> > Actually, I find that static typing is the biggest problem I have with
> > administration script writing in OCaml. It can be great for bigger
> > projects where dynamic typing doesn't always scale (as you mentioned),
> > but for something under thirty lines of code (to pick a number out of
> > thin air) the syntax requirements imposed by static typing and the
> > semantic gymnastics that must sometimes be done (string_of_int, et
> > cetera) can get in the way a bit.
>
> If you examine your
> scripting code I'm going to bet you find that most variables
> actually have a single statically known type!

Actually, I typically don't even use variables in small OCaml scripts.
I occasionally use "let n =", but don't find need to do the reference
thing very often when the body of code is kept small.

The most common problem I have with static typing getting in the way is
having to decide on either multiple print statements or constructing a
string from several different data types to use a single print
statement. Where something like that takes one short line of code in
the dynamically typed languages I'm used to using, it can be half my
code when doing something similar in an OCaml script.

Plus, y'know, list handling is easier for me in Perl or UCBLogo, with
the former providing better general purpose text processing capability
and the latter having an attractive prefix notation functional syntax
(among other benefits). Neither compiles to persistent executable
native binaries, though, which (combined with its high level syntax and
several other means of running OCaml code) is really my main reason for
liking OCaml.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

print substr("Just another Perl hacker", 0, -2);

Chad Perrin

unread,
Dec 21, 2006, 3:28:09 PM12/21/06
to caml...@yquem.inria.fr
On Thu, Dec 21, 2006 at 09:59:15AM -0500, Serge Aleynikov wrote:
>
> I believe that introducing strict typing in the language might help in
> bridging static typing and dynamic loading. Strict typing would allow
> to do run-time type verification and type-specific guard checks.
> Though, this would no longer be Ocaml as we know it today. ;-)

I think you lost me. What do you mean by "strict typing" such that
OCaml doesn't do it? From what I've seen, OCaml is both statically and
strongly typed.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"A script is what you give the actors. A program
is what you give the audience." - Larry Wall

Chad Perrin

unread,
Dec 21, 2006, 3:29:20 PM12/21/06
to caml...@yquem.inria.fr
On Thu, Dec 21, 2006 at 02:59:54PM +0000, Richard Jones wrote:
>
> Actually, even Perl isn't a very usable alternative for shell
> scripting because of the amount of code you have to write just to fork
> off a command and capture the output.

That's not my experience. Running an external command from Perl is
easy, and there are a number of ways to do it, to suit pretty much any
purpose.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

unix virus: If you're using a unixlike OS, please forward
this to 20 others and erase your system partition.

Daniel Bünzli

unread,
Dec 21, 2006, 3:41:38 PM12/21/06
to Chad Perrin

Le 21 déc. 06 à 21:25, Chad Perrin a écrit :

> I think you lost me. What do you mean by "strict typing" such that
> OCaml doesn't do it? From what I've seen, OCaml is both statically
> and
> strongly typed.

What do you mean by strong typing ?

Daniel

Serge Aleynikov

unread,
Dec 21, 2006, 4:12:55 PM12/21/06
to Chad Perrin
Chad Perrin wrote:
> On Thu, Dec 21, 2006 at 09:59:15AM -0500, Serge Aleynikov wrote:
>> I believe that introducing strict typing in the language might help in
>> bridging static typing and dynamic loading. Strict typing would allow
>> to do run-time type verification and type-specific guard checks.
>> Though, this would no longer be Ocaml as we know it today. ;-)
>
> I think you lost me. What do you mean by "strict typing" such that
> OCaml doesn't do it? From what I've seen, OCaml is both statically and
> strongly typed.
>

What I meant by "strict typing" was performing type checks *at runtime*,
i.e. that there are no unsafe operations.

Say, as an example, doing type checks of function arguments at runtime:

let f = fun i when is_integer (i) -> i
| x when is_float (x) -> int_of_float x;;

(raising Bad_match of some sort if neither one of the two guards pass)

Direct application of this could be for making the Marshal/Unmarshal
modules more safe when marshaling data over files/sockets between
applications written in heterogeneous languages.

Serge


--
Serge Aleynikov
Routing R&D, IDT Telecom
Tel: +1 (973) 438-3436
Fax: +1 (973) 438-1464

Philippe Wang

unread,
Dec 21, 2006, 4:29:51 PM12/21/06
to Serge Aleynikov, caml...@inria.fr
Serge Aleynikov a écrit :

> What I meant by "strict typing" was performing type checks *at runtime*,
> i.e. that there are no unsafe operations.
>
> Say, as an example, doing type checks of function arguments at runtime:
>
> let f = fun i when is_integer (i) -> i
> | x when is_float (x) -> int_of_float x;;
>
> (raising Bad_match of some sort if neither one of the two guards pass)
>
> Direct application of this could be for making the Marshal/Unmarshal
> modules more safe when marshaling data over files/sockets between
> applications written in heterogeneous languages.

If you want to do that, use Lisp...
Or use sum types...
type t = Int of int | Float of float | ...

OCaml forgets types at runtime !
This means that you can't know without a huge cost (cf. SafeUnmarshal
costs), because what you can do in O(1) is to know whether a value is an
int or a pointer...

If OCaml is particularly efficient, it's probably mostly because of that!
..

Cheers,

--
Philippe Wang
mail(at)philippewang.info

Serge Aleynikov

unread,
Dec 21, 2006, 5:13:59 PM12/21/06
to Philippe Wang
Philippe Wang wrote:
>> What I meant by "strict typing" was performing type checks *at
>> runtime*, i.e. that there are no unsafe operations.
>>
>> Say, as an example, doing type checks of function arguments at runtime:
>>
>> let f = fun i when is_integer (i) -> i
>> | x when is_float (x) -> int_of_float x;;
>>
>> (raising Bad_match of some sort if neither one of the two guards pass)
>>
>> Direct application of this could be for making the Marshal/Unmarshal
>> modules more safe when marshaling data over files/sockets between
>> applications written in heterogeneous languages.
>
> If you want to do that, use Lisp...

Indeed. I am also quite happy with having that feature in Erlang. ;-)

> Or use sum types...
> type t = Int of int | Float of float | ...
>
> OCaml forgets types at runtime !
> This means that you can't know without a huge cost (cf. SafeUnmarshal
> costs), because what you can do in O(1) is to know whether a value is an
> int or a pointer...

There doesn't seem to be a large overhead for knowing if a value is a
closure, string, float or float array either (they have dedicated tag
values accessed with a single dereferencing). With having the compiler
reserve some tag values for other basic types that could optionally be
inspected at run-time, it perhaps wouldn't penalize efficiency very badly.

> If OCaml is particularly efficient, it's probably mostly because of that!

> ...

This thread began with John's statement that "interaction of the static
typing and dynamic loading was not easy". Any dynamic type check by all
means would reduce efficiency of a statically typed language. But maybe
it's possible to do dynamic type checking optional and make it available
if explicitly requested. At least if would make dynamic code loading
less complicated.

BR,
Serge


--
Serge Aleynikov
Routing R&D, IDT Telecom
Tel: +1 (973) 438-3436
Fax: +1 (973) 438-1464

_______________________________________________

Chad Perrin

unread,
Dec 21, 2006, 5:19:15 PM12/21/06
to Daniel Bünzli
On Thu, Dec 21, 2006 at 09:41:01PM +0100, Daniel Bünzli wrote:
> Le 21 déc. 06 ŕ 21:25, Chad Perrin a écrit :

>
> >I think you lost me. What do you mean by "strict typing" such that
> >OCaml doesn't do it? From what I've seen, OCaml is both statically
> >and
> >strongly typed.
>
> What do you mean by strong typing ?

I mean that it doesn't allow you to go around doing in-place type
changes willy-nilly the way something like C does.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"It's just incredible that a trillion-synapse computer could actually
spend Saturday afternoon watching a football game." - Marvin Minsky

Chad Perrin

unread,
Dec 21, 2006, 5:21:51 PM12/21/06
to Serge Aleynikov
On Thu, Dec 21, 2006 at 04:11:00PM -0500, Serge Aleynikov wrote:
> Chad Perrin wrote:
> > On Thu, Dec 21, 2006 at 09:59:15AM -0500, Serge Aleynikov wrote:
> >> I believe that introducing strict typing in the language might help in
> >> bridging static typing and dynamic loading. Strict typing would allow
> >> to do run-time type verification and type-specific guard checks.
> >> Though, this would no longer be Ocaml as we know it today. ;-)
> >
> > I think you lost me. What do you mean by "strict typing" such that
> > OCaml doesn't do it? From what I've seen, OCaml is both statically and
> > strongly typed.
> >
>
> What I meant by "strict typing" was performing type checks *at runtime*,
> i.e. that there are no unsafe operations.

So, basically, by "strict typing" you mean something like both
compile-time and runtime type checking? If it was *only* runtime type
checking, you'd just be using a dynamic type system (the opposite of
static typing, basically).

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"The first rule of magic is simple. Don't waste your time waving your
hands and hopping when a rock or a club will do." - McCloctnick the Lucid

Martin Jambon

unread,
Dec 21, 2006, 6:38:41 PM12/21/06
to Richard Jones
On Thu, 21 Dec 2006, Richard Jones wrote:

> On Wed, Dec 20, 2006 at 10:41:20PM -0500, Denis Bueno wrote:
> > I've been writing bash scripts to perform various build- and
> > development-related tasks, and I don't enjoy it. I won't bore you with
> > detailed reasons why. The upshot is that I'd like to script in OCaml.
> >
> > I have considered writing a few camlp4 extensions to make it easier to
> > write scripts:
> >
> > 1) create a syntax which grabs environment variables:
> >
> > e.g. $FOO would grab the value of the environment variable FOO
> >
> > 2) some sort of more convenient process interaction, e.g., for piping.
>
> I think it's a great idea - I'd love to push OCaml for scripting.
> However I hope your camlp4-fu is up to snuff. You'd want, as you say,
> a syntax for pipelines and file redirection, but more importantly
> you'd want a very simple syntax for running commands. So you can
> write some unholy OCaml/sh combination like:
>
> let nr_files = int_of_string ` ls | wc -l `

It's something that I'd love to have too. An implementation of a
simple subset of sh would be nice. The programming features would be
handled by ocaml, so we need a way to use ocaml variables as
shell variables (of type string, string list or command) in addition to
environment variables.

Should camlp4's quotations be used for this, or should the special syntax
be handled by another preprocessor? I don't know.
A quotation looks like << ls | wc -l >> or <:cmd< ls | wc -l >>.

If something like this already exists, please let us know.
If not, I'd be glad to help design the thing.

Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

skaller

unread,
Dec 21, 2006, 9:54:59 PM12/21/06
to Serge Aleynikov
On Thu, 2006-12-21 at 16:11 -0500, Serge Aleynikov wrote:
> Chad Perrin wrote:

> What I meant by "strict typing" was performing type checks *at runtime*,
> i.e. that there are no unsafe operations.
>
> Say, as an example, doing type checks of function arguments at runtime:
>
> let f = fun i when is_integer (i) -> i
> | x when is_float (x) -> int_of_float x;;
>
> (raising Bad_match of some sort if neither one of the two guards pass)
>

Just BTW .. it is very bad to raise exceptions on type errors.
The program should be aborted.

The reason is that raising such exceptions also allows for
catching them, which means doing a type error is no longer
unsafe and no longer a bug, but a legitimate technique.
This in turn defeats most static type analysis you might do.

For example this destroys the ability to analyse Python
statically for the purpose of optimising it.

It is *essential* that the language description not
mandate raising exceptions on type errors, but rather
specify the action is undefined .. even if the implementation
raises an exception, the language specification must NOT
require that. This prevents programmers actually relying on it
and allows a static analyser to optimise the code on the
assumption it is well typed.


--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

_______________________________________________

Jon Harrop

unread,
Dec 22, 2006, 6:37:48 AM12/22/06
to caml...@yquem.inria.fr
On Thursday 21 December 2006 21:11, Serge Aleynikov wrote:
> What I meant by "strict typing" was performing type checks *at runtime*,
> i.e. that there are no unsafe operations.

I'd call run-time type checking "dynamic typing" and type checking such that
there are no unsafe operations "strong typing". So OCaml is strongly,
statically typed.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

Daniel Bünzli

unread,
Dec 22, 2006, 7:22:06 AM12/22/06
to Chad Perrin, Jon Harrop

Le 21 déc. 06 à 23:16, Chad Perrin a écrit :

> I mean that it doesn't allow you to go around doing in-place type
> changes willy-nilly the way something like C does.

(Well in fact you can with Obj.magic, but that's not my point)

The problem is that this weak/strong terminology is hopelessly
confused (see [1],[2]). Since there is no clear unique definition of
strong/weak typing I think this terminology should be avoided. I tend
to favour the definitions you can find in the introduction of this
book [3] which are imho less confusing.

Basically the author distinguishes on one hand between statically and
dynamically typed languages, and on the other hand, between safe and
unsafe languages.

Static and dynamic type checking refers to whether type checks are
respectively performed at compilation or run time.

Safety is broadly defined as follows :

"A safe language is one that protects its own abstractions. Every
high-level language provides abstractions of machine services. Safety
refers to the language's ability to guarantee the integrity of these
abstractions and of higher-level abstractions introduced by the
programmer using the definitional facilities of the language"

Later he gives the following chart

|Statically checked | Dynamically checked
-------------------------------------------------
safe | ML, Haskell, Java, etc. | Lisp, Scheme, Perl, Postscript, etc
unsafe | C, C++, etc. |

Subsequently he adds :

"Language safety is seldom absolute. Safe languages often offer
programmers "escape hatches", such as foreign function calls to code
written in other, possibly unsafe, languages. Indeed such escape
hataches are sometimes provided in a controlled from within the
language itself--Obj.magic in Ocaml, ... "

These are just definitions. But it is hard to argue when words do not
have a common meaning between arguers. I just think these definitions
make it simpler to have a common understanding of what we are talking
about.

Best,

Daniel

[1] <http://en.wikipedia.org/wiki/Strong_typing>
[2] <http://www.artima.com/intv/strongweak.html>
[3] <http://www.cis.upenn.edu/~bcpierce/tapl/>

Jon Harrop

unread,
Dec 22, 2006, 7:40:32 AM12/22/06
to caml...@yquem.inria.fr
On Thursday 21 December 2006 21:27, Philippe Wang wrote:
> If you want to do that, use Lisp...

Lisp is too slow.

> Or use sum types...
> type t = Int of int | Float of float | ...

Manual boxing is too verbose.

> OCaml forgets types at runtime!

Some type related information is certainly retained, e.g. to unbox float
arrays.

> This means that you can't know without a huge cost (cf. SafeUnmarshal
> costs), because what you can do in O(1) is to know whether a value is an
> int or a pointer...

I'd like to quantify this cost. I've read papers and heard work stating that
carrying run-time type information can be cheap but I see evidence that might
point to the contrary, e.g. F# is significantly slower than OCaml but it has
concurrent GC that was designed for a non-FPL.

In F#, you have run-time type information. Amongst other things, this allows
you to dispatch to more efficient type-specialised functions. For example,
you can write functions over polymorphic arrays and dispatch to optimised
BLAS versions for float arrays when the input happens to be a float array.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Jon Harrop

unread,
Dec 22, 2006, 7:42:48 AM12/22/06
to caml...@yquem.inria.fr
On Thursday 21 December 2006 22:19, Chad Perrin wrote:
> So, basically, by "strict typing" you mean something like both
> compile-time and runtime type checking? If it was *only* runtime type
> checking, you'd just be using a dynamic type system (the opposite of
> static typing, basically).

As many Lisp compilers try to do type inference and check types at run-time
(giving warnings), Lisp is not dynamically typed according to your argument.
Moreover, this is a property of the implementation and not of the language.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Jon Harrop

unread,
Dec 22, 2006, 10:25:08 AM12/22/06
to caml...@yquem.inria.fr
On Friday 22 December 2006 02:51, skaller wrote:
> Just BTW .. it is very bad to raise exceptions on type errors.

For what definition of "type"?

> The reason is that raising such exceptions also allows for
> catching them, which means doing a type error is no longer
> unsafe and no longer a bug, but a legitimate technique.
> This in turn defeats most static type analysis you might do.

Absolutely. But the ability to do run-time dispatch based upon type is an
advantage of dynamic typing, so it is something that you do not want to lose.

> For example this destroys the ability to analyse Python
> statically for the purpose of optimising it.

Yes. An optimising Python compiler will only be adopted/useful if it can
evaluate any Python. Note that this could mean reverting to interpreted
bytecode when the program is inherently dynamically typed.

> It is *essential* that the language description not
> mandate raising exceptions on type errors, but rather
> specify the action is undefined .. even if the implementation
> raises an exception, the language specification must NOT
> require that. This prevents programmers actually relying on it
> and allows a static analyser to optimise the code on the
> assumption it is well typed.

You can raise exceptions from unexpectedly typed code whilst keeping the
advantages of static checking and performance in F#, for example. This gives
you the advantages of both worlds: performance/reliability when leveraging
static typing and brevity/generality when leveraging dynamic typing.

For example, I recently benchmarked C++, F#, OCaml and Python for computing
discrete wavelet transforms. F# (on 32-bit WXP) was slightly faster than
OCaml (on 64-bit Debian), so it can have very competitive performance:

http://groups.google.co.uk/group/comp.lang.python/msg/0229d2c6484ea491?hl=en&
http://groups.google.co.uk/group/comp.lang.python/msg/daf7bbb2bd7e99f3?hl=en&

Yet F# retains run-time type information so you can use a generic print
function (print_any) on any type, have your dynamic code loading and so on.
The best of both worlds.

On a related note, F# supports operator overloading, which greatly simplifies
many mathematical expressions at the cost of requiring more type annotations.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists

_______________________________________________

Tom

unread,
Dec 22, 2006, 11:53:40 AM12/22/06
to Daniel Bünzli
>
>
> Later he gives the following chart
>
> |Statically checked | Dynamically checked
> -------------------------------------------------
> safe | ML, Haskell, Java, etc. | Lisp, Scheme, Perl, Postscript, etc
> unsafe | C, C++, etc. |
>
>
But this chart is not expressive enough... I believe that the properties
implied by "weak/strong" refer to the ability (or the disability) of the
compiler/runtime (or rather semantics of the language) to change types at
will (actually, whenever this seems useful, in cases such as "string" + 7 or
"9" - "3").

This category would include C and C++ (implicit conversions of numbers) and
certainly dynamically checked languages such as php, javascript, (probably
also) Ruby, Python, ...

I believe that these languages need to be distinguished.

- Tom

Daniel Bünzli

unread,
Dec 22, 2006, 12:35:18 PM12/22/06
to Tom

Le 22 déc. 06 à 17:51, Tom a écrit :

> But this chart is not expressive enough...

Its aim is not to describe anything you can say about programming
languages. It describes the particular notion that Chad was implying
by "strong" typing.

> I believe that the properties implied by "weak/strong" refer to the
> ability (or the disability) of the compiler/runtime (or rather
> semantics of the language) to change types at will (actually,
> whenever this seems useful, in cases such as "string" + 7 or "9" -
> "3").

That's _one_ of the bullet points listed in [1] as a _possible_
meaning for weak/strong typing. Note that your reaction makes my
point : do not use the notion of weak/strong typing it always mean
something different to other persons and hence it doesn't mean anything.

A language that has the property you describe can simply be said to
have implicit type conversions.

Best,

Daniel

[1] <http://en.wikipedia.org/wiki/Strong_typing>

_______________________________________________

skaller

unread,
Dec 22, 2006, 1:19:10 PM12/22/06
to Tom
On Fri, 2006-12-22 at 17:51 +0100, Tom wrote:
>
> Later he gives the following chart
>
> |Statically checked | Dynamically checked
> -------------------------------------------------
> safe | ML, Haskell, Java, etc. | Lisp, Scheme, Perl,
> Postscript, etc
> unsafe | C, C++, etc. |
>
>
> But this chart is not expressive enough...

It is also unclear what you mean by 'unsafe'.

Ocaml is not safe:

let a = Array.create 0 0 in
let y = a.[99] in (* WOOPS *)

The fact that an exception is thrown may or may not
make the language safe depending on whether or not
you INTEND to trigger an exception. The best you can
say is that if you don't catch it, its a bug.

Otherwise, you have to read the comments to know if
the out of bounds access was deliberate .. something
compilers can't do very easily.

In fact C arrays are safer precisely because the behaviour
is not defined. That way you KNOW an out of bound access is
an error, so reasoning about the code .. and optimising it ..
is easier.

Furthermore you can always enable array bounds checking
with an instrumentation switch or tool like Purify.

Well array bounds probably aren't contentious: most people
would say an access violation was a bug, even in Ocaml.
But you can't be so sure with

try Some (Hashtbl.find table key)
with Not_found -> None

That doesn't look like a bug, if the key isn't found.

There's a difference in intent .. but the language
fails to express it. Exceptions and specified dynamic
checks are 'evil' :)

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

_______________________________________________

Daniel Bünzli

unread,
Dec 22, 2006, 1:48:23 PM12/22/06