Re: [Caml-list] Wanted: your feedback on the hierarchy of OCaml Batteries Included

25 views
Skip to first unread message

David Teller

unread,
Nov 18, 2008, 6:18:40 AM11/18/08
to Richard Jones, OCaml
This raises two questions:
1) how important is it to allow third-party modules to extend the
namespace?
2) how important is it to offer a uniform package structure (where
levels are always separated by '.' rather than some level by '.' and
some by '_')?

For the moment, we have considered point 1 not very important and point
2 a little more. There are several reasons to disregard point 1. Among
these, clarity of origin (as in "is this module endorsed by Batteries or
not?") and documentation issues (as in "gosh, this module pretends to be
part of [Data] but I can't find the documentation anywhere in the
documentation of Batteries, wtf?").

Do you believe that we should have chosen otherwise?

Cheers,
David

On Tue, 2008-11-18 at 10:06 +0000, Richard Jones wrote:
> Your biggest problem is using dot ('.') instead of underscore ('_').
> Using a dot means that the System namespace cannot be extended by
> external packages. If you use an underscore then an external package
> can extend the namespace (eg. by providing System_Newpackage)
>
> Rich.
>
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

David Teller

unread,
Nov 18, 2008, 6:22:24 AM11/18/08
to Zheng Li, OCaml, Richard Jones
I thought the linker only linked in symbols which were actually used?

On Tue, 2008-11-18 at 11:21 +0100, Zheng Li wrote:
> > Your biggest problem is using dot ('.') instead of underscore ('_').
> > Using a dot means that the System namespace cannot be extended by
> > external packages. If you use an underscore then an external package
> > can extend the namespace (eg. by providing System_Newpackage)
>

> And, doesn't that forces all sub modules to be linked into the final
> executables even if we only use one of them?

Daniel Bünzli

unread,
Nov 18, 2008, 6:35:39 AM11/18/08
to OCaml List

Le 18 nov. 08 à 11:29, Erkki Seppala a écrit :

> For example I prefer using the least amount of opening of modules,
> to make it easier to see where the values come from

Same here. This is why I'm a little bit sceptical about this hierarchy.

With the current standard library if I suddenly want to use
Int32.of_int, I know I just need to type Int32.of_int in my source.
With your proposal I need to remember that it is in Data.Numeric and
go at the beginning of my file to open it or write
Data.Numeric.Int32.of_int, to me this brings bureaucracy without any
benefit. And lack of bureaucracy is one of the reasons I like ocaml
(and dislike java for example).

Besides Hierarchies are anyway limited in their descriptive power and
one day you'll find something that will fit in two places, Rope is
already an example being both Data.Persistent and Data.Text.

Thus my proposal would be to _present_ them as a hierarchy (but even
here a mean to tag/browse the modules with/by keywords would do a
better job) but keep the actual module structure of Batteries as flat
as possible, everything just under the toplevel Batteries. When I code
I really don't want to have to think about all these open directives
that essentially bring nothing.

Best,

Daniel

Thomas Gazagnaire

unread,
Nov 18, 2008, 6:47:46 AM11/18/08
to Daniel Bünzli, OCaml List
>
> With the current standard library if I suddenly want to use Int32.of_int, I
> know I just need to type Int32.of_int in my source. With your proposal I
> need to remember that it is in Data.Numeric and go at the beginning of my
> file to open it or write Data.Numeric.Int32.of_int, to me this brings
> bureaucracy without any benefit. And lack of bureaucracy is one of the
> reasons I like ocaml (and dislike java for example).
>
> Besides Hierarchies are anyway limited in their descriptive power and one
> day you'll find something that will fit in two places, Rope is already an
> example being both Data.Persistent and Data.Text.
>

I use modules in the same way, mostly to be able to grep Int32.of_int in my
code when needed (as greping for of_int only would make the result less
precise).


> Thus my proposal would be to _present_ them as a hierarchy (but even here a
> mean to tag/browse the modules with/by keywords would do a better job) but
> keep the actual module structure of Batteries as flat as possible,
> everything just under the toplevel Batteries. When I code I really don't
> want to have to think about all these open directives that essentially bring
> nothing.
>

tag system for modules is a good idea, and I would like to add that type
search for functions (which is already done by ocamlbrowser) is also nice.
--
Thomas

David Teller

unread,
Nov 18, 2008, 7:15:43 AM11/18/08
to Daniel Bünzli, OCaml List
On Tue, 2008-11-18 at 12:34 +0100, Daniel Bünzli wrote:
>Besides Hierarchies are anyway limited in their descriptive power and
>one day you'll find something that will fit in two places, Rope is
>already an example being both Data.Persistent and Data.Text.

That's correct, there are plenty of modules which could fit in different
places. For the moment, we decided that every module should appear only
in one place. However, we could easily change this -- in fact, to allow
this, we only need to alter our documentation generator.

> Thus my proposal would be to _present_ them as a hierarchy (but even
> here a mean to tag/browse the modules with/by keywords would do a
> better job) but keep the actual module structure of Batteries as flat
> as possible, everything just under the toplevel Batteries. When I code
> I really don't want to have to think about all these open directives
> that essentially bring nothing.

Browsing by keywords sounds like an interesting idea. I'm adding this to
our TODO list. Of course, the next step will be to actually add these
keywords and that's going to be much longer if we intend to tag all
values.

However, we disagree on the necessity of a hierarchy. There are two good
reasons why the base library of OCaml doesn't have a hierarchy (almost):
it's small and there are almost no redundancies between modules. Neither
is true for Batteries.

For an example of this redundancy, consider threads. For the moment, we
have five thread-related modules: [Threads], [Mutex], [RMutex],
[Condition] and [Event]. These modules, which are essentially the same
modules as those of the base library, are all submodules of
[Control.Concurrency.Threads]. Now, I personally like
[Control.Concurrency] but I agree that this is debatable. The reason why
we group these modules into [Threads] is because sooner or later, we
are going to have four or five other thread-related modules called
[Threads], [Mutex], [Condition], [Event] and perhaps [RMutex]. These
modules will get into [Control.Concurrency.CoThreads]. They won't
replace the first batch, they will exist side-by-side. Of course, we
could trim the hierarchy and remove [Control.Concurrency] -- trimming
the hierarchy is the main reason for launching this thread,
incidentally. But, to keep things ordered, we will still need modules
[Threads.Threads], [Threads.Mutex], [Threads.RMutex]...
[CoThreads.Threads], [CoThreads.Mutex]... and, well, that's a hierarchy
already.

coThreads is not an exceptional case, mind you. We may end up with two
definitions of [Graphics], several data structures with the same name
but different purposes, etc.

There's also the issue of labels and other partial redefinitions of
modules. The OCaml base library defines [Array]/[ArrayLabels],
[List]/[ListLabels], [Map]/[MoreLabels.MapLabels] etc. In Batteries
Included, we define [Array], [Array.Labels], [List], [List.Labels],
which clutters less the list of modules and makes for something more
consistent, especially since [FooLabel] is not the only kind of "module
[Foo] with a variant": we also have [Array.ExceptionLess], for
operations without exceptions, and [Array.Cap] for read-only/write-only
arrays. Other variants may still appear.

Do you see any better way of managing the complexity of all this?

Cheers,
David


--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________

Richard Jones

unread,
Nov 18, 2008, 7:22:58 AM11/18/08
to David Teller, OCaml
On Tue, Nov 18, 2008 at 12:17:28PM +0100, David Teller wrote:
> This raises two questions:
> 1) how important is it to allow third-party modules to extend the
> namespace?
> 2) how important is it to offer a uniform package structure (where
> levels are always separated by '.' rather than some level by '.' and
> some by '_')?
>
> For the moment, we have considered point 1 not very important and point
> 2 a little more. There are several reasons to disregard point 1. Among
> these, clarity of origin (as in "is this module endorsed by Batteries or
> not?") and documentation issues (as in "gosh, this module pretends to be
> part of [Data] but I can't find the documentation anywhere in the
> documentation of Batteries, wtf?").
>
> Do you believe that we should have chosen otherwise?

Easy - look at CPAN[1]. If you want to scale a project you have to
make decisions that allow a distributed network of people to
cooperate, without needing too much central coordination. CPAN is a
great example of this loose coupling because packages make their own
decision about naming (albeit they can become "official" later - but
they won't need to rename unless there is an actual naming conflict).

If the problem is documentation or provenance of packages, then add a
mechanism to solve that problem. Perl also solves this through an
existing, lightweight, distributed mechanism (a standard location to
install man-pages, and a standard man-page format and man-page
generating mechanism -- POD).

Rich.

[1] http://www.cpan.org/

--
Richard Jones
Red Hat

Richard Jones

unread,
Nov 18, 2008, 7:32:41 AM11/18/08
to David Teller, Daniel Bünzli, OCaml List
On Tue, Nov 18, 2008 at 01:15:39PM +0100, David Teller wrote:
> Do you see any better way of managing the complexity of all this?

I'm still not getting where the benefit of having this hierarchy is,
except that it adds a Java-like complexity and will create
hard-to-manage churn if a module ever moves.

API changes are handled really badly in OCaml, ironically because of
the lack of a textual preprocessor. You can't just write this every
time lablgtk / calendar / latest culprit decides to change their API:

#ifdef LABLGTK < 210
let icon = GMisc.image () in
icon#set_stock icon_type ~size:size;
icon
#else
let icon = GMisc.image () in
icon#set_stock `DIALOG_ERROR;
icon#set_icon_size `DIALOG;
icon
#endif

(Well, you can run -pp cpp, but that breaks other stuff)

Rich.

--
Richard Jones
Red Hat

_______________________________________________

David Teller

unread,
Nov 18, 2008, 7:40:08 AM11/18/08
to Daniel Bünzli, OCaml List
On Tue, 2008-11-18 at 12:34 +0100, Daniel Bünzli wrote:
> Le 18 nov. 08 à 11:29, Erkki Seppala a écrit :
>
> > For example I prefer using the least amount of opening of modules,
> > to make it easier to see where the values come from
>
> Same here. This is why I'm a little bit sceptical about this hierarchy.
>
> With the current standard library if I suddenly want to use
> Int32.of_int, I know I just need to type Int32.of_int in my source.
> With your proposal I need to remember that it is in Data.Numeric and
> go at the beginning of my file to open it or write
> Data.Numeric.Int32.of_int, to me this brings bureaucracy without any
> benefit. And lack of bureaucracy is one of the reasons I like ocaml
> (and dislike java for example).

I forgot to answer that part.

In Batteries, for the moment, we decided to keep the module names of the
base library as shortcuts to our new modules. Consequently, you can
still write your [Int32.of_int] in addition to our new [Int32.print],
etc. The old modules are still available as submodules of [Legacy], if
needed.

Should you wish to flatten the complete hierarchy, assuming that it's
possible and that there are no collisions on names, that's also
something which you can do quite easily. We even provide some syntactic
sugar for this. It's just the matter of writing a file my_batteries.ml
along the lines of

module Array = Data.Mutable.Array
module List = Data.Persistent.List
..
module PosixThreads = Control.Concurrency.Threads.Threads
module PosixMutex = Control.Concurrency.Threads.Mutex
module CoThreads = Control.Concurrency.CoThreads.Threads
..
module ArrayExn = Data.Mutable.Array include ExceptionLess
(*syntactic sugar*)
module ArrayLabels = Data.Mutable.Array include Labels
module ArrayCapExn = Data.Mutable.Array.Cap include ExceptionLess
module ArrayCapLabels= Data.Mutable.Array.Cap include Labels
..

I personally don't like name [ArrayCapLabels] but I can't think of any
better name to represent this once we have removed any hierarchy.

I personally prefer the hierarchy but, once again, the majority may
disagree. So if you believe this is better, the next logical step would
be to design a full and consistent list of modules including all the
modules which already appear in the current version of Batteries, and
with some space left for OCamlnet, OCamlnae, Reins, Camomile, ULex,
Camlp4, CoThreads and a few others. I truly mean it, if you can provide
us with something you consider more comfortable and as future-proof, we
may adopt it.

Cheers,
David

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________

Zheng Li

unread,
Nov 18, 2008, 7:51:05 AM11/18/08
to David Teller, OCaml, Richard Jones
David Teller wrote:
> I thought the linker only linked in symbols which were actually used?

You really should check.

I have not yet looked too much into the source, but if the
batteries_core.ml is one of them to be referenced anyway, I'm afraid all
modules (not just parents/siblings) will be linked.

Try to compile the following source into executable:

----
open Batteries.Data.Persistent.List

let _ = iter
----

You will end up with being asked for numerous unrelated modules during
the linking phrase, or you can use the recommended "ocamlfind
batteries/ocamlc" shortcut. Either way, an executable of +50 times
bigger in size (i.e. +1M for the 2 lines) than using the standard List
will be produced.

--
Zheng

> On Tue, 2008-11-18 at 11:21 +0100, Zheng Li wrote:
>>> Your biggest problem is using dot ('.') instead of underscore ('_').
>>> Using a dot means that the System namespace cannot be extended by
>>> external packages. If you use an underscore then an external package
>>> can extend the namespace (eg. by providing System_Newpackage)
>> And, doesn't that forces all sub modules to be linked into the final
>> executables even if we only use one of them?
>

_______________________________________________

David Teller

unread,
Nov 18, 2008, 7:51:43 AM11/18/08
to Benedikt Grundmann, OCaml
Ok, that's an interesting point. Now, we just need to all agree on one
standard :)

On Tue, 2008-11-18 at 12:28 +0000, Benedikt Grundmann wrote:
> > Do you see any better way of managing the complexity of all this?

> Yes don't introduce it at all, make a decision to use or not use labels
> and stick with it. Similarly make a decision to use or not use exceptions
> as the "default", suffix / rename alternative functions as appropriate
> (consistently). Consistency is a big win. Not only as it speeds you up
> when you read/modify other people's code it also reduces the amount
> of decisions you have to do when writing new code.
>
> http://ocaml.janestreet.com/?q=node/28
>
> Cheers,
>
> Bene

David Teller

unread,
Nov 18, 2008, 7:52:23 AM11/18/08
to Richard Jones, OCaml
On Tue, 2008-11-18 at 12:22 +0000, Richard Jones wrote:
> > Do you believe that we should have chosen otherwise?
>
> Easy - look at CPAN[1]. If you want to scale a project you have to
> make decisions that allow a distributed network of people to
> cooperate, without needing too much central coordination. CPAN is a
> great example of this loose coupling because packages make their own
> decision about naming (albeit they can become "official" later - but
> they won't need to rename unless there is an actual naming conflict).

Interesting point. So far, the approach of Batteries has certainly been
different, in large part because we don't want everything to end up part
of the Batteries hierarchy (or, well, lack thereof). Of course, this is
in contradiction with our sometimes imperialistic tendencies, so we may
be guilty of schizophrenia.

Perhaps we should organise a poll on this subject.

> If the problem is documentation or provenance of packages, then add a
> mechanism to solve that problem. Perl also solves this through an
> existing, lightweight, distributed mechanism (a standard location to
> install man-pages, and a standard man-page format and man-page
> generating mechanism -- POD).

I'm not sure the man-page format quite scales up to the kind of
hyperlinked complexity we have in Batteries for the moment. But yes, I
agree, we can certainly work something out. In fact, we could say that
we've started on this track, albeit perhaps not with such grand
ambitions.

Thanks for the idea,
David

P.S.: I've pointedly ignored your perch on POD :) In my mind, that's a
very different topic. For the moment, we'll stick with ocamldoc.

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act
brings liquidations.

_______________________________________________

David Teller

unread,
Nov 18, 2008, 7:58:00 AM11/18/08
to Richard Jones, Daniel Bünzli, OCaml List
On Tue, 2008-11-18 at 12:32 +0000, Richard Jones wrote:
> API changes are handled really badly in OCaml, ironically because of
> the lack of a textual preprocessor. You can't just write this every
> time lablgtk / calendar / latest culprit decides to change their API:
>
> #ifdef LABLGTK < 210
> let icon = GMisc.image () in
> icon#set_stock icon_type ~size:size;
> icon
> #else
> let icon = GMisc.image () in
> icon#set_stock `DIALOG_ERROR;
> icon#set_icon_size `DIALOG;
> icon
> #endif

Side-note: That's certainly something we could add to Batteries, if
needed. Camlp4 is pretty-much necessary to use Batteries anyway and
Camlp4 already defines IFDEF, INCLUDE, etc. We would just need to
complete that DSL perhaps to accept any valid OCaml expression and call
the ocaml interpreter to evaluate these expressions.

Cheers,
David

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________

Daniel Bünzli

unread,
Nov 18, 2008, 8:26:12 AM11/18/08
to OCaml List

Le 18 nov. 08 à 13:15, David Teller a écrit :

> But, to keep things ordered, we will still need modules
> [Threads.Threads], [Threads.Mutex], [Threads.RMutex]...
> [CoThreads.Threads], [CoThreads.Mutex]... and, well, that's a
> hierarchy
> already.

If you include in batteries an external package that has its own
hierarchy and is designed to be opened I don't mind having that
hierarchy. In that case you can just add the new toplevel entry
CoThread. And if I want to use CoThread, I just open CoThreads, not
Control.Concurrency.CoThreads. Just try to keep it as flat as
possible, don't try to force modules in an ad-hoc hierarchical
taxonomy to try to sort out modules. I don't care if the toplevel list
of modules is three hundred pages long if there is an efficient mean
to access their documentation (like tags). I do however care a lot if
it becomes bureaucratic to be able to _use_ a module in my code.


Le 18 nov. 08 à 13:22, Richard Jones a écrit :

> Easy - look at CPAN[1]. If you want to scale a project you have to
> make decisions that allow a distributed network of people to
> cooperate, without needing too much central coordination.

But (unfortunately, sorry to repeat that) Batteries is not a CPAN like
initiative. It aims at giving a library of modules/syntax extensions
selected by the library maintainers, as such it is inherently
centralized and I don't think that questions (1) or (2) are actually
pertinent for the project.

Best,

Daniel

Dario Teixeira

unread,
Nov 18, 2008, 8:32:06 AM11/18/08
to OCaml List
Hi,

> I personally prefer the hierarchy but, once again, the majority
> may disagree. So if you believe this is better, the next logical
> step would be to design a full and consistent list of modules
> including all the modules which already appear in the current
> version of Batteries, and with some space left for OCamlnet,
> OCamlnae, Reins, Camomile, ULex, Camlp4, CoThreads and a few
> others. I truly mean it, if you can provide us with something
> you consider more comfortable and as future-proof, we may adopt it.

Paraphrasing Einstein, I think the hierarchy should be as flat
as possible, but no flatter. For example, I see no reason to
materialise in the hierarchy the separation between persistent
and mutable data structures. The should be a documentation
issue. However, and as you noted, there are cases where some
hierarchisation may remove namespace clutter and allow for
better code reuse.

Cheers,
Dario Teixeira

Alain Frisch

unread,
Nov 18, 2008, 9:11:26 AM11/18/08
to David Teller, OCaml, Richard Jones, Zheng Li
David Teller wrote:
> I thought the linker only linked in symbols which were actually used?

No, it is not the case.

The only automatic mechanism for code pruning is at the level of
individual modules embedded in a library. As soon as you pack, you
obtain a monolithic module which can only be linked as a whole.

-- Alain

David Teller

unread,
Nov 18, 2008, 9:23:43 AM11/18/08
to Dario Teixeira, OCaml List
On Tue, 2008-11-18 at 05:31 -0800, Dario Teixeira wrote:
> Paraphrasing Einstein, I think the hierarchy should be as flat
> as possible, but no flatter. For example, I see no reason to
> materialise in the hierarchy the separation between persistent
> and mutable data structures. The should be a documentation
> issue. However, and as you noted, there are cases where some
> hierarchisation may remove namespace clutter and allow for
> better code reuse.

Duly noted. As you may see on our candidate replacement hierarchy, we
intend to merge Data.Persistent and Data.Mutable into Data.Containers.

Whether we flatten further remains open to debate.

Thanks,
David

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act
brings liquidations.

_______________________________________________

Stefano Zacchiroli

unread,
Nov 18, 2008, 9:40:42 AM11/18/08
to caml...@yquem.inria.fr
On Tue, Nov 18, 2008 at 03:23:33PM +0100, David Teller wrote:
> On Tue, 2008-11-18 at 05:31 -0800, Dario Teixeira wrote:
> > Paraphrasing Einstein, I think the hierarchy should be as flat
> > as possible, but no flatter. For example, I see no reason to
> > materialise in the hierarchy the separation between persistent
> > and mutable data structures. The should be a documentation
> > issue. However, and as you noted, there are cases where some
> > hierarchisation may remove namespace clutter and allow for
> > better code reuse.
>
> Duly noted. As you may see on our candidate replacement hierarchy, we
> intend to merge Data.Persistent and Data.Mutable into Data.Containers.

More generally, I would like to advertise a bit more the proposed
*replacement* hierarchy reported at the bottom of David's blog post
[1]; do a text search for "One possible replacement" and start reading
from there.

Several problems with the current hierarchy which have been pointed
out in this thread were notice by ourselves as well, and are already,
at least partly, solved by the proposed new hierarchy.

Cheers.

[1] http://dutherenverseauborddelatable.wordpress.com/2008/11/18/batteries-hierarchy/

--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'č ..| . |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu ą tous ceux que j'aime

David Teller

unread,
Nov 18, 2008, 9:47:07 AM11/18/08
to Alain Frisch, OCaml, Richard Jones, Zheng Li
Ok, good to know. Since we're packing anyway, there's nothing we can do
yet. However, we've already planned to work on a dynamically linked
version of Batteries. Just not for release 1.0

So back to square 1 on this argument.

Thanks Alain & Zheng


On Tue, 2008-11-18 at 15:10 +0100, Alain Frisch wrote:
> David Teller wrote:
> > I thought the linker only linked in symbols which were actually used?
>
> No, it is not the case.
>
> The only automatic mechanism for code pruning is at the level of
> individual modules embedded in a library. As soon as you pack, you
> obtain a monolithic module which can only be linked as a whole.
>
> -- Alain
>
>

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________

David Teller

unread,
Nov 18, 2008, 9:47:07 AM11/18/08
to Daniel Bünzli, OCaml List
On Tue, 2008-11-18 at 14:24 +0100, Daniel Bünzli wrote:
> Le 18 nov. 08 à 13:15, David Teller a écrit :
>
> > But, to keep things ordered, we will still need modules
> > [Threads.Threads], [Threads.Mutex], [Threads.RMutex]...
> > [CoThreads.Threads], [CoThreads.Mutex]... and, well, that's a
> > hierarchy
> > already.
>
> If you include in batteries an external package that has its own
> hierarchy and is designed to be opened I don't mind having that
> hierarchy.
>
> In that case you can just add the new toplevel entry
> CoThread. And if I want to use CoThread, I just open CoThreads, not
> Control.Concurrency.CoThreads. Just try to keep it as flat as
> possible, don't try to force modules in an ad-hoc hierarchical
> taxonomy to try to sort out modules. I don't care if the toplevel list
> of modules is three hundred pages long if there is an efficient mean
> to access their documentation (like tags). I do however care a lot if
> it becomes bureaucratic to be able to _use_ a module in my code.

I concur that tags make a considerable difference.

But let us return to threads for one second. There is a very good reason
to have two distinct modules [Threads] and [CoThreads] with 4-5
submodules each: functors. Assuming [Threads] and [CoThreads] implement
the same interface -- which they do -- I can write a module which takes
as argument either [Threads], [CoThreads] or [WhateverThreads] and
produces a pseudo-concurrent/truly concurrent/whatever implementation of
an algorithm. The same thing could apply to latin-1 strings vs. Unicode
strings (this is essentially what happens in Camomile).

Now, there are certainly several possibilities.

Here's one which doesn't involve a deep hierarchy:
* [Thread], [Mutex], [Concurrent], [Event] remain top-level modules
* [Threads] is also a top-level module, which contains aliases to
[Thread], [Mutex], [Concurrent], [Event]
* [CoThreads] is also a top-level module, which contains its own
implementations of [Thread], [Mutex], [Concurrent], [Event]


We could do the same for strings
* [String], [Char], [Rope], [UChar] remain top-level modules
* we introduce a new module [Strings] containing [String] and [Char]
* we introduce another new module [UStrings] containing an alias
[String] to [Rope] and an alias [Char] to [UChar]

And for numbers
* [Float], [Int], [SafeInt], [BigInt] and hypothetical [SafeFloat] and
[BigFloat] (don't ask me what a BigFloat is supposed to be) remain
top-level modules
* we introduce a new module [Numeric] containing [Float] and [Int]
* we introduce a new module [SafeNumeric] containing [SafeFloat] aliased
as [Float], [SafeInt] aliased as [Int]
* we introduce a new module [BigNumeric] containing [BigFloat] aliased
as [Float], [BigInt] aliased as [Int]

etc.

To me, this seems like the only way to combine no hierarchy and
modularity. However, I have the nasty feeling that this is going to end
up messy, cluttered and otherwise both unmaintainable and unusable
(despite tags).

>
> Le 18 nov. 08 à 13:22, Richard Jones a écrit :
>
> > Easy - look at CPAN[1]. If you want to scale a project you have to
> > make decisions that allow a distributed network of people to
> > cooperate, without needing too much central coordination.
>
> But (unfortunately, sorry to repeat that) Batteries is not a CPAN like
> initiative. It aims at giving a library of modules/syntax extensions
> selected by the library maintainers, as such it is inherently
> centralized and I don't think that questions (1) or (2) are actually
> pertinent for the project.

No, we're not CPAN. If someone wishes to build a CPAN, please feel free
to do it. That may actually be easier to do once Batteries 1.0 has
landed. However, Richard's remark remains interesting. So perhaps
redesigning Batteries to have an open namespace structure is a good
idea.

Cheers,
David

--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.

_______________________________________________

Richard Jones

unread,
Nov 18, 2008, 10:20:42 AM11/18/08
to David Teller, OCaml
On Tue, Nov 18, 2008 at 01:49:09PM +0100, David Teller wrote:
> P.S.: I've pointedly ignored your perch on POD :) In my mind, that's a
> very different topic. For the moment, we'll stick with ocamldoc.

I've used POD selectively even in OCaml projects, mainly because it is
by far the easiest way to generate man pages. OCamldoc is great for
developer documentation (APIs etc) but POD is super-simple for making
manual pages.

cf man page:
http://hg.et.redhat.com/virt/applications/virt-top--devel/?f=5b38082d8aa4;file=virt-top/virt-top.pod
vs ocamldoc documentation:
http://hg.et.redhat.com/virt/applications/ocaml-libvirt--devel/?f=893899664388;file=libvirt/libvirt.mli

One place where POD really stands out, and could be replicated by
camlp4, is for standalone programs that combine argument parsing,
usage and man page all in one place. In many cases you can keep the
option parsing, implementation of the option, and documentation for
the option right next to each other.

http://perldoc.perl.org/Getopt/Long.html#Documentation-and-help-texts

Rich.

--
Richard Jones
Red Hat

_______________________________________________

Richard Jones

unread,
Nov 18, 2008, 1:59:22 PM11/18/08
to Jon Harrop, caml...@yquem.inria.fr
On Tue, Nov 18, 2008 at 06:17:23PM +0000, Jon Harrop wrote:
> I don't follow. Can you not use "include" to extend an existing module:
>
> # module Array = struct
> include Array

You're missing the point which is scalability - how to deal with
distributed parties who are loosely coordinated. The above scheme
allows one person to extend the Array module, but not two people,
unless they coordinate with each other about which order they extend
it (or both have incompatible extensions).

Jon Harrop

unread,
Nov 18, 2008, 2:15:42 PM11/18/08
to caml...@yquem.inria.fr, Richard Jones
On Tuesday 18 November 2008 18:59:14 Richard Jones wrote:
> On Tue, Nov 18, 2008 at 06:17:23PM +0000, Jon Harrop wrote:
> > I don't follow. Can you not use "include" to extend an existing module:
> >
> > # module Array = struct
> > include Array
>
> You're missing the point which is scalability - how to deal with
> distributed parties who are loosely coordinated. The above scheme
> allows one person to extend the Array module, but not two people,
> unless they coordinate with each other about which order they extend
> it (or both have incompatible extensions).

If the library creator did not use functors or classes to make their design
reusable then the only solution for the user is to include all of the
implementations they require:

module Array = struct
include RichardsArray
include JonsArray
end

Given the lack of libraries available for OCaml anyway, this seems like a very
minor concern to me.

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Richard Jones

unread,
Nov 18, 2008, 2:22:49 PM11/18/08
to Jon Harrop, caml...@yquem.inria.fr
On Tue, Nov 18, 2008 at 08:17:36PM +0000, Jon Harrop wrote:
> If the library creator did not use functors or classes to make their design
> reusable then the only solution for the user is to include all of the
> implementations they require:

You're talking about something completely different.

In Perl they have:

Net
Net::Amazon
Net::BitTorrent
Net::FTPServer
(and a million others[1])

The proposal is to have a hierarchy of OCaml modules, of this sort:

Net
Net.Amazon
Net.BitTorrent
Net.FTPServer
(and a million more)

which doesn't scale. However, using '_' as a separator scales because
distributed, loosely coordinated parties can add new modules ad hoc to
such a namespace.

Rich.

[1] http://www.cpan.org/modules/by-module/Net/

--
Richard Jones
Red Hat

_______________________________________________

Daniel Bünzli

unread,
Nov 18, 2008, 2:52:00 PM11/18/08
to OCaml List

Le 18 nov. 08 à 20:22, Richard Jones a écrit :

> The proposal is to have a hierarchy of OCaml modules, of this sort:
>
> Net
> Net.Amazon
> Net.BitTorrent
> Net.FTPServer
> (and a million more)
>
> which doesn't scale.

If there is nothing in the Net module (and ignoring the linking issue)
you can actually achieve that by using -pack. Just redo the pack on
the client whenever it installs a new package in the namespace. No ?

Best,

Daniel

Jon Harrop

unread,
Nov 18, 2008, 4:40:54 PM11/18/08
to Nicolas Pouillard, Caml_mailing list
On Tuesday 18 November 2008 17:51:21 Nicolas Pouillard wrote:
> Excerpts from Jon Harrop's message of Tue Nov 18 19:17:23 +0100 2008:

> > # module Array = struct
> > include Array
> > let empty = [||]
> > end;;
> > module Array :
> > sig
> > external length : 'a array -> int = "%array_length"
> > ...
> > val empty : 'a array
> > end
>
> Yes but that's the same than saying you can change a value:
>
> let x = 42
> let x = x + 1
>
> So you make a new module but don't extend it.

In what way is that unsatisfactory?

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

_______________________________________________

Richard Jones

unread,
Nov 18, 2008, 4:50:25 PM11/18/08
to Daniel Bünzli, OCaml List
On Tue, Nov 18, 2008 at 08:50:51PM +0100, Daniel Bünzli wrote:
> Le 18 nov. 08 ŕ 20:22, Richard Jones a écrit :

> >The proposal is to have a hierarchy of OCaml modules, of this sort:
> >
> > Net
> > Net.Amazon
> > Net.BitTorrent
> > Net.FTPServer
> > (and a million more)
> >
> >which doesn't scale.
>
> If there is nothing in the Net module (and ignoring the linking issue)
> you can actually achieve that by using -pack. Just redo the pack on
> the client whenever it installs a new package in the namespace. No ?

No because Net isn't necessarily an empty module, nor does it
magically pull in all the modules underneath it (which would be
impossible because the Net::* space is constantly changing).

Rich.

--
Richard Jones
Red Hat

_______________________________________________

Alain Frisch

unread,
Nov 18, 2008, 5:07:48 PM11/18/08
to caml...@yquem.inria.fr
On 11/18/2008 7:17 PM, Jon Harrop wrote:
> I don't follow. Can you not use "include" to extend an existing module:
>
> # module Array = struct
> include Array
> let empty = [||]
> end;;
> module Array :
> sig
> external length : 'a array -> int = "%array_length"
> ...
> val empty : 'a array
> end

In addition to this being non-modular, this extension scheme does not
work well with hiararchy as it forces you to mention all the siblings of
the ancestors of the module you want to extend.

E.g. if you start from:

module M = struct
module M1 = struct
module M11 = struct ... end
module M12 = struct ... end
module M13 = struct ... end
...
end
module M2 = struct
...
end
module M3 = struct
...
end
...
end

and you want to extend M11, you need to write:

module M' = struct
module M1 = struct
module M11 = struct include M.M1.M11 (* extension here *) end
module M12 = M.M1.M12
module M13 = M.M1.M13
...
end
module M2 = M.M2
module M3 = M.M3
...
end


Frankly, I don't think that having a nice and well-organized hierarchy
of modules really matters. Things like having uniform interfaces,
consistent idioms and compatible types across libraries seem much more
important to me. Anyway, if a hierarchy is desired, I fail to see any
advantage of using "." instead of e.g. "_" (easily extensible + does not
force you to link everything).

-- Alain

Jon Harrop

unread,
Nov 18, 2008, 5:27:43 PM11/18/08
to caml...@yquem.inria.fr
On Tuesday 18 November 2008 09:56:18 David Teller wrote:
> Now, we've decided that our current hierarchy is perhaps somewhat clumsy
> and that it may benefit from some reworking. Before we proceed, we'd
> like some feedback from the community...

I only have one major concern: you say "with the large number of modules
involved, we would need a hierarchy of modules" but the number of modules
involved is tiny (a few dozen in OCaml compared to tens or even hundreds of
thousands in any industrial-strength language) because OCaml has very few
libraries. Yet your module hierarchies are already enormous and often require
a longer sequence of modules to reach simple functionality than is required
in a comparatively-huge library like .NET.

To me, the most striking example is printf which is just printf in F#,
Printf.printf in OCaml and is now Text.Printf.printf in OCaml+Batteries.
Surely this is a step in the wrong direction?

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

_______________________________________________

Jon Harrop

unread,
Nov 18, 2008, 5:46:58 PM11/18/08
to caml...@yquem.inria.fr
On Tuesday 18 November 2008 22:07:33 Alain Frisch wrote:
> and you want to extend M11, you need to write:
>
> module M' = struct
> module M1 = struct
> module M11 = struct include M.M1.M11 (* extension here *) end
> module M12 = M.M1.M12
> module M13 = M.M1.M13
> ...
> end
> module M2 = M.M2
> module M3 = M.M3
> ...
> end

Ah, yes. Otherwise you get "Multiple definition of the module name ...".

Perhaps that could be solved with extensive Camlp4 hacking to rename the
previous modules (even coming from an "include") to avoid the clash?

> Frankly, I don't think that having a nice and well-organized hierarchy
> of modules really matters. Things like having uniform interfaces,
> consistent idioms and compatible types across libraries seem much more
> important to me.

Indeed. I think the current system would withstand an order of magnitude more
(popular) libraries. I'd also recommend the SML Basis library and F# for
inspiration: they both contain some great designs.

> Anyway, if a hierarchy is desired, I fail to see any advantage of using "."
> instead of e.g. "_" (easily extensible + does not force you to link
> everything).

That brings its own problems, of course. You no longer have a real hierarchy
so you cannot do anything at a given depth in the hierarchy, e.g. apply
mid-level module to a functor.

No doubt people will want both so we'll end up with an ad-hox mix of "."
and "_" separators. In that case, I'd prefer to flatten every "_" (assuming
names didn't clash).

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

_______________________________________________

Alain Frisch

unread,
Nov 18, 2008, 6:14:44 PM11/18/08