For the moment, we have considered point 1 not very important and point
2 a little more. There are several reasons to disregard point 1. Among
these, clarity of origin (as in "is this module endorsed by Batteries or
not?") and documentation issues (as in "gosh, this module pretends to be
part of [Data] but I can't find the documentation anywhere in the
documentation of Batteries, wtf?").
Do you believe that we should have chosen otherwise?
Cheers,
David
On Tue, 2008-11-18 at 10:06 +0000, Richard Jones wrote:
> Your biggest problem is using dot ('.') instead of underscore ('_').
> Using a dot means that the System namespace cannot be extended by
> external packages. If you use an underscore then an external package
> can extend the namespace (eg. by providing System_Newpackage)
>
> Rich.
>
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
On Tue, 2008-11-18 at 11:21 +0100, Zheng Li wrote:
> > Your biggest problem is using dot ('.') instead of underscore ('_').
> > Using a dot means that the System namespace cannot be extended by
> > external packages. If you use an underscore then an external package
> > can extend the namespace (eg. by providing System_Newpackage)
>
> And, doesn't that forces all sub modules to be linked into the final
> executables even if we only use one of them?
> For example I prefer using the least amount of opening of modules,
> to make it easier to see where the values come from
Same here. This is why I'm a little bit sceptical about this hierarchy.
With the current standard library if I suddenly want to use
Int32.of_int, I know I just need to type Int32.of_int in my source.
With your proposal I need to remember that it is in Data.Numeric and
go at the beginning of my file to open it or write
Data.Numeric.Int32.of_int, to me this brings bureaucracy without any
benefit. And lack of bureaucracy is one of the reasons I like ocaml
(and dislike java for example).
Besides Hierarchies are anyway limited in their descriptive power and
one day you'll find something that will fit in two places, Rope is
already an example being both Data.Persistent and Data.Text.
Thus my proposal would be to _present_ them as a hierarchy (but even
here a mean to tag/browse the modules with/by keywords would do a
better job) but keep the actual module structure of Batteries as flat
as possible, everything just under the toplevel Batteries. When I code
I really don't want to have to think about all these open directives
that essentially bring nothing.
Best,
Daniel
I use modules in the same way, mostly to be able to grep Int32.of_int in my
code when needed (as greping for of_int only would make the result less
precise).
> Thus my proposal would be to _present_ them as a hierarchy (but even here a
> mean to tag/browse the modules with/by keywords would do a better job) but
> keep the actual module structure of Batteries as flat as possible,
> everything just under the toplevel Batteries. When I code I really don't
> want to have to think about all these open directives that essentially bring
> nothing.
>
tag system for modules is a good idea, and I would like to add that type
search for functions (which is already done by ocamlbrowser) is also nice.
--
Thomas
That's correct, there are plenty of modules which could fit in different
places. For the moment, we decided that every module should appear only
in one place. However, we could easily change this -- in fact, to allow
this, we only need to alter our documentation generator.
> Thus my proposal would be to _present_ them as a hierarchy (but even
> here a mean to tag/browse the modules with/by keywords would do a
> better job) but keep the actual module structure of Batteries as flat
> as possible, everything just under the toplevel Batteries. When I code
> I really don't want to have to think about all these open directives
> that essentially bring nothing.
Browsing by keywords sounds like an interesting idea. I'm adding this to
our TODO list. Of course, the next step will be to actually add these
keywords and that's going to be much longer if we intend to tag all
values.
However, we disagree on the necessity of a hierarchy. There are two good
reasons why the base library of OCaml doesn't have a hierarchy (almost):
it's small and there are almost no redundancies between modules. Neither
is true for Batteries.
For an example of this redundancy, consider threads. For the moment, we
have five thread-related modules: [Threads], [Mutex], [RMutex],
[Condition] and [Event]. These modules, which are essentially the same
modules as those of the base library, are all submodules of
[Control.Concurrency.Threads]. Now, I personally like
[Control.Concurrency] but I agree that this is debatable. The reason why
we group these modules into [Threads] is because sooner or later, we
are going to have four or five other thread-related modules called
[Threads], [Mutex], [Condition], [Event] and perhaps [RMutex]. These
modules will get into [Control.Concurrency.CoThreads]. They won't
replace the first batch, they will exist side-by-side. Of course, we
could trim the hierarchy and remove [Control.Concurrency] -- trimming
the hierarchy is the main reason for launching this thread,
incidentally. But, to keep things ordered, we will still need modules
[Threads.Threads], [Threads.Mutex], [Threads.RMutex]...
[CoThreads.Threads], [CoThreads.Mutex]... and, well, that's a hierarchy
already.
coThreads is not an exceptional case, mind you. We may end up with two
definitions of [Graphics], several data structures with the same name
but different purposes, etc.
There's also the issue of labels and other partial redefinitions of
modules. The OCaml base library defines [Array]/[ArrayLabels],
[List]/[ListLabels], [Map]/[MoreLabels.MapLabels] etc. In Batteries
Included, we define [Array], [Array.Labels], [List], [List.Labels],
which clutters less the list of modules and makes for something more
consistent, especially since [FooLabel] is not the only kind of "module
[Foo] with a variant": we also have [Array.ExceptionLess], for
operations without exceptions, and [Array.Cap] for read-only/write-only
arrays. Other variants may still appear.
Do you see any better way of managing the complexity of all this?
Cheers,
David
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
Easy - look at CPAN[1]. If you want to scale a project you have to
make decisions that allow a distributed network of people to
cooperate, without needing too much central coordination. CPAN is a
great example of this loose coupling because packages make their own
decision about naming (albeit they can become "official" later - but
they won't need to rename unless there is an actual naming conflict).
If the problem is documentation or provenance of packages, then add a
mechanism to solve that problem. Perl also solves this through an
existing, lightweight, distributed mechanism (a standard location to
install man-pages, and a standard man-page format and man-page
generating mechanism -- POD).
Rich.
--
Richard Jones
Red Hat
I'm still not getting where the benefit of having this hierarchy is,
except that it adds a Java-like complexity and will create
hard-to-manage churn if a module ever moves.
API changes are handled really badly in OCaml, ironically because of
the lack of a textual preprocessor. You can't just write this every
time lablgtk / calendar / latest culprit decides to change their API:
#ifdef LABLGTK < 210
let icon = GMisc.image () in
icon#set_stock icon_type ~size:size;
icon
#else
let icon = GMisc.image () in
icon#set_stock `DIALOG_ERROR;
icon#set_icon_size `DIALOG;
icon
#endif
(Well, you can run -pp cpp, but that breaks other stuff)
Rich.
--
Richard Jones
Red Hat
_______________________________________________
I forgot to answer that part.
In Batteries, for the moment, we decided to keep the module names of the
base library as shortcuts to our new modules. Consequently, you can
still write your [Int32.of_int] in addition to our new [Int32.print],
etc. The old modules are still available as submodules of [Legacy], if
needed.
Should you wish to flatten the complete hierarchy, assuming that it's
possible and that there are no collisions on names, that's also
something which you can do quite easily. We even provide some syntactic
sugar for this. It's just the matter of writing a file my_batteries.ml
along the lines of
module Array = Data.Mutable.Array
module List = Data.Persistent.List
..
module PosixThreads = Control.Concurrency.Threads.Threads
module PosixMutex = Control.Concurrency.Threads.Mutex
module CoThreads = Control.Concurrency.CoThreads.Threads
..
module ArrayExn = Data.Mutable.Array include ExceptionLess
(*syntactic sugar*)
module ArrayLabels = Data.Mutable.Array include Labels
module ArrayCapExn = Data.Mutable.Array.Cap include ExceptionLess
module ArrayCapLabels= Data.Mutable.Array.Cap include Labels
..
I personally don't like name [ArrayCapLabels] but I can't think of any
better name to represent this once we have removed any hierarchy.
I personally prefer the hierarchy but, once again, the majority may
disagree. So if you believe this is better, the next logical step would
be to design a full and consistent list of modules including all the
modules which already appear in the current version of Batteries, and
with some space left for OCamlnet, OCamlnae, Reins, Camomile, ULex,
Camlp4, CoThreads and a few others. I truly mean it, if you can provide
us with something you consider more comfortable and as future-proof, we
may adopt it.
Cheers,
David
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
You really should check.
I have not yet looked too much into the source, but if the
batteries_core.ml is one of them to be referenced anyway, I'm afraid all
modules (not just parents/siblings) will be linked.
Try to compile the following source into executable:
----
open Batteries.Data.Persistent.List
let _ = iter
----
You will end up with being asked for numerous unrelated modules during
the linking phrase, or you can use the recommended "ocamlfind
batteries/ocamlc" shortcut. Either way, an executable of +50 times
bigger in size (i.e. +1M for the 2 lines) than using the standard List
will be produced.
--
Zheng
> On Tue, 2008-11-18 at 11:21 +0100, Zheng Li wrote:
>>> Your biggest problem is using dot ('.') instead of underscore ('_').
>>> Using a dot means that the System namespace cannot be extended by
>>> external packages. If you use an underscore then an external package
>>> can extend the namespace (eg. by providing System_Newpackage)
>> And, doesn't that forces all sub modules to be linked into the final
>> executables even if we only use one of them?
>
_______________________________________________
On Tue, 2008-11-18 at 12:28 +0000, Benedikt Grundmann wrote:
> > Do you see any better way of managing the complexity of all this?
> Yes don't introduce it at all, make a decision to use or not use labels
> and stick with it. Similarly make a decision to use or not use exceptions
> as the "default", suffix / rename alternative functions as appropriate
> (consistently). Consistency is a big win. Not only as it speeds you up
> when you read/modify other people's code it also reduces the amount
> of decisions you have to do when writing new code.
>
> http://ocaml.janestreet.com/?q=node/28
>
> Cheers,
>
> Bene
Interesting point. So far, the approach of Batteries has certainly been
different, in large part because we don't want everything to end up part
of the Batteries hierarchy (or, well, lack thereof). Of course, this is
in contradiction with our sometimes imperialistic tendencies, so we may
be guilty of schizophrenia.
Perhaps we should organise a poll on this subject.
> If the problem is documentation or provenance of packages, then add a
> mechanism to solve that problem. Perl also solves this through an
> existing, lightweight, distributed mechanism (a standard location to
> install man-pages, and a standard man-page format and man-page
> generating mechanism -- POD).
I'm not sure the man-page format quite scales up to the kind of
hyperlinked complexity we have in Batteries for the moment. But yes, I
agree, we can certainly work something out. In fact, we could say that
we've started on this track, albeit perhaps not with such grand
ambitions.
Thanks for the idea,
David
P.S.: I've pointedly ignored your perch on POD :) In my mind, that's a
very different topic. For the moment, we'll stick with ocamldoc.
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act
brings liquidations.
_______________________________________________
Side-note: That's certainly something we could add to Batteries, if
needed. Camlp4 is pretty-much necessary to use Batteries anyway and
Camlp4 already defines IFDEF, INCLUDE, etc. We would just need to
complete that DSL perhaps to accept any valid OCaml expression and call
the ocaml interpreter to evaluate these expressions.
Cheers,
David
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
> But, to keep things ordered, we will still need modules
> [Threads.Threads], [Threads.Mutex], [Threads.RMutex]...
> [CoThreads.Threads], [CoThreads.Mutex]... and, well, that's a
> hierarchy
> already.
If you include in batteries an external package that has its own
hierarchy and is designed to be opened I don't mind having that
hierarchy. In that case you can just add the new toplevel entry
CoThread. And if I want to use CoThread, I just open CoThreads, not
Control.Concurrency.CoThreads. Just try to keep it as flat as
possible, don't try to force modules in an ad-hoc hierarchical
taxonomy to try to sort out modules. I don't care if the toplevel list
of modules is three hundred pages long if there is an efficient mean
to access their documentation (like tags). I do however care a lot if
it becomes bureaucratic to be able to _use_ a module in my code.
Le 18 nov. 08 à 13:22, Richard Jones a écrit :
> Easy - look at CPAN[1]. If you want to scale a project you have to
> make decisions that allow a distributed network of people to
> cooperate, without needing too much central coordination.
But (unfortunately, sorry to repeat that) Batteries is not a CPAN like
initiative. It aims at giving a library of modules/syntax extensions
selected by the library maintainers, as such it is inherently
centralized and I don't think that questions (1) or (2) are actually
pertinent for the project.
Best,
Daniel
> I personally prefer the hierarchy but, once again, the majority
> may disagree. So if you believe this is better, the next logical
> step would be to design a full and consistent list of modules
> including all the modules which already appear in the current
> version of Batteries, and with some space left for OCamlnet,
> OCamlnae, Reins, Camomile, ULex, Camlp4, CoThreads and a few
> others. I truly mean it, if you can provide us with something
> you consider more comfortable and as future-proof, we may adopt it.
Paraphrasing Einstein, I think the hierarchy should be as flat
as possible, but no flatter. For example, I see no reason to
materialise in the hierarchy the separation between persistent
and mutable data structures. The should be a documentation
issue. However, and as you noted, there are cases where some
hierarchisation may remove namespace clutter and allow for
better code reuse.
Cheers,
Dario Teixeira
No, it is not the case.
The only automatic mechanism for code pruning is at the level of
individual modules embedded in a library. As soon as you pack, you
obtain a monolithic module which can only be linked as a whole.
-- Alain
Duly noted. As you may see on our candidate replacement hierarchy, we
intend to merge Data.Persistent and Data.Mutable into Data.Containers.
Whether we flatten further remains open to debate.
Thanks,
David
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act
brings liquidations.
_______________________________________________
More generally, I would like to advertise a bit more the proposed
*replacement* hierarchy reported at the bottom of David's blog post
[1]; do a text search for "One possible replacement" and start reading
from there.
Several problems with the current hierarchy which have been pointed
out in this thread were notice by ourselves as well, and are already,
at least partly, solved by the proposed new hierarchy.
Cheers.
[1] http://dutherenverseauborddelatable.wordpress.com/2008/11/18/batteries-hierarchy/
--
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Dietro un grande uomo c'č ..| . |. Et ne m'en veux pas si je te tutoie
sempre uno zaino ...........| ..: |.... Je dis tu ą tous ceux que j'aime
So back to square 1 on this argument.
Thanks Alain & Zheng
On Tue, 2008-11-18 at 15:10 +0100, Alain Frisch wrote:
> David Teller wrote:
> > I thought the linker only linked in symbols which were actually used?
>
> No, it is not the case.
>
> The only automatic mechanism for code pruning is at the level of
> individual modules embedded in a library. As soon as you pack, you
> obtain a monolithic module which can only be linked as a whole.
>
> -- Alain
>
>
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
I concur that tags make a considerable difference.
But let us return to threads for one second. There is a very good reason
to have two distinct modules [Threads] and [CoThreads] with 4-5
submodules each: functors. Assuming [Threads] and [CoThreads] implement
the same interface -- which they do -- I can write a module which takes
as argument either [Threads], [CoThreads] or [WhateverThreads] and
produces a pseudo-concurrent/truly concurrent/whatever implementation of
an algorithm. The same thing could apply to latin-1 strings vs. Unicode
strings (this is essentially what happens in Camomile).
Now, there are certainly several possibilities.
Here's one which doesn't involve a deep hierarchy:
* [Thread], [Mutex], [Concurrent], [Event] remain top-level modules
* [Threads] is also a top-level module, which contains aliases to
[Thread], [Mutex], [Concurrent], [Event]
* [CoThreads] is also a top-level module, which contains its own
implementations of [Thread], [Mutex], [Concurrent], [Event]
We could do the same for strings
* [String], [Char], [Rope], [UChar] remain top-level modules
* we introduce a new module [Strings] containing [String] and [Char]
* we introduce another new module [UStrings] containing an alias
[String] to [Rope] and an alias [Char] to [UChar]
And for numbers
* [Float], [Int], [SafeInt], [BigInt] and hypothetical [SafeFloat] and
[BigFloat] (don't ask me what a BigFloat is supposed to be) remain
top-level modules
* we introduce a new module [Numeric] containing [Float] and [Int]
* we introduce a new module [SafeNumeric] containing [SafeFloat] aliased
as [Float], [SafeInt] aliased as [Int]
* we introduce a new module [BigNumeric] containing [BigFloat] aliased
as [Float], [BigInt] aliased as [Int]
etc.
To me, this seems like the only way to combine no hierarchy and
modularity. However, I have the nasty feeling that this is going to end
up messy, cluttered and otherwise both unmaintainable and unusable
(despite tags).
>
> Le 18 nov. 08 à 13:22, Richard Jones a écrit :
>
> > Easy - look at CPAN[1]. If you want to scale a project you have to
> > make decisions that allow a distributed network of people to
> > cooperate, without needing too much central coordination.
>
> But (unfortunately, sorry to repeat that) Batteries is not a CPAN like
> initiative. It aims at giving a library of modules/syntax extensions
> selected by the library maintainers, as such it is inherently
> centralized and I don't think that questions (1) or (2) are actually
> pertinent for the project.
No, we're not CPAN. If someone wishes to build a CPAN, please feel free
to do it. That may actually be easier to do once Batteries 1.0 has
landed. However, Richard's remark remains interesting. So perhaps
redesigning Batteries to have an open namespace structure is a good
idea.
Cheers,
David
--
David Teller-Rajchenbach
Security of Distributed Systems
http://www.univ-orleans.fr/lifo/Members/David.Teller
Angry researcher: French Universities need reforms, but the LRU act brings liquidations.
_______________________________________________
I've used POD selectively even in OCaml projects, mainly because it is
by far the easiest way to generate man pages. OCamldoc is great for
developer documentation (APIs etc) but POD is super-simple for making
manual pages.
cf man page:
http://hg.et.redhat.com/virt/applications/virt-top--devel/?f=5b38082d8aa4;file=virt-top/virt-top.pod
vs ocamldoc documentation:
http://hg.et.redhat.com/virt/applications/ocaml-libvirt--devel/?f=893899664388;file=libvirt/libvirt.mli
One place where POD really stands out, and could be replicated by
camlp4, is for standalone programs that combine argument parsing,
usage and man page all in one place. In many cases you can keep the
option parsing, implementation of the option, and documentation for
the option right next to each other.
http://perldoc.perl.org/Getopt/Long.html#Documentation-and-help-texts
Rich.
--
Richard Jones
Red Hat
_______________________________________________
You're missing the point which is scalability - how to deal with
distributed parties who are loosely coordinated. The above scheme
allows one person to extend the Array module, but not two people,
unless they coordinate with each other about which order they extend
it (or both have incompatible extensions).
If the library creator did not use functors or classes to make their design
reusable then the only solution for the user is to include all of the
implementations they require:
module Array = struct
include RichardsArray
include JonsArray
end
Given the lack of libraries available for OCaml anyway, this seems like a very
minor concern to me.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
You're talking about something completely different.
In Perl they have:
Net
Net::Amazon
Net::BitTorrent
Net::FTPServer
(and a million others[1])
The proposal is to have a hierarchy of OCaml modules, of this sort:
Net
Net.Amazon
Net.BitTorrent
Net.FTPServer
(and a million more)
which doesn't scale. However, using '_' as a separator scales because
distributed, loosely coordinated parties can add new modules ad hoc to
such a namespace.
Rich.
[1] http://www.cpan.org/modules/by-module/Net/
--
Richard Jones
Red Hat
_______________________________________________
> The proposal is to have a hierarchy of OCaml modules, of this sort:
>
> Net
> Net.Amazon
> Net.BitTorrent
> Net.FTPServer
> (and a million more)
>
> which doesn't scale.
If there is nothing in the Net module (and ignoring the linking issue)
you can actually achieve that by using -pack. Just redo the pack on
the client whenever it installs a new package in the namespace. No ?
Best,
Daniel
In what way is that unsatisfactory?
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
_______________________________________________
No because Net isn't necessarily an empty module, nor does it
magically pull in all the modules underneath it (which would be
impossible because the Net::* space is constantly changing).
Rich.
--
Richard Jones
Red Hat
_______________________________________________
In addition to this being non-modular, this extension scheme does not
work well with hiararchy as it forces you to mention all the siblings of
the ancestors of the module you want to extend.
E.g. if you start from:
module M = struct
module M1 = struct
module M11 = struct ... end
module M12 = struct ... end
module M13 = struct ... end
...
end
module M2 = struct
...
end
module M3 = struct
...
end
...
end
and you want to extend M11, you need to write:
module M' = struct
module M1 = struct
module M11 = struct include M.M1.M11 (* extension here *) end
module M12 = M.M1.M12
module M13 = M.M1.M13
...
end
module M2 = M.M2
module M3 = M.M3
...
end
Frankly, I don't think that having a nice and well-organized hierarchy
of modules really matters. Things like having uniform interfaces,
consistent idioms and compatible types across libraries seem much more
important to me. Anyway, if a hierarchy is desired, I fail to see any
advantage of using "." instead of e.g. "_" (easily extensible + does not
force you to link everything).
-- Alain
I only have one major concern: you say "with the large number of modules
involved, we would need a hierarchy of modules" but the number of modules
involved is tiny (a few dozen in OCaml compared to tens or even hundreds of
thousands in any industrial-strength language) because OCaml has very few
libraries. Yet your module hierarchies are already enormous and often require
a longer sequence of modules to reach simple functionality than is required
in a comparatively-huge library like .NET.
To me, the most striking example is printf which is just printf in F#,
Printf.printf in OCaml and is now Text.Printf.printf in OCaml+Batteries.
Surely this is a step in the wrong direction?
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
_______________________________________________
Ah, yes. Otherwise you get "Multiple definition of the module name ...".
Perhaps that could be solved with extensive Camlp4 hacking to rename the
previous modules (even coming from an "include") to avoid the clash?
> Frankly, I don't think that having a nice and well-organized hierarchy
> of modules really matters. Things like having uniform interfaces,
> consistent idioms and compatible types across libraries seem much more
> important to me.
Indeed. I think the current system would withstand an order of magnitude more
(popular) libraries. I'd also recommend the SML Basis library and F# for
inspiration: they both contain some great designs.
> Anyway, if a hierarchy is desired, I fail to see any advantage of using "."
> instead of e.g. "_" (easily extensible + does not force you to link
> everything).
That brings its own problems, of course. You no longer have a real hierarchy
so you cannot do anything at a given depth in the hierarchy, e.g. apply
mid-level module to a functor.
No doubt people will want both so we'll end up with an ad-hox mix of "."
and "_" separators. In that case, I'd prefer to flatten every "_" (assuming
names didn't clash).
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
_______________________________________________