Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] On module distribution

2 views
Skip to first unread message

Bünzli Daniel

unread,
Jan 15, 2008, 6:20:30 AM1/15/08
to caml-list caml-list
A few month ago, after a discussion on cherry-picking modules in the
reins library I thought a little bit about devising a system to
facilitate module sharing. A system to simplify the tedious and
uninteresting actions needed to be able to use and publish modules. At
that time I started a design document for it, but as is expected in
such cases the effort didn't last long.

However since people will be meeting in Paris in a few week to discuss
these things I thought there may be some ideas to take in this very
rough and incomplete draft of a system that I will never implement. So
FWIW here's a link [1] to this document. In summary the main ideas
were :

1. A decentralized system, anyone who can publish on a web server can
publish a package. A central authority is a bottleneck to publication.

2. Use atom feeds [2] as the distribution medium. Atom feeds contain
all the semantic information (authors, contributors, entries, rights,
link to enclosure, labels etc.) needed to represent a package and its
versions with release notes. Shortly a package is a feed, a version is
an entry with an enclosure link to an archive. The only extensions
needed (Atom allows this via xml name spaces) are new link attributes
to describe version dependencies. Packages as feeds allow to follow
their evolution with a plain newsfeed reader (which would also
facilitates the maintenance of repositories like the hump). To avoid
angle brackets, package feeds are generated from a tagged plain text
README file.

3. Manage packages per project (vs. per machine) to make project
dependencies explicit. Thus a single command can install you the
(OCaml + C stubs only) dependencies of your project on a fresh system.
If your project is a package itself, it facilitates its packaging .

4. Rely on ocamlbuild to do the hard work. Grosso modo in the way
described here [3], which may be unrealistic for big projects, but on
unices ressource consumption could be mitigated by making hard links
to a cache maintained per user or machine (inspired by ideas in this
message [4]).

Best,

Daniel

[1] http://erratique.ch/writings/mod.pdf
[2] http://tools.ietf.org/html/rfc4287
[3] http://brion.inria.fr/gallium/index.php/Working_on_dependent_projects_with_ocamlbuild
[4] http://caml.inria.fr/pub/ml-archives/caml-list/2007/04/ea46e76c646854347ad02dc10862a6ee.fr.html

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Berke Durak

unread,
Jan 15, 2008, 8:38:27 AM1/15/08
to Bünzli Daniel, caml-list caml-list
Bünzli Daniel a écrit :

> 3. Manage packages per project (vs. per machine) to make project
> dependencies explicit. Thus a single command can install you the (OCaml
> + C stubs only) dependencies of your project on a fresh system. If your
> project is a package itself, it facilitates its packaging .
>
> 4. Rely on ocamlbuild to do the hard work. Grosso modo in the way
> described here [3], which may be unrealistic for big projects, but on
> unices ressource consumption could be mitigated by making hard links to
> a cache maintained per user or machine (inspired by ideas in this
> message [4]).

I'm not a big fan of hardlinking external stuff into the _build directory.

I think we should rather add to Ocamlbuild a module for calling
ocamlfind, parsing its output, etc. This way ocamlbuild plugins could
easily call ocamlfind, be it for configuration or compilation.

Pursuing the ocamlbuild philosophy of having a simple solution for
simple problems, a built-in tag syntax and associated rules should allow
such simple (e.g. regular) projects to easily use a package registered
in ocamlfind.

I'm thinking of a tag use_ocamlfind(PROJECTNAME) so that you could write:

<myproject.{byte,native}>: use_ocamlfind(pcre), use_ocamlfind(mysql)

Of course you still need to register those packages in ocamlfind.

As Ocaml binaries are brittle, a solution for compiling from source such
as Godi is welcome.

However Godi needs to be kept up-to-date with respect to the Ocaml
distribution... it is currently only available for 3.09!
--
Berke DURAK

Gerd Stolpmann

unread,
Jan 15, 2008, 9:24:53 AM1/15/08
to Berke Durak, Bünzli Daniel, caml-list caml-list
> As Ocaml binaries are brittle, a solution for compiling from source such
> as Godi is welcome.
>
> However Godi needs to be kept up-to-date with respect to the Ocaml
> distribution... it is currently only available for 3.09!

Sorry, but this is not true. You can use Godi with Ocaml 3.10 by passing
"-section 3.10" to the bootstrap script.

It is right is that Godi for Ocaml 3.10 is not yet publicly announced.
The reason is that a few libraries are not kept up-to-date. In
particular, there are still libraries using camlp4 that are not
available for 3.10. So we simply cannot recommend blindly upgrading to
3.10 yet.

It is unclear how we go on. Maybe we drop some libraries.

Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
ge...@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------

Sylvain Le Gall

unread,
Jan 15, 2008, 10:08:19 AM1/15/08
to caml...@inria.fr
On 15-01-2008, Bünzli Daniel <daniel....@erratique.ch> wrote:
> A few month ago, after a discussion on cherry-picking modules in the
> reins library I thought a little bit about devising a system to
> facilitate module sharing. A system to simplify the tedious and
> uninteresting actions needed to be able to use and publish modules. At
> that time I started a design document for it, but as is expected in
> such cases the effort didn't last long.
>
> However since people will be meeting in Paris in a few week to discuss
> these things I thought there may be some ideas to take in this very
> rough and incomplete draft of a system that I will never implement. So
> FWIW here's a link [1] to this document. In summary the main ideas
> were :
>
> 1. A decentralized system, anyone who can publish on a web server can
> publish a package. A central authority is a bottleneck to publication.
>

Unfortunately, a decentralized system has also several drawbacks:
* initial specification on how to be part of the decentralized system
must be precise and complete enough to not need to update it --
decentralized system always need a clear "contract" to collaborate.
This part is by far not tricky if you are not 100% sure of what you
want to build and if you have never done it before (it is just
like designing a network protocol).
* you need to provide a backup foreach node of your system. Otherwise,
every node will become a point of failure. This is critical: lets
consider you have a package A that build depends on package B, C and
D. With a centralized system you "download" point of failure is the
central location, either it is up or down. With a decentralized
approach your "download" point of failure will be the location of A,
B, C and D. You have to find a way to circumvent this problem...
* automatic build and different checkup are more easily done in a
central repository (because everything is at the same place)
* hijack of modules is more easily done in a central repository. This
point is important because, OSS developper tends to be Missing In
Action.
* ...

In fact, Debian user reading this will see that i am having the same
sort of arguments that Debian has concerning the other distributions.
Debian has developped a very centric repository for all its packages
which other Linux distribution have not done. This tends to lead to have
more control on the QA of everything. Which is better to my mind.

> 2. Use atom feeds [2] as the distribution medium. Atom feeds contain
> all the semantic information (authors, contributors, entries, rights,
> link to enclosure, labels etc.) needed to represent a package and its
> versions with release notes. Shortly a package is a feed, a version is
> an entry with an enclosure link to an archive. The only extensions
> needed (Atom allows this via xml name spaces) are new link attributes
> to describe version dependencies. Packages as feeds allow to follow
> their evolution with a plain newsfeed reader (which would also
> facilitates the maintenance of repositories like the hump). To avoid
> angle brackets, package feeds are generated from a tagged plain text
> README file.
>

You should have a look to DOAP
http://usefulinc.com/doap/

> 3. Manage packages per project (vs. per machine) to make project
> dependencies explicit. Thus a single command can install you the
> (OCaml + C stubs only) dependencies of your project on a fresh system.
> If your project is a package itself, it facilitates its packaging .
>

I don't agree project and package are not the same thing. You should
take into consideration that different distribution have different
packaging policy. Trying to have the same packaging policy for every
distribution is not feasable (well in fact it is possible but it is a
very long term wokr -- something like the Grand Unification Theory).

Regards,
Sylvain Le Gall

David Thomas

unread,
Jan 15, 2008, 1:46:23 PM1/15/08
to caml-list
One thing that might well be worth consideration is
hooks between whatever module system we devise and
various platform specific package management systems.
I've no idea what form these should take, but it seems
to me there's a bit too much SEP in this space.


____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs

Bünzli Daniel

unread,
Jan 15, 2008, 3:41:52 PM1/15/08
to caml-list caml-list
Le 15 janv. 08 à 14:38, Berke Durak a écrit :

> I think we should rather add to Ocamlbuild a module for calling
> ocamlfind, parsing its output, etc. This way ocamlbuild plugins
> could easily call ocamlfind, be it for configuration or compilation.

My problem with ocamlfind is that it takes too much control over me.
Also it doesn't help you with the tedious publishing aspect (which I
try to mitigate by using news feeds) and it won't help you with the
binary update problem.

Le 15 janv. 08 à 16:07, Sylvain Le Gall a écrit :

> Unfortunately, a decentralized system has also several drawbacks:

[...]

Yes of course. But the point is that we already have a decentralized
system. All these tarballs that are referenced from the hump and not
part of godi. My aim is to be able to quickly install or publish such
decentralized bits. Currently these two tasks take too much time:
using them, because everyone does it its own way, publishing them,
because you have to devise your own way (make a readme, think about
how to structure the tarball how to manage releases, announce on the
mailing list, etc.). The idea is to simplify all this uninteresting
business to entice people to share their modules. Lowering the bar may
mean a decrease in quality but in the end good modules and reliable
publishers will be identified by the community.

Also note that the proposal in itself doesn't prevent the development
of a more authoritative, centralized and stable source of packages.

> In fact, Debian user reading this will see that i am having the same
> sort of arguments that Debian has concerning the other distributions.
> Debian has developped a very centric repository for all its packages
> which other Linux distribution have not done. This tends to lead to
> have
> more control on the QA of everything. Which is better to my mind.

If the aim is to support an operating system I completly agree with
you. But the aim of my proposal is to support the ocaml development
bazaar which is not the same thing.

>> 3. Manage packages per project (vs. per machine) to make project
>> dependencies explicit. Thus a single command can install you the
>> (OCaml + C stubs only) dependencies of your project on a fresh
>> system.
>> If your project is a package itself, it facilitates its packaging .
>>
>
> I don't agree project and package are not the same thing. You should
> take into consideration that different distribution have different
> packaging policy.

That's not what I say. The _if_ of the last sentence is for when you
are developing an ocaml library with dependencies in that case your
project may become a package. If you are making an end-user
application this should not be used as a distribution mechanism, I
explicitly say that in the proposal, it is a tool for ocaml
_developers_. But still from a developer perspective it is a good
thing to have a mechanical way to track the external dependencies of
your project whether this is an end-user application or not, hence
packages should be (conceptually) managed per project.

Best,

Daniel

Will Farr

unread,
Jan 15, 2008, 3:56:41 PM1/15/08
to caml-list caml-list
You might take a look at the PLaneT (http://planet.plt-scheme.org/)
system for PLT Scheme. It's a centralized repository, so not directly
applicable if you stick with your current idea, but they handle the
issues if interdependence and ease of publishing with *extreme*
elegance (from my point of view---I've published a few PLaneT
packages), and I think they have a paper discussing some of the design
issues they've come across. (The paper is, in fact, here:
http://scheme2006.cs.uchicago.edu/04-matthews.pdf .)

Good luck, and have fun at the conference!

Will

On Jan 15, 2008 3:41 PM, Bünzli Daniel <daniel....@erratique.ch> wrote:
> Yes of course. But the point is that we already have a decentralized
> system. All these tarballs that are referenced from the hump and not
> part of godi. My aim is to be able to quickly install or publish such
> decentralized bits. Currently these two tasks take too much time:
> using them, because everyone does it its own way, publishing them,
> because you have to devise your own way (make a readme, think about
> how to structure the tarball how to manage releases, announce on the
> mailing list, etc.). The idea is to simplify all this uninteresting
> business to entice people to share their modules. Lowering the bar may
> mean a decrease in quality but in the end good modules and reliable
> publishers will be identified by the community.

_______________________________________________

Vlad Skvortsov

unread,
Jan 15, 2008, 3:56:44 PM1/15/08
to Bünzli Daniel, caml-list caml-list
Bünzli Daniel wrote:
> Le 15 janv. 08 ŕ 14:38, Berke Durak a écrit :

>
>> I think we should rather add to Ocamlbuild a module for calling
>> ocamlfind, parsing its output, etc. This way ocamlbuild plugins could
>> easily call ocamlfind, be it for configuration or compilation.
>
> My problem with ocamlfind is that it takes too much control over me.
> Also it doesn't help you with the tedious publishing aspect (which I
> try to mitigate by using news feeds) and it won't help you with the
> binary update problem.
>
> Le 15 janv. 08 ŕ 16:07, Sylvain Le Gall a écrit :

>
>> Unfortunately, a decentralized system has also several drawbacks:
> [...]
>
> Yes of course. But the point is that we already have a decentralized
> system. All these tarballs that are referenced from the hump and not
> part of godi. My aim is to be able to quickly install or publish such
> decentralized bits. Currently these two tasks take too much time:
> using them, because everyone does it its own way, publishing them,
> because you have to devise your own way (make a readme, think about
> how to structure the tarball how to manage releases, announce on the
> mailing list, etc.). The idea is to simplify all this uninteresting
> business to entice people to share their modules. Lowering the bar may
> mean a decrease in quality but in the end good modules and reliable
> publishers will be identified by the community.
>
> Also note that the proposal in itself doesn't prevent the development
> of a more authoritative, centralized and stable source of packages.

I've just started with OCaml, and my immediate perception was that
modules and libraries are quite hard to find. This is due to several
factors, one being that the web interface for Hump doesn't allow complex
searches and stuff, doesn't offer RSS to keep track of updates, etc.

Did you guys look at how this problem is solved in Python? There is a
standard library module which allows one to write a package metadata in
a "minilanguage". Then once the metadata is there, it's possible to use
that for packaging (again, through the standard utility) and/or to
update the centralized package index (pypi.python.org). It is as easy as:

# Create metadata, listing files included into the package and some
mandatory fields like version, homepage, author, keywords, etc.
$ vi setup.py

# Create a tarball for source distribution
$ python setup.py sdist

# ...or a binary distribution
$ python setup.py bdist

# Now tell the world we have a new package here
$ python setup.py register

More information here:
http://docs.python.org/dist/setup-script.html

My 0.02.

--
Vlad Skvortsov, v...@73rus.com, http://vss.73rus.com

Sylvain Le Gall

unread,
Jan 15, 2008, 4:27:25 PM1/15/08
to caml...@inria.fr
On 15-01-2008, Bünzli Daniel <daniel....@erratique.ch> wrote:

I think one of the best way to manage the bazaar is to follow the track
of some other major languages (let say Perl) that has implemented a
standard way to publish project with a good naming scheme et al (let
say CPAN).

My point is that unfortunately managing the bazaar of OCaml require more
standard procedure and knowledge foreach member of the community. You
will never reach the 100% compliance for every bits of the humps. If you
get something like 10% it will already be something great (i really
think it).

As in debian, you need to have some kind of required knowledge to begin
publishing good project. You cannot speed up this step.

BUT after having this standard common knowledge that leads to an uniform
packaging, you will be able to perform the second task: using them
quickly...

FYI, i recommend you to browse a little bit about Perl/CPAN, this is
a great work about a centralized module publishing system:
http://pause.perl.org

Extract, to show you what mean a required basic knowledge:
[quote]
Your duties, the basics, traps

We trust that you have read the perlmodinstall, perlmodlib,
perlmodstyle, and perlnewmod manpages and that you regularly check out
uploads to CPAN and that you have been watching CPAN activities for a
while to have an impression of how things fit together. It usually boils
down to (slogan shamelessly stolen and adapted from sudo(1)):

1. Think, better even talk before you upload
2. Respect the namespace of others
[/quote]

In particular perlmodstyle, perlmodlid and perlmodinstall are really a
good way to understand years of publishing...

>>> 3. Manage packages per project (vs. per machine) to make project
>>> dependencies explicit. Thus a single command can install you the
>>> (OCaml + C stubs only) dependencies of your project on a fresh
>>> system.
>>> If your project is a package itself, it facilitates its packaging .
>>>
>>
>> I don't agree project and package are not the same thing. You should
>> take into consideration that different distribution have different
>> packaging policy.
>
> That's not what I say. The _if_ of the last sentence is for when you
> are developing an ocaml library with dependencies in that case your
> project may become a package. If you are making an end-user
> application this should not be used as a distribution mechanism, I
> explicitly say that in the proposal, it is a tool for ocaml
> _developers_. But still from a developer perspective it is a good
> thing to have a mechanical way to track the external dependencies of
> your project whether this is an end-user application or not, hence
> packages should be (conceptually) managed per project.
>

OK, sorry i don't have understand what you have written.

Regards,
Sylvain Le Gall

Jonathan Bryant

unread,
Jan 15, 2008, 7:11:30 PM1/15/08
to caml...@yquem.inria.fr
Oops. Forgot to CC the list...

---------- Forwarded message ----------
From: Jonathan Bryant <waterso...@gmail.com>
Date: Jan 15, 2008 7:10 PM
Subject: Re: [Caml-list] Re: On module distribution
To: Sylvain Le Gall <syl...@le-gall.net>


On Jan 15, 2008 10:07 AM, Sylvain Le Gall <syl...@le-gall.net> wrote:

>
> Unfortunately, a decentralized system has also several drawbacks:
>

[...]

> * you need to provide a backup foreach node of your system. Otherwise,
> every node will become a point of failure. This is critical: lets
> consider you have a package A that build depends on package B, C and
> D. With a centralized system you "download" point of failure is the
> central location, either it is up or down. With a decentralized
> approach your "download" point of failure will be the location of A,
> B, C and D. You have to find a way to circumvent this problem...
>

> [...]


Why not take a Bittorrent style approach to decentralization? Given that a
package format is agreed upon, you can download a small file that has basic
info such as an MD5 sum. Every person has a P2P-style client that caches
any packages they've downloaded, and when you download a new package, pulls
from everybody who has a copy with the same MD5 sum, and after it is
downloaded it is offered for redistribution. Updates could simply search
for all users who are offering the old version and alert them that there is
a new version. Any dependencies will be offered at least as much as the
packages that depend on them.

That eliminates the single point of failure at least.

--Jonathan

Jonathan Bryant

unread,
Jan 16, 2008, 12:26:38 AM1/16/08
to caml...@yquem.inria.fr
I forgot to CC the list, and then I sent it from the wrong email address.
Sorry for multiple posts...

---------- Forwarded message ----------
From: Jonathan Bryant <waterso...@gmail.com>
Date: Jan 15, 2008 7:10 PM
Subject: Re: [Caml-list] Re: On module distribution
To: Sylvain Le Gall <syl...@le-gall.net>


On Jan 15, 2008 10:07 AM, Sylvain Le Gall <syl...@le-gall.net> wrote:

>
> Unfortunately, a decentralized system has also several drawbacks:
>

[...]

> * you need to provide a backup foreach node of your system. Otherwise,
> every node will become a point of failure. This is critical: lets
> consider you have a package A that build depends on package B, C and
> D. With a centralized system you "download" point of failure is the
> central location, either it is up or down. With a decentralized
> approach your "download" point of failure will be the location of A,
> B, C and D. You have to find a way to circumvent this problem...
>

Maxence Guesdon

unread,
Jan 16, 2008, 5:17:34 AM1/16/08
to caml...@yquem.inria.fr
On Tue, 15 Jan 2008 12:56:13 -0800
Vlad Skvortsov <v...@73rus.com> wrote:

> Bünzli Daniel wrote:
> > Le 15 janv. 08 à 14:38, Berke Durak a écrit :


> >
> >> I think we should rather add to Ocamlbuild a module for calling
> >> ocamlfind, parsing its output, etc. This way ocamlbuild plugins could
> >> easily call ocamlfind, be it for configuration or compilation.
> >
> > My problem with ocamlfind is that it takes too much control over me.
> > Also it doesn't help you with the tedious publishing aspect (which I
> > try to mitigate by using news feeds) and it won't help you with the
> > binary update problem.
> >

> > Le 15 janv. 08 à 16:07, Sylvain Le Gall a écrit :


> >
> >> Unfortunately, a decentralized system has also several drawbacks:
> > [...]
> >
> > Yes of course. But the point is that we already have a decentralized
> > system. All these tarballs that are referenced from the hump and not
> > part of godi. My aim is to be able to quickly install or publish such
> > decentralized bits. Currently these two tasks take too much time:
> > using them, because everyone does it its own way, publishing them,
> > because you have to devise your own way (make a readme, think about
> > how to structure the tarball how to manage releases, announce on the
> > mailing list, etc.). The idea is to simplify all this uninteresting
> > business to entice people to share their modules. Lowering the bar may
> > mean a decrease in quality but in the end good modules and reliable
> > publishers will be identified by the community.
> >
> > Also note that the proposal in itself doesn't prevent the development
> > of a more authoritative, centralized and stable source of packages.
>
> I've just started with OCaml, and my immediate perception was that
> modules and libraries are quite hard to find. This is due to several
> factors, one being that the web interface for Hump doesn't allow complex
> searches and stuff, doesn't offer RSS to keep track of updates, etc.

There is an RSS channel on this page:
http://caml.inria.fr/resources/index.en.html
in the hump box on the right.
The url of the channel is:
http://caml.inria.fr/hump.rss

Regards,

Maxence Guesdon

0 new messages