New Contrib Guidance?

Sean Corfield

未读，

2011年9月8日 01:19:422011/9/8

收件人 cloju...@googlegroups.com

As Stuart (S) recently said "it's hard for people to A) find
libraries, B) figure out how to use them, and C) know what the latest
version is."

One of the biggest issues we have for 1.3.0 is adoption and a large
part of that in my opinion is shifting people from contrib 1.2.0 to
the new contrib libraries. So Stuart's right that people need to find
the libraries (or be told about them) etc.

In order to solve that problem, I think we need clear communication
(from Clojure/core) about the libraries - which means we first need
consensus on a number of problems and how to solve them:

On discoverability, Clojure is available as a download, and Contrib
used to be available as a download. That's not practical for 30+
individual contrib libraries. That means we have to acknowledge build
tools and we somehow need to tell people which libraries are
available, how to get them into your projects and how to figure out
what versions are available (seems like that should be done
automatically from Maven Central?).

On guidelines for contrib library developers...

For a given new contrib library, how should users expect it to relate
to its stated old contrib counterpart?

e.g., clojure.tools.cli is indicated as a replacement for
clojure.contrib.command-line but it is a completely different library.

For a given old contrib library that has been migrated to new contrib,
how should "dropped" features be documented?

e.g., clojure.core.incubator is claimed to have "contrib.def stuff"
but defnk has not been migrated (because Clojure now supports map
destructuring).

In the past, users expected contrib to have a version number that
lined up with Clojure. In the new world, contrib libraries move
forward at their on pace but what expectations should users have
around versions? Should all new contrib libraries be aiming for 1.0.0
releases to indicate stability? Saying "it's all semantic versioning"
is fine but, really, what will a user think when confronted with
data.finger-tree 0.0.1 and core.logic 0.6.3? (and everything in
between)

There was some discussion that a "first version" of a new contrib
library should be identical to the contrib 1.2.0 version (except for
the namespace) but that's clearly not happening. Should the version
number give any indication of compatibility with old contrib?

What really are the criteria for a library becoming a new contrib candidate?

e.g., clojure.data.csv is cljcsv which seems to be used far less
widely than clojure-csv (which was offered up as a contrib library in
the past) - cljcsv has had no discussion on the list (and only one
thread on the Clojure list mentioned it at all, compared to several
dozen threads mentioning clojure-csv).

e.g., what about clj-http? This seems to be very heavily used but
hasn't been updated since February (and there are active forks with a
lot of updates since). Clearly there's the usual concerns about
copyright and the CA but this seems like a library just crying out for
a place in new contrib.

Finally, are there areas that we feel need a new contrib library to
address, in order to improve the overall richness of the Clojure
platform? I mentioned clj-http above but a nice date/time library
seems like a good candidate too. Should we maintain a list of themes /
areas needing attention?

There are probably other issues not listed here so please pile in!
--
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/
Railo Technologies, Inc. -- http://www.getrailo.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

Meikel Brandmeyer

未读，

2011年9月8日 02:23:472011/9/8

收件人 cloju...@googlegroups.com

Hi,

Am Donnerstag, 8. September 2011 07:19:42 UTC+2 schrieb Sean Corfield:

On discoverability, Clojure is available as a download, and Contrib
used to be available as a download. That's not practical for 30+
individual contrib libraries. That means we have to acknowledge build
tools and we somehow need to tell people which libraries are
available, how to get them into your projects and how to figure out
what versions are available (seems like that should be done
automatically from Maven Central?).

Looking for some library mostly leads me to Google or mvnrepository:

http://mvnrepository.com/search.html?query=org.clojure

The published version should be trivial to find once you know where to look. So a site like mvnrepository should probably be pointed out to people having trouble with that.

What really are the criteria for a library becoming a new contrib candidate?

I've long given up on that one.

Sincerely
Meikel

Konrad Hinsen

未读，

2011年9月8日 04:04:592011/9/8

收件人 cloju...@googlegroups.com

On 8 sept. 11, at 07:19, Sean Corfield wrote:

> On discoverability, Clojure is available as a download, and Contrib
> used to be available as a download. That's not practical for 30+
> individual contrib libraries. That means we have to acknowledge build
> tools and we somehow need to tell people which libraries are
> available, how to get them into your projects and how to figure out
> what versions are available (seems like that should be done
> automatically from Maven Central?).

Is there any reason not to provide a single jar containing all of
Contrib, like we did before? Contrib was split up into parts in order
to permit more fine-grained dependency handling, but not to enforce
it. Especially for newcomers, a single jar with all of Contrib, or
even a "batteries included" Clojure+Contrib jar, would be much easier
to handle.

However, this does require a decision about the following point:

> In the past, users expected contrib to have a version number that
> lined up with Clojure. In the new world, contrib libraries move
> forward at their on pace but what expectations should users have
> around versions?

I agree that a common policy would help users, but I don't have a good
idea for what would be a good and maintainable scheme.

> What really are the criteria for a library becoming a new contrib
> candidate?

My best guess is that there aren't any. It comes down to a kind of
vote. Still, I don't see this as a major problem for the moment.

Konrad.

Jonas Enlund

未读，

2011年9月8日 06:47:272011/9/8

收件人 cloju...@googlegroups.com

On Thursday, September 8, 2011 8:19:42 AM UTC+3, Sean Corfield wrote:

What really are the criteria for a library becoming a new contrib candidate?
e.g., clojure.data.csv is cljcsv which seems to be used far less
widely than clojure-csv (which was offered up as a contrib library in
the past) - cljcsv has had no discussion on the list (and only one
thread on the Clojure list mentioned it at all, compared to several
dozen threads mentioning clojure-csv).

I was asked by Clojure/core if I was willing to contribute cljcsv to contrib, which of course I was.

I don't know why cljcsv was chosen instead of clojure-csv but I think one factor is how easy it is to find all the contributors and ensure that everyone has signed the CA. One would also hope that the quality and usefulness of the library is investigated. For cljcsv, now data.csv, I have focused a lot on performance. At https://gist.github.com/1203122 is a gist comparing the two libraries. I hope I have used clojure-csv as it is intended.

/Jonas

Herwig Hochleitner

未读，

2011年9月8日 07:40:432011/9/8

收件人 cloju...@googlegroups.com

2011/9/8 Konrad Hinsen <konrad...@fastmail.net>:

> Especially for newcomers, a single jar with all of Contrib, or even a "batteries included" Clojure+Contrib jar, would be much easier to handle.

+1

Also for rapid prototyping. One doesn't want to bother with
fine-grained dependencies, before knowing the code will actually be
used.

> However, this does require a decision about the following point:
>
>> In the past, users expected contrib to have a version number that
>> lined up with Clojure. In the new world, contrib libraries move
>> forward at their on pace but what expectations should users have
>> around versions?
>
> I agree that a common policy would help users, but I don't have a good idea
> for what would be a good and maintainable scheme.

How about a org.clojure/batteries artefact, whose version number is
kept in sync with the org.clojure/clojure version.
It would depend on the most recent stable versions (at the time of
release) of the contrib projects.

It could also be curated, to only include the more mature projects
under the org.clojure umbrella.

kind regards
--
__________________________________________________________________
Herwig Hochleitner

Hugo Duncan

未读，

2011年9月8日 08:06:592011/9/8

收件人 cloju...@googlegroups.com

On Thu, 08 Sep 2011 01:19:42 -0400, Sean Corfield <seanco...@gmail.com>
wrote:

> One of the biggest issues we have for 1.3.0 is adoption and a large
> part of that in my opinion is shifting people from contrib 1.2.0 to
> the new contrib libraries. So Stuart's right that people need to find
> the libraries (or be told about them) etc.

One thing that would greatly simplify conversion to 1.3.0 would be an
"old-contrib" compatibility lib, containing all the old contrib code that
hasn't made it into new contrib, or that has made it into 1.3 core.
Obviously this would be a fair amount of work to set up and make work with
1.3.0, but if driving 1.3.0 adoption is a priority, this would, I think,
definitely facilitate the process for users.

--
Hugo Duncan

Alex Miller

未读，

2011年9月8日 08:58:382011/9/8

收件人 cloju...@googlegroups.com

If you assume that the majority of users are using a
Maven-dependency-aware build system (Maven, Leiningen, etc) then I
think it would be an excellent idea to create an
org.clojure/contrib-all project that did not contain any code itself
but was merely a pom that depended on all of the latest stable
versions of the various new contrib projects.

That way you would only need to put two deps in your (for example)
Leiningen project:

:dependencies [[org.clojure/clojure "1.3.0"]
[org.clojure/contrib-all "1.3.0"]]

And you would get all of the libs via dependency. You *could* even
name this project in an identical fashion to the old contrib
(clojure-contrib). I go back and forth on whether that would be
better or worse from helping users understand what's happening.

I imagine that this umbrella project would probably rev frequently
(this could even be made automatic when the contrib projects release
new versions).

Alex

> --
> You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
> To post to this group, send email to cloju...@googlegroups.com.
> To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.
>
>

Stuart Sierra

未读，

2011年9月8日 09:00:422011/9/8

收件人 cloju...@googlegroups.com

"Is there any reason not to provide a single jar containing all of Contrib, like we did before?"

No, just time needed to do it.

-S

David Santiago

未读，

2011年9月9日 03:09:262011/9/9

收件人 cloju...@googlegroups.com

You know, it's interesting. I've gotten a lot of requests for enhancements to clojure-csv over the years. Mostly people seem to want increased configurability so they can slide some csv-like format through by changing the parameters (commas to tabs, etc), and I've had some requests for certain strictness checks. Which I guess is pretty unsurprising, as Python and Perl have similar options in their CSV libraries. I've never gotten performance as a request, though.

Still, I have been aware of some ways that clojure-csv has suboptimal code for performance, and have been meaning to check them out for a while now. I went ahead and played around with it today in my spare moments this afternoon.

The first thing is, you aren't really calling parse-csv the way it's intended to be called. It takes a CharSequence, but if you use the reader to char-seq adapter, you're creating a list of chars out of a file. Loading the file into memory with slurp and passing that in as a string really improves the speed without changing any code. When I tried your benchmark that way, clojure-csv was only about 4x slower for me.

Still, reading ridiculously large csv files isn't a use case that I've given much thought to, and being able to stream the data in from a reader would be nice, so I converted the code to work off of a reader instead of a char seq (plus a few other tweaks here and there), and your benchmark program turns in best times of 26409msecs for data.csv vs. 37734msecs for clojure-csv (apparently my laptop is less awesome than yours).

So anyways, I bring all this up just to point out that had there been a public discussion about adding csv support to contrib, we could have had an interesting discussion about how to best serve users' needs and how far each of the projects were from the goals. As it happens, I signed a CA last summer, anticipating a proposal for contrib after 1.3 was out (it was my understanding, though perhaps I have not paid close enough attention here, that new contrib projects were on hold until 1.3 was done). On the other hand, it's been interesting to hear about a shortcoming that I wasn't aware was turning people off, and now I can back to my users with some big speedups.

David

--

You received this message because you are subscribed to the Google Groups "Clojure Dev" group.

To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/DaM_4fIUHjAJ.

Jonas Enlund

未读，

2011年9月9日 04:38:282011/9/9

收件人 cloju...@googlegroups.com

Hi

On Friday, September 9, 2011 10:09:26 AM UTC+3, David Santiago wrote:

The first thing is, you aren't really calling parse-csv the way it's intended to be called. It takes a CharSequence, but if you use the reader to char-seq adapter, you're creating a list of chars out of a file. Loading the file into memory with slurp and passing that in as a string really improves the speed without changing any code. When I tried your benchmark that way, clojure-csv was only about 4x slower for me.

Sorry about that. I looked at http://corfield.org/blog/post.cfm/parsing-powermta-accounting-files for guidance.

Still, reading ridiculously large csv files isn't a use case that I've given much thought to, and being able to stream the data in from a reader would be nice, so I converted the code to work off of a reader instead of a char seq (plus a few other tweaks here and there), and your benchmark program turns in best times of 26409msecs for data.csv vs. 37734msecs for clojure-csv (apparently my laptop is less awesome than yours).

In my experience, large csv files are not uncommon so streaming through the data without consuming too much memory is important.

So anyways, I bring all this up just to point out that had there been a public discussion about adding csv support to contrib, we could have had an interesting discussion about how to best serve users' needs and how far each of the projects were from the goals.

I agree. Maybe we could work together to improve data.csv? It would be nice to walk through the feature-set of clojure-csv to see what data.csv lacks. If you were given commit access to the data.csv repo we could both enhance and maintain the code together with ideas from clojure-csv moved over to data.csv. Maybe someone from Clojure/core can make a design page on http://dev.clojure.org, then much of the decision making and design concerning the project would be public.

Cheers,

Jonas

Stuart Halloway

未读，

2011年9月9日 09:01:422011/9/9

收件人 cloju...@googlegroups.com

So anyways, I bring all this up just to point out that had there been a public discussion about adding csv support to contrib, we could have had an interesting discussion about how to best serve users' needs and how far each of the projects were from the goals. As it happens, I signed a CA last summer, anticipating a proposal for contrib after 1.3 was out (it was my understanding, though perhaps I have not paid close enough attention here, that new contrib projects were on hold until 1.3 was done). On the other hand, it's been interesting to hear about a shortcoming that I wasn't aware was turning people off, and now I can back to my users with some big speedups.

Contrib projects are definitely not on hold! A key objective of being modular is to decouple lifecycle decisions across projects.

If there is any information floating around suggesting contrib is on hold, please point it out and we will try to fix it.

Stuart Halloway
Clojure/core
http://clojure.com

Sean Corfield

未读，

2011年9月9日 13:01:462011/9/9

收件人 cloju...@googlegroups.com

On Fri, Sep 9, 2011 at 1:38 AM, Jonas Enlund <jonas....@gmail.com> wrote:
> On Friday, September 9, 2011 10:09:26 AM UTC+3, David Santiago wrote:
>> The first thing is, you aren't really calling parse-csv the way it's
>> intended to be called. It takes a CharSequence, but if you use the reader to
>> char-seq adapter, you're creating a list of chars out of a file. Loading the
>> file into memory with slurp and passing that in as a string really improves
>> the speed without changing any code. When I tried your benchmark that way,
>> clojure-csv was only about 4x slower for me.
>
> Sorry about that. I looked at
> http://corfield.org/blog/post.cfm/parsing-powermta-accounting-files for
> guidance.

I'm handling 200MB files there with half a million lines so I took the
simple approach and it performed well enough for my needs. I very
specifically did not want to load the whole file into memory.

As David notes, performance wasn't a problem here. Flexibility might have been.

The issue, from my point of view, is that there was ZERO discussion on
creating data.csv and of the two CSV libraries, the one that is hardly
used has been promoted over the more heavily used and feature-rich
library:

http://clojuresphere.herokuapp.com/cljcsv
http://clojuresphere.herokuapp.com/data.csv

http://clojuresphere.herokuapp.com/clojure-csv

>> Still, reading ridiculously large csv files isn't a use case that I've
>> given much thought to, and being able to stream the data in from a reader
>> would be nice, so I converted the code to work off of a reader instead of a
>> char seq (plus a few other tweaks here and there), and your benchmark
>> program turns in best times of 26409msecs for data.csv vs. 37734msecs for
>> clojure-csv (apparently my laptop is less awesome than yours).

Sounds like that new version would be a good upgrade for us at World Singles.

> I agree. Maybe we could work together to improve data.csv? It would be nice
> to walk through the feature-set of clojure-csv to see what data.csv lacks.
> If you were given commit access to the data.csv repo we could both enhance
> and maintain the code together with ideas from clojure-csv moved over to
> data.csv. Maybe someone from Clojure/core can make a design page on
> http://dev.clojure.org, then much of the decision making and design
> concerning the project would be public.

Hopefully this situation works out well, but it really does point to a
fundamental flaw in how contrib is being managed that could easily
have been solved by some public discussion on the selection of the
library.

pmbauer

未读，

2011年9月9日 19:03:242011/9/9

收件人 cloju...@googlegroups.com

+1 for an umbrella artifact with a version # tied to a clojure release #.

One consequence of decoupling the contrib version numbers from clojure proper is that it will not be immediately evident which version(s) of clojure is compatible with a particular version of an individual contrib library.

If Clojure 1.4 follows the trend of breaking changes in 1.3, this issue will compound.

Sean Corfield

未读，

2011年9月9日 20:15:422011/9/9

收件人 cloju...@googlegroups.com

On Fri, Sep 9, 2011 at 4:03 PM, pmbauer <paul.mich...@gmail.com> wrote:
> One consequence of decoupling the contrib version numbers from clojure
> proper is that it will not be immediately evident which version(s) of
> clojure is compatible with a particular version of an individual contrib
> library.

That begs another question for which contrib maintainers need
guidance: right now new contrib must run on Clojure 1.2 and Clojure
1.3 - what does the future compatibility path look like? Will new
contrib always be able to run on Clojure 1.2? If not, what is
acceptable in terms of backward compatibility and how should
maintainers indicate this (or is it a build system responsibility to
tag compatibility somehow for each build?).

Stuart Sierra

未读，

2011年9月10日 15:42:052011/9/10

收件人 cloju...@googlegroups.com

That begs another question for which contrib maintainers need
guidance: right now new contrib must run on Clojure 1.2 and Clojure
1.3 - what does the future compatibility path look like? Will new
contrib always be able to run on Clojure 1.2?

No formal decision has been made. I configured build.clojure.org to test all contrib libraries on 1.2.0, 1.2.1, and the latest 1.3.0-* releases, on the assumption that those are the most widely-deployed versions of Clojure.

I would like all contrib libraries to continue to support widely-deployed versions, but that certainly doesn't mean 1.2.0 forever. Maintaining compatibility with 1.2.x and 1.3.x is fairly easy because there is not much feature-difference between the two. As new features come into the language, library maintainers will have to balance the advantages of new language features versus breaking compatibility with older releases. There may be a policy for this in the future, but for now use your best judgment.

-Stuart Sierra
clojure.com

Stuart Sierra

未读，

2011年9月10日 15:44:452011/9/10

收件人 cloju...@googlegroups.com

One consequence of decoupling the contrib version numbers from clojure proper is that it will not be immediately evident which version(s) of clojure is compatible with a particular version of an individual contrib library.

I'm hoping we can automate this. The "matrix builds" in Hudson are a start, see http://build.clojure.org/job/core.logic-test-matrix/ for an example. It may also be possible for libraries to declare which versions of Clojure they support.

-Stuart Sierra
clojure.com

回复全部

回复作者