1. What do we want to achieve?
2. What arrangements (social, technical, etc.) best allow us to achieve these goals?
I'll take this opportunity to say just a few words in response to each question. First, in terms of desired outcomes, I would like to see the development of the server code accelerate in known directions (e.g. continue to build out the API) as well as to see the unexpected ingenuity that this injection of fresh talent will undoubtedly produce. Frank Bennett's amazing work to build a new citation processing engine, now folded into the Zotero client trunk, is a great example of what we might achieve.
Second, in terms of a social and technical infrastructure designed to encourage such development, I'm inclined to offer just a minimum of suggestions until we all have the chance to hear what the members of the Zotero development community think. For starters, we might ask whether this mailing list can expand its scope to encompass discussions of the server code. Given its relatively low traffic, I would say probably yes. Even if that's the case, however, the Zotero development team directed by the Center for History and New Media won't be able to provide technical support. Nonetheless, we're putting together some extremely basic documentation that should be enough to get the most technically-savvy among you started, and this documentation can of course be fleshed out by the community as time goes on.
Looking forward to getting your feedback.
One of the "unexpected" areas that I'd like to contribute to is an
alternative framework for translators. At the moment it's just a
little toy in my mind. The idea is to host the translators on the
server-side, then send HTML pages over to the server to get parsed,
and return some JSON or YAML. The reason why this would be worthwhile
is that it's a single point of failure that would get updated
immediately if the translator/parser gets out of date. Translator
so that the translators don't have to run on a browser in a VM behind
the server (blah). If anywhere there was opportunity to get this
rolled out, it's zotero.org :-).
> I think this needs to be a project separate from zotero. For one
> thing, the official project has shown no interest in such efforts,
> even seems resistant. For another, there are divergent goals here.
> Zotero's longer-term direction is toward building a community at
> zotero.org, and gaining value from the central data store. People
> running their own servers is a different direction entirely. These
> are incompatible missions at some level, but that doesn't mean they
> can't both thrive, and even produce some positive feedback in both
> directions. Finally, I'll be honest, I'm not comfortable with
> zotero's development process - IMO too opaque and tightly-controlled.
> That's about it, I think a separate repository and separate forum for
> discussion is appropriate. In relation to zotero.org, cooperative and
> collaborative from day one, but distinct.
So I suppose I could ask this question one way ("what changes do you
think would have to be made with zotero proper to avoid the need for
this?") but will instead ask: what concretely are you thinking that
will enable a separate project to be "cooperative and collaborative
from day one"? What sorts of metrics could we look to measure these
sorts of things?
Given my repeated ranting about translators, I'm happy to see this
sort of initiative. But I would suggest you start by articulating
concrete use cases before settling on particular technical solutions.
E.g. I want whatever solution to enable:
1) easier translator authoring
2) mobile and other non-Firefox client support
3) portable translators (right now, I'm interested in the Sakai 3
library integration project being able to reuse the work)
4) ____ ?
BTW, beyond Zotero, the other projects that use similar solutions
include CiteULike, Connotea, Mendeley.
I'd also probably echo some of Alex's concerns about the development process (I'm used to a very anarchistic model in the other open source code I'm involved with). However, I suspect that this is partially to do with the fact that zotero is primarily application code, and the anarchistic model seems less appropriate for this kind of code base (e.g. linux, open office), more suitable for the development of tools. However it's still a laudable aim to attempt to keep the barrier to entry for development in the zotero ecosystem as low as possible. How to achieve this?
Well the community of developers around CSL shows that the anarchistic model certainly works well for a project with well defined aims where unit tests are easy to write. I haven't contributed because I haven't had the need, and besides, XML brings me out in a rash. Mind you, I do watch with interest.
However, that's a side issue. What do we need to get the barrier to entry for Zotero as low as possible? The work I've done with zotero-browser is a start - it makes getting started with programming the zotero API as easy as PHP development. I haven't looked at baking zotero+POW+xulrunner together to provide a headless zotero repository, but that's got a lot of potential as an approach. However the POW firefox extension seems insufficiently robust, and the maintainer appears to lack the resources for getting a version that works on all platforms working with FF 3.6. This super-easy-development environment (which would be useful for development of bookmarklets, and prototyping the things to get zotero working outside of firefox) would be a valuable addition to zotero - but I suspect that resources would need to be put in to make it robust, and possibly better integrated with zotero (e.g. addition of convenience functions). This would be a great Google summer of code project.
What's this got to do with the server? Directly, not much. Indirectly, we need to describe the core tools for the zotero ecosystem from the perspective of a developer/administrator. As far as I can see, this comprises: The application, the headless xulrunner zotero, my easy dev environment, and the sync server.
Providing these tools so that they scale from the smallest scope (lone developer) to the largest deployment (e.g. zotero.org) without too much effort on behalf of the end user is in my experience probably the best way to get an environment where the social/technical etc. issues become obvious, the process becomes low friction, and the barrier to entry is as low as possible. This low barrier to entry is of critical importance, as the user base for zotero is not your typical programmer, but more likely a researcher who happens to do some programming (like me).
The other part of the process is while it's clear that the zotero application needs to be in some sense centrally controlled, the anarchistic model of open source contribution is a much better way to develop a vibrant ecosystem, so having tools in place that make that possible will massively enhance the potential of zotero's tools to have a major influence for the better in aspects of scholarly communication.
A niggle I talked about before, which is more important from a programmer perspective than a user perspective is to have the documentation available for use and modification offline. I half-implemented a solution based on the wiki and git a while ago, but I stopped caring for a while. However it is another source of devleopment process friction.
Penultimately, most significant open source projects have at least one IRC channel. I'd recommend looking into establishing a #zotero-dev and #zotero channel, probably on freenode.
Finally, I think it would be a very good idea to start thinking about how to recruit Google summer of code students to develop components of the zotero developer ecosystem, as well as considering any other possible sources of funding.
I will get around to looking at the sync server at some point, to look at the feasibility of making it database server agnostic (so a lone dev can use it with sqlite for testing for example). This may or may not be a feasible goal.
Just a note on my credentials here. I've been involved in the Catalyst web framework project for a number of years, where I am one of the people seen as in charge of the documentation. I wrote a book on it for the publisher Apress (Springer's tech book arm), and based on that I was the guest speaker at an international (to me) Perl conference where I talked about open source community issues, and documentation (separately). Nonetheless I'm a distinctly amateur programmer who dabbles from time to time. For my real job, at the moment I'm a PhD student doing work on the sociology of IT implementations in aged care homes, and I was unit coordinator for a subject at my institution teaching "Organisational issues in IT", although I've been doing research on organisations for a number of years, mainly with a focus on virtual organisations, and social/IT issues at work.
 Catalyst is a good example of a set of tools that as well as being optimised for developer convenience, scales to the very largest scope - e.g. as well as the single user research data management/analysis tools I write with catalyst, it also runs the BBC iPlayer and some other very large websites. It's rather more optimised for flexibility than the other major web frameworks (e.g. Rails, Django).
> You received this message because you are subscribed to the Google Groups "zotero-dev" group.
> To post to this group, send email to zoter...@googlegroups.com.
> To unsubscribe from this group, send email to zotero-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/zotero-dev?hl=en.
> About metrics, actually I'm not quite sure what you mean by that. I'm
> happy to have a discussion like this, here.
I'm asking a question larger than the particular projects: what
specific things do we need to see to declare what we do "more open"?
So "metrics" might be number of people with particular kinds of
responsibility, commit rights, etc. It might be particular policy
> Ah, I see where you're going Bruce. To over-simplify, seems like
> there's a continuum of process from tight control to what Kieren aptly
> terms the "anarchic" model.
It's becoming clear to me that the GUI is not a particularly good place to apply the anarchistic model. XUL development is quite tricky, and somewhat fragile across firefox upgrades, so the skills required are quite specialised and high. The internationalisation doesn't really help either (I tried to patch a "list open url resolvers" link into the preferences screen a few days ago and got very lost due to the il8n stuff making it very difficult to grep the source tree. Openoffice.org also has probems with using the anarchistic model. Firefox doesn't because of the relative ease with which extensions can be developed, and the distribution mechanism (although harder than the typical skill set of a potential zotero developer I would suspect).
So my key point was that there need to be tools available to provide an easy dev environment by default - call it the zotero developer ecosystem or whatever. That way there's a way to encourage the development of the clearly superior anarchistic model while maintaining the independence of the GUI.
As an aside, I was doing a lecture on the management of open source projects the other week, and I was talking about the different contribution models. Normally the class was pretty quite, but a student asked if the evolution of low friction development processes in open source was caused by the development of the git version control software. To which the answer was that no, git was a symptom, not a cause, but that it seems to have a strong synergistic effect in lowering the friction of the development process even further.
> To de-simplify a little, there isn't one
> continuum but multiple: continua of design method, development
> process, management structures. And to future complicate things, even
> continua of development tools and especially how they're employed
> (e.g. how broadly commit privileges are granted). Personally I prefer
> some fluidity here. Consider a project like zotero - must be
> thousands more users than a couple years ago when I signed on.
I think I saw that Zotero has around 4 million users in another thread, which is pretty substantial.
> Implementing the sync feature radically changed what it's about. With
> changes like that usually process needs to evolve as well. Multiple,
> interrelated projects, as has already happened with CSL is a natural
> development as well.
> OK, I admit it, I'm theorizing "metrics" not defining them. :-/
> Partly because I think these things have to be hashed out by doing -
> and discussing as well, but you can't just define them in the
> abstract. My attitude is very situational: depending on who else is
> involved, the nature of the problem, and other conditions, you choose
> one point on the continuum vs. another.
> If there's demand for a
> more sophisticated web server, we could potentially use something like
> Mozilla's httpd.js (see http://mxr.mozilla.org/mozilla-central/source/netwerk/test/httpserver/),
> although to date I've shied away from this because the codebase is
>> 20x the size of our mini-server, and the ability to execute multiple
> requests at once is not a priority for us at this time.
It's not really demand. I've demonstrated the strong potential, but POW seems too fragile and unmaintained. A good quality web server as a separate plugin with some zotero convernience functions would be useful. I wouldn't baulk at httpd.js - it's not heavy like apache for example ;). In fact it looks about the same size and scale as the production-grade-as-a-reverse-proxy-or-low-user-web-server that I use as a perl development environment.